The gSOAP level-2 DOM parser features "smart" XML namespace handling and can be used to mix gSOAP XML serializers with plain XML parsing. The DOM parser is also an essential component of the wsse plugin to verify digital signatures.
The DOM parser is not a stand-alone application. The DOM parser is integrated with the SOAP engine to populate a node set and to render a node set in XML.
Two files are needed to work with DOM node sets:
#import "dom.h"
By importing dom.h two special data types xsd__anyType and xsd__anyAttribute are available representing a hierarchical DOM node set of elements and attributes, respectively. The DOM node element and attribute data structures can be used within structs, classes, STL containers, and as arguments of service operations. For example:
#import "dom.h" #import "wsu.h" class ns__myProduct { public: @char* wsu__Id; @xsd__anyAttribute atts; _wsu__Timestamp* wsu__Timestamp; char* name; int SKU; double price; xsd__anyType* elts; ns__myProduct(); ~ns__myProduct(); };
It is important to declare the xsd__anyType at the end of the struct or class, since the DOM parser consumes any XML element (the field name, 'elts' in this case, is irrelavant). Thus, the other fields must be defined first to ensure they are populated first before the DOM node set is populated with any non-previously matched XML element. Likewise, the xsd__anyAttribute member should be placed after the other attributes.
Note that we also imported wsu.h as an example to show how to add a wsu:Id attribute to a struct or class if we want to digitally sign instances, and how to add a standardized wsu:Timestamp element to record creation and expiration times.
To compile, run soapcpp2 (with -Iimport) and compile your code by linking dom.cpp (or dom.c for C). Note that the DOM data structures are declared in stdsoap2.h, while the DOM operations are defined in dom.cpp (or dom.c for C).
Methods to populate and traverse DOM node sets will be explained later. First, let's take a look at parsing and generating XML documents.
#include "soapH.h" // generated by soapcpp2 #include "ns.nsmap" // a namespace table with the XML namespace used
The C++ std::iostream operators are overloaded to parse XML octet streams into node sets and to emit XML from node sets:
soap_dom_element dom; dom.soap = soap_new1(SOAP_DOM_TREE | SOAP_C_UTFSTRING); cin >> dom; // parse XML if (dom.soap->error) ... // parse error cout << dom; // display XML if (dom.soap->error) ... // output error soap_destroy(dom.soap); soap_end(dom.soap); soap_done(dom.soap); free(dom.soap);
In the example above we copied an XML document from stdin to stdout.
In C we use the DOM "serializers" to accomplish this as follows:
soap_dom_element dom; dom.soap = soap_new1(SOAP_DOM_TREE | SOAP_C_UTFSTRING); dom.soap->recvfd = stdin; if (soap_begin_recv(dom.soap) || NULL != soap_in_xsd__anyType(dom.soap, NULL, &dom, NULL) || soap_end_recv(dom.soap)) ... // parse error dom.soap->sendfd = stdout; if (soap_begin_send(dom.soap)) ... // output error dom.soap->ns = 2; // note: must use this to omit namespaces table dumping if (soap_out_xsd__anyType(dom.soap, NULL, 0, &dom, NULL) || soap_end_send(dom.soap)) ... // output error soap_end(dom.soap); soap_done(dom.soap); free(dom.soap);
The SOAP_DOM_NODE flag is used to instruct the parser to populate a DOM node set with deserialized C and C++ data structures using the data type's deserializers that were generated with soapcpp2 from a header file with the data type declarations. Suppose for example that the following header file was used (in fact, this declaration appears in wsu.h):
typedef struct _wsu__Timestamp { @char* wsu__Id; char* Created; char* Expires; } _wsu__Timestamp;
Note that the leading underscore of the type name indicates an XML element definition (rather than a complexType definition), so the name of the data type is relevant when comparing XML element tags to C/C++ data types by the deserializers.
When an XML document is parsed with one or more <wsu:Timestamp> elements, the DOM will be automatically populated with the _wsu__Timestamp objects. Suppose the XML document root is a <wsu:Timestamp>, then the root node of the DOM is a _wsu__Timestamp object:
soap_dom_element dom; dom.soap = soap_new1(SOAP_DOM_NODE); cin >> dom; // parse XML if (dom.soap->error) ... // parse error if (dom.type == SOAP_TYPE__wsu__Timestamp) { _wsu__Timestamp *t = (_wsu__Timestamp*)dom.node; cout << "Start " << (t->Created ? t->Created : "") << " till " << (t->Expires ? t->Expires : "") << endl; }
Note that the soapcpp2 compiler generates a unique type identification constant SOAP_TYPE_X for each data type X, which is used to determine the node's type in the example above.
When objects occur deeper within the DOM node set then the DOM tree should be traversed. This subject will be discussed next.
soap_dom_element dom; dom.soap = soap_new1(SOAP_DOM_TREE | SOAP_C_UTFSTRING); ... for (soap_dom_element::iterator iter = dom.begin(); iter != dom.end(); ++iter) for (soap_dom_attribute::iterator attr = (*iter).atts.begin(); attr != (*iter).atts.end(); ++attr) ...
In C code, use:
soap_dom_element dom, *iter; soap_dom_attribute *attr; dom.soap = soap_new1(SOAP_DOM_TREE | SOAP_C_UTFSTRING); ... for (iter = &dom; iter; iter = soap_dom_next_element(iter)) for (attr = iter->atts; attr; attr = soap_dom_next_attribute(attr)) ...
The soap_dom_element and soap_dom_attribute structs form essentially linked lists, so it would not be too difficult to write your own tree walkers:
The linked lists of sibling elements nodes and attribute nodes are respectively:
Note that for a root node, the soap_dom_element::prnt and soap_dom_element::next are both NULL.
Tag names of elements and attributes are stored in soap_dom_element::name and soap_dom_attribute::name strings, respectively. The names are UTF-8 encoded.
XML namespace bindings are explicitly propagated throughout the DOM node set for those elements and attributes that are namespace qualified (either with a namespace prefix or when they occur in a xmlns default namespace scope). The namespaces are stored in the soap_dom_element::nstr and soap_dom_attribute::nstr strings. The following example shows how to traverse a DOM node set and print the elements with their namespace URIs when present:
soap_dom_element dom; dom.soap = soap_new1(SOAP_DOM_TREE | SOAP_C_UTFSTRING); cin >> dom; for (soap_dom_element::iterator iter = dom.begin(); iter != dom.end(); ++iter) { cout << "Element " << (*iter).name; if ((*iter).nstr) cout << " has namespace " << (*iter).nstr; cout << endl; } soap_destroy(dom.soap); soap_end(dom.soap); soap_done(dom.soap); free(dom.soap);
Text content of a node is stored in the soap_dom_element::data string in UTF-8 format. This string is populated if the SOAP_C_UTFSTRING flag was set. Otherwise the data content will be stored in the soap_dom_element::wide wide-character string.
The following example prints those element nodes that have text content (in UTF-8 format):
soap_dom_element dom; ... for (soap_dom_element::iterator iter = dom.begin(); iter != dom.end(); ++iter) { cout << "Element " << (*iter).name; if ((*iter).data) cout << " = " << (*iter).data; cout << endl; } ...
When a DOM node set contains deserialized objects (enabled with the SOAP_DOM_NODE flag), the soap_dom_element::type and soap_dom_element::node values are set:
soap_dom_element dom; ... for (soap_dom_element::iterator iter = dom.begin(); iter != dom.end(); ++iter) { cout << "Element " << (*iter).name; if ((*iter).type) cout << "Element " << (*iter).name << " contains a deserialized object" << endl; cout << endl; } ...
The soap_dom_element::type is 0 or a SOAP_TYPE_X constant, where X is the name of the deserialized type. The soap_dom_element::node points to the deserialized object. If this is a char* string, it points directly to the character sequence.
Note: the SOAP_DOM_TREE flag restricts the parser to DOM content only, so deserializers is not used. When the SOAP_DOM_TREE flag is not used, an appropriate deserializer MAY be used by gSOAP when an element contains an id attribute and gSOAP can determine the type from the id attribute reference and/or the xsi:type attribute of an element.
For C++ code, the built-in soap_dom_element::iterator can be used to search for matching element nodes. C programmers are out of luck as they should write looping code to search for nodes explicitly.
The soap_dom_element::find method returns a search iterator. The method takes an optional namespace URI and element name to match elements in the DOM node set. For example, to iterate over all "product" elements:
soap_dom_element dom; ... for (soap_dom_element::iterator iter = dom.find(NULL, "product"); iter != dom.end(); ++iter) cout << "Element " << (*iter).name << endl; ...
To iterate over all elements in a particular namespace:
soap_dom_element dom; ... for (soap_dom_element::iterator iter = dom.find("http://www.w3.org/2001/XMLSchema", NULL); iter != dom.end(); ++iter) cout << "Element " << (*iter).name << endl; ...
Since namespaces may have different version, a '*' wildcard can be used with the namespace string. Likewise, tag names may be namespace qualified with prefixes that are not relevant to the search:
soap_dom_element dom; ... for (soap_dom_element::iterator iter = dom.find("http://www.w3.org/*XMLSchema", "*:schema"); iter != dom.end(); ++iter) cout << "Element " << (*iter).name << endl; ...
This searches for qualified elements in one of the XSD namespaces.
The following examples are shown in C++. C programmers can use the soap_dom_element:elts list and soap_dom_elements::atts list to add child nodes and attribute nodes, respectively.
soap_dom_element dom; dom.soap = soap_new1(SOAP_C_UTFSTRING | SOAP_XML_INDENT); const char *myURI = "http://www.mydomain.com/myproducts"; ns__myProduct product(); product.soap_default(dom.soap); // method generated by soapcpp2 product.name = "Ernie"; product.SKU = 123; product.price = 9.95; dom.set(myURI, "list"); dom.add(soap_dom_attribute(dom.soap, myURI, "version", "0.9")); dom.add(soap_dom_element(dom.soap, myURI, "documentation", "List of products")); dom.add(soap_dom_element(dom.soap, myURI, "product", &product, SOAP_TYPE_ns__myProduct); cout << dom; ...
Assuming that myURI is associated with namespace prefix "ns" in the namespace table, the rendition is
<?xml version="1.0" encoding="UTF-8"?> <ns:list xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/" xmlns:SOAP-ENC="http://schemas.xmlsoap.org/soap/encoding/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:ns="http://domain/schemas/product.xsd" version="0.9" > <ns:documentation>List of products</ns:documentation> <ns:product> <name>Ernie</name> <SKU>123</SKU> <price>9.95</price> </ns:product> </ns:list>
Note that the namespace table content is "dumped" into the XML rendition.
The global namespace mapping table "namespaces[]" contains the namespace bindings that should be meaningful to the application. The soap context can be set to a new table as follows:
Namespace myNamespaces[] = { { "ns", "..." }, ... , { NULL } }; soap_dom_element dom; dom.soap = soap_new1(SOAP_C_UTFSTRING | SOAP_XML_INDENT); dom.soap->namespaces = myNamespaces;
To produce cleaner XML, use the SOAP_XML_CANONICAL flag to initiate the soap context:
<ns:list xmlns:ns="http://domain/schemas/product.xsd" version="0.9" > <ns:documentation>List of products</ns:documentation> <ns:product> <name>Ernie</name> <SKU>123</SKU> <price>9.95</price> </ns:product> </ns:list>
Note that the xmlns bindings are rendered automatically. When parsing an XML document, xmlns bindings are not added to the attribute node set. The soap_dom_element::nstr and soap_dom_attribute::nstr namespace strings are set to retain namespace URIs. The XML rendering algorithm uses the namespace strings to add xmlns bindings that are not already in the namespace table.
When it is desirable to render XML exactly as represented in the DOM node set, e.g. when xmlns bindings are explicitly included in the attribute node set, use the SOAP_DOM_ASIS flag:
soap_dom_element dom; dom.soap = soap_new1(SOAP_C_UTFSTRING | SOAP_DOM_ASIS);
#import "dom.h" typedef float xsd__float;
Consider invoking the XMethods delayed stock quote service to obtain a stock quote. The float deserializer is used to store the floating-point value of a stock given that the <result> element has an xsi:type="xsd:float" attribute.
struct soap *soap = soap_new1(SOAP_C_UTFSTRING | SOAP_DOM_NODE); soap_dom_element envelope(soap, "http://schemas.xmlsoap.org/soap/envelope/", "Envelope"); soap_dom_element body(soap, "http://schemas.xmlsoap.org/soap/envelope/", "Body"); soap_dom_attribute encodingStyle(soap, "http://schemas.xmlsoap.org/soap/envelope/", "encodingStyle", "http://schemas.xmlsoap.org/soap/encoding/"); soap_dom_element request(soap, "urn:xmethods-delayed-quotes", "getQuote"); soap_dom_element symbol(soap, NULL, "symbol", "IBM"); soap_dom_element response(soap); envelope.add(body); body.add(encodingStyle); body.add(request); request.add(symbol); cout << "Request message:" << endl << envelope << endl; if (soap_connect(soap, "http://services.xmethods.net/soap", "") || soap_out_xsd__anyType(soap, NULL, 0, &envelope, NULL) || soap_end_send(soap) || soap_begin_recv(soap) || NULL != soap_in_xsd__anyType(soap, NULL, &response, NULL) || soap_end_recv(soap) || soap_closesock(soap)) { soap_print_fault(soap, stderr); soap_print_fault_location(soap, stderr); } else { cout << "Response message:" << endl << response << endl; for (soap_dom_element::iterator walker = response.find(SOAP_TYPE_xsd__float); walker != response.end(); ++walker) cout << "Quote = " << *(xsd__float*)(*walker).node << endl; } soap_destroy(soap); soap_end(soap); soap_done(soap); free(soap);
soap_dom_element dom; dom.soap = soap_new1(... flags ...); ... soap_destroy(dom.soap); soap_end(dom.soap); soap_done(dom.soap); soap_free(dom.soap);
The nodes are removed with soap_destroy (for C++) and soap_end. The soap_done function should only be used before the soap context is deallocated.
The soap context flags that control the parsing and rendition of XML are:
The DOM traversal operations:
The soap_dom_element fields:
The soap_dom_element types:
The soap_dom_element methods:
The soap_dom_element constructors:
The soap_dom_attribute fields:
The soap_dom_attribute types:
The soap_dom_attribute methods:
The soap_dom_attribute constructors: