Friday, January 9, 2009

01 - XML in SQL Server



Here we'll cover below –


-          Transact-SQL, an extension to the SQL database programming language, is a powerful language offering many features.

-          Transact-SQL provides the SQL Server developer with several useful functions, conditional processing methods, advanced transaction control, exception and error handling, scrollable cursors, and much more.



XML Technologies


-          There are a number of XML-related technologies. Some of these technologies are used

-          For displaying XML, like XHTML (HTML 5.0) – XSL and XSLT

-          To model and map the XML document - DTD or XML Schema.

-          To manipulate the contents of an XML document programmatically -  DOM and SAX

-          To query the contents of an XML document - XPath (XML Path language), XLINK (XML Linking Language), and XQL (XML Query Language).

-          To transfer arbitrary XML documents between systems - SOAP


-          This latest version of SQLXML includes SOAP functionality. Shortly after SQL Server 2000 was released, Microsoft began offering free downloads of SQLXML (XML for SQL Server), which further extended interoperability between XML and SQL Server 2000. SOAP with SQLXML enables SQL Server stored procedures, user-defined functions, and other SQLXML technologies, to be exposed as web services. Such web services can then be accessed from multiple platforms or programming languages, by using the SOAP protocol.



-          Other SQLXML technologies





-    OPENXML uses Transact-SQL extensions and system stored procedures to load an XML document into the SQL Server 2000 memory space.

-    The source XML document must be stored in a CHAR, NCHAR, VARCHAR, NVARCHAR, TEXT, or NTEXT table column within a table.

-    Once the document is in memory, we can use a rowset view of the XML data.


-    OPENXML uses the Microsoft XML parser, called Microsoft XML Core Services (MSXML), for parsing the document.




-    MSXML as a free download, can be used with or independently of SQL Server.

-    MSXML includes an XML parser, XSLT engine, DOM APIs, XSD, XPATH, and SAX.

-    Using a programming language such as Visual Basic 6.0, and ADO, you can export or load data to or from SQL Server by accessing the DOM exposed by MSXML.

-    MSXML also includes two programming classes that enable HTTP access.


Microsoft XML OLE DB Simple Provider


-    ADO can also work with the Microsoft XML OLE DB Simple Provider, which can be used to read XML documents into a recordset, and then used for importing into SQL Server 2000.


SQLXML XML Bulk Load Utility (BCP)



-    SQLXML version 1.0 introduced the XML Bulk Load utility, which allows high-speed bulk loads of data

-    packaged inside XML tags into SQL Server.

-    Unlike OPENXML, the entire XML document is not loaded into memory, so XML Bulk Load can be used to load very large documents.


-    XML Bulk Load is a standalone COM object, and can be referenced and invoked by COM compliant programming languages



-    Used within the Transact-SQL statement to output data in XML hierarchical format.


SQLXML Client-side XML Processing



-    SQLXML 3.0 SP1 includes client-side XML processing, which converts a relational result set to a hierarchical XML document format on the client side.

-    If you call a SELECT FOR XML query, with client-side XML formatting enabled, only the SELECT statement (without the FOR XML) is passed to SQL Server. The rowset is then converted to an XML document by SQLXML on the client workstation.


XSD Annotated Schemas



-    XSD is an extension of the W3C XML Schema specification.

-    Microsoft added annotated schemas, which allow you to avoid complicated XML FOR EXPLICIT clause statements by binding the XSD definition directly to the database schema.

-    This binding is also called XML views, allowing these views to be queried by the XML Path language (XPath).


-    You should consider using XSD, instead of FOR XML, if you are more familiar with XML and XPath than Transact-SQL. XPath is not as expressive as Transact-SQL (for example, XPath lacks wildcards, has limited data types, and doesn't have a UNION clause).


Direct URL Queries


-    IIS 5.0 includes XML and SQL Server 2000 integration features.

-    In conjunction with IIS, you can embed SQL queries directly into URL strings. SQL Server can then return the results as an XML document, and display the results in the browser.


XML Templates


-    Also in conjunction with IIS 5.0, XML templates contain SQL statements that incoming URL requests can invoke.

-    Templates are more secure and flexible than direct URL queries, and are not limited in size or complexity.

-    XML templates are stored on the server itself.


SQLXML Updategrams



-    Allow you to modify data in SQL Server by using special XML tags.

-    With updategrams, you use an XML grammar to specify before and after images for fragments of the modified data.

-    Updategrams implement an XML-to-SQL mapping that eliminates the need to write Transact-SQL update queries.


SQLXML Diffgrams


-    Diffgrams are similar to updategrams but they can be generated automatically from an ADO.NET Dataset object.


SQLXML Managed Classes



-    SQLXML managed classes consist of .NET objects (classes) that allow programmers unfamiliar with traditional SQL to use XML templates or server-side XPath queries against SQL Server instead.

-    To use these classes, the .NET framework and SQLXML free download must be installed on the machine where you plan to use them.






-          The Transact-SQL SELECT statement FOR XML clause allows you to convert relational rowset data into hierarchical XML output.

-          FOR XML has three modes.




-    Returns each table used in the FROM clause of the query as an element, and each column referenced in the SELECT clause as an attribute associated with the element.

-    AUTO mode is ideal for queries using complex JOIN operations, easing the conversion to hierarchical format.


-    AUTO mode can also be specified with elements; when adding elements, this AUTO mode maps columns to elements instead of attributes.


-    The GROUP BY clause and aggregate functions are not allowed in conjunction with FOR XML AUTO






-    RAW mode generates one row element for each row of the query result set, and includes the column data as the row element's attributes.

-    RAW mode does not support retrieval of binary data.













Leaders.ID, Leaders.vchTitle, Leaders.vchCountry, Leaders.vchLastName, Country.iPopulation

FROM Leaders, Country

WHERE Leaders.vchCountry = Country.vchName

<FOR Clause>






<Leaders ID="1" vchTitle="President" vchCountry="Afghanistan" vchLastName="Karzai">

<Country iPopulation="26668251"/>


<Leaders ID="2" vchTitle="President" vchCountry="Albania" vchLastName="Moisiu">

<Country iPopulation="3119000"/>



























<row ID="1" vchTitle="President" vchCountry="Afghanistan" vchLastName="Karzaia" iPopulation="26668251"/>

<row ID="2" vchTitle="President" vchCountry="Albania"vchLastName="Moisiu" iPopulation="3119000"/>






<Schema name="Schema4" xmlns="urn:schemas-microsoft-com:xml-data" xmlns:dt="urn:schemas-microsoft-com:datatypes">

<ElementType name="Leaders" content="empty" model="closed">

<AttributeType name="ID" dt:type="i4"/>

<AttributeType name="vchTitle" dt:type="string"/>

<AttributeType name="vchCountry" dt:type="string"/>

<attribute type="ID"/>

<attribute type="vchTitle"/><attribute type="vchCountry"/>




<Leaders xmlns="x-schema:#Schema4" ID="1" vchTitle="President" vchCountry="Afghanistan"/>

<Leaders xmlns="x-schema:#Schema4" ID="1" vchTitle="President" vchCountry="Afghanistan"/>









-          The OPENXML Transact-SQL command allows us to query XML documents stored in a table column like a relational table.

-          OPENXML allows retrieval of both elements and attributes from an XML document fragment.


-          The sp_xml_preparedocument system SP is used in conjunction with OPENXML to move the XML document into memory.

-          The document is stored in the internal cache of SQL Server, using up to one eighth of the total available SQL Server memory.

-          The procedure reads the XML document and parses it internally using the MSXML (Microsoft XML) parser.

-          The sp_xml_preparedocument stored procedure is responsible for returning a handle that we can use to access the internal representation of the XML document.

-          The handle can be used for the duration of the user connection to SQL Server, or until the handle is de-allocated using sp_xml_removedocument.

-          The sp_xml_removedocument procedure is used to clear the document from memory once finished.


-          Syntax









sp_xml_preparedocument hdoc OUTPUT [, xmltext] [, xpath_namespaces]








OPENXML(idoc int [in], rowpattern nvarchar[in], flags byte[in]])


  [WITH (SchemaDeclaration | TableName)]







Is the document handle of the internal representation of an XML document



Is the XPath pattern used to identify the nodes to be processed as rows



Indicates the mapping that should be used between the XML data and the relational rowset, and how the spill-over column should be filled. flags is an optional input parameter, and can be one of these values.




Defaults to attribute-centric mapping



Use the attribute-centric mapping.

Can be combined with XML_ELEMENTS; in which case, attribute-centric mapping is applied first, and then element-centric mapping is applied for all columns not yet dealt with.



Use the element-centric mapping.

Can be combined with XML_ATTRIBUTES; in which case, attribute-centric mapping is applied first, and then element-centric mapping is applied for all columns not yet dealt with.



Can be combined (logical OR) with XML_ATTRIBUTES or XML_ELEMENTS.

In context of retrieval, this flag indicates that the consumed data should not be copied to the overflow property @mp:xmltext.









Is the schema definition of the form:

ColName ColType [ColPattern | MetaProperty][, ColName ColType [ColPattern | MetaProperty]...]



Is the column name in the rowset.



Is the SQL data type of the column in the rowset. If the column types differ from the underlying XML data type of the attribute, type coercion occurs. If the column is of type timestamp, the present value in the XML document is disregarded when selecting from an OPENXML rowset, and the autofill values are returned.



Is an optional, general XPath pattern that describes how the XML nodes should be mapped to the columns.









Is the table name that can be given (instead of SchemaDeclaration) if a table with the desired schema already exists and no column patterns are required.


The WITH clause provides a rowset format (and additional mapping information as necessary) using either SchemaDeclaration or specifying an existing TableName. If the optional WITH clause is not specified, the results are returned in an edge table format. Edge tables represent the fine-grained XML document structure (e.g. element/attribute names, the document hierarchy, the namespaces, PIs etc.) in a single table.




-          Examples



Attribute Centric



DECLARE @idoc      int

DECLARE @KeyWord   varchar(100)

DECLARE @KeyWordsXML varchar(1000)



set @KeyWordsXML='<ROOT>

            <Keywords KeyWord="DB" />

            <Keywords KeyWord="Link" />

            <Keywords KeyWord="1" />




EXEC sp_xml_preparedocument @idoc OUTPUT, @KeyWordsXML




OPENXML (@idoc, '/ROOT/Keywords', 1)

WITH (KeyWord  varchar(25) '@KeyWord')



EXEC sp_xml_removedocument @idoc



Result :-









Element Centric



DECLARE @idoc        int

DECLARE @KeyWord11   varchar (100)

DECLARE @KeyWordsXML varchar (1000)



set @KeyWordsXML='<ROOT>













EXEC sp_xml_preparedocument @idoc OUTPUT, @KeyWordsXML




OPENXML (@idoc, '/ROOT/Keywords', 2)

WITH (KeyWord11 varchar (25) 'KeyWord')


EXEC sp_xml_removedocument @idoc



Result :-












Thanks & Regards,

Arun Manglick || Tech Lead

No comments:

Post a Comment