Part 5 – OpenXML and XQuery Optimisation Tips


This is the fifth and final post in a series of five on Using TVP’s and XML to Transport Relational Data.

In this post we’ll take a look at some tips and tricks used to increase OpenXML and XQuery performance.


The purpose behind this series of posts is to provide .Net Developers with an insight into three technologies that can be used to send multiple rows of data to SQL Server 2008(R2).

In Part 2 and Part 3 we looked at how to use the OpenXML Rowset Provider and XQuery (respectively) to shred an XML document into a rowset, and in Part 4 we looked at Microsoft’s latest offering; Table-Valued Parameters (TVP’s). We also looked at how easy it is to swap between these technologies when using .Net Framework, ADO.NET Datasets and Microsoft’s Enterprise Library 5.0.


I could only find two real performance tips relating to OpenXML (three if you count don’t use it!)

1) A parsed document is stored in the internal cache of SQL Server and the MSXML parser (Msxmlsql.dll) uses one-eighth the total memory available for SQL Server so to avoid running out of memory, execute sp_xml_removedocument as soon as possible.

2) If you need to use the output of the OpenXML function several times in your procedure, populate a table variable with the shredded output and use the table variable instead of using the OpenXML function over and over again.


The following performance tips where sourced from Bob Beauchemin’s SQL Server 2008 Microsoft Certified Master (MCM) XML readiness video.

1) Queries that use parent axis generate extra query plan steps so use multiple CROSS APPLY steps to get nodes at multiple nesting levels rather than using the parent axis.

Bad: select o.value('../@id', 'int') as CustID, o.value('@id', 'int') as OrdID from T cross apply x.nodes('/doc/customer/orders') as N(o) Good: select c.value('@id', 'int') as CustID, o.value('@id', 'int') as OrdID from T cross apply x.nodes('/doc/customer') as N1(c) cross apply c.nodes('orders') as N2(o)

2) Move ordinals to the end of path expressions



3) Avoid predicates in the middle of path expressions

book[@ISBN = "1-8610-0157-6"]/author[first-name = "Davis"]

/book[@ISBN = "1-8610-0157-6"] "n" /book/author[first-name = "Davis"] 

4) Use context item in predicate to lengthen path in exist()

FROM docs 
WHERE 1 = xCol.exist('/book/subject[text() = "security"]') 

FROM docs 
WHERE 1 = xCol.exist('/book/subject/text()[. = "security"]') 

5) Casting from XML to SQL

BAD: CAST( CAST(xmldoc.query('/a/b/text()') as nvarchar(500)) as int)
GOOD: xmldoc.value('(/a/b/text())[1]', 'int')

BAD: node.query('.').value('@attr', 'nvarchar(50)')
GOOD: node.value('@attr', 'nvarchar(50)')

6) Use temp table (insert into #temp select … from nodes()) or Table-valued parameter instead of XML

7) Specify a single root node in query as the optimizer assumes that XML can be a fragment.

OpenXML vs. XQuery Performance

There are a ton of forum threads and blog posts out there which have compared the performance of OpenXML and XQuery so I’m not going to post any of my own tests. What I will say though is that OpenXML appears to be faster than XQuery for large documents but slower for small documents. Given this, you’ll probably end up using a mixture of both OpenXML and XQuery (and maybe even a bit of Table-valued Parameters) in your applications.


About Mr. Wharty's Ramblings
Jeff holds a Master’s Degree in Systems Development (.NET Stream) and a Master’s Degree in Database Design and Management (SQL Server). Jeff also holds MCPD, MCITP, MCDBA, MCSD and MCSE certifications.

Leave a Reply

Please log in using one of these methods to post your comment: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: