Converting HIERARCHYID to XML in T-SQL

Not sure if you’ve attempted to convert a table with HIERARCHYID to  an XML representation, and if you have, I’m sure you’ve experienced the same woes as me. Sure, I could have taken the route of using C# to create the XML, and it very well may be a better way to make such a conversion, but after deciding that I had to be able to do this in T-SQL; and so began the journey (albeit it a short one) to find such a solution…

Since the XML modify method can only insert into a single node in an XML document, I had to either attempt to generate a string representation of the xml form the data (no simple task) or I could cursor through the data one row at a time (yes, cursor) and insert each node. For this implementation, I choose the cursor method.

Since the XQuery to insert nodes must be a static value and cannot be a variable, I found it difficult to figure out how to insert a node into another node since there was no point of reference. So at first I thought, the data would probably be uniquely identifiable, and so I could use that “id” to add an attribute to every node that I constructed and then cursor through and insert into the node that matched the parent id of the node I was inserting.

In other words, I would create a cursor that contained the parent node ID and concatenated values from the row of data to create the node with an “id” attribute.

CAST

('<' + NodeName + ' id="' CAST(NodeID AS VARCHAR(20)) + '">' + ISNULL(NodeText, '') + '</' + NodeName + '>' AS XML) AS XmlToInsert

I would then iterate through the cursor and insert the node as follows:


SET
@XR.modify('insert sql:variable("@xcol") into (//*[@id=sql:variable("@hparentid")])[1]')

This could become more problematic if the unique key contained multiple fields. I also may not want the “id” attribute in my results. There were lots of things that could go wrong with this implementation. So I scrapped it and moved on. And although the version I am about to present has its own potential for issues, I felt it was more flexible and cleaner in its approach. Essentially what I decided to do is to use a temp table that contained the generated the XML node, the original HIERARCHYID value, a row number, generated with ROW_NUMBER() ordered by the hierarchy order, and a parent row number, which would initially set to 0 then updated using a self join on the temp table.

Then since the XML nodes position will match the generated row number based on the HIERARCHYID position, we can simply insert the new node into the parent node based on its position.


– Sample Data to test with

CREATE TABLE
#HTable (NodeName sysname, Attributes xml, NodeText VARCHAR(MAX), HierarchyNode HIERARCHYID)
INSERT INTO #HTable (NodeName, Attributes, NodeText, HierarchyNode)
VALUES 
  
('a', '<a attr="1" />', NULL, 0x),
   (
'b', NULL, NULL, 0x58),
   (
'c', '<a xyz="3" />', 'abc', 0x5AC0),
   (
'c', NULL, 'def', 0x5B40),
   (
'b', '<a id="111" pid="1234" />', NULL, 0x68),
   (
'c', NULL, 'abc', 0x6AC0),
   (
'c', NULL, 'def', 0x6B40)

CREATE TABLE #T (XmlToInsert XML, HierarchyNode HIERARCHYID, RowNum INT, ParentRowNum INT)

-- INSERT the generated XML node, the original HIERARCHYID, a unique row number, and a parent row number (set to 0)
INSERT INTO #T (XmlToInsert, HierarchyNode, RowNum, ParentRowNum
SELECT 
  
CAST(
          
'<' + NodeName + ' '
          
+ CASE WHEN Attributes IS NOT NULL 
              
THEN SUBSTRING(CAST(Attributes AS VARCHAR(MAX)), 3, LEN(CAST(Attributes AS VARCHAR(MAX))) - 4)
              
ELSE '' END
          
+ '>' + ISNULL(NodeText, '') + '</' + NodeName + '>'
      
AS XML) AS XmlToInsert
  
, HierarchyNode 
  
, ROW_NUMBER() OVER (ORDER BY HierarchyNode) AS RowNum
  
, 0 AS ParentRowNum
FROM #HTable
ORDER BY HierarchyNode

-- UPDATE the parent row number using the HIERARCHYID method GetAncestor in the self join
UPDATE T1
SET T1.ParentRowNum = T2.RowNum
FROM #T AS T1
  
INNER JOIN #T AS T2 ON T2.HierarchyNode = T1.HierarchyNode.GetAncestor(1)

DECLARE @xcol XML, @parentrownum INT, @flag BIT = 0, @XR XML = ''

-- We actually only need the generated XML and the parent row number to do the rest of this work
DECLARE crH CURSOR READ_ONLY FOR SELECT XmlToInsert, ParentRowNum FROM #T ORDER BY RowNum
   
OPEN crH

FETCH NEXT FROM crH INTO @xcol, @parentrownum
WHILE(@@FETCH_STATUS = 0)
BEGIN
       -- First time through, we add a root node
       IF
@flag = 0
          
SET @XR.modify('insert sql:variable("@xcol") into (/)[1]')
      
ELSE -- Subsequent passes we find the parent node by position
           SET
@XR.modify('insert sql:variable("@xcol") into (//*)[sql:variable("@parentrownum")][1]')
       
      
SET @flag = 1
      
FETCH NEXT FROM crH INTO @xcol, @parentrownum
END
CLOSE
crH
DEALLOCATE crH
DROP TABLE #T
DROP TABLE #HTable
SELECT @xr

Please let me know if you have any ideas that might optimize this, and if you have an implementation (T-SQL or .NET), please share.


Cross-posted from SQLBlog! – http://www.sqlblog.com

Advertisement
Explore posts in the same categories: Blogroll

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Connecting to %s


Follow

Get every new post delivered to your Inbox.