This is a temporary post that was not deleted. Please delete this manually. (55e72283-dfd8-4ba4-ad6f-04c84c2ba162 – 3bfe001a-32de-4114-a6b4-4005b770f6d7)
Cross-posted from SQLBlog! – http://www.sqlblog.com
This is a temporary post that was not deleted. Please delete this manually. (55e72283-dfd8-4ba4-ad6f-04c84c2ba162 – 3bfe001a-32de-4114-a6b4-4005b770f6d7)
I’d like to take a moment to let you know that SQLblog found a new home this past weekend and was moved onto a much needed, much better server infrastructure. SQLblog continues using MaximumASP (now CBeyond Cloud Services but still found at www.maximumasp.com). We are now in our fourth year being hosted at MaximumASP and we have been very happy with our hosting and support.
Not sure if you’ve attempted to convert a table with HIERARCHYID to an XML representation, and if you have, I’m sure you’ve experienced the same woes as me. Sure, I could have taken the route of using C# to create the XML, and it very well may be a better way to make such a conversion, but after deciding that I had to be able to do this in T-SQL; and so began the journey (albeit it a short one) to find such a solution…
Since the XML modify method can only insert into a single node in an XML document, I had to either attempt to generate a string representation of the xml form the data (no simple task) or I could cursor through the data one row at a time (yes, cursor) and insert each node. For this implementation, I choose the cursor method.
Since the XQuery to insert nodes must be a static value and cannot be a variable, I found it difficult to figure out how to insert a node into another node since there was no point of reference. So at first I thought, the data would probably be uniquely identifiable, and so I could use that “id” to add an attribute to every node that I constructed and then cursor through and insert into the node that matched the parent id of the node I was inserting.
In other words, I would create a cursor that contained the parent node ID and concatenated values from the row of data to create the node with an “id” attribute.
CAST ('<' + NodeName + ' id="' CAST(NodeID AS VARCHAR(20)) + '">' + ISNULL(NodeText, '') + '</' + NodeName + '>' AS XML) AS XmlToInsert
I would then iterate through the cursor and insert the node as follows:
SET @XR.modify('insert sql:variable("@xcol") into (//*[@id=sql:variable("@hparentid")])[1]')
This could become more problematic if the unique key contained multiple fields. I also may not want the “id” attribute in my results. There were lots of things that could go wrong with this implementation. So I scrapped it and moved on. And although the version I am about to present has its own potential for issues, I felt it was more flexible and cleaner in its approach. Essentially what I decided to do is to use a temp table that contained the generated the XML node, the original HIERARCHYID value, a row number, generated with ROW_NUMBER() ordered by the hierarchy order, and a parent row number, which would initially set to 0 then updated using a self join on the temp table.
Then since the XML nodes position will match the generated row number based on the HIERARCHYID position, we can simply insert the new node into the parent node based on its position.
– Sample Data to test with
CREATE TABLE #HTable (NodeName sysname, Attributes xml, NodeText VARCHAR(MAX), HierarchyNode HIERARCHYID)
INSERT INTO #HTable (NodeName, Attributes, NodeText, HierarchyNode)
VALUES
('a', '<a attr="1" />', NULL, 0x),
('b', NULL, NULL, 0x58),
('c', '<a xyz="3" />', 'abc', 0x5AC0),
('c', NULL, 'def', 0x5B40),
('b', '<a id="111" pid="1234" />', NULL, 0x68),
('c', NULL, 'abc', 0x6AC0),
('c', NULL, 'def', 0x6B40)
CREATE TABLE #T (XmlToInsert XML, HierarchyNode HIERARCHYID, RowNum INT, ParentRowNum INT)
-- INSERT the generated XML node, the original HIERARCHYID, a unique row number, and a parent row number (set to 0)
INSERT INTO #T (XmlToInsert, HierarchyNode, RowNum, ParentRowNum)
SELECT
CAST(
'<' + NodeName + ' '
+ CASE WHEN Attributes IS NOT NULL
THEN SUBSTRING(CAST(Attributes AS VARCHAR(MAX)), 3, LEN(CAST(Attributes AS VARCHAR(MAX))) - 4)
ELSE '' END
+ '>' + ISNULL(NodeText, '') + '</' + NodeName + '>'
AS XML) AS XmlToInsert
, HierarchyNode
, ROW_NUMBER() OVER (ORDER BY HierarchyNode) AS RowNum
, 0 AS ParentRowNum
FROM #HTable
ORDER BY HierarchyNode
-- UPDATE the parent row number using the HIERARCHYID method GetAncestor in the self join
UPDATE T1
SET T1.ParentRowNum = T2.RowNum
FROM #T AS T1
INNER JOIN #T AS T2 ON T2.HierarchyNode = T1.HierarchyNode.GetAncestor(1)
DECLARE @xcol XML, @parentrownum INT, @flag BIT = 0, @XR XML = ''
-- We actually only need the generated XML and the parent row number to do the rest of this work
DECLARE crH CURSOR READ_ONLY FOR SELECT XmlToInsert, ParentRowNum FROM #T ORDER BY RowNum
OPEN crH
FETCH NEXT FROM crH INTO @xcol, @parentrownum
WHILE(@@FETCH_STATUS = 0)
BEGIN
-- First time through, we add a root node
IF @flag = 0
SET @XR.modify('insert sql:variable("@xcol") into (/)[1]')
ELSE -- Subsequent passes we find the parent node by position
SET @XR.modify('insert sql:variable("@xcol") into (//*)[sql:variable("@parentrownum")][1]')
SET @flag = 1
FETCH NEXT FROM crH INTO @xcol, @parentrownum
END
CLOSE crH
DEALLOCATE crH
DROP TABLE #T
DROP TABLE #HTable
SELECT @xr
Please let me know if you have any ideas that might optimize this, and if you have an implementation (T-SQL or .NET), please share.
In a previous blog post, I had discussed a method of shredding XML to a table with HIERARCHYID, and realized that it had a dependency that I was not too keen about: The XML data required an “id” attribute in order to create the hierarchy. I had sorted out a way to inject a unique attribute ID into all the nodes (I’ll discuss this in a follow up post), but having to modify the original XML didn’t have much appeal. But, upon reading another post by my fellow blogger, Adam Machanic, I realized it could be done without this requirement. Using the technique that Adam presented, I can generate unique paths to be parsed into a HIERARCHYID column.
SET @x = '<a someAttribute="1"><b><c>abc</c><c anotherAttribute="2">def</c></b><b><c>abc</c><c>def</c></b></a>'
DECLARE @T TABLE (NodeName VARCHAR(255), Attributes XML, NodeText VARCHAR(MAX), HierarchyNode HIERARCHYID)
;WITH N (Node, NodeName, Attributes, NodeText, HierarchyPath)
AS
( SELECT
CAST(Expr.query('.') AS XML) -- Node
, CAST(Expr.value('local-name(.)', 'varchar(255)') AS VARCHAR(255)) -- NodeName
, CASE WHEN Expr.value('count(./@*)', 'INT') > 0
THEN Expr.query('<a>{for $a in ./@* return $a}</a>')
ELSE NULL END -- Attributes
, CAST(Expr.value('./text()[1]', 'varchar(max)') AS VARCHAR(MAX)) -- NodeText
, CAST('/' AS VARCHAR(1000)) -- HierarchyPath
FROM @x.nodes('/*[1]') AS Res(Expr)
UNION ALL
SELECT
Expr.query('.') -- Node
, CAST(Expr.value('local-name(.)', 'varchar(255)') AS VARCHAR(255)) -- NodeName
, CASE WHEN Expr.value('count(./@*)', 'INT') > 0
THEN Expr.query('<a>{for $a in ./@* return $a}</a>')
ELSE NULL END -- Attributes
, CAST(Expr.value('./text()[1]', 'varchar(max)') AS VARCHAR(MAX)) -- NodeText
, CAST(N.HierarchyPath
+ CAST(DENSE_RANK() OVER (ORDER BY Expr) AS VARCHAR(1000))
+ '/' AS VARCHAR(1000)) -- HierarchyPath
FROM N CROSS APPLY Node.nodes('*/*') AS Res(Expr)
)
INSERT INTO @T (NodeName, Attributes, NodeText, HierarchyNode)
SELECT NodeName, Attributes, NodeText, CAST(HierarchyPath AS HIERARCHYID)
FROM N
ORDER BY CAST(HierarchyPath AS HIERARCHYID)
SELECT * FROM @T
For this example, I simple grab the node name, the node text, and the attributes (when they exist) as a simple XML value of the format:
<a [attribute1=”attribute value” [attribute2=”attribute value”]…] />
Of course, these values could also be shredded into the hierarchy. One way of doing this would be to add an additional column to the results that represents the type of entry in the hierarchy (node versus attribute). My challenge to you is to create that solution.
Have fun!
testing for comments
testing for comments
testing for comments