{"id":1808,"date":"2012-03-07T08:52:40","date_gmt":"2012-03-06T21:52:40","guid":{"rendered":"http:\/\/blog.mozilla.org\/nnethercote\/?p=1808"},"modified":"2012-03-07T08:52:40","modified_gmt":"2012-03-06T21:52:40","slug":"n-ary-trees-in-c","status":"publish","type":"post","link":"https:\/\/blog.mozilla.org\/nnethercote\/2012\/03\/07\/n-ary-trees-in-c\/","title":{"rendered":"N-ary trees in C"},"content":{"rendered":"<p>If you want to implement an N-ary tree of integers in C, the obvious way to represent each node is something like the following.<\/p>\n<pre>struct t {\r\n    int n;\r\n    int numKids;\r\n    struct t **kids;\r\n}<\/pre>\n<p>Here <code>kids<\/code> is an array of kid pointers.\u00a0 Here&#8217;s an example tree.<\/p>\n<p><a href=\"http:\/\/blog.mozilla.org\/nnethercote\/files\/2012\/03\/IMG_1552.jpg\"><img decoding=\"async\" loading=\"lazy\" class=\"alignnone size-full wp-image-1809\" title=\"IMG_1552\" src=\"http:\/\/blog.mozilla.org\/nnethercote\/files\/2012\/03\/IMG_1552.jpg\" alt=\"N-ary tree with kids array\" width=\"640\" height=\"480\" srcset=\"https:\/\/blog.mozilla.org\/nnethercote\/files\/2012\/03\/IMG_1552.jpg 640w, https:\/\/blog.mozilla.org\/nnethercote\/files\/2012\/03\/IMG_1552-300x225.jpg 300w\" sizes=\"(max-width: 640px) 100vw, 640px\" \/><\/a><\/p>\n<p>This representation is a bit annoying because the <code>kids<\/code> array must be allocated separately from the node, and you need to reallocate the <code>kids<\/code> array every time you add a new kid to a node. Alternatively, you could instead over-allocate in anticipation of more nodes being added, but then you have to record the array capacity within the node.\u00a0 (The latter representation is exactly what I used for the tree of stack traces in <a href=\"http:\/\/valgrind.org\/docs\/manual\/ms-manual.html\">Massif<\/a>.)<\/p>\n<p>While reading through the code for <a href=\"https:\/\/wiki.mozilla.org\/Performance:Leak_Tools#Trace-malloc\">trace-malloc<\/a> yesterday, I learnt a better representation.<\/p>\n<pre>struct t {\r\n    int n;\r\n    struct t *kids;\r\n    struct t *siblings;\r\n}<\/pre>\n<p>This looks odd at first.\u00a0 Here&#8217;s the example tree with this new representation.<\/p>\n<p><a href=\"http:\/\/blog.mozilla.org\/nnethercote\/files\/2012\/03\/IMG_1553.jpg\"><img decoding=\"async\" loading=\"lazy\" class=\"alignnone size-full wp-image-1810\" title=\"IMG_1553\" src=\"http:\/\/blog.mozilla.org\/nnethercote\/files\/2012\/03\/IMG_1553.jpg\" alt=\"N-ary tree with sibling pointers\" width=\"640\" height=\"480\" srcset=\"https:\/\/blog.mozilla.org\/nnethercote\/files\/2012\/03\/IMG_1553.jpg 640w, https:\/\/blog.mozilla.org\/nnethercote\/files\/2012\/03\/IMG_1553-300x225.jpg 300w\" sizes=\"(max-width: 640px) 100vw, 640px\" \/><\/a><\/p>\n<p>From any single node you can only directly access its first kid, through the <code>kids<\/code> pointer. You access the second kid through <code>kids-&gt;siblings<\/code>, the third kid through <code>kids-&gt;siblings-&gt;siblings<\/code>, and so on.\u00a0 In other words, instead of storing a node&#8217;s kids in an array, you store them in a list; it just so happens that this list can be embedded elegantly in the nodes themselves. (There&#8217;s an argument to be made that the fields should be called <code>firstKid<\/code> and <code>nextSibling<\/code>, though I can see the sense in both naming schemes.)<\/p>\n<p>There are several nice things about this representation.<\/p>\n<ul>\n<li>Each node is a fixed size;\u00a0 no auxiliary arrays are needed.<\/li>\n<li>Many algorithms can be expressed more easily because it&#8217;s just a binary tree.<\/li>\n<li>You can treat each sibling list as a <a href=\"http:\/\/c2.com\/cgi\/wiki?MoveToFrontLists\">move-to-front list<\/a>.\u00a0 In the example above, if you accessed node 11 you would move nodes 4 and 11 to the front of their sibling lists. This may make subsequent accesses faster if you have locality in your access patterns. (You could also do this with the first representation, but you&#8217;d have to shuffle some of the pointers in the <code>kids<\/code> array along by one.)<\/li>\n<\/ul>\n<p>It appears this is a moderately well-known representation &#8212; <a href=\"http:\/\/stackoverflow.com\/questions\/189855\/n-ary-trees-in-c\">this question<\/a> on Stack Overflow has two answers, one for each of the representations I described.\u00a0 But I didn&#8217;t know about it, so I thought I&#8217;d share it.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>If you want to implement an N-ary tree of integers in C, the obvious way to represent each node is something like the following. struct t { int n; int numKids; struct t **kids; } Here kids is an array of kid pointers.\u00a0 Here&#8217;s an example tree. This representation is a bit annoying because the [&hellip;]<\/p>\n","protected":false},"author":139,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[15268],"tags":[],"_links":{"self":[{"href":"https:\/\/blog.mozilla.org\/nnethercote\/wp-json\/wp\/v2\/posts\/1808"}],"collection":[{"href":"https:\/\/blog.mozilla.org\/nnethercote\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blog.mozilla.org\/nnethercote\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blog.mozilla.org\/nnethercote\/wp-json\/wp\/v2\/users\/139"}],"replies":[{"embeddable":true,"href":"https:\/\/blog.mozilla.org\/nnethercote\/wp-json\/wp\/v2\/comments?post=1808"}],"version-history":[{"count":0,"href":"https:\/\/blog.mozilla.org\/nnethercote\/wp-json\/wp\/v2\/posts\/1808\/revisions"}],"wp:attachment":[{"href":"https:\/\/blog.mozilla.org\/nnethercote\/wp-json\/wp\/v2\/media?parent=1808"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blog.mozilla.org\/nnethercote\/wp-json\/wp\/v2\/categories?post=1808"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blog.mozilla.org\/nnethercote\/wp-json\/wp\/v2\/tags?post=1808"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}