{"id":392,"date":"2015-01-27T15:39:50","date_gmt":"2015-01-27T20:39:50","guid":{"rendered":"http:\/\/blog.mozilla.org\/nfroyd\/?p=392"},"modified":"2015-01-27T15:39:50","modified_gmt":"2015-01-27T20:39:50","slug":"examples-of-poor-api-design-1n-pldhash-functions","status":"publish","type":"post","link":"https:\/\/blog.mozilla.org\/nfroyd\/2015\/01\/27\/examples-of-poor-api-design-1n-pldhash-functions\/","title":{"rendered":"examples of poor API design, 1\/N &#8211; pldhash functions"},"content":{"rendered":"<p>The other day in the <code>#content<\/code> IRC channel:<\/p>\n<pre>&lt;bz&gt; I have learned so many things about how to not define APIs in my work with Mozilla code ;)\r\n&lt;bz&gt; (probably lots more to learn, though)<\/pre>\n<p>I, too, am still learning a lot about what makes a good API. Like a lot of other things, it&#8217;s easier to point out poor API design than to describe examples of good API design, and that&#8217;s what this blog post is about. In particular, the venerable XPCOM datastructure <code>PLDHashTable<\/code> has been undergoing a number of changes lately, all aimed at bringing it up to date. (The question of why we have our own implementations of things that exist in the C++ standard library is for a separate blog post.)<\/p>\n<p>The whole effort started with <a href=\"https:\/\/bugzilla.mozilla.org\/show_bug.cgi?id=1118024\">noticing that <code>PL_DHashTableOperate<\/code> is not a well-structured API<\/a>. It&#8217;s necessary to quote some more of the API surface to fully understand what&#8217;s going on here:<\/p>\n<pre>typedef enum PLDHashOperator {\r\n    PL_DHASH_LOOKUP = 0,        \/* lookup entry *\/\r\n    PL_DHASH_ADD = 1,           \/* add entry *\/\r\n    PL_DHASH_REMOVE = 2,        \/* remove entry, or enumerator says remove *\/\r\n    PL_DHASH_NEXT = 0,          \/* enumerator says continue *\/\r\n    PL_DHASH_STOP = 1           \/* enumerator says stop *\/\r\n} PLDHashOperator;\r\n\r\ntypedef PLDHashOperator\r\n(* PLDHashEnumerator)(PLDHashTable *table, PLDHashEntryHdr *hdr, uint32_t number,\r\n                      void *arg);\r\n\r\nuint32_t\r\nPL_DHashTableEnumerate(PLDHashTable *table, PLDHashEnumerator etor, void *arg);\r\n\r\nPLDHashEntryHdr*\r\nPL_DHashTableOperate(PLDHashTable* table, const void* key, PLDHashOperator op);<\/pre>\n<p>(<code>PL_DHashTableOperate<\/code> no longer exists in the tree due to other cleanup bugs; the above is approximately what it looked like at the end of 2014.)<\/p>\n<p>There are several problems with the above slice of the API:<\/p>\n<ul>\n<li><code>PL_DHashTableOperate(table, key, PL_DHASH_ADD)<\/code> is a long way to spell what should have been named <code>PL_DHashTableAdd(table, key)<\/code><\/li>\n<li>There&#8217;s another problem with the above: it&#8217;s making a runtime decision (based on the value of <code>op<\/code>) about what should have been a compile-time decision: this particular call will always and forever be an add operation. We shouldn&#8217;t have the (admittedly small) runtime overhead of dispatching on <code>op<\/code>. It&#8217;s worth noting that compiling with LTO and a quality inliner will remove that runtime overhead, but we might as well structure the code so non-LTO compiles benefit and the code at callsites reads better.<\/li>\n<li>Given the above definitions, you can say <code>PL_DHashTableOperate(table, key, PL_DHASH_STOP)<\/code> and nothing will complain. The <code>PL_DHASH_NEXT<\/code> and <code>PL_DHASH_STOP<\/code> values are really only for a function of type <code>PLDHashEnumerator<\/code> to return, but nothing about the above definition enforces that in any way. Similarly, you can return <code>PL_DHASH_LOOKUP<\/code> from a <code>PLDHashEnumerator<\/code> function, which is nonsensical.<\/li>\n<li>The requirement to always return a <code>PLDHashEntryHdr*<\/code> from <code>PL_DHashTableOperate<\/code> means doing a <code>PL_DHASH_REMOVE<\/code> has to return <em>something<\/em>; it happens to return <code>nullptr<\/code> always, but it really should return <code>void<\/code>. In a similar fashion, <code>PL_DHASH_LOOKUP<\/code> always returns a non-<code>nullptr<\/code> pointer (!); one has to check <code>PL_DHASH_ENTRY_IS_{FREE,BUSY}<\/code> on the returned value. The typical style for an API like this would be to return <code>nullptr<\/code> if an entry for the given key didn&#8217;t exist, and a non-<code>nullptr<\/code> pointer if such an entry did.  The free-ness or busy-ness of a given entry should be a property entirely internal to the hashtable implementation (it&#8217;s possible that some scenarios could be slightly more efficient with direct access to the busy-ness of an entry).<\/li>\n<\/ul>\n<p>We might infer corresponding properties of a good API from each of the above issues:<\/p>\n<ul>\n<li>Entry points for the API produce readable code.\n<li>The API doesn&#8217;t enforce unnecessary overhead.\n<li>The API makes it impossible to talk about nonsensical things.\n<li>It is always reasonably clear what return values from API functions describe.\n<\/ul>\n<p>Fixing the first two bulleted issues, above, was the subject of <a href=\"https:\/\/bugzilla.mozilla.org\/show_bug.cgi?id=1118024\">bug 1118024<\/a>, done by Michael Pruett. Once that was done, we really didn&#8217;t need <code>PL_DHashTableOperate<\/code>, and removing <code>PL_DHashTableOperate<\/code> and related code was done in <a href=\"https:\/\/bugzilla.mozilla.org\/show_bug.cgi?id=1121202\">bug 1121202<\/a> and <a href=\"https:\/\/bugzilla.mozilla.org\/show_bug.cgi?id=1124920\">bug 1124920<\/a> by Michael Pruett and Nicholas Nethercote, respectively. Fixing the unusual return convention of <code>PL_DHashTableLookup<\/code> is being done in <a href=\"https:\/\/bugzilla.mozilla.org\/show_bug.cgi?id=1124973\">bug 1124973<\/a> by Nicholas Nethercote. Maybe once all this gets done, we can move away from C-style <code>PL_DHashTable*<\/code> functions to C++ methods on <code>PLDHashTable<\/code> itself!<\/p>\n<p>Next time we&#8217;ll talk about the actual contents of a <code>PL_DHashTable<\/code> and how improvements have been made there, too.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>The other day in the #content IRC channel: &lt;bz&gt; I have learned so many things about how to not define APIs in my work with Mozilla code \ud83d\ude09 &lt;bz&gt; (probably lots more to learn, though) I, too, am still learning a lot about what makes a good API. Like a lot of other things, it&#8217;s [&hellip;]<\/p>\n","protected":false},"author":320,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[72428,72429,540,5,72421],"_links":{"self":[{"href":"https:\/\/blog.mozilla.org\/nfroyd\/wp-json\/wp\/v2\/posts\/392"}],"collection":[{"href":"https:\/\/blog.mozilla.org\/nfroyd\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blog.mozilla.org\/nfroyd\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blog.mozilla.org\/nfroyd\/wp-json\/wp\/v2\/users\/320"}],"replies":[{"embeddable":true,"href":"https:\/\/blog.mozilla.org\/nfroyd\/wp-json\/wp\/v2\/comments?post=392"}],"version-history":[{"count":0,"href":"https:\/\/blog.mozilla.org\/nfroyd\/wp-json\/wp\/v2\/posts\/392\/revisions"}],"wp:attachment":[{"href":"https:\/\/blog.mozilla.org\/nfroyd\/wp-json\/wp\/v2\/media?parent=392"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blog.mozilla.org\/nfroyd\/wp-json\/wp\/v2\/categories?post=392"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blog.mozilla.org\/nfroyd\/wp-json\/wp\/v2\/tags?post=392"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}