<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Fink @ Mozilla &#187; planet</title>
	<atom:link href="http://blog.mozilla.org/sfink/tag/planet/feed/" rel="self" type="application/rss+xml" />
	<link>http://blog.mozilla.org/sfink</link>
	<description>One more Blog.mozilla.com weblog than you need</description>
	<lastBuildDate>Wed, 18 Apr 2012 17:53:24 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.2</generator>
		<item>
		<title>What&#8217;s your random seed?</title>
		<link>http://blog.mozilla.org/sfink/2012/04/18/whats-your-random-seed/</link>
		<comments>http://blog.mozilla.org/sfink/2012/04/18/whats-your-random-seed/#comments</comments>
		<pubDate>Wed, 18 Apr 2012 17:44:30 +0000</pubDate>
		<dc:creator>sfink</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[planet]]></category>

		<guid isPermaLink="false">http://blog.mozilla.org/sfink/?p=350</guid>
		<description><![CDATA[Greg Egan is awesome I&#8217;m going back and re-reading Luminous, one of his collections of short stories. I just read the story Transition Dreams, which kinda creeped me out. Partly because I buy into the whole notion that our brains are digitizable &#8212; as in, there&#8217;s nothing fundamentally unrepresentable about our minds. There&#8217;s probably a [...]]]></description>
			<content:encoded><![CDATA[<h1>Greg Egan is awesome</h1>
<p>I&#8217;m going back and re-reading <a title="Amazon link to Luminous" href="http://www.amazon.com/Luminous-Greg-Egan/dp/1857985737/ref=sr_1_1?ie=UTF8&amp;qid=1334767192&amp;sr=8-1">Luminous</a>, one of his collections of short stories. I just read the story <em>Transition Dreams</em>, which kinda creeped me out. Partly because I buy into the whole notion that our brains are digitizable &#8212; as in, there&#8217;s nothing fundamentally unrepresentable about our minds. There&#8217;s probably a fancy philosophy term for this, with some dead white guy&#8217;s name attached to it (because only a dozen people had thought of it before him and he talked the loudest).</p>
<p>Once you&#8217;re willing to accept accurate-enough digitization, the ramifications get pretty crazy. And spooky. I can come up with some, but Egan takes it way farther, and <em>Transition Dreams</em> is a good illustration. But I won&#8217;t spoil the story. (By the way, most of Egan&#8217;s books are out of print or rare enough to be expensive, but Terrence tells me that they&#8217;re all easily available on Kindle. Oddly, although I would be happy to transition my mental workings from meat to bits, I&#8217;m still dragging my heels on transitioning my reading from dead trees to bits.)</p>
<h2>Transition and Free Will</h2>
<p>Now, let&#8217;s assume that you&#8217;ve converted your brain to live inside a computer (or network of computers, or encoded into the flickers of light on a precisely muddy puddle of water, it really doesn&#8217;t matter.) So your thinking is being simulated by all these crazy cascades of computation (only it&#8217;s not simulated; it&#8217;s the real thing, but that&#8217;s irrelevant here.) Your mind is getting a stream of external sensor input, it&#8217;s chewing on that and modifying its state, and you&#8217;re just&#8230; well, being you.</p>
<p>Now, where is free will in this picture? Assuming free will exists in the first place, I mean, and that it existing and not existing are distinguishable. If you start in a particular, fully-described state, and you receive the exact same inputs, will you always behave in exactly the same way? You could build the mind hosting computer either way, you know, and the hosted minds wouldn&#8217;t normally be able to tell the difference. But they <em>could</em> tell the difference if they recorded all of their sensory inputs (which is fairly plausible, actually), because they could make a clone of themselves back at the previous state and replay all their sensory input and see if they made the same decisions. (Actually, it&#8217;s easier than that; if the reproduction was accurate, they should end up bit-for-bit identical.)</p>
<p>I don&#8217;t know about you, but I&#8217;d rather not be fully predictable. I don&#8217;t want somebody to copy me and my sensor logs, and then when I&#8217;m off hanging out in the Gigahertz Ghetto (read: my brain is being hosted on a slow computer), they could try out various different inputs on faster computers to see how &#8220;I&#8221; reacted and know for 100% certainty how to achieve some particular reaction.</p>
<p>Well, ok, my time in the GHzGhetto might change me enough to make the predictions wrong, so you&#8217;d really have to do this while I was fully suspended. Maybe the shipping company that suspends my brain while they shoot me off to a faster hosting facility in a tight orbit around the Sun (those faster computers need the additional solar energy, y&#8217;know) is also selling copies on the side to advertisers who want to figure out exactly what ads they can expose me to upon reawakening to achieve a 100% clickthrough rate. Truly, truly targeted advertising.</p>
<p>So, anyway, I&#8217;m going to insist on always having access to a strong source of random numbers, and I&#8217;ll call that my free will. You can record the output of that random number generator, but that&#8217;ll only enable you to accurately reproduce my past, not my future.</p>
<h2>The Pain and Joy of Determinism</h2>
<p>Or will I? What if that hosting facility gets knocked out by a solar flare? Do I really want to start over from a backup? If it streams out the log of sensor data to a safer location, then it&#8217;d be pretty cool to be able to replay as much of the log as still exists, and recover almost all of myself. I&#8217;d rather mourn a lost day than a lost decade. But that requires <em>not</em> using an unpredictable random number generator as an input.</p>
<p>So what about a pseudo-random number generator? If it&#8217;s a high quality one, then as long as nobody else can access the seed, it&#8217;s just as good. But that gives the seed incredible importance. It&#8217;s not &#8220;you&#8221;, it&#8217;s just a simple number, but in a way it allows substantial control over you, so it&#8217;s private in a more fundamental way than anything we&#8217;ve seen before. Who would you trust it to? Not yourself, certainly, since you&#8217;ll be copied from computer to computer all the time and each transfer is an opportunity for <strong>identity theft</strong>. What about your spouse? Or maybe just a secure service that will only release it for authorized replays of your brain?</p>
<p>Without that seed (or those timestamped seeds?), you can never go back. Well, you can go back to your snapshots, but you can&#8217;t accurately go forward from there to arbitrary points in time. Admittedly, that&#8217;s not necessary for some uses &#8212; if you want to know why you did something, you can go back to a snapshot and replay with a different seed. If you do something different, it was a choice made of your own free will. You could use it in court cases, even. If you get the same result, well, it&#8217;s trickier, because you might make the same choice for 90% of the possible random seeds or something. &#8220;Proof beyond a reasonable confidence interval?&#8221; Heh.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.mozilla.org/sfink/2012/04/18/whats-your-random-seed/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>bzexport changes released</title>
		<link>http://blog.mozilla.org/sfink/2012/04/13/bzexport-changes-released/</link>
		<comments>http://blog.mozilla.org/sfink/2012/04/13/bzexport-changes-released/#comments</comments>
		<pubDate>Fri, 13 Apr 2012 21:02:32 +0000</pubDate>
		<dc:creator>sfink</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[hg]]></category>
		<category><![CDATA[mozilla]]></category>
		<category><![CDATA[planet]]></category>

		<guid isPermaLink="false">http://blog.mozilla.org/sfink/?p=343</guid>
		<description><![CDATA[bzexport &#8211;new and hg newbug have landed My bzexport changes adding a --new flag and an hg newbug command have landed. Ok, they landed months ago. See my previous blog post for details; all of the commands and options described there are still valid in the current version. But please pull from the official repo [...]]]></description>
			<content:encoded><![CDATA[<h1>bzexport &#8211;new and hg newbug have landed</h1>
<p>My bzexport changes adding a <tt>--new</tt> flag and an <tt>hg newbug</tt> command have landed. Ok, they landed months ago. See <a href="/sfink/2012/01/21/bzexport-new-crash-test-dummies-wanted/">my previous blog post</a> for details; all of the commands and options described there are still valid in the current version. But please pull from the <a href="http://hg.mozilla.org/users/tmielczarek_mozilla.com/bzexport">official repo</a> instead of my testing repo given in the earlier blog post.</p>
<h2>Installing bzexport</h2>
<p><code>mkdir -p ~/hg-extensions<br />
cd ~/hg-extensions<br />
hg clone http://hg.mozilla.org/users/tmielczarek_mozilla.com/bzexport</code><br />
in the [extensions] section of your <code>~/.hgrc</code>, add:<code><br />
bzexport = ~/hg-extensions/bzexport/bzexport.py</code></p>
<p>Note to Windows users: unfortunately, I think the python packaged with MozillaBuild is missing the json.py package that bzexport needs. I think it still works if you use a system Python with json.py installed, but I&#8217;m not sure.</p>
<h2>Trying it out</h2>
<p>For the (understandably) nervous users out there, I&#8217;d like you to give it a try and I&#8217;ve made it safe to do so. Here are the levels of paranoia available:<span id="more-343"></span></p>
<h3>Ultra-paranoid: landfill testing server</h3>
<p>First, if you&#8217;re truly paranoid, you can add these lines to your ~/.hgrc file to make bugs go to the bugzilla testing instance:</p>
<pre>[bzexport]
# Testing
api_server = https://api-dev.bugzilla.mozilla.org/test/1.0/
bugzilla = https://landfill.bugzilla.org/bzapi_sandbox/</pre>
<p>Note that to avoid annoyance, you&#8217;ll want to go to that bugzilla URL in your default browser profile first and log in, so that bzexport can borrow your authentication cookies instead of prompting for the username and password that you&#8217;ve already forgotten. (And most of you will need to register over there first anyway.)</p>
<p>Honestly, though, you don&#8217;t need to worry this much. A number of people are already using bzexport successfully.</p>
<h3> Moderately paranoid: -i (&#8211;interactive) flag</h3>
<p>bzexport will do <em>nothing</em> to bugzilla without asking you when you use this flag. It&#8217;ll ask before creating a bug, it&#8217;ll ask before creating an attachment, and it&#8217;ll ask before obsoleting any attachments with matching filenames. I confess that it doesn&#8217;t fully describe the action it&#8217;s going to take, but at least it asks.</p>
<p>Note that if you say &#8216;y&#8217; when it asks whether to create the bug, but &#8216;n&#8217; when it asks whether to add the attachment, you will have a new bug with no attachment. I probably ought to hold off doing anything until it&#8217;s asked you everything, but right now it&#8217;s confirm-as-you-go.</p>
<h3>Healthily paranoid: -e (&#8211;edit) flag</h3>
<p>I almost always run with this flag. It pops up an editor with a form describing nearly everything relevant to what it&#8217;s going to do. You can tweak the bug title, the bug and attachment comments, the reviewers, and various other stuff. You can even fill in partial answers (eg you can just fill in the component but not the product, or a substring of a reviewer&#8217;s name) and it&#8217;ll give a series of text prompts after you exit the editor, to confirm anything that is still ambiguous. (If your reviewer matches multiple possibilities, it&#8217;ll give you a menu of matches. Similar for product and component.)</p>
<p>Note that you can use both -i and -e together, though that gets old fast.</p>
<h2>Usage examples</h2>
<p>You wrote a patch for a bug that has not yet been filed. It is sitting at the top of your mq queue.</p>
<ol>
<li><code>hg qref -e</code> to set the patch comment. (Not strictly necessary, but it&#8217;s better this way.)</li>
<li><code>hg bzexport --new -e -C <em>&lt;component&gt;</em></code></li>
</ol>
<p>You wrote a patch for a bug that has already been filed. It is sitting at the top of your mq queue.</p>
<ol>
<li><code>hg qref -e</code> to set the patch comment. Include &#8220;Bug <em>nnnn</em>&#8221; somewhere in the first line.</li>
<li><code>hg bzexport -e</code></li>
</ol>
<p>You wrote a patch for a bug that has not yet been filed, you&#8217;ve already given it a reasonable comment, and it&#8217;s not at the top of your mq queue. It&#8217;s partway down, named my-cool-patch:</p>
<ol>
<li><code>hg bzexport --new -e -C <em>&lt;component&gt;</em> my-cool-patch</code></li>
</ol>
<p>You haven&#8217;t written a patch yet, but you want to create a new bug:</p>
<ol>
<li><code>hg newbug -e -C <em>&lt;component&gt;</em></code></li>
</ol>
<p>You find it tiresome to use hg bzexport &#8211;new to create a new bug, then go back and use hg qref -e to add in the bug number. Couldn&#8217;t bzexport do that part too, since it&#8217;s creating the d&amp;#n bug in the first place?</p>
<ol>
<li>add <code>update-patch=1</code> to the <code>[bzexport]</code> section of your <code>~/.hgrc</code> file</li>
</ol>
<p>bzexport sucks. You want to file a bug on it.</p>
<ol>
<li><code>hg newbug -e -C bzexport</code> # This will go in product &#8220;Other Applications&#8221;, component &#8220;bzexport&#8221;</li>
</ol>
<p>Forget about patches, you just want a way to programmatically create bugs.</p>
<ol>
<li><code>hg newbug - C product/component --title "Bug title" --comment "Feature X is useless and you all suck"</code></li>
</ol>
<p>You want to set severity, platform, CCs, version, or other fields on the bugs you create.</p>
<ol>
<li>Patches welcome</li>
</ol>
<h2>Advanced Usage</h2>
<ul>
<li>I like to name all the patches in my mq queue after the bug that they apply to. In the unlikely event that you&#8217;d like to use the same naming convention, you can add rename-patch=1 to the [bzexport] section of your <code>~/.hgrc</code>, and it&#8217;ll rename the patch for you when you create a new bug for it. (See known issues, though.)</li>
<li>Every time you request review, it&#8217;ll cache the mapping from whatever substring you used (eg &#8220;:bz&#8221;) to the full reviewer email address in <code>~/.bzexport</code>. You can edit that file to add aliases that you&#8217;d like to use. (Useful when a substring like &#8220;bz&#8221; matches a bunch of people, but you know there&#8217;s only One True bz.)</li>
<li>Sometimes bzexport picks the wrong profile to use to find your authentication keys. There&#8217;s a <code>--ffprofile</code> option that you can use to tell it the right profile name to use. (I&#8217;ve had problems with Firefox changing which profile is my default profile, and looking for keys in the wrong place.)</li>
<li>hg newbug has a <code>--take-bug</code> option that will assign the bug to you. <code>hg bzexport --new</code> will do this by default, but there&#8217;s <code>--no-take-bug</code> if you just want to attach a patch without assigning the bug to yourself.</li>
</ul>
<h2>Known Issues</h2>
<ul>
<li>The rename-patch and update-patch configuration options only work when creating a bug with bzexport. If you are attaching a patch to an existing bug, the patch will not be renamed nor will it get a leading &#8220;Bug n &#8211; &#8221; in its description.</li>
<li>bzexport is slow. I blame it on BzAPI. (Or more properly, the API stuff that bzapi sits on top of.)</li>
<li>I&#8217;m not watching the bzexport component in bugzilla. Oops. Just realized that. I&#8217;ll go do it now.</li>
<li>~/.bzexport caches reviewers even when you use full email addresses, so the file is a mess and editing it is a pain. Oopsie. Fortunately, I doubt anyone else ever will.</li>
<li>Choosing which patch to obsolete works entirely by filename, which isn&#8217;t always right. For example, if you rename a patch in your queue, that will update the filename and when you use bzexport to update the patch, it won&#8217;t find the older version to obsolete. Also, you can&#8217;t obsolete multiple patches. bzexport should really have an option to prompt which attachments you&#8217;d like to obsolete.</li>
<li>bugzilla requires the product version to be set when creating a bug. bzexport has no way of detecting the default version of a product. The <code>enter_bug.cgi</code> form guesses based on the order of versions in the dropdown list, but BzAPI re-sorts the versions. I have a bug open on that, but for now, it guesses. You can override with <code>--prodversion</code> or by setting the field when you do -e.</li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://blog.mozilla.org/sfink/2012/04/13/bzexport-changes-released/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Only pay for the entropy you use</title>
		<link>http://blog.mozilla.org/sfink/2012/02/22/only-pay-for-the-entropy-you-use/</link>
		<comments>http://blog.mozilla.org/sfink/2012/02/22/only-pay-for-the-entropy-you-use/#comments</comments>
		<pubDate>Wed, 22 Feb 2012 21:15:36 +0000</pubDate>
		<dc:creator>sfink</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[mozilla]]></category>
		<category><![CDATA[planet]]></category>

		<guid isPermaLink="false">http://blog.mozilla.org/sfink/?p=328</guid>
		<description><![CDATA[Log Files Are Boring Just an idea, based on hearing that build log transfers seem to consume large amounts of bandwidth. (Note that for all I know, this is already being done.) Logs are pretty dull. In particular, two consecutive log files are usually quite similar. It&#8217;d be nice if we could take advantage of [...]]]></description>
			<content:encoded><![CDATA[<h1>Log Files Are Boring</h1>
<p>Just an idea, based on hearing that build log transfers seem to consume large amounts of bandwidth. (Note that for all I know, this is already being done.)</p>
<p>Logs are pretty dull. In particular, two consecutive log files are usually quite similar. It&#8217;d be nice if we could take advantage of this redundancy to reduce the bandwidth/time consumed by log transfers.</p>
<h2>rsync likes boring data</h2>
<p>The natural thing that springs to mind is rsync. I grabbed two log files that are probably more similar to each than is really fair, but they shouldn&#8217;t be horribly unrepresentative. rsyncing one to the other found them to share 32% of their data, based on the |rsync &#8211;stat| output lines labeled &#8220;Matched data&#8221; and &#8220;Literal data&#8221;, for a speedup of <strong>1.46x</strong>.</p>
<p>I suspected that rsync&#8217;s default block size is too large, and so most of the commonalities are not found. So I tried setting the block size ridiculously low, to 8 bytes, and it found them to be 98% similar. Which is silly, because it has to retrieve more block hashes at that block size than it saves. The total &#8220;speedup&#8221; is reported as <strong>0.72x</strong>.</p>
<p>But the sweet spot in the middle, with a block size of 192, gives 84% similarity for a speedup of <strong>4.73x</strong>.</p>
<h2>compression likes boring data too</h2>
<p>Take a step back: this only applies to uncompressed files. Simply gzipping the log file before transmitting it gives us a speedup of <strong>14.5x</strong>. Oops!</p>
<p>Well, rsync can compress the stuff it sends around too. Adding a -z flag with block size 192 gives a speedup of <strong>16.2x</strong>. Hey, we beat basic gzip!</p>
<p>But compression needs decent chunks to work with, so the sweet spot may be different. I tried various block sizes, and managed a speedup of <strong>24.3x</strong> with -B 960. An additional 1.7x speedup over simple compression is pretty decent!</p>
<p>To summarize our story so far, let&#8217;s say you want to copy over a log file named log123.txt. The proposal is:</p>
<ol>
<li>Have a vaguely recent benchmark log file, call it log_compare.txt, available on all senders and receivers. (Actually, it&#8217;d probably be a different one per build configuration, but whatever.)</li>
<li>On the server, hard link log123.txt to log_compare.txt.</li>
<li>From the client, rsync -z -B 960 log123.txt server:log123.txt</li>
</ol>
<h2>stop repeating what I say!</h2>
<p>But it still feels like there ought to be something better. The benchmark log file is re-hashed every time you do this and the hashes are sent back over the wire, costing bandwidth. So let&#8217;s eliminate that part. Note that we&#8217;ll drop the -z from flag because we may as well compress the data during the transfer instead:</p>
<pre> ssh server 'ln log_compare.txt log123.txt'
 rsync -B 960 log123.txt log_compare.txt --only-write-batch=batch.dat
 ssh -C server 'rsync --read-batch=- argleblargle log132.txt' &lt; batch.dat</pre>
<p>Note that &#8220;argleblargle&#8221; is ignored, since the source file isn&#8217;t needed.</p>
<p>So what&#8217;s the speedup now? Let&#8217;s only consider the bytes transmitted over the network. Assuming the compression from ssh -C has the same effect as gzipping the file locally, I get a speedup of <strong>28.9x</strong>, about 2x the speedup of simply compressing the log file in the first place.</p>
<p>But wait. The block size of 960 was based on the cost of retrieving all those hashes from the remote side. We&#8217;re not doing that anymore, so a smaller block size should again be more effective. Let&#8217;s see&#8230; -B 192 gets a total speedup of <strong>139x</strong>, which is almost exactly one order of magnitude faster than plain gzipped log files. Now we&#8217;re talking!</p>
<h2>loose ends</h2>
<p>Two things still bug me. One is a minor detail &#8212; the above is writing out batch.dat, then reading it back in to send over to the server. This uselessly consumes disk bandwidth. It would be better if rsync could directly read/write compressed batch files to stdin/stdout. (It can read uncompressed batches from stdin, but not write to stdout. You could probably hack it somehow, perhaps with /proc/pidN/fd/&#8230;, but it&#8217;s not a big deal. And you can just use use /dev/shm/batch.dat for your temporary filename, and remove it right after. It&#8217;d still be better if it never had to exist uncompressed anywhere, but whatever.)</p>
<p>The other is that we&#8217;re still checksumming that benchmark file locally for every log file we transfer. It doesn&#8217;t change the number of bytes spewed over the network, but it slows down the overall procedure. I wonder if librsync would allow avoiding that somehow&#8230;? (I think rsync uses two checksums, a fast rolling checksum and a slower precise one, so you&#8217;d need to compute both for all offsets. And reading those in would probably cost more than recomputing from the original file. But I haven&#8217;t thought too hard about this part.)</p>
<h2>not just emacs and debuggers</h2>
<p>I sent this writeup to Jim Blandy, who in a typically insightful fashion noticed that (1) this requires some fiddly bookkeeping to ensure that you have a comparison file, and (2) revision control systems already handle all of this. If you have one version of a file checked in and then you check in a modified version of it, the VCS can compute a delta to save storage costs. Then when you transmit the new revision to a remote repository, the VCS will know if the remote already has the baseline revision so it can just send the delta.</p>
<p>Or in other words, you could accomplish all of this by simply checking your log files into a suitable VCS and pushing them to the server. That&#8217;s not to say that you&#8217;re guaranteed that your VCS <em>will</em> be able to fully optimize this case, just that it&#8217;s <em>possible</em> for it to do the &#8220;right&#8221; thing.</p>
<p>I attempted to try this out with git, but I don&#8217;t know enough about how git does things. I checked in my baseline log file, then updated it with the new log file&#8217;s contents, then ran git repack to make a pack file containing both. I was hoping to use the increase in size from the original object file to the pack file as an estimate of the incremental cost of the new log file, but the pack file was *smaller* than either original object file. If I make a pack with just the baseline, then I end up with two pack files, but the new one is still smaller.</p>
<h2>clients could play too</h2>
<p>As a final thought, this idea is not fundamentally restricted to the server. You could do the same thing inside eg tbpl: keep the baseline log(s) in localStorage or IndexedDB. When requesting a log, add a parameter ?I_have_baseline_36fe137a1192. Then, at the server&#8217;s discretion, it could compute a delta from that baseline and send it over as a series of &#8220;insert this literal data, then copy bytes 3871..17313 from your baseline, then&#8230;&#8221;. tbpl would reconstruct the resulting log file, the unicorns would do their lewd tap dance, and everyone would profit.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.mozilla.org/sfink/2012/02/22/only-pay-for-the-entropy-you-use/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>bzexport &#8211;new: crash test dummies wanted</title>
		<link>http://blog.mozilla.org/sfink/2012/01/21/bzexport-new-crash-test-dummies-wanted/</link>
		<comments>http://blog.mozilla.org/sfink/2012/01/21/bzexport-new-crash-test-dummies-wanted/#comments</comments>
		<pubDate>Sun, 22 Jan 2012 06:36:34 +0000</pubDate>
		<dc:creator>sfink</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[automation]]></category>
		<category><![CDATA[hg]]></category>
		<category><![CDATA[mozilla]]></category>
		<category><![CDATA[mq]]></category>
		<category><![CDATA[mqext]]></category>
		<category><![CDATA[planet]]></category>
		<category><![CDATA[tools]]></category>

		<guid isPermaLink="false">http://blog.mozilla.org/sfink/?p=307</guid>
		<description><![CDATA[Scenario 1: you have a patch to some bug sitting in our mercurial queue. You want to attach it to a bug, but the bugzilla interface is painful and annoying. What do you do? Use bzexport. It&#8217;s great! You can even request review at the same time. What I really like about bzexport is that [...]]]></description>
			<content:encoded><![CDATA[<p>Scenario 1: you have a patch to some bug sitting in our mercurial queue. You want to attach it to a bug, but the bugzilla interface is painful and annoying. What do you do?</p>
<p>Use <a href="http://hg.mozilla.org/users/tmielczarek_mozilla.com/bzexport">bzexport</a>. It&#8217;s great! You can even request review at the same time.</p>
<p>What I really like about bzexport is that while writing and testing a patch, I&#8217;m in an editor and the command line. I may not even have a browser running, if I&#8217;m constantly re-starting it to test something out. Needing to go to the bugzilla web UI interrupts my flow. With bzexport, I can stay in the shell and move onto something else immediately.</p>
<p>Scenario 2: You have a patch, but haven&#8217;t filed a bug yet. Neither has anybody else. But your patch has a pretty good description of what the bug is. (This is common, especially for small things.) Do you really have to go through the obnoxious bug-filing procedure? It sure is tempting just to roll this fix up into some other vaguely related bug, isn&#8217;t it? Surely there&#8217;s a simple way to do things the right way without bouncing between interfaces?</p>
<p>Well, you&#8217;re screwed. Unless you&#8217;re willing to test something out for me. If not, please stop reading.<br />
<span id="more-307"></span></p>
<hr />
<p>Great! You&#8217;re still here! Ok, give <a href="https://bitbucket.org/sfink/bzexport/">this modified version of bzexport</a> a try. (<strong>Update:</strong><em> everything described in this post is now available in the</em> <a title="bzexport" href="http://hg.mozilla.org/users/tmielczarek_mozilla.com/bzexport/">official version of bzexport</a>.)  It adds a <tt>--new</tt> option to the <tt>hg bzexport</tt> command that creates a new bug, then attaches your patch to it. My preferred way to use it, for a bug in the Javascript engine (product &#8216;Core&#8217;, component &#8216;JavaScript engine&#8217;):</p>
<p><tt>hg bzexport --new -e -i -C javascript</tt></p>
<p>That&#8217;ll grab off the top applied patch in your mercurial queue. The first line of its description (as set by <tt>hg qnew -e</tt> or <tt>hg qref -e</tt>) will be used for the bug title and the patch description; any other lines will be used for bug comments. It&#8217;ll create a bug, upload your patch as an attachment, and wash your car.</p>
<p>In case you only own a bicycle or your car is allergic to soap, here&#8217;s some detail on all those options I passed:</p>
<ul>
<li><tt>--new</tt>: Create a new bug to attach your patch to. (Without this, it would use the existing bug given either on the command line or in the patch description.)</li>
<li><tt>-e</tt>(or &#8211;edit): Open up your text editor on a textual form where you can fill in all of the various values: bug title, product, component, comments, etc. If you omit this, it will prompt for anything it needs, but I prefer using this option to see everything at once.</li>
<li><tt>-i</tt>(or &#8211;interactive): That&#8217;s a lot of magic, and I know the idiot who wrote this code, so I want to be asked before it does anything irreversible. In this case, it&#8217;ll ask before creating the bug, and before uploading the attachment. I can cancel at any time.</li>
<li><tt>-C javascript</tt> (or &#8211;component javascript): This is for setting the product and component. You <em>could</em> say <tt>-C 'Core/JavaScript Engine'</tt> if you happen to have your product/component memorized. Or even <tt>--product Core --component 'JavaScript Engine'</tt> if you really like being explicit. But if you&#8217;re as lazy as I am, you can give a substring of just your component, and bzexport will scan through all components of all products and find substring matches. In this case, there is a unique match, and it&#8217;ll fill everything in for you. It Does The Right Thing if there are multiple matches &#8212; if you give the -e option, you&#8217;ll have a second chance to tell it the product and component, and if you still can&#8217;t be bothered (or you didn&#8217;t use the -e option in the first place), it will present a textual menu giving all of your options. For example, if I had said <tt>-C engine</tt>, it would give me a menu of the 4 products containing components with &#8220;engine&#8221; as a substring. If I foolishly chose &#8216;Mozilla Messaging&#8217;, it would set the product/component to &#8216;Mozilla Messaging/Release <strong>Engine</strong>ering&#8217;, which is not at all what I wanted but might be if you are not me.</li>
</ul>
<p>If you demand even more help in choosing a bugzilla product and component, install my <a href="https://bitbucket.org/sfink/mqext">mqext</a> extension and run <tt>hg components</tt> (or perhaps <tt>hg components -f <em>filename</em></tt>) to scan through the last 100k commits or so and look for the product/component of all bugs that touched the same files to provide a best guess as to the appropriate categorization.</p>
<p>So where do you come in? Why am I babbling on about this instead of just getting it reviewed by the maintainer (jdm), landing it, and shutting up?</p>
<p>Well, I did give jdm an earlier version, and he mostly liked it. But like I said, it&#8217;s a lot of magic (more than I went over above &#8212; there&#8217;s stuff for picking reviewers, possibly from the patch description, and setting a product version). It implements one particular workflow that I dreamt up. It makes sense to me, but I figure that other people probably have rather different ideas of how it should work and will be rather irritated by what I came up with. I&#8217;m not sure if all of this is 100% compatible with the previous bzexport workflow, so I&#8217;d like to be a little more confident that this is the right direction before updating what everybody&#8217;s using.</p>
<p>So please, especially if you&#8217;re an irritable person, give it a try and let me know what you think. It scratches my itch, but I&#8217;d like to try to scratch your itch too.</p>
<p>Figuratively, I mean.</p>
<p>Yuck.</p>
<p>How can I get that mental image out of my head? Argh!!</p>
<p>Anyway, one last thing &#8212; there&#8217;s also an <tt>hg newbug</tt> command in case you just want the bug-creation functionality, or want to do things in isolated steps. You still get all of the cookie-stealing goodness from bzexport (it&#8217;s awesome; it rifles around your Firefox default profile directory, tries to find the database containing your bugzilla cookie, does some funky magic to open it even if it&#8217;s locked down by sqlite&#8217;s WAL feature, and snatches away the cookie to use for your bugzilla operations. ted++.)</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.mozilla.org/sfink/2012/01/21/bzexport-new-crash-test-dummies-wanted/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>patch queue dependencies</title>
		<link>http://blog.mozilla.org/sfink/2012/01/05/patch-queue-dependencies/</link>
		<comments>http://blog.mozilla.org/sfink/2012/01/05/patch-queue-dependencies/#comments</comments>
		<pubDate>Fri, 06 Jan 2012 00:31:25 +0000</pubDate>
		<dc:creator>sfink</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[hg]]></category>
		<category><![CDATA[mercurial]]></category>
		<category><![CDATA[planet]]></category>

		<guid isPermaLink="false">http://blog.mozilla.org/sfink/?p=300</guid>
		<description><![CDATA[A little while back, I was again contemplating a tangled patch queue, considering how to rework it for landing. I thought it&#8217;d be nice to see at a very basic level which patches in the queue were going to be problematic, and which I could freely reorder at whim. So I whipped together a silly [...]]]></description>
			<content:encoded><![CDATA[<p>A little while back, I was again contemplating a tangled patch queue, considering how to rework it for landing. I thought it&#8217;d be nice to see at a very basic level which patches in the queue were going to be problematic, and which I could freely reorder at whim.</p>
<p>So I whipped together a <a href='http://people.mozilla.org/~sfink/data/patchdeps'>silly little script</a> to do that at a file level only. Example output:
<pre>
% patchdeps
Note: This is based on filename collisions only, so may overreport conflicts
if patches touch different parts of the same file. (TODO)

A bug-663281-deque                   X   *       *     *   * *     *
A bug-663281-deque-test              |   :       :     :   : *     :
A bug-642054-func-setline          X |   *       :     :   : :     :
A bug-642054-js_MapPCToLineNumber--' |   *       :     :   : :     :
A bug-642054-rwreentrant             |   : X     :     :   : :     :
A algorithm--------------------------'   X |     *     *   * *     *
A system-libunwind                     X | |     :   * : * : *   * :
A try-libunwind------------------------' | |     :   X : * : *   * :
A backtrace------------------------------' | X * * * | * : * * * : * * * *
U shell-backtrace                          | | : * : | : : : : : : : : : :
U M-reentr---------------------------------' | : : : | : : : : : : : : : :
U M-backtrace--------------------------------' X : : | : : : : : : : * : :
U activities-----------------------------------' X : | : : : : * * : X * *
U profiler---------------------------------------' X | * : * * X * * | * *
U bug-675096-valgrind-jit--------------------------' | * : * : | : : | : :
U bug-599499-opagent-config--------------------------' X * : * | * : | : :
U bug-599499-opagent-----------------------------------' X X * | : * | : :
U bug-642320-gdb-jit-config------------------------------' | * | * : | : :
U bug-642320-gdb-jit---------------------------------------' X | : * | : :
U import-libunwind                                           | | : : | : :
U libunwind-config-------------------------------------------' | X X | : :
U warnings-fixes-----------------------------------------------' | | | : *
U bug-696965-cfi-autocheck---------------------------------------' | | X :
U mystery-librt-stuff----------------------------------------------' | | :
U bug-637393-eval-lifetime                                           | | :
U register-dwarf-----------------------------------------------------' | :
U bug-652535-JM__JIT_code_performance_counters-------------------------' X
U JSOP_RUNMODE-----------------------------------------------------------'
</pre>
<p>How to read it: patches that have no conflicts earlier in the stack are shown without a line next to them. They&#8217;re free spirits; you can &#8220;sink&#8221; them anywhere earlier in your queue without getting conflicts. (The script removes their lines to make the grid take up less horizontal space.)</p>
<p>Any other patch gets a horizontal line that then bends up to show the interference pattern with earlier patches. All in all, you have a complete interference matrix showing whether the set of files touched by any patch intersects the set of files for any other patch.</p>
<p>&#8216;X&#8217; marks the first conflict. After that, the marker turns to &#8216;*&#8217; and the vertical lines get broken. (That&#8217;s just because it&#8217;s mostly the first one that matters when you&#8217;re munging your queue.)</p>
<p>So the patch named &#8220;backtrace&#8221; conflicts with the earlier &#8220;algorithm&#8221; patch, as well as the even earlier &#8220;bug-642054-js_MapPCToLineNumber&#8221; and others. The &#8220;M-reentr&#8221; patch only touches the same stuff as &#8220;bug-642054-rwreentrant&#8221; (not surprising, since &#8220;M-&#8230;&#8221; is my notation for a patch that needs to be folded into an earlier patch.) &#8220;system-libunwind&#8221; doesn&#8217;t conflict with anything earlier in the queue, and so can be freely reordered in the series file to anywhere earlier than where it is now &#8212; but note that several later patches touch the same stuff as it does. (It happens to be a patch to js/src/configure.in.)</p>
<p>Useful? Not very. But it was kinda fun to write and I find myself running it occasionally just to see what it shows, so I feel the entertainment value was worth the small investment of time. Though now I&#8217;m tempted to enhance it by checking for collisions in line ranges, not just in the files&#8230;</p>
<p>I suppose I could make a mercurial extension out of it, but that&#8217;d require porting it from Perl to Python, which is more trouble than it&#8217;s worth. (Yes, I still use Perl as my preferred language for whipping things together. Even though I dislike the syntax for nested data structures, I very much like the feature set, and it&#8217;s still the best language I&#8217;ve found for these sorts of things. So phbbbttt!)</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.mozilla.org/sfink/2012/01/05/patch-queue-dependencies/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>hg adventure</title>
		<link>http://blog.mozilla.org/sfink/2011/12/16/hg-adventure/</link>
		<comments>http://blog.mozilla.org/sfink/2011/12/16/hg-adventure/#comments</comments>
		<pubDate>Fri, 16 Dec 2011 22:50:03 +0000</pubDate>
		<dc:creator>sfink</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[hg]]></category>
		<category><![CDATA[mercurial]]></category>
		<category><![CDATA[planet]]></category>

		<guid isPermaLink="false">http://blog.mozilla.org/sfink/?p=294</guid>
		<description><![CDATA[Inspired by some silliness on #developers: &#60;jgilbert> well that was an hg adventure &#60;dholbert> $ hg adventure You are in a twisty maze of passageways, all alike... &#60;cpeterson> $ hg look It is pitch black. You are likely to be eaten by a grue. &#60;hub> $ hg doctor How can I help you? I thought [...]]]></description>
			<content:encoded><![CDATA[<p>Inspired by some silliness on #developers:
<pre>
&lt;jgilbert>	well that was an hg adventure
&lt;dholbert>	$ hg adventure
You are in a twisty maze of passageways, all alike...
&lt;cpeterson>	$ hg look
It is pitch black. You are likely to be eaten by a grue.
&lt;hub>		$ hg doctor
How can I help you?
</pre>
<p>I thought I&#8217;d stick to actual hg commands, and came up with:</p>
<pre>
You see a small hole leading to a dark passageway.
820:21d40b86ae37$ echo "enter passageway" > action
820:21d40b86ae37$ hg commit
It is pitch black. You are likely to be eaten by a grue.
821:0121fb347e18$ echo "look" > action
821:0121fb347e18$ hg commit
** You have been eaten by a grue **
822:b09217a7bbc1$ hg backout 822
It is pitch black. You are likely to be eaten by a grue.
821:0121fb347e18$ hg backout 821
You see a small hole leading to a dark passageway.
820:21d40b86ae37$ echo "turn on flashlight" > action
820:21d40b86ae37$ hg commit
Your flashlight is now on.
824:44a4e4bf5f0e$ hg merge 821
Your light reveals a forking passageway leading north and south.
</pre>
<p>Kinda makes you think, huh? Time reversal games became popular semi-recently (eg Braid). Maybe the fad is over now; I&#8217;m *way* out of date.</p>
<p>But did any of them allow you to branch and merge? Push and pull from your friends&#8217; distributed repos? Bisect to find the point where you unknowingly did something that prevented ever winning the game and either continue from there, merge a backout of that action, or create a new branch by splicing that action out?</p>
<p>It&#8217;s a whole new genre! It&#8217;ll be&#8230; um&#8230; fun.</p>
<p>(I&#8217;ll go back to work now)</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.mozilla.org/sfink/2011/12/16/hg-adventure/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Patch reordering</title>
		<link>http://blog.mozilla.org/sfink/2011/11/03/patch-reordering/</link>
		<comments>http://blog.mozilla.org/sfink/2011/11/03/patch-reordering/#comments</comments>
		<pubDate>Thu, 03 Nov 2011 22:18:28 +0000</pubDate>
		<dc:creator>sfink</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[hg]]></category>
		<category><![CDATA[mercurial]]></category>
		<category><![CDATA[mozilla]]></category>
		<category><![CDATA[mq]]></category>
		<category><![CDATA[patches]]></category>
		<category><![CDATA[planet]]></category>

		<guid isPermaLink="false">http://blog.mozilla.org/sfink/?p=284</guid>
		<description><![CDATA[I have a patch queue that looks roughly like: initial-API consumer-1 consumer-2 unrelated consumer-3-plus-API-changes-and-consumer-1-and-2-updates-for-new-API (So my base repo has a patch &#8216;initial-API-changes&#8217; applied to it, followed by a patch &#8216;consumer-1&#8242;, etc.) The idea is that I am working on a new API of some sort, and have a couple of independent consumers of that API. [...]]]></description>
			<content:encoded><![CDATA[<p>I have a patch queue that looks roughly like:</p>
<pre>  initial-API
  consumer-1
  consumer-2
  unrelated
  consumer-3-plus-API-changes-and-consumer-1-and-2-updates-for-new-API</pre>
<p>(So my base repo has a patch &#8216;initial-API-changes&#8217; applied to it, followed by a patch &#8216;consumer-1&#8242;, etc.)</p>
<p>The idea is that I am working on a new API of some sort, and have a couple of independent consumers of that API. The first two are &#8220;done&#8221;, but when working on the 3rd, I realize that I need to make changes to or clean up the API that they&#8217;re all using. So I hack away, and end up with a patch that contains both consumer 3 plus some API changes, and to get it to compile I also update consumers 1 and 2 to accommodate the new changes. All of that is rolled up into a big hairball of a patch.</p>
<p>Now, what I want is:</p>
<pre>  final-API
  consumer-1 (new API)
  consumer-2 (new API)
  unrelated
  consumer-3 (new API)</pre>
<p>But how do I do that (using mq patches)? I can use qcrefresh+qnew to fairly easily get to:</p>
<pre>  initial-API
  consumer-1 (old API)
  consumer-2 (old API)
  unrelated
  consumer-3 (new API)
  API-changes-plus-API-changes-for-consumers-1-and-2</pre>
<p>or I could split out the consumer 1 &amp; 2 API changes:</p>
<pre>  initial-API
  consumer-1 (old API)
  consumer-2 (old API)
  unrelated
  consumer-3 (new API)
  API-changes
  consumer-2-API-changes
  consumer-1-API-changes</pre>
<p>which theoretically I could qfold the consumer 1 and consumer 2 patches:</p>
<pre>  initial-API
  consumer-1 (new API)
  consumer-2 (new API)
  unrelated
  consumer-3 (new API)
  API-changes</pre>
<p>Unfortunately, consumer-1-API-changes collides with API-changes, so the fold will fail. It shouldn&#8217;t collide, really, but it does because part of the code to &#8220;register&#8221; consumer-1 with the new API happens to sit right alongside the API itself. Even worse, how do I &#8220;sink&#8221; the &#8216;API-changes&#8217; patch down so I can fold it into initial-API to produce final-API? (Apologies for displaying my stacks upside-down from my terminology!) A naive qfold will only work if the API-changes stuff is separate from all the consumer-* patches.</p>
<p>My manual solution is to start with the initial queue:</p>
<pre>  initial-API
  consumer-1 (old API)
  consumer-2 (old API)
  unrelated
  consumer-3-plus-API-changes-and-consumer-1-and-2-updates-for-new-API</pre>
<p>and then use qcrefresh to rip the API changes and their effects on consumers 1 &amp; 2 back out, leaving:</p>
<pre>  initial-API
  consumer-1 (old API)
  consumer-2 (old API)
  unrelated
  API-changes-and-consumer-1-and-2-updates-for-new-API
  (in working directory) consumer-3 (new API)</pre>
<p>I qrename/qmv the current patch to &#8216;api-change&#8217; and qnew &#8216;consumer-3&#8242; (its original name), cursing about how my commit messages are now on the wrong patch. Now I have</p>
<pre>  initial-API
  consumer-1 (old API)
  consumer-2 (old API)
  unrelated
  api-change (API changes and consumer 1 and 2 updates for new API)
  consumer-3 (new API)</pre>
<p>Now I know that &#8216;unrelated&#8217; doesn&#8217;t touch any of the same files, so I can qgoto consumer-2 and qfold api-change safely, producing:</p>
<pre>  initial-API
  consumer-1 (old API)
  consumer-2 (new API, but also with API change and consumer 1 updates)
  unrelated
  consumer-3 (new API)</pre>
<p>I again qcrefresh,qmv,qnew to pull a reduced version of the api-change patch, giving:</p>
<pre>  initial-API
  consumer-1 (old API)
  api-change (with API change and consumer 1 updates)
  consumer-2 (new API)
  unrelated
  consumer-3 (new API)</pre>
<p>Repeat. I&#8217;m basically taking a combined patch and sinking it down towards its destination, carving off pieces to incorporate into patches as I pass them by. Now I have:</p>
<pre>  initial-API
  api-change (with *only* the API change!)
  consumer-1 (new API)
  consumer-2 (new API)
  unrelated
  consumer-3 (new API)</pre>
<p>and finally I can qfold api-change into initial-API, rename it to final-API, and have my desired result.</p>
<p>What a pain in the ass! Though the qcrefresh/qmv/qnew step is a lot better than what I&#8217;ve been doing up until now. Without qcrefresh, it would be</p>
<pre> % hg qrefresh -X .
 % hg qcrecord api-change
 % hg qnew consumer-n
 % hg qpop
 % hg qpop
 % hg qpop
 % hg qpush --move api-change
 % hg qpush --move consumer-n
 % hg qfold old-consumer-n</pre>
<p>which admittedly preserves the change message from old-consumer-n, which is an advantage over my qcrefresh version.<br />
Or alternatively: fold all of the patches together, and qcrecord until you have your desired final result. In this particular case, the &#8216;unrelated&#8217; patch was a whole series of patches, and they weren&#8217;t unrelated enough to just trivially reorder them out of the way.</p>
<p>Without qcrecord, this is intensely painful, and probably involves hand-editing patch files.</p>
<p>My dream workflow would be to have qfold do the legwork: first scan through all intervening patches and grab out the portions of the folded patch that only modify nonconflicting files. Then try to get clever and do the same thing for the portions of the conflicted files that are independent. (The cleverness isn&#8217;t strictly necessary, but I&#8217;ve found that I end up selecting the same portions of my sinking patch over and over again, which gets old.) Then sink the patch as far as it will go before hitting a still-conflicting file, and open up the crecord UI to  pull out just the parts that belong to the patch being folded (aka sunk). Repeat this for every intervening conflicting patch until the patch has sunk to its destination, then fold it in. If things get too hairy, then at any point abort the operation, leaving behind a half-sunk patch sitting next to the unmodified patch it conflicted with. (Alternatively, undo the entire operation, but since I keep my mq repo revision-controlled, I don&#8217;t care all that much.)</p>
<p>I originally wanted something that would do 3-way merges instead of the crecord UI invocations, but merges really want to move you &#8220;forward&#8221; to the final result of merging separate patches/lines of development. Here, I want to go backwards to a patch that, if merged, would produce the result I already have. So merge(base,base+A,base+B) -&gt; base+AB which is the same as base+BA. From that, I could infer a B&#8217; such that base+A+B&#8217; is my merged base+AB, but that doesn&#8217;t do me any good.</p>
<p>In my case, I have base+A+B and want B&#8221; and A&#8221; such that base+B&#8221;+A&#8221; == base+A+B.</p>
<p>To anyone who made it this far: is there already an easy way to go about this? Is there something wrong with my development style that I get into these sorts of situations? In my case, I had already landed &#8216;initial-API&#8217;; please don&#8217;t tell me that the answer is that I always have to get the API right in the first place. Does anyone else get into this mess? (I can&#8217;t say I&#8217;ve run into this all that often, but it&#8217;s happened more than once or twice.)</p>
<p>I suppose if I had landed consumers 1 and 2, I would&#8217;ve just had to modify their uses of the API afterwards. So I could do that here, too. But reviews could tangle things up pretty easily &#8212; if a reviewer of consumer 1 or 2 notices the API uglinesses that I fixed for consumer 3, then landing the earlier consumers becomes dependent on landing consumer 3, which sucks. But also, none of this is really ready to land, and I&#8217;d like to iterate the API in my queue for a while with all the different consumers as test users, *without* lumping everything together into one massive patch.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.mozilla.org/sfink/2011/11/03/patch-reordering/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>distcc, ccache, and bacon</title>
		<link>http://blog.mozilla.org/sfink/2011/10/07/distcc-ccache-and-bacon/</link>
		<comments>http://blog.mozilla.org/sfink/2011/10/07/distcc-ccache-and-bacon/#comments</comments>
		<pubDate>Fri, 07 Oct 2011 21:42:40 +0000</pubDate>
		<dc:creator>sfink</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[build]]></category>
		<category><![CDATA[mozilla]]></category>
		<category><![CDATA[planet]]></category>

		<guid isPermaLink="false">http://blog.mozilla.org/sfink/?p=260</guid>
		<description><![CDATA[This was initially a response to JGriffin&#8217;s GoFaster analysis post but grew out of control. Read that first. Rampant speculation tl;dr: hey, we could use ccache and distcc on our build system! Just speculating (as usual), but&#8230; The note about retiring slow slaves, combined with the performance gap between full and incremental builds, suggests something. [...]]]></description>
			<content:encoded><![CDATA[<p>This was initially a response to <a href="http://jagriffin.wordpress.com/2011/09/06/gofaster-deeper-data-analysis/">JGriffin&#8217;s GoFaster analysis post</a> but grew out of control. Read that first.</p>
<h2>Rampant speculation</h2>
<p>tl;dr: <b>hey, we could use ccache and distcc on our build system!</b></p>
<p>Just speculating (as usual), but&#8230;</p>
<p>The note about retiring slow slaves, combined with the performance gap between full and incremental builds, suggests something.</p>
<p>Why does additional hardware (the slow slaves) slow things down? Because load is unevenly distributed. Ignoring communication costs, the fastest way to build with a fast machine and a slow one that takes 2x longer would be to compile 2/3 of the files with the fast machine and 1/3 with the slow one. How? Remove all slow slaves from the build pool and convert them to distcc servers.</p>
<p>What about the clobber builds? Well, if you&#8217;ve already built a particular file before with the same compiler and options, it would be nice to not have to build it again. That&#8217;s what ccache is for. But a ccache per slave means you have to have built the same thing <em>on the same slave</em>. For try builds (which is where most of the clobbers are), that&#8217;s not going to happen all the time.</p>
<p>But combine that with the above distcc idea: you could run ccache <em>under</em> distcc on the distcc servers. Now you have a ccache/distcc sandwich: local ccache first, then distcc, then remote ccache, then finally some bacon. Because everything&#8217;s better with bacon.</p>
<div align="center"><img width="50%" src="http://people.mozilla.com/~sfink/Art/bacon.png"></div>
<h2>ts;wm: (too short; want more)</h2>
<p>You know, in terms of data sources, the above picture is wrong. It&#8217;s really local ccache, then remote ccache (via distcc), then remote compile, and only <em>then</em> bacon. But the configuration-centric ccache/distcc/ccache description makes for better visuals. Or would if I put the bacon on the inside, anyway.</p>
<p>Let&#8217;s walk through a clobber build. The stuff the local slave has built before gets pulled from local distcc. Some of the remaining stuff gets built locally. The rest gets sent over to various machines in the distcc pool. We can break those things down into 3 categories: (1) stuff that&#8217;s never been built anywhere, (2) stuff that&#8217;s been built on a different distcc host, and (3) stuff that&#8217;s been built on the same distcc host. #3 is a win. #1 is unavoidable, it&#8217;s the basic cost of doing business. (Actually, there&#8217;s another dimension, which is whether something has been built before on a non-distcc host. I&#8217;ll ignore that for now. Conceptually, you can make it go away by making every slave a distcc server.)</p>
<p>#2 is waste. But it&#8217;s less waste than we have now, if the distcc pool is smaller than the whole build pool, because you&#8217;re doing one redundant build per distcc host rather than one per builder. And it&#8217;s self-limiting: a distcc host that has a build cached returns it immediately, meaning it&#8217;s more likely to get stuck with something it needs to build, which sucks but at least it populates its ccache so it won&#8217;t have to do it again.</p>
<p>Now, I am assuming here that compile costs are greater than communication + ccache lookup costs, which is an insanely flawed assumption. But it&#8217;s very very true for my personal builds &#8212; I have my own distcc server, and my clobber builds (actually, *all* my builds) feel way way faster when I&#8217;m using it. So I don&#8217;t think the question is so much &#8220;would this work?&#8221; as it is &#8220;what would we need to do to make this work?&#8221;</p>
<p>For starters, do no harm: it would be great if we could partition the network so that distcc servers are separate from the current communication channels. Every build host would sit on two VLANs, say: the regular one and the distcc one. That would reduce chances of infrastructure meltdown through excessive distcc traffic. (I am not a network engineer, nor do I play one on TV, and this may require separate physical networks and possibly Pringles cans.)</p>
<p>On a related note, it might be wise to start out by restricting the slaves from doing too many distcc jobs at a time, to prevent the distcc jobs from getting bogged down through congestion. I do this for my own builds through a ~/.distcc/hosts file containing: &#8220;localhost/4 192.168.1.99/7&#8243;. That means you can use -j666, and it&#8217;ll still only do 4 jobs on localhost and 7 jobs on 192.168.1.99 simultaneously. (Actually, that&#8217;s my home ~/.distcc/hosts file. My server at work is beefier, and there I allow the remote to do 12 jobs at once. I have a cron job that checks every 5 minutes to see what network I&#8217;m on and sets a ~/.distcc/hosts symlink accordingly. But I digress.)</p>
<p>More worrying is the reason behind all that clobbering. If a slave turns to the dark side, runs amok, gets hit by a cosmic ray, or is just having a bad day, do we really want to use its ccached builds? More to the point, when something goes wrong, what do we need to clobber? Right now everything is local to a slave, so it&#8217;s straightforward to pull a slave from the pool, take it out behind the garage, and beat the crap out of it with a stick. With distcc and ccache, it&#8217;s harder to tell which server to blame.</p>
<p>Still, how often does this happen? (I have no idea. I&#8217;m just a troublemaking developer, dammit.) We can always wipe the ccache on the whole distcc pool. It&#8217;d be nice to be able to track problems to their source, though. Maybe we could use the distcc pool redundancy to our advantage: have them cross-check the checksums of their builds with each other. Same input, same output. But that&#8217;s even more speculative.</p>
<p>It&#8217;s not all bad, though &#8212; I&#8217;m guessing that most clobbers result from the build system not being able to handle various types of change. If the ccache/distcc/ccache sandwich makes clobbers substantially cheaper, we can be a lot freer with them. Someone accidentally cancelled an m-c build partway through? Clobber the world! <b>Let&#8217;s make bacon!!</b></p>
<h2>wtf;yai;bdb: (what the f#@; you&#8217;re an idiot; been done before)</h2>
<h5>Reality check</h5>
<ul>
<li>We use local ccache already &#8211; see <a href="https://bugzilla.mozilla.org/show_bug.cgi?id=488412">bug 488412</a></li>
<li>distcc has been proposed a number of times, but for the life of me I cannot find the bug. There are most likely some very valid reasons not to use it. Such as making a complete interdependent hairball out of our build system where one machine can kill everything.</li>
<li>Given the results in <a href="https://bugzilla.mozilla.org/show_bug.cgi?id=488412">bug 488412</a>, it&#8217;s very plausible that remote ccaches would be of no benefit or a net loss. (Though those numbers were using NFS to retrieve remote ccache results, and I deeply distrust NFS.)</li>
</ul>
<h2>Screw Reality. What has it ever done for me?</h2>
<p>Hey, if we really needed to conceal network latency and redundant rebuilds across different hosts, we could stream out ccache results before they were even needed! But that&#8217;s <a href="https://wiki.mozilla.org/Sfink/Thought_Experiment_-_One_Minute_Builds">crazy talk</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.mozilla.org/sfink/2011/10/07/distcc-ccache-and-bacon/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>JS Probes</title>
		<link>http://blog.mozilla.org/sfink/2011/09/21/js-probes/</link>
		<comments>http://blog.mozilla.org/sfink/2011/09/21/js-probes/#comments</comments>
		<pubDate>Thu, 22 Sep 2011 06:36:23 +0000</pubDate>
		<dc:creator>sfink</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[intern]]></category>
		<category><![CDATA[mozilla]]></category>
		<category><![CDATA[planet]]></category>

		<guid isPermaLink="false">http://blog.mozilla.org/sfink/?p=252</guid>
		<description><![CDATA[Have you ever had your browser mysteriously stall periodically and wondered &#8220;what the f#@$! is it doing?!!&#8221; Or perhaps you&#8217;re working on something, say the garbage collector, and you&#8217;d like to see what effect your changes are having. Or maybe even write a little analysis that postprocesses some sort of trace of what is going [...]]]></description>
			<content:encoded><![CDATA[<p>Have you ever had your browser mysteriously stall periodically and wondered &#8220;what the f<em>#@$!</em> is it doing?!!&#8221; Or perhaps you&#8217;re working on something, say the garbage collector, and you&#8217;d like to see what effect your changes are having. Or maybe even write a little analysis that postprocesses some sort of trace of what is going on, and figures out what the optimal pattern of actions would be. (&#8220;If I&#8217;d thrown this big chunk of data out of the cache here, then I would&#8217;ve had room for all of these little things that got evicted instead, and would have had way fewer misses&#8230;&#8221;)</p>
<p>The usual way to do things like this is to manually add some instrumentation code (probably just logging a bunch of events) and postprocess the results. This works fine, but it has a few drawbacks: (1) you have to figure out where to insert your instrumentation, often in unfamiliar code; (2) you&#8217;ll need to recompile, possibly several times; (3) the logs can get very large very quickly; and (4) you&#8217;ll probably end up writing a very special-purpose postprocessor that (5) dumps stuff to a text file that only you know how to interpret, and even you will only remember what it all means for a week or two. The next time you need to do something similar, you&#8217;ll find that all of your instrumentation code is severely bitrotted and misses some paths that have been added in the meantime, so you&#8217;ll start everything over from scratch.</p>
<p>Well, tough luck. Sometimes those are just facts of life and you&#8217;ll need to suck it up. Quit whining, dammit.</p>
<p>But many times, the events of interest (or more precisely, &#8220;probe points&#8221;) are of general interest. If you can manage to slip them into the code and so get other developers to maintain them for you as they make changes, then everyone can rely on those probes being in roughly the right place permanently. That&#8217;s #1 above, and depending on how they&#8217;re implemented there&#8217;s a good chance you won&#8217;t even need to recompile, so that&#8217;s #2.</p>
<p>I&#8217;ve done an implementation of these sorts of probes in the SpiderMonkey Javascript engine. There are probe points like &#8220;a GC is starting (and it&#8217;s local to one compartment)&#8221;, &#8220;the heap has been resized&#8221;, and &#8220;javascript function F is being called/is returning.&#8221; Some of these are straightforward to place into the code &#8212; the start of a GC wasn&#8217;t hard to figure out, for example. Some weren&#8217;t so straightforward, such as JS function calls (they might seem simple, but what if you&#8217;re running JITted? Which JIT? Are you still running JITted by the time you return from the function?) I&#8217;ve delivered the probe information to various backends &#8212; anything from Windows&#8217; <a href="http://blog.mozilla.org/sfink/2010/11/01/etw-part-1-intro/" title="ETW" target="_blank">ETW</a> (blog post forthcoming whenever I manage to implement the start/stop functionality), to <a href="http://en.wikipedia.org/wiki/DTrace" title="DTrace" target="_blank">dtrace</a>/<a href="http://sourceware.org/systemtap/" title="Systemtap" target="_blank">systemtap</a> (another blog post, probably coming sooner since I recently scraped together a demo), to a simple callback mechanism (see JS_SetFunctionCallback on MDN) and other special-purpose ones that only care about a small subset of probes.</p>
<p>#3 (log it all vs online handling) ventures into religious territory. It is easiest to mindlessly log everything of interest and postprocess it. But what if you want realtime updates? Or if you want to track different information depending on what you learn from other probe points? Or what if the volume of your log writing interferes with whatever you&#8217;re trying to measure (eg disk I/O)? Or maybe you need to track some sort of state in order to give the probes meaning. (GC when idle => good. Avoidable GC when the user is waiting => bad.)</p>
<p>Those arguments are what led to the creation of tools like <a href="http://en.wikipedia.org/wiki/DTrace" title="DTrace" target="_blank">DTrace</a> and <a href="http://sourceware.org/systemtap/" title="Systemtap" target="_blank">Systemtap</a>. Both give you a scripting environment that can aggregate information from probes as they fire, control exactly what information gets tracked as things are happening, and can be attached/detached at any time. They&#8217;re pretty cool, and invaluable once you get familiar with them. They&#8217;re also extremely system-dependent and generally require root access or special builds or kernel debuginfo or something, which ends up meaning that you often can&#8217;t just hand off analysis scripts to other people and have those people get some use out of them. And even you may not be able to take them to another environment.</p>
<p>Still, they deal pretty well with #4 (avoiding one-use, special-purpose processors), at least for environments matching the one they were written for. And if they can draw from statically-inserted probe points (the type I was talking about above), they can actually be pretty general. #5 is still a killer, though &#8212; at least the way I write systemtap scripts, they all end up with idiosyncratic ways of dumping out the results of some particular analysis, and nobody else is going to get much enlightenment without studying the script for a while first.</p>
<p>What if we could do better? What if we could insert these static probes, but rather than feeding the information to some niche tool that is usable by only a handful of people, we make the data available to a plain old Firefox addon? You could collect, aggregate, summarize, mutilate, fold, spindle, or crush the data directly in JS code. Then we could let addon authors go crazy with visualizations and analysis libraries. That&#8217;d be cool, right?</p>
<blockquote><p>Graph GC behavior. Warn the user when slow or suspicious stuff is happening. Figure out what&#8217;s going on during long event handlers. Graph the percentage of time spent in different subsystems. Correlate performance/trace data with user-meaningful actions. Make a flight-recording of various metrics and let the user walk through history. Your ideas here.
</p></blockquote>
<p>Ok, so I tricked you. I&#8217;m not going to tell you how to do any of that. This blog post is a tease, an advertisement for the work that <a href="http://www.cs.washington.edu/homes/burg/" title="Brian Burg" target="_blank">Brian Burg</a> did this summer during his Mozilla internship. If you&#8217;re interested, he&#8217;ll be giving his internship final presentation tomorrow (today when you&#8217;re reading this, or perhaps yesterday or last month for those of you who have fallen behind on your Planet reading.) That&#8217;s 1:30PM PDT on Thursday, September 22 at the Mountain View Mozilla headquarters, and I&#8217;m 97.2% sure it will be broadcast over <a href="http://air.mozilla.org/" title="Air Mozilla" target="_blank">Air Mozilla</a> as well. And taped, I think? (Sadly, I can&#8217;t find where those are archived. Somebody please tell me and I&#8217;ll update this post.) There will be a demo. With pretty pictures! And he&#8217;ll be writing it up on his own blog Real Soon Now. I&#8217;m not going to say any more for now &#8212; I&#8217;d get it wrong anyway.</p>
<p><em>Update:</em> Argh! I got the date wrong! It&#8217;s not Wednesday, September 21 as I originally wrote. It&#8217;s today, <em>Thursday</em>, September 22. Sorry for the confusion!</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.mozilla.org/sfink/2011/09/21/js-probes/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Contexts and Compartments</title>
		<link>http://blog.mozilla.org/sfink/2011/08/25/contexts-and-compartments/</link>
		<comments>http://blog.mozilla.org/sfink/2011/08/25/contexts-and-compartments/#comments</comments>
		<pubDate>Thu, 25 Aug 2011 18:35:57 +0000</pubDate>
		<dc:creator>sfink</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[js]]></category>
		<category><![CDATA[mozilla]]></category>
		<category><![CDATA[planet]]></category>

		<guid isPermaLink="false">http://blog.mozilla.org/sfink/?p=246</guid>
		<description><![CDATA[A while ago (at the Platform offsite just after the last all-hands, actually) I wrote up what I understood about contexts and compartments. I&#8217;ve since sent it to a couple of people and put it up on the wiki, but haven&#8217;t distributed it more widely because I wasn&#8217;t sure it was all correct. I am [...]]]></description>
			<content:encoded><![CDATA[<p>A while ago (at the Platform offsite just after the last all-hands, actually) I wrote up what I understood about contexts and compartments. I&#8217;ve since sent it to a couple of people and put it <a title="wiki contexts vs compartments article" href="https://wiki.mozilla.org/Sfink/Contexts_and_Compartments">up on the wiki</a>, but haven&#8217;t distributed it more widely because I wasn&#8217;t sure it was all correct. I am far from an expert, but mrbkap (who *is* the expert) has now read through this and pointed out only one glaring mistake, which is now fixed. So other than the parts I&#8217;ve added since then, it should be more or less correct now and thus is ready for a wider audience.</p>
<p>See also <a href="http://www.christianwimmer.at/Publications/Wagner11a/Wagner11a.pdf">http://www.christianwimmer.at/Publications/Wagner11a/Wagner11a.pdf</a> for the fundamental idea of compartments.</p>
<h2>Contexts=Control, Compartments=Data</h2>
<p>JSContexts are control, JSCompartments are data.</p>
<p>A <code>JSContext</code> (from here on, just &#8221;context&#8221;) represents the execution of JS code. A context contains a JS stack and is associated with a thread. A thread may use multiple contexts, but a given context will only execute on a single thread at a time.</p>
<p>A <code>JSCompartment</code> (&#8221;compartment&#8221;) is a memory space that objects and other garbage-collected things (&#8221;GCthings&#8221;) are stored within.</p>
<p>A context is associated with a single compartment at all times (not necessarily always the same one, but only ever one at a time). The context is often said to be &#8220;running inside&#8221; that compartment. Any object created with that context will be physically stored within the context&#8217;s current compartment. Just about any GCthing read or touched by that context should also be within that same compartment.</p>
<p>To access data in another compartment, a context must first &#8220;enter&#8221; that other compartment. This is termed a &#8220;cross-compartment call&#8221; &#8212; remember, contexts are control, so changing a context&#8217;s compartment is only meaningful if you&#8217;re going to run code. The context will enter another compartment, do some stuff, then return, at which time it&#8217;ll exit back to the original compartment. (The APIs allow you to change to a different compartment and never change back, but using that is almost always a bug and will trigger an assertion in a debug build the first time you touch an object in a compartment that differs from your context&#8217;s compartment.)</p>
<p>When a context is not running code &#8212; as in, its JS stack is empty and it is not in a request &#8212; then it isn&#8217;t really associated with any compartment at all. In the future, starting a request and entering an initial compartment will become the same action. Also, a context is only ever running on one thread at a time. <strong>Update</strong>: or perhaps we&#8217;ll eliminate contexts altogether and just map from a thread to the relevant data.</p>
<p>In implementation terms, a context has a field (cx-&gt;compartment) that gives the current compartment. Contexts also maintain a default scope object (cx-&gt;globalObject) that is required to always be within the same compartment, and a &#8220;pending exception&#8221; object which, if set, will also be in the same compartment. Any object created using a context will be created inside the context&#8217;s current compartment, and the object&#8217;s scope chain will be initialized to a scope object within that same compartment. (That scope object <em>might</em> be cx-&gt;globalObject, but really that&#8217;s just the ultimate fallback. Usually the scope object will be found via the stack.)</p>
<p>To make a cross-compartment call, cx-&gt;compartment is updated to the new compartment. The scope object must also be updated, and for that reason you must pass in a target object in the destination compartment. The scope object will be set to the target object&#8217;s global object. (There&#8217;s a hacky special case when you&#8217;re using a JSScript for the target object, since they don&#8217;t have global objects, but ignore that.) If an exception is pending, it will be set to a wrapper (really, a proxy) inside the new compartment. The wrapper mediates access to the original exception object that lives in the origin compartment.</p>
<p>Finally, a dummy frame that represents the compartment transition is pushed onto the JS stack. This frame is used for setting the scope object of anything created while executing within the new compartment. Also, the security privileges of executing code are determined by the current stack &#8212; eg, if your chrome code in a chrome compartment calls a content script in a content compartment, that script will execute with content privileges until it returns, then will revert to chrome privileges.</p>
<p>When debugging, it is helpful to know that a compartment is associated with a &#8220;<code>JSPrincipals</code>&#8221; object that represents the &#8220;security information&#8221; for the contents of that compartment. This is used to decide who can access what, and is mostly opaque to the JS engine. But for Gecko, it&#8217;ll typically contain a human-understandable URL, which makes it much easier to figure out what&#8217;s going on:</p>
<pre>(gdb) p obj
 $1 = (JSObject *) 0x7fffbeef
 (gdb) p obj-&gt;compartment()
 $2 = (JSCompartment *) 0xbf5450
 (gdb) p obj-&gt;compartment()-&gt;principals()
 $3 = (JSPrincipals *) 0xc29860
 (gdb) p obj-&gt;compartment()-&gt;principals-&gt;codebase
 $4 = 0x7fffd120 "[System Principal]"
 ...or perhaps...
 $4 = 0x7fffd120 "http://angryhippos.com/accounts/"</pre>
<p>Anything within a single compartment can freely and directly access anything else in that same compartment. No locking or wrappers are necessary (or possible). The overall model is thus a partitioning of all (garbage collectible) data into separate compartments, with controlled access from one compartment to another but lockless, direct access between objects within a compartment. Cross-compartment access is handled via &#8220;wrappers&#8221;, which is the subject of the next section.</p>
<h2>Wrappers</h2>
<p>GCthings may be wrapped in cross-compartment wrappers for a number of reasons. When a context is transitioning from one compartment to another (ie, it&#8217;s making a cross-compartment call), its scope object and pending exception (if any) are changed to wrappers pointing back to the objects in the old compartment. But any object can be wrapped in a cross-compartment wrapper if needed. You can clone an object from another compartment, and all of its properties will be wrappers pointing at the &#8220;real&#8221; properties in the origin compartment.</p>
<p>Cross-compartment wrappers do not compose. When you wrap an object, any existing wrappers will be ripped off first. (Slight oversimplification; there is one exception.) In fact, the type of wrapper used for an object is uniquely determined by the source and destination compartments.</p>
<p>The precise terminology is a little confusing. A cross-compartment wrapper is a <code>JSObject</code> whose class is one of the proxy classes. When you access such an<br />
object, it fetches its proxy handler (a subclass of <code>JSProxyHandler</code>) out of a slot to decide how to handle that access. Confusingly, in the code a <code>JSCrossCompartmentWrapper</code> is the subclass of <code>JSProxyHandler</code> that manages cross-compartment access, but usually when we refer to a &#8220;cross-compartment wrapper&#8221;, we&#8217;re really talking about the <code>JSObject</code>. (The <code>JSObject</code> of type <code>js::SomethingProxyClass</code> that has a private <code>JSSLOT_PROXY_HANDLER</code> field containing a <code>JSProxyHandler</code> subclass that knows how to mediate access to the proxied object stored in <code>JSSLOT_PROXY_PRIVATE</code>. Phew.)</p>
<p>A proxy handler mediates access to the proxied objects based on a set of rules embodied by some subclass of <code>JSProxyHandler</code>. A proxy handler might allow all accesses through, conceal certain properties, or check on each access whether the source compartment is allowed to see a particular property. Examples of proxy handler classes are the things listed on <a title="XPConnect wrappers" href="https://developer.mozilla.org/en/XPConnect_wrappers">https://developer.mozilla.org/en/XPConnect_wrappers</a> : cross-origin wrappers (XOWs), chrome object wrappers (COWs), etc.</p>
<p>Also, the same wrapper will always be used for a given object. This is necessary for equality testing between independently generated wrappings of the same object, and useful for performance and memory usage as well. Internally, every compartment has a wrapperCache that is keyed off of wrapped objects&#8217; identity. You could think of the flavor of wrapper (i.e., the type of proxy handler) being determined by the tuple «destination compartment, source compartment, object», but the object is stored within the source compartment so those last two are redundant with each other.</p>
<p>From the JS engine&#8217;s point of view, there are a bunch of objects, every object lives in a different compartment, and whenever you call something or point to something in another compartment, the engine will interpose a cross-compartment wrapper for you. It&#8217;s up to the embedding &#8212; the user of the JS engine &#8212; to decide how to divide up data into different compartments, and what the behavior is triggered when you cross between compartments. You could have a &#8220;home&#8221; compartment and a &#8220;bigger&#8221; compartment, and the cross-compartment wrapper could convert any string to Pig Latin when it is retrieved from &#8220;bigger&#8221; by &#8220;home&#8221;. More practically, you could conceal certain properties from view when accessing them from an &#8220;unprivileged&#8221; compartment (whatever that might mean in your embedding), or you could do locking or queuing when accessing one compartment from another compartment in a different thread. Or add a remoting layer.</p>
<p>XPConnect (Gecko&#8217;s SpiderMonkey embedding code) uses cross-compartment wrappers to implement security policies and access rules. The &#8216;Introduction&#8217; section at <a href="https://developer.mozilla.org/en/XPConnect_security_membranes#Introduction" title="XPConnect security membranes intro">https://developer.mozilla.org/en/XPConnect_security_membranes</a> gives a very good description of what XPConnect is using the wrappers for. Gecko uses (mostly) one compartment for chrome, and one compartment for each content domain. The wrapper is chosen based on whether the two compartments are the same origin, or whether one is privileged to see anything or a subset of the information in the other, etc. See <code>js/src/xpconnect/wrappers/WrapperFactory.cpp</code> for the gruesome details.</p>
<h2>Future</h2>
<p>(Or, &#8220;What Luke Wagner is plotting&#8221;.)</p>
<p>There are various plans that will probably change this picture substantially. Our threading story right now is a bit convoluted &#8212; compartments can only be touched by one thread at a time but can supposedly switch between threads, or something, and contexts need to be in a request before doing anything and beginning a request binds the context to a thread but requests can be suspended, and a context points to a thread data but you need to rebind the thread data if you switch threads&#8230; it&#8217;s complicated, ok? I tried to document it once, but just kept confusing myself.</p>
<p>Luke plans to <a href="https://bugzilla.mozilla.org/show_bug.cgi?id=650411">make <code>JSRuntime</code>s be single-thread only</a>, <a href="https://bugzilla.mozilla.org/show_bug.cgi?id=650361">eliminate JSContexts entirely</a>, <a href="https://bugzilla.mozilla.org/show_bug.cgi?id=650353">make JSCompartments be per-global</a> (right now you can have multiple global objects in a compartment). I don&#8217;t really understand all that (are JSRuntimes the new JSContexts?) but the point is that things are a&#8217;changin.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.mozilla.org/sfink/2011/08/25/contexts-and-compartments/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
	</channel>
</rss>

