29
Jul 16

a git pre-commit hook for tooltool manifest checking

I’ve recently been uploading packages to tooltool for my work on Rust-in-Gecko and Android toolchains. The steps I usually follow are:

  1. Put together tarball of files.
  2. Call tooltool.py from build-tooltool to create a tooltool manifest.
  3. Upload files to tooltool with said manifest.
  4. Copy bits from said manifest into one of the manifest files automation uses.
  5. Do try push with new manifest.
  6. Admire my completely green try push.

That would be the ideal, anyway.  What usually happens at step 4 is that I forget a comma, or I forget a field in the manifest, and so step 5 winds up going awry, and I end up taking several times as long as I would have liked.

After running into this again today, I decided to implement some minimal validation for automation manifests.  I use a fork of gecko-dev for development, as I prefer Git to Mercurial. Git supports running programs when certain things occur; these programs are known as hooks and are usually implemented as shell scripts. The hook I’m interested in is the pre-commit hook, which is looked for at .git/hooks/pre-commit in any git repository. Repositories come with a sample hook for every hook supported by Git, so I started with:

cp .git/hooks/pre-commit.sample .git/hooks/pre-commit

The sample pre-commit hook checks trailing whitespace in files, which I sometimes leave around, especially when I’m editing Python, and can check for non-ASCII filenames being added.  I then added the following lines to that file:

if git diff --cached --name-only | grep -q releng.manifest; then
    for f in $(git diff --cached --name-only | grep releng.manifest); do
	if ! python -<<EOF
import json
import sys
try:
    with open("$f", 'r') as f:
        json.loads(f.read())
    sys.exit(0)
except:
    sys.exit(1)
EOF
	    then
	    echo $f is not valid JSON
	    exit 1
	fi
     done
fi

In prose, we’re checking to see if the current commit has any releng.manifest files being changed in any way. If so, then we’ll try parsing each of those files as JSON, and throwing an error if one doesn’t parse.

There are several ways this check could be more robust:

  • The check will error if a commit is removing a releng.manifest, because that file won’t exist for the script to check;
  • The check could ensure that the unpack field is set for all files, as the manifest file used for the upload in step 3, above, doesn’t include that field: it needs to be added manually.
  • The check could ensure that all of the digest fields are the correct length for the specified digest in use.
  • …and so on.

So far, though, simple syntax errors are the greatest source of pain for me, so that’s what’s getting checked for.  (Mismatched sizes have also been an issue, but I’m unsure of how to check that…)

What pre-commit hooks have you found useful in your own projects?


06
Jul 16

on the usefulness of computer books

I have a book, purchased during my undergraduate days, entitled Introduction to Algorithms. Said book contains a wealth of information about algorithms and data structures, has its own Wikipedia page, and even a snappy acronym people use (“CLRS”, for the first letters of its authors’ last names).

When I bought it, I expected it to be both an excellent textbook and a book I would refer to many times throughout my professional career.  I cannot remember whether it was a good textbook in the context of my classes, and I cannot remember the last time I opened it to find some algorithm or verify some subtle point.  Mostly, it has served two purposes: an excellent support for my monitor to position the monitor more closely to eye level, and as extra weight to move around when I have had to transfer my worldly possessions from place to place.

Whether this reflects on the sort of code I have worked on, or the rise of the Internet for answering questions, I am unsure.

I have another book, also purchased during my undergraduate days, entitled Programming with POSIX Threads.  Said book contains a wealth of information about POSIX threads (“pthreads”), is only mentioned in “Further Reading” on the Wikipedia page for POSIX threads, and has no snappy acronym associated with it.

I purchased this book because I thought I might assemble a library of programming knowledge, and of course threads would be a part of that.  Mostly, it would sit on the shelves to show people I was a Real Programmer(tm).

Instead, I have found it to be one of those books to always have close at hand, particularly working on Gecko.  Its explanations of the basic concepts of synchronization are clear and extensive, its examples of how to structure multithreaded algorithms are excellent, and its secondary coverage of “real-world” things such as memory ordering and signals + threads (short version: “don’t”) have been helpful when people have asked me for opinions or to review multi-threaded code.  When I have not followed the advice of this book, I have found myself in trouble later on.

My sense when searching for some of the same topics the book covers is that finding the same quality of coverage for those topics online is rather difficult, even taking into account that topics might be covered by disparate people.

If I had to trim my computer book library down significantly, I’m pretty sure I know what book I would choose.

What book have you found unexpectedly (un)helpful in your programming life?