Jan 12

Version String Management in Python: Introducing python-versioneer

What’s a good way to manage version numbers in a Python project? I don’t mean:

  • where should it be stored, so that other code can find it. PEP 8 tells us to use __version__, and distutils tells us to call setup() with a version= argument. The embedded string is particularly useful to report or record a version in bug reports.
  • what format it should take: PEP 386 describes a format (N.N[.N]+[{a|b|c|rc}N[.N]+][.postN][.devN]) that enables comparison, so packaging tools can evaluate things like “dependency > 1.2.0”. (I happen to find this format really limiting, and this tool doesn’t necessarily produce PEP386-compliant strings, but that’s not what this post is about)

What I do mean is:

  • how does the right version string get into the code?
  • what does a release manager need to do when it’s time to make a new release?

The traditional approach, ages old, is to have a static string embedded in the code. Each time you’re about to make a new release, you make a commit which updates this string. It’s nice and simple, but has some problems: Continue reading →

Jan 12

Improving the pynacl build process

I’ve been hacking on my copy of pynacl this week. pynacl is a set of Python bindings to the NaCl cryptography library (by djb and friends). The bindings, first written by Sean Lynch and then picked up by “k3d3”, are great. They even support py2.6 through py3.2.

But actually building them is a hassle, because the NaCl build process is so idiosyncratic. It consists of a 500-line undocumented shell script named “do”, and running it gets you 25 minutes of 100% CPU that executes in stony silence (all progress messages are redirected to a logfile). If you can wait that long, and think to explore the directory afterwards, you’ll be rewarded with a build/HOSTNAME/ directory that contains a libnacl.a and a set of header files that are pretty easy to use. What’s actually going on behind the scenes is that the script is exhaustively compiling and testing a large matrix of compiler flags (-O vs -O3 vs -O3 -funroll-loops), ABI variants, and alternative implementations. The goal is apparently to:

  • select the fastest possible implementation and compiler options, using any assembly-language tricks specific to the processor (SSE3, etc)
  • make sure the unit tests pass
  • construct a performance report to send back to the authors

Unfortunately, this doesn’t play well with other build systems that might want to embed a copy, such as Python’s distutils, because:

  • some compiler flags (-fPIC) are needed to build the .so files that python can load at runtime: distutils knows what these are, “do” doesn’t
  • when building e.g. debian packages, the results will be used on other machines, so processor-specific optimizations aren’t ok (you might build your packages on a machine with some feature that’s not present on the machines that use those packages, so stick with least-common-denominator).
  • running “do” requires a Bourne Shell interpreter, standard on unix systems but not so obvious on windows. distutils knows how to compile things on windows, but you have to tell it the source files, and it will run the compiler itself.
  • having a separate “./do” compile step means that “setup.py build” is not enough, which means that easy_install won’t work, making it hard to use pynacl as a dependency in virtualenv or pip environments.

In addition, waiting 25 minutes for an otherwise small and elegant library to build is just a drag.

Continue reading →