Jan 13

analyzing linker max vsize

mozilla-inbound is currently approval-only due to issues with Windows PGO builds.  The short explanation is that we turn on aggressive code optimization for our Windows builds.  This aggressive code optimization causes the linker than comes with Visual Studio to run out of virtual memory.  The current situation is especially problematic because we can’t increase the amount of virtual memory the linker can access (unlike last time, where we “just” moved the builds to 64-bit machines).

We don’t really have a good handle on what causes these issues (other than the obvious “more code”), but at least we are tracking the linker’s vsize and we’ll soon have pretty pictures of the same.  We hadn’t expected to have to deal with this problem for several more months.  The graph below helps explain why we’re hitting this problem a little sooner than before.  The data for this graph was taken from the Windows nightly build logs.

Win32 Linker max vsize

Notice the massive spike in October, as well as the ~100MB worth of growth in early January.  While the data is not especially fine-grained (nightly builds can include tens of changesets, and we’d really like information on the vsize growth on a per-changeset basis), looking at the biggest increases over the last ten months might prove helpful.  There have been ~300 nightly builds since we started recording data; below is a list of the top 20 daily increases in linker max vsize.  The date in the table is the date the nightly build was done; the newly-included changeset range is linked to for your perusal.

Nightly build date vsize increase (MB)
2012-05-18 282.363281
2012-10-06 103.609375
2012-10-08 90.769531
2013-01-10 49.699219
2012-06-02 49.199219
2012-10-19 32.976562
2012-12-25 32.332031
2013-01-06 32.015625
2013-01-20 30.144531
2013-01-22 27.222656
2012-10-04 19.273438
2012-05-10 18.234375
2012-11-23 17.937500
2012-08-03 17.738281
2013-01-07 17.671875
2012-09-08 17.386719
2012-12-23 17.269531
2012-12-27 17.156250
2012-11-11 17.085938
2012-12-06 17.003906

Mike Hommey suggested that trying to divine the whys and hows of extra memory usage would be a fruitless endeavor. Looking at the above pushlogs, I am inclined to agree with him. There’s nothing in any of them that jumps out. I didn’t try clicking through to individual changesets to figure out what might have added large chunks of code, though.

Jan 13

64-bit multiplication pitfalls

I’ve seen several instances of code recently that look something like this:

void madd(int64_t *sum, int32_t x, int32_t y)
  *sum += x * y;

Or this:

void func(int64_t);
  int32_t x, y = ...;
  func(x * y);

Unfortunately, neither of these cases do what the programmer intended.  The intended result was to compute the full 64-bit product from multiplying two 32-bit numbers.  What gets computed instead is the lower 32-bits of the desired product, sign-extended to 64-bits, which is quite different! The assembly produced by x86-64 GCC at -O2 for the first example looks like:

    imull   %edx, %esi    # int32_t multmp = x * y
    movslq  %esi, %rsi    # int64_t exttmp = static_cast<int64_t>(multmp)
    addq    %rsi, (%rdi)  # *sum += exttmp

If the full 64-bit product is desired, one of the arguments needs to be cast to a 64-bit value first.

void madd(int64_t *sum, int32_t x, int32_t y)
  *sum += static_cast<int64_t>(x) * y;

(The standard-ese for this is that operands are automatically promoted based on the types of the operands, not on the type of the result.  Integers smaller than int are promoted to int, which is what you want most of the time. Of course, here we’re dealing with things that are already int-sized[*], so we have to explicitly ask for promotion.)

which produces the desired:

    movslq  %esi, %rsi    # int64_t xtmp = static_cast<int64_t>(x)
    movslq  %edx, %rdx    # int64_t ytmp = static_cast<int64_t>(y)
    imulq   %rdx, %rsi    # int64_t multmp = xtmp * ytmp
    addq    %rsi, (%rdi)  # *sum += multmp

The above examples are semi-obvious instances, but when dealing with types whose sizes are not specified, similar problems occur. Consider replacing int64_t with off_t and int32_t with size_t in the example above. While such code will mostly work (most files are well under 2GB or 4GB in size), off_t and size_t do not need to be the same size: try compiling sizeof off_t with -D_FILE_OFFSET_BITS=64 on your favorite 32-bit Linux sometime.

[*] Assuming we’re on a fairly standard 32-bit or 64-bit machine, of course.