Conservative stack scanning 101

Maybe you’re curious about how the MMgc garbage collector works. Here’s my rambling attempt to explain it on IRC this morning (edited for clarity and layout).

(Note: The MMgc documentation is another good place to start, if you already know all about the C++ stack and heap.)

peter: this "scanning the C stack" is a mystery to me
peter: I had no idea C was reflective like this
jorendorff: Oh, it's not :)
peter: ok good because I never heard of such things!
jorendorff: Happy to explain.  :)
jorendorff: Or try. :)
jorendorff: But you need to understand a little about how C works in practice.
jorendorff: What do you know about "the stack"?
peter: not much I guess
peter: I know about regular C dynamic memory use
jorendorff: Well, when your program runs, and main() gets called,
jorendorff: say you've got some local variables in there.
jorendorff: C has to store them somewhere.  What happens is
jorendorff: ...C has some memory, maybe 64K bytes, set aside
jorendorff: ...for local variables.
peter: yep
peter: set aside by the OS, correct?
jorendorff: eh, not exactly
jorendorff: C asks the OS for 64K bytes, the OS says sure, here you go,
peter: right
peter: ok
jorendorff: ...and C uses those for the stack.
jorendorff: But the OS doesn't know what C is doing with them.
peter: ok

Here when I talk about “C” doing things, I really mean “the C language runtime”. This is code that the C linker automatically links into every C/C++ program. It’s called libc on Unix. On Windows, they call it the CRT.

jorendorff: So.  C remembers how much of the stack is in use.
jorendorff: (It has a machine register pointing to "the top of the stack")
jorendorff: (...on intel x86, this is ESP)
peter: C's malloc asks for memory from this 64kb and manages it
peter: making sure things that need to be in even alignment, are
peter: and so on
jorendorff: Well, the stack is separate from all malloc'd memory.
peter: ahh ok
jorendorff: Local variables don't go in malloc'd memory.
peter: gotcha
jorendorff: C actually sets up both the stack and the malloc() heap
jorendorff: before main even runs.
peter: ok
peter: and the stack is the automanaged part for locals
jorendorff: exactly
peter: thats the part everyone likes :)
jorendorff: right.
jorendorff: OK.  So when main() calls another function,
jorendorff: ...say it calls printf()
jorendorff: C does all this automatically
jorendorff: ...*before* any code in printf() actually executes:
jorendorff: (1) stores a "return address"
jorendorff: so it knows where to pick up again when printf() eventually returns
jorendorff: (2) sets aside some stack space for printf's local variables.
jorendorff: I guess (2) actually happens in some code within printf
jorendorff: ...that is generated automatically by the compiler.
jorendorff: Then printf() runs.
peter: ok
jorendorff: When printf() returns, all those local variables go away.
jorendorff: My point is--
jorendorff: can you see what the data structure for this is like?
jorendorff: "The stack" is just a contiguous chunk of memory.
jorendorff: main()'s locals are at the bottom.
peter: sure
jorendorff: If main() calls another function, those locals are adjacent
jorendorff: and so on
peter: makes sense
jorendorff: it's actually a "stack" in the data structures sense.
jorendorff: ok, what this means is
jorendorff: all local variables, both for the current function
jorendorff: and all its callers,
jorendorff: are in this contiguous chunk of memory.
jorendorff: For each platform (meaning, OS + CPU + compiler)
jorendorff: ...there's a way to find out where that chunk of memory is.
jorendorff: Tamarin does that,
jorendorff: then it scans all that memory during GC
jorendorff: to see where the locals point to.
peter: there is some C header can be included that allows access to all this?
jorendorff: peter: not standard C, no
peter: no but some header
jorendorff: yeah, look in the tamarin source :)

This magic lives in a macro, MMGC_GET_STACK_EXTENTS, defined in the header MMgc/GC.h. As you can see, there’s a separate implementation for each platform.

At any given moment, some locals might be in CPU registers and not on the stack. To cope with this, the macro uses a few lines of assembly code to dump the contents of all the registers onto the stack. That way MMgc can just scan the stack and it’ll see all local variables.

peter: I have been thinking there would be ways to fool this stack scan
jorendorff: peter: Yes, there would be.  :)
jorendorff: Can you give me an example, though?
peter: if a local pointer points to  dynamically allocated bit of memory
peter: that points to other dynamically allocated bits
peter: that should not be garbage collected
peter: and there have been a bunch of type casts along the way
jorendorff: Oh, type casts don't affect MMgc at all.
jorendorff: MMgc doesn't look at C/C++ code.
jorendorff: It's looking at raw memory.
peter: ahh right
peter: just inference by bit patterns?
jorendorff: yep
peter: and it follows from local pointers to malloc()ed memory,
peter: and follows the pointers in that memory to other memory, and so on?
jorendorff: yes, in principle -- but
jorendorff: MMgc has its own allocator
jorendorff: that you must use instead of malloc().
peter: ahh
peter: ok
jorendorff: Since it knows all the internal data structures
jorendorff: used by that allocator,
jorendorff: when it follows pointers
jorendorff: it knows the size of every region
jorendorff: and can scan each reachable region for further pointers.

In addition, MMgc does enough bookkeeping that it knows exactly which parts of memory are GC-managed. So it can’t be fooled into following a pointer into bad memory (such as NULL or a pointer to a page of memory that has been returned to the operating system, either of which would immediately crash the process). The same bookkeeping information helps MMgc avoid mistakenly writing to non-GC-managed memory. (This matters because it does write to GC-managed memory, just like any other mark-and-sweep garbage collector.)

There are other ways to fool MMgc. Simply using malloc() is enough —MMgc never scans memory allocated using malloc(). It’s up to the programmer to avoid fooling MMgc. This is not such a heavy burden in practice.

peter: fascinating stuff
peter: who's the genius that thought up all this stuff?
jorendorff: Some Lisp hacker in prehistory :)
jorendorff: :)
jorendorff: Lisp was the first language with GC
jorendorff: you can look up garbage collection online
jorendorff: It's a whole field of computer science.
jorendorff: It *is* fascinating.
peter: do you know the proper comp sci terminology
peter: for this stack-scan style of GC?
peter: or is it just part of the collective knowledge?
jorendorff: Um,
jorendorff: What MMgc does is called "conservative GC"
jorendorff: which means (a) it doesn't really know the types of variables,
jorendorff: so it scans everything whether it's really a pointer variable or not
peter: ok I've seen that written in a few places recently
jorendorff: and usually (b) it treats all the likely places as "roots"
jorendorff: including the stack
peter: The root of the issue is I had no idea C could analyze it's own stack
jorendorff: heh
jorendorff: C can do anything ;)

Indeed it can—which is why, despite their many deep, fundamental problems, C and C++ remain the weapons of choice for operating system and VM designers.

5 Responses to Conservative stack scanning 101

  1. > C can do anything

    Not quite (please correct me if I’m wrong). That’s where the “conservative” part comes in. Since C don’t really know what is on the stack or in the heap (i.e. what data types), it just errs on the safe side and assumes everything (well maybe only the n-byte aligned?) is pointers. So even if it sees a string it will treat it as a pointer and if the characters of that string can be interpreted as a pointer to a block of memory that the allocator have allocated, that block will be marked as referenced. But (a) this probably doesn’t happen very often in practice, since the character sequence (or the integer or floating point number) must be interpreted as a pointer that point exactly to the start of an allocated block and (b) even if it did, the worst thing that could happen is that a block of memory wouldn’t be collected immediately.

    But this also means that another way of fooling the GC is by instead of keeping a pointer the the start of an allocated block, holding a pointer to the second byte (or any other place relative to the allocated block). While your code might know how to access the block based on the offset into the block, the GC probably isn’t conservative enough to recognize the “disguised” reference and will collect and free the referenced block in error.

  2. Anders,

    Yes, errors of both kinds (incorrectly marking something that isn’t really reachable, and incorrectly collecting something that is) are possible. Even C can’t create a rock it can’t lift, or (which is the moral equivalent) a GC that it can’t fool.

    For another way to fool MMgc, open MMgc/GC.h and search for GCHiddenPointer. :)

    MMgc actually does detect the “disguised” reference in the specific case you describe. However, see bug 388070.

  3. That’s really quite clever. Thanks for the dissection!

    Has this been ported to Solaris yet? The SPARC ABI uses a nifty sliding-register scheme for function argument passing, I’d like to see how THAT works.

  4. Someone has! Specifically, Leon Sha. See bug 391333. The patch for Solaris is attached.

    I hate to disappoint, but it looks pretty easy:

    #ifdef MMGC_SPARC
    #define FLUSHWIN() asm(“ta 3″);
    #define FLUSHWIN()

  5. I just found your site while looking at an old IOCCC entry your wrote called adventure. Then I saw this entry.

    I love this stuff.

    I was actually looking you up in reference to that IOCCC entry. If you would please e-mail me.