Why it’s hard to ship non-crashy software
June 14th, 2011
I was just looking at some data produced from our crash reporting system, and I continue to be amazed at the amount of third-party code that gets loaded into Firefox on Windows. That data file contains a list of all unique binary files (EXE or DLL) that were listed in Windows crash reports in a single day. A quick look at it shows:
$ cut -f1 -d, 20110613-modulelist.txt | sort -u | wc -l
10385
There are over 10,000 unique filenames in a single day’s worth of crash reports. That sure seems like a lot! Now, certainly, a lot of these modules look like they’ve been randomly named, which probably indicates that they’re some kind of virus (like 0eYZf0QFDSGEAbTRWD3F.dll, for example), so those are likely to inflate the number. There’s a bug on file asking that we collect MD5 hashes of every DLL in our crash reports so we could more easily detect malware/virus DLLs that use these tactics, as well as integrate with lists of known malware and viruses from antivirus vendors.
In the past, we have had problems with plugins and extensions causing crashes for many Firefox users. We have ways of mitigating those through blacklisting. We can also blacklist specific DLLs from loading in the Firefox process, which is not used as often because it’s harder to get right and provides little feedback to users about what’s been disabled. However, given the sheer number of possible things that can be loaded in our process, it’s unlikely that we’ll ever be able to block all software that causes crashes for users. This is unfortunate, because any one of these pieces of software can cause a crash in Firefox, and all the user sees is “Firefox crashed“. I suppose we now know how Microsoft feels when users blame Windows for crashes caused by faulty drivers.
June 14th, 2011 at 9:32 am
Well, we actually can’t blacklist “specific DLLs”, we can blacklist DLL names or name/version combinations – if we could blacklist by MD5 sum or so, that malware blocking stuff would be easier.
And yes, it’s crazy how much stuff we routinely load into our process, and we need better ways to inform people of ad-/malware infections.
June 14th, 2011 at 11:24 am
Perhaps you could employ some user-side security – when Mozilla ships for Windows, it has a set of known good DLLs that have MD5s available. Store the list in an encrypted/signed file. When new DLLs appear between runs, OK them with the browser’s user before loading them, and if accepted, generate an additional (or non-Mozilla) list – or perhaps keep these lists off the machine in sync space. If the user ever experiences problems, first have them delete the secondary list, or have an option to run in a protected mode where only Mozilla DLLs are loaded.
Anyway, however this checking is done, my point is that if my browser said at startup “new DLL 0eYZf0QFDSGEAbTRWD3F.dll found, DLLs contain executable code, this could be a virus, should I load it?” I think I’d say no. As an extra check, those new MD5s could be checked with a central blacklist server if the user asks for help deciding what to do.
June 14th, 2011 at 11:25 am
“if we could blacklist by MD5 sum or so, that malware blocking stuff would be easier.”
The moment you do that, they switch to another technique. Not sure there’s any benefit.
I do wish we could tell users about it though. People think I’m being rude when I point out to them “almost every problem people come to me about turns out to be not be Firefox’s fault”. The only time they’re not offended is when they’re convinced they’re superior to the average user. Then those people are even more upset when they find out they’re not – after investigation.
People are downright incredulous at the idea that Firefox could not be the source of these problems. Much like they are with Windows. I feel we could do to name and shame a bit more. Better than going the Apple (*cough* /AMO *cough*) route.
June 14th, 2011 at 12:21 pm
Note that all the blacklisting techniques are only aimed at misbehaving software, like poorly written plugins and extensions. If we started trying to lock things down further we would be inserting ourselves directly into a war with malware/virus authors, and that is not a war we can win. Any protection we can add client-side can be broken by code that’s already running on the user’s machine.
June 14th, 2011 at 12:43 pm
Isn’t that a good reason to work on the Electrolysis project, where it might be easier to have seperate binaries that do not contain any other code ?
Or is that just wishful thinking ?
June 14th, 2011 at 1:57 pm
If a Windows machine is compromised, you’re pretty much out of options, but then that can’t really be blamed on the browser. However, surrendering because you know you’re in a war that you can’t win seems defeatist. At the very least you can put up a few extra hurdles to have a higher barrier of entry, and knock down the field of participants to the more determined malware writers.
June 15th, 2011 at 8:01 pm
@Mark Richardson: At the very least you can put up a few extra hurdles to have a higher barrier of entry, and knock down the field of participants to the more determined malware writers.
Maybe. But that’s also how it has come to pass that hospitals are fighting an uphill battle against the most infectious and antibio-resistant germs in the world (google “nosocomial” if you don’t see what I mean).
June 16th, 2011 at 12:40 am
Sad that everything’s moving towards hating on loading binary bits in the app (see dropping of any binary API compatibility, repeated declarations of them being deprecated from people working on the basic XPCOM layer, etc. for examples). Instead of making it easier to write non-crashy software, it’s just harder to write anything that works at all… so everything ends up being ugly hacks that’s way more fragile
June 27th, 2011 at 8:28 am
Realistically third-party binary code will *always* be harder to support and more likely to cause crashes. That’s just life, C++ doesn’t guarantee memory safety. Getting people to write more of their code in JS is the best way to improve our stability. That being said, for many use cases js-ctypes is way easier than writing a binary XPCOM component anyway. You get the usefulness of interacting with C code without all the horrible boilerplate of XPCOM getting in your way.