Files are considered fragmented when they aren’t laid out in a continuous chunk on disk. This causes extra seeks even if the file is being read sequentially.
I was discussing startup over dinner, someone asked about how much of an issue fragmentation is in Firefox.
Early on I decided to pretend that fragmentation does not exist as we had bigger fish to fry. We were opening too many files on startup, effectively causing our own high-level fragmentation. Luckily, that problem should be mostly solved in Firefox 4 once omnijar and fat xul bugs land (unfortunately, extensions can cause similar issues until we stick em into a single file).
To measure fragmentation I used my SystemTap script to get a list of files opened (one could also use strace or any similar tool) and piped the results to filefrag. Filefrag is a Linux fragmentation-measuring utility. On Windows one can use contig and Mac OS X features hfsdebug. I’m using ext4 on Linux.
My top offenders were:
places.sqlite: 34 extents
cookies.sqlite: 18 extents
XPC.mfasl: 11 extents
Cache/_CACHE_003_: 11 extents
urlclassifier3.sqlite: 6 extents
Cache/_CACHE_002_: 6 extents
Cache/_CACHE_001_: 6 extents
XUL.mfasl: 5 extents
formhistory.sqlite: 5 extents
content-prefs.sqlite: 4 extents
libxul.so: 4 extents
signons.sqlite: 3 extents
icon-theme.cache: 2 extents
libatk-1.0.so.0.2809.1: 2 extents
libflashplayer.so: 2 extents
Cache/_CACHE_MAP_: 2 extents
I did an informal poll of my friends and it seems that the order of fragmentation is similar among them, only the magnitude differs. For example, XFS tends to be 10-20 times more fragmented than ext4 :). I don’t have any numbers for HFS+, but I suspect XFS takes the crown as most fragmentation-prone filesystem to run Firefox.
Interestingly, my friend running NTFS reported similar fragmentation to ext4. That was disappointing as Windows Prefetch supposedly defragments files used with the first 10 seconds of startup. Clearly, isn’t keeping up in this case.
places.sqlite is the largest and most performance-critical file in Firefox. It contains browser history and bookmarks. It is the brains behind the AwesomeBar.The fact that it is severely affected by fragmentation significantly impacts Firefox responsiveness. There are no easy fixes for fragmentation there. mak suggested moving history to a separate file to mitigate this, but that isn’t an easy change.
In contrast, cookies.sqlite is tiny(<1mb for me) and probably so fragmented due to cookie expiration. I am guessing that easiest workaround here is to write a new sqlite file every time there is a mass update to the file.
urlclassifier.sqlite is a large file that may be mitigated similarly to cookies.
SQLite 3.7.0 came out today which features WAL logging, which may reduce fragmentation (or make battling it easier). In general, sqlite’s VACUUM (used to clean and compact the database) command does not help with fragmentation, we really need to be doing something like hot backup which would create a new database file every VACUUM.
Our cache code is ancient and sucks. The cache files get fragmented immediately and severely. They are accessed in insane patterns and they get laid out insanely on disk. There are some efforts to improve the code, but I suspect that’s equivalent to putting lipstick on a pig.
*.mfasl files are due to be obsoleted by a startup cache jar. It may get less fragmented. Should be a straight-forward fix it if it does get fragmented.
I’m disappointed to see the .so files get fragmented. This might be an ext4 bug or has something to do with how the updater works (both ours and yum on Fedora).
I would like to see more data on fragmentation on Windows/OSX. Feel free to leave a comment with fragmentation numbers for cache, mfasl, sqlite and .dll files in your Firefox. We should look into online defragmentation APIs in modern OSes.
Easiest way to fix fragmented files is to make a copy of the original file, delete the original and then rename the copy. This works on sane filesystems, apparently it doesn’t work too well on OS X.