pdf.js reached its first milestone

Last Friday, pdf.js reached the state we wanted to it to be in before announcing it loudly: it renders the Tracemonkey paper perfectly*. So, we’re announcing it!

Try out version 0.2.

We’re very excited about the progress since the cat was let out of the bag two weeks ago. Below is a comparison of some pages as rendered by the version of pdf.js initially covered by the press and our v0.2 release. In each pair of screenshots, the rendering of the older version is on top, and the rendering of 0.2 is on the bottom.

This is the most dramatic demonstration of pdf.js‘s biggest feature in 0.2: loading Type 1 fonts. (In fact, the difference between the captures above should have been even more dramatic, except that we hard-coded into pdf.js selection of the font used for most body text in the paper, so that we could more easily focus on other unimplemented features.) Dynamically loading Type 1 fonts into a web application was a big challenge. We’re trying to get Vivien to write about it; stay tuned. It’s hard to overstate how important this feature is for pdf.js.

Figure 2 on this page shows off several parts of pdf.js‘s renderer

  • The very obvious improvement in the labels on elements in the figure in 0.2 is due to pdf.js loading TrueType fonts properly.
  • The shadows under the rounded boxes are masked images, which Shaon implemented.
  • The dashed lines are drawn using a new API we’ve added to Firefox’s <canvas> and are in the process of standardizing.

Figure 4 is another dramatic demonstration of the difference made by loading Type 1 fonts and measuring them accurately.

The prettily colored, filled bars in Figure 10 are also thanks to Shaon; they’re “shading patterns” (custom, parameterized functions) that pdf.js evaluates all in JS, drawing resulting pixel values to canvas. These particular bars are “axial shading” patterns, aka “linear gradients”. The text in the description of Figure 10 also looks vastly better in 0.2, as it mixes several font faces that are now being loaded thanks to Vivien.

In Figure 12, we see a nice demonstration of a couple more new features in 0.2: the labels in the figure are being drawn because pdf.js now loads TrueType fonts. The hatched segments of bars in the graph are being rendered faithfully now because Shaon implemented tiled fills of patterns.

And last but not least, tt’s obvious in all the screenshots above that the user interface in version 0.2 is much more usable and prettier than in the initial version. That’s the work of justindarc. This sceenshot shows off a really cool new feature of pdf.js‘s new viewer: a “preview panel”, that pops out when the mouse hovers over the dark bar on the left side of the page. You’ll also notice the lower screenshot, from 0.2, shows the viewer straddling two pages; the first version, shown above, could only display one page at a time.

We chose the pixel-perfect rendering of this paper as our first milestone because getting there required solving some hard problems, and it’s easier to focus attention on one target. We want to prove that a competitive HTML5 PDF renderer really is feasible, and not just fun talk. Many more hard problems remain, but we haven’t come across any so far that are so much harder than what we’ve already solved to make us rethink the viability of pdf.js.

Community

pdf.js has a great and growing community. As we noted above, justindarc totally overhauled the viewer UI. notmasteryet implemented support for encrypted PDFs and embedded JPEGs (among other things). jvierek added a Web Workers backend (among other things) that will be one of the biggest features of our next milestone. sayrer has greatly improved our testing infrastructure. Everyone has done their fair share bug fixing. The list of contributors will probably have grown between the time we write this and the time you read it, so be sure to check out the current list.

More browsers/OSes, more problems

We intend pdf.js to work in all HTML5-compliant browsers. And that, by definition, means pdf.js should work equally well on all operating systems that those browsers run on.

Reality is different. pdf.js produces different results on pretty much every element in the browser×OS matrix. We said above that pdf.js renders the Tracemonkey paper “perfectly” … if you’re running a Firefox nightly. On a Windows 7 machine where Firefox can use Direct2D and DirectWrite. If you ignore what appears to be a bug in DirectWrite’s font hinting.

The paper is rendered less well on other platforms and in older FIrefoxen, and even worse in other browsers. But such is life on the bleeding edge of the web platform.

pdf.js has now reached the point where a significant portion of its issues are actually browser-rendering-engine bugs, or missing features. Finding these gaps and filling some of them has been one of the biggest returns on our investment in pdf.js so far.

What’s next?

For our next release, we have two big goals: first is to continue adding features needed to render PDFs (of course!). Our next target is a bit more ambitious: pixel-perfect rendering of the PDF 1.7 specification itself. Work has already begun on this, during the stabilization period for the 0.2 release. Second is to improve pdf.js’s architecture. This itself has two parts: use Web Workers to parallelize computationally-intensive tasks, and allow pdf.js’s main-thread computations to be interrupted to improve UI responsiveness. (Ideally the web platform would allow us to do all computationally-intensive tasks like drawing to <canvas> off the UI thread, but that’s a hard and unsolved problem.)

We can keep moving fast towards rendering the PDF spec because we’re not worried about regressions, thanks mostly to sayrer’s work on testing.

Contribute!

We want pdf.js to be a community driven and governed open-source project. We want to use it for Firefox, but we think there are many cool applications for it. We would love to see it embedded in other browsers or web applications; because it’s written only in standards-compliant web technologies, the code will run in any compliant browser. pdf.js is licensed under a very liberal 3-clause BSD license and we welcome external contributors. We are looking forward to your ideas or code to make pdf.js better! Take a look at our github and our wiki, talk to us on IRC in #pdfjs, and sign up for our mailing list.

Andreas Gal and Chris Jones (and the pdf.js team)

Comments (39)

  1. I really like this web-rendering idea for PDF and it’s really cool that in the end, it’s gonna improve web technologies!

    Thanks.

    Monday, July 4, 2011 at 00:21 #
  2. Abraxas wrote::

    Hello. And about the ability to modify the fields of a PDF document, like an administrative form? I know Acrobat Reader is able to do that, because some websites disable “print” and “local save”, so the only function is to fill the PDF and save it online.

    Monday, July 4, 2011 at 02:22 #
  3. pd wrote::

    Absolutely amazing. Brilliant!

    Keep up the great work. Cannot wait to see this in a Firefox release. It’s the sort of innovative kick-arse feature than Firefox should be doing in order to improve the web and differentiate itself.

    Monday, July 4, 2011 at 03:15 #
  4. poulpillusion wrote::

    Will we be able to embed PDF elements natively, just as what we can do with audio or video html5 elements ? That would be a great feature.

    Monday, July 4, 2011 at 03:23 #
  5. Robert O'Callahan wrote::

    Nice.

    The Chrome bug you linked to (http://code.google.com/p/chromium/issues/detail?id=82402) is a fairly deep known Webkit bug.

    Monday, July 4, 2011 at 04:46 #
  6. Joao Menezes wrote::

    Good! Keep it up!

    Monday, July 4, 2011 at 05:21 #
  7. Ventron wrote::

    Your milestone after that could be to render musical notation. If you print-to-file from Garageband, you’ll see what I mean :)

    Also, any plans to make printing work nice? Considering you can display:none the UI and change margins with media queries, it shouldn’t be too hard (especially since printing is one of the most common things you do with a PDF).

    Monday, July 4, 2011 at 06:10 #
  8. zob wrote::

    It flies in FF Aurora but its quite slow in Chrome. Wondering if its because of specific technical aspects or just because FF is actually faster overall.

    Works well otherwise – great progress

    Monday, July 4, 2011 at 06:20 #
  9. cjones wrote::

    Abraxas: It’s possible, but low on our list of priorities right now.

    poulpillusion: We’re planning on releasing a Firefox extension that will allow that, and (hopefully) eventually shipping that in Firefox by default.

    Ventron: Do you know about http://0xfe.blogspot.com/2010/05/music-notation-with-html5-canvas.html ? Yep, we want printing to work, but we’re not entirely sure how to go about it yet. Maybe we should talk :).

    zob: We’re using the same APIs in both Firefox and Chrome, so Firefox probably just has a faster implementation of those APIs. We haven’t done a detailed performance analysis in either browser though, so I can’t say much more than that.

    Monday, July 4, 2011 at 07:09 #
  10. Tobu wrote::

    Insta bug report, feel free to forward to GitHub: pdf.js, and particularly labels in figures, do not work too well with a minimum font size of 12pt.

    Monday, July 4, 2011 at 07:29 #
  11. Albert Astals Cid wrote::

    Are you planning to support text selection? Wonder how you plan on doing it since you seem to be exposing pages as png images (or that is what my firefox 5 reports here)

    Monday, July 4, 2011 at 07:54 #
  12. vpslist wrote::

    Once this is done, think of what else can be accomplished.

    Monday, July 4, 2011 at 08:06 #
  13. cjones wrote::

    Tobu: Which figures are displaying badly for you? pdf.js just uses the font information encoded in the PDF, it doesn’t take artistic liberty. There are some browser/platform bugs in font rendering though, which you might be seeing.

    Albert: Yes; https://wiki.mozilla.org/PDF.js#Big_project:_Text_selection will have our most up-to-date plans.

    Monday, July 4, 2011 at 08:24 #
  14. The goal is nice. It will make PDF on the web more accessible. I think Adobe needs to sponsor this project.

    Monday, July 4, 2011 at 13:40 #
  15. Markus wrote::

    Pdf.js on my FF nightly on Mac OS X 10.5 does not work at all. It displays either garbage of the result is almost completely white. What’s the reason for this?

    Monday, July 4, 2011 at 15:24 #
  16. J. McNair wrote::

    Ventron and cjones:
    I think you both want http://vexflow.com/ for PURE HTML5+JS music notation software.

    I think Ventron’s REAL problem is with the PDFs generated by Garageband. I am guessing PDF.js does not render the musical notation in those PDFs properly, yet.

    Monday, July 4, 2011 at 19:31 #
  17. cjones wrote::

    Markus: no idea. We haven’t tested in 10.5 yet; we probably should.

    Monday, July 4, 2011 at 19:43 #
  18. Alex wrote::

    I’ve been following along development of this, and it’s awesome to see what work you guys have done in such a short time.

    The test PDF renders fine on my Windows 7 box, but on my 10.7 box it doesn’t render embedded fonts (at least the monospace fonts)

    It completely falls apart on other, stranger PDF files (producing uncaught exceptions), but that’s to be expected.

    Monday, July 4, 2011 at 20:05 #
  19. F1LT3R wrote::

    Epic!

    Monday, July 4, 2011 at 22:21 #
  20. anon wrote::

    It failed to show any content of PDF of unicode charts from http://www.unicode.org/charts/ .

    The pages are loaded but only blank in the pages.

    Tuesday, July 5, 2011 at 00:14 #
  21. ZeroCube wrote::

    If you set a minimum font size in the Firefox prefs (e.g. 14), the output is completely messed up, both in the old version and in 0.2 .
    I hope this can be fixed.

    Tuesday, July 5, 2011 at 02:10 #
  22. Casper Hornstrup wrote::

    Awesome. Can I donate bitcoins to this project?

    Tuesday, July 5, 2011 at 02:59 #
  23. Tobu wrote::

    @cjones, here is an example of what I meant. It is not the fonts themselves, it is the firefox feature of setting a minimum point size for text. My link explains.

    http://i.imgur.com/7Fv83.png

    Tuesday, July 5, 2011 at 04:51 #
  24. Tobu wrote::

    Link truncated, here:

    https://support.mozilla.com/en-US/kb/Accessibility#w_setting-a-minimum-font-size
    https://support.mozilla.com/en-US/kb/Text%20Zoom#w_minimum-text-size

    Tuesday, July 5, 2011 at 04:53 #
  25. David Ford wrote::

    Genius! Well done, and many thanks from those of us world-wide!

    Tuesday, July 5, 2011 at 05:47 #
  26. voracity wrote::

    I think it goes without saying that this is awesome. But I’ll say it anyway: Awesome!

    Anyway, on Windows 7, with no Direct2D/DirectWrite, things are not exactly pixel perfect and I imagine other platforms are still less up to scratch. Is this something you think can be solved in the next (say) half-year or so?

    Tuesday, July 5, 2011 at 05:52 #
  27. cjones wrote::

    anon: Thanks. I filed https://github.com/andreasgal/pdf.js/issues/186 .

    ZeroCube/Tobu: Thanks, I understand now. I filed https://github.com/andreasgal/pdf.js/issues/187 . It’s a pretty hard problem :/.

    Casper: The Mozilla Foundation (http://www.mozilla.org/) welcomes donations. But I’m not sure if it can accept bitcoins; let me ask around.

    Tuesday, July 5, 2011 at 06:22 #
  28. cjones wrote::

    voracity: Yes.

    Tuesday, July 5, 2011 at 07:18 #
  29. partypooper wrote::

    Can one edit a pdf file with this like highlighting, underlinening etc?

    Tuesday, July 5, 2011 at 15:05 #
  30. cjones wrote::

    partypooper: No, and that’s not like something we would add to pdf.js. That ought to be easy to layer on top of pdf.js.

    Tuesday, July 5, 2011 at 16:30 #
  31. Yunier J wrote::

    Bravo Mozilla, espero con muchas ganas este lector pdf en HTML5 y JS. Si se logra hacer -que de eso estoy seguro- va a sonar mucho, felicidades a todos los desarrolladores implicados en este proyecto.
    Una web abierta, ese es el futuro de Internet.
    Gracias Mozilla !!!

    Tuesday, July 5, 2011 at 18:27 #
  32. Guppy wrote::

    Great idea!
    Although when I try to open a local pdf (created via FF: print to pdf), only blank page appears. Error message in error console:

    Error: Illegal character in hex string
    Source File: http://andreasgal.github.com/pdf.js/pdf.js
    Line: 22

    Wednesday, July 6, 2011 at 05:45 #
  33. Anton wrote::

    AWESOME! Thank you very much for the hard work!

    Btw., there is a really good test suite to check whether you render various properties of a PDF correctly (yes, the whole 46 MB package of “patches”):

    http://www.gwg.org/ghentoutputsuite.phtml

    Thursday, July 7, 2011 at 03:10 #
  34. Grant Galitz wrote::

    Now we wait for someone to make a flash engine (JIT one too (obligatory, yo dawg, we heard you like JITs, so we put a JIT in your JIT)) in JavaScript that does AS3, not just basic animations like Gordon.

    Friday, July 8, 2011 at 12:43 #
  35. Jon Bizri wrote::

    Hooray! The wicked witch is >almost< dead!

    Friday, July 8, 2011 at 20:44 #
  36. cjones wrote::

    Guppy: Thanks. What you’re seeing is obviously a bug; I filed https://github.com/andreasgal/pdf.js/issues/228 to investigate.

    Anton: Thanks for the link.

    Grant: You’re not the only one interested in that. Stay tuned ;).

    Saturday, July 9, 2011 at 13:39 #
  37. s-p-ripper wrote::

    Hey, how is it possible to embed fa font?

    E.g. doc.text(20, 20, “Hi man”, “Consolas”); prints “Hi man” to the PDF file. Well, only if the user has the font “Consolas” installed the text is shown…

    That’s why I want to embed this font. Any ideas?

    Thursday, July 21, 2011 at 05:37 #
  38. TopIn24 wrote::

    It will improve the web technologies as well !

    Regards, Alina Erin (:

    Tuesday, September 27, 2011 at 01:51 #
  39. cjones wrote::

    s-p-ripper: Your PDF authoring tool has to embed the font into the PDF it generates.

    Wednesday, November 23, 2011 at 09:29 #