What do you get when you mix one part automation, one part natural language interpretation, two parts programming by demonstration, and three parts online collaboration? If you stir all of these research areas together and toss in some XUL, you get one of the most innovative extensions for Firefox: CoScripter.
CoScripter was created by a research team at IBM led by Allen Cypher, and it allows you to record your actions on the Web, play them back, and share them with others. For instance, one popular script quickly automates the process of adding your phone number to the national do not call registry:
A video demonstrating how CoScripter works is available on IBM’s alphaWorks site. In the video, they automate the process of searching for houses in Palo Alto. Instead of bookmarking individual static pages on the Web, this is like bookmarking a series of actions, with online sharing and tagging built in.
In addition to being really useful, CoScripter is also very interesting from a research perspective. One of the most innovative aspects of CoScripter is that actions are represented as human readable and editable text. In their CHI 2007 paper they put CoScripter into the context of previous Firefox extensions:
Chickenfoot  eases client-side customization by providing a higher-level API for accessing and manipulating common web page elements, using information in the rendered DOM. For example, the Chickenfoot instruction click(‘search button’) will click a button with the text “search” on it. However, the Chickenfoot interface is still very much a programming interface, in which users write syntactically correct statements in the Chickenfoot programming language.
Unlike Greasemonkey and Chickenfoot, CoScripter does not require users to know how to program. CoScripter expresses commands in natural language, as opposed to a formal scripting syntax. This means that you can literally edit the textual instructions and play the script again, or even drop in instructions written by hand and see if CoScripter is able to execute them. Because CoScripter’s interpreter is extremely flexible, this actually works surprisingly well. They call this approach Sloppy Programming:
…Koala [former name] leverages the sloppy programming approach in the web domain by taking advantage of the fact that most web commands are flat: there is one verb, and one or two arguments. This assumption dramatically simplifies the algorithm, and makes it more robust to extraneous words. It can handle long expressions originally intended for humans.
Some of the code that drives CoScripter is also interesting from an accessibility perspective. Imagine commanding your browser using only your voice, or tabbing through form fields on a Web page and having a screen reader accurately tell you what each element is by analyzing the surrounding text. Since the CoScripter team plans to open source their code, Mozilla’s accessibility team will be looking into leveraging their work.
In the future CoScripter might also impact how we test Firefox. Ray Kiddy recently wrote a post proposing that we allow beta testers to attach a log of their actions to a bug report, instead of having to manually write a list of steps explaining how to recreate the issue. Ray notes that in addition to helping our testers quickly communicate the steps to recreate a bug, these scripts could also eventually be used for automated testing.
Quick note about phishing: since scripts are shared between users, be careful what you run. Hopefully the social nature of CoScripter will result in the community quickly flagging and removing any malicious scripts that get submitted.
To everyone at IBM that worked on building CoScripter, congratulations on setting a new bar for the state of the art in Web browser automation.