What should we teach about publishing on the web?

There’s a discussion going on among a number of Software Carpentry instructors over on GitHub about scientific communication in the digital age. The question posed was, “What should we teach about writing/publishing papers in a webby world?”

A bit of background for those new to the blog: Software Carpentry is a program of the Science Lab, the aim of which is to teach researchers the computing skills that most of them don’t get as undergrads. The program currently has over a hundred volunteer instructors, almost all of whom are working scientists. These are the folks leading the way in changing practice on the ground level in universities, introducing efficiency, reproducible research and what it means to work in the open to students all around the world.

Currently, Software Carpentry’s two-day bootcamps focus on skills like working with the shell, version control, programming in Python or R, and managing data with SQL. These serve as the jumping-off point for bigger ideas like automating repetitive tasks, collaborating through the web, sharing code that might actually be doing the right thing, and creating data that other people can query and use.

But what about the last ten yards?  What can we do to make the link between these tools and ideas on one hand, and advances in scholarly publishing and communication on the other?  Delving into the meat of the thread, the main themes are:

– Discussion around the published work as an end product itself (citation tools, authoring, licensing)

– Discussion around the process (reproducibility, what “working in the open” means when it comes to publishing)

I’d love to hear more from those who have been engaged in the discussions around open access/content, scholarly publishing, and new forms of communication on this issue. There has been a tremendous amount of work done in recent years to explore new modes of publication (ie., epub, micropublication, data papers), authoring tools, workflow tools, and how we package the final product of research to be maximally reusable. But those discussion and approaches aren’t yet joined up in something that someone new to the subject can understand and start using with just an hour’s training.

What can we teach that ties these components together, but lends itself to research that’s easier to access, build upon, and reuse? We’d love to hear your thoughts on the thread.