Launching a second pilot study of code review in science

We are launching a second pilot study of code review in science in February, and invite you to take part. The main findings of our first study were:

  1. everyone involved thought it could be useful, but
  2. a drive-by review after work is “finished” isn’t.

This time, we’re pairing experienced mentors with small groups of scientists who are ready to start giving code review a try. The mentor will do a few reviews at the outset to get things rolling, but after that will act as a coach to help the scientists learn how to do reviews themselves.

Our goal is to find out whether code review will have the same benefits in science as it does in open source and commercial software development. More importantly, though, we want to see whether integrating code review into the research cycle will spur scientists to work in more open, more collaborative ways in general. Once researchers are used to reading one another’s code, will they be more likely to re-use it as well? Will adoption of code review encourage them to use more open tools in writing their papers, and help them see how to make data more reusable as well?

If you would like to take part (either as a mentor or as a team being mentored), please get in touch – we’d be grateful for your help. You can also catch up on our first pilot in this wrap up post.

Software Carpentry Week in Review 6-12 January 2014

The purpose of computing is insight, not numbers (1961). 

The purpose of computing numbers is not yet in sight (1973).

– Richard Hamming

Bootcamps

Julia Gustavsen and Ross Dickson instructed a two-day Python-focused bootcamp at the University of New Brunswick. There are some excellent BioPython and Git notes on the Etherpad.

On the other side of the Atlantic, Stephen Eglen, Laurent Gatto, and the tireless Aleksandra Pawlik instructed a two-day R-focused bootcamp at the Centre for Mathematical Sciences, University of Cambridge. The bootcamp repository is here.

Not to be outdone, Jonah Duckles, Molly Gibson, Adina Howe, and William Trimble instructed a double-language bootcamp at Iowa State that featured both Python and R training.

We have several bootcamps in progress this and next week week, but we’d like to note that the January 27-28 bootcamp at Indiana University being pushed back several months due to a scheduling conflict.

Lesson Material

Justin Kitzes added some clarifications on how we’re using the master and gh-pages branches. Thanks to Trevor King for his help in getting his merged. You can see the updated CONTRIBUTING guidelines here.

Thanks to Raniere Silva, who has made a number of improvements and contributions across the repository.

Conversations

The conversation about writing and publishing papers in a webby world continued last week, with thoughtful insights from a number of contributors. We’ll be wrapping up that discussion with a summary blog post this week.

Blog

On the blog, Paul Wilson posted that Philip Guo’s Two Cultures conversation is about more than just tools. In refocusing the discussion on the users, he stated:

We need to understand that today’s new graduates students have always lived in a world in which computers were tools for the masses and not specialized tools for science and engineering. Many of us, lived through a time when nearly every use of a computer required some understanding of how it worked.

Amy Brown noted that a job position is available working with the University College London Research Software Development Team. The recruit will design, extend, refactor and maintain scientific software across all subject areas. Check out the advertisement if this sounds interesting to you.

Greg Wilson shared one of his favorite examples of a concept map, “Interaction of Patterns and Antipatterns”, from Release It!, by Michael Nygard.

Release It

Greg also provided a detailed answer in response to a request for more information about mental models in computer science. Included as well is a reference to the vicious circle in computer science education research:

  • K-12 schools don’t offer programming classes.
  • So there’s no incentive for teachers to specialize in computing.
  • So there aren’t programs to train people how to teach computing.
  • So there aren’t enough teachers to reliably staff classes.
  • So schools don’t offer classes.

Last, but not least, the Software Carpentry paper, Best Practices for Scientific Computing, has been published in PLOS Biology. If you have not had a chance to read any of the preprints, now is your chance to view the manuscript in its final form. Congratulations to Dhavide Aruliah, Titus Brown, Neil Chue Hong, Matt Davis, Tommy Guy, Steve Haddock, Katy Huff, Ian Mitchell, Mark Plumbley, Ben Waugh, Ethan White, Greg Wilson, and Paul Wilson.

Have a good week!

Send bootcamp reports, questions, suggestions for quotes, and other updates to aron@ahmadia.net.

The latest from the open science community

We held our monthly community call last week, and as always we asked our community members to post links to their latest projects, events, and discoveries to the call’s etherpad. Thanks to everyone who shared! Here’s what you wanted the world to know about:

Upcoming Events

On 16th Jan 2014 WriteLaTeX is organising an event to celebrate the New Year and the exciting changes in science and science publishing at the British Library, London. Ping @writelatex or @drhammersley if you’d like to find out more. The event also marks the launch of the Rich Text version of writeLaTeX. (From John Hammersley)

Announcements and Updates (and Pre-Announcements!)

The British Library and Cottage Labs are running a competition called Visualising Research: bringing public data to life. Entrants must use publicly accessible data from UK research council projects to “produce images that will show how … public funding contributes to research in the UK”.

A 2015 workshop on “Python in Astronomy” is in the works. We would like to figure out how to include participation from non-astro Open Sci/etc: contact P. Barmby

Several members of our community wrote a paper entitled “Best Practices in Scientific Computing”. The paper discusses research-based best practices for scientific coding which are key to doing science on the web. The paper was recently published in PLOS Biology.

To be formally announced soon: F1000Research is now accepting papers about science communication (in addition to our existing coverage of the life sciences), which includes any papers/commentaries/reviews related to Open Science or Open Access. Throughout 2014, papers on scicomm topics will not be charged an article-processing charge, so we welcome any formal write-ups of any of these projects. (From Eva Amsen)

Tools and Projects

ScienceGist: Science gists are simplified summaries of scientific papers. Our goal is to bring science closer to everyone. Lots of new things are coming to ScienceGist in 2014: a browser extension so that simple summaries are accessible from anywhere, Open Annotation standard so that you can export them to any other annotation service through our API, alt metrics for written gists so that our contributors know what their impact is. We’re also looking for brilliant developers (yes, you!) who are passionate about scientific communication: Contact info@sciencegist.com.

InterdisciplinaryProgramming.com, a project by Bill Mills (TRIUMF) and Angelina Fabbro (Mozilla) is a service to match professional software developers with scientists seeking code mentorship. Since last time, backend development is complete and they are on track for a Q1 launch of the full service. Sign up for the news mailing list at the website.

CANARIE provides a repository of software services for research that are tested and monitored for reliability. Most are open source and community contributions are welcome.

CodersCrowd is a site aimed at making code review a concrete issue for scientists, and is soliciting advice/commentary.

OpenArticleGauge is a service by Cottage Labs which determines the license for journal articles. (Development of the tool is supported by PLoS.) OpenArticleGauge is getting a polish this month — if anybody wants to have a chat about scholarly licensing, just mail Emanuil Tolev (OAG dev). Any feedback will really help to make this into a useful tool for the open science community! All OpenArticleGauge code is on Github (good place for feedback too).

Articles and other reading matter

The AAS Working Group on Astronomical Software (WGAS) and the ASCL held a session at AAS earlier this week called “Astrophysics Code Sharing II: The Sequel”. It was standing room only and included some python folks. (From Daniel S. Katz.)

With respect to the discussion about versioning in the last call, here is a description of how a common resource versioning pattern used on the web works with the Memento “web time travel” protocol from Herbert Van de Sompel. He also shared slides from a presentation on Hiberlink he did at CNI in December. The presentation is about reference rot in scholarly communication, and is related to linking to dynamic scholarly resources.

Nature Genetics published an editorial statement on code sharing: Credit for Code.

Many thanks to everyone who contributed, and we hope you’ll join us for our next community call on February 13!

What’s in store for 2014

As we dive into planning for the new year, I wanted to take a moment to reflect on where we’ve come over the last few months, and give you a peek at our plans for 2014.

Since our launch in mid-June, we’ve been busy building a vision of how Mozilla can help support the open research community by addressing gaps where they exist, growing the community of practitioners, and helping projects that show what science on the web is and can do. Back in October, we stopped to reflect on what we’d been up to since our launch. In that post, we described how we had spoken with over 3,000 educators, developers, startups, and publishers around the world to find out how we could help. We also started to test our model through community building, technical work to test new models for doing research on the web, and extending programs like Software Carpentry to teach researchers the technical skills they need to do more open, collaborative, efficient science.

Here’s a look at where we’ve come, and where we’re heading in 2014.

1. We’ve taught the web to over 4,000 researchers (and counting).

Since November 2012, we’ve run over a hundred boot camps in North America, Europe, Australia, the Middle East, and Africa. We’ve reached over 4,200 students, and we’re on track to beat those numbers in 2014. (We’ve even had some instructors battle the polar vortex last week – now *that’s* dedication!)

We graduated a whole new crop of volunteer instructors, bringing the total who are badged and certified to 103. To help with handling the logistics, we recently added another bootcamp facilitator to help Amy (welcome, Arliss!).

Moving forward: This spring, we’ll run our largest bootcamp yet at PyCon 2014 (for over 250 participants), hold our second bootcamp for women in science and engineering, and make a concerted effort to reach into Africa and South America.

We’re also working to make it even easier to join the community. Our biggest bottleneck now is the number of instructors, so we will run our first in-person “train the trainer” event in Toronto this April to explore ways to accelerate instructor training. We’ll continue running the 12-week online course as well, but we hope the “crash course” will be more accessible, and help strengthen ties between instructors.

Finally, we’re continuing work on our basic and intermediate curriculum and exploring new topic areas, such as data science and web programming. We’ll also launch an “affiliates” program for people who want to build their instructional material on top of ours — please keep an eye on the blog for announcements.

2. We’ve launched (and tested) new projects.

Systemic change is hard, especially when you’re working within a system that’s been in place since the 1600s. But that doesn’t mean we shouldn’t try, and we’re working on the problem from a few different angles. From our first code review pilot with PLOS to our recently announced “code as a research object” project with Github and figshare, we’re building and testing new approaches to some of the thorniest issues plaguing research. Our primary aim is to connect the pieces we already have, so that open science looks less like an archipelago and more like a continent.

Moving forward: We’ll continue our work with Github, figshare and the community on citing and preserving code as a research object. As a part of that, we’ll look at best practice in open research for code, and test our work with publishers and the research community.

We’ll also continue the conversation about code review in science, digging deeper into the issues raised in our research about the way scientists engage with one another (and their code) on the web. In particular, we want to know what we can learn from how open source projects do code review, and how we can use that as a starting point for a deeper discussion about scientific collaboration in a web-based world.

3. We’ve started a global conversation.

Our main aim for the Science Lab is to serve as connective tissue for a number of communities that are involved in science on the web. We’ve broken conference call technology (a metric of success when it comes to Mozilla) for each of our community calls, and had incredible turnout for our online discussions, monthly community calls, email lists, and in-person events (like MozFest).

But there are still groups around the world that should be part of the discussion but are not for various reasons. We want to do a better job of extending our discussions to new disciplines, new audiences, and new locations.

Moving forward: Over the course of the next year, we’ll be testing out ways to better engage (and in some cases, catalyze) these communities. We will explicitly link the open science community with Software Carpentry’s instructors and participants, and model ways for bootcamp participants to both continue learning new skills, and also contribute their thoughts and work more widely.

If you’d like to get involved, please get in touch. In particular, if you think you have a piece that we could help connect to the larger puzzle, please let us know — we’re here to help.

We couldn’t be more excited about our plans for the new year, made in part thanks to our team (Amy, Arliss and Greg), our contributors and instructors, and the community. Many thanks to all who’ve helped us teach, learn, and build these last few months. Here’s to even more fun in 2014!

Building better user testing for Webmaker

TLDR version:

  • We want to work with you on user testing for Webmaker. Building a more regular, agile and community-powered process to gather and act on feedback from users.
  • The goal: make Webmaker.org work better for the people we most want to serve. Especially lead users: teachers, informal educators / Hive members, and techies interested in teaching.
  • It’s more than just “user testing. It’s about bringing our community closer into the design and build process — to co-build, share ownership, and take Webmaker’s user experience to the next level.

How to get involved

  1. Attend our prototype user testing event in Toronto. We’re hosting an experimental user testing event on Jan 24 in the Mozilla Toronto community space. If you’re a teacher, informal educator or Hive member, please come and test, hack and play with us!
    • More events will follow. This one’s just a prototype we can learn from, document, and hopefully spread everywhere.
  2. Help build a new Webmaker User Testing Kit. So that other community members can host their own user testing sessions.
  3. Sign up for updates and discussion on the Webmaker newsgroup. To discuss next steps, share thoughts, help edit and localize the new kit, etc. Continue reading …

Accessing content on the web: extending the Open Access Button

The following is a guest post by Victor Ng, a services engineer at Mozilla, on his work with a project called the “Open Access Button”. At the Science Lab, we’re keen to see how we can move science forward by building and extending existing open tools and projects on the web. Separate from his work at Mozilla, he became interested in the open science space after being diagnosed with a rare medical condition, quickly becoming frustrated by his inability to access scholarly literature needed for him to understand his condition. Here’s his story of his work taking an open source project and adapting it to help enable others better access to content on the web. You can follow his on Twitter at @crankycoder or check out his blog for more.

—-

The Open Access Button

The Open Access button project started in June 2013 to help people see who was being denied access to scientific publications because of paywalls. The idea was simple: every day, people who are looking for important scientific information can’t get it because there’s no publicly-available version. Instead, the search for an article ends all too often in a paywall, even if the work itself was paid for by public money.

With OAButton, a person runs into a paywall can report the problem by clicking on the bookmarklet. The OAButton then tries to find relevant papers, and also collects information from the user so that we map how often this frustration is happening globally. By making the problem visible (see image above), we wanted to spark a discussion about how to give the general public better access to cutting-edge research. We also wanted to show that this isn’t just an academic problem.

The clusters above are people looking for medical research for themselves, or their friends and family. They’re students looking for research when they are in school, parents concerned about pollution, or simply people who are curious and would like to learn more. In all those cases, there is no obvious way for them to find the information they want.

Today’s OAButton a good start. But we can do better. Although the final copy of a paper may not be available, there are many other places where draft or preprint copies of the paper can be found. Sites like arxiv.org house hundreds of thousands of preprints, and at many institutions like Harvard, there is a mandate to self-archive the final version of a paper into an university repository. Perhaps most importantly, many universities have ‘green’ open access policies where researchers may distribute freely their manuscripts and published works with their colleagues, separate from the journal.

Right now, there is no fabric that connects all these pieces together. I’ve therefore started on a little hack on top of the OAButton to do so.

Every paper that is published is associated with a unique digital object identifier, or DOI. Every DOI is also associated with a webpage, and that page contains an author email address. My hack allows authors to email the OAButton with a DOI and a URL to a publicly available version of their paper. This may be a link to the library where the paper has been self-deposited, or a direct link to the author’s own personal copy of the paper.

The next time a person goes to access the blocked paper, the OAButton can display the author-submitted version of the paper. This gets more science into more people’s hands, and gives authors more readers than they’d otherwise have. It’s a win for everyone.

My hack isn’t complete: we need some help to get it finished, and even more than that, feedback on how we can use the OAButton to open up science a little more. You can follow along and hack with us here in the Mozilla Science repository, or get in touch with the Science Lab directly.

Victor will be joining us on our next community call on January 9 (call in details, here). Join us to hear more about this work, ask questions find out how you can get involved. We’d love to hear your thoughts. Have a question? Add it to the etherpad!

Software Carpentry Weeks in Review 16 December 2013 – 5 January 2014

This review post covers the last three weeks of activity for Software Carpentry.

One shot of Software Carpentry is not necessarily enough… –Phillip Fowler

Lesson Material

Christina Koch has been reorganizing the default install instructions for Software Carpentry bootcamps to make them clearer. Take a look at the proposed changes here, and drop by https://github.com/swcarpentry/bc/pull/211 with any suggestions for improvement.

Ethan White has continued his work on intermediate lesson material, focusing on Python content. Here’s a preview of one of the new lessons! Please drop by https://github.com/swcarpentry/bc/pull/209 to leave feedback.

Karthik Ram, John Blischak, and Gavin Simpson have been working on a complete set of R materials for an R-themed bootcamp, based on original work by Karthik. Please direct any editing feedback to Karthik’s Issue Tracker.

Website

Abigail Cabunoc redesigned the Software Carpentry website over the last several weeks—check out the new front page.

Bernhard Konrad has added an independent instructor map in https://github.com/swcarpentry/site/pull/274, and Greg has added some other enhancements, including country flag images. You can see the results online, but the resulting page is too pretty not to share:

Instructors and Countries

Blog

Phillip Fowler posted a study of the impact of the Software Carpentry bootcamp held at Oxford University one year ago. In addition to some clear visualizations, Phillip looks at what tools and techniques students are still using, and what they think now of what they learned then.

We’re very pleased to announce that Andromeda Yelton will be coming to Toronto in mid-January to help teach a bootcamp for librarians. Her advice on how to do this is online, along with her reflections on what she’s learned herself. There’s lots of good stuff in both, and we’re looking forward to lots of new ideas.

Now that Software Carpentry has wrapped up its sixth round of instructor training, we’ve written a summary of enrolment and completion rates. If you’re looking to take part in an instructor training, please mail us.

On his blog, Mark Guzdial considers how to gain interest and hold attention in computer science education. Greg Wilson adds Software Carpentry’s perspective in Catch and Hold. How do we catch the attention of scientists by showing them we can help them solve their problems? And how do we continue to reach out to underrepresented groups within science and computing?

Software Carpentry now has a CafePress store with lots of ways to show your support, including Software Carpentry-themed glassware.

We are excited to welcome Arliss Collins to the team. Based in Toronto, she will be working on bootcamp management, infrastructure development and maintenance, and communications.

We’ll return to our normal weekly schedule next week. Welcome to SWC 2014!

Join us for our first community call of 2014! (January 9, 11 ET)

Our first community call of 2014 will take place this Thursday, January 9. The call is open to the public and will start at 11 am ET. Call in details can be found on the call etherpad (where you can also find notes and the agenda) and on the wiki. (If you have trouble with the toll-free number, try one of the numbers at the bottom of this post.)

The Science Lab meeting is our community call, taking place on the second Thursday of every month, highlighting what we’re up to as well as work of the community relevant to science and the web. Join us to hear more about current projects, ways you can get involved, and hear from others about their work in and around open science.

This month, we’ll be following up on one of our recent guest posts by Sophie (Kershaw) Kay from the Open Science Training Initiative on teaching reproducible science. She’ll be walking us through some of her work in introducing graduate students to the many facets of reproducible science, and telling us about their plans moving forward. It’s a fascinating course – do tune in.

We’ll also be hearing from one of our Mozilla colleagues, Victor Ng, who’s been hacking on the Open Access Button. Stay tuned for more this week on his work in extending the project to help surface archived copies of otherwise paywalled research.

Also have a look at some of our wrap-up posts from last month’s information-rich call. Check out recent news in the community, refresh your memory about some of Ed Lazowska’s talk on the new Moore/Sloan funded data science centers or delve into some of the issues for our new project with Github and figshare around code as a research object. (Many thanks to Amy Brown for helping us distill those!)

Have a project, idea or blog post you’d like to share relevant to open science? Add it to the etherpad (see line 86). It’s a great way to share what you’re working on and/or interested in with the community. Don’t be shy. Have a look at last month’s notes for an idea of what others contributed to the conversation.

Mark your calendars, and help us spread the word. Our first two calls hit record participation (and stretched the limits of open software solutions). Let’s see if we can drum up the same turnout, and be sure to join us a few minutes before 11 ET to secure a spot on the line. For call-in details and links to the etherpad, visit our wiki page. We hope you’ll join us.


Note: Our last couple of community calls have been so well-attended that we overstretched the capabilities of our conference call system! If you have trouble accessing the toll-free conference call number, try one of these numbers. (Note that they are toll calls and you’ll be charged by your telephone company if the number is long-distance.)

After you enter the extension, you’ll be asked for the conference ID, which is 7677.

  • US/California/Mountain View: +1 650 903 0800, extension 92
  • US/California/San Francisco: +1 415 762 5700, extension 92
  • US/Oregon/Portland: +1 971 544 8000, extension 92
  • CA/Vancouver: +1 778 785 1540, extension 92
  • CA/Toronto: +1 416 848 3114, extension 92
  • UK/London: +44 (0)207 855 3000, extension 92
  • FR/Paris: +33 1 44 79 34 80, extension 92

What should we teach about publishing on the web?

There’s a discussion going on among a number of Software Carpentry instructors over on GitHub about scientific communication in the digital age. The question posed was, “What should we teach about writing/publishing papers in a webby world?”

A bit of background for those new to the blog: Software Carpentry is a program of the Science Lab, the aim of which is to teach researchers the computing skills that most of them don’t get as undergrads. The program currently has over a hundred volunteer instructors, almost all of whom are working scientists. These are the folks leading the way in changing practice on the ground level in universities, introducing efficiency, reproducible research and what it means to work in the open to students all around the world.

Currently, Software Carpentry’s two-day bootcamps focus on skills like working with the shell, version control, programming in Python or R, and managing data with SQL. These serve as the jumping-off point for bigger ideas like automating repetitive tasks, collaborating through the web, sharing code that might actually be doing the right thing, and creating data that other people can query and use.

But what about the last ten yards?  What can we do to make the link between these tools and ideas on one hand, and advances in scholarly publishing and communication on the other?  Delving into the meat of the thread, the main themes are:

– Discussion around the published work as an end product itself (citation tools, authoring, licensing)

– Discussion around the process (reproducibility, what “working in the open” means when it comes to publishing)

I’d love to hear more from those who have been engaged in the discussions around open access/content, scholarly publishing, and new forms of communication on this issue. There has been a tremendous amount of work done in recent years to explore new modes of publication (ie., epub, micropublication, data papers), authoring tools, workflow tools, and how we package the final product of research to be maximally reusable. But those discussion and approaches aren’t yet joined up in something that someone new to the subject can understand and start using with just an hour’s training.

What can we teach that ties these components together, but lends itself to research that’s easier to access, build upon, and reuse? We’d love to hear your thoughts on the thread.

Update From the Home Stretch

It’s been just over a month since Mozilla launched our year-end fundraising campaign, and we only have four days before it comes to an end.

U.S. non-profits in general should see a gradual increase in daily revenue as the deadline (December 31st) gets closer. The reason? U.S. donors have a growing sense of urgency to donate before December 31st in order to receive the tax deduction for charitable gifts. To date, about 50% of donations to Mozilla are coming from the U.S. Like many non-profits, we are leveraging this sense of urgency among donors by emphasizing the deadline across channels in the last days of the campaign.

We want to see daily revenue trend lines going up and to the right. Let’s look at our channels.

Trends - homepage 20131227

Mozilla.org homepage. Average daily revenue through Mozilla.org is increasing as expected. There are details in my earlier blog post about homepage donors and optimizing revenue from that channel. They’re our most engaged donors and are most likely to be experiencing the campaign through multiple channels (email, snippet, social). Creating this “echo chamber” is very strategic. The repetition of our message increases the likelihood a potential donor will become an actual donor. We’re purposefully turning up the volume across all our channels the closer we get to December 31st.

Trends - social 20131227

Social networks. Though a relatively small share of our total, daily revenue through Twitter and Facebook is also trending higher in recent days. (Check out that outlier donation in early December.)

Trends - Email 2-131227

Email. By design, we timed our email appeals to send as close to December 31st as possible. We are sending a total of six emails between December 16th and December 31st:

  1. 12/16: Message #1 to Mozilla email list (820,000 recipients)
  2. 12/23: Message #2 to Mozilla email list (non-donors only)
  3. 12/26: Message #3 to Mozilla email list (non-donors)
  4. 12/30: Message #4 to Mozilla email list (non-donors)
  5. 12/30: Message to Firefox & You newsletter (3.7 million recipients; localized English, German, French, Portuguese)
  6. 12/31: Message #5 to Mozilla email list (non-donors)

You can see revenue spikes on days emails launch, as you’d expect. Generally the bulk of donations will come in within the first 24 hours of sending an email appeal, and a trickle of late donations will continue beyond the 24-hr mark.

Overall, Mozilla’s fundraising campaign is progressing as expected, with upward-rightward trends… that is with one big exception: The snippet. No other organization has the snippet, it is its own unique animal. It also is the source of 86% of the $750,000 raised to date.

snippet - all locales20131227

I’ll write a future post about the peaks and valleys we see in this graph, mostly due to our snippet testing. We’ve gained a lot of new knowledge about the snippet during the campaign.

Excluding snippet revenue, trends are going in the right direction:

Trends - nonSnippet 20131227