Reproducible research – teaching the “how”

The following is a guest post by Sophie Kay, a computational biologist at the University of Oxford. She is the founding director of the Open Science Training Initiative and also promotes open science, open data and open education at a range of conferences and speaker events. You can follow the progress of her work via her blog, The Stilettoed Mathematician, and her Twitter feed.

Here’s a question for you: what does a research user look like? In truth, it’s almost impossible to say. The highly connected, interdisciplinary nature of modern science can result in our work being applied to diverse, sometimes unexpected, areas.

It’s therefore vital that we maximise the utility of our work through adept communication of what I call a coherent research story: that is, code, figures, data and written reports, together creating a unified whole. Coupled with the open science movement, this holds vast potential for tackling the problem of non-reproducibility, a hot topic in scientific research. Begley and Ellis highlighted this very issue back in March 2012 and called for a change in working culture, wider release of data and greater publication of negative results (I’ve spoken about this before on opensource.com).

That non-reproducibility is thwarting progress should not come as a surprise. The metrics we use to evaluate our researchers emphasise what, where and how much has been published, creating a working culture which heavily incentivises the role of research producer, with little or no requirement for the researcher to account for the utility of their work.

My own approach to tackling the problem centres on the fact that we simply don’t train our researchers to consciously identify with the role of the research user. Much of our education culture is highly results-driven and focuses almost exclusively on student-as-producer. Although this has logical roots in the need to test students’ skills and understanding, it leaves young academics ill-prepared for the user-producer symbiosis of the research world.

There’s no quick fix and a remedy will, most likely, take several forms. We need to: develop a community-wide understanding of non-reproducibility, its causes and effects; nurture a working culture that rewards reproducible research – whether through extending the scope of our metrics, or via validation studies, as in the Reproducibility Initiative – and finally, train our researchers how to deliver reproducible research.

Much of my own work in science education aims to address this final objective. Young scientists don’t just need to acquire the technical skills and knowledge required to deliver that coherent research story; they must actively learn to identify and answer the needs of the end user throughout their research.

My recent workshop at SpotOn London 2013 provided a quick-fire and enjoyable introduction to this idea. Groups were challenged to recreate a microscope model in Lego, taking only 35 minutes and working from deliberately flawed instructions. By the end of the session, they had to identify exactly how those instructions failed to account for the needs of the end user. More details are available via the wrapup blog and session footage.

The Lego-based workshop is actually an offshoot of a broader scheme called the Open Science Training Initiative (OSTI). Originally piloted at the University of Oxford in January 2013, OSTI provides a teaching pattern, mini-lectures and workshop exercises which convert an existing subject-specific course into one which also fosters reproducibility in scientific research. A novel teaching structure called rotation based learning helps participants appreciate the balance between the roles of the user and producer. Each group works on a given problem, before passing their research story onto a successor group who critique and develop the inherited work. By assessing students on the utility as well as the quality of their work, we can shift the goalposts in favour of reproducibility. OSTI students don’t just learn about the ethos of open science: they’re taught how to deliver open outputs – a range of skills from code, content and data licensing to version control; data management to open access publication. And they do this through hands-on experience within the context of their own research.

The pilot scheme proved an encouraging whirlwind of development, discussion and creativity on the part of the students. The rotation structure was met with enthusiasm; they produced higher quality work and benefited from feedback from users of their work – a rare opportunity within academia. I was delighted when the University of Oxford recognised OSTI with an OxTalent Award in June 2013 (see their analysis on the LTG blog) and I hope to see the initiative go from strength to strength.

Right now I’m keen to establish an ongoing efficacy study at one or more host institutions: it’s really important to determine the long-term benefits of the OSTI approach and assess its adaptability for other settings. In fact the Lego incarnation of OSTI proved so successful that I’ve been asked to run a longer session at the upcoming SpotOn hackday (details TBA) – and this one will use the full OSTI rotation structure. I have high hopes that it’ll prove to be a fun, dynamic and memorable experience!

Whatever part OSTI has to play in the bigger scheme of things, I can’t wait to see us establish a culture where the research user is every bit as familiar a face as the research producer.