Open Science goes to University

An old favorite of mine, Michael Faraday, started the Royal Institution Christmas Lectures as a way to introduce a novel topic in science to a public underserved by science education almost 200 years ago, in a very old swing at opening up the academy a little. The open science movement is faced with a similar problem today: how can we equip undergrads and grad students with the code and data skills they need to do open research, that are often not part of most curricula?

Shauna Gordon-McKeon and I have been discussing lately what would be involved in introducing the ideas, skills and practices of open science to university students in an event like her successful Open Source Comes to Campus series; more on this forthcoming event as it develops, but thinking along those lines today, I started wondering what open science education would look like as part of university curriculum and life.

There are plenty of questions to answer here, not the least of which is, ‘exactly what do we mean when we say we want to teach open science?’. This bears plenty of discussion, but we can at least start from the simple goal of enabling students to create reusable research objects, meant to be shared publicly and collaborated on, supporting automated and reproducible analyses.

These are pretty bedrock open science values, but calling out even these limited basics begins to illustrate the constellation of things I’d like us to bring to students in a course offering:

  • Basic coding, database & version control skills, like those taught in the array of coding workshops currently out there, are the foundations of automated and reproducible analyses.
  • High-level coding skills are needed to augment basic programming in order to build a community of collaboration and sharing. For example: collaboration requires trust, and strong trust is based on proof. How do you prove code is trustworthy? Write unit tests. Or another: as collaborators come to a project, how do you maintain a high standard of quality? A system of formal code review needs to be implemented. The list goes on; these are things that must be taught in order to actualize the collaborative goals of open science.
  • Communication skills are the unsung hero of open science. Code and data distributed freely on the web that no one can make any sense of is open in name only; all you can do is touch it, and not much else. Newcomers have to be able to get a working understanding of the research object in a reasonable (ie, very short) amount of time, or it isn’t actually going to get reused. Skills like writing good documentation, clearly planning project scope and structure, and data hygiene are key to open science working in practice.

Once we have a curriculum sorted out, how are we going to deliver it?

  • In a for-credit course? This gives us lots of time to discuss everything we ever wanted, but can be nearly impossible to wedge into already bursting degree programs.
  • As part of an existing course? This can be a lot easier to pull off with the help of a professor who’s on board, but risks having the new ideas marginalized by the traditional content of the course.
  • As a directed studies course? Many programs have some form of supervised, for-credit project course; this could be a fantastic vehicle for students to gain experience and practice skills, but may be a bit light on instructional time, depending on how the course is designed.

There are tons more details to both these areas of consideration, and a whole bunch more things I haven’t even touched on (for instance: what can we do outside the classroom but within the university community to foster open science practices?). But before I continue much further down my merry path of musings, I’d like to turn it over to you: how would you design and deliver a course or unit on open science to undergrads or grad students? Join the brainstorm in this etherpad!

A number of you have contacted me recently about upcoming plans to teach open science in your university courses – here’s an opportunity to share our plans and envision what this could look like together. I hope you’ll join us!

Image Credit: By Alexander Blaikley (1816 – 1903) [Public domain], via Wikimedia Commons

6 responses

  1. Radhouane Aniba wrote on :

    this is a subject that is very close to my heart, thanks Bill for starting it.

    I think undergrads are the deep layer that you want to reach, but before you must start with the upper layers, people who teach programming at universities and schools

    I spent some time in academia, and I can certify that code quality and reproducibility are, absolutely, things that do not matter at all, unfortunately.

    Besides CS classes, students are offered rotation projects, in order, in theory, to improve their skills, I said “in theory” because I’ve seen so many undergrads turned into extra hands for the PIs in order to get things ‘done’, I personnally heard so many times what sounds like ” No I don’t care what programming language you’re using, and I don’t care if it is OO and I don’t care if you share it or not, I don’t have enough time to think about that because I am too busy writing a grant right now”

    This is reality.

    In a more phislosophical note, the likelihood that the idea presented in this article succeeds is in my opinion higher in countries where education costs less, I would see a great success of this idea in Cape Town or in Tunis than in Boston or California, because if you think about it, it is like teaching people who pay > $40K a year that the science/information should be free, shared and esily reproducible, it is a bit contradictory on a certain level ( my 2 cents)

    1. Bill Mills wrote on :

      Yes! I had those experiences in academia, too. But I also encountered, from time to time, faculty and staff scientists that genuinely wanted to try something different (like, for example, yourself); university libraries actively looking to offer cutting-edge skills and services to users; graduate students driven mad by the hurdles bad computing put in their way; and legions of undergrads genuinely eager and receptive to learn something new. There is great energy for change out there, particularly among graduate students; these people could have considerable impact in a first cut.

      The plenty who don’t care won’t get in the way, for just that reason – they’re too busy with the next funding cycle to concern themselves. And the power structures that leverage those high tuition fees are so far removed from a unit on data hygiene (or whatever), I think the determining factors will have more to do with enthusiasm and daring. It won’t work everywhere all at once, and there will be detractors; but all we need now is to see what we can make fly in a few places, then start expanding from there.

      If you have professor friends in Cape Town and Tunis, please – make the introductions!

      1. Radhouane Aniba wrote on :

        Thanks Bill, I totally agree, just wanted to count for those people because it is important to see the big picture with all the ‘actors’. I like your enthusiasm. I believe this is something that has to be done on a bootstrap mode, through hacking events, competitions, guided projects .. spreading the culture of OS and reproducibility among students so that it becomes a standard. I think this something that goes along the same way as our previous discussions and I really like this

  2. Kubke wrote on :

    Thanks Bill – great summary.

    We just had a short session at University of Auckland to look at how to embed some of this literacy right before students go into joining research groups. Although we did hand pick the participants, it was nice to see big names within the faculty keen to see this happen and willing to put their time into it. There were a few interesting topics that emerged during the session.

    One to highlight was some reflection over what was already in place (and perhaps not working as well as we’d like) – which is the “computer science” for biologists type of thing. My impression from the discussion is that CS teachers don’t really care about the biology that much and the biologist do not want to become computer scientists. There seems to be a mismatch in motivation which is exacerbated by a lack of common language. I think there is an argument that what we need to do is to teach the basic principles that students will need to know when something is good and when something is not – and either be able to learn how to fix it, or at least be able to identify where they can find a solution to the problem (ie, be able to talk to the CS people in their language). The resources to solving the technical skills are there – what is lacking is a framework of principles that go beyond the specific tools.

    I think we will make a lot of progress as we move to the second round of workshops. We are trying at this point in time to be “tool-agnostic” but think a bit more of what would a student that is literate in open science look like – or rather what are the behaviours that we expect them to exhibit. These in essence we should be able to define quite clearly – we can then move to asking what specific activity is best suited to provide the learning/assessment.

    The skills that you list above (1. Basic coding, database & version control skills, 2. High-level coding skills 3. Communication skills) could be rethought as “what does having those good skills look like” (ie, the principles) and then use student activities to “practice and become familiar” with those principles in action (ie, acquire some skill level). The competency in those skills take time to develop. One thing that was discussed in the workshop here was the need to think of all of this as a continuum. There will always be people better and worse than us – so what is the minimum point in the continuum we need our students to be before they join a lab, and how do we point them to the right opportunities to continue to advance along that continuum as their careers progress. I think this is essential – and where there may be more difficulty in reaching consensus when you bring into the conversation people from different disciplines – what a CS person sees as a very very basic skill, an anatomist like me might see as a “way more than you’ll ever need” skill level.

    For a tertiary course that consensus is essential (it was nice to see the group discussing this) because it provides the needed justification to get the course approved and funded and the ability to map it against the graduate profile. MozFest (and Billy Meinke) were a great motivator and provided a great starting point to get this started at UoA. I think this is as possible in Auckland and Boston as it is in Cape Town or Tunis.

    1. Bill Mills wrote on :

      Thanks, @kubke:disqus!

      The mismatch you identify between CS departments and everyone else is spot on and a very widespread challenge (actually, it echoes exactly problems physics and engineering departments face when they rely on math departments for math education). The CS departments contain the purists, with a view of of their field that focuses on the depth and sophistication necessary to push at its edges – but what most scientists could benefit from, is aptitude nearer the base. That’s why Software Carpentry pursues scientists as instructors, rather than developers (though all are welcome) – we need people to carry these ideas to their own communities so they emphasize and represent the priorities of those communities first.

      Fundamentally, I believe we face at least as big a communication challenge as we do a technical one. In order for these ideas to be meaningful and compelling, we must create opportunities for students to see these ideas through the lens of their own field of study. Sending them to be lectured to on a technical topic they didn’t realize was important to them, by someone outside their field who will struggle to help them relate to it, suffers from a huge communication barrier brought about by mismatched priorities and an unclear value proposition. There is an identity problem here – presenting ideas in a way that feels external or alien is a very hard place to make a compelling argument from; I would rather find a way to present skills and principles from a place internal to the learner’s existing identity. Focusing on the framework of principles you mention is absolutely the right first step here, since those principles can transcend fields, I agree; my current thinking, is that the next step is to deliver those principles from the inside of the fields we’re speaking to – introduce them to ecologists (or whoever) as practical principles for doing ecology, for example. The work you and Billy Meinke did at MozFest was a great example of starting this process of identifying how people are engaging with these principles and ideas, from perspectives that were natural to them and resonant with their own identities.

      Developing competency over time in these skills is again exactly right. No skill is acquired without practice – the trick is, is to incentivize that practice, and make that effort feasible in the lives of students who already have many responsibilities. By integrating these ideas into the coursework they already have committed to participate in, it makes it much less of an additional demand on students. The other more bottom up part of this strategy, is to offer students opportunities to work on projects over time in social settings that (hopefully) will be appealing – study groups that support each other as they work towards a common goal, or informal meet-ups where students can bring questions and share experiences.

      1. Kubke wrote on :

        Geeze – you can really articulate the problem – I wish I had that skill :)
        I think there is an opportunity at UoA to begin to solve the language problem at least for the biology/biomedical (which I think may be a tough crowd compared to physics). There is a critical mass of senior people who use and depend on digital literacy so the incentive is there and should be able to bridge the communication gap. There is also enough more CSy people who are keen on helping, and a strong drive to see the sciences exploit the infrastructure that was built by government. I don’t think I would have started thinking about this the way I do, had I not had the opportunity of attending a Software Carpentry bootcamp, so tick that on your box of objective achieved :)

        I think the way I see the developments of this literacy moving on at U Auckland is to provide a “software carpentry-inspired” set of lectures (with the “principles” meat that we need to map against the graduate profile) and an assessment structure that is based on specific projects that are led by “biologists” in conjunction with CS-types. Because the students would by then have a clear idea of what they’d like to do for research, then these projects can provide the “authentic learning space” for both the students and supervisors, and, too, helping bridge the language/communication barrier between CS and other disciplines that can exploit CS expertise. In essence, it is your 3rd option (directed studies course, which each supervisor can claim for their department) but with a critical number of them so that we can all then collaborate on delivering lectures on themes that go across disciplines/projects.

        An issue with integrating this too early is the overhead of professional development of the current teaching staff and the way that $ is allocated across departments within an institution to justify staff dedication to teach for other departments. This is one reason why that middle point at the end of undergrad, beginning of postgrad is easier to tackle because there is more flexibility around course building exploiting expertise from different sectors within the institution. Once the value of these competencies can be demonstrated, then the barriers for the professional development or re-design of the curriculum will be lower (I hope).

        Cameron McLean has been putting up the UoAuckland stuff within the MozFest github repo – and I am now about to go in there to try to integrate with the MozSciLab Open Research Map.

        As an aside – I am really grateful to you guys for helping me get to London – it was a great motivator and inspiration to work with you all and Billy Meinke to get things rolling here at UoAuckland.