openSNP brings genetic data to everyone | #mozsprint 2016

openSNP (pronounced “open snip”) lets you share your personal genetic data from DNA testing services like 23andMe, so scientists can discover genetic links to traits like diseases and you can read the latest research to better understand your genome and how it affects you.

Bastian Greshake, Helge Rausch and Philipp Bayer make up the core team building openSNP. This diverse group includes a PhD student in Frankfurt, a web-developer in Berlin and a researcher in Australia. They came together through a shared interest in the advancement of open science by making data freely available to both researchers and laymen.

We interviewed the team behind openSNP to learn more about the project and how you can help Bastian, Helge, Philipp and many others during our upcoming Global Sprint, June 2-3.


What is openSNP and why did you start it?

Generating a personal genetic report (aka getting genotyped) became available to the public around 2011. Genotyping looks at large parts of your genome for genetic variation. Small changes at a single point are called Single Nucleotide Polymorphisms, SNPs for short (pronounced “snips”). These tests look at what’s different in your genome compared to the standard human genome. Some of these changes, or SNPs, may cause differences in your phenotypes, meaning any “observable characteristic” like your blood pressure, your hair color, or your medical conditions. This makes genotypings medically useful – some variations in your genome may be linked to an increase in something like heart diseases.

23andMe is one of the biggest companies that sells genotypings to consumers and holds a wealth of medical information, but you can’t directly access this data since it’s not publicly available. While 23andMe does some research for users who explicitly opt-in, the closed nature of the data still hinders research.

To alleviate this situation some customers of 23andMe and similar companies made their genotypings publicly accessible. For example, Bastian uploaded his 23andMe data to GitHub, but there was no nice way to annotate the data with phenotypes at the time, and open genotypings from others were scattered all over the Internet with no central place collecting them.

Without phenotypes the genotyping data isn’t very useful: you have the genetic variations, but you don’t know what they do in the body. Researchers who want to link genetic variations to various traits need to know the phenotypes – you can’t link the moon to the tides if you’ve never measured the tides.

In 2011 Mendeley and PLoS started a competition called Binary Battle to promote their APIs providing access to the latest research around the individual SNPs. We made a new website that combined the user-supplied genotypings with information about SNPs: openSNP was born.

Initially we thought maybe a few hundred people would upload their genotyping, but we’re now at more than 2500 genotypings!

How does openSNP help individuals?

openSNP provides the latest research around each SNP to help you understand what the changes in your genome could mean in terms of symptoms or other characteristics.

Many people use the data-mining side of openSNP to learn more about their genomes and how their SNPs are linked to certain symptoms, especially those who are either already familiar with human genetics or potentially have a genetic disease. The latter group tries to find others with similar genetic variations and symptoms, which can be useful for rare symptoms. Sometimes volunteers run analyses on our users’ data, providing further insight to our users on what they can learn from their genome.

How has the scientific community responded to openSNP? How is it helping them?

We’ve been contacted by medical researchers who wanted to add their phenotypes of interest to our system, or by social scientists who wanted to do surveys and interviews, researching the effects of genomic sharing. We’ve usually added new phenotypes for them, or included them in our newsletter so that users could reach out to them. The user feedback to those surveys is usually pretty good.

openSNP data is also used in teaching. There are bio-ethics courses discussing openSNP, hands-on human genetics classes using the data as well as MOOCs that put the data to good use. We’re aware of at least one project that used openSNP data in experimental music.

Our publication in PLOS about the platform is getting cited more and more over time.

What problems have you run into while making openSNP?

When we started working on openSNP the biggest problem was getting into programming in the first place. We had no real background in programming prior to the project, so that was a bit hard. Having a concrete goal we wanted to achieve really helped. Another problem we face to this day is how a community-driven project, located completely outside traditional academia can still play in the scientific space. It’s not only the culture that’s different but also policy surrounding publications which can make it hard to join forces.

What kind of skills do I need to help you build openSNP?

Thanks for the interest! If you want to join us for a chat, to find out what’s going on behind the scenes of openSNP, you can find us most of the time in our gitter channel. We’re located in different time zones so there’s usually someone online.

We can use all sorts of help: Whether you’re a programmer, a designer or just someone who has a better knowledge of the English language than we do (because none of us are native speakers!).

Our User Interface could benefit from lots of love to make it responsive and more intuitive, allowing a more diverse crowd to use openSNP easily. Our site could also benefit from copy-editing. Of course there are always new features that could be developed.

For the programmers: The code hosted on GitHub is a Ruby On Rails application that talks to a Redis and a PostgreSQL server, so it’s primarily web-design, JavaScript, Ruby (on Rails), with a bit of “big data” database design due to the large number of SNPs that come with each genotyping.

Can I contribute financially to openSNP?

Yes, running the platform requires quite a bit of computing power by now. It’s distributed over a set of machines to keep everything at acceptable speeds. If you want to help us with paying for that you can either tip us through Gratipay or become a contributor through Patreon.

What are you hoping to do at the Mozilla Science Global Sprint, June 2-3? Can others help you here?

In general if you are looking to help us out (thanks for that!): We have a with an introduction for people new to the project and some project ideas, a code of conduct, and a long-term roadmap.

We have several open issues on GitHub that we hope to tackle during the Global Sprint:

There are a few big topics right now: First of all our user interface is broken in some places, so people with CSS/JavaScript skills are very appreciated, as we’re kind of amateurs when it comes to UI in general. ;-)

Secondly there’s our commenting and messaging system. These go way back to 2011 when we started the website and it hasn’t aged well. Unifying the commenting and messaging system and making them more accessible are high on our priority list.

Bonus Question: Where is your Research Fox sticker?

Philipp: In the mail! [Editor’s note: yes, it’s on its way!]

Bastian: Mine is on my paper notebook which I carry everywhere.


Come join us wherever you are June 2-3 at the Mozilla Science Global Sprint to work on openSNP and pick up your own Research Fox! Have your own project or want to host a site? Submissions are open for projects and site hosts.