Now how about sharing that data with the rest of us so we can compare and improve health outcomes? What about your Google Maps data check-ins so we can expedite routing? How about the results of your recently sequenced DNA so we can compare molecular variant issues with like-bodied fellows?
A group of scientists within the Openhumans.org community have been sharing much of the above datasets since their launch in 2015. PCMag spoke with Executive Director Dr. Mad Price Ball and Research Chief Dr. Bastian Greshake Tzovaras about what they hope to achieve, and how we can all get involved, if we’re feeling brave and transparent.
Dr. Ball, firstly tell us what Open Humans is and how it works.
[MPB] It’s a platform that allows you to upload, connect, and privately store your personal data—such as genetic, activity, or social media data. Once you’ve added data, you can to donate it: you might choose to share some publicly, and you can join and contribute to diverse research projects, turning the traditional research pipeline on its head. You are at the center and in control of when you share your data.
How are you funded?
[MPB] Open Humans is a program of the 501(c)(3) nonprofit Open Humans Foundation and has been funded by the Robert Wood Johnson Foundation, Knight Foundation, and Shuttleworth Foundation, who all see an opportunity to connect communities to health data and research. I was awarded a Shuttleworth Foundation Fellowship to allow me to embark on this work with Open Humans. Mark Shuttleworth is a big believer in new types of “open,” wanting to apply that ethos to other fields, hence his foundation’s support of our work here.
Dr. Tzovaras, what can people do with their pooled data once it’s inside your platform?
[BGT] They can analyze their genome or Twitter data. For researchers and citizen scientists, we offer a toolbox to easily create new projects which can efficiently ask an engaged audience of participants to join and contribute, plus engage in research projects. We have 19 different tools to enable data import from external sources at the touch of a button, as well as numerous projects that help members explore data and invite them to contribute to research.
How many people have signed up so far?
[MPB] We have over 5,000 people from all over the world sharing everything from GPS, genetic data ancestry interpretation, continuous glucose monitor data, activity and steps data, weight logs, and social media (Twitter).
What is the overall goal for sharing personal data as a society?
[MPB] We all have an increasing amount of personal data and there’s a lot of power in it! Obviously commercial entities see that already. What’s missing is the ability for individuals to be empowered, and unify as communities around our own data, to use our data to make our lives better. I’m especially inspired by use cases where patient communities pool their data amongst themselves, so they can analyze it and find new solutions in clinical care.
Can you share an example?
[MPB] Dana Lewis has written about her experience within Open Humans, as a principal investigator, creating an OpenAPS Data Commons for people with Type 1 Diabetes to share information from Data Selfie and Nightscout, as they are at high risk from not just a shorter life span but dying in their sleep due to their condition. By having control over their data, doing medical research about themselves, as a community, they are able to take actions.
How has obtaining and sharing your own personal data changed your behavior?
[MPB] It’s been less about behavior modification than a search for meaning. I have two uncles with a genetic disease: microcephaly, which causes intellectual disability. As a prospective parent, I worried about carrying an X-linked disease, and I was able to use my 23andMe data to figure out I was “safe.” Later, we learned it was actually on chromosome 1. When I was studying for my PhD at Harvard, I was in George Church’s lab, so I volunteered for his Personal Genome Project and found out that I inherited a copy—I’m a carrier for it—something that has a lot of meaning for me given my family history.
[BGT] After I had my DNA sequenced, I found I was at elevated risk from prostate cancer. I told my father about it and he got screened—at which point they found he already had a tumor. Thankfully, he’s fine now. This is why it’s important to not just share data, but analyze and understand its implications, which I addressed in a co-authored paper. We’re also passionate about opening up data sharing beyond the currently predominantly white European ancestry population. We feel Open Humans is turning the traditional research community on its head in so many ways.
Let’s get geeky: what tools do you use to sift, analyze, interpret and share data?
[MPB] Lots of genomic databases, like Illumina and ClinVar, which contains public domain data and aggregates molecular variant effects which go beyond the current peer-reviewed academic literature. We allow anyone to set up and parse data within member accounts using OAuth2 and APIs to work with files in S3 to deposit, control and manage data.
Something in the works, that I’m really excited about, uses the Jupyter Notebook project—a method for creating data analysis built on Python, R and Julia, which supports 100s of languages. We can then share the notebook via Kubernetes and spin it in virtual machines, allowing people to run within their browser, but still be able to edit and share the improved analysis. We’ve got it working, but it’s still full of bugs, so a work in progress.
Looking around the site, I felt odd knowing I could (and did) download your data. Knowledge is power? But…
[MPB] Yes. There are risks. Which is why we have all have different preferences for our level of sharing. For example, I’m sharing a lot genomically, but I’m not sharing my GPS data. It’s very contextual—we’re trying to address this by creating an ecosystem where people can make more nuanced decisions about exploring and sharing it.
Tell us about your first group of mini grant beneficiaries.
[BGT] I’m most excited about Kevin Arvai’s project, which brings genotype imputation to improve the quality of DNA-sequenced results by filling in the gaps using statistical methods, to get a more complete genome construction this way. It’s a great project because it makes the existing data more valuable.
Final thoughts to share with us?
[MPB] At Open Humans we see ourselves as the “stewards of the data” so others can come in and build on it with their new projects. When someone gets data it’s essentially “dead” or fixed. But within Open Humans it remains “alive” and they can do things with it, ask new questions using technology, within the community, getting new and exciting outcomes as a result of being open and imaginative with their information for the personal and general good.
Dr. Mad Price Ball will be speaking at MyData 2018 in Helsinki in August. In the meantime, if anyone reading this is feeling brave and transparent with an urge to use your data for good: sign up here.