Commit 12c10ac7 authored by taco@waag.org's avatar taco@waag.org
Browse files

minor changes

parent e1ec1828
......@@ -13,7 +13,7 @@ Last year as participant of the [NGI-Ledger](https://www.ngi.eu/ngi-projects/led
This article is a reflection on things we learned during the development of the demonstrator that might apply for future Datacommons projects at Waag.
To begin we'll dive straight into the *use case* that we tried to tackle, because it will give insight into the complexity and sensitivity of many the things that can happen to your data in a very concrete way. This is followed by a reflection on the nature of *genetic data and ownership*. The underlying vision for a distributed *datacommons for genetic data* is laid out, and how the *verifiable credentials* technology could play a key role in this. We then zoom in on the idea of *dynamic informed consent*, and how we fit everything together in *the Geneconsent demonstrator*. We finish with questions and challenges that lay ahead.
To begin we'll dive straight into the *case* that we tried to tackle, because it will give insight into the complexity and sensitivity of many the things that can happen to your data in a very concrete way. This is followed by a reflection on the nature of *genetic data and ownership*. The underlying vision for a distributed *datacommons for genetic data* is laid out, and how the *verifiable credentials* technology could play a key role in this. We then zoom in on the idea of *dynamic informed consent*, and how we fit everything together in *the Geneconsent demonstrator*. We finish with questions and challenges that lay ahead.
## Case: Eye melanoma research
......@@ -71,9 +71,9 @@ We consider GeneConsent to be one of the steps in the creation of a Datacommons,
Right now valuable research data is stored under control of various academic research institutions, commercial consumer business, cloud services and other related parties. But actually using this data for research is legally and practically problematic or even impossible.
Depending on the conditions of consent, legally speaking much of this data can not be reused for other purposes than the original. Commercial companies often address this by reserving the rights in their agreements to do whatever they want with the data, including selling it to the highest bidder. Which means you will be effectively unable to know where your data ends up. Then collected data is saved in unlimited different file formats, on various aggregation levels, levels of sensitivity, research subjects etc. Even if we could, we definitely don't want to have everything stored in one place. but failing some sort of registry, it's nearly impossible to find relevant existing datasets.
Depending on the conditions of consent, legally speaking much of this data can not be reused for other purposes than the original. Commercial companies often address this by reserving the rights in their agreements to do whatever they want with the data, including selling it to the highest bidder. Which means you will be effectively unable to know where your data ends up. Then collected data is saved in unlimited different file formats, on various aggregation levels, levels of sensitivity, research subjects etc. Even if we could, we definitely don't want to have everything stored in one place. But failing some sort of registry, it's nearly impossible to find relevant existing datasets.
What we aim for is to create a distributed network of datasources where all parties involved can responsibly share, create, find and reuse data, knowing that they will stay appropriately informed every step of the way. We see a consent service as one of the components in such a network where responsible sharing of (research) data according to the FAIR principles is enabled. Fair stands for Findability, Accessibility, Interoperability and Re-use. For more information read the [FAIR principles](https://www.go-fair.org/fair-principles). Geneconsent aims to manage consent between individuals and researchers for genomic data.
What we aim for is to create a distributed network of datasources where all parties involved can responsibly share, create, find and reuse data, knowing that they will stay appropriately informed every step of the way. We see a consent service as one of the components in such a network where responsible sharing of (research) data according to the FAIR principles is enabled. Fair stands for Findability, Accessibility, Interoperability and Re-use. For more information read the [FAIR principles](https://www.go-fair.org/fair-principles). Geneconsent aims to manage consent between individuals and researchers for genomic data in that network.
Other components in such a network could be:
......@@ -87,7 +87,7 @@ The first reason is that the service doesn't need to know _anything_ about the d
The second reason is that in order to be able to inform the donor in an impartial way the service should have some distance to the other components in terms of identity, we feel that this would increase the trustworthiness of the consent service.
In order to carry out its main responsibility of complete and clear information, the consent service needs to know enough about all parties and steps involved in such a way that both can be verified. So the third reason for isolating the consent service, is to take on the role of a *Verifiable Data Registry* (technically speaking this could be a separate component, but we think \ this role naturally fits well with the responsibility of informing).
In order to carry out its main responsibility of complete and clear information, the consent service needs to know enough about all parties and steps involved in such a way that they can be verified. So the third reason for isolating the consent service, is to take on the role of a *Verifiable Data Registry* (technically speaking this could be a separate component, but we think \ this role naturally fits well with the responsibility of informing).
## Verifiable Credentials
......@@ -107,7 +107,7 @@ A Verifiable Data Registry is a concept used in [Verifiable Credentials](https:/
In the case of our demonstrator, the consent service is the Issuer that issues a credential (on behalf of the donor) that contains the exact information that was seen by the donor giving consent, but in a machine readable way. The researcher that originally sent the invitation can put this credential as the Holder in his wallet. With this credential the researcher can ask for permission to the Verifier (a separate data vault) to access or write data according to the terms set in the agreement. The data vault can check with the consent service to validate the claims made in the agreement.
Instead of using VC, we considered using [Attribute Based Credentials](https://privacybydesign.foundation/irma-explanation/#why) (e.g. as in [IRMA](https://irma.app/docs/what-is-irma/ "irma.app")). Although very similar, attribute based credentials are a specific flavour of verifiable credentials where you can get access without giving up your entire identity. For example; I as the Holder can prove with the 18+ credential (Issued by the city) to the Store owner (Verifier) that i'm old enough to buy a bottle of alcohol. So you protect the privacy of the Holder by selectively disclosing only the attributes needed for the interaction.
Instead of using VC, we considered using [Attribute Based Credentials](https://privacybydesign.foundation/irma-explanation/#why) (e.g. as in [IRMA](https://irma.app/docs/what-is-irma/ "irma.app")). Although very similar, Attribute Based Credentials are a specific flavour of verifiable credentials where you can get access to something without giving up your entire identity. For example; I as the Holder can prove with the 18+ credential (Issued by the city) to the Store owner (Verifier) that i'm old enough to buy a bottle of alcohol. So you protect the privacy of the Holder by selectively disclosing only the attributes needed for the interaction.
Since in our case the Holder is the researcher, there is no need to protect their privacy. It's the other way around: we want to provide complete verifiable information about the researcher, it's mainly the donor whose privacy we want to protect. So in this case we felt Verifiable Credentials were more appropriate.
......@@ -121,7 +121,7 @@ First we need to understand 'informed consent'. The [iConsent guidelines](https:
To paraphrase the checklist, it seems wise to view informed consent as a co-creation process. The consent procedure is designed together with a group of people itself representative of the people who need to understand everything involved to a degree where the consent can be considered informed. We think that in the case of a data commons the 'I' in the acronym also means being complete and clear about what the consequences of consent mean in terms of data storage, access, handling, sharing, trading/selling and so forth.
Adding the word dynamic gives us 'dynamic-informed' consent as described in [Dynamic-informed consent: A potential solution for ethical dilemmas in population sequencing initiatives](https://www.sciencedirect.com/science/article/pii/S2001037019304969). A dynamic processallows to keep participants informed before, during and after the research is conducted. The linked article is an overview of possible requirements that can help design such a process that are classified in three categories (dynamic permissions, dynamic education and dynamic preferences).
Adding the word dynamic gives us 'dynamic-informed' consent as described in [Dynamic-informed consent: A potential solution for ethical dilemmas in population sequencing initiatives](https://www.sciencedirect.com/science/article/pii/S2001037019304969). A dynamic process allows to keep participants informed before, during and after the research is conducted. The linked article is an overview of possible requirements that can help design such a process that are classified in three categories (dynamic permissions, dynamic education and dynamic preferences).
Our research focused on the parts needed to enable 'dynamic permissions' in the consent service. But we always had in the back of our minds this would also enable responsible reuse of existing data, by keeping the subjects in the loop after they give their consent. If a researcher discovers data collected in as part of earlier studies, the mechanism can facilitate a recurring informed consent, before the researcher is provided access to the data. The identity of a subject remains unknown to the researcher at least until they give their consent.
......@@ -207,7 +207,7 @@ But in the end for our actual demonstrator we settled on our relational database
## Do we really want this level of control?
A question that came up repeatedly during our research was: why try to control and register everything at all? Technically it doesn't offer protection, since once you share data, it's no longer under your control by definition. And indeed, it's quite hard to prove someone broke your trust and did something that wasn't agreed purely from an audit log.
A question that came up repeatedly during our research was: why try to control and register everything at all? Technically it doesn't offer protection, since once you share data, it's no longer under your control by definition. And indeed, it's quite hard to prove that someone broke your trust and did something that wasn't agreed purely from an audit log.
However, a verifiable log of consent does allow you to turn around the _burden of proof_: data handling parties are required to prove they rightfully received access to the data according to some agreement under consent. We think an audit log can help with that. Also we think that clear and verifiable agreements leave less room for negligence, mistakes and misunderstandings, thereby increasing trust between parties.
......
Supports Markdown
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment