Handling Permissions¶
As per Beacon specification there are three types of permissions:
PUBLIC
- data available for anyone;REGISTERED
- data available for users registered on a service for special credentials e.g. ELIXIR bona_fide or researcher status. Requires a JWT Token;CONTROLLED
- data available for users that have been granted access to a protected resource by a Data Access Committee (DAC).
Note
In this page we are illustrating permissions according to: GA4GH Authentication and Authorization Infrastructure (AAI) OpenID Connect Profile.
Registered Data¶
For retrieving REGISTERED
permissions the function below forwards the TOKEN to another server
(e.g ELIXIR userinfo
endpoint)
that validates the information in the token is for a registered user/token and retrieves a JSON
message that contains data regarding the Bona Fide status. Custom servers can be set up to mimic this functionality.
{
"ga4gh_visa_v1": {
"type": "ResearcherStatus",
"value": "https://doi.org/10.1038/s41431-018-0219-y",
"source": "https://ga4gh.org/duri/no_org",
"by": "peer",
"asserted": 1539017776,
"expires": 1593165413
}
}
The function below then checks for the existence of the ga4gh.AcceptedTermsAndPolicies
and ga4gh.ResearcherStatus
keys,
which will indicate, that the user has agreed to follow ethical researcher practices, and has been recognised by another esteemed
researcher.
"""Retrieve Bona Fide status from GA4GH JWT claim."""
LOG.info("Parsing GA4GH bona fide claims.")
# User must have agreed to terms, and been recognized by a peer to be granted Bona Fide status
terms = False
status = False
for passport in passports:
# Check for the `type` of visa to determine if to look for `terms` or `status`
#
# CHECK FOR TERMS
passport_type = passport[2].get("ga4gh_visa_v1", {}).get("type")
passport_value = passport[2].get("ga4gh_visa_v1", {}).get("value")
if passport_type in "AcceptedTermsAndPolicies" and passport_value == OAUTH2_CONFIG.bona_fide_value:
# This passport has the correct type and value, next step is to validate it
#
# Decode passport and validate its contents
# If the validation passes, terms will be set to True
# If the validation fails, an exception will be raised
# (and ignored since it's not fatal), and terms will remain False
await validate_passport(passport)
# The token is validated, therefore the terms are accepted
terms = True
#
# CHECK FOR STATUS
if passport_value == OAUTH2_CONFIG.bona_fide_value and passport_type == "ResearcherStatus":
# Check if the visa contains a bona fide value
# This passport has the correct type and value, next step is to validate it
#
# Decode passport and validate its contents
# If the validation passes, status will be set to True
# If the validation fails, an exception will be raised
# (and ignored since it's not fatal), and status will remain False
await validate_passport(passport)
# The token is validated, therefore the status is accepted
status = True
# User has agreed to terms and has been recognized by a peer, return True for Bona Fide status
return terms and status
Note
The ga4gh.AcceptedTermsAndPolicies
and ga4gh.ResearcherStatus
keys’ values must be equal to those mandated by GA4GH.
Controlled Data¶
Note
See https://tools.ietf.org/html/rfc7519 for more information on claims and JWT. A short intro on the JSON Web Tokens available at: https://jwt.io/introduction/
In order to retrieve permissions for the CONTROLLED
datasets via a JWT token, we added a
permissions module beacon_api.permissions()
that aims to act as a platform where
add-ons are placed for processing different styles of permissions claims.
The main reason for choosing such a method of handling dataset permissions, is that there is no standard way for delivering access to datasets via JWT Tokens and each AAI authority provides different claims with different structures.
By default we include beacon_api.permissions.ga4gh()
add-on that offers the means to retrieve
permissions following the GA4GH format
via a token provided by ELIXIR AAI.
If a token contains ga4gh_userinfo_claims
JWT claim with ga4gh.ControlledAccessGrants
, these are parsed
and retrieved as illustrated in:
"""Retrieve dataset permissions from GA4GH passport visas."""
# We only want to get datasets once, thus the set which prevents duplicates
LOG.info("Parsing GA4GH dataset permissions.")
datasets = set()
for passport in passports:
# Decode passport and validate its contents
validated_passport = await validate_passport(passport)
# Extract dataset id from validated passport
# The dataset value will be of form `https://institution.org/urn:dataset:1000`
# the extracted dataset will always be the last list element when split with `/`
dataset = validated_passport.get("ga4gh_visa_v1", {}).get("value").split("/")[-1]
# Add dataset to set
datasets.add(dataset)
The permissions are then passed in beacon_api.utils.validate_jwt()
as illustrated below:
# for now the permissions just reflects that the data can be decoded from token
# the bona fide status is checked against ELIXIR AAI by default or the URL from config
# the bona_fide_status is specific to ELIXIR Tokens
# Retrieve GA4GH Passports from /userinfo and process them into dataset permissions and bona fide status
dataset_permissions: Set[str] = set()
bona_fide_status: bool = False
dataset_permissions, bona_fide_status = await check_ga4gh_token(decoded_data, token, bona_fide_status, dataset_permissions)
# currently we offer module for parsing GA4GH permissions, but multiple claims and providers can be utilised
# by updating the set, meaning replicating the line below with the permissions function and its associated claim
# For GA4GH DURI permissions (ELIXIR Permissions API 2.0)
controlled_datasets: Set[str] = set()
controlled_datasets.update(dataset_permissions)
all_controlled = list(controlled_datasets) if bool(controlled_datasets) else None
request["token"] = {
"bona_fide_status": bona_fide_status,
# permissions key will hold the actual permissions found in the token/userinfo e.g. GA4GH permissions
"permissions": all_controlled,
# additional checks can be performed against this authenticated key
# currently if a token is valid that means request is authenticated
"authenticated": True,
}
If there is no claim for GA4GH permissions as illustrated above, they will not be added to
controlled_datasets
.
More datasets can be added to the controlled_datasets
set()
by updating:
controlled_datasets.update(custom_add_on())
where custom_add_on()
is a function one could add in beacon_api.permissions()
.
An example of such a function is beacon_api.permissions.ga4gh()
and the specific JWT claim it should parse.
Attention
JWT is validated against an AAI OAuth2 signing authority with the public key.
This public key can be provided either a JWK server or the environment variable
PUBLIC_KEY
. See also: OAuth2 Configuration.
Access Resolution¶
In the tables below we illustrate how the beacon server handles access to datasets. We have integrated tests for these use cases that can be found at: beacon-python Github deploy tests.
Table Legend
colours:
- green is for
PUBLIC
datasets; - orange is for
REGISTERED
datasets; - red is for
CONTROLLED
datasets; - blue is for errors in retrieving datasets, currently done via HTTP error statuses;
- green is for
[]
- all available datasets are requested;if a cell is empty it means no datasets are requested;
✓ - is used to represent that:
- a JWT
TOKEN
is present in the request - used for retrievingCONTROLLED
datasets from JWT claim; - a user’s
BONA FIDE
status can be retrieved - used forREGISTERED
datasets - if the ✓ is not present that means (depending on the column) there is no
TOKEN
orBONA FIDE
is not provided;
- a JWT
PERMISSIONS
column reflects the dataset permissions found in the JWTTOKEN
claim, if column is empty no datasets are in that specific claim.
Default cases (no dataset IDs specified)¶
Most queries to the beacon do not specify datasets IDs meaning a request does not contain the datasetIds
parameter.
For such cases we handle permissions as illustrated below.
Requested datasets | DB: 1, 2, 3, 4, 5, 6 | |||||
---|---|---|---|---|---|---|
PUBLIC | REGISTERED | CONTROLLED | TOKEN | PERMISSIONS | BONA FIDE | RESPONSE |
[] |
[] |
[] |
1, 2 | |||
[] |
[] |
[] |
✓ | 1, 2 | ||
[] |
[] |
[] |
✓ | ✓ | 1, 2, 3, 4 | |
[] |
[] |
[] |
✓ | 5, 6 | 1, 2, 5, 6 | |
[] |
[] |
[] |
✓ | 5, 6 | ✓ | 1, 2, 3, 4, 5, 6 |
Specific cases (dataset IDs specified)¶
For cases in which the dataset IDs are specified we handle permissions as in the table below.
Requested datasets | DB: 1, 2, 3, 4, 5, 6 | |||||
---|---|---|---|---|---|---|
PUBLIC | REGISTERED | CONTROLLED | TOKEN | PERMISSIONS | BONA FIDE | RESPONSE |
5, 6 | ✓ | 5 | 5 | |||
1 | 5 | 1 | ||||
4 | 7 | ✓ | ✓ | 4 | ||
3 | 401 Unauthorized | |||||
5 | 401 Unauthorized | |||||
4 | ✓ | 403 Forbidden | ||||
6 | ✓ | 7 | 403 Forbidden | |||
2 | 6 | ✓ | 7 | 2 |