Module: LocusZoom_DataFunctions

"Data operation" functions, with call signature ({plot_state, data_layer}, [recordsetA, recordsetB...], ...params) => combined_results

After data is retrieved from adapters, Data Operations will be run on the resulting data. The most common operation is a "join", such as combining association + LD together into a single set of records for plotting. Several join functions (that operate by analogy to SQL) are provided built-in.

Other use cases (even if no examples are in the built in code, see unit tests for what is possible):

  1. Grouping or filtering records; data operations can consider dynamic properties stored in plot.state. (in the future, adapters may cache more aggressively; if you want to provide your own code for filtering returned data, this is the recommended path to do so)
  2. Since the context argument also contains a reference to the data layer instance (and thus the parent panel and plot), a data operation can modify the layout when new data is received, without having to create a custom data layer class. Eg, for datasets where the categories are not known before first render, this could generate automatic x-axis ticks (PheWAS), automatic panel legends or color schemes (BED tracks), etc.

Usually, a data operation receives two recordsets (the left and right members of the join, like "assoc" and "ld"). In practice, any number of recordsets can be passed to one join function. There are performance penalties to making too many network requests when rendering a web page, so in practice, joining too many distinct data entities in this fashion is uncommon. (if possible, try to provide your data with fewer adapters/network requests!)

In a few cases, the rules of how to combine datasets are very specific to those two types of data. Some, particularly for advanced features, may carry assumptions about field names/ formatting. (example: choosing the best EBI GWAS catalog entry for a variant may look for a field called log_pvalue instead of pvalue, or it may match two datasets based on a specific way of identifying the variant)

Source:

Methods

(inner) assoc_to_gwas_catalog(plot_state, recordsets, assoc_key, catalog_key, catalog_log_p_name)

A single purpose join function that combines GWAS data with best claim from the EBI GWAS catalog. Essentially this is a left join modified to make further decisions about which records to use.

Parameters:
Name Type Description
plot_state Object
recordsets Array.<Array>

An array with two items: assoc records, then catalog records

assoc_key String

The name of the key field in association data, eg variant ID

catalog_key String

The name of the key field in gwas catalog data, eg variant ID

catalog_log_p_name String

The name of the "log_pvalue" field in gwas catalog data, used to choose the most significant claim for a given variant

Source:

(inner) full_outer_match(plot_state, recordsets, left_key)

Perform a full outer join, based on records where the field values at left_key and right_key are identical

By analogy with SQL, the result will include all records from both the left and right recordsets. If there are matching records, then the relevant items will include fields from both records combined into one.

Parameters:
Name Type Description
plot_state Object
recordsets Array.<Array>
left_key String
Source:

(inner) genes_to_gnomad_constraint(plot_state, recordsets)

A single purpose join function that combines gene data (UM Portaldev API format) with gene constraint data (gnomAD api format).

This acts as a left join that has to perform custom operations to parse two very unusual recordset formats.

Parameters:
Name Type Description
plot_state Object
recordsets Array.<Array>

An array with two items: UM Portaldev API gene records, then gnomAD gene constraint data

Source:

(inner) inner_match(plot_state, recordsets, left_key)

Perform an inner join, based on records where the field values at left_key and right_key are identical

By analogy with SQL, the result will include all fields from both recordsets, but only for records where both the left and right keys are defined, and equal. If a record is not in one or both recordsets, it will be excluded from the result.

Parameters:
Name Type Description
plot_state Object
recordsets Array.<Array>
left_key String
Source:

(inner) left_match(plot_state, recordsets, left_key)

Perform a left outer join, based on records where the field values at left_key and right_key are identical

By analogy with SQL, the result will include all values in the left recordset, annotated (where applicable) with all keys from matching records in the right recordset

Parameters:
Name Type Description
plot_state Object
recordsets Array.<Array>
left_key String
Source: