Skip to content

UK Biobank Pharma Proteomics Project

ID: ukb_ppp

URL: https://doi.org/10.7303/syn51364943

License: CC BY

Citation:

Sun BB, Chiou J, Traylor M, et al. Plasma proteomic associations with genetics and health in the UK Biobank. Nature. 622, 329-338. (2023).

  • AFR = African
  • AMR = American
  • ALL = All ancestries
  • CSA = Central-South Asian
  • EAS = East Asian
  • EUR = European
  • MID = Middle Eastern

is_palindromicin_dbsnpreported_strandwas_flippedOutcome
FalseTrue'forward'FalseNon-palindromic, effect_allele == alt_allele and other_allele == ref_allele -> no changes.
FalseTrue'forward'TrueNon-palindromic, effect_allele == ref_allele and other_allele == alt_allele -> swap alleles, flip beta and effect_allele_frequency.
FalseTrue'reverse'FalseNon-palindromic, reverse_complement(effect_allele) == alt_allele and reverse_complement(other_allele) == ref_allele -> take reverse complement of alleles, beta and effect_allele_frequency unchanged.
FalseTrue'reverse'TrueNon-palindromic, reverse_complement(effect_allele) == ref_allele and reverse_complement(other_allele) == alt_allele -> swap and take reverse complement of alleles, flip beta and effect_allele_frequency.
FalseFalseNULLNULLNon-palindromic, no variant match in reference -> unharmonized. Alleles, beta, and effect_allele_frequency unchanged. Proceed with caution.
TrueTrue'forward' (inferred from non-palindromic consensus)FalsePalindromic, effect_allele == alt_allele and other_allele == ref_allele -> no changes. Strand is inferred rather than known definitively, proceed with slight caution.
TrueTrue'forward' (inferred from non-palindromic consensus)TruePalindromic, effect_allele == ref_allele and other_allele == alt_allele -> swap alleles, flip beta and effect_allele_frequency. Strand is inferred rather than known definitively, proceed with slight caution.
TrueTrue'reverse' (inferred from non-palindromic consensus)FalsePalindromic, reverse_complement(effect_allele) == alt_allele and reverse_complement(other_allele) == ref_allele -> take reverse complement of alleles, beta and effect_allele_frequency unchanged. Strand is inferred rather than known definitively, proceed with slight caution.
TrueTrue'reverse' (inferred from non-palindromic consensus)TruePalindromic, reverse_complement(effect_allele) == ref_allele and reverse_complement(other_allele) == alt_allele -> swap and take reverse complement of alleles, flip beta and effect_allele_frequency. Strand is inferred rather than known definitively, proceed with slight caution.
TrueFalseNULLNULLPalindromic, no variant match in reference -> unharmonized. Alleles, beta, and effect_allele_frequency unchanged. Proceed with caution.

ColumnTypeDescription
partitionINTDummy partition column to enable compatibility with Cloudflare R2 SQL API. Ignore.
ukb_ppp_idTEXTUnique assay identifier defined by UKB-PPP.
olink_idTEXTOlink platform assay identifier (e.g. OID30809).
gene_symbolTEXTHGNC gene symbol of the gene coding the target protein (e.g. GLP1R, APOE, TNF).
panelTEXTOlink panel containing the assay.
panel_lotTEXTOlink lot identifier of the panel.
dilution_factorINTDilution factor of the assay.
blockTEXTIdentifier of the 96-well block within the 384-plex Olink panel.
in_expansion_setBOOLEANWhether the assay was part of the original set (false) or added in the expansion set (true).
protein_idTEXTUniProt accession (e.g. P43220, P02649), or multiple accessions joined by underscore for multi-chain protein complexes.
filenameTEXTName of the tar archive file containing pQTL results for the assay.
ColumnTypeDescription
partitionINTDummy partition column to enable compatibility with Cloudflare R2 SQL API. Ignore.
gene_idTEXTEnsembl gene identifier (e.g. ENSG00000112164).
gene_symbolTEXTHGNC gene symbol (e.g. GLP1R, APOE).
chromosomeTEXTChromosome on which the gene is located.
start_positionINTGene start position in assembly GRCh38 coordinates.
end_positionINTGene end position in assembly GRCh38 coordinates.
strandTEXTStrand of the gene.
ColumnTypeDescription
ancestryTEXTPopulation ancestry code. Partition column.
protein_idTEXTUniProt accession. Partition column.
panelTEXTOlink panel containing the assay. Partition column.
chromosomeTEXTChromosome on which the variant is located.
positionINTVariant position in assembly GRCh38 coordinates.
effect_alleleTEXTEffect allele. Harmonized to equal the dbSNP alt_allele.
other_alleleTEXTNon-effect allele. Harmonized to equal the dbSNP ref_allele.
betaFLOATEffect size estimate.
standard_errorFLOATStandard error of the effect size estimate.
effect_allele_frequencyFLOATFrequency of the effect allele in the ancestry group.
neg_log_10_p_valueFLOAT-log10(p-value). Higher = more significant. Genome-wide significance threshold is ~7.3.
variant_idTEXTUnique UKB-PPP variant identifier (e.g. 4:180574458:A:G:imp:v1).
infoFLOATImputation quality score (0-1, higher is better).
nINTSample size.
chi_squaredFLOATChi-squared test statistic.
ColumnTypeDescription
partitionINTDummy partition column to enable compatibility with Cloudflare R2 SQL API. Ignore.
protein_idTEXTUniProt accession (e.g. P43220, P02649), or multiple accessions joined by underscore for multi-chain protein complexes.
protein_nameTEXTFull protein name (e.g. Glucagon-like peptide 1 receptor). NOT a gene symbol — do not search for gene names here.
uniprot_accessionsTEXT[]List of constituent UniProt accessions. A single-element list, or a multi-element list for multi-chain protein complexes. Cannot be filtered in queries (array type).
ColumnTypeDescription
variant_idTEXTUnique UKB-PPP variant identifier (e.g. 4:180574458:A:G:imp:v1).
rsidTEXT, nullabledbSNP rsID (e.g. rs123456). May be null for novel variants.
chromosomeTEXTChromosome on which the variant is located. Partition column.
position_grch37INTVariant position in assembly GRCh37 coordinates.
position_grch38INTVariant position in assembly GRCh38 coordinates.
effect_alleleTEXTEffect allele of pQTL results. Harmonized to equal the dbSNP alt_allele.
other_alleleTEXTNon-effect allele of pQTL results. Harmonized to equal the dbSNP ref_allele.
strandTEXT, nullableStrand of the variant.
is_palindromicBOOLEANVariant alleles are A/T or C/G. These require specific treatment during harmonization.
dbsnp_buildTEXTVersion of dbSNP used to harmonize variant alleles (e.g. b156).
in_dbsnpBOOLEANVariant has a match in dbSNP (allowing for swapping of alleles).
was_flippedBOOLEAN, nullableVariant was flipped relative to source UKB-PPP pQTL files during harmonization.