Polygenic Scores

Despite past and present misuses of genomic data, the WLS recognizes the promise that such data have for advancing scientific inquiry into the multiple, complex factors that shape health and well-being across the life course. We are therefore pleased to offer qualified researchers access to a rich collection of polygenic scores.

To guard against overly reductive or deterministic uses of the genomic data, all requests must include three documents.

A brief research proposal that includes which Polygenic scores you are requesting (version 2.0; version 1.1 or Legacy Scores)
A copy of your CV
A brief statement acknowledging the potential pitfalls of genomics research and summarizing the researchers’ plan to avoid them in the course of the design and conduct of the study as well as in the interpretation and reporting of findings.

Here is an example of a good statement:

We intend to use the WLS data for a project studying social and genetic factors influencing life quality in old age. We plan on using polygenic scores as simple measures of genetic predispositions to (mental) health outcomes, avoiding any interpretations about genetic reductivism and determinism. We intend to study the correlations between the polygenic scores and the outcomes, and how social factors (such as educational level) may mediate or moderate (reinforce or compensate) for genetic predispositions. We are well-aware that we cannot claim anything about the underlying genetic mechanisms, we will avoid any deterministic language in interpretation and dissemination of results. We are aware of the misuse of genetic information to justify inequality and even mistreatment of certain groups. Our goal is, in contrast, to enrich social stratification studies on health with genetic information in order to reach a more nuanced understanding of inequality in health.

Please email your request and documents to wls@ssc.wisc.edu

What is a polygenic score?

A polygenic score collapses the effects of genetic variants across the entire genome into a single quantitative measure of genetic risk for a chosen phenotype. Polygenic scores use effect sizes from genome-wide association studies (GWAS) for that phenotype as weights. The predictive power of polygenic scores increases with the sample size of the underlying GWAS.

Population stratification can bias the estimated association between the outcome of interest and the polygenic score. This will happen when there are differences in the distribution of the score across ancestry groups. Controlling for either the top 5 or top 10 principal components of the covariance matrix of the individuals’ genotypic data is a common way to account for population stratification. For that reason all scores found on this page are accompanied by principal components. However, since principal components can reveal fine grain ancestry, they have been randomly shuffled in sets of 5. Users must either include principal components 1-5 or 1-10 in their analysis to control for population stratification.

Polygenic Index Repository (version 2.0)

NewIncludes polygenic indices for parents, based on simulated parental genotypes
NoteVersion 2.0 is not a simple update of the repository. Version 1.1 indices were constructed using a larger sample and contain some discontinued phenotypes

Source study: “An Updated Polygenic Index Repository: Expanded Phenotypes, New Cohorts, and Improved Causal Inference” LINK
Documentation (PDF) — Polygenic Index Repository User Guide (updated)
Documentation (PDF) — Phenotypes in Repository (updated)

Polygenic Index Repository (version 1.1)

Source study: “Resource Profile and User Guide of the Polygenic Index Repository” LINK
Documentation (PDF) — Polygenic Index Repository User Guide
Documentation (TXT) — Phenotypes in Repository

Legacy Polygenic Scores

Educational attainment, cognitive performance and math-related scores

Source study: “Gene discovery and polygenic prediction from a genome-wide association study of educational attainment in 1.1 million individuals” LINK
Documentation (PDF) — Lee_et_al_(2018)_PGS_WLS.pdf

Depression, subjective well-being and neuroticism scores

Source study: “Multi-trait analysis of genome-wide association summary statistics using MTAG” LINK
Documentation (PDF) — Turley_et_al_(2018)_PGS_WLS.pdf