Polygenic Risk Scores for Depression

Polygenic Risk Scores (PRS)… for depression

·      Polygenic risk scores (PRS) can provide a measure of your lifetime genetic susceptibility risk of disease;

·      Combining polygenic risk scores with other risk factors (environmental) can affect disease risk;

·      The value of PRS rests on observing inherited DNA differences across unrelated individuals’ DNA in the general population to predict complex traits and common disorders such as depression.

ABOUT Polygenic Risk Scores (PRS)… preamble

‘The diversity of differences in our genomes and their complex relationship with health & disease’ NIH

Depression (also called Major Depressive Disorder or MDD) is a common but serious disorder that is growing in prevalence, now estimated worldwide at over 280 million people (WHO). Depression is different from usual mood fluctuations to challenges in everyday life, where you may experience related depressive symptoms. Formal diagnosis of depression relies on the depressive symptoms being present for most of the day, every day for at least two weeks. There are different types of depression, and the illness can be present lifelong or situational (linked with an event that may not lead to clinical depression resolving with time). Clinical assessment is carried out by healthcare professionals using a number of different instruments.

A promising avenue for screening early detection of depression lies in genomics study, qualifying our genetic susceptibility to depression-related phenotype using Polygenic Risk Scores (or PRS). So, PRS for depression is a susceptibility biomarker based on the understanding that depression is a polygenic trait. That is, depression has a heritable component that is polygenic in nature; ‘poly’ meaning many and ‘genic’ meaning genes. PRS for depression pins down the many small-effect gene variations spread throughout a person’s genome as well as epigenetic signals (genes x environment) that can contribute to depression susceptibility. It is the case that genetic susceptibility does not necessarily mean a person will develop depression or associated diseases in their lifetime, rather susceptibility is about likelihood of influence.

In the standard approach, the goal of PRS is to provide a measure of disease risk due to our genes, predicting genetic traits using genetic data to understand relative risk and absolute risk for an individual on the effect sizes of variation(s) found in their DNA. Looking at particular genes of interest with depression-related traits, an example gene of interest being Dopamine receptor D2 (DRD2). Effect sizes of DRD2 are observed along single nucleotide polymorphisms (or SNPs – the most common type of genetic variation present in a sufficiently large fraction the general population) and used as weights or weighted genotypes to estimate the cumulative effect of each genetic variant for an individual. But what does this actually mean for you and me?

PRS for depression is an example of the progress that has been made in (neuro)science to understand the heterogeneity of genetic risk factors for depression. Genomics study in this context factors how these genes interact with each other and their environment. It is the case that depression can be co-morbid (meaning simultaneously present) with other conditions such as chronic stress, anxiety disorder and obesity disorder, where one leads to the other (depression leads to obesity or vice versa). This implies a shared genetic architecture between depression and many other traits and their environment. So, PRS has potential clinical utility for providing the relative risk and absolute risk of depression and genetic associations among a range of traits that can be associated with depression

How does PRS for depression accuracy compare with current clinical models? Genome-wide Association Studies (GWAS) are increasing in size and power due to larger datasets becoming available, while costs are dropping, especially as sequencing technology has become faster and cheaper. Taking advantage of the large epidemiological GWAS and other data sets (such as 23&Me) in being able to map the common genetic variants across a population that are spread throughout individual genomes, means they can be used as accurate clues for disease risk. Albeit, a disparity exists in GWAS where European ancestry is over-represented vs. heavily underrepresented non-European populations globally. Nevertheless, studies using larger GWAS sample sizes, do provide PRS increased prediction power, despite the highly polygenic and multi-factorial nature of the condition.

Meantime, PRS continues to create considerable interest as a susceptibility biomarker for early detection of increased risk of depression, prior to the manifestation of traditional indicators of mood disorders. PRS has also brought awareness to the inter-relatedness of complex psychiatric traits, such as depression and anxiety disorder or even obesity. Making it available for health professionals to incorporate PRS into critical medical decisions, and improving clinical care and treatment design that also considers patients environments. That said, in providing a ‘risk score’, PRS cannot provide a baseline, nor timing on disease progression. PRS can only establish an underlying genetic susceptibility of a phenotype, for example, for current depression, lifetime depression, and quantitative traits, such as a depression symptom profile.

So, how does a modern-day Geneticist involved in genomics research focused on PRS for depression association studies grapple with such including identifying candidate genes for depression and dealing with some of the issues in genetic data translation from the lab to real-world settings?  We decided to ask one!

Introducing Dr David M Howard whose research subject area is Genetics and Psychiatry.

See below and here Glossary of Genetic Terms for easy reference.

Translating from the LAB to Real-World Settings… interview

‘Personalised Psychiatry in the Genomic Era’

Dr David M Howard is a Sir Henry Wellcome Postdoctoral Research Fellow based at King’s College London. Prior to this he worked as a Research Fellow in the Division of Psychiatry at the University of Edinburgh. His main area of research is focused on improving our understanding of the genetics of depression. I had the pleasure of meeting David at the King’s College London 7th Maudsley Mediterranean Forum appealing for some of his precious time to bring insights from his lab to your couch!

What is the most important element for conducting research on polygenic risk scores?

The availability of large scale genetic and epidemiological studies are vitally important for research using PRS. Within these studies there are often various ways of assessing whether someone has depression, such as self-reporting by participants or through the completion of questionnaires examining depressive symptoms. There might also be data collected at different time points. However, this requires thousands of individuals to achieve a baseline of information, and entails regularly re-contacting those taking part, which is normally very time consuming, and typically there is natural attrition (participants choosing to exit the research project for reasons of their own).

The gathered data is mainly focused on individuals living in Europe and the U.S. (such as that collected by UK Biobank and 23andMe) and that predominantly have a European ancestry. However, this means research often lacks representation from Asian, African, and Latino ancestries and so the genetic results found just in a European ancestry may not be as relevant for the global population. The Martin et al. (2019) paper titled “Clinical use of current polygenic risk scores may exacerbate health disparities” showed PRS prediction between ancestries is not great. So, when we use results from a European study and tried to predict depression (or examine a depression phenotype across a range of traits) for different ancestries, there is a reduction in effectiveness.

Why are there differences in prediction between ancestries when using polygenic scores?

There are 3 billion bases in the human genome, but when we conduct genetic research, we examine a relatively small subset consisting of several million genetic variants on a genotyping array. As large stretches of DNA are inherited together the information of the genotyping array captures a lot of the surrounding region too. We have found many associations between the genotypes on the array and depression in Europeans and using these genotypes to predict depression in a similar ancestry cohort works well. However, across different ancestries the genotyping array may not be capturing the same surrounding regions and so using the array in an East Asian cohort may not be capturing the same genetic variation causing a reduction in effectiveness across ancestries.

What are the potential benefits of using polygenic scores in a clinical setting?

Selective serotonin reuptake inhibitors (SSRI) and serotonin-norepinephrine reuptake inhibitors (SNRI) are classes of medications proven effective in treating depression. However, for a third of patients, they do not work. And despite their widespread use, efficacy or improvement rates tend to use response [generally defined as a 50% decrease in scores on depression scales] vs. full remission rates [ranging between 50-67%] as criteria for improvement. So, the patient may have symptomatically improved, but still does not feel well [by the response criterion].

Identifying how individuals are likely to respond to both pharmaceutical and psychological treatments and the likelihood of remission from depression is vital. The use of data obtained from epidemiological studies, such as biological markers, PRS, as well as genotyping and epigenetic signals [or gene expression], enables us to ultimately develop tools with clinical utility for treating depression. This has the benefit of reducing the number of different treatments an individual may need to undergo. It can also support considerations of other interventions such as the benefit of cognitive behavioural therapy or CBT only, or CBT plus particular prescribed medications, like SSRIs.

Can you tell us more about epigenetic signals?

Epigenetic signals tell us more about what is going on with our genome, such as increased or decreased gene expression which determines which genes are turned on or off at a particular point in time. It has been shown that including gene expression data can also increase how well we can predict depression. Changes in gene expression that precede incidence or onset of a depressive episode.

Do you think we can get to a point where people can use an App to check changes in their epigenome in the future (i.e., putting you on alert)?

There is still quite a way to go before this could become a reality. But certainly, if the research of today could deliver something preemptive for picking up biological changes relevant to a whole range of different illnesses and disorders that would be fantastic.

What have you been working on recently?

I am currently developing a method and software package that improves prediction of depression across different ancestries by considering the genetic structure between the ancestries being analysed. So, depending on how similar the structure is it will increase or decrease the confidence that we have for each variant’s effect on depression for that person. For example, if we are using results from Europeans to make predictions in an East Asian cohort and the genetic structure is the same in an Asian individual as in Europeans, we can assume it continues to hold the same ‘weighting’. This works on an individual basis across a whole population – looking at each in turn – asking – is that genetic structure similar to European? If yes, the weighting remains the same and if not, reducing the weighting down. The software package is in still in development and it does seem to work!

Others are also developing similar tools as awareness has increased of the issues if we are only focusing on European individuals. For example, if we will only learn about the biological pathways for depression and other diseases from Europeans, then the potential treatments may only be applicable to European individuals. We already have massive inequalities in global health and we need more non-European ancestry cohorts to ensure that research benefits as many people as possible.

Last year I published a study on DNA methylation and early age risk factors and mental health (Howard et al., 2021). This highlights that in early life events have an association with long-term changes to regions of the methylome. Additionally, I analysed depression to see if there were any overlaps between parts of methylome related to early life events and mental health. However, the study was likely underpowered but was a useful demonstration of further work that should be conducted.

Some of my previous work (Howard et al., 2019) has shown multiple brain regions have altered gene expression associated with depression, including the prefrontal cortex. As more and more individuals are included in a study it increases the power to detect associations, so we can also pick up smaller effect sizes in different regions, such as the amygdala and hippocampus. The Psychiatric Genomics Consortium approach of combining many smaller studies will likely aid in the detection of genes and biological pathways for depression.

What other exciting genetics research is going on related to depression?

Currently, I’m involved with an ongoing analysis led by the Psychiatric Genomics Consortium. The analysis will examine genetic data from around 500,000 depression cases and 3,000,000 controls (due to come out 2023) for an association with depression. This work should greatly improve on what we know already about depression and genetics. Meantime, Levey et al. (2021) recently conducted a genome wide association study of depression, using an African American cohort of 30,000 and a European cohort of 500,000. It’s still early days in the analysis of non-European analyses but the study was well powered and a big step forward which is super positive!

Talking Glossary of Genetic Terms

Genomics vs. Genetics: Genomics is the study of all of an individual’s genome or genetic material made up of DNA and includes genes and other components that control the activity of those genes. Genomics study encompasses how those genes interact with each other and their environment. Genetics is the study of genes and their hereditary inheritance, identifying variations or mutations in a single or multiple genes that are scattered throughout the genome. Hereditary inheritance refers to passing on traits from parents to their offspring suggestive of genetic susceptibility, that may increase or decrease the relative risk for disease, although some genetic variations can occur randomly and so are not inherited;

Genetic Variant is slight differences in near-identical DNA sequences that occur at specific positions along the DNA that makes each of us unique across populations. Some variants increase the likelihood of developing diseases, while others may reduce risk or have no biological effect. In some cases, genetic variations are due to the environment;

Environment or Environmental Factors exposure to factors that can increase risk susceptibility (so-called gene x environment interaction). Such as drugs, poor nutrition, mental stress, trauma from adverse childhood experiences (ACEs) and war, pollution, cigarette smoke, and status (e.g., socioeconomic & educational status);

Genetic susceptibility is associated with an increased, or in some cases, decreased chance of an individual developing a disease or disorder based on the presence of one or more (could be thousands) of individual genetic variants and family history (heritable risk). From a clinical standpoint, that individual does not yet have that disease and it does not mean they will develop one;

Relative risk probability in terms of individual genetic susceptibility. Thus, relative risk provides an increase or decrease in the likelihood of an event based on some exposure;

Absolute risk an individual’s risk is higher because they inherit a genetic variant in one gene (monogenic disorders) or a combination of many variants in different genes (multifactorial) that elevates susceptibility risk or overtly causes a disorder;

Deoxyribonucleic acid (DNA) is the molecule that carries genetic information or instructions for the development and functioning of an organism from birth, including humans;

Single nucleotide polymorphisms (SNPs) in the genome refer to a genomic variant at a single base position in the DNA that are studied to find out if and how those variations can influence disease risk;

Genotype is a scoring of the type of variant present at a given location in a person’s genome, or can also be represented by the actual DNA sequence at a specific location. DNA sequencing and other methods such as PRS can be used to determine the genotypes at millions of locations in a genome. Some genotypes contribute to an individual’s phenotype

Phenotype observable characteristics or traits, like height and eye colour, or biochemical and/or developmental traits of a disease state (e.g., depressive symptomology; recurrent depression with onset before age 30), determined by genomic or genetic makeup (genotype) and environment;

Polygenic Traits polygenic means two or more genes and traits refer to phenotypic characteristics such as baseline depressive symptoms. Environmental factors such as socio-economic status can influence many polygenic traits;

Polygenic risk scores or PRS are constructed as the weighted sum of a collection of genetic variants, based on the total number of changes related to a particular disease phenotype, moderated by environmental factors. They do not provide a baseline or time-frame for disease progression or causation.

Biomarker refers to a broad subcategory of medical signs or objective markers of an individual’s medical state, which can be measured accurately and reproducibly. PRS for depression is a susceptibility/risk biomarker, while blood pressure is a physiological biomarker. LDL cholesterol level and epigenetic signal are molecular biomarkers, while a genetic biomarker like the BRCA1 gene can indicate the likelihood of developing breast cancer later in life.

Epigenetics or chemical modifications in the DNA and histones that do not involve alterations in the DNA sequence, but in gene expression regulation, moderated by environmental factors. Such alterations in gene expression can be passed down during cell division in conception and from one generation to the next, and can implicate risk susceptibility;

Genome-wide Association Studies (abbreviated GWAS) launched in October 2002 building a public database, of common human DNA or genome sequence variations, mapping traits and variations in nucleotide bases in an individual’s DNA. GWAS of Mutations or  Copy Number Variations have been used to build PRS models, as well as Gene Expression data;

Epidemiology is the study of how often diseases occur in different population groups and why (i.e., what are the factors that determine the presence or absence of diseases) asking: 1) how many people have a disease, 2) if those numbers are changing, and 3) how the disease affects society and the economy.


Martin, A.R., Kanai, M., Kamatani, Y. et al. Clinical use of current polygenic risk scores may exacerbate health disparities. Nat Genet 51, 584–591 (2019). Retrieved from https://www.nature.com/articles/s41588-019-0379-x

Howard, D.M., Pain, O. et al. Methylome-wide association study of early life stressors and adult mental health, Human Molecular Genetics 31, 4/651–664 (2022). Retrieved from https://academic.oup.com/hmg/article/31/4/651/6370548

Howard, D.M., Adams, M.J., Clarke, TK. et al. Genome-wide meta-analysis of depression identifies 102 independent variants and highlights the importance of the prefrontal brain regions. Nat Neurosci 22, 343–352 (2019). Retrieved from https://www.nature.com/articles/s41593-018-0326-7

Levey, D.F., Stein, M.B., Wendt, F.R. et al. Bi-ancestral depression GWAS in the Million Veteran Program and meta-analysis in >1.2 million individuals highlight new therapeutic directions. Nat Neurosci 24, 954–963 (2021). Retrieved from https://www.nature.com/articles/s41593-021-00860-2

Interviewer and Writer: Treesje Verlinden

Interview Edits: Dr David Howard

Leave a Reply