Research Projects Directory

Research Projects Directory

14,139 active projects

This information was updated 11/4/2024

The Research Projects Directory includes information about all projects that currently exist in the Researcher Workbench to help provide transparency about how the Workbench is being used. Each project specifies whether Registered Tier or Controlled Tier data are used.

Note: Researcher Workbench users provide information about their research projects independently. Views expressed in the Research Projects Directory belong to the relevant users and do not necessarily represent those of the All of Us Research Program. Information in the Research Projects Directory is also cross-posted on AllofUs.nih.gov in compliance with the 21st Century Cures Act.

Environment x CAD

This project aims to investigate whether a novel risk score that can enhance prediction of cardiovascular disease risk. The goal is to assess the effectiveness of this approach compared to existing models and explore its potential for personalized health applications.…

Scientific Questions Being Studied

This project aims to investigate whether a novel risk score that can enhance prediction of cardiovascular disease risk. The goal is to assess the effectiveness of this approach compared to existing models and explore its potential for personalized health applications. This research is relevant to public health because it addresses the need for improved methods to identify individuals at higher risk, which could support earlier interventions and better health outcomes.

Project Purpose(s)

  • Methods Development

Scientific Approaches

This study will use advanced statistical and machine learning methods to develop and evaluate a predictive model. The project will utilize a large, de-identified dataset with multiple health-related variables, applying techniques for data preprocessing, model training, and validation. The analysis will include machine learning algorithms, feature selection, and interpretability methods to assess and refine the model’s performance and reliability.

Anticipated Findings

The study is anticipated to yield a predictive model that accurately estimates the risk of a specific health outcome. Expected findings may highlight key risk factors and provide insights into complex interactions between variables relevant to this outcome. This research will contribute to the field by advancing our understanding of the predictive potential of non-traditional risk factors, enhancing the accuracy of risk stratification models, and demonstrating the applicability of machine learning techniques in clinical prediction. Ultimately, this could inform targeted interventions and personalized risk assessments, benefiting patient care and public health.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Registered Tier

Research Team

Owner:

  • Kee Park - Graduate Trainee, Icahn School of Medicine at Mount Sinai

EXT_GWAS

Our analyses seek to understand the genetic factors associated with externalizing behaviors and disorders.

Scientific Questions Being Studied

Our analyses seek to understand the genetic factors associated with externalizing behaviors and disorders.

Project Purpose(s)

  • Disease Focused Research (Externalizing)
  • Ancestry

Scientific Approaches

We will perform large-scale genetic association studies to identify variants associated with externalizing traits.

Anticipated Findings

We will identify and replicate variants identified in other studies to improve our understanding of the genetic basis of externalizing.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Controlled Tier

Research Team

Owner:

[GY]Demonstration Project

The aim of the study is to identify clinical, environmental, and genetic risk factors for disease and treatment outcomes and to develop precision medicine strategies. We are interested in demographics (sex, race/ethnicity, etc.), socioeconomic factors, environmental factors, clinical factors, and…

Scientific Questions Being Studied

The aim of the study is to identify clinical, environmental, and genetic risk factors for disease and treatment outcomes and to develop precision medicine strategies. We are interested in demographics (sex, race/ethnicity, etc.), socioeconomic factors, environmental factors, clinical factors, and genomic data (WGS/array).

Project Purpose(s)

  • Educational

Scientific Approaches

We plan to identify patients using ICD-9/10 and SNOMED codes. Chi-squared tests and t-tests will be used to compare cases and controls to identify risk factors. Logistic regression analysis and Cox proportional hazards models will be employed to develop prediction models.

Anticipated Findings

We anticipate that these studies will greatly enhance our knowledge of treatment risk factors and contribute to the development of treatment strategies in diverse U.S. populations.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Controlled Tier

Research Team

Owner:

  • Jeong Yee - Teacher/Instructor/Professor, Sungkyunkwan University, School of Pharmacy
  • gayeong seo - Graduate Trainee, Sungkyunkwan University, School of Pharmacy

Exploring Dataset

Identify the biological mechanisms that influence brain structure and the risk of mental health conditions. Understanding which genes are involved in these variations is crucial for developing targeted therapies and personalised treatments.

Scientific Questions Being Studied

Identify the biological mechanisms that influence brain structure and the risk of mental health conditions. Understanding which genes are involved in these variations is crucial for developing targeted therapies and personalised treatments.

Project Purpose(s)

  • Disease Focused Research (Neuropsychiatric disorders)

Scientific Approaches

I will leverage advanced statistical modelling and machine learning on large genetic, brain imaging, sleep, and mental health datasets

Anticipated Findings

By understanding how our genes impact brain structure and mental health, we can improve prevention, screening, diagnosis, and treatment for neurological and psychological conditions, ultimately enhancing patient care for people living with these disorders.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Controlled Tier

Research Team

Owner:

(YH) Demonstration project - controlled tier

The aim of the study is to identify clinical, environmental, and genetic risk factors for disease and treatment outcomes and to develop precision medicine strategies. We are interested in demographics (sex, race/ethnicity, etc.), socioeconomic factors, environmental factors, clinical factors, and…

Scientific Questions Being Studied

The aim of the study is to identify clinical, environmental, and genetic risk factors for disease and treatment outcomes and to develop precision medicine strategies. We are interested in demographics (sex, race/ethnicity, etc.), socioeconomic factors, environmental factors, clinical factors, and genomic data (WGS/array).

Project Purpose(s)

  • Educational

Scientific Approaches

We plan to identify patients using ICD-9/10 and SNOMED codes. Chi-squared tests and t-tests will be used to compare cases and controls to identify risk factors. Logistic regression analysis and Cox proportional hazards models will be employed to develop prediction models.

Anticipated Findings

We anticipate that these studies will greatly enhance our knowledge of treatment risk factors and contribute to the development of treatment strategies in diverse U.S. populations.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Controlled Tier

Research Team

Owner:

  • Jeong Yee - Teacher/Instructor/Professor, Sungkyunkwan University, School of Pharmacy
  • YUNHA KIM - Graduate Trainee, Sungkyunkwan University, School of Pharmacy

KN Duplicate of 2024 ASHG workshop: Type 2 diabetes genomic analyses

This workspace is intended for educational purposes for a 2024 ASHG workshop training on getting started with biomedical and genomic data in the All of Us Researcher Workbench. This workspace demonstrates preparing a phenotype file and running a genome-wide association…

Scientific Questions Being Studied

This workspace is intended for educational purposes for a 2024 ASHG workshop training on getting started with biomedical and genomic data in the All of Us Researcher Workbench. This workspace demonstrates preparing a phenotype file and running a genome-wide association study (GWAS) with Hail for type 2 diabetes as well as querying the All by All tables. This workspace and its workshop are developed by Baylor College of Medicine’s All of Us Evenings with Genetics Research Program (https://bcm.edu/allofuseveningswithgenetics). Any questions about the workspace can be directed to Julie Coleman (julie.coleman@bcm.edu).

Project Purpose(s)

  • Educational

Scientific Approaches

In the Cohort Builder Tool and Dataset Builder Tool, we will create a cohort for type 2 diabetes (cases) and use this cohort to create a dataset. In the Cohort Builder Tool and Dataset Builder Tool, we will also create a cohort for having short read whole-genome sequencing (srWGS) data (cases and controls) and use this cohort to create a dataset that includes demographic concepts. In the first Jupyter Notebook, we will form the two datasets created into one dataset of phenotypic data in the proper format for GWAS analysis with Hail. In the second Jupyter Notebook, we will retrieve desired genomic datasets as a Hail MatrixTable (MT), downsample the genomic data for preliminary runs before a final GWAS, annotate the phenotypic data onto the genomic data, perform quality control (QC), view allele frequencies (AFs), run GWAS, and query All by All tables.

Anticipated Findings

There are not any anticipated findings as these notebooks are for demonstration purposes to educate others on the Researcher Workbench.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Controlled Tier

Research Team

Owner:

  • Kisung Nam - Graduate Trainee, Seoul National University, Graduate School of Data Science

Duplicate of PGS for Breast Carcinoma in Diverse Populations

In the United States breast cancer is a top contributor to cancer-related deaths among women. Polygenic scores (PGS) are constructed from single nucleotide polymorphisms (SNPs) identified in genome-wide association studies (GWAS). Unfortunately, GWAS have been conducted disproportionately in people with…

Scientific Questions Being Studied

In the United States breast cancer is a top contributor to cancer-related deaths among women. Polygenic scores (PGS) are constructed from single nucleotide polymorphisms (SNPs) identified in genome-wide association studies (GWAS). Unfortunately, GWAS have been conducted disproportionately in people with European ancestry, limiting PGS efficacy in genetically diverse and admixed populations. To address these disparities, we want to develop a PGS for breast cancer aggressiveness based on associations identified in a diverse cohort.

Project Purpose(s)

  • Disease Focused Research (breast carcinoma)
  • Population Health
  • Ancestry

Scientific Approaches

To address these disparities, we will develop a PGS for breast cancer aggressiveness based on associations identified in a diverse cohort. To construct the PGS, we will leverage socioeconomic, and environmental data. Integration of this diverse dataset will increase the predictive accuracy of the PGS among both European and non-European patients.

Anticipated Findings

This multifaceted approach seeks to bolster the predictive precision and underscore the critical role of environmental determinants on disease risk. Taken together, these findings will emphasize the necessity of inclusive genomic research for elucidating disease architecture and developing predictive models to better serve diverse patient populations.

Demographic Categories of Interest

  • Race / Ethnicity
  • Access to Care

Data Set Used

Controlled Tier

Research Team

Owner:

Celiac Disease

The goal in this experiment is to find what causes patients who have celiac disease to have symptoms and how can they be minimized. This will help people get diagnosed at the youngest age possible and adjust to a way…

Scientific Questions Being Studied

The goal in this experiment is to find what causes patients who have celiac disease to have symptoms and how can they be minimized. This will help people get diagnosed at the youngest age possible and adjust to a way of living that helps prevent symptoms of celiac disease. The disease is incurable but living certain lifestyle will reduce the symptoms and protect a patient from gaining other autoimmune issues.

Project Purpose(s)

  • Educational

Scientific Approaches

The specific approach's for the study is to look at specific genes that are present in people with celiac disease and compare it to those who also have these same genes but have no symptoms of celiac disease. The demographic information of celiac disease will also be studied to attempt to find correlation between the genes and the symptoms of celiac disease.

Anticipated Findings

The anticipated finding in the experiment is to find a pattern within celiac disease to help the general population be diagnosed with celiac disease as early as possible since late diagnosis can result in other autoimmune issues. Knowing the pattern of who suffers from this disease will help the scientifical world to help people get diagnosed earlier and faster.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Registered Tier

Research Team

Owner:

Nick's Duplicate of 2024 ASHG workshop: Type 2 diabetes genomic analyses

This workspace is intended for educational purposes for a 2024 ASHG workshop training on getting started with biomedical and genomic data in the All of Us Researcher Workbench. This workspace demonstrates preparing a phenotype file and running a genome-wide association…

Scientific Questions Being Studied

This workspace is intended for educational purposes for a 2024 ASHG workshop training on getting started with biomedical and genomic data in the All of Us Researcher Workbench. This workspace demonstrates preparing a phenotype file and running a genome-wide association study (GWAS) with Hail for type 2 diabetes as well as querying the All by All tables. This workspace and its workshop are developed by Baylor College of Medicine’s All of Us Evenings with Genetics Research Program (https://bcm.edu/allofuseveningswithgenetics). Any questions about the workspace can be directed to Julie Coleman (julie.coleman@bcm.edu).

Project Purpose(s)

  • Educational

Scientific Approaches

In the Cohort Builder Tool and Dataset Builder Tool, we will create a cohort for type 2 diabetes (cases) and use this cohort to create a dataset. In the Cohort Builder Tool and Dataset Builder Tool, we will also create a cohort for having short read whole-genome sequencing (srWGS) data (cases and controls) and use this cohort to create a dataset that includes demographic concepts. In the first Jupyter Notebook, we will form the two datasets created into one dataset of phenotypic data in the proper format for GWAS analysis with Hail. In the second Jupyter Notebook, we will retrieve desired genomic datasets as a Hail MatrixTable (MT), downsample the genomic data for preliminary runs before a final GWAS, annotate the phenotypic data onto the genomic data, perform quality control (QC), view allele frequencies (AFs), run GWAS, and query All by All tables.

Anticipated Findings

There are not any anticipated findings as these notebooks are for demonstration purposes to educate others on the Researcher Workbench.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Controlled Tier

Research Team

Owner:

  • Nick Shrine - Research Fellow, University of Leicester

Pulmonary Fibrosis

Our study aims to address critical scientific questions regarding pulmonary fibrosis (PF): What genetic variations contribute to the onset and progression of PF, and how do comorbidities interact with these genetic factors? While genetic associations, such as those involving the…

Scientific Questions Being Studied

Our study aims to address critical scientific questions regarding pulmonary fibrosis (PF): What genetic variations contribute to the onset and progression of PF, and how do comorbidities interact with these genetic factors? While genetic associations, such as those involving the MUC5B gene on chromosome 11, have been identified, a comprehensive understanding of genetic diversity influencing PF remains limited. By investigating these variations, we aim to enhance early detection and targeted treatment. Additionally, we seek to determine how PF interacts genetically with comorbidities like coronary artery disease and diabetes to identify shared pathways that could magnify disease risk.The answers could revolutionize PF management through improved diagnostic models and personalized interventions, significantly impacting public health by facilitating earlier, more effective care and reducing healthcare burdens.

Project Purpose(s)

  • Disease Focused Research (pulmonary fibrosis)
  • Population Health
  • Social / Behavioral
  • Methods Development
  • Control Set
  • Ancestry
  • Ethical, Legal, and Social Implications (ELSI)

Scientific Approaches

We will use an integrative research approach combining genomic, clinical, and statistical analyses. Our primary datasets include electronic health records and genetic data from the eMERGE consortium and the All of Us registry, which provide extensive biorepository samples linked to detailed clinical information. We will employ Genome-Wide Association Studies (GWAS) to identify significant genetic variants linked to PF and utilize principal component analysis (PCA) and Phenome-Wide Association Studies (PheWAS) to examine genetic overlaps with comorbidities. To handle missing data and ensure robust findings, imputation methods like multiple imputation by chained equations (MICE) will be applied. The development of a polygenic risk score (PRS) model will incorporate genetic and phenotypic data. These methods aim to create a predictive tool that integrates genetic, environmental, and comorbidity data for enhanced PF risk assessment.

Anticipated Findings

Our study will reveal novel genetic variants associated with both the onset and progression of pulmonary fibrosis (PF), enhancing the current understanding of the disease’s genetic basis. We expect to confirm the significance of known genetic markers, such as the MUC5B gene variant on chromosome 11, and identify additional loci contributing to PF risk. The analysis of comorbidities using PCA and PheWAS will likely uncover shared genetic pathways between PF and diseases like coronary artery disease and diabetes, suggesting potential mechanisms of shared pathophysiology.

These findings will contribute to the body of scientific knowledge by providing a comprehensive view of the multifactorial nature of PF, integrating genetic, environmental, and comorbidity data. This integrated approach could pave the way for the development of a predictive polygenic risk score (PRS) model, enabling earlier identification of high-risk individuals and informing personalized management strategies.

Demographic Categories of Interest

  • Race / Ethnicity
  • Age
  • Geography
  • Disability Status
  • Access to Care

Data Set Used

Registered Tier

Research Team

Owner:

Duration of RSV from FCγRIIA mutations receptor in immunocompromised individuals

Our purpose of the study is to research the relationship between time of infection and the level of mutation of FCγRIIA immune receptor inside the immunocompromised individuals.

Scientific Questions Being Studied

Our purpose of the study is to research the relationship between time of infection and the level of mutation of FCγRIIA immune receptor inside the immunocompromised individuals.

Project Purpose(s)

  • Educational

Scientific Approaches

We will be using the genomic data of immunocomprised individuals and correlate that information with their genomic information that will show some level of mutation inside FCγRIIA immune receptor that will then be used to derive the level of time that it takes for them to overcome infection

Anticipated Findings

We anticipate to find that immunocompromised individuals with FCγRIIA mutations will have a prolonged RSV infection. The results of this experiment could contribute to a greater understanding of host and pathogen interaction. It could also become an area of interest for genetics based treatment in the future to help with immunocompromised individuals.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Controlled Tier

Research Team

Owner:

  • Arsh Khan - Undergraduate Student, Arizona State University

Collaborators:

  • Mariam Khan - Undergraduate Student, Arizona State University
  • Kennedy Patton - Undergraduate Student, Arizona State University

Biostats Final Presentation

Assessing the racial and ethnic disparities of opioid overdose deaths in the United States

Scientific Questions Being Studied

Assessing the racial and ethnic disparities of opioid overdose deaths in the United States

Project Purpose(s)

  • Educational

Scientific Approaches

I am unsure at this time -- Still learning

Anticipated Findings

There will be a difference in opioid mortality rates between various ethnic/racial groups

Demographic Categories of Interest

  • Race / Ethnicity

Data Set Used

Registered Tier

Research Team

Owner:

  • Maggie Dries - Graduate Trainee, University of Texas Medical Branch (UTMB) at Galveston

Duplicate of All by All - Drug Phenotypes Curation

This Featured Workspace provides details about how drug exposure phenotypes were curated for downstream genome- and phenome-wide analysis in All by All. The All by All tables encompass about 3,400 phenotypes with gene-based and single-variant associations across nearly 250,000 whole…

Scientific Questions Being Studied

This Featured Workspace provides details about how drug exposure phenotypes were curated for downstream genome- and phenome-wide analysis in All by All. The All by All tables encompass about 3,400 phenotypes with gene-based and single-variant associations across nearly 250,000 whole genome sequences, with drug exposures as an included phenotype category. More details about the All by All tables can be found in the User Support Hub Article: https://support.researchallofus.org/hc/en-us/articles/27049847988884-Overview-of-the-All-by-All-tables-available-on-the-All-of-Us-Researcher-Workbench.

Within the Featured Workspace, a ReadMe file provides more information about the drug exposure phenotypes. Each phenotype is included as a separate notebook, which includes a graphical summary and descriptive statistics of the data. The ReadMe file includes an index of all the phenotypes and notebooks included in the Featured Workspace.

Project Purpose(s)

  • Educational

Scientific Approaches

Briefly, drug exposure concept IDs were queried using ATC vocabulary and cases (participants who reported drug exposure) were assigned a value of True, while control participants were assigned a value of False. The resultant participant level summaries for each drug exposure phenotype were then used in downstream genome- and phenome-wide analysis.

Anticipated Findings

The All by All tables leverage the genomic data and rich array of phenotypic data available from All of Us participants. The billions of association testing results available in the All by All data will enable many types of research studies geared towards understanding the genetic contribution to a variety of phenotypes, including drug exposures.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Controlled Tier

Research Team

Owner:

  • Josiah Perry - Graduate Trainee, University of Alabama at Birmingham

(Duplicate) Allele Combinations for apoE and T2DM/Hypertension

How do apoE allele combinations increase the risk of Type 2 Diabetes Mellitus and Hypertension for individuals aged 50 and up? Relevance-- Determining correlation of apoE allele frequencies with the presence of T2DM/Hypertension (for an older demographic).

Scientific Questions Being Studied

How do apoE allele combinations increase the risk of Type 2 Diabetes Mellitus and Hypertension for individuals aged 50 and up? Relevance-- Determining correlation of apoE allele frequencies with the presence of T2DM/Hypertension (for an older demographic).

Project Purpose(s)

  • Educational

Scientific Approaches

The research will be used to find specific patterns of the allele combinations of the apoE gene that increases the risk of Type 2 Diabetes. Datasets will be used to organize data on individuals aged 50+ who do/don't have T2DM, compared with data on the presence of frequent allele combinations found within each category. With this information, graphs will be created for visual comparison of these data, allowing for the analysis of this research and for conclusions pertaining to the research question.

Anticipated Findings

The anticipated findings from the study is: apoE e3/e4 combinations / the presence of e4 alleles will correlate more with the presence of T2DM/Hypertension within adults the age of 50 and up. This information would expand knowledge about how specific allele combinations (hereditary genetics) might be contributing to the presence of T2DM/Hypertension within individuals-- conclusions made can then be applied to real-world scenarios/solutions (Risk of T2DM/Hypertension can be better detected within individuals, for example...).

Demographic Categories of Interest

  • Age

Data Set Used

Registered Tier

Research Team

Owner:

Duplicate of How to Work with All of Us Survey Data (v7)

We recommend that all researchers explore the notebooks in this workspace to learn the basics of All of Us Program Data. What should you expect? By running the notebooks in this workspace, you should get familiar with how to query…

Scientific Questions Being Studied

We recommend that all researchers explore the notebooks in this workspace to learn the basics of All of Us Program Data.

What should you expect?
By running the notebooks in this workspace, you should get familiar with how to query PPI questions/surveys, what the frequencies of answers for each question in each PPI module are.

Project Purpose(s)

  • Educational
  • Methods Development
  • Other Purpose (This is an All of Us Tutorial Workspace created by the Researcher Workbench Support team. It is meant to provide instruction for key Researcher Workbench components and All of Us data representation.)

Scientific Approaches

By running the notebooks in this workspace, you should get familiar with how to query PPI questions/surveys, what the frequencies of answers for each question in each PPI module are.

Anticipated Findings

By reading and running the notebooks in this Tutorial Workspace, researchers will learn the following:
- how to query the survey data,
- how to summarize PPI modules, and questions.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Registered Tier

Research Team

Owner:

  • Rami Namou - Graduate Trainee, University of California, Riverside

Duplicate of Phenotype - Type 2 Diabetes (v7)

The Notebooks in this Workspace can be used to implement well-known phenotype algorithms in one’s own research, using the Controlled Tier Curated Data Repository (CDR).

Scientific Questions Being Studied

The Notebooks in this Workspace can be used to implement well-known phenotype algorithms in one’s own research, using the Controlled Tier Curated Data Repository (CDR).

Project Purpose(s)

  • Educational
  • Methods Development
  • Other Purpose (This is an All of Us Phenotype Library Workspace created by the Researcher Workbench Support team. It is meant to demonstrate the implementation of key phenotype algorithms within the All of Us Research Program cohort, using the Controlled Tier Curated Data Repository (CDR).)

Scientific Approaches

Controlled-tier All of Us cohort data; Jupyter Notebooks, Cohort Builder, Concept Set Selector, Dataset Selector

Anticipated Findings

By reading and running the Notebooks in this Phenotype Library Workspace, researchers can implement the following phenotype algorithms: Jennifer Pacheco and Will Thompson. Northwestern University. Type 2 Diabetes Mellitus. PheKB; 2012 Available from: https://phekb.org/phenotype/18

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Controlled Tier

Research Team

Owner:

  • Rami Namou - Graduate Trainee, University of California, Riverside

Phenotypic correlates to non-coding variation surrounding the SCN5A-SCN10A locus

I aim to study how genetic variation within the non-coding genome affects cardiovascular disease. The majority of genetic variation occurs in non-coding regions, offering many more opportunities to identify genetic risk than in coding sequence alone. The non-coding genome is…

Scientific Questions Being Studied

I aim to study how genetic variation within the non-coding genome affects cardiovascular disease. The majority of genetic variation occurs in non-coding regions, offering many more opportunities to identify genetic risk than in coding sequence alone. The non-coding genome is also incompletely understood, offering us the opportunity to indirectly learn relationships between promoters/enhancers and genes by studying how variation surrounding disease-specific genes leads to different phenotypes. For example, Brugada Syndrome is an arrhythmia disorder that increases the risk for sudden cardiac death. Lead variants identified by GWAS are in non-coding regions surrounding the SCN5A locus. I will first understand if each of these variants are associated with development of cardiovascular-related phenotypes. Then I will ask if variants surrounding the same locus lead to different types of cardiovascular phenotypes. Finally I will apply this model to other disease-associated genes.

Project Purpose(s)

  • Disease Focused Research (Cardiovascular disease)
  • Ancestry

Scientific Approaches

I plan to study the population of individuals who have full-genome sequencing data. I will identify those individuals with particular genetic variants within regulatory elements surround the SCN5A locus and then search for cardiovascular phenotypes enriched in these individuals versus those with wild-type alleles at the same sites. Some of these cardiovascular phenotypes may include development of disease, use of cardiospecific pharmacologics, or need for cardiovascular surgical intervention. I will also compare the phenotypes associated with the various genotypes surrounding the same locus, asking if they all are associated with the same phenotype or have unique profiles. I can apply this same method to other disease-specific loci.

Anticipated Findings

I hope to both better characterize the genetic architecture of the SCN5A locus and also establish a model for studying how non-coding variation can impact disease. One possible outcome is that all non-coding variants around this locus are associated with arrhythmia-associated disease phenotypes (diagnosed with arrhythmia disorders, prescribed anti-arrhythmics) given that these variants have all previously been identified in GWAS for Brugada Syndrome. Another possibility is that no variant is significantly associated with cardiovascular phenotypes. Each regulatory element may contribute minimal risk on its own and therefore risk might be better captured by combining variation across a larger locus. Either finding would help us better characterize non-coding risk factors for cardiovascular disease and in doing so indirectly uncover relationships between the regulatory and coding sequence at a disease-specific locus.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Controlled Tier

Research Team

Owner:

  • Jeff Hansen - Research Fellow, Northwestern University

Genomic Data

Interested in exploring the racial diversity of genomic data for potential precision medicine applications.

Scientific Questions Being Studied

Interested in exploring the racial diversity of genomic data for potential precision medicine applications.

Project Purpose(s)

  • Disease Focused Research (cancer)
  • Educational

Scientific Approaches

Starting off by creating datasets for various types of cancer and analyzing genomic data to explore ML project feasibility.

Anticipated Findings

We have yet to determine precisely what studies to pursue, although the broad goal is to use ML for precision medicine. The particular projects will rely on the diversity and size of the data.

Demographic Categories of Interest

  • Race / Ethnicity

Data Set Used

Controlled Tier

Research Team

Owner:

  • Anisha Reddy - Undergraduate Student, University of Southern California

YFHDuplicate of 2024 ASHG workshop: Type 2 diabetes genomic analyses

This workspace is intended for educational purposes for a 2024 ASHG workshop training on getting started with biomedical and genomic data in the All of Us Researcher Workbench. This workspace demonstrates preparing a phenotype file and running a genome-wide association…

Scientific Questions Being Studied

This workspace is intended for educational purposes for a 2024 ASHG workshop training on getting started with biomedical and genomic data in the All of Us Researcher Workbench. This workspace demonstrates preparing a phenotype file and running a genome-wide association study (GWAS) with Hail for type 2 diabetes as well as querying the All by All tables. This workspace and its workshop are developed by Baylor College of Medicine’s All of Us Evenings with Genetics Research Program (https://bcm.edu/allofuseveningswithgenetics). Any questions about the workspace can be directed to Julie Coleman (julie.coleman@bcm.edu).

Project Purpose(s)

  • Educational

Scientific Approaches

In the Cohort Builder Tool and Dataset Builder Tool, we will create a cohort for type 2 diabetes (cases) and use this cohort to create a dataset. In the Cohort Builder Tool and Dataset Builder Tool, we will also create a cohort for having short read whole-genome sequencing (srWGS) data (cases and controls) and use this cohort to create a dataset that includes demographic concepts. In the first Jupyter Notebook, we will form the two datasets created into one dataset of phenotypic data in the proper format for GWAS analysis with Hail. In the second Jupyter Notebook, we will retrieve desired genomic datasets as a Hail MatrixTable (MT), downsample the genomic data for preliminary runs before a final GWAS, annotate the phenotypic data onto the genomic data, perform quality control (QC), view allele frequencies (AFs), run GWAS, and query All by All tables.

Anticipated Findings

There are not any anticipated findings as these notebooks are for demonstration purposes to educate others on the Researcher Workbench.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Controlled Tier

Research Team

Owner:

  • Yun Freudenberg-Hua - Early Career Tenure-track Researcher, Feinstein Institute for Medical Research

KJ Duplicate of 2024 ASHG workshop: Type 2 diabetes genomic analyses

This workspace is intended for educational purposes for a 2024 ASHG workshop training on getting started with biomedical and genomic data in the All of Us Researcher Workbench. This workspace demonstrates preparing a phenotype file and running a genome-wide association…

Scientific Questions Being Studied

This workspace is intended for educational purposes for a 2024 ASHG workshop training on getting started with biomedical and genomic data in the All of Us Researcher Workbench. This workspace demonstrates preparing a phenotype file and running a genome-wide association study (GWAS) with Hail for type 2 diabetes as well as querying the All by All tables. This workspace and its workshop are developed by Baylor College of Medicine’s All of Us Evenings with Genetics Research Program (https://bcm.edu/allofuseveningswithgenetics). Any questions about the workspace can be directed to Julie Coleman (julie.coleman@bcm.edu).

Project Purpose(s)

  • Educational

Scientific Approaches

In the Cohort Builder Tool and Dataset Builder Tool, we will create a cohort for type 2 diabetes (cases) and use this cohort to create a dataset. In the Cohort Builder Tool and Dataset Builder Tool, we will also create a cohort for having short read whole-genome sequencing (srWGS) data (cases and controls) and use this cohort to create a dataset that includes demographic concepts. In the first Jupyter Notebook, we will form the two datasets created into one dataset of phenotypic data in the proper format for GWAS analysis with Hail. In the second Jupyter Notebook, we will retrieve desired genomic datasets as a Hail MatrixTable (MT), downsample the genomic data for preliminary runs before a final GWAS, annotate the phenotypic data onto the genomic data, perform quality control (QC), view allele frequencies (AFs), run GWAS, and query All by All tables.

Anticipated Findings

There are not any anticipated findings as these notebooks are for demonstration purposes to educate others on the Researcher Workbench.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Controlled Tier

Research Team

Owner:

2024 ASHG workshop: Type 2 diabetes genomic analyses

This workspace is intended for educational purposes for a 2024 ASHG workshop training on getting started with biomedical and genomic data in the All of Us Researcher Workbench. This workspace demonstrates preparing a phenotype file and running a genome-wide association…

Scientific Questions Being Studied

This workspace is intended for educational purposes for a 2024 ASHG workshop training on getting started with biomedical and genomic data in the All of Us Researcher Workbench. This workspace demonstrates preparing a phenotype file and running a genome-wide association study (GWAS) with Hail for type 2 diabetes as well as querying the All by All tables. This workspace and its workshop are developed by Baylor College of Medicine’s All of Us Evenings with Genetics Research Program (https://bcm.edu/allofuseveningswithgenetics). Any questions about the workspace can be directed to Julie Coleman (julie.coleman@bcm.edu).

Project Purpose(s)

  • Educational

Scientific Approaches

In the Cohort Builder Tool and Dataset Builder Tool, we will create a cohort for type 2 diabetes (cases) and use this cohort to create a dataset. In the Cohort Builder Tool and Dataset Builder Tool, we will also create a cohort for having short read whole-genome sequencing (srWGS) data (cases and controls) and use this cohort to create a dataset that includes demographic concepts. In the first Jupyter Notebook, we will form the two datasets created into one dataset of phenotypic data in the proper format for GWAS analysis with Hail. In the second Jupyter Notebook, we will retrieve desired genomic datasets as a Hail MatrixTable (MT), downsample the genomic data for preliminary runs before a final GWAS, annotate the phenotypic data onto the genomic data, perform quality control (QC), view allele frequencies (AFs), run GWAS, and query All by All tables.

Anticipated Findings

There are not any anticipated findings as these notebooks are for demonstration purposes to educate others on the Researcher Workbench.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Controlled Tier

Research Team

Owner:

Collaborators:

  • Shamika Ketkar - Other, Baylor College of Medicine
  • Jinyoung Byun - Other, Baylor College of Medicine
  • Keira Johnston - Other, Yale University

Duplicate of Workshop: Intro to AoU Gen Data for analysis of AGS variants

We will use this workspace to explore genetic variation of proteins involved in G-protein signaling with a focus on the family of Activators of G-protein signaling (AGS). AGS proteins provide unexpected insight into signal processing by the classical G-protein coupled…

Scientific Questions Being Studied

We will use this workspace to explore genetic variation of proteins involved in G-protein signaling with a focus on the family of Activators of G-protein signaling (AGS). AGS proteins provide unexpected insight into signal processing by the classical G-protein coupled receptor pathway and have been biochemically characterized and associated with a range of functions that touch upon a broad disease base. However, their roles in specific diseases and their candidacy as therapeutic targets remains elusive and not well-defined. This project will test the global hypothesis that AGS genetic variants lead to dysfunction of the G-protein signaling network and this results in defined phenotypes and/or functional system adaptations that are therapeutically malleable.
Exercise 2: Looking at the genomic data
Exercise 3: GWAS - extracting phenotypic data
Exercise 4: GWAS - running Hail GWAS
Exercise 5: Advanced GWAS
Exercise 7: Exploring long read data
Exercise 8: Explore structural variant data

Project Purpose(s)

  • Drug Development

Scientific Approaches

The primary approach is to use the developing platforms for long read and structural variant data to define genetic architecture for members of the Activator of G-protein signaling family of proteins and to subsequently evaluate association of the variants with defined phenotypes.

Anticipated Findings

It is anticipated that these studies will eventually lead to the determination of phenotype characteristics associated with variants in specific members of the Activators of G-protein Signaling family of regulatory proteins, b) provide insight into signal processing dynamics with this family of proteins and c) provide a platform for targeted therapeutics.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Controlled Tier

Research Team

Owner:

  • Stephen Lanier - Late Career Tenured Researcher, Wayne State University

test

Incidence and prevalence of comorbid conditions affecting citizen of the united states. I would like to see how the disease progresses overtime and how the presentation varies.

Scientific Questions Being Studied

Incidence and prevalence of comorbid conditions affecting citizen of the united states. I would like to see how the disease progresses overtime and how the presentation varies.

Project Purpose(s)

  • Disease Focused Research (cardiovascular cancer)
  • Educational

Scientific Approaches

Improve patient care. Incidence and prevalence of comorbid conditions affecting citizen of the united states. I would like to see how the disease progresses overtime and how the presentation varies.

Anticipated Findings

Learn and improve my knowledge and practice. Incidence and prevalence of comorbid conditions affecting citizen of the united states. I would like to see how the disease progresses overtime and how the presentation varies.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Registered Tier

Research Team

Owner:

  • Moiuz Chaudhri - Research Fellow, Hackensack University Medical Center

CVD Risk Uncovered

What is the prevalence of hypertension among young African American women YAAW (ages 20-39) in Miami-Dade County, Florida, and how do socioeconomic, lifestyle, and environmental factors contribute to hypertension risk in this population? This study aims to provide a comprehensive…

Scientific Questions Being Studied

What is the prevalence of hypertension among young African American women YAAW (ages 20-39) in Miami-Dade County, Florida, and how do socioeconomic, lifestyle, and environmental factors contribute to hypertension risk in this population?

This study aims to provide a comprehensive understanding of hypertension risk in YAAW in Miami-Dade County. This knowledge can guide the development of targeted culturally relevant interventions, inform public health policies, and contribute to reducing health disparities in this population.

Project Purpose(s)

  • Disease Focused Research (Hypetension Risk Uncovered)
  • Population Health
  • Social / Behavioral

Scientific Approaches

I will use the All of Us Research Program dataset, focusing on YAAW (ages 20-39) in Miami-Dade County, Florida, Electronic Health Records (EHRs), Survey data, and Physical measurements data. This will be a cross-sectional observational study design.

Anticipated Findings

Young African American women in Miami-Dade may face disproportionately high rates of hypertension, often developing it before age 35—a factor linked to increased stroke risk. Elevated obesity rates and unhealthy lifestyle habits like frequent fast food and high salt intake are expected to be prevalent, with many in this group experiencing multiple cardiovascular risk factors simultaneously. Socioeconomic factors, such as lower education and income levels, may also play a role in hypertension risk, while racial discrimination and stress, particularly among college-educated women, could exacerbate this risk. These findings emphasize the need for early, culturally tailored interventions and policies addressing social and medical health determinants for African American women.

Demographic Categories of Interest

  • Race / Ethnicity
  • Geography
  • Access to Care
  • Education Level
  • Income Level

Data Set Used

Controlled Tier

Research Team

Owner:

mak_Duplicate of 2024 ASHG workshop: Type 2 diabetes genomic analyses

This workspace is intended for educational purposes for a 2024 ASHG workshop training on getting started with biomedical and genomic data in the All of Us Researcher Workbench. This workspace demonstrates preparing a phenotype file and running a genome-wide association…

Scientific Questions Being Studied

This workspace is intended for educational purposes for a 2024 ASHG workshop training on getting started with biomedical and genomic data in the All of Us Researcher Workbench. This workspace demonstrates preparing a phenotype file and running a genome-wide association study (GWAS) with Hail for type 2 diabetes as well as querying the All by All tables. This workspace and its workshop are developed by Baylor College of Medicine’s All of Us Evenings with Genetics Research Program (https://bcm.edu/allofuseveningswithgenetics). Any questions about the workspace can be directed to Julie Coleman (julie.coleman@bcm.edu).

Project Purpose(s)

  • Educational

Scientific Approaches

In the Cohort Builder Tool and Dataset Builder Tool, we will create a cohort for type 2 diabetes (cases) and use this cohort to create a dataset. In the Cohort Builder Tool and Dataset Builder Tool, we will also create a cohort for having short read whole-genome sequencing (srWGS) data (cases and controls) and use this cohort to create a dataset that includes demographic concepts. In the first Jupyter Notebook, we will form the two datasets created into one dataset of phenotypic data in the proper format for GWAS analysis with Hail. In the second Jupyter Notebook, we will retrieve desired genomic datasets as a Hail MatrixTable (MT), downsample the genomic data for preliminary runs before a final GWAS, annotate the phenotypic data onto the genomic data, perform quality control (QC), view allele frequencies (AFs), run GWAS, and query All by All tables.

Anticipated Findings

There are not any anticipated findings as these notebooks are for demonstration purposes to educate others on the Researcher Workbench.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Controlled Tier

Research Team

Owner:

  • Marissa Kellogg - Mid-career Tenured Researcher, Oregon Health & Science University
1 - 25 of 14139
<
>
Request a Review of this Research Project

You can request that the All of Us Resource Access Board (RAB) review a research purpose description if you have concerns that this research project may stigmatize All of Us participants or violate the Data User Code of Conduct in some other way. To request a review, you must fill in a form, which you can access by selecting ‘request a review’ below.