Research Projects Directory

Research Projects Directory

17,073 active projects

This information was updated 3/26/2025

The Research Projects Directory includes information about all projects that currently exist in the Researcher Workbench to help provide transparency about how the Workbench is being used. Each project specifies whether Registered Tier or Controlled Tier data are used.

Note: Researcher Workbench users provide information about their research projects independently. Views expressed in the Research Projects Directory belong to the relevant users and do not necessarily represent those of the All of Us Research Program. Information in the Research Projects Directory is also cross-posted on AllofUs.nih.gov in compliance with the 21st Century Cures Act.

Pulmonary Hypertension and immune checkpoint

Does immune activation by Immune checkpoint inhibitors contribute to vascular remodeling or endothelial dysfunction in the pulmonary vasculature? What is the prevalence of PH among cancer patients treated with ICIs? Are there specific cancer types or patient subgroups at higher…

Scientific Questions Being Studied

Does immune activation by Immune checkpoint inhibitors contribute to vascular remodeling or endothelial dysfunction in the pulmonary vasculature?
What is the prevalence of PH among cancer patients treated with ICIs?
Are there specific cancer types or patient subgroups at higher risk of developing PH after ICI therapy?

ICIs have revolutionized cancer therapy, and their use continues to expand. Understanding their rare but severe complications, such as PH, is critical for improving patient outcomes.. Additionally, Understanding how PH influences survival in patients treated with ICIs could improve prognostication and clinical outcomes.

Project Purpose(s)

  • Disease Focused Research (Cancer)

Scientific Approaches

Data from cancer patients treated with ICIs, including demographics, cancer type, treatment regimens, and outcomes.
Electronic Health Records (EHRs) capturing PH diagnosis, echocardiographic data, hemodynamic measurements, and biomarkers (e.g., NT-proBNP, cytokines).
Databases:
The All of Us Research Program for patient-level data with detailed clinical histories

Anticipated Findings

The study may find a higher prevalence of PH among patients treated with immune checkpoint inhibitors (ICIs) compared to cancer patients not receiving ICIs or the general population

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Controlled Tier

Research Team

Owner:

Collaborators:

  • Yong Eun - Research Associate, New York City Health & Hospitals

Depression & Anxiety PRS

Calculate the genetic risk for depression and anxiety using the latest GWAS summary statistics. The results will be used in follow-up analysis to investigate its effects on depressive and anxiety symptoms.

Scientific Questions Being Studied

Calculate the genetic risk for depression and anxiety using the latest GWAS summary statistics. The results will be used in follow-up analysis to investigate its effects on depressive and anxiety symptoms.

Project Purpose(s)

  • Ancestry

Scientific Approaches

I will use SayesRC to estimate individuals genetic risk (i.e., PRS) for depression and anxiety. MDD GWAS is from https://www.cell.com/cell/fulltext/S0092-8674(24)01415-6; and anxiety GWAS is from https://www.nature.com/articles/s41588-024-01908-2

Anticipated Findings

I expected to quantify the genetic risk for depression and anxiety using sumscores (i.e., PRS). The results will be used in follow-up analysis to investigate its effects on depressive and anxiety symptoms.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Controlled Tier

Research Team

Owner:

  • Tingyan Yang - Graduate Trainee, The Council of The Queensland Institute of Medical Research

Data Exploration

I intend to study whether the severity of discoid lupus erythematosus (DLE) correlates with an increased risk of substance use disorders (SUDs), particularly focusing on opioid and cannabis use. This question is important because while DLE's psychosocial burden is known,…

Scientific Questions Being Studied

I intend to study whether the severity of discoid lupus erythematosus (DLE) correlates with an increased risk of substance use disorders (SUDs), particularly focusing on opioid and cannabis use. This question is important because while DLE's psychosocial burden is known, few studies have examined the impact of disease severity on SUDs. Understanding this correlation could reveal if more severe cases lead patients to self-medicate, thus informing public health strategies aimed at providing early mental health and pain management support for DLE patients. Exploring this data could yield insights into whether clinical interventions for severe DLE should prioritize addiction screening and management. Additionally, it could prompt dermatologists to consider mental health factors more routinely, potentially influencing treatment guidelines to include integrated care for patients at higher risk for SUDs.

Project Purpose(s)

  • Disease Focused Research (Dermatologic Conditions)
  • Social / Behavioral

Scientific Approaches

To investigate the link between discoid lupus erythematosus (DLE) severity and substance use disorders (SUDs), I plan to conduct a case-control study using the NIH's All of Us database, which provides a diverse patient cohort and detailed electronic health records (EHRs). I'll identify DLE patients by disease severity, categorizing cases based on ICD-10 codes and any clinical notes indicating symptom extent. Using EHR data, I’ll extract substance use history, focusing on opioids, cannabis, and other SUDs. Statistical analyses, including logistic regression models, will examine if increased DLE severity correlates with higher SUD risk, adjusting for confounders like age, sex, and coexisting mental health conditions. Tools like R or Python will facilitate data cleaning and analysis, while the All of Us Research Workbench provides secure access. This method allows for a robust, data-driven approach to determine if DLE severity predicts SUD risk, supporting potential interventions.

Anticipated Findings

I anticipate finding that higher severity in discoid lupus erythematosus (DLE) is associated with increased risk of substance use disorders (SUDs), particularly for opioids and cannabis, possibly due to unmanaged pain or psychosocial distress in severe cases. If the data support this hypothesis, it would underscore the need for comprehensive care models that integrate dermatologic, pain management, and mental health services for patients with DLE. Such findings would contribute to scientific knowledge by highlighting the psychosocial impacts of dermatologic conditions, encouraging a more holistic approach to DLE treatment. This research could prompt further studies on mechanisms linking chronic inflammatory skin disorders and addiction, guiding clinicians to incorporate addiction screenings and early mental health interventions into routine care for high-risk DLE patients. In turn, these findings could reduce SUD rates in this population, improving overall patient outcomes and QOL.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Registered Tier

Research Team

Owner:

  • James Madsen - Graduate Trainee, Baylor College of Medicine

v7 Dermatology Research

Here, we seek to study the association of psoriasis with various co-morbidities. These include autoimmune diseases (e.g. SLE, hypothyroidism, vitiligo), psychiatric illnesses (e.g. depression, anxiety), and vascular disease (e.g. PAD/PVD, MI, stroke/TIA). By understanding the associated co-morbidities, we hope to…

Scientific Questions Being Studied

Here, we seek to study the association of psoriasis with various co-morbidities. These include autoimmune diseases (e.g. SLE, hypothyroidism, vitiligo), psychiatric illnesses (e.g. depression, anxiety), and vascular disease (e.g. PAD/PVD, MI, stroke/TIA). By understanding the associated co-morbidities, we hope to personalize care for patients with psoriasis.

Project Purpose(s)

  • Educational

Scientific Approaches

We will perform a nested cross-sectional, case-control analysis in the All of Us database. We will extract psoriasis patients (each matched to one control using nearest neighbor propensity score matching). We will then collect data on co-morbidities. Cases will be compared to controls using Pearson’s Chi-squared test or Fisher exact test for categorical variables and unpaired t-test for continuous variables. Multivariable models will be built using logistic regression, taking into account universal cofounders (e.g. age, sex), a priori associations, and co-variates with significance of P <0.1 in univariate analysis, followed by backward elimination of co-variates with a significance of P >0.1 or with evidence of collinearity.

Anticipated Findings

We hope to elucidate the burden of comorbidities in psoriasis patients, through the lens of a diverse, national cohort including communities historically underrepresented in research. We hypothesize significant associations of psoriasis with cardiovascular/cerebrovascular risk factors/events, as well as psychiatric distress. Our findings will help clarify the burden of comorbidities in underserved psoriasis patients.

Demographic Categories of Interest

  • Race / Ethnicity
  • Access to Care
  • Education Level
  • Income Level

Data Set Used

Registered Tier

Research Team

Owner:

Collaborators:

  • Vikram Shaw - Graduate Trainee, Baylor College of Medicine

Method development for genetic susceptibility

Currently, I am exploring WGS/genotype array datasets to see if we see evidence of pairs or sets of variants having significant effect on particular traits such as height. If this is observed, I believe a new approach in analyzing genetic…

Scientific Questions Being Studied

Currently, I am exploring WGS/genotype array datasets to see if we see evidence of pairs or sets of variants having significant effect on particular traits such as height. If this is observed, I believe a new approach in analyzing genetic variants and their susceptibility can be developed with stronger power than existing methods such as GWAS.

Project Purpose(s)

  • Methods Development
  • Ancestry

Scientific Approaches

As an exploratory step, I will be using WGS or genotyping arrays and their height from a particular genetic ancestry e.g. European ancestry. I will test if there are segments of variants that have high coefficients while considering the linkage disequilibrium. Study will be expanded to multi-ancestry when evidence is found.

Anticipated Findings

I expected to find that there are interactive effects of multiple variants that are overlooked when studied by individual variant level. This could further help understanding the overall mechanisms of how genotyping and mutations affect various traits.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Controlled Tier

Research Team

Owner:

  • Ji-Eun Park - Research Fellow, Dana-Farber Cancer Institute

Atopic dermatitis and temperature

Our specific scientific question is how is ambient temperature associated with frequency and severity of atopic dermatitis (AD), also known as eczema, exacerbations? This question is important because climate and environmental factors are increasingly recognized as significant contributors to the…

Scientific Questions Being Studied

Our specific scientific question is how is ambient temperature associated with frequency and severity of atopic dermatitis (AD), also known as eczema, exacerbations? This question is important because climate and environmental factors are increasingly recognized as significant contributors to the AD symptoms. However, there needs to be additional evidence on how specific weather conditions, such as temperature and humidity, impact the disease. Addressing this gap in knowledge may be critical for guiding clinical and public health strategies to mitigate the effects of climate-related environmental changes for individuals with AD.

Project Purpose(s)

  • Disease Focused Research (atopic dermatitis)

Scientific Approaches

This will bea retrospective case-crossover study using electronic health records from the All of Us Research Program. Meteorologic data will be extracted from publicly available databases collected by the National Oceanic and Atmospheric Administration (NOAA). R and RSTudio will be used for statistical analyses.

Anticipated Findings

We anticipate to find that atopic dermatitis exacerbation frequency and severity will increase at the extremes of ambient temperature for a given location (i.e. the coldest and hottest days will be associated with the most exacerbation events).

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Controlled Tier

Research Team

Owner:

  • Tran Dang - Graduate Trainee, Johns Hopkins University

BIPOLAR

To check the effect of medication on bipolar disorders.

Scientific Questions Being Studied

To check the effect of medication on bipolar disorders.

Project Purpose(s)

  • Educational

Scientific Approaches

I will be using SQL, Python for data analysis.

Anticipated Findings

To check the effect of medication on bipolar disorders.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Registered Tier

Research Team

Owner:

Collaborators:

  • Vladimir Cardenas - Graduate Trainee, George Mason University

Duplicate of How to Work With Wearable Device Data (v8)

We recommend that all researchers explore the notebooks in this workspace to learn the basics of how to work with Fitbit data, which is the first pilot of wearable device data currently available within the All of Us Registered Tier…

Scientific Questions Being Studied

We recommend that all researchers explore the notebooks in this workspace to learn the basics of how to work with Fitbit data, which is the first pilot of wearable device data currently available within the All of Us Registered Tier dataset. What should you expect? This notebook will give an overview characterization of the Fitbit data elements currently available in the current Curated Data Repository (CDR) and provide best practices and tips for how to retrieve them.

Project Purpose(s)

  • Educational
  • Other Purpose (This is an All of Us Tutorial Workspace. It is meant to provide instruction for key Researcher Workbench components and All of Us data representation.)

Scientific Approaches

This Tutorial Workspace contains one Jupyter Notebook written in Python. The notebook contains information on how to extract and work with the current set of All of Us Fitbit data. What are the anticipated findings from the study? How would your findings contribute to the body of scientific knowledge in the field? By reading and running the notebook in this Tutorial Workspace, researchers will learn how to query information about steps, heart rate, and daily activity summary.

Anticipated Findings

By reading and running the notebook in this Tutorial Workspace, researchers will understand how to work with Fitbit CDR data from the workbench. They will learn how to query information about steps, heart rate, and daily activity summary.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Registered Tier

Research Team

Owner:

Duplicate of Phenotype - Dementia (v8)

The Notebooks in this Workspace can be used to implement well-known phenotype algorithms in one’s own research.

Scientific Questions Being Studied

The Notebooks in this Workspace can be used to implement well-known phenotype algorithms in one’s own research.

Project Purpose(s)

  • Educational
  • Methods Development
  • Other Purpose (This is an All of Us Phenotype Library Workspace created by the Researcher Workbench Support team. It is meant to demonstrate the implementation of key phenotype algorithms within the All of Us Research Program cohort.)

Scientific Approaches

Not Applicable

Anticipated Findings

By reading and running the Notebooks in this Phenotype Library Workspace, researchers can implement the following phenotype algorithms:

Ritchie, M., Denny, J., Crawford, D., Ramirez, A., Weiner, J., … Roden, D. (2010). Robust replication of genotype-phenotype associations across multiple diseases in an electronic medical record. American Journal of Human Genetics. 87(2):310 doi: 10.1016/j.ajhg.2010.03.003

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Registered Tier

Research Team

Owner:

How SDOH with race impact healthcare access and disease management

When considering barriers to health equity, we are exploring potential correlations with known barriers to disease management that include: 1) How are perceptions of experiences with the healthcare system serving as facilitators or barriers to engagement in disease management and…

Scientific Questions Being Studied

When considering barriers to health equity, we are exploring potential correlations with known barriers to disease management that include:

1) How are perceptions of experiences with the healthcare system serving as facilitators or barriers to engagement in disease management and routine engagement in maintaining appointment attendance?
2) What is the correlation between medication adherence and satisfaction with the healthcare received on a consistent basis?
3) Is there a relationship between perceptions of quality of life, experiences with the healthcare system, and fidelity to provider's guidance and instruction?

Project Purpose(s)

  • Population Health
  • Social / Behavioral
  • Ethical, Legal, and Social Implications (ELSI)
  • Other Purpose (To inform abstract, manuscript, ang grant development)

Scientific Approaches

Frequency statistics
Chi-square analysis
Multi-logistic regression
Get guidance from Statistician

Anticipated Findings

Racial/ethnic differences exist and help to explain how experiences with the healthcare system work together to thwart fidelity to provider recommendations and prevents attaining health equity and equitable chronic disease management.

Better understanding how social factors and lived experiences with discrimination and structural racism impacts measurable and quantifiable factors, like chronic disease management, is needed to validate the focus and financial investment in mitigating these factors with grants and funding priorities.

Demographic Categories of Interest

  • Race / Ethnicity
  • Sex at Birth
  • Gender Identity
  • Sexual Orientation
  • Geography
  • Access to Care
  • Education Level
  • Income Level

Data Set Used

Registered Tier

Research Team

Owner:

  • Mandy Hill - Teacher/Instructor/Professor, University of Texas Medical Branch (UTMB) at Galveston

Collaborators:

  • Nashika Jackson Ogilvie - Graduate Trainee, Nova Southeastern University
  • Nelson Lemieux - Senior Researcher, Seven Star Academy
  • Lakeshia Cousin - Early Career Tenure-track Researcher, University of Florida
  • Keesha Roach - Early Career Tenure-track Researcher, University of Tennessee Health Science Center, Memphis
  • Jaelyn Bivens - Project Personnel, Delta Research and Educational Foundation
  • Arnethea Sutton - Early Career Tenure-track Researcher, Virginia Commonwealth University

CUSON_N9555 v8

We will be using All of Us workbench for N9555 - Research synthesis through visualization of health data course at Columbia University School of Nursing. The course is intended to provide a hands‐on introduction to delivering data visualizations to serve…

Scientific Questions Being Studied

We will be using All of Us workbench for N9555 - Research synthesis through visualization of health data course at Columbia University School of Nursing. The course is intended to provide a hands‐on introduction to delivering data visualizations to serve as a critical lens through which individual and population level health can be examined. The course combines concepts and theory in data visualization and exploration and practice to prepare the learner to begin using graphics and statistics to explore data, find and construct a narrative, and share findings in ways colleagues and decision-makers can readily understand and act upon.

Project Purpose(s)

  • Educational

Scientific Approaches

Data will be used for educational purposes related to the Reducing Health Disparities Through Informatics T32 Pre-and Post-doctoral Training Program and CUSON Center for Community-Engaged Health Informatics and Data Science (CCHIDS). Students will use R and Jupyter Notebook to build visualizations for their final course project. Students will investigate the following questions:

-How do immunization rates compare between survivors of childhood cancer and their adolescent peers?

-What factors are associated with cognitive decline in older adults?

-What factors are associated with mental health conditions in adolescents?

Anticipated Findings

We expect to see different immunization rates, and be able to describe predictors of cognitive decline in older adults and mental health conditions in adolescents.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Controlled Tier

Research Team

Owner:

Collaborators:

  • Xuefan Ji - Graduate Trainee, Columbia University
  • Victoria Winogora - Graduate Trainee, Columbia University
  • Susan Maloney - Graduate Trainee, Columbia University
  • Sophie Junak - Graduate Trainee, Columbia University
  • Marcela Algave - Graduate Trainee, Columbia University
  • Lenka Hellerova - Graduate Trainee, Columbia University
  • Helen Dinh - Graduate Trainee, Columbia University

Genetic Ancestry_BRCA1

This study will integrate BRCA1 variant data with clinical, lifestyle, and environmental factors to enhance breast cancer risk prediction. We will assess how BRCA1 interacts with non-genomic factors and explore ML models for personalized outcome forecasting.

Scientific Questions Being Studied

This study will integrate BRCA1 variant data with clinical, lifestyle, and environmental factors to enhance breast cancer risk prediction. We will assess how BRCA1 interacts with non-genomic factors and explore ML models for personalized outcome forecasting.

Project Purpose(s)

  • Population Health

Scientific Approaches

This study will integrate BRCA1 variant data, including pathogenic mutations and polygenic risk scores, with longitudinal clinical, lifestyle, and environmental data from All of Us. The goal is to enhance breast cancer risk prediction through multi-modal machine learning models for personalized risk stratification.

Anticipated Findings

This research aims to improve disease risk prediction, treatment targeting, and outcomes by integrating genomic data with multi-modal health information, enabling earlier identification of high-risk individuals and personalized care strategies. The findings will advance precision medicine, uncover novel drug targets, and enhance public health efforts to prevent and manage chronic diseases.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Controlled Tier

Research Team

Owner:

  • Ankita Roy - Other, Icahn School of Medicine at Mount Sinai

Medical Phenome of Speech and Language Problems and Disorders - Dataset v8

The central aim of this work is to characterize the medical phenome of speech-language traits, difficulties, and disorders, to understand risk and resilience factors, comorbidities, and identify opportunities for interventions. We employ an unbiased (phenotype-agnostic), data-driven (hypothesis generating) approach using…

Scientific Questions Being Studied

The central aim of this work is to characterize the medical phenome of speech-language traits, difficulties, and disorders, to understand risk and resilience factors, comorbidities, and identify opportunities for interventions. We employ an unbiased (phenotype-agnostic), data-driven (hypothesis generating) approach using large-scale EHR data from a diverse group of participants from the NIH All of Us Research Program. We draw on our multidisciplinary expertise in epidemiology and speech-language development and pathology to conduct a strategic set of Phenome-Wide Association Studies (PheWAS) towards the first major effort to map the medical phenome of language at population health scales. PheWAS approaches can answer questions about health and well-being that we have not yet thought to ask, generating novel testable hypotheses about mechanisms, risks, and predispositions to therapies.

Project Purpose(s)

  • Disease Focused Research (speech language problems and disorders)
  • Population Health
  • Social / Behavioral

Scientific Approaches

To characterize the clinical phenome of speech-language difficulties and disorders coded in EHRs, we will conduct Phenome Wide Association Studies (PheWASs) of speech-language phenotypes available in All of Us EHRs, in participants who have been assigned speech-language related ICD-9 codes. That is, with a series of regression analyses, we will assess how speech and language variables in EHRs are associated with all other health variables. A subset of analyses will focus on African American individuals, who are underrepresented in speech-language science and clinical practice for speech-language disorders.

Anticipated Findings

This project uses a data driven, hypothesis-generating approach towards understanding speech and language disorders. Our approach, which is agnostic to specific clinical phenotypes, symptoms, diagnoses, or disorders, will reveal population-level health and disease outcomes associated with speech-language traits in populations that are usually underrepresented in biomedical research. This project makes discoveries using high-quality All of Us data; advances the mission to improve the lives of people with communication disorders. By combining precision medicine techniques with a health equity and community-engaged focus, findings from this project will address critical health needs for subsets of individuals with certain clinical risk factors or comorbidities related to speech and language, as well as entire communities in whom communication traits/disorders have been understudied (e.g., African Americans).

Demographic Categories of Interest

  • Race / Ethnicity
  • Disability Status

Data Set Used

Registered Tier

Research Team

Owner:

  • Srishti Nayak - Research Fellow, Vanderbilt University Medical Center
  • Grace Schlicht - Project Personnel, Vanderbilt University Medical Center

Collaborators:

  • Tanguy Rubat du Mérac - Graduate Trainee, Vanderbilt University
  • Alex Petty - Project Personnel, Vanderbilt University Medical Center
  • Rachana Nitin - Research Fellow, Vanderbilt University Medical Center

Duplicate of Remove Algorithm Bias

This is a project for my course in removing algorithm bias at GMU. I will look into identifying algorithm bias through hierarchical analysis in the LGBTQ/Transgender community.

Scientific Questions Being Studied

This is a project for my course in removing algorithm bias at GMU. I will look into identifying algorithm bias through hierarchical analysis in the LGBTQ/Transgender community.

Project Purpose(s)

  • Educational

Scientific Approaches

I will be using SQL to review and filter through the data. I also use R for the regression model analysis and check the likelihood of patients experiencing depression in the LGBTQ community.

Anticipated Findings

I aim to identify the existence of bias and provide remedies in the LGBTQ/Transgender community. I also think these findings will help better understand and improve the existence of bias by assessing the response to the anti-depressant treatment.

Demographic Categories of Interest

  • Race / Ethnicity
  • Age
  • Sex at Birth
  • Gender Identity
  • Sexual Orientation
  • Geography
  • Others

Data Set Used

Registered Tier

Research Team

Owner:

Social Support and the Hispanic Paradox

I intend to study the Hispanic paradox in survival outcomes for Hispanic patients with head and neck cancer (HNC) compared to other racial and ethnic groups. This research is important because despite facing significant socioeconomic disadvantages and barriers in healthcare…

Scientific Questions Being Studied

I intend to study the Hispanic paradox in survival outcomes for Hispanic patients with head and neck cancer (HNC) compared to other racial and ethnic groups. This research is important because despite facing significant socioeconomic disadvantages and barriers in healthcare access, Hispanic individuals paradoxically exhibit better survival rates for certain health conditions, including head and neck cancer. Limited research exists on the underlying mechanisms driving this paradox, such as social and cultural factors ("Barrio advantage"), immigration-related behaviors ("Salmon Bias," "Healthy Migrant Effect"), and lifestyle influences. Understanding these mechanisms is crucial for developing culturally tailored interventions, reducing healthcare disparities, and improving cancer care outcomes for Hispanic populations.

Project Purpose(s)

  • Population Health

Scientific Approaches

The study will utilize the All of Us Research Program database to analyze de-identified health and survey data from Hispanic and non-Hispanic individuals with Head and Neck cancer. A retrospective cohort design will be used to compare outcomes such as treatment efficacy, complication rates, and long-term health effects. Statistical analysis methods will include multivariable regression to adjust for confounding variables (e.g., age, comorbidities, cancer stage). Subgroup analyses will evaluate how factors such as implant type and radiation dose influence outcomes. Data cleaning, management, and statistical analysis will be conducted using supported tools such as R or Python

Anticipated Findings

The anticipated findings from this study include identifying whether Hispanic individuals with Head and Neck cancer experience different outcomes with social support. These differences may include variations in treatment efficacy, rates of adverse events, and long-term health outcomes. The results could provide insights into mechanisms that potentially improve cancer care outcomes for Hispanic populations.

Demographic Categories of Interest

  • Race / Ethnicity

Data Set Used

Controlled Tier

Research Team

Owner:

  • Kevin Vargas - Graduate Trainee, Baylor College of Medicine

Duplicate of How to Work with Genomics Data (CRAM_Processing and IGV)_v7HC

This workspace and its notebooks neither ask nor answer any scientific questions. The purpose of this workspace is to serve as a tutorial which shows how to localize the All of Us (AoU) CRAM files individually or in groups via…

Scientific Questions Being Studied

This workspace and its notebooks neither ask nor answer any scientific questions. The purpose of this workspace is to serve as a tutorial which shows how to localize the All of Us (AoU) CRAM files individually or in groups via the CRAM manifest in addition to showing how to render the Integrated Genome Viewer (IGV) on the AoU workbench to explore the CRAM files.

Project Purpose(s)

  • Methods Development

Scientific Approaches

This workspace conducts no study and applies no scientific approaches. This workspace and its notebooks are tutorials for localizing AoU CRAM files with R commands and using IGV to explore their contents. The methods and tools employed include R system commands for localizing individual CRAM files, an R for loop for localizing multiple CRAM files by referencing the manifest, and the commands for importing and rendering IGV to view the localized CRAM files.

Anticipated Findings

There will be no findings or contribution to scientific knowledge as there is no study being conducted nor questions asked. Informal 'findings' include the usability of the aforementioned tools and AoU CRAM files on the All of Us workbench.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Controlled Tier

Research Team

Owner:

Duplicate of AIM-AHEAD - Comorbid Osteoporosis in Breast Cancer -FOR ME

We will use the breast cancer phenotype as an example for studying certain survey variables using the Controlled Tier Curated Data Repository (CDR). We will investigate if survey factors are associated with comorbid osteoporosis in breast cancer.

Scientific Questions Being Studied

We will use the breast cancer phenotype as an example for studying certain survey variables using the Controlled Tier Curated Data Repository (CDR). We will investigate if survey factors are associated with comorbid osteoporosis in breast cancer.

Project Purpose(s)

  • Educational
  • Other Purpose (This workspace will be used as part of the AIM-AHEAD initiative as one example of analysis that can be conducted within All of Us. It is intended to demonstrate how to build a cohort with disparate data sources that can be used to check for associations between outcomes (e.g., comorbid osteoporosis) and survey variables. This specific workspace will be a recreation of the original workspace, in R.)

Scientific Approaches

We will start with a cohort of breast cancer patients with certain survey information available. We will characterize the survey factors and compare them to breast cancer and osteoporosis comorbidity. We will quantify the associations through statistical methods while controlling for demographics and other factors.

Anticipated Findings

This workspace should provide an example of a complete analysis, from cohort identification through insight generation, within All of Us. The breast cancer phenotype is based on work done by Ning Shang, George Hripcsak, Chunhua Weng, Wendy K. Chung, & Katherine Crew (originally retrieved from https://phekb.org/phenotype/breast-cancer).

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Controlled Tier

Research Team

Owner:

  • Analiz Rodriguez - Mid-career Tenured Researcher, University of Arkansas for Medical Sciences

Social Determinants of Health Factors in Ovarian Cancer Survivors

Scientific Question: Are there specific social determinants of health factors that are associated with overall perceptions of health among ovarian cancer survivors from underrepresented populations? Women who are Black and/or Hispanic, and women residing in rural communities continue to have…

Scientific Questions Being Studied

Scientific Question: Are there specific social determinants of health factors that are associated with overall perceptions of health among ovarian cancer survivors from underrepresented populations?

Women who are Black and/or Hispanic, and women residing in rural communities continue to have lower ovarian cancer survival rates and higher rates of recurrence compared non-Hispanic white women and those residing the metropolitan regions. These vulnerable groups of women are more likely to be diagnosed with aggressive ovarian cancers at late stage disease. While causes for the observed cancer disparities are multidimensional, social determinants of health are deemed as contributors. Among ovarian cancer survivors from these underrepresented groups, little is known about how social determinants of health factors may influence their overall health outlook.

Project Purpose(s)

  • Social / Behavioral

Scientific Approaches

(a) Use the data from the EHR domain to identify women with a history of ovarian cancer (including demographic survey questions) as the cohort of interest.
(b) Select social determinants of health variables as the independent variables using data from the social determinants of health survey questions.
(c) Select overall health variables as the dependent variables using data from the Overall Health survey questions.
(d) Analyze the data generated.

Anticipated Findings

We anticipate that specific social determinants of health factors will be associated with overall perceptions of health among ovarian cancer survivors from underrepresented groups. The findings of this study will help provide scientific knowledge on important factors that exacerbate health inequalities and help scientists partner with local communities to develop and implement interventions that optimize quality of life during ovarian cancer survivorship in vulnerable populations and could prevent disease recurrence.

Demographic Categories of Interest

  • Race / Ethnicity
  • Geography
  • Access to Care

Data Set Used

Registered Tier

Research Team

Owner:

  • Rishav Mukherjee - Research Assistant, University of Kansas Medical Center
  • Diane Mahoney - Research Associate, University of Kansas Medical Center

Collaborators:

  • Diego Mazzotti - Senior Researcher, University of Kansas Medical Center

AD patients latent construct_v8

The current diagnostic criteria for mental health is based on symptoms or clinical variables, which face significant challenges due to substantial heterogeneity of these disorders and a lack of objective biomarkers. The National Institute of Mental Health (NIMH)-led Research Domain…

Scientific Questions Being Studied

The current diagnostic criteria for mental health is based on symptoms or clinical variables, which face significant challenges due to substantial heterogeneity of these disorders and a lack of objective biomarkers.
The National Institute of Mental Health (NIMH)-led Research Domain Criteria (RDoC) initiative proposes to integrate biological and behavioral measures from various sources of analysis and different domains of functioning for disease classifications.
In this sense, mental disorders will be reorganized by measures collected across multiple domains (e.g., clinical, genomics, brain, and behavioral) at multiple levels and thus more closely align with the underlying biology.

Project Purpose(s)

  • Methods Development

Scientific Approaches

We will develop novel methods to subtype mental disorder patients by integrating measures across multiple modalities with different data types (e.g., categorical clinical measures and continuous neuro-imaging measures).

Anticipated Findings

We anticipate to offer novel insights into the subtyping of mental disorder patients and meaningful latent construct.

Demographic Categories of Interest

  • Age

Data Set Used

Controlled Tier

Research Team

Owner:

PRS_admix

Development and evaluation of polygenic risk score methods for ancestrally diverse populations in All of US

Scientific Questions Being Studied

Development and evaluation of polygenic risk score methods for ancestrally diverse populations in All of US

Project Purpose(s)

  • Methods Development
  • Ancestry

Scientific Approaches

We plan to develop a statistical method SDPR_admix which is able to leverage local ancestry and cross-ancestry genetic architecture to improve the polygenic prediction in admixed population.

Anticipated Findings

We expect our method is able to Improve the polygenic prediction in admixed populations and thus promote health equity in clinical practice.

Demographic Categories of Interest

  • Race / Ethnicity
  • Age
  • Sex at Birth
  • Gender Identity
  • Sexual Orientation
  • Geography
  • Disability Status
  • Access to Care
  • Education Level
  • Income Level

Data Set Used

Controlled Tier

Research Team

Owner:

  • Yuhan Xie - Research Fellow, Yale University
  • Leqi Xu - Graduate Trainee, Yale University
  • Jiaqi Hu - Graduate Trainee, Yale University
  • Jiangnan Shen - Graduate Trainee, Yale University
  • Geyu Zhou - Research Fellow, Yale University

Duplicate of Novel genes for sudden cardiac arrest/death

Sudden cardiac arrest/death is a problem that affects about 1 in 1000 Americans every year. There are around 60 genes that are well established genetic causes of this condition. We seek to use All of Us data to identify novel…

Scientific Questions Being Studied

Sudden cardiac arrest/death is a problem that affects about 1 in 1000 Americans every year. There are around 60 genes that are well established genetic causes of this condition. We seek to use All of Us data to identify novel genes that are associated with an increased risk of sudden cardiac arrest/death.

Project Purpose(s)

  • Disease Focused Research (Sudden cardiac arrest/death)
  • Ancestry

Scientific Approaches

We will identify All of Us participants with the phenotype of sudden cardiac arrest/death or subphenotypes that can cause sudden cardiac arrest/death which includes arrhythmic (Brugada syndrome, long QT syndrome, short QT syndrome, CPVT, ventricular tachycardia, ventricular fibrillation) and cardiomyopathic diseases (dilated cardiomyopathy, hypertrophic cardiomyopathy, arrhythmogenic cardiomyopathy). In this cohort we will look for truncating variants in genes that could contribute to the phenotype.

Anticipated Findings

We expect to identify new genes that when harboring truncating variants (frameshift, stop-gain, canonical splice) lead to sudden cardiac arrest/death.

Demographic Categories of Interest

  • Race / Ethnicity
  • Age
  • Geography

Data Set Used

Controlled Tier

Research Team

Owner:

  • Marco Mathias - Research Assistant, Cedars-Sinai Medical Center

Duplicate of How to Work with All of Us Survey Data (v8)

The data will be used for education purposes (e.g. for a college research methods course, to educate students on population-based research approaches). We recommend that all researchers explore the notebooks in this workspace to learn the basics of All of…

Scientific Questions Being Studied

The data will be used for education purposes (e.g. for a college research methods course, to educate students on population-based research approaches).

We recommend that all researchers explore the notebooks in this workspace to learn the basics of All of Us Program Data. What should you expect? By running the notebooks in this workspace, you should get familiar with how to query PPI questions/surveys, what the frequencies of answers for each question in each PPI module are.

Project Purpose(s)

  • Educational
  • Methods Development
  • Other Purpose (The primary purpose of the use of All of Us data is to develop and/or validate specific methods/tools for analyzing or interpreting data (e.g. statistical methods for describing data trends, developing more powerful methods to detect gene-environment, or other types of interactions in genome-wide association studies). It also provides instruction for key Researcher Workbench components and All of Us data representation. )

Scientific Approaches

By running the notebooks in this workspace, you should get familiar with how to query PPI questions/surveys, what the frequencies of answers for each question in each PPI module are.

Anticipated Findings

By reading and running the notebooks in this Tutorial Workspace, researchers will learn the following:
- how to query the survey data,
- how to summarize PPI modules, and questions.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Registered Tier

Research Team

Owner:

  • Ke Wang - Research Associate, Columbia University

SUD Psychedelic-Assisted Therapies

This project will describe and categorize the demographics, social determinants, cultural correlates, mental health status, healthcare utilization, and alternative healing practices among survey respondents who report psychedelic use (psilocybin, MDMA, ketamine) in the past 30 days. This exploratory investigation of…

Scientific Questions Being Studied

This project will describe and categorize the demographics, social determinants, cultural correlates, mental health status, healthcare utilization, and alternative healing practices among survey respondents who report psychedelic use (psilocybin, MDMA, ketamine) in the past 30 days. This exploratory investigation of the data will be used to understand profiles of patients who use alternative psychiatric approaches. The project will also serve as a research training opportunity for medical students.

Project Purpose(s)

  • Disease Focused Research (psychiatric disorders)
  • Educational

Scientific Approaches

A cohort dataset will be created for patients with IDC codes indicating treatment with psychedelic-assisted therapies as well as patients who indicate recreational use of psychedelic substances in the lifestyles survey. Initially descriptive tools will be used to create aggregate summaries. Inferential statistics will be used to establish associations with treatment and outcomes, as well as associations with recreational use.

Anticipated Findings

We anticipate being able to add to the literature on medically assisted treatment for substance use, and to contribute to evidence regarding whether and to what degree psychedelic-assisted therapies are effective for psychiatric disorders.

Demographic Categories of Interest

  • Age
  • Geography
  • Access to Care

Data Set Used

Registered Tier

Research Team

Owner:

Duplicate of How to Work with All of Us Genomic Data (Hail - Plink)(v8)

Not applicable - these notebooks demonstrate example analysis how to use Hail and PLINK to perform genome-wide association studies using the All of Us genomic data and phenotypic data.

Scientific Questions Being Studied

Not applicable - these notebooks demonstrate example analysis how to use Hail and PLINK to perform genome-wide association studies using the All of Us genomic data and phenotypic data.

Project Purpose(s)

  • Other Purpose (Demonstrate to the All of Us Researcher Workbench users how to get started with the All of Us genomic data and tools. It includes an overview of all the All of Us genomic data and shows some simple examples on how to use these data.)

Scientific Approaches

Not applicable - these notebooks demonstrate example analysis how to use Hail and PLINK to perform genome-wide association studies using the All of Us genomic data and phenotypic data.

Anticipated Findings

Not applicable - these notebooks demonstrate example analysis how to use Hail and PLINK to perform genome-wide association studies using the All of Us genomic data and phenotypic data.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Controlled Tier

Research Team

Owner:

  • Jeremy McRae - Mid-career Tenured Researcher, Illumina Inc

Ritchie Lab Common Files v8

This workspace contains files that we wish to make commonly available to members of the Ritchie Lab at the University of Pennsylvania.

Scientific Questions Being Studied

This workspace contains files that we wish to make commonly available to members of the Ritchie Lab at the University of Pennsylvania.

Project Purpose(s)

  • Ancestry

Scientific Approaches

The project will contain notebooks and code for general use within our research lab.

Anticipated Findings

The project will be used as a repository for notebooks and code that can be useful to all members of our research lab.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Controlled Tier

Research Team

Owner:

Collaborators:

  • Anurag Verma - Early Career Tenure-track Researcher, University of Pennsylvania
  • Tess Cherlin - Research Fellow, University of Pennsylvania
  • Stephanie Mohammed - Research Fellow, University of Pennsylvania
  • Nikki Palmiero - Project Personnel, University of Pennsylvania
  • Manu Shivakumar - Graduate Trainee, University of Pennsylvania
  • Lindsay Guare - Graduate Trainee, University of Pennsylvania
  • Christopher Carson - Research Assistant, University of Pennsylvania
  • Alexis Garofalo - Graduate Trainee, University of Pennsylvania
  • David Zhang - Graduate Trainee, University of Pennsylvania
  • David Lee - Research Fellow, Northwestern University
1 - 25 of 17074
<
>
Request a Review of this Research Project

You can request that the All of Us Resource Access Board (RAB) review a research purpose description if you have concerns that this research project may stigmatize All of Us participants or violate the Data User Code of Conduct in some other way. To request a review, you must fill in a form, which you can access by selecting ‘request a review’ below.