Research Projects Directory

Research Projects Directory

16,647 active projects

This information was updated 3/7/2025

The Research Projects Directory includes information about all projects that currently exist in the Researcher Workbench to help provide transparency about how the Workbench is being used. Each project specifies whether Registered Tier or Controlled Tier data are used.

Note: Researcher Workbench users provide information about their research projects independently. Views expressed in the Research Projects Directory belong to the relevant users and do not necessarily represent those of the All of Us Research Program. Information in the Research Projects Directory is also cross-posted on AllofUs.nih.gov in compliance with the 21st Century Cures Act.

421 projects have 'COVID' in the scientific questions being studied description
< Go back to All Projects View or enter a new search query

Covid + DM + Scleroderma

We are hoping to see if there is any association between COVID infection and/or vaccination and autoimmune connective tissue diseases such as scleroderma and dermatomyositis.

Scientific Questions Being Studied

We are hoping to see if there is any association between COVID infection and/or vaccination and autoimmune connective tissue diseases such as scleroderma and dermatomyositis.

Project Purpose(s)

  • Disease Focused Research (Dermatomyositis, scleroderma)

Scientific Approaches

We plan to search the database for people with COVID infection and/or vaccination and evaluate if there are any associations with dermatomyositis and scleroderma.

Anticipated Findings

We think there may be a positive correlation. This would be helpful to know for patients with these diseases.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Controlled Tier

Research Team

Owner:

  • Sophia Manduca - Graduate Trainee, New York University, Grossman School of Medicine

DB8 of CRS study

What are some of the significant characteristics of Covid 19 patients who lost sense of smell. Why important: to understand the potential cause of the loss of smell for Covid 19 Patients.

Scientific Questions Being Studied

What are some of the significant characteristics of Covid 19 patients who lost sense of smell.
Why important: to understand the potential cause of the loss of smell for Covid 19 Patients.

Project Purpose(s)

  • Disease Focused Research (covid 19)
  • Methods Development

Scientific Approaches

Build ML models to discover the potentail patterns for the Covid 19 patients who had smell lose

Anticipated Findings

Find significant features that can predict the smell lose for Covid 19 patients and potentially guide the recovery process of the patients

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Controlled Tier

Research Team

Owner:

  • Renjie Hu - Early Career Tenure-track Researcher, University of Houston

Collaborators:

  • Zain Mehdi - Graduate Trainee, Houston Methodist Research Institute
  • Tania Banerjee - Early Career Tenure-track Researcher, University of Houston
  • Roshan Dongre - Graduate Trainee, Houston Methodist Research Institute
  • Khoa Nguyen - Student, University of Houston
  • Natalia Freire - Undergraduate Student, University of Houston
  • Najm Khan - Graduate Trainee, Rutgers, The State University of New Jersey
  • Meher Gajula - Graduate Trainee, University of Houston
  • Likhitha Reddy Kesara - Graduate Trainee, University of Houston
  • Koyal Ansingkar - Graduate Trainee, Houston Methodist Research Institute
  • Jagan Mohan Reddy Dwarampudi - Graduate Trainee, University of Houston
  • Faizaan Khan - Graduate Trainee, Houston Methodist Research Institute
  • Ethan Hoang - Undergraduate Student, University of Houston
  • Ying Lin - Early Career Tenure-track Researcher, University of Houston
  • Sicong Chang - Graduate Trainee, University of Houston
  • Aatin Dhanda - Graduate Trainee, Rutgers, The State University of New Jersey
  • Thamer Alnazzal - Graduate Trainee, University of Houston

DB7 of CRS study

What are some of the significant characteristics of Covid 19 patients who lost sense of smell. Why important: to understand the potential cause of the loss of smell for Covid 19 Patients.

Scientific Questions Being Studied

What are some of the significant characteristics of Covid 19 patients who lost sense of smell.
Why important: to understand the potential cause of the loss of smell for Covid 19 Patients.

Project Purpose(s)

  • Disease Focused Research (covid 19)
  • Methods Development

Scientific Approaches

Build ML models to discover the potentail patterns for the Covid 19 patients who had smell lose

Anticipated Findings

Find significant features that can predict the smell lose for Covid 19 patients and potentially guide the recovery process of the patients

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Controlled Tier

Research Team

Owner:

  • Renjie Hu - Early Career Tenure-track Researcher, University of Houston
  • Meher Gajula - Graduate Trainee, University of Houston

Collaborators:

  • Zain Mehdi - Graduate Trainee, Houston Methodist Research Institute
  • Tania Banerjee - Early Career Tenure-track Researcher, University of Houston
  • Roshan Dongre - Graduate Trainee, Houston Methodist Research Institute
  • Khoa Nguyen - Student, University of Houston
  • Natalia Freire - Undergraduate Student, University of Houston
  • Najm Khan - Graduate Trainee, Rutgers, The State University of New Jersey
  • Likhitha Reddy Kesara - Graduate Trainee, University of Houston
  • Koyal Ansingkar - Graduate Trainee, Houston Methodist Research Institute
  • Jagan Mohan Reddy Dwarampudi - Graduate Trainee, University of Houston
  • Faizaan Khan - Graduate Trainee, Houston Methodist Research Institute
  • Ethan Hoang - Undergraduate Student, University of Houston
  • Ying Lin - Early Career Tenure-track Researcher, University of Houston
  • Sicong Chang - Graduate Trainee, University of Houston
  • Aatin Dhanda - Graduate Trainee, Rutgers, The State University of New Jersey
  • Thamer Alnazzal - Graduate Trainee, University of Houston

HAP464 Antidepressants Analysis

The specific scientific question I aim to study is how certain environmental factors influence the spread of respiratory infections in urban populations. This question is important because understanding how air pollution, temperature variations, and other environmental variables affect the transmission…

Scientific Questions Being Studied

The specific scientific question I aim to study is how certain environmental factors influence the spread of respiratory infections in urban populations. This question is important because understanding how air pollution, temperature variations, and other environmental variables affect the transmission of diseases like the flu or COVID-19 can help public health authorities create more effective prevention strategies. By studying these factors, we can gain insights into how to reduce the burden of infectious diseases, particularly in high-density areas. Additionally, this research will provide valuable data that could inform policies on improving urban environments for better public health outcomes. Through this study, I hope to identify patterns that can be used to predict and control outbreaks more efficiently.

Project Purpose(s)

  • Educational

Scientific Approaches

To answer my scientific question, I plan to use a combination of epidemiological analysis and environmental data modeling. Specifically, I will use publicly available datasets on air quality, weather patterns, and reported cases of respiratory infections, such as flu and COVID-19, from health organizations and government sources. I will apply statistical methods, including regression analysis, to examine the relationships between environmental factors (like air pollution levels, temperature, and humidity) and the spread of infections over time. Additionally, I will use geographic information system (GIS) tools to visualize the spatial distribution of infections and environmental conditions across urban areas. This will help identify patterns and high-risk zones for outbreaks.

Anticipated Findings

I anticipate that the study will reveal significant correlations between specific environmental factors—such as air pollution levels, temperature fluctuations, and humidity—and the spread of respiratory infections in urban populations. For example, I might find that higher levels of air pollution or certain temperature ranges are associated with an increased risk of infection transmission. These findings could also suggest that certain urban areas with more exposure to environmental stressors are more vulnerable to outbreaks.

The contribution to scientific knowledge will be twofold. First, this research could provide a clearer understanding of how environmental conditions directly influence the dynamics of disease transmission, which is still an area with many unknowns. Second, it could offer actionable insights for public health officials, helping them develop more effective preventative strategies tailored to specific environmental conditions.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Registered Tier

Research Team

Owner:

HAP464 - Antidepressant Analysis

The specific scientific question I aim to study is how certain environmental factors influence the spread of respiratory infections in urban populations. This question is important because understanding how air pollution, temperature variations, and other environmental variables affect the transmission…

Scientific Questions Being Studied

The specific scientific question I aim to study is how certain environmental factors influence the spread of respiratory infections in urban populations. This question is important because understanding how air pollution, temperature variations, and other environmental variables affect the transmission of diseases like the flu or COVID-19 can help public health authorities create more effective prevention strategies. By studying these factors, we can gain insights into how to reduce the burden of infectious diseases, particularly in high-density areas. Additionally, this research will provide valuable data that could inform policies on improving urban environments for better public health outcomes. Through this study, I hope to identify patterns that can be used to predict and control outbreaks more efficiently.

Project Purpose(s)

  • Educational

Scientific Approaches

To answer my scientific question, I plan to use a combination of epidemiological analysis and environmental data modeling. Specifically, I will use publicly available datasets on air quality, weather patterns, and reported cases of respiratory infections, such as flu and COVID-19, from health organizations and government sources. I will apply statistical methods, including regression analysis, to examine the relationships between environmental factors (like air pollution levels, temperature, and humidity) and the spread of infections over time. Additionally, I will use geographic information system (GIS) tools to visualize the spatial distribution of infections and environmental conditions across urban areas. This will help identify patterns and high-risk zones for outbreaks.

Anticipated Findings

I anticipate that the study will reveal significant correlations between specific environmental factors—such as air pollution levels, temperature fluctuations, and humidity—and the spread of respiratory infections in urban populations. For example, I might find that higher levels of air pollution or certain temperature ranges are associated with an increased risk of infection transmission. These findings could also suggest that certain urban areas with more exposure to environmental stressors are more vulnerable to outbreaks.

The contribution to scientific knowledge will be twofold. First, this research could provide a clearer understanding of how environmental conditions directly influence the dynamics of disease transmission, which is still an area with many unknowns. Second, it could offer actionable insights for public health officials, helping them develop more effective preventative strategies tailored to specific environmental conditions.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Registered Tier

Research Team

Owner:

Duplicate of AOU_Recover_Long_Covid_v6

The purpose of this workspace was to implement the published XGBoost machine learning (ML) model, which was developed using the National COVID Cohort Collaborative’s (N3C) EHR repository to identify potential patients with PASC/Long COVID in All of Us Research Program.

Scientific Questions Being Studied

The purpose of this workspace was to implement the published XGBoost machine learning (ML) model, which was developed using the National COVID Cohort Collaborative’s (N3C) EHR repository to identify potential patients with PASC/Long COVID in All of Us Research Program.

Project Purpose(s)

  • Disease Focused Research (Long COVID)

Scientific Approaches

To achieve this objective, data science workflows were used to apply ML algorithms on the Researcher Workbench. This effort allowed an expansion in the number of participants used to evaluate the ML models used to identify risk of PASC/Long COVID and also serve to validate the efforts of one team and providing insight to other teams. These models were implemented within the All of Us Controlled Tier data (C2022Q2R2), which was last refreshed on June 22, 2022. We intend to provide a step-by-step guide for the implementation of N3C's ML Model for identification of PASC/Long COVID Phenotype in the All of Us dataset.

Anticipated Findings

We intend to provide a step-by-step guide for the implementation of N3C's ML Model for identification of PASC/Long COVID Phenotype in the All of Us dataset.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Controlled Tier

Research Team

Owner:

Long COVID

We are interested in identifying effective treatments for Long COVID to develop a personalized recommendation algorithm that can be used as a Clinical Decision Support Tool in the future to improve patient outcome.

Scientific Questions Being Studied

We are interested in identifying effective treatments for Long COVID to develop a personalized recommendation algorithm that can be used as a Clinical Decision Support Tool in the future to improve patient outcome.

Project Purpose(s)

  • Disease Focused Research (Long COVID)
  • Methods Development
  • Commercial

Scientific Approaches

We will filter by patients diagnosed with Long COVID, identify trends in patient outcome with different treatments and other variables, and apply machine learning models to identify predictors of outcome using demographic and clinical data.

Anticipated Findings

We will identify important demographic and clinical factors that impact treatment efficacy for Long COVID, and use that information to optimize treatment plans for individual patients.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Registered Tier

Research Team

Owner:

Duplicate of Longitudinal serum cytokines and health outcomes COVID-19

How longitudinal serum cytokines (IL-6, IL-10, CRP etc.) is associated with intubation among COVID-19 patients? Understanding how longitudinal serum cytokines like IL-6, IL-10, and CRP correlate with severe outcomes in COVID-19 patients is critical. These cytokines are pivotal in immune…

Scientific Questions Being Studied

How longitudinal serum cytokines (IL-6, IL-10, CRP etc.) is associated with intubation among COVID-19 patients?
Understanding how longitudinal serum cytokines like IL-6, IL-10, and CRP correlate with severe outcomes in COVID-19 patients is critical. These cytokines are pivotal in immune response and their levels can indicate cytokine storm, which worsens inflammation and tissue damage. Tracking these markers over time helps predict disease severity and outcomes such as respiratory failure or death. This knowledge aids in timely intervention and personalized treatment, potentially improving patient outcomes amid the pandemic.

Project Purpose(s)

  • Disease Focused Research (COVID-19)
  • Population Health

Scientific Approaches

We are using a retrospective cohort study design to examine factors associated with intubation in COVID-19 patients.
Dependent Variable: Binary variable indicating whether intubation occurred (1) or did not occur (0).
Independent Variables: Demographics: Age, sex, race, and ethnicity.
Physical Measurements: BMI and pregnancy status.
Biomarkers: D-dimer, Interleukin-6 (IL-6), IL-10, CRP. Collect biomarker data at specific times relative to the diagnosis of COVID-19.
COVID Vaccine Status: Document vaccination status and dates.
Drug: Record medications administered and their timing relative to COVID-19 diagnosis.

Repeated measures logistic regression will be performed to assess the relationship between the biomarkers and intubation status. Adjusted odds ratios (aOR) will be reported with their 95% confidence intervals. We will also examine potential interaction effects potential interactions between independent variables (e.g., biomarkers and demographics).

Anticipated Findings

We anticipate to reveal that elevated levels of biomarkers such as D-dimer, IL-6, IL-10, and CRP are significantly associated with increased odds of intubation. This could underscore the role of systemic inflammation and coagulopathy in disease severity. In the meantime, the we also would like to know the impact of demographic factors that older age, male sex, and specific racial or ethnic groups are more prone to requiring intubation, highlighting demographic disparities in COVID-19 outcomes.

Demographic Categories of Interest

  • Race / Ethnicity

Data Set Used

Controlled Tier

Research Team

Owner:

  • Weize Wang - Project Personnel, Florida International University

Final work

we wanted to study association between penumonia and hemolytic anemia and also COVID and other association since there has been differentassociation found so far and wanted to see if the gathered data aslo supportour hypotehsis

Scientific Questions Being Studied

we wanted to study association between penumonia and hemolytic anemia and also COVID and other association since there has been differentassociation found so far and wanted to see if the gathered data aslo supportour hypotehsis

Project Purpose(s)

  • Population Health
  • Educational

Scientific Approaches

we will use all of us research dataset, research method will be mostly cross sectional study, we will use the AI code generatior and our in house analyst to overview our statistical part

Anticipated Findings

We are working to see if our hypothesis is also applicable to all of us research dataset. If we see any asosication, given it will be cross sectional study, we can propose to do clinical trials and develop diagnostic or treatement guidelines

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Registered Tier

Research Team

Owner:

  • Pooja Roy - Other, New York City Health & Hospitals

Housing Insecurity and Mental Wellbeing v6 CT

This exploratory analysis will examine the association between housing insecurity and the impact of COVID-19 on participant health and mental health as measured in the COPE surveys. The initial exploration will assess whether the sample sizes and cross-tabulations are sufficient…

Scientific Questions Being Studied

This exploratory analysis will examine the association between housing insecurity and the impact of COVID-19 on participant health and mental health as measured in the COPE surveys. The initial exploration will assess whether the sample sizes and cross-tabulations are sufficient to proceed with a research project examining the impact of the COVID-19 pandemic on housing insecure individuals, as compared to securely-housed individuals.

Project Purpose(s)

  • Social / Behavioral

Scientific Approaches

This analysis will pull data from the Basics survey and the COPE surveys to examine whether answers in the COPE surveys can be broken down by differential housing circumstances. This will include summary and bivariate analyses.

Anticipated Findings

We hypothesize that housing insecurity will be associated with enduring worse health and mental health outcomes as a result of the COVID-19 pandemic.
This research project seeks to reduce health disparities and improve health equity in underrepresented in biomedical research (UBR) populations.

Demographic Categories of Interest

  • Race / Ethnicity
  • Income Level

Data Set Used

Controlled Tier

Research Team

Owner:

  • Catherine Xin - Graduate Trainee, New York University
  • Stephanie Cook - Early Career Tenure-track Researcher, New York University
  • Andrea Titus - Other, New York University, Grossman School of Medicine

Collaborators:

  • Giselle Routhier - Research Fellow, New York University, Grossman School of Medicine
  • Binyu Cui - Graduate Trainee, New York University
  • Chenziheng Weng - Graduate Trainee, New York University

COVID EHR Exploration

We intend to explore several aspects of the EHR to elucidate health patterns in hospital settings during COVID-19. Right now, we intend to build a set of COVID positive patients and determine burden of disease longitudinally. In the future, once…

Scientific Questions Being Studied

We intend to explore several aspects of the EHR to elucidate health patterns in hospital settings during COVID-19. Right now, we intend to build a set of COVID positive patients and determine burden of disease longitudinally. In the future, once genetic information is released, we hope to use genetic information to explain differences in COVID severity.

Project Purpose(s)

  • Disease Focused Research (Coronavirus)
  • Population Health
  • Ancestry

Scientific Approaches

We will use the EHR, COPE survey, and in the future genetic information to explore associations between genetics and COVID as well as burden of disease.

Anticipated Findings

Any discovered associations between genetics and COVID severity can help inform clinical practitioners about potential increased risk of severe illness and death that is attributable to genetic predisposition.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Registered Tier

Research Team

Owner:

  • Tracey Ferrara - Project Personnel, National Human Genome Research Institute (NIH - NHGRI)
  • David Schlueter - Research Fellow, National Human Genome Research Institute (NIH - NHGRI)
  • David Schlueter - Early Career Tenure-track Researcher, University of Toronto

Metformin Association with PASC

The overall goal of this research is to evaluate the association between use of metformin prior to COVID-19 illness and subsequent incidence of PASC compared to patients who were prevalent users of other diabetes medications.

Scientific Questions Being Studied

The overall goal of this research is to evaluate the association between use of metformin prior to COVID-19 illness and subsequent incidence of PASC compared to patients who were prevalent users of other diabetes medications.

Project Purpose(s)

  • Disease Focused Research (Postacute sequelae of SARS-CoV-2 infection (PASC))

Scientific Approaches

Using condition and medication information in the Controlled Tier dataset, we will look for associations between patients who used different diabetes medications prior to a COVID-19 infection to quantify their risk of developing PASC. An analytic fact table will be developed and data will be analyzed using Python and SQL. The study design is a retrospective cohort analysis using trial emulation techniques in adults with documented SARS-CoV-2 infection. The index date will be the date of first documented SARS-CoV-2 infection, and the exposure of interest: existing metformin or other diabetes medication prescription. The outcome of interest is a subsequent diagnosis of PASC.

Anticipated Findings

In vitro data show metformin inhibits severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2) virus and pathogenic inflammatory responses to the virus. Clinical trial data show metformin prevents severe Covid-19 and Long Covid. We anticipate seeing an association with metformin use and the risk of developing PASC.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Controlled Tier

Research Team

Owner:

  • Steve Johnson - Early Career Tenure-track Researcher, University of Minnesota
  • Lisiane Pruinelli - Mid-career Tenured Researcher, University of Minnesota

Collaborators:

  • Tim Meyer - Project Personnel, University of Minnesota
  • Ragnhildur Bjarnadottir - Early Career Tenure-track Researcher, University of Florida
  • Marisa Sileo - Project Personnel, Georgia Institute of Technology

Impact of Covid19

My plan is to focus on understanding the impact of COVID-19 vaccination progress and its relationship with public health, economic recovery, and demographic factors. How did vaccination rates correlate with COVID-19 case and mortality trends over time? Were there disparities…

Scientific Questions Being Studied

My plan is to focus on understanding the impact of COVID-19 vaccination progress and its relationship with public health, economic recovery, and demographic factors.
How did vaccination rates correlate with COVID-19 case and mortality trends over time?
Were there disparities in vaccination distribution across different regions, age groups, or socioeconomic backgrounds?
Did higher vaccination rates lead to a significant reduction in hospitalizations and severe cases?
What factors contributed to vaccine hesitancy, and how did misinformation impact vaccination uptake?
Did social media influence public perception of vaccines, and what patterns can be identified from sentiment analysis?

Project Purpose(s)

  • Educational

Scientific Approaches

I plan to utilize available datasets related to COVID-19 vaccination, public health outcomes, and socioeconomic factors on All of Us platform.

Research Methods and Analytical Approaches:
(a) Exploratory Data Analysis (EDA)
Descriptive Statistics:
Compute means, medians, and standard deviations for vaccination rates, case fatality rates, and economic indicators.
Identify missing data and perform necessary cleaning (handling null values, duplicates).
Data Visualization:
Create time-series plots to observe vaccination progress vs. case and death rates.
Heatmaps to detect correlations between vaccination rates and socioeconomic factors.
(b) Statistical and Machine Learning Approaches
Correlation and Regression Analysis:
Pearson and Spearman correlation to assess relationships between vaccination rates, public health outcomes, and economic recovery.
Multiple Linear Regression to analyze the impact of vaccination rates on COVID-19 fatalities and economic growth.

Anticipated Findings

Policymakers can use findings to optimize vaccine distribution and public health campaigns for future pandemics.
Governments and businesses can develop economic recovery strategies informed by vaccination data.
Social media platforms can use insights to combat misinformation and promote credible health information more effectively.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Registered Tier

Research Team

Owner:

Replication and validation of combinatorial genetic risk factors for long COVID

Long COVID is a debilitating chronic condition that has affected over 100 million people globally. Despite considerable global research, traditional genetic studies have identified a single gene linked to long COVID, with little insight into the mechanisms underlying this complex…

Scientific Questions Being Studied

Long COVID is a debilitating chronic condition that has affected over 100 million people globally. Despite considerable global research, traditional genetic studies have identified a single gene linked to long COVID, with little insight into the mechanisms underlying this complex heterogeneous disease. Using PrecisionLife’s unique combinatorial approach to analyzing complex, chronic diseases, Taylor et al. (2023) identified 73 genetic associations with long COVID, including mechanistic differences between different patient subgroups. These genetic associations are reflected in combinatorial disease signatures, i.e., combinations of SNP genotypes that are significantly over- or under-enriched in long COVID patients. This study aims to replicate and validate those signatures in a diverse patient population. Validated signatures will then be used as the basis for a clinical decision support tool that can be used to stratify patients based on genetic risk and mechanistic subcategorization.

Project Purpose(s)

  • Disease Focused Research (Long COVID)
  • Methods Development
  • Ancestry

Scientific Approaches

For each Long COVID disease signature from Taylor et al. (2023), we will generate summary statistics (e.g., # cases & controls, odds ratio, p-value) to evaluate the overall degree of replication in a patient cohort comprised of long COVID patients and healthy controls. Signatures with odds ratio <1 will be flagged as non-replicating. We will also test whether the count of disease signatures possessed by each patient is significantly associated with case-control status. This test will be repeated in ancestry-specific cohorts to identify potential challenges for health equity.
For each signature, we will evaluate the contribution of each component SNP to disease risk by comparing the odds ratio for patients with the full signature to the odds ratio for patients with the broader signature excluding the focal SNP. SNPs will be removed from the signature when the odds ratio of the latter exceeds the former. This refinement process will be repeated using a 5-fold cross validation approach.

Anticipated Findings

The main output of this study will be a set of combinatorial disease signatures that are associated with elevated risk of Long COVID in multiple datasets. Each signature will be paired with summary statistics (e.g., odds ratio, p-value), allowing us to assess the identify and annotate signatures that are individually significant. We expect to further demonstrate that a risk score based on the cumulative effects of refined signatures is significantly correlated with prevalence of long COVID and that this correlation is significant in all broad ancestry groups and not just patients with European ancestry.

Validated signatures will be further clustered based on shared mechanistic hypotheses as identified in the Taylor et al. (2023) manuscript. We expect to demonstrate that these signatures can be used to stratify the population, opening potential for precision medicine-based treatment of long COVID.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Controlled Tier

Research Team

Owner:

Social connection and suicide risk (v8)

Social capital is a major determinant of suicide risks. However, it remains unknown whether this association differs by the type of social capital. In this study, we examine associations between neighborhood bonding, bridging, and linking social capital and suicide ideation…

Scientific Questions Being Studied

Social capital is a major determinant of suicide risks. However, it remains unknown whether this association differs by the type of social capital. In this study, we examine associations between neighborhood bonding, bridging, and linking social capital and suicide ideation during the covid-19 pandemic.

Project Purpose(s)

  • Social / Behavioral

Scientific Approaches

We will use data from the COPE surveys to assess suicidal ideation during the pandemic. We will link the external data capturing neighborhood social capital. We will then perform regression analysis, adjusting for many individual- and neighborhood-characteristics as potential confounders.

Anticipated Findings

This study will provide new insights into which specific type of social capital is more effective in preventing suicide.

Demographic Categories of Interest

  • Race / Ethnicity
  • Age
  • Sex at Birth
  • Gender Identity
  • Sexual Orientation
  • Geography
  • Disability Status
  • Education Level
  • Income Level

Data Set Used

Controlled Tier

Research Team

Owner:

  • Koichiro Shiba - Early Career Tenure-track Researcher, Boston University

Collaborators:

  • Haku Chao - Undergraduate Student, Boston University
  • Azuna Sawada - Graduate Trainee, Columbia University

Duplicate of All of Us chronic conditions Fitbit analysis

Objective: To access differences in Fitbit measures across various chronic conditions, such as diabetes, Covid and long Covid, hypertension, heart diseases, and others. Our hypothesis is that individuals with chronic conditions will have poorer Fitbit measure health outcomes than those…

Scientific Questions Being Studied

Objective: To access differences in Fitbit measures across various chronic conditions, such as diabetes, Covid and long Covid, hypertension, heart diseases, and others. Our hypothesis is that individuals with chronic conditions will have poorer Fitbit measure health outcomes than those without chronic conditions.

Project Purpose(s)

  • Disease Focused Research (chronic conditions)
  • Population Health
  • Social / Behavioral

Scientific Approaches

Dataset: develop a dataset of Fitbit users with and without certain chronic conditions.
We will describe the sample in terms of sociodemographics. We will use the combination of feature engineering and machine learning techniques to assess differences between groups.

Anticipated Findings

We expect to find differences in heart rate and activity levels, and sleep across different disease groups as well as heterogeneities across sociodemographic groups. The findings will help develop passive characterization and predictive models of chronic conditions.

Demographic Categories of Interest

  • Race / Ethnicity
  • Age
  • Geography
  • Disability Status
  • Access to Care
  • Education Level
  • Income Level

Data Set Used

Controlled Tier

Research Team

Owner:

  • Citina Liang - Graduate Trainee, University of Southern California

Duplicate of COVID 19 and Pulmonary Hypertension

Does COVID 19 Predispose individuals with Pulmonary hypertension.If yes is there any variability among different ages, gender and ethinic groups.

Scientific Questions Being Studied

Does COVID 19 Predispose individuals with Pulmonary hypertension.If yes is there any variability among different ages, gender and ethinic groups.

Project Purpose(s)

  • Disease Focused Research (COVID 19)
  • Population Health
  • Methods Development
  • Control Set
  • Ethical, Legal, and Social Implications (ELSI)

Scientific Approaches

Perform a retrospective analysis of COVID 19 and access the outcome as pulmonary hypertension. Would like to see if there is correlation between these diseases. If there is a correlation would like to explore the outcome across different gender, ethnicity and age based.

Anticipated Findings

We anticipate that there would be a direct correlation to COVID 19 and pulmonary hypertension. And if there is a correlation would like to explore the relationship across age, gender and different ethnicities.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Registered Tier

Research Team

Owner:

ICU ADMISSIONS

The COVID-19 pandemic put immense pressure on ICUs. Data-driven strategies are needed to improve patient management, yet challenges remain in predicting who will require ICU care, who may not survive despite admission, and how long survivors will stay. This study…

Scientific Questions Being Studied

The COVID-19 pandemic put immense pressure on ICUs. Data-driven strategies are needed to improve patient management, yet challenges remain in predicting who will require ICU care, who may not survive despite admission, and how long survivors will stay. This study aims to build on prior work by identifying key clinical markers and evaluating the accuracy of predictive models for ICU admission, mortality, and length of stay before patients require critical care.

Expanding on a previous study that used 733 hospitalized COVID-19 patients from a single institution, this research incorporates a larger and more diverse dataset from multiple hospitals to improve generalizability. Demographic, clinical, and laboratory data will be analyzed to enhance model reliability, addressing past concerns about sample size limitations, data imbalance, and the lack of external validation. --- Modify

Project Purpose(s)

  • Disease Focused Research (RESPIRATORY INFECTIONS)
  • Educational

Scientific Approaches

Machine learning models will assess ICU risk using clinical data collected – allowing for early identification of high-risk patients. To improve fairness and accuracy, strategies will be applied to ensure the dataset is balanced, ensuring predictions remain reliable across different patient groups. The goal is to develop an interpretable and practical decision-support tool for ICU planning and resource allocation.

Mathematical component – within host models or transmission models – SIR-type models – Ordinary Differential based equations and analysis. – SARS-CoV-2 and Influenza seasonal variant.

Anticipated Findings

By refining prediction methods and incorporating broader hospital data, this study aims to strengthen ICU management strategies. Future directions will focus on validating model performance across healthcare systems, addressing ethical considerations in ICU prioritization, and integrating predictions into clinical decision-making to support proactive patient care and resource efficiency.

Demographic Categories of Interest

  • Race / Ethnicity
  • Age

Data Set Used

Registered Tier

Research Team

Owner:

Version 8 COVID Arrhythmia 2025

This project seeks to understand whether the established relationship between COVID-19 infection and new diagnosis of cardia arrhythmia is reflected in the All of Us dataset. As COVID-19 and its sequalae are still important public health concerns, our analysis may…

Scientific Questions Being Studied

This project seeks to understand whether the established relationship between COVID-19 infection and new diagnosis of cardia arrhythmia is reflected in the All of Us dataset. As COVID-19 and its sequalae are still important public health concerns, our analysis may contribute to a growing body of research that characterizes the long term effects of this novel disease.

Project Purpose(s)

  • Disease Focused Research (COVID-19, cardiac arrhythmia)

Scientific Approaches

We will assemble a dataset of participants with and without cardiac arrhythmias and COVID-19 as well as relevant covariates. We will conduct logistic regression using both a matched case design and a case-crossover design in order to understand whether a significant relationship exists between COVID-19 infection and new onset cardiac arrythmia. Data cleaning will be conducted in python, and analysis will be conducted in R.

Anticipated Findings

We anticipate a statistically significant relationship between COVID-19 infection and new onset cardiac arrhythmia, however we wonder which study design will be more appropriate. We also wonder whether COVID-19 cases not recorded in the EHR will constitute a significant confounder.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Controlled Tier

Research Team

Owner:

Duplicate COVID19 Vaccine and T2D

The aim of my research is to determine the correlation between blood glucose, hemoglobin A1c and SARS-CoV-2 antibody among immunized type 2 diabetic patient who received different doses and types of COVID19 vaccines.

Scientific Questions Being Studied

The aim of my research is to determine the correlation between blood glucose, hemoglobin A1c and SARS-CoV-2 antibody among immunized type 2 diabetic patient who received different doses and types of COVID19 vaccines.

Project Purpose(s)

  • Disease Focused Research (Diabetes and COVID19)
  • Population Health
  • Social / Behavioral
  • Educational

Scientific Approaches

The aim of my research is to determine the correlation between blood glucose, hemoglobin A1c and SARS-CoV-2 antibody among immunized type 2 diabetic patient who received different doses and types of COVID19 vaccines. I am planning on creating a dataset consisting of type 2 diabetic patients who received their COVID-19 vaccines and have their blood glucose, HbA1c, and SARS-CoV-2 antibody measurements available. I am planning on using SAS and various statistical methods to answer my research questions.

Anticipated Findings

Participants who the highest number of vaccine doses may have better blood glucose and Ha1c levels. Vaccine hesitancy may correlate with poor of glycemic control tests. The findings of the study can help raise awareness on the role of immunization on the outcomes of diabetes. Furthermore, they can improve public health intervention on the management of diabetes.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Registered Tier

Research Team

Owner:

Duplicate of COVID 19 and Pulmonary Hypertension

Does COVID 19 Predispose individuals with Pulmonary hypertension.If yes is there any variability among different ages, gender and ethinic groups.

Scientific Questions Being Studied

Does COVID 19 Predispose individuals with Pulmonary hypertension.If yes is there any variability among different ages, gender and ethinic groups.

Project Purpose(s)

  • Disease Focused Research (COVID 19)
  • Population Health
  • Methods Development
  • Control Set
  • Ethical, Legal, and Social Implications (ELSI)

Scientific Approaches

Perform a retrospective analysis of COVID 19 and access the outcome as pulmonary hypertension. Would like to see if there is correlation between these diseases. If there is a correlation would like to explore the outcome across different gender, ethnicity and age based.

Anticipated Findings

We anticipate that there would be a direct correlation to COVID 19 and pulmonary hypertension. And if there is a correlation would like to explore the relationship across age, gender and different ethnicities.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Registered Tier

Research Team

Owner:

  • Pooja Roy - Other, New York City Health & Hospitals

Covid and Time

A well-known challenge for machine learning in healthcare is that patients with different comorbidities induce different distributions over outcomes and the pattern changes across time. We will study two questions related to this heterogeneity, with the goal of producing more…

Scientific Questions Being Studied

A well-known challenge for machine learning in healthcare is that patients with different comorbidities induce different distributions over outcomes and the pattern changes across time. We will study two questions related to this heterogeneity, with the goal of producing more effective characterizations of patients at risk of severe outcomes due to SARS-COV-2 infection as well as more effective population-level tracking of COVID-19 outbreaks and disease burden. First, we will study how temporal variation impacts the risk factors and how to detect the shifts in distribution. Second, we will study how multiple sites could pool surveillance-related data together in a manner that preserves patient privacy and allows more rapid detection of new outbreaks or changes in the distribution of disease burden.

Project Purpose(s)

  • Methods Development

Scientific Approaches

A well-known challenge for machine learning in healthcare is that patients with different comorbidities induce different distributions over outcomes and the pattern changes across time. We will study two questions related to this heterogeneity, with the goal of producing more effective characterizations of patients at risk of severe outcomes due to SARS-COV-2 infection as well as more effective population-level tracking of COVID-19 outbreaks and disease burden. First, we will study how temporal variation impacts the risk factors and how to detect the shifts in distribution. Second, we will study how multiple sites could pool surveillance-related data together in a manner that preserves patient privacy and allows more rapid detection of new outbreaks or changes in the distribution of disease burden.

Anticipated Findings

A well-known challenge for machine learning in healthcare is that patients with different comorbidities induce different distributions over outcomes and the pattern changes across time. We will study two questions related to this heterogeneity, with the goal of producing more effective characterizations of patients at risk of severe outcomes due to SARS-COV-2 infection as well as more effective population-level tracking of COVID-19 outbreaks and disease burden. First, we will study how temporal variation impacts the risk factors and how to detect the shifts in distribution. Second, we will study how multiple sites could pool surveillance-related data together in a manner that preserves patient privacy and allows more rapid detection of new outbreaks or changes in the distribution of disease burden.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Controlled Tier

Research Team

Owner:

  • Ruiqi Lyu - Graduate Trainee, Carnegie Mellon University

Investigation on Suicide in the COVID-19 pandemic Phase 2

Outbreak of Coronavirus Disease 2019 (COVID-19) has caused a new psychological burden. Patient Health Questionnaire (PHQ-9) can be used to evaluate mood status, monitor changes in signs/symptoms of suicide, and assess suicidal ideation. Here our study aims to describe the…

Scientific Questions Being Studied

Outbreak of Coronavirus Disease 2019 (COVID-19) has caused a new psychological burden. Patient Health Questionnaire (PHQ-9) can be used to evaluate mood status, monitor changes in signs/symptoms of suicide, and assess suicidal ideation. Here our study aims to describe the basis statistics of PHQ-9 scores and its inferred depression or suicide risk for all participants in All of US COPE survey.

Project Purpose(s)

  • Disease Focused Research (Suicidal behaviors/thoughts)

Scientific Approaches

PHQ-9 questions and answers will be retrieved for participants involved in six different time points. Response to each question will be converted to numeric scores (0, 1, 2, 3), and then summed up to derive the PHQ-9 total score. Participants missing any individual score were not included in this study. Binary status of suicidal ideation will be defined using item-9 answer (i.e., yes for >0). Distributions of PHQ-9 total score, suicidal ideation status at each time session will be reported by descriptive statistics stratified by age, sex, and ancestry. Their changes across different time sessions were tested by Kruskal-Wallis (KW) test, Friedman test, or chi-square test. Multivariable analyses are going to be conducted by generalized linear mixed models.

Anticipated Findings

We anticipate the descriptive statistics, pairwise correlations, and multivariable model fitting results will tell us the trajectories of suicidal thoughts and behaviors in the COVID-19 pandemic. They will not only help to verify the known relationship between suicide and gender or age, but also will provide new evidence of mood status changes along COVID-19 pandemic at both population and individual level.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Registered Tier

Research Team

Owner:

  • Hongsheng Gui - Early Career Tenure-track Researcher, Henry Ford Health System

Collaborators:

  • Hsueh-Han Yeh - Research Associate, Henry Ford Health System

PASC AI Derived Phenotypes for GWAS V8

We will develop models to predict COVID-19 patients risk for PASC including neurological complications. In particular, we will use the predicted scores derived from an RNN based model trained on N3C as the phenotype for PASC that will be used…

Scientific Questions Being Studied

We will develop models to predict COVID-19 patients risk for PASC including neurological complications. In particular, we will use the predicted scores derived from an RNN based model trained on N3C as the phenotype for PASC that will be used for further GWAS analysis

Project Purpose(s)

  • Disease Focused Research (COVID-19 , PASC and Alzheimer's disease)
  • Ancestry

Scientific Approaches

Datasets: N3C is one of the richest data sources that include the electronic health records data for more than 5 million confirmed covid-19 patients from 74 sites
across the United States. All of Us is a unique source where we can access the genetic and clinical data for 100000 US patients and with higher representation for
minority groups.

Research Methods and Methods : We train a deep learning-based model on COVID-19 patients’ data available through the N3C initiative. As an outcome, the model will learn a phenotypic representation that consists of the patient's risk to develop post COVID complications, including neuropsychiatric complications Afterwards, we will transfer the model to the All of Us researcher platform and apply our model to create the phenotypic representation for the 11,767 COVID-19 patients using their EHR data. Then, we will use the genotypic data for 3,653 Covid-19 patients who has both their whole genome sequencing (WGS) and EHR data for the GWAS study

Anticipated Findings

The goal is to bring breakthroughs in AI/ML for expedite discovery of the genetic basis of Alzheimer’s disease (AD). We expect to find associations between endophenotypes and SNPs related to Long Covid and Alzheimer's Disease.

The innovation of our project are as follows:
1. Using a transfer learning approach to leverage the large N3C data for phenotyping All of Us data is new.
2. We will be the first who leverage the All of Us platform to study the genetic factors for neuro-PASC and ADRD-PASC.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Controlled Tier

Research Team

Owner:

  • Laila Rasmy Bekhet - Graduate Trainee, University of Texas Health Science Center, Houston
  • Keith Sanders - Graduate Trainee, University of Texas Health Science Center, Houston
  • Hao Yan - Graduate Trainee, University of Texas Health Science Center, Houston
  • Degui Zhi - Mid-career Tenured Researcher, University of Texas Health Science Center, Houston
  • Ardalan Naseri - Other, University of Texas Health Science Center, Houston

V8 ARI Workspace - 4-21-23

We now have 4 goals in our research. This workspace is for goals 1 through 3. We have created a new workspace for Goal #4. 1. Determine prevalence of autoimmune diseases, individually and as a class of disease, in the…

Scientific Questions Being Studied

We now have 4 goals in our research. This workspace is for goals 1 through 3. We have created a new workspace for Goal #4.

1. Determine prevalence of autoimmune diseases, individually and as a class of disease, in the US.

2. Determine comorbidity of autoimmune diseases, including statistics on comorbidity of other autoimmune diseases and non-autoimmune diseases for each autoimmune disease.

3. Determine the impact of COVID-19 on the autoimmune and autoinflammatory disease population. This work will be conducted in parallel with work we are doing at University of Southern California under an IRB there.

4. Explore the genomic component of autoimmune diseases, particularly among patients with more than one autoimmune disease, so that the underlying mechanisms of disease among these diseases can be better understood.

Project Purpose(s)

  • Disease Focused Research (Autoimmune diseases)
  • Population Health
  • Ancestry

Scientific Approaches

We will create three data sets for analysis:

1. A list of diseases rated in the following ways:

a. Evidence Class
i. Strong evidence it is autoimmune
ii. Moderate evidence it is autoimmune
iii. Weak evidence for autoimmunity
iv. A comorbidity of autoimmune disease
v. Symptom or symptom set with no known mechanism

b. Autoinflammatory versus autoimmune flag

c. “Not always autoimmune” flag – to indicate diseases that could have alternative mechanisms of cause

2. A list of patients, anonymized, with socioeconomic, geographic and other data that would be of interest to patients and public health officials to understand which communities are affected by these diseases
3. Outcomes data for patients over time assessing quality of life using PROMIS metrics

Anticipated Findings

The current NIH estimate of 23.5 million people with autoimmune disease was a guess by a knowledgable clinician, but has no scientific support. As a consequence, there are numerous figures in the public sphere and nobody knows which one is correct.

Many reports say autoimmune diseases are on the increase, but since the number is unknown, it is impossible to say whether this is a public health issue or not. Having a methodology that can be used to recompute the number of people with autoimmune disease will help us understand if these reports are true.

Demographic Categories of Interest

This study will not center on underrepresented populations.

Data Set Used

Registered Tier

Research Team

Owner:

Collaborators:

  • Francis Ratsimbazafy - Other, All of Us Program Operational Use
  • Stephen Kocsis - Project Personnel, Mayo Clinic
  • Jun Qian - Other, All of Us Program Operational Use
  • Jeremy Harper - Senior Researcher, Autoimmune Registry
  • Jeffrey Green - Project Personnel, Autoimmune Registry
  • Ingrid He - Project Personnel, Autoimmune Registry
  • Emily Holladay - Project Personnel, Autoimmune Registry
  • Chenchal Subraveti - Other, Vanderbilt University Medical Center
  • Boyd Ingalls - Project Personnel, Autoimmune Registry
  • Adnaan Jhetam - Project Personnel, Autoimmune Registry
  • Alexander Burrows - Research Assistant, Autoimmune Registry
  • Jagannadha Avasarala - Other, University of Kentucky
1 - 25 of 421
<
>
Request a Review of this Research Project

You can request that the All of Us Resource Access Board (RAB) review a research purpose description if you have concerns that this research project may stigmatize All of Us participants or violate the Data User Code of Conduct in some other way. To request a review, you must fill in a form, which you can access by selecting ‘request a review’ below.