What resources are available for researchers interested in survey data?

Participants in the All of Us Research Program respond to surveys spanning a variety of topics, including demographics, health care, and lifestyle. Each survey has been tested for readability and accessibility through cognitive interviews and quantitative testing. This testing process included populations from different educational backgrounds and geographic locations to capture a sample reflective of the U.S. population. You can preview the survey questions on the Survey Explorer.  Previewing the available questions can help you prepare your research questions and approach. The All of Us Researcher Workbench provides researchers with a variety of supportive materials for conducting survey research with the All of Us dataset.

  • The All of Us Research Program is very careful to protect the privacy of our participants. We follow privacy and data security rules to ensure the protection of participant data. This includes removing all personally identifying information (PII) from participant records as well as withholding and/or generalizing data that might be considered at “at risk” for participant re-identification. Because these methods affect what data are available for analysis, we provide multiple sources of documentation of our participant privacy protection methodology to all registered researchers that outline data removal, transformations, and generalizations made. This information can be found within our “Documentation” category in the User Support Hub (under “Resources for Survey Data Research”).
  • Within the Researcher Workbench, three resources are available to help you search for and understand variables of interest: survey codebooks (pdfs),  links to the All of Us Registered Tier CDR Data Dictionary (online spreadsheet), and Athena (a searchable database). Athena links survey questions and answers to their corresponding source as well as “standard concept IDs.”

For example, let’s say your research wants to include data that reflects how often the participants in your cohort smoke cigarettes. The “Lifestyle Survey”’ includes questions about cigarette smoking habits of participants (e.g., “Do you now smoke cigarettes every day, some days, or not at all?”).  If you are interested in including only those participants who smoke every day, you can look up the concept ID (SmokeFrequency_EveryDay) and standard concept ID for that specific answer (45881677) in our survey codebook, so when you are ready to analyze your data, you can make sure to extract data including that concept ID. You could also log in to Athena and search for that information by typing in the concept ID in the search bar (make sure to check “PPI” under the vocabulary drop down menu). Athena provides the concept ID as well as additional contextual information that you might find useful (e.g., it will show that concept ID is the answer to the question “Do you now smoke cigarettes every day, some days, or not at all?” which falls under the parent code of “Smoking Frequency”).

  • The All of Us Survey Codebook and Frequency Distribution Guide is a featured workspace in the Researcher Workbench that all  users can access immediately. This workspace provides detailed instructions on how to extract survey data from the All of Us data repository and visualize the data in both tables and graphs. Researchers can copy this workspace and use it as a template.