Public Data Sets

Public Use Data Sets

The following list of data sets is provided as a resource for MPH Epidemiology and Biostatistics students for possible sources of thesis data.

BRFSS (Behavioral Risk Factor Surveillance System)
1984-1995 BRFSS Survey Data
1996-1997 BRFSS Survey Data
1998 BRFSS Survey Data

  • Nation-wide cross-sectional telephone survey of health behaviors and their relation to illness

DartmouthAtlas of Health Care :

  • Focuses on the geographic variability of the use of Medicare resources and utilization across country. May not be enough accessible data for a thesis.

DRC (Data Resource Center for Child and Adolescent Health)
2001 National Survey of Children with Special Health Care Needs (NS-CSHCN)
2003 National Survey of Children's Health (NSCH)

FARS - The Fatality Analysis Reporting System

Drawn from a nationwide census of Police Accident Reports (PARS) from ALL fatal traffic accidents in the U.S.

Web Interface:


FIPP is a statewide registry of postpartum women reporting incontinence and/or pelvic support disorders after childbirth. This registry, is funded by the NICHD and has been ongoing since 2000, and maintains over 20,000 women in the main registry with ongoing quality of life surveys and clinical examinations for specified subsets. This is the largest population-based cohort regarding incontinence related to childbirth in the nation. Sponsored by the NIH, this is a four-phased statewide examination into the risk factors for the onset, progression, and/or remission of fecal incontinence postpartum. The team assembled has a long history of success in maintaining registries, and combining clinical studies, surveys, and epidemiological data required for the proposed research.

Jeanne-Marie Guise, MD, MPH

Metropolitan Atlanta Developmental Disabilities Surveillance Program (MADDSP), CDC:

  • Survey of medical and school records to describe the prevalence of mental retardation, cerebral palsy, hearing/vision loss, epilepsy, and autism in children

National Automotive Sampling System (NASS), US Department of Transportation:

  • Extensive data about crash scenarios, including injuries sustained, damage to car, vehicle speed, etc.

National Survey of American Families
See for data sets and recent reports using these data.

National Survey of Children with Special Health Care Needs (NS-CSHCN), Maternal and Child Health Bureau of the US DHHS:

  • Survey of child needs, functionality, access to care, unmet need, and more

Nationwide Inpatient Sample (NIS), the Agency for Healthcare Research and Quality:

  • Includes extensive data from inpatient discharges: diagnoses, procedures, length of stay, cost, and more.

National Health and Nutrition Examination Survey (NHANES), CDC:

  • Cross-sectional, nation-wide survey including interview and health examination data (ie, clinical data such as blood pressure, cholesterol, etc.). Also includes a longitudinal subset ("NHANES I Epidemiologic Followup Study")

Nurses Health Study, Harvard:

  • Longitudinal questionnaire data of reproductive history, screening history, disease outcome, and more

OECD (Organization for Economic Cooperation and Development)

Pregnancy Risk Assessment Monitoring System (PRAMS)
CDC with state health departments:

  • Cross-sectional survey of mothers of recent live births, gathering pre- through post-partum data on risk factors, experiences, infant health, etc.

The State Obstetric and Pediatric Research Collaborative (STORC), OHSU:

  • Includes demographic and clinical characteristics of women who deliver at OHSU and their clinical findings and infant outcomes

Surveillance Epidemiology and End Results (SEER), National Cancer Institute:

  • US data on cancer incidence, survival, type/staging, and more

STORC (State Obstetric and Pediatric Research Collaborative)
The State Obstetric and Pediatric Research Collaborative (STORC) is a consortium of investigators from different disciplines (such as epidemiology, medical informatics, basic science, obstetrics, nursing, clinical research, and pediatrics) who focus their research on improving birth outcomes and improving the safety of childbirth for mother and child. Led by Jeanne-Marie Guise, MD, MPH, this diverse team works together to develop and synthesize sound scientific evidence and develop innovative tools to apply the best evidence in everyday clinical care.

Dr. Guise and her group developed E- STORC a comprehensive electronic obstetric record used in all outpatient and inpatient OB sites at OHSU. It is designed to capture clinical data for women and their infants who receive prenatal care at OHSU satellite clinics and women evaluated and admitted to OHSU for labor and delivery. This data system is all-inclusive of prenatal, perinatal, and antenatal patient encounters at OHSU and satellites. At the core of STORC is a custom written software package, e-STORC, that allows for direct data entry by health care providers at the point of care. Data collection begins at prenatal clinic visits at both OHSU and non-OHSU campus sites, continues through patient triage and admission to OHSU for Labor and Delivery, and follows the mother and baby through discharge and follow-up care. The collected data is available within the application to view on-line in "form" or "summary" modes, and is available to print from the computer desktop, so that the data may be placed in the paper OHSU medical record. The graphical user interface, customized to capture obstetric data in a user-friendly fashion at the point of care is combined with an MS SQL database back-end to manage and store the patient data. Entry of data for E-STORC began at OHSU in March 2005 and contains data for approximately 2500 deliveries each year.

The STORC database contains more than 3460 variables describing demographic and clinical characteristics of women who deliver at OHSU, and more than 200 data fields describe clinical findings and outcomes of the infant.

Jeanne-Marie Guise, MD, MPH

YRBS (Youth Risk Behavior Survey)
Youth '97:
1990 through 1997 National YRBS
1995, 1997 Oregon YRBS

Virginia Commonwealth University's Dataset Compilation
R.M. Jones, MPH, PhD has compiled a list of well-established datasets for thesis work.

CDs Currently Available in the Department:
1984-95 BRFSS Survey Data
1997 Natality Data Set (Issued July 1999)
1997 National Ambulatory Medical Care Survey (Issued July 1999)
1995 National Health interview Survey (Issued September 1998): ASCII Version
1994 National Health Interview Survey on Disability Phase I (Issued September 1996)
1994 National Health Interview Survey on Disability Phase I and II (Issued July 1998): ASCII Version
National Health and Nutrition Examination Survey III 1988-94 (Revised October 1997)
National Health and Nutrition Examination Survey III 1988-94 (Issued April 1998): ASCII Version
NHANES III Reference Manuals and Reports (Issued October 1996): No data. Includes .PDF files.
1996 National Home and Hospice Care Survey (Issued June 1999)
1996 National Hospital Ambulatory Medical Care Survey (Issued August 1998)

Data Sets Available Outside the Department
The University of Michigan has a very comprehensive collection of 50-60 databases on their website.
NAACSR (National Association of Central Cancer Registries)
AHQR (formerly AHCPR) Hospital Patient Survey
Medical Expenditure Panel Survey
HCFA (Health Care Financing Administration)

Faculty Interests and Database Availability

The following information has been compiled to assist Epidemiology and Biostatistics track students in finding a thesis topic and a thesis advisor. Additional faculty may become available and currently listed faculty may obtain students or no longer be available after this list is published.

Phone number
Data sets Topics

Mathew Carlson, PhD
**David Mosen, PhD
Data sets: CareOregon data
Topics: Racial disparities in access, quality of care & outcomes; patterns in utilization of care & meds.; quality of care; smoking prevention; receipt of substance abuse treatment; impact of managed care on quality of care and health outcomes

Jeanne-Marie Guise, MD, MPH

Data sets: FIPP (statewide survey of postpartum women), e-STORC (OHSU pregnancy & birth database)
Topics: Maternal and child health, obstetric safety, childbirth decision making, obstetric outcomes

Katrina Hedberg, MD, MPH
Data sets: multiple for Oregon
Topics: Various (will link with other potential advisors)

Ken James, PhD

Topics: Clinical trials

Jodi Lapidus, PhD
Data sets: Disability, DUII, NCHS data, YRBS, BRFSS, SEER
Topics: Various, especially biostat topics

Lynn Marshall, PhD
Data sets: MrOS, OR Hospital Discharge
Topics: Hip fracture rates in OR

Dennis McCarty, PhD
Data sets: Client Process Monitoring System
Topics: Substance abuse

Bentson McFarland, MD, PhD

Data sets: Canadian National Pop. Health Survey, US Health Care Cost and Utilization Project, National Inpatient Sample
Topics: Substance abuse, mental health services research, psychiatric epidemiology

Alan Melnick, MD, MPH
Data sets: Clackamas County - birth data, communicable disease data, STD, immunizations, cancer, WIC, primary care clinic, Medicaid
Topics: Various

Cynthia Morris, PhD

Topics: Maternal and child health, nutritional epidemiology, cardiovascular disease, clinical epidemiology

Gary Rischitelli, MD, JD, MPH
Data sets: Oregon Hospital Discharge data, Workers' Comp. Claims data and medical payment data
Topics: Environmental epidemiology, occupational epidemiology

Ken Rosenberg, MD, MPH, State MCH epidemiologist
Data sets: Oregon PRAMS - ongoing survey of Oregon women 3 months after a live birth; oversamples minority women, about 1800 women per year
Topics: Breastfeeding, prenatal care, domestic violence, alcohol and tobacco use, HIV testing. For more information see web site:

Jackilen Shannon, PhD
503-220-8262 x57285
Data sets: Diet and cervical cancer, (Thailand), diet and breast cancer (Shanghai), environ. & health conditions in Texas
Topics: Nutritional epi. diet and gene interactions in cancer prevention

*Michael Skeels, PhD, MPH
Data sets: Some state data
Topics: Newborn metabolic disorders, efficacy of screening for some new disorders, communicable disease

Atif Zaman, MD, MPH
Data sets: Chronic liver disease, surveillance, hepatitis C registry study
Topics: Liver disease, hepatitis C

*Willing to serve on committees and supervise internships but not available to chair theses.
**Not able to serve on committees but able to supervise internships.