Comparing two large data repositories to understand the differences in demographics, health history, and behavioral attributes in populations

Front Oral Health. 2024 Dec 4:5:1427109. doi: 10.3389/froh.2024.1427109. eCollection 2024.

Abstract

Introduction: This study conducted a comparative analysis between two large data repositories, the All of Us (AoU) medical data and BigMouth dental data repositories.

Methods: The comparison analysis includes variables related to behavioral and systemic health, health literacy, and overall health status across race, ethnicity, and gender. The analytic approach used descriptive statistics, Chi-square, odds ratio, and 95% confidence intervals; significant comparisons were measured with Cohen's D effect sizes.

Results: In the AoU dataset, 80.6% of Hispanic or Latino participants reported alcohol use compared to 16.8% in the BigMouth data repository. The female cohort in AoU showed 87.9% alcohol use, a contrast to BigMouth's 26.0%. Additionally, the diabetes prevalence among females was 8.8% in AoU vs. 21.6% in BigMouth. Differences in health literacy were observed, with 49.2% among Hispanic or Latino participants in AoU, in contrast to BigMouth's 3.2%. Despite this, 70.1% of Hispanic or Latino respondents in AoU reported satisfactory health status, while BigMouth indicated a much higher figure at 98.3%.

Discussion: These variations highlight the importance of targeted health interventions addressing racial/ethnic and gender influences. Differences may arise from recruitment approaches, participant demographics, and healthcare access. There is a need for collaboration, standardized data collection, and inclusive recruitment to remedy these discrepancies. Further research is imperative to understand the underlying causes, facilitate interventions that address the disparities, and advocate for a more inclusive healthcare system.

Keywords: behavioral health; big data; electronic health record; health literacy; systemic health.

Grants and funding

The author(s) declare financial support was received for the research, authorship, and/or publication of this article. The grant support for this project is National Institute of Health–National Institute of Dental and Craniofacial Research (NIH-NIDCR) NIDCR 1R03DE029809.