A Longitudinal Big Data Approach for Precision Health

08 Jan 2026 · RNA

Authors

Sophia Miryam Schüssler-Fiorenza Rose1,2,3*, Kévin Contrepois1*, Kegan J. Moneghetti4,5,6, Wenyu Zhou1, Tejaswini Mishra1, Samson Mataraso7,8, Orit Dagan Rosenfeld1, Ariel B. Ganz1, Jessilyn Dunn1,9, Daniel Hornburg1, Shannon Rego1, Dalia Perelman1, Sara Ahadi1, M. Reza Sailani1, Yanjiao Zhou10,11, Shana R. Leopold10, Jieming Chen12, Melanie Ashland1, Jeffrey W. Christle4,5, Monika Avina1, Pats Limcaoco1, Camilo Ruiz13, Marilyn Tan14, Atul J. Butte12, George M. Weinstock10, George M. Slavich15, Erica Sodergren10, Tracey L. McLaughlin14, Francois Haddad4,5**, Michael P. Snyder1,4**

* These authors contributed equally.    ** Corresponding authors.

Affiliations

  • 1 Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA
  • 2 Spinal Cord Injury Service, Veteran Affairs Palo Alto Health Care System, Palo Alto, CA 94304, USA
  • 3 Department of Neurosurgery, Stanford University School of Medicine, Stanford, CA 94305, USA
  • 4 Stanford Cardiovascular Institute, Stanford University, Stanford, CA 94305, USA
  • 5 Division of Cardiovascular Medicine, Stanford University School of Medicine, Stanford, CA 94305, USA
  • 6 Department of Medicine, St Vincent's Hospital, University of Melbourne, Melbourne, Australia
  • 7 Department of Electrical Engineering and Computer Sciences, University of California, Berkeley, CA 94720, USA
  • 8 Department of Bioengineering, University of California, Berkeley, CA 94720, USA
  • 9 Mobilize Center, Stanford University, Stanford, CA 94305, USA
  • 10 The Jackson Laboratory for Genomic Medicine, Farmington, CT 06032, USA
  • 11 University of Connecticut Health, Farmington, CT 06030, USA
  • 12 UCSF Bakar Computational Health Sciences Institute, San Francisco, CA 94143, USA
  • 13 Department of Bioengineering, Stanford University, Stanford, CA 94305, USA
  • 14 Division of Endocrinology, Stanford University School of Medicine, Stanford, CA 94305, USA
  • 15 Cousins Center for Psychoneuroimmunology, UCLA, Los Angeles, CA 90095, USA

Users may view, print, copy, and download text and data-mine the content in such documents, for the purposes of academic research, subject always to the full Conditions of Use: nature.com licensing terms.

Correspondence: mpsnyder@stanford.edu or fhaddad@stanford.edu

Author Contributions

S.M.S.-F.R., M.P.S., F.H., K.C., K.M., T.M., and W.Z. contributed to conceptualization. S.M.S.-F.R., K.C., F.H., M.P.S., T.M., K.M., S.M., W.Z., and S.R. contributed to methodology. K.C. (ASCVD biomarkers), D.H. (lipidomics), A.B.G. (microbiome DADA2 processing), T.M., M.A., and W.Z. (OGTT c-peptide and insulin) contributed to omics generation and processing.

S.M.S.-F.R., K.C., T.M., W.Z., J.D., M.A., J.W.C., E.S., and P.L. contributed to data curation. K.C., S.M.S.-F.R., T.M., K.M., F.H., and M.P.S. contributed to visualization. S.M.S.-F.R., K.C., T.M., S.M., K.M., O.D.-R., S.R., J.C., and C.R. contributed to formal analysis.

S.M.S.-F.R., K.C., and M.P.S. contributed to project administration. M.P.S. and F.H. contributed to supervision. S.M.S.-F.R., F.H., K.C., K.M., and M.P.S. contributed to writing the original draft. All authors contributed to review and editing.

* These authors contributed equally to this work.

Competing Interests

M.P.S. is a cofounder of Personalis, SensOmics, January, Filtricine, Qbio, and Akna, and an inventor on provisional patent number 62/814,746 related to glycemic dysregulation. S.M.S.-F.R., K.C., W.Z., T.M., and S.M. are also listed as inventors.

A.J.B. reports grants, personal fees, and non-financial support from multiple academic, industry, and healthcare organizations during the conduct of the study. Stanford University receives royalties on licensed intellectual property.

Abstract

Precision health relies on the ability to assess disease risk at an individual level, detect early preclinical conditions, and initiate preventive strategies. Recent advances in omics technologies and wearable monitoring enable deep molecular and physiological profiling, providing powerful tools for precision health.

We investigated deep longitudinal profiling in a prospective cohort of 109 individuals enriched for type 2 diabetes risk. Participants underwent integrative Personalized Omics Profiling (iPOP) with quarterly sampling over a median of 2.8 years, incorporating genomic, immunomic, transcriptomic, proteomic, metabolomic, microbiomic, and wearable data.

We identified over 67 clinically actionable health discoveries and multiple molecular pathways associated with metabolic, cardiovascular, and oncologic processes. Predictive models for insulin resistance demonstrated the potential of omics-based measurements to replace traditional clinical tests.

Participation in the study also led most individuals to adopt positive lifestyle changes. Together, these findings demonstrate that deep longitudinal profiling enables actionable health insights and supports the advancement of precision health.

Introduction

Precision health and medicine are entering a new era where wearable sensors, omics technologies, and computational methods have the potential to improve health and lead to mechanistic discoveries. Emerging technologies such as longitudinal multi-omics profiling combined with clinical measures can comprehensively assess health and identify deviations from healthy baselines, potentially improving disease risk prediction and early detection. Connecting longitudinal multi-omics profiling with clinical assessment is also important in developing a new taxonomy of disease based on molecular measures.

Despite this promise, few studies have leveraged emerging technologies and longitudinal profiling to manage health and identify disease markers. Previous efforts included our study of a single individual in which longitudinal multi-omics profiling over 14 months captured the individual’s transition to diabetes on a deep molecular level. A recent study of 108 individuals followed for 9 months using various omic technologies revealed several health-related findings. A cross-sectional study used genome sequencing, metabolomics, and advanced imaging to identify individuals at risk for age-related chronic disease. These studies either had limited sample size, lacked meaningful longitudinal profiling, or performed only limited analysis of health information. We have also demonstrated utility in using wearable devices to detect infections and identify early glucose dysregulation, and population-based studies are underway to potentially detect arrhythmias.

In this study, we longitudinally profiled 109 participants at risk for type 2 diabetes mellitus, performing quarterly clinical laboratory tests and multi-omics assessments. In addition, individuals underwent exercise testing, enhanced cardiovascular imaging and physiological testing, wearable sensor monitoring, and completed various surveys.

The study objectives were threefold. We first evaluated the usefulness of emerging technologies in combination with standard and enhanced clinical tests to detect diseases early. We then characterized multi-omics associations with clinical pathophysiologies including glucose and insulin dysregulation, inflammation, and cardiovascular risk, and evaluated the ability of multi-omics measures to predict insulin resistance and response to glucose load. Lastly, we examined how participation affected health habits.

Results

Summary of Research Design & Cohort

A 109-person cohort enriched for individuals at risk for type 2 diabetes mellitus (DM) underwent quarterly longitudinal profiling for up to eight years (median 2.8 years) using standard and enhanced clinical measures and emerging assays. Emerging tests included molecular profiling of the genome, gene expression (transcriptome), proteins (proteome), immune proteins (immunome), small molecules (metabolome) and gut microbes (microbiome), along with wearable monitoring including continuous glucose monitoring (CGM). The study aimed to capture transitions from normoglycemic to preDM and from preDM to DM, using both standard and enhanced measures such as fasting plasma glucose, glycated hemoglobin, oral glucose tolerance test, insulin secretion assessment, and the modified insulin suppression test (SSPG). Enhanced cardiovascular profiling included vascular ultrasound, echocardiography, cardiopulmonary exercise testing, and cardiovascular disease protein markers.

The study was approved by the Stanford University Institutional Review Board (IRB 23602) and all participants consented. The mean age at enrollment was 53.4 ± 9.2 years. Genetic ancestry analysis indicated individuals mapped to expected ancestral populations. Over the study, 67 major clinically actionable health discoveries spanning metabolism, cardiovascular disease, oncology, hematology, and infectious disease were observed.

Metabolic Health Profiling

At entry, participants reported their DM status. Among 86 participants without preDM or DM, one had a DM diagnosis, one had DM-range HbA1C, and 43 had labs in the preDM range. During the study, eight more individuals converted to DM. Some participants exhibited glucose dysregulation detectable only via CGM. Exome sequencing provided actionable insights, including discovery of a hepatic nuclear factor 1A mutation (MODY) affecting medication and family testing decisions.

Enhanced metabolic profiling highlighted heterogeneity in DM pathophysiology. Measurements of FPG, HbA1C, and OGTT often varied, and insulin sensitivity assessment indicated that 55% were insulin resistant. OGTT-based insulin secretion analysis identified four clusters: early, intermediate, late, and very late. Multi-omics correlations revealed associations between the disposition index, leptin, GM-CSF, body mass index, inflammation markers, and lipid metabolism.

Longitudinal trajectories of HbA1C demonstrated transitions between normal, preDM, and DM ranges. Individual trajectory analysis revealed multiple pathways to diabetes, influenced by weight gain, microbiome diversity, and delayed insulin secretion. Multi-omics models predicted SSPG and OGTT with higher accuracy than clinical measures alone, illustrating the value of multi-omics data.

Other metabolic disorders included elevated liver enzymes (ALT), microalbuminuria, and macroalbuminuria. Screening suggested nonalcoholic fatty liver disease in a majority of participants. Multi-omics and clinical measures identified subtle hepatic abnormalities not always captured by standard labs.

Cardiovascular Health Profiling

Atherosclerotic cardiovascular disease (ASCVD) risk was assessed in all participants. Enhanced cardiovascular profiling for 43 participants included vascular ultrasound, echocardiography, and biomarkers of oxidative stress, inflammation, immune regulation, myocardial injury, and stress.

At study entry, 24 participants had ASCVD risk scores ≥ 7.5%. Wearable heart rate monitoring identified arrhythmias, leading to diagnoses of sleep apnea and atrial fibrillation. Subclinical atherosclerosis and reduced exercise capacity were observed, resulting in clinical recommendations. Five participants experienced cardiovascular events during the study, with pharmacogenomic variants influencing therapy response. Multi-omics correlations revealed interactions between ASCVD risk, inflammatory cytokines, lipid measures, and exercise capacity, highlighting opportunities for personalized risk stratification.

Oncological, Hematological & Immune Profiling

Exome sequencing revealed actionable variants associated with cancer risk (APC, SDHB, BRCA1, MUTYH, CHEK2) and hematologic disorders (PROS1). Early detection enabled interventions, such as thyroid-preserving surgery for papillary thyroid cancer. B-cell lymphoma was detected via imaging and longitudinal molecular outlier analysis, showing early cytokine (MIG) elevation as a potential biomarker.

Hematologic profiling revealed undiagnosed anemia, alpha thalassemia trait, MGUS, and smoldering myeloma. Wearable monitoring identified temperature and heart rate abnormalities linked to inflammation, leading to a Lyme disease diagnosis. These findings illustrate the value of multi-modal data for detecting hematologic, immune, and infectious conditions.

Effect of iPOP Participation on Participants

Participation influenced behavior by encouraging risk-based screening, facilitating diagnoses, informing therapeutic choices, and increasing awareness of diet and exercise. Eighty-two percent reported changes in diet and/or exercise, and nearly half improved other health behaviors including sleep, stress reduction, and self-monitoring. Most participants discussed results with family and physicians, leading to follow-up tests and additional screening. CGM and SSPG monitoring motivated positive lifestyle modifications and informed decision-making regarding diet and activity.

Discussion

Precision Health Insights

Our study demonstrates that combining untargeted multi-omics and physiological longitudinal profiling with targeted metabolic and cardiovascular assessments leads to actionable health discoveries and physiological insights. Targeted profiling connected longitudinal glucose metabolism with multi-omics, facilitating the precision medicine goal of defining diseases based on molecular mechanisms and pathophysiology. Untargeted, longitudinal big data enabled discoveries across cardiology, oncology, hematology, and infectious disease, showing broad profiling can detect disease in multiple domains.

Impact on Participants & Personalized Medicine

Over half of participants were informed of their preDM, DM, dyslipidemia, and hypertension status, prompting lifestyle changes such as diet and physical activity improvements. Enhanced clinical assays, including OGTT, beta-cell function assessment, insulin resistance, and CGM combined with standard tests, improved characterization of preDM and DM. Individual mechanisms of glucose dysregulation were identified, which has implications for personalized treatment. Multi-omics data improved prediction of SSPG compared to standard measures, supporting its use for molecular disease taxonomy and for replacing expensive insulin resistance tests with a simple blood test.

Microbiome diversity was inversely associated with SSPG, highlighting the link between gut microbes and insulin resistance. Exome sequencing revealed actionable findings, including a MODY mutation, RBM20 mutation associated with dilated cardiomyopathy, and pharmacogenomic variants affecting treatment decisions. Two participants experienced vascular events that could have been influenced by unrecognized pharmacogenomic risks.

Role of Imaging and Wearables

Imaging enabled early detection of systemic disease, including dilated cardiomyopathy, atherosclerotic disease, and asymptomatic lymphoma. Wearable sensors identified atrial fibrillation, sleep apnea, and Lyme disease, highlighting their transformative potential in precision health. CGM provided opportunities for diabetes prevention by detecting unrecognized glucose dysregulation and guiding personalized dietary responses.

Multi-Omics Insights into Cardiovascular Risk

Multi-omics analysis highlighted systemic inflammation as a key contributor to ASCVD risk. All five participants with incident cardiovascular events had subclinical inflammation. Correlation network analysis revealed roles for monocytes, HGF, IL-2, MCP-3, and interferon-gamma cytokines including MIG and IP10, illustrating molecular connections to cardiovascular health and emerging risk markers.

Outlier Analysis & Early Detection

Untargeted longitudinal outlier analysis prior to lymphoma diagnosis illustrated the power of multi-omics to identify early biomarkers and pathway changes. Elevation of MIG and shifts in the microbiome were detected up to one year before diagnosis. Similar analyses identified MGUS, where early detection can improve outcomes. While omics outliers can reveal important health signals, interpretation remains challenging in some cases, though no undue anxiety or overtesting was observed.

Methodological Considerations

Our cohort comprised highly educated volunteers, introducing self-selection bias, which may affect behavioral change generalizability but is less likely to impact biological associations. The study was ethnically diverse relative to other longitudinal multi-omics studies. Intensive molecular and physiological phenotyping demonstrated that small, longitudinal cohorts can generate important health and discovery insights. Personalized testing programs based on disease risk and marker trajectories could optimize healthcare value in the future.

Data Availability

Raw omics data (transcriptome, immunome, proteome, metabolome, microbiome) are hosted on the NIH Human Microbiome 2 project site (https://portal.hmpdacc.org/) under the T2D project along with clinical laboratory data through 2016. Data from participants without consent for public release are available on dbGAP (accession phs001719.v1.p1). Additional data unique to this manuscript are provided in supplemental data files.

Online Methods

Participant Consent and Accrual

Participants were recruited from the Stanford University surrounding community with an emphasis on individuals at risk for Type 2 diabetes. Enrollment was part of Stanford’s iPOP (Integrated Personal Omics Profiling) research study (IRB 23602), a longitudinal multi-omics study of adult volunteers enriched for pre-diabetes. No monetary compensation was provided. The study is part of the NIH integrated Human Microbiome Project (iHMP).

Design, Setting, and Participants

iPOP is a longitudinal prospective cohort study of 109 individuals (Extended Data Figure S1a). Inclusion criteria: age 25–75 years, BMI 25–40 kg/m², 2-hour OGTT 200 mg/dL. Exclusion criteria included active eating disorder, hypertriglyceridemia 400 mg/dL, uncontrolled hypertension, heavy alcohol use, pregnancy/lactation, prior bariatric surgery, or active psychiatric disease. Later, the study expanded to include participants with diabetes and normal BMI. Median participation duration was 2.8 years, with standard and enhanced clinical data available through June 2018. Most analyses used healthy time points only.

Measurements

All blood samples were collected after an overnight fast for standard and enhanced clinical tests. Standard tests: FPG, HbA1C, fasted insulin, lipid panel, metabolic panel, CBC with differential. Enhanced tests: OGTT, SSPG, beta-cell function, hsCRP, IgM, cardiovascular imaging (echocardiography, vascular ultrasound), cardiopulmonary exercise, CVD markers, wearable devices, and continuous glucose monitoring (CGM). Multi-omics profiling included genome, transcriptome, immunome, proteome, metabolome, lipidome, and microbiome.

Modified Insulin Suppression Test

Sixty-nine participants underwent a 180-minute insulin suppression test to determine SSPG after an overnight fast, with blood draws at minutes 150, 160, 170, and 180. Reasons for non-participation included medical contraindications (n=9), refusal (n=5), dropouts (n=11), or tests pending (n=15).

Multi-omics Measures

Genomics: Whole Exome Sequencing (n=88) analyzed using the HugeSeq pipeline, with pathogenic variants assessed per ACMG guidelines.
RNA Sequencing: PBMC RNA-seq performed with Illumina TruSeq and HiSeq 2000, aligned with TopHat, quantified with HTseq and DESeq2.
Proteomics: Plasma SWATH-MS using NanoLC 425 and TripleTOF 6600, top3 peptide quantification, batch correction via Perseus.
Immune Proteins: 62-plex Luminex assay (Affymetrix) at Stanford HIMC.
Metabolomics: Untargeted plasma LC-MS using RPLC and HILIC separations on Q Exactive instruments, annotated against standards and public databases.
Lipidomics: Lipids extracted from plasma and analyzed on Lipidyzer DMS-QTRAP platform.
Microbiome: 16S rRNA sequencing (V1–V3 regions), clustered into OTUs with Usearch and taxonomically assigned via RDP classifier against Greengenes.

ASCVD Circulating Markers

Millipore immunoassays (HCVD1-4MAG) characterized ASCVD blood markers, performed at the Stanford Human Immune Monitoring Center.

Wearable Physiology and CGM

Participants wore Basis or Fitbit devices. The "Change of Heart" algorithm detected abnormal heart rates relative to individual baselines. CGM used the Dexcom G4 system, recording glucose every 5 minutes for 2–4 weeks, with finger-stick calibration.

Echocardiography and Vascular Ultrasound

Baseline and post-stress echocardiography used Philips iE33 systems, with LVEF, LV GLS, and tissue Doppler imaging metrics calculated per guidelines. Vascular ultrasound assessed carotid and femoral arteries and central pulse wave velocity.

Cardiopulmonary Exercise Testing

Symptom-limited treadmill CPX with breath-by-breath analysis measured VO2, VCO2, VE, and RER. VE/VCO2 slope was calculated using linear regression, and percent predicted VO2 derived from FRIEND registry equations.

iPOP Participant Surveys

Surveys captured changes in health behaviors, understanding of personal health, medical follow-ups, and data sharing. Surveys were initially anonymous, later identified by participant ID.

Calculation of Insulin Secretion Rate and Disposition Index

ISR was calculated via deconvolution of C-peptide during OGTT, reported in pmol/kg/min every 15 minutes. Disposition Index (DI) = ISR30 × Matsuda index.

Multi-omics Feature Analysis and Modeling

K-means clustering (k=4) associated OGTT insulin secretion rates with multi-omics analytes. Spearman correlations identified associations between adjusted ASCVD risk and analytes. Linear mixed models analyzed FPG, HbA1C, hsCRP over time, controlling for age and sex. Multi-omics outliers were defined as Z-scores >95th percentile.

Microbiome Diversity Modeling

Shannon diversity (H') was modeled in univariate and multivariate SAS Proc Mixed models with repeated measures. Individual trajectories modeled with general additive models (proc gam).

SSPG and OGTT Prediction Models

Microbiome 16S data were reprocessed using QIIME2 and DADA2. Features were standardized and selected via the MMPC algorithm in the MXM R package. Ridge regression with leave-one-out cross-validation predicted SSPG/OGTT values and assessed model performance (MSE, R²).

Ethnicity PCA Plot

Ethnicity information for 72 participants was classified using the 1000 Genomes Project (1000GP) super-population definitions: African (AFR), East Asian (EAS), European (EUR), South Asian (SAS), and Admixed American (AMR). Participants self-identifying as Indians were categorized as SAS (n = 7), Hispanics and Latinos as AMR (n = 3), East Asians as EAS (n = 8), Caucasians as EUR (n = 50), and African Americans as AFR (n = 4). Ethnicity data for the 2,504 1000GP samples, along with population and super-population definitions, were obtained from ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20130502/ (downloaded April 2017).

The following filters were applied to each individual genome in the study:

  • Removed indels, retaining only single nucleotide variants (SNVs).
  • Removed SNVs without the “PASS” quality tag.
  • Kept SNVs with a minimum read depth of 1.
  • Removed SNVs with missing genotypes.

Genetic loci from the 72 participants were intersected with the 1000GP samples, resulting in 6,653 SNVs common to both datasets. To reduce linkage disequilibrium and dependency between closely located SNVs, every third SNV was selected, yielding a final set of 2,576 samples and 2,318 SNVs for principal component analysis (PCA). PCA was performed using the smartpca tool in the PLINK2 suite.

Extended Data Figures

Extended Data Fig 1. Cohort Flow Chart and Genetic Ancestry

(a) Flow chart demonstrates recruitment and enrollment of the iPOP cohort.

(b) PCA plot showing ancestries of 72 participants. Reference includes 2,504 samples from the 1000 Genomes Project. Each filled circle is a 1000GP sample, colored by super-population: African (AFR; red), Admixed American (AMR; purple), East Asian (EAS; green), European (EUR; cyan), South Asian (SAS; orange). Each black symbol represents a study participant categorized by self-reported ethnicity consistent with 1000GP super-population definitions: AFR (black filled circle), AMR (black filled triangle), EAS (black filled square), EUR (black plus sign), SAS (checked box). Participants generally cluster within the reference super-populations.

Extended Data Fig 2. Comparison of Diabetic Metrics and HbA1C Trajectories

(a) Overlap of Fasting Plasma Glucose (FPG) and Hemoglobin A1C (HbA1C) categories measured simultaneously. FPG impaired: 100–126 mg/dL; diabetic: ≥126 mg/dL. HbA1C impaired: 5.7–6.5%; diabetic: ≥6.5%.

(b) Overlap of FPG and 2-Hour Oral Glucose Tolerance Test (OGTT). OGTT impaired: 140–200 mg/dL; diabetic: ≥200 mg/dL.

(c) Longitudinal HbA1C patterns categorized into six groups: Group 1 (n = 51): normal range throughout, Group 2 (n = 5): normal → prediabetic, Group 3 (n = 10): prediabetic → normal, Group 4 (n = 21): fluctuating normal/prediabetic, Group 5 (n = 14): predominantly prediabetic, Group 6 (n = 8): crossed into diabetic range. Red lines indicate overall penalized B-spline fit for each group.

Extended Data Fig 3. Individual Longitudinal Trajectories for Diabetic Measures

Diabetic-range metrics are shown in red. Panels include: (a) OGTT, (b,c) FPG, (d) undiagnosed diabetes at study entry (HbA1C), (e) initial abnormal HbA1C, (f) bouncer with diabetic-range HbA1C and OGTT, (g) SSPG decrease with lifestyle change.

Extended Data Fig 4. Longitudinal Microbiome Trajectories in Diabetes

Longitudinal changes in weight, gut microbial Shannon diversity, phylum proportions, and genus proportions for participants ZNDMXI3 and ZNED4XZ. Microbiome outliers (95th percentile) are indicated at the latest sample time point. Microbial abundance is row-scaled (low = blue, high = red).

Extended Data Fig 5. Multi-omics of Glucose Metabolism and Inflammation

(a) Proteins and metabolites associated with HbA1C, FPG, and hsCRP using healthy-baseline and dynamic linear mixed models. Healthy-baseline models: HbA1C n = 101, samples 560; FPG n = 101, samples 563; hsCRP n = 98, samples 518. Dynamic models normalized analytes to first measurement across all time points. Significance: BH FDR < 0.2.

(b) Integrative pathway analysis using IMPaLA of proteins and metabolites associated with HbA1C, FPG, and hsCRP. Significance determined by hypergeometric test followed by Fisher’s combined probability test (BH FDR < 0.05). See Tables S9, S11, S13 for protein and metabolite counts per pathway.

Extended Data Fig 6. Outlier Analysis of RNA-seq Data

(a) Number of outlier RNA molecules (95th percentile) per participant. Analysis performed on Z-scores of median expression at healthy visits (≥3 visits; n = 63). Boxplot: 25th–75th quartiles, whiskers 1.5× IQR, horizontal bar = median.

(b) Selected clinical lab and metabolite trajectories (7 time points) for participant ZJTKAE3 showing concomitant increases of bile acids and glutamyl dipeptides with ALT and AST.

Extended Data Fig 7. Multidimensional Cardiac Risk Assessment

(a) Distribution of ASCVD risk scores (n = 35, 36 measurements) and cardiovascular imaging/physiology markers: RWT, LV GLS, E/e’, PWV (age-adjusted). Box plots: Q1, median, Q3; whiskers Q3 + 1.5×IQR and Q1 – 1.5×IQR.

(b) Ultrasound detection of carotid plaque (6/36 participants) with corresponding ASCVD risk score, HbA1C, and LV GLS distributions. Differences evaluated by two-sided Student’s t-test; error bars = ±1 SD.

(c) Correlation network of metrics significantly associated with ASCVD risk score (Spearman q-value < 0.2; n = 35 participants, 36 measurements).

(d) Composite Z-scores for ZOBX723 (unstable angina with stent placement) and ZNED4XZ (mild stroke transitioning to diabetes). Gray dots represent Z-scores of other participants (n = 101, 859 samples).

(e) Violin plot of (d) data. Boxplot shows 1st quartile, median, 3rd quartile; whiskers Q3 + 1.5×IQR and minimum value.

Supplementary Material

Refer to Web version on PubMed Central for supplementary material.

Acknowledgments

Our work was supported by grants from the National Institutes of Health (NIH) Human Microbiome Project (HMP) 1U54DE02378901 (G.M.W. and M.P.S.), an NIH grant no. R01 DK110186-03 (T.L.M.), a NIH National Center for Advancing Translational Science Clinical and Translational Science Award (no. UL1TR001085). This work used the Genome Sequencing Service Center by the Stanford Center for Genomics and Personalized Medicine Sequencing Center (supported by NIH grant no. S10OD020141), the Diabetes Genomics Analysis Core and the Clinical and Translational Core of the Stanford Diabetes Research Center (NIH grant no. P30DK116074). SMS-FR was supported by a Department of Veteran Affairs Office of Academic Affiliations Advanced Fellowship in Spinal Cord Injury Medicine and a NIH Career Development Award K08 ES028825. GMS was supported by NIH grant K08 MH103443. DH was supported by a Stanford School of Medicine Dean’s Postdoctoral Fellowship and a Stanford Center for Computational, Evolutionary and Human Genomics Fellowship. MRS was supported by grants P300PA_161005 and P2GEP3_151825 from the Swiss National Science Foundation (SNSF). The content is solely the responsibility of the authors and does not necessarily represent the official views of the NIH, the Department of Veteran Affairs, or the SNSF. We thank Songjie Chen and Brittany Lee for their work in metabolomics data production. Alessandra Breschi generously shared her code for the insulin secretion rate calculations. Finally, we thank the iPOP participants who generously gave their time and biological samples.

References

  1. National Research Council (US) Committee on A Framework for Developing a New Taxonomy of Disease. Toward Precision Medicine: Building a Knowledge Network for Biomedical Research and a New Taxonomy of Disease. (National Academies Press (US), 2012).
  2. Li X et al. Digital Health: Tracking Physiomes and Activity Using Wearable Biosensors Reveals Useful Health-Related Information. PLoS Biol. 15, e2001402 (2017). PubMed: 28081144
  3. Chen R et al. Personal omics profiling reveals dynamic molecular and medical phenotypes. Cell 148, 1293–1307 (2012). PubMed: 22424236
  4. Price ND et al. A wellness study of 108 individuals using personal, dense, dynamic data clouds. Nat. Biotechnol. (2017). doi:10.1038/nbt.3870
  5. Perkins BA et al. Precision medicine screening using whole-genome sequencing and advanced imaging to identify disease risk in adults. Proc. Natl. Acad. Sci. U. S. A. (2018). doi:10.1073/pnas.1706096114
  6. Hall H et al. Glucotypes reveal new patterns of glucose dysregulation. PLoS Biol. 16, e2005143 (2018). PubMed: 30040822
  7. McConnell MV et al. Feasibility of Obtaining Measures of Lifestyle From a Smartphone App: The MyHeart Counts Cardiovascular Health Study. JAMA Cardiol 2, 67–76 (2017). PubMed: 27973671
  8. Dinneen S, Gerich J & Rizza R. Carbohydrate metabolism in non-insulin-dependent diabetes mellitus. N. Engl. J. Med. 327, 707–713 (1992). PubMed: 1495524
  9. Varghese RT et al. Mechanisms Underlying the Pathogenesis of Isolated Impaired Glucose Tolerance in Humans. J. Clin. Endocrinol. Metab. 101, 4816–4824 (2016). PubMed: 27603902
  10. 1000 Genomes Project Consortium et al. A global reference for human genetic variation. Nature 526, 68–74 (2015). PubMed: 26432245
  11. Rego S et al. High Frequency Actionable Pathogenic Exome Variants in an Average-Risk Cohort. Cold Spring Harb Mol Case Stud (2018). doi:10.1101/mcs.a003178
  12. Pearson ER et al. Genetic cause of hyperglycaemia and response to treatment in diabetes. Lancet 362, 1275–1281 (2003). PubMed: 14575972
  13. Cersosimo E, Solis-Herrera C, Trautmann ME, Malloy J & Triplitt CL. Assessment of pancreatic β cell function: review of methods and clinical applications. Curr. Diabetes Rev. 10, 2–42 (2014). PubMed: 24524730
  14. Van Cauter E, Mestrez F, Sturis J & Polonsky KS. Estimation of insulin secretion rates from C-peptide levels. Comparison of individual and standard kinetic parameters for C-peptide clearance. Diabetes 41, 368–377 (1992). PubMed: 1551497
  15. Matsuda M & DeFronzo RA. Insulin sensitivity indices obtained from oral glucose tolerance testing: comparison with the euglycemic insulin clamp. Diabetes Care 22, 1462–1470 (1999). PubMed: 10480510
  16. Godsland IF, Jeffs JAR & Johnston DG. Loss of beta cell function as fasting glucose increases in the non-diabetic range. Diabetologia 47, 1157–1166 (2004). PubMed: 15249997
  17. Kanat M et al. The relationship between β-cell function and glycated hemoglobin: results from the veterans administration genetic epidemiology study. Diabetes Care 34, 1006–1010 (2011). PubMed: 21346184
  18. Iikuni N, Lam QLK, Lu L, Matarese G & La Cava A. Leptin and Inflammation. Curr. Immunol. Rev. 4, 70–79 (2008). PubMed: 20198122
  19. Hamilton JA. GM-CSF in inflammation and autoimmunity. Trends Immunol. 23, 403–408 (2002). PubMed: 12133803
  20. Reidy SP & Weber J. Leptin: an essential regulator of lipid metabolism. Comp. Biochem. Physiol. A Mol. Integr. Physiol 125, 285–298 (2000). PubMed: 10794958
  21. Guasch-Ferré M et al. Metabolomics in Prediabetes and Diabetes: A Systematic Review and Meta-analysis. Diabetes Care 39, 833–846 (2016). PubMed: 27208380
  22. Twig G et al. White blood cells count and incidence of type 2 diabetes in young men. Diabetes Care 36, 276–282 (2013). PubMed: 22961572
  23. Oliveira AG et al. The Role of Hepatocyte Growth Factor (HGF) in Insulin Resistance and Diabetes. Front. Endocrinol. 9, 503 (2018).
  24. Mothe-Satney I et al. Adipocytes secrete leukotrienes: contribution to obesity-associated inflammation and insulin resistance in mice. Diabetes 61, 2311–2319 (2012). PubMed: 22688342
  25. Tsamardinos I, Brown LE & Aliferis CF. The max-min hill-climbing Bayesian network structure learning algorithm. Mach. Learn. 65, 31–78 (2006).
  26. Lagani V, Athineou G, Farcomeni A, Tsagris M & Tsamardinos I. Feature Selection with the R Package MXM: Discovering Statistically Equivalent Feature Subsets. Journal of Statistical Software, Articles 80, 1–25 (2017).
  27. McLaughlin T et al. Use of metabolic markers to identify overweight individuals who are insulin resistant. Ann. Intern. Med. 139, 802–809 (2003). PubMed: 14623617
  28. Nowak C et al. Protein Biomarkers for Insulin Resistance and Type 2 Diabetes Risk in Two Large Community Cohorts. Diabetes 65, 276–284 (2016). PubMed: 26420861
  29. Apostolopoulou M et al. Specific Hepatic Sphingolipids Relate to Insulin Resistance, Oxidative Stress, and Inflammation in Nonalcoholic Steatohepatitis. Diabetes Care 41, 1235–1243 (2018). PubMed: 29602794
  30. Gomez-Arango LF et al. Connections Between the Gut Microbiome and Metabolic Hormones in Early Pregnancy in Overweight and Obese Women. Diabetes 65, 2214–2223 (2016). PubMed: 27217482
  31. Kwo PY, Cohen SM & Lim JK. ACG Clinical Guideline: Evaluation of Abnormal Liver Chemistries. Am. J. Gastroenterol. 112, 18–35 (2017). PubMed: 27995906
  32. Hu FB et al. Elevated risk of cardiovascular disease prior to clinical diagnosis of type 2 diabetes. Diabetes Care 25, 1129–1134 (2002). PubMed: 12087009
  33. Goff DC Jr et al. 2013 ACC/AHA guideline on the assessment of cardiovascular risk: a report of the American College of Cardiology/American Heart Association Task Force on Practice Guidelines. Circulation 129, S49–73 (2014). PubMed: 24222018
  34. Kuznetsova T et al. Additive Prognostic Value of Left Ventricular Systolic Dysfunction in a Population-Based Cohort. Circ. Cardiovasc. Imaging 9, e004661 (2016). PubMed: 27329778
  35. Wang TJ et al. Carotid intima-media thickness is associated with premature parental coronary heart disease: the Framingham Heart Study. Circulation 108, 572–576 (2003). PubMed: 12874190
  36. Mitchell GF et al. Arterial stiffness and cardiovascular events: the Framingham Heart Study. Circulation 121, 505–511 (2010). PubMed: 20083680
  37. Moneghetti KJ et al. Applying current normative data to prognosis in heart failure: The Fitness Registry and the Importance of Exercise National Database (FRIEND). Int. J. Cardiol 263, 75–79 (2018). PubMed: 29525067
  38. Hall KT et al. Polymorphisms in catechol-O-methyltransferase modify treatment effects of aspirin on risk of cardiovascular disease. Arterioscler. Thromb. Vasc. Biol. 34, 2160–2167 (2014). PubMed: 25035343
  39. Malik R et al. Multiancestry genome-wide association study of 520,000 subjects identifies 32 loci associated with stroke and stroke subtypes. Nat. Genet. (2018). doi:10.1038/s41588-018-0058-3
  40. Cross DS et al. Coronary risk assessment among intermediate risk patients using a clinical and biomarker based algorithm developed and validated in two population cohorts. Curr. Med. Res. Opin. 28, 1819–1830 (2012). PubMed: 23092312
  41. Ma H, Calderon TM, Fallon JT & Berman JW. Hepatocyte growth factor is a survival factor for endothelial cells and is expressed in human atherosclerotic plaques. Atherosclerosis 164, 79–87 (2002). PubMed: 12119196
  42. Bell EJ et al. Hepatocyte Growth Factor Is Positively Associated With Risk of Stroke: The MESA (Multi-Ethnic Study of Atherosclerosis). Stroke 47, 2689–2694 (2016). PubMed: 27729582
  43. Chen X & Devaraj S. Monocytes from metabolic syndrome subjects exhibit a proinflammatory M1 phenotype. Metab. Syndr. Relat. Disord. 12, 362–366 (2014). PubMed: 24847781
  44. Elkind MS et al. Interleukin-2 levels are associated with carotid artery intima-media thickness. Atherosclerosis 180, 181–187 (2005). PubMed: 15823291
  45. Porez G, Prawitt J, Gross B & Staels B. Bile acid receptors as targets for the treatment of dyslipidemia and cardiovascular disease. J. Lipid Res. 53, 1723–1737 (2012). PubMed: 22550135
  46. Berry CE & Hare JM. Xanthine oxidoreductase and cardiovascular disease: molecular mechanisms and pathophysiological implications. J. Physiol. 555, 589–606 (2004). PubMed: 14694147
  47. Sane DC, Kontos JL & Greenberg CS. Roles of transglutaminases in cardiac and vascular diseases. Front. Biosci. 12, 2530–2545 (2007). PubMed: 17127261
  48. Wollert KC, Kempf T & Wallentin L. Growth Differentiation Factor 15 as a Biomarker in Cardiovascular Disease. Clin. Chem. 63, 140–151 (2017). PubMed: 27913719
  49. Ridker PM et al. C-reactive protein and other markers of inflammation in the prediction of cardiovascular disease in women. N. Engl. J. Med. 342, 836–843 (2000). PubMed: 10733371
  50. Wang TJ et al. Multiple biomarkers for the prediction of first major cardiovascular events and death. N. Engl. J. Med. 355, 2631–2639 (2006). PubMed: 17101633
  51. Ridker PM et al. Rosuvastatin to prevent vascular events in men and women with elevated C-reactive protein. N. Engl. J. Med. 359, 2195–2207 (2008). PubMed: 18997196
  52. Salem RM et al. Metabolite profiling of diabetes and obesity in two population-based cohorts. Diabetologia 61, 1728–1739 (2018). PubMed: 29628851
  53. Wang TJ et al. Metabolite profiles and the risk of developing diabetes. Nat. Med. 17, 448–453 (2011). PubMed: 21399648
  54. Newgard CB et al. A branched-chain amino acid-related metabolic signature that differentiates obese and lean humans and contributes to insulin resistance. Cell Metab. 9, 311–326 (2009). PubMed: 19356713
  55. Shaham O et al. Metabolic profiling of the human response to a glucose challenge reveals distinct axes of insulin sensitivity. Mol. Syst. Biol. 6, 364 (2010). PubMed: 20664669
  56. Wurtz P et al. Metabolite profiling and cardiovascular event risk: a prospective study of 3 population-based cohorts. Circulation 131, 774–785 (2015). PubMed: 25605761
  57. Otvos JD et al. GlycA: a composite nuclear magnetic resonance biomarker of systemic inflammation. Clin. Chem. 61, 714–723 (2015). PubMed: 25614311
  58. Fiehn O. Metabolomics—The link between genotypes and phenotypes. Plant Mol. Biol. 48, 155–171 (2002). PubMed: 11886458
  59. Shah SH et al. A multi-platform metabolomics approach identifies novel biomarkers associated with incident type 2 diabetes. Diabetes 61, 3380–3388 (2012). PubMed: 22973045
  60. Ganna A et al. Large-scale metabolomic profiling identifies novel biomarkers for incident coronary heart disease. PLoS Genet. 10, e1004801 (2014). PubMed: 25232712
  61. Wang TJ et al. Metabolite profiles and the risk of developing diabetes. Nat. Med. 17, 448–453 (2011). PubMed: 21399648
  62. Lotta LA et al. Genetic predisposition to an impaired metabolism of branched-chain amino acids and risk of type 2 diabetes: A Mendelian randomisation analysis. PLoS Med. 13, e1002179 (2016). PubMed: 26871002
  63. Rhee EP et al. Metabolite profiling identifies markers of uremia. J. Am. Soc. Nephrol. 22, 2141–2151 (2011). PubMed: 21817030
  64. Ahola-Olli AV et al. Circulating metabolites and risk of type 2 diabetes: a prospective study in Finnish men. Diabetologia 61, 2127–2139 (2018). PubMed: 29929514
  65. Yuan M et al. Metabolomics reveals insulin resistance-associated metabolic signatures of obesity and type 2 diabetes. Diabetes 62, 2685–2696 (2013). PubMed: 23823402
  66. Floegel A et al. Identification of serum metabolites associated with risk of type 2 diabetes using a targeted metabolomic approach. Diabetes 62, 639–648 (2013). PubMed: 23047863
  67. Wang-Sattler R et al. Novel biomarkers for pre-diabetes identified by metabolomics. Mol. Syst. Biol. 6, 441 (2010). PubMed: 21045812
  68. Suhre K et al. Human metabolic individuality in biomedical and pharmaceutical research. Nature 477, 54–60 (2011). PubMed: 21886133
  69. Shaham O et al. Metabolic profiling of the human response to a glucose challenge reveals distinct axes of insulin sensitivity. Mol. Syst. Biol. 6, 364 (2010). PubMed: 20664669
  70. Floegel A et al. Identification of serum metabolites associated with risk of type 2 diabetes using a targeted metabolomic approach. Diabetes 62, 639–648 (2013). PubMed: 23047863
  71. Wang TJ et al. Metabolite profiles and the risk of developing diabetes. Nat. Med. 17, 448–453 (2011). PubMed: 21399648
  72. Yuan M et al. Metabolomics reveals insulin resistance-associated metabolic signatures of obesity and type 2 diabetes. Diabetes 62, 2685–2696 (2013). PubMed: 23823402
  73. Lotta LA et al. Genetic predisposition to an impaired metabolism of branched-chain amino acids and risk of type 2 diabetes: A Mendelian randomisation analysis. PLoS Med. 13, e1002179 (2016). PubMed: 26871002
  74. Shaham O et al. Metabolic profiling of the human response to a glucose challenge reveals distinct axes of insulin sensitivity. Mol. Syst. Biol. 6, 364 (2010). PubMed: 20664669
  75. Wang TJ et al. Metabolite profiles and the risk of developing diabetes. Nat. Med. 17, 448–453 (2011). PubMed: 21399648
  76. Rhee EP et al. Metabolite profiling identifies markers of uremia. J. Am. Soc. Nephrol. 22, 2141–2151 (2011). PubMed: 21817030
  77. Ahola-Olli AV et al. Circulating metabolites and risk of type 2 diabetes: a prospective study in Finnish men. Diabetologia 61, 2127–2139 (2018). PubMed: 29929514
  78. Wang-Sattler R et al. Novel biomarkers for pre-diabetes identified by metabolomics. Mol. Syst. Biol. 6, 441 (2010). PubMed: 21045812
  79. Suhre K et al. Human metabolic individuality in biomedical and pharmaceutical research. Nature 477, 54–60 (2011). PubMed: 21886133

Figures

Study design and data collection
Figure 1. Study design and data collection. Overview of the in-depth longitudinal phenotyping used to determine health risk and status. Data types were categorized as: Standard (Blue), Enhanced (Purple) and Emerging (Red) tests. PBMCs: peripheral blood mononuclear cells; HbA1C: glycated hemoglobin; OGTT: oral glucose tolerance test; SSPG: steady-state plasma glucose; CBC: complete blood count; hsCRP: high sensitivity C-reactive protein; CVD: cardiovascular disease.
Clinical and enhanced phenotyping of glucose metabolism
Figure 2. Clinical and enhanced phenotyping of glucose metabolism, insulin production and resistance.
  • (a) Transitions in diabetes mellitus (DM) status (n = 109). Self-reported vs clinically determined DM status including FPG, HbA1C, OGTT; prediabetic and diabetic ranges defined.
  • (b) Overlap of diabetic range labs by participants over the course of the study.
  • (c) Violin plots showing insulin levels during OGTT at 0, 30 and 120 minutes, SSPG (n = 43) and glucose disposition index (n = 89 samples from 61 participants) by glycemic status. SSPG measured by modified insulin suppression test. Disposition index = insulin secretion rate at 30 min × Matsuda index. Two-sided Wilcoxon t-test used. Kernel density shows data proportion; horizontal bar = median.
  • (d) Heatmap of insulin secretion rates, row-standardized and clustered using k-means (n = 89 samples from 61 participants). OGTT status, disposition index (DI), SSPG and insulin secretion rate max (ISR) shown.
  • (e) Correlation network of multi-omics measures associated with glucose disposition index (n = 89; BH FDR 0.1). Spearman correlations; Bonferroni FDR 0.05. Only networks with ≥3 molecules plotted.
Longitudinal individual phenotyping and multi-omics of glucose metabolism and inflammation
Figure 3. Longitudinal individual phenotyping and multi-omics of glucose metabolism and inflammation.
  • (a–c) Longitudinal diabetic measures demonstrating different DM onset and progression patterns: OGTT, FPG, and initial improvement followed by progression. Diabetic-range metrics in red.
  • (d) Clinical markers and immune proteins associated with HbA1C, FPG, and hsCRP using healthy-baseline (HbA1C n = 101, samples 560; FPG n = 101, samples 563; hsCRP n = 98, samples 518) and dynamic models (HbA1C n = 94, samples 836; FPG n = 94, samples 843; hsCRP n = 92, samples 777). Two-sided t-test; BH FDR 0.2.
  • (e) Integrative pathway analysis using IMPaLa of proteins and metabolites associated with HbA1C, FPG, hsCRP (BH FDR 0.2). Hypergeometric test + Fisher’s combined probability test; BH FDR 0.05. Molecule counts in Tables S15, S17, S19.
  • (f) Molecules selected in SSPG and OGTT prediction models with associated coefficients. Lipidomics included for SSPG prediction. MSE = mean square error.
Clinical longitudinal cardiovascular health profiling and multi-omics correlation network of adjusted ASCVD risk
Figure 4. Clinical longitudinal cardiovascular health profiling and multi-omics correlation network of adjusted ASCVD risk.
  • (a) Distribution of ASCVD and adjusted ASCVD risk scores (n = 108). Boxplot shows Q1, median, Q3; whiskers = 1.5×IQR.
  • (b) Self-reported cholesterol vs measured total cholesterol at study entry and longitudinally (n = 108).
  • (c) Multi-omics correlation network of molecules associated with adjusted ASCVD risk score (n = 77) using Spearman correlation; q-value 0.2; Bonferroni p 0.1. Only main network molecules plotted.
Oncologic discoveries
Figure 5. Oncologic discoveries.
  • (a) Abdominal ultrasound: mildly enlarged spleen ~13 cm craniocaudal.
  • (b) PET imaging: large retroperitoneal mass with high FDG uptake, occupying most of spleen.
  • (c) LDH levels at imaging and post-chemotherapy.
  • (d) MIG (CXCL9) levels increase one year prior to diagnosis, peak at diagnosis, return to baseline after treatment (n = 11). BH p-value two-sided on Z-scores.
  • (e) Functional association network of outlier proteins (95th percentile) at diagnosis using STRING. Edges = known/predicted/other interactions.
  • (f) Shannon diversity of gut microbiome decreases before diagnosis, minimum at diagnosis, returns to baseline post-treatment (n = 11). General additive model separates linear (β = −0.197, p = 0.002) and non-linear (df = 3, p = 0.0112) components; F-test one-sided vs null.
  • (g) IgM levels across cohort (n = 109, samples 1,111). BH p-value on Z-scores; outlier visits = participant with MGUS. Boxplot shows Q1, median, Q3, whiskers = 1.5×IQR; diamond = mean.
Summary of major clinically actionable health discoveries and participant health behavior change
Figure 6. Summary of major clinically actionable health discoveries and participant health behavior change.
  • (a) Summary of clinically relevant health discoveries: 67 major discoveries; 55 PreDM results not included.
  • (b) Diet and physical activity modifications by participants.
  • (c) Amount of change in diet and exercise on 5-point scale (1 = no change, 5 = significant change). Abbreviations: MODY = maturity onset diabetes of the young; DM = diabetes mellitus; PreDM = prediabetes mellitus; afib = atrial fibrillation; SVT = supraventricular tachycardia; CV = cardiovascular; MGUS = monoclonal gammopathy of undetermined significance.