15  References

Alper, B. S. 2023. “Reflections on Defining a Standard for Computable Expression of Scientific Knowledge: What Teach Us Yoda Can.” Journal Article. Learn Health Syst 7 (1): e10312. https://doi.org/10.1002/lrh2.10312.
Andersen, K. M., B. A. Bates, E. S. Rashidi, A. L. Olex, R. B. Mannon, R. C. Patel, J. Singh, et al. 2022. “Long-Term Use of Immunosuppressive Medicines and in-Hospital COVID-19 Outcomes: A Retrospective Cohort Study Using Data from the National COVID Cohort Collaborative.” Journal Article. Lancet Rheumatol 4 (1): e33–41. https://doi.org/10.1016/S2665-9913(21)00325-8.
Ankan, A., I. M. N. Wortel, and J. Textor. 2021. “Testing Graphical Causal Models Using the r Package "Dagitty".” Journal Article. Curr Protoc 1 (2): e45. https://doi.org/10.1002/cpz1.45.
Anzalone, Alfred Jerrod, Ronald Horswell, Brian M Hendricks, San Chu, William B Hillegass, William H Beasley, Jeremy R Harper, et al. 2023. “Higher Hospitalization and Mortality Rates Among SARS-CoV-2-Infected Persons in Rural America.” The Journal of Rural Health 39 (1): 39–54. https://doi.org/10.1111/jrh.12689.
Benchimol, E. I., L. Smeeth, A. Guttmann, K. Harron, D. Moher, I. Petersen, H. T. Sorensen, E. von Elm, S. M. Langan, and Record Working Committee. 2015. “The REporting of Studies Conducted Using Observational Routinely-Collected Health Data (RECORD) Statement.” Journal Article. PLoS Med 12 (10): e1001885. https://doi.org/10.1371/journal.pmed.1001885.
Bradwell, Katie R, Jacob T Wooldridge, Benjamin Amor, Tellen D Bennett, Adit Anand, Carolyn Bremer, Yun Jae Yoo, et al. 2022. “Harmonizing Units and Values of Quantitative Data Elements in a Very Large Nationally Pooled Electronic Health Record (EHR) Dataset.” Journal of the American Medical Informatics Association 29 (7): 1172–82. https://doi.org/10.1093/jamia/ocac054.
Casiraghi, Elena, Dario Malchiodi, Gabriella Trucco, Marco Frasca, Luca Cappelletti, Tommaso Fontana, Alessandro Andrea Esposito, et al. 2020. “Explainable Machine Learning for Early Assessment of COVID-19 Risk Prediction in Emergency Departments.” IEEE Access 8: 196299–325. https://doi.org/10.1109/access.2020.3034032.
Casiraghi, Elena, Rachel Wong, Margaret Hall, Ben Coleman, Marco Notaro, Michael D. Evans, Jena S. Tronieri, et al. 2023. “A Method for Comparing Multiple Imputation Techniques: A Case Study on the u.s. National COVID Cohort Collaborative.” Journal of Biomedical Informatics 139 (March): 104295. https://doi.org/10.1016/j.jbi.2023.104295.
Caton, S, and S Haas. 2020. “Fairness in Machine Learning: A Survey.” Journal Article. arXiv. https://doi.org/10.48550/arXiv.2010.0405.
Charlson, Mary E., Peter Pompei, Kathy L. Ales, and C.Ronald MacKenzie. 1987. “A New Method of Classifying Prognostic Comorbidity in Longitudinal Studies: Development and Validation.” Journal of Chronic Diseases 40 (5): 373–83. https://doi.org/10.1016/0021-9681(87)90171-8.
Chollet, Francois. 2021. Deep Learning with Python. Simon; Schuster.
Cutter, SL, KD Ash, and CT. Emrich. 2014. “The Geographies of Community Disaster Resilience.” Journal Article. Global Environmental Change 29 (Nov 1): 65–77. https://doi.org/10.1016/j.gloenvcha.2014.08.005.
Dong, Xiao, Jianfu Li, Ekin Soysal, Jiang Bian, Scott L DuVall, Elizabeth Hanchrow, Hongfang Liu, et al. 2020. “COVID-19 TestNorm: A Tool to Normalize COVID-19 Testing Names to LOINC Codes.” Journal of the American Medical Informatics Association 27 (9): 1437–42. https://doi.org/10.1093/jamia/ocaa145.
Elm, E. von, D. G. Altman, M. Egger, S. J. Pocock, P. C. Gotzsche, J. P. Vandenbroucke, and Strobe Initiative. 2014. “The Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) Statement: Guidelines for Reporting Observational Studies.” Journal Article. Int J Surg 12 (12): 1495–99. https://doi.org/10.1016/j.ijsu.2014.07.013.
Franklin, J. M., K. J. Lin, N. M. Gatto, J. A. Rassen, R. J. Glynn, and S. Schneeweiss. 2021. “Real-World Evidence for Assessing Pharmaceutical Treatments in the Context of COVID-19.” Journal Article. Clin Pharmacol Ther 109 (4): 816–28. https://doi.org/10.1002/cpt.2185.
Franklin, J. M., R. Platt, N. A. Dreyer, A. J. London, G. E. Simon, J. H. Watanabe, M. Horberg, A. Hernandez, and R. M. Califf. 2022. “When Can Nonrandomized Studies Support Valid Inference Regarding Effectiveness or Safety of New Medical Treatments?” Journal Article. Clin Pharmacol Ther 111 (1): 108–15. https://doi.org/10.1002/cpt.2255.
Fu, Sunyang, Lester Y. Leung, Anne-Olivia Raulli, David F. Kallmes, Kristin A. Kinsman, Kristoff B. Nelson, Michael S. Clark, et al. 2020. “Assessment of the Impact of EHR Heterogeneity for Clinical Research Through a Case Study of Silent Brain Infarction.” BMC Medical Informatics and Decision Making 20 (1). https://doi.org/10.1186/s12911-020-1072-9.
Géron, Aurélien. 2022. Hands-on Machine Learning with Scikit-Learn, Keras, and TensorFlow. " O’Reilly Media, Inc.".
Gold, Sigfried, Andrea Batch, Robert McClure, Guoqian Jiang, Hadi Kharrazi, Rishi Saripalle, Vojtech Huser, et al. 2018. “Clinical Concept Value Sets and Interoperability in Health Data Analytics.” In AMIA Annual Symposium Proceedings, 2018:480. American Medical Informatics Association. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6371254.
Gold, Sigfried, Harold Lehmann, Lisa Schilling, and Wayne Lutters. 2021. “Practices, Norms, and Aspirations Regarding the Construction, Validation, and Reuse of Code Sets in the Analysis of Real-World Data.” medRxiv, 2021–10. https://doi.org/10.1101/2021.10.14.21264917.
Griffith, G. J., T. T. Morris, M. J. Tudball, A. Herbert, G. Mancano, L. Pike, G. C. Sharp, et al. 2020. “Collider Bias Undermines Our Understanding of COVID-19 Disease Risk and Severity.” Nature Communications 11 (1): 5749. https://doi.org/10.1038/s41467-020-19478-2.
Haendel, Melissa A, Christopher G Chute, Tellen D Bennett, David A Eichmann, Justin Guinney, Warren A Kibbe, Philip R O Payne, et al. 2020. The National COVID Cohort Collaborative (N3C): Rationale, design, infrastructure, and deployment.” Journal of the American Medical Informatics Association 28 (3): 427–43. https://doi.org/10.1093/jamia/ocaa196.
Hastie, Trevor, Robert Tibshirani, Jerome H Friedman, and Jerome H Friedman. 2009. The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Vol. 2. Springer.
Hernan, M. A., and J. M. Robins. 2016. “Using Big Data to Emulate a Target Trial When a Randomized Trial Is Not Available.” Journal Article. Am J Epidemiol 183 (8): 758–64. https://doi.org/10.1093/aje/kwv254.
Islam, J. Y., V. Madhira, J. Sun, A. Olex, N. Franceschini, G. Kirk, and R. Patel. 2022. “Racial Disparities in COVID-19 Test Positivity Among People Living with HIV in the United States.” Journal Article. Int J STD AIDS 33 (5): 462–66. https://doi.org/10.1177/09564624221074468.
Kharrazi, Hadi, Winnie Chi, Hsien-Yen Chang, Thomas M Richards, Jason M Gallagher, Susan M Knudson, and Jonathan P Weiner. 2017. “Comparing Population-Based Risk-Stratification Model Performance Using Demographic, Diagnosis and Medication Data Extracted from Outpatient Electronic Health Records Versus Administrative Claims.” Medical Care 55 (8): 789–96. https://doi.org/10.1097/MLR.0000000000000754.
Klein, Julie Thompson. 1996. Crossing Boundaries Knowledge, Disciplinarities, and Interdisciplinarities. Book. Knowledge : Disciplinarity and Beyond. Charlottesville ; London: University Press of Virginia. https://www.google.com/books/edition/Crossing_Boundaries/bNJvYf3ROPAC.
Kleinberg, Jon M., Sendhil Mullainathan, and Manish Raghavan. 2016. “Inherent Trade-Offs in the Fair Determination of Risk Scores.” CoRR abs/1609.05807. https://doi.org/10.48550/arXiv.1609.05807.
Kuehne, F., B. Jahn, A. Conrads-Frank, M. Bundo, M. Arvandi, F. Endel, N. Popper, et al. 2019. “Guidance for a Causal Comparative Effectiveness Analysis Emulating a Target Trial Based on Big Real World Evidence: When to Start Statin Treatment.” Journal Article. J Comp Eff Res 8 (12): 1013–25. https://doi.org/10.2217/cer-2018-0103.
Li, Chenyu, Abdulrahman M. Alsheikh, Karen A. Robinson, and Harold P. Lehmann. 2023. “Use of Recommended Real-World Methods for Electronic Health Record Data Analysis Has Not Improved over 10 Years.” medRxiv. https://doi.org/10.1101/2023.06.21.23291706.
Lundberg, Scott M., Gabriel G. Erion, and Su-In Lee. 2018. “Consistent Individualized Feature Attribution for Tree Ensembles.” arXiv. https://doi.org/10.48550/ARXIV.1802.03888.
Madlock-Brown, C., K. Wilkens, N. Weiskopf, N. Cesare, S. Bhattacharyya, N. O. Riches, J. Espinoza, et al. 2022a. “Clinical, Social, and Policy Factors in COVID-19 Cases and Deaths: Methodological Considerations for Feature Selection and Modeling in County-Level Analyses.” Journal Article. BMC Public Health 22 (1): 747. https://doi.org/10.1186/s12889-022-13168-y.
———, et al. 2022b. “Correction: Clinical, Social, and Policy Factors in COVID-19 Cases and Deaths: Methodological Considerations for Feature Selection and Modeling in County-Level Analyses.” Journal Article. BMC Public Health 22 (1): 1250. https://doi.org/10.1186/s12889-022-13562-6.
Mehta, Hemalkumar B., Huijun An, Kathleen M. Andersen, Omar Mansour, Vithal Madhira, Emaan S. Rashidi, Benjamin Bates, et al. 2021. “Use of Hydroxychloroquine, Remdesivir, and Dexamethasone Among Adults Hospitalized with Covid-19 in the United States: A Retrospective Cohort Study.” Annals of Internal Medicine 174 (10): 1395–1403. https://doi.org/10.7326/M21-0857.
Mitra, Robin, Sarah F McGough, Tapabrata Chakraborti, Chris Holmes, Ryan Copping, Niels Hagenbuch, Stefanie Biedermann, et al. 2023. “Learning from Data with Structured Missingness.” Nature Machine Intelligence 5 (1): 13–23. https://doi.org/10.1038/s42256-022-00596-z.
Morgan, R. L., P. Whaley, K. A. Thayer, and H. J. Schunemann. 2018. “Identifying the PECO: A Framework for Formulating Good Questions to Explore the Association of Environmental and Other Exposures with Health Outcomes.” Journal Article. Environ Int 121 (Pt 1): 1027–31. https://doi.org/10.1016/j.envint.2018.07.015.
Narrett, J. A., I. Mallawaarachchi, C. M. Aldridge, E. D. Assefa, A. Patel, J. J. Loomba, S. Ratcliffe, et al. 2023. “Increased Stroke Severity and Mortality in Patients with SARS-CoV-2 Infection: An Analysis from the N3C Database.” Journal Article. J Stroke Cerebrovasc Dis 32 (3): 106987. https://doi.org/10.1016/j.jstrokecerebrovasdis.2023.106987.
OHDSI. 2019. The Book of OHDSI: Observational Health Data Sciences and Informatics. United States: OHDSI. https://ohdsi.github.io/TheBookOfOhdsi/.
Palantir. 2023. “Documentation: Code Repositories Overview.” https://www.palantir.com/docs/foundry/code-repositories/overview/.
Peshawa J Muhammad Ali, and Rezhna Hassan Faraj. 2014. “Data Normalization and Standardization: A Technical Report.” https://doi.org/10.13140/RG.2.2.28948.04489.
Pfaff, E. R., A. T. Girvin, T. D. Bennett, A. Bhatia, I. M. Brooks, R. R. Deer, J. P. Dekermanjian, et al. 2022. “Identifying Who Has Long COVID in the USA: A Machine Learning Approach Using N3C Data.” Lancet Digit Health 4 (7): e532–41. https://doi.org/10.1016/S2589-7500(22)00048-6.
Pfaff, Emily R, Andrew T Girvin, Davera L Gabriel, Kristin Kostka, Michele Morris, Matvey B Palchuk, Harold P Lehmann, et al. 2022. “Synergies Between Centralized and Federated Approaches to Data Quality: A Report from the National COVID Cohort Collaborative.” Journal of the American Medical Informatics Association 29 (4): 609–18. https://doi.org/10.1093/jamia/ocab217.
Pfaff, Emily R, Charisse Madlock-Brown, John M Baratta, Abhishek Bhatia, Hannah Davis, Andrew Girvin, Elaine Hill, et al. 2023. “Coding Long COVID: Characterizing a New Disease Through an ICD-10 Lens.” BMC Medicine 21 (1): 1–13. https://doi.org/10.1186/s12916-023-02737-6.
Redelmeier, D. A., J. Wang, and D. Thiruchelvam. 2023. “COVID Vaccine Hesitancy and Risk of a Traffic Crash.” Journal Article. Am J Med 136 (2): 153–162 e5. https://doi.org/10.1016/j.amjmed.2022.11.002.
Reese, Justin T, Hannah Blau, Elena Casiraghi, Timothy Bergquist, Johanna J Loomba, Tiffany J Callahan, Bryan Laraway, et al. 2023. “Generalisable Long COVID Subtypes: Findings from the NIH N3C and RECOVER Programmes.” EBioMedicine 87. https://doi.org/10.1016/j.ebiom.2022.104413.
Richesson, Rachel L, W Ed Hammond, Meredith Nahm, Douglas Wixted, Gregory E Simon, Jennifer G Robinson, Alan E Bauck, et al. 2013. “Electronic Health Records Based Phenotyping in Next-Generation Clinical Trials: A Perspective from the NIH Health Care Systems Collaboratory.” Journal of the American Medical Informatics Association 20 (e2): e226–31. https://doi.org/10.1136/amiajnl-2013-001926.
Roberts, Michael, Derek Driggs, Matthew Thorpe, Julian Gilbey, Michael Yeung, Stephan Ursprung, Angelica I. Aviles-Rivero, et al. 2021. “Common Pitfalls and Recommendations for Using Machine Learning to Detect and Prognosticate for COVID-19 Using Chest Radiographs and CT Scans.” Nature Machine Intelligence 3 (3): 199–217. https://doi.org/10.1038/s42256-021-00307-0.
Sahner, David, and David C. Spellmeyer. 2020. “Artificial Intelligence: Emerging Applications in Biotechnology and Pharma.” In Biotechnology Entrepreneurship, 399–417. Elsevier. https://doi.org/10.1016/b978-0-12-815585-1.00028-0.
Schneeweiss, S., J. A. Rassen, J. S. Brown, K. J. Rothman, L. Happe, P. Arlett, G. Dal Pan, W. Goettsch, W. Murk, and S. V. Wang. 2019. “Graphical Depiction of Longitudinal Study Designs in Health Care Databases.” Journal Article. Ann Intern Med 170 (6): 398–406. https://doi.org/10.7326/M18-3079.
Schuemie, M. J., P. B. Ryan, G. Hripcsak, D. Madigan, and M. A. Suchard. 2018. “Improving Reproducibility by Using High-Throughput Observational Studies with Empirical Calibration.” Journal Article. Philos Trans A Math Phys Eng Sci 376 (2128). https://doi.org/10.1098/rsta.2017.0356.
Schuemie, M. J., P. B. Ryan, N. Pratt, R. Chen, S. C. You, H. M. Krumholz, D. Madigan, G. Hripcsak, and M. A. Suchard. 2020. “Large-Scale Evidence Generation and Evaluation Across a Network of Databases (LEGEND): Assessing Validity Using Hypertension as a Case Study.” Journal Article. J Am Med Inform Assoc 27 (8): 1268–77. https://doi.org/10.1093/jamia/ocaa124.
Shapley, L. S. 1953. “17. A Value for n-Person Games.” In Contributions to the Theory of Games (AM-28), Volume II, 307–18. Princeton University Press. https://doi.org/10.1515/9781400881970-018.
Sharafeldin, Noha, Benjamin Bates, Qianqian Song, Vithal Madhira, Yao Yan, Sharlene Dong, Eileen Lee, et al. 2021. “Outcomes of COVID-19 in Patients with Cancer: Report from the National COVID Cohort Collaborative (N3C).” Journal of Clinical Oncology 39 (20): 2232–46. https://doi.org/10.1200/JCO.21.01074.
Sidky, H., J. C. Young, A. T. Girvin, E. Lee, Y. R. Shao, N. Hotaling, S. Michael, et al. 2023. “Data Quality Considerations for Evaluating COVID-19 Treatments Using Real World Data: Learnings from the National COVID Cohort Collaborative (N3C).” Journal Article. BMC Med Res Methodol 23 (1): 46. https://doi.org/10.1186/s12874-023-01839-2.
Stoudt, S., V. N. Vasquez, and C. C. Martinez. 2021. “Principles for Data Analysis Workflows.” Journal Article. PLoS Comput Biol 17 (3): e1008770. https://doi.org/10.1371/journal.pcbi.1008770.
Sun, Jing, Qulu Zheng, Vithal Madhira, Amy L. Olex, Alfred J. Anzalone, Amanda Vinson, Jasvinder A. Singh, et al. 2022. “Association Between Immune Dysfunction and COVID-19 Breakthrough Infection After SARS-CoV-2 Vaccination in the US.” Archives of Internal Medicine (Chicago, Ill. : 1908) 182 (2): 153–62. https://doi.org/10.1001/jamainternmed.2021.7024.
Tan, A. L. M., E. J. Getzen, M. R. Hutch, Z. H. Strasser, A. Gutierrez-Sacristan, T. T. Le, A. Dagliati, et al. 2023. “Informative Missingness: What Can We Learn from Patterns in Missing Laboratory Data in the Electronic Health Record?” Journal Article. J Biomed Inform 139: 104306. https://doi.org/10.1016/j.jbi.2023.104306.
U.S. Food and Drug Administration. 2017. “Software as a Medical Device (SAMD): Clinical Evaluation/Guidance for Industry and Food and Drug Administration Staff.” Web Page. FDA. https://www.fda.gov/media/100714/download.
———. 2023. “Considerations for the Design and Conduct of Externally Controlled Trials for Drug and Biological Products Guidance for Industry.” Report. Food; Drug Administration. https://www.fda.gov/media/164960/download.
U.S. Food and Drug Administration and the Duke-Margolis Center for Health Policy. 2019. “Developing Real-World Data and Evidence to Support Regulatory Decision-Making.” Online Multimedia. https://www.youtube.com/watch?v=-G6ltatA71I.
U.S. Food and Drug Administration, Health Canada, and the United Kingdom’s Medicines and Healthcare products Regulatory Agency (MHRA). 2021. “Good Machine Learning Practice for Medical Device Development: Guiding Principles.” Web Page. https://www.fda.gov/medical-devices/software-medical-device-samd/good-machine-learning-practice-medical-device-development-guiding-principles.
Walonoski, Jason, Sybil Klaus, Eldesia Granger, Dylan Hall, Andrew Gregorowicz, George Neyarapally, Abigail Watson, and Jeff Eastman. 2020. “Synthea™ Novel Coronavirus (COVID-19) Model and Synthetic Data Set.” Intelligence-Based Medicine 1-2: 100007. https://doi.org/doi.org/10.1016/j.ibmed.2020.100007.
Wang, S. V., S. Pinheiro, W. Hua, P. Arlett, Y. Uyama, J. A. Berlin, D. B. Bartels, K. H. Kahler, L. G. Bessette, and S. Schneeweiss. 2021. “STaRT-RWE: Structured Template for Planning and Reporting on the Implementation of Real World Evidence Studies.” Journal Article. BMJ 372: m4856. https://doi.org/10.1136/bmj.m4856.
Weiskopf, N. G., D. A. Dorr, C. Jackson, H. P. Lehmann, and C. A. Thompson. 2023. “Healthcare Utilization Is a Collider: An Introduction to Collider Bias in EHR Data Reuse.” Journal Article. J Am Med Inform Assoc. https://doi.org/10.1093/jamia/ocad013.
Wilkinson, Mark D, Michel Dumontier, IJsbrand Jan Aalbersberg, Gabrielle Appleton, Myles Axton, Arie Baak, Niklas Blomberg, et al. 2016. “The FAIR Guiding Principles for Scientific Data Management and Stewardship.” Scientific Data 3. https://doi.org/10.1038/sdata.2016.18.
Yang, Xueying, Jing Sun, Rena C Patel, Jiajia Zhang, Siyuan Guo, Qulu Zheng, Amy L Olex, et al. 2021. “Associations Between HIV Infection and Clinical Spectrum of COVID-19: A Population Level Analysis Based on US National COVID Cohort Collaborative (N3C) Data.” The Lancet HIV 8 (11): 690–700. https://doi.org/10.1016/S2352-3018(21)00239-3.
Zhou, R., K. E. Johnson, J. F. Rousseau, P. J. Rathouz, and N. C. Consortium. 2022. “Comparative Effectiveness of Dexamethasone in Treatment of Hospitalized COVID-19 Patients During the First Year of the Pandemic: The N3C Data Repository.” Journal Article. medRxiv. https://doi.org/10.1101/2022.10.22.22281373.