News

Thoughts on Machine Learning and AI from ACoP: Part Two

Posted by Matthew Wiens, M.A. on Dec 9, 2022 2:50:31 PM

Welcome to the second post of a three-part blog series reflecting on the ACoP (American Conference on Pharmacometrics) 13 meeting held October 30th - November 2nd, 2022.

In the first post we discussed the realistic goals, reasonable scope, and perspective on the effort required to integrate artificial intelligence/machine learning (AI/ML) in pharmacometrics. If you missed the first post of this series or need a refresher, check it out here.

ML methodologies to improve modeling: many posters incorporated ML ideas into their analysis workflow.

Many posters used Shapley values to aid in interpreting ML models. Shapley values are a method to interpret the covariate effects of an arbitrary mode by making predictions under different permutations of covariates in the model (see https://christophm.github.io/interpretable-ml-book/shapley.html for a more thorough, intuitive explanation).  The application of Shapley values in several projects demonstrated the improved scientific understanding using these models in a range of analyses. A logical question to ask then would be: should and will Shapley value plots eventually replace conventional forest plots in pharmacometric analyses, in particular because of the desire to identify subgroups and covariate interactions?   In addition,  I presented a poster (Wiens, et al. A Machine Learning and Statistical Meta-Analysis for Trial Simulations Predicting Transitions from Relapsing-Remitting to Secondary Progressive Multiple Sclerosis. Poster M-055.) using XGboost, a type of machine learning algorithm using trees, and used Shapley values to characterize covariate relationships (more than just screening!).  This was  exploratory in nature as part of the learning phase of the learn-confirm paradigm, coined by Lewis B. Sheiner [1]. Other discussion revolved around ML and causal inference, where often multiple models are used (e.g., outcome model, propensity score model), and in those cases an ML model is easy to implement and can accurately describe a complicated process that doesn’t have solid scientific understanding (e.g., propensity scores).  

In addition to models, various parts of the ML workflow are being adopted by pharmacometricians. Several posters and talks presenting non-ML projects  specifically referenced out-of-sample prediction and cross-validation. While not an ML model, bringing ML concepts to pharmacometrics continues to be valuable. In cases where data are not extremely limited, using cross-validation increases confidence in the generalizability of the model. This approach moves model selection away from potentially questionable asymptotic approximations and focuses instead on the scientific quantities of interest and models’ ability to predict new data. Here, general awareness of ML methodologies brings a new perspective to pharmacometrics. Ideas and best practices from the ML community might not be applicable to every model, but also shouldn’t be ignored. Another methodological tool in ML is the concept of latent spaces as a way to summarize data (beyond PCA), which will be my third theme, and discussed in the next blog: Pharmacometrics-specific ML approaches for classically challenging models. 

 

1. Sheiner, L. B. (1997). Learning versus confirming in clinical drug development. Clinical Pharmacology and Therapeutics, 61(3), 275–291.