Explore
Home 
Literature 
Links 
Posts 
Molecules 
Blogs 
Zeitgeist 
Markup Help 
News 
Everything Papers Books
Machine learning methods are becoming integral to scientific inquiry in numerous disciplines. We demonstrated that machine learning can be used to predict the performance of a synthetic reaction in multidimensional chemical space using data obtained via high-throughput experimentation. We created scripts to compute and extract atomic, molecular, and vibrational descriptors for the components of a palladium-catalyzed Buchwald-Hartwig cross-coupling of aryl halides with 4-methylaniline in the presence of various potentially inhibitory additives. Using these descriptors as inputs and reaction yield as output, we showed that a random forest algorithm provides significantly improved predictive performance over linear regression analysis. The random forest model was also successfully applied to sparse training sets and out-of-sample prediction, suggesting its value in facilitating adoption of synthetic methodology.

Posts

I wrote a little while back about a brute-force approach to finding metal-catalyzed coupling conditions. These reactions have a lot of variables in them and can be notoriously finicky about what combination of these will actually give decent amounts of product....