Abstract
Small molecules exhibiting desirable property profiles are often discovered through an iterative process of designing, synthesizing and testing sets of molecules. The selection of molecules to synthesize from all possible candidates is a complex decision-making process that typically relies on expert chemist intuition. Here we propose a quantitative decision-making framework, SPARROW, that prioritizes molecules for evaluation by balancing expected information gain and synthetic cost. SPARROW integrates molecular design, property prediction and retrosynthetic planning to balance the utility of testing a molecule with the cost of batch synthesis. We demonstrate, through three case studies, that the developed algorithm captures the non-additive costs inherent to batch synthesis, leverages common reaction steps and intermediates, and scales to hundreds of molecules.
This is a preview of subscription content, access via your institution
Access options
style{display:none!important}.LiveAreaSection-193358632 *{align-content:stretch;align-items:stretch;align-self:auto;animation-delay:0s;animation-direction:normal;animation-duration:0s;animation-fill-mode:none;animation-iteration-count:1;animation-name:none;animation-play-state:running;animation-timing-function:ease;azimuth:center;backface-visibility:visible;background-attachment:scroll;background-blend-mode:normal;background-clip:borderBox;background-color:transparent;background-image:none;background-origin:paddingBox;background-position:0 0;background-repeat:repeat;background-size:auto auto;block-size:auto;border-block-end-color:currentcolor;border-block-end-style:none;border-block-end-width:medium;border-block-start-color:currentcolor;border-block-start-style:none;border-block-start-width:medium;border-bottom-color:currentcolor;border-bottom-left-radius:0;border-bottom-right-radius:0;border-bottom-style:none;border-bottom-width:medium;border-collapse:separate;border-image-outset:0s;border-image-repeat:stretch;border-image-slice:100%;border-image-source:none;border-image-width:1;border-inline-end-color:currentcolor;border-inline-end-style:none;border-inline-end-width:medium;border-inline-start-color:currentcolor;border-inline-start-style:none;border-inline-start-width:medium;border-left-color:currentcolor;border-left-style:none;border-left-width:medium;border-right-color:currentcolor;border-right-style:none;border-right-width:medium;border-spacing:0;border-top-color:currentcolor;border-top-left-radius:0;border-top-right-radius:0;border-top-style:none;border-top-width:medium;bottom:auto;box-decoration-break:slice;box-shadow:none;box-sizing:border-box;break-after:auto;break-before:auto;break-inside:auto;caption-side:top;caret-color:auto;clear:none;clip:auto;clip-path:none;color:initial;column-count:auto;column-fill:balance;column-gap:normal;column-rule-color:currentcolor;column-rule-style:none;column-rule-width:medium;column-span:none;column-width:auto;content:normal;counter-increment:none;counter-reset:none;cursor:auto;display:inline;empty-cells:show;filter:none;flex-basis:auto;flex-direction:row;flex-grow:0;flex-shrink:1;flex-wrap:nowrap;float:none;font-family:initial;font-feature-settings:normal;font-kerning:auto;font-language-override:normal;font-size:medium;font-size-adjust:none;font-stretch:normal;font-style:normal;font-synthesis:weight style;font-variant:normal;font-variant-alternates:normal;font-variant-caps:normal;font-variant-east-asian:normal;font-variant-ligatures:normal;font-variant-numeric:normal;font-variant-position:normal;font-weight:400;grid-auto-columns:auto;grid-auto-flow:row;grid-auto-rows:auto;grid-column-end:auto;grid-column-gap:0;grid-column-start:auto;grid-row-end:auto;grid-row-gap:0;grid-row-start:auto;grid-template-areas:none;grid-template-columns:none;grid-template-rows:none;height:auto;hyphens:manual;image-orientation:0deg;image-rendering:auto;image-resolution:1dppx;ime-mode:auto;inline-size:auto;isolation:auto;justify-content:flexStart;left:auto;letter-spacing:normal;line-break:auto;line-height:normal;list-style-image:none;list-style-position:outside;list-style-type:disc;margin-block-end:0;margin-block-start:0;margin-bottom:0;margin-inline-end:0;margin-inline-start:0;margin-left:0;margin-right:0;margin-top:0;mask-clip:borderBox;mask-composite:add;mask-image:none;mask-mode:matchSource;mask-origin:borderBox;mask-position:0 0;mask-repeat:repeat;mask-size:auto;mask-type:luminance;max-height:none;max-width:none;min-block-size:0;min-height:0;min-inline-size:0;min-width:0;mix-blend-mode:normal;object-fit:fill;object-position:50% 50%;offset-block-end:auto;offset-block-start:auto;offset-inline-end:auto;offset-inline-start:auto;opacity:1;order:0;orphans:2;outline-color:initial;outline-offset:0;outline-style:none;outline-width:medium;overflow:visible;overflow-wrap:normal;overflow-x:visible;overflow-y:visible;padding-block-end:0;padding-block-start:0;padding-bottom:0;padding-inline-end:0;padding-inline-start:0;padding-left:0;padding-right:0;padding-top:0;page-break-after:auto;page-break-before:auto;page-break-inside:auto;perspective:none;perspective-origin:50% 50%;pointer-events:auto;position:static;quotes:initial;resize:none;right:auto;ruby-align:spaceAround;ruby-merge:separate;ruby-position:over;scroll-behavior:auto;scroll-snap-coordinate:none;scroll-snap-destination:0 0;scroll-snap-points-x:none;scroll-snap-points-y:none;scroll-snap-type:none;shape-image-threshold:0;shape-margin:0;shape-outside:none;tab-size:8;table-layout:auto;text-align:initial;text-align-last:auto;text-combine-upright:none;text-decoration-color:currentcolor;text-decoration-line:none;text-decoration-style:solid;text-emphasis-color:currentcolor;text-emphasis-position:over right;text-emphasis-style:none;text-indent:0;text-justify:auto;text-orientation:mixed;text-overflow:clip;text-rendering:auto;text-shadow:none;text-transform:none;text-underline-position:auto;top:auto;touch-action:auto;transform:none;transform-box:borderBox;transform-origin:50% 50%0;transform-style:flat;transition-delay:0s;transition-duration:0s;transition-property:all;transition-timing-function:ease;vertical-align:baseline;visibility:visible;white-space:normal;widows:2;width:auto;will-change:auto;word-break:normal;word-spacing:normal;word-wrap:normal;writing-mode:horizontalTb;z-index:auto;-webkit-appearance:none;-moz-appearance:none;-ms-appearance:none;appearance:none;margin:0}.LiveAreaSection-193358632{width:100%}.LiveAreaSection-193358632 .login-option-buybox{display:block;width:100%;font-size:17px;line-height:30px;color:#222;padding-top:30px;font-family:Harding,Palatino,serif}.LiveAreaSection-193358632 .additional-access-options{display:block;font-weight:700;font-size:17px;line-height:30px;color:#222;font-family:Harding,Palatino,serif}.LiveAreaSection-193358632 .additional-login>li:not(:first-child)::before{transform:translateY(-50%);content:””;height:1rem;position:absolute;top:50%;left:0;border-left:2px solid #999}.LiveAreaSection-193358632 .additional-login>li:not(:first-child){padding-left:10px}.LiveAreaSection-193358632 .additional-login>li{display:inline-block;position:relative;vertical-align:middle;padding-right:10px}.BuyBoxSection-683559780{display:flex;flex-wrap:wrap;flex:1;flex-direction:row-reverse;margin:-30px -15px 0}.BuyBoxSection-683559780 .box-inner{width:100%;height:100%;padding:30px 5px;display:flex;flex-direction:column;justify-content:space-between}.BuyBoxSection-683559780 p{margin:0}.BuyBoxSection-683559780 .readcube-buybox{background-color:#f3f3f3;flex-shrink:1;flex-grow:1;flex-basis:255px;background-clip:content-box;padding:0 15px;margin-top:30px}.BuyBoxSection-683559780 .subscribe-buybox{background-color:#f3f3f3;flex-shrink:1;flex-grow:4;flex-basis:300px;background-clip:content-box;padding:0 15px;margin-top:30px}.BuyBoxSection-683559780 .subscribe-buybox-nature-plus{background-color:#f3f3f3;flex-shrink:1;flex-grow:4;flex-basis:100%;background-clip:content-box;padding:0 15px;margin-top:30px}.BuyBoxSection-683559780 .title-readcube,.BuyBoxSection-683559780 .title-buybox{display:block;margin:0;margin-right:10%;margin-left:10%;font-size:24px;line-height:32px;color:#222;text-align:center;font-family:Harding,Palatino,serif}.BuyBoxSection-683559780 .title-asia-buybox{display:block;margin:0;margin-right:5%;margin-left:5%;font-size:24px;line-height:32px;color:#222;text-align:center;font-family:Harding,Palatino,serif}.BuyBoxSection-683559780 .asia-link{color:#069;cursor:pointer;text-decoration:none;font-size:1.05em;font-family:-apple-system,BlinkMacSystemFont,”Segoe UI”,Roboto,Oxygen-Sans,Ubuntu,Cantarell,”Helvetica Neue”,sans-serif;line-height:1.05em6}.BuyBoxSection-683559780 .access-readcube{display:block;margin:0;margin-right:10%;margin-left:10%;font-size:14px;color:#222;padding-top:10px;text-align:center;font-family:-apple-system,BlinkMacSystemFont,”Segoe UI”,Roboto,Oxygen-Sans,Ubuntu,Cantarell,”Helvetica Neue”,sans-serif;line-height:20px}.BuyBoxSection-683559780 ul{margin:0}.BuyBoxSection-683559780 .link-usp{display:list-item;margin:0;margin-left:20px;padding-top:6px;list-style-position:inside}.BuyBoxSection-683559780 .link-usp span{font-size:14px;color:#222;font-family:-apple-system,BlinkMacSystemFont,”Segoe UI”,Roboto,Oxygen-Sans,Ubuntu,Cantarell,”Helvetica Neue”,sans-serif;line-height:20px}.BuyBoxSection-683559780 .access-asia-buybox{display:block;margin:0;margin-right:5%;margin-left:5%;font-size:14px;color:#222;padding-top:10px;text-align:center;font-family:-apple-system,BlinkMacSystemFont,”Segoe UI”,Roboto,Oxygen-Sans,Ubuntu,Cantarell,”Helvetica Neue”,sans-serif;line-height:20px}.BuyBoxSection-683559780 .access-buybox{display:block;margin:0;margin-right:10%;margin-left:10%;font-size:14px;color:#222;opacity:.8px;padding-top:10px;text-align:center;font-family:-apple-system,BlinkMacSystemFont,”Segoe UI”,Roboto,Oxygen-Sans,Ubuntu,Cantarell,”Helvetica Neue”,sans-serif;line-height:20px}.BuyBoxSection-683559780 .price-buybox{display:block;font-size:30px;color:#222;font-family:-apple-system,BlinkMacSystemFont,”Segoe UI”,Roboto,Oxygen-Sans,Ubuntu,Cantarell,”Helvetica Neue”,sans-serif;padding-top:30px;text-align:center}.BuyBoxSection-683559780 .price-buybox-to{display:block;font-size:30px;color:#222;font-family:-apple-system,BlinkMacSystemFont,”Segoe UI”,Roboto,Oxygen-Sans,Ubuntu,Cantarell,”Helvetica Neue”,sans-serif;text-align:center}.BuyBoxSection-683559780 .price-info-text{font-size:16px;padding-right:10px;color:#222;font-family:-apple-system,BlinkMacSystemFont,”Segoe UI”,Roboto,Oxygen-Sans,Ubuntu,Cantarell,”Helvetica Neue”,sans-serif}.BuyBoxSection-683559780 .price-value{font-size:30px;font-family:-apple-system,BlinkMacSystemFont,”Segoe UI”,Roboto,Oxygen-Sans,Ubuntu,Cantarell,”Helvetica Neue”,sans-serif}.BuyBoxSection-683559780 .price-per-period{font-family:-apple-system,BlinkMacSystemFont,”Segoe UI”,Roboto,Oxygen-Sans,Ubuntu,Cantarell,”Helvetica Neue”,sans-serif}.BuyBoxSection-683559780 .price-from{font-size:14px;padding-right:10px;color:#222;font-family:-apple-system,BlinkMacSystemFont,”Segoe UI”,Roboto,Oxygen-Sans,Ubuntu,Cantarell,”Helvetica Neue”,sans-serif;line-height:20px}.BuyBoxSection-683559780 .issue-buybox{display:block;font-size:13px;text-align:center;color:#222;font-family:-apple-system,BlinkMacSystemFont,”Segoe UI”,Roboto,Oxygen-Sans,Ubuntu,Cantarell,”Helvetica Neue”,sans-serif;line-height:19px}.BuyBoxSection-683559780 .no-price-buybox{display:block;font-size:13px;line-height:18px;text-align:center;padding-right:10%;padding-left:10%;padding-bottom:20px;padding-top:30px;color:#222;font-family:-apple-system,BlinkMacSystemFont,”Segoe UI”,Roboto,Oxygen-Sans,Ubuntu,Cantarell,”Helvetica Neue”,sans-serif}.BuyBoxSection-683559780 .vat-buybox{display:block;margin-top:5px;margin-right:20%;margin-left:20%;font-size:11px;color:#222;padding-top:10px;padding-bottom:15px;text-align:center;font-family:-apple-system,BlinkMacSystemFont,”Segoe UI”,Roboto,Oxygen-Sans,Ubuntu,Cantarell,”Helvetica Neue”,sans-serif;line-height:17px}.BuyBoxSection-683559780 .tax-buybox{display:block;width:100%;color:#222;padding:20px 16px;text-align:center;font-family:-apple-system,BlinkMacSystemFont,”Segoe UI”,Roboto,Oxygen-Sans,Ubuntu,Cantarell,”Helvetica Neue”,sans-serif;line-height:NaNpx}.BuyBoxSection-683559780 .button-container{display:flex;padding-right:20px;padding-left:20px;justify-content:center}.BuyBoxSection-683559780 .button-container>*{flex:1px}.BuyBoxSection-683559780 .button-container>a:hover,.Button-505204839:hover,.Button-1078489254:hover,.Button-2737859108:hover{text-decoration:none}.BuyBoxSection-683559780 .btn-secondary{background:#fff}.BuyBoxSection-683559780 .button-asia{background:#069;border:1px solid #069;border-radius:0;cursor:pointer;display:block;padding:9px;outline:0;text-align:center;text-decoration:none;min-width:80px;margin-top:75px}.BuyBoxSection-683559780 .button-label-asia,.ButtonLabel-3869432492,.ButtonLabel-3296148077,.ButtonLabel-1636778223{display:block;color:#fff;font-size:17px;line-height:20px;font-family:-apple-system,BlinkMacSystemFont,”Segoe UI”,Roboto,Oxygen-Sans,Ubuntu,Cantarell,”Helvetica Neue”,sans-serif;text-align:center;text-decoration:none;cursor:pointer}.Button-505204839,.Button-1078489254,.Button-2737859108{background:#069;border:1px solid #069;border-radius:0;cursor:pointer;display:block;padding:9px;outline:0;text-align:center;text-decoration:none;min-width:80px;max-width:320px;margin-top:20px}.Button-505204839 .btn-secondary-label,.Button-1078489254 .btn-secondary-label,.Button-2737859108 .btn-secondary-label{color:#069}
/* style specs end */
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 digital issues and online access to articles
$99.00 per year
only $8.25 per issue
Buy this article
- Purchase on Springer Link
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
Bayesian reaction optimization as a tool for chemical synthesis
SAVI, in silico generation of billions of easily synthesizable compounds through expert-system type rules
Modeling the expansion of virtual screening libraries
Data availability
SMILES and rewards used for all case studies32,33,37 can be found at github.com/coleygroup/sparrow/tree/main/examples. All results can be reproduced using included configuration files in the same repository52. Source data are provided with this paper.
Code availability
SPARROW is open source and can be found at github.com/coleygroup/sparrow (ref. 52). All code and retrosynthetic routes from ASKCOS used to generate the described results can be found at github.com/coleygroup/sparrow/tree/main/examples. Full candidate sets with configuration files are included in this repository both for reproducibility and as examples for use of SPARROW.
References
-
Gao, W. & Coley, C. W. The synthesizability of molecules proposed by generative models. J. Chem. Inf. Model. 60, 5714–5723 (2020).
Google Scholar
-
Méndez-Lucio, O., Baillif, B., Clevert, D.-A., Rouquié, D. & Wichard, J. De novo generation of hit-like molecules from gene expression signatures using artificial intelligence. Nat. Commun. 11, 10 (2020).
Google Scholar
-
Ertl, P. & Schuffenhauer, A. Estimation of synthetic accessibility score of drug-like molecules based on molecular complexity and fragment contributions. J. Cheminform. 1, 8 (2009).
Google Scholar
-
Coley, C. W., Rogers, L., Green, W. H. & Jensen, K. F. SCScore: synthetic complexity learned from a reaction corpus. J. Chem. Inf. Model. 58, 252–261 (2018).
Google Scholar
-
Thakkar, A., Chadimová, V., Bjerrum, E. J., Engkvist, O. & Reymond, J.-L. Retrosynthetic Accessibility Score (RAscore)—rapid machine learned synthesizability classification from AI driven retrosynthetic planning. Chem. Sci. 12, 3339–3349 (2021).
Google Scholar
-
Liu, C.-H. et al. RetroGNN: fast estimation of synthesizability for virtual screening and de novo design by learning from slow retrosynthesis software. J. Chem. Inf. Model. 62, 2293–2300 (2022).
Google Scholar
-
Andersson, S. et al. Making medicinal chemistry more effective—application of Lean Sigma to improve processes, speed and quality. Drug Discov. Today 14, 598–604 (2009).
Google Scholar
-
Segler, M. H. S., Preuss, M. & Waller, M. P. Planning chemical syntheses with deep neural networks and symbolic AI. Nature 555, 604–610 (2018).
Google Scholar
-
Coley, C. W. et al. A robotic platform for flow synthesis of organic compounds informed by AI planning. Science 365, eaax1566 (2019).
Google Scholar
-
Genheden, S. et al. AiZynthFinder: a fast, robust and flexible open-source software for retrosynthetic planning. J. Cheminform. 12, 70 (2020).
Google Scholar
-
Badowski, T., Molga, K. A. & Grzybowski, B. Selection of cost-effective yet chemically diverse pathways from the networks of computer-generated retrosynthetic plans. Chem. Sci. 10, 4640–4651 (2019).
Google Scholar
-
Gao, W., Mercado, R. & Coley, C. W. Amortized tree generation for bottom-up synthesis planning and synthesizable molecular design. In International Conference on Learning Representations https://openreview.net/forum?id=FRxhHdnxt1 (OpenReview.net, 2022).
-
Zhang, Q., Liu, C., Wu, S., Hayashi, Y. & Yoshida, R. A Bayesian method for concurrently designing molecules and synthetic reaction networks. Sci. Technol. Adv. Mater. Methods 3, 2204994 (2023).
-
Breznik, M. et al. Prioritizing small sets of molecules for synthesis through in-silico tools: a comparison of common ranking methods. ChemMedChem 18, e202200425 (2023).
Google Scholar
-
Frazier, P. I. Bayesian Optimization. INFORMS TutORials in Operations Research https://doi.org/10.1287/educ.2018.0188 (2018).
-
Shahriari, B., Swersky, K., Wang, Z., Adams, R. P. & de Freitas, N. Taking the human out of the loop: a review of Bayesian optimization. Proc. IEEE 104, 148–175 (2016).
Google Scholar
-
Korovina, K. et al. ChemBO: Bayesian optimization of small organic molecules with synthesizable recommendations. In Proc. Twenty Third International Conference on Artificial Intelligence and Statistics (eds Chiappa, S. & Calandra, R.) 3393–3403 (PMLR, 2020).
-
Pyzer-Knapp, E. O. Bayesian optimization for accelerated drug discovery. IBM J. Res. Dev. 62, 2:1–2:7 (2018).
Google Scholar
-
Sasena, M. J. Flexibility and Efficiency Enhancements for Constrained Global Design Optimization with Kriging Approximations. PhD Thesis, Univ. of Michigan (2002).
-
Huang, D., Allen, T. T., Notz, W. I. & Miller, R. A. Sequential Kriging optimization using multiple-fidelity evaluations. Struct. Multidiscip. Optim. 32, 369–382 (2006).
Google Scholar
-
Palizhati, A., Torrisi, S. B. & Aykol, M. et al. Agents for sequential learning using multiple-fidelity data. Sci. Rep. 12, 4694 (2022).
Google Scholar
-
Zanjani Foumani, Z., Shishehbor, M., Yousefpour, A. & Bostanabad, R. Multi-fidelity cost-aware Bayesian optimization. Comput. Methods Appl. Mech. Eng. 407, 115937 (2023).
Google Scholar
-
Molga, K., Dittwald, P. & Grzybowski, B. A. Computational design of syntheses leading to compound libraries or isotopically labelled targets. Chem. Sci. 10, 9219–9232 (2019).
Google Scholar
-
Gao, H., Pauphilet, J., Struble, T. J., Coley, C. W. & Jensen, K. F. Direct optimization across computer-generated reaction networks balances materials use and feasibility of synthesis plans for molecule libraries. J. Chem. Inf. Model. 61, 493–504 (2021).
Google Scholar
-
Gao, H. et al. Combining retrosynthesis and mixed-integer optimization for minimizing the chemical inventory needed to realize a WHO essential medicines list. Reaction Chem. Eng. 5, 367–376 (2020).
Google Scholar
-
Marvin, W. A., Rangarajan, S. & Daoutidis, P. Automated generation and optimal selection of biofuel-gasoline blends and their synthesis routes. Energy Fuels 27, 3585–3594 (2013).
Google Scholar
-
Dahmen, M. & Marquardt, W. Model-based formulation of biofuel blends by simultaneous product and pathway design. Energy Fuels 31, 4096–4121 (2017).
Google Scholar
-
König, A., Neidhardt, L., Viell, J., Mitsos, A. & Dahmen, M. Integrated design of processes and products: optimal renewable fuels. Comput. Chem. Eng. 134, 106712 (2020).
Google Scholar
-
Adjiman, C. S. et al. Process systems engineering perspective on the design of materials and molecules. Ind. Eng. Chem. Res. 60, 5194–5206 (2021).
Google Scholar
-
Coley, C. W., Barzilay, R., Jaakkola, T. S., Green, W. H. & Jensen, K. F. Prediction of organic reaction outcomes using machine learning. ACS Central Sci. 3, 434–443 (2017).
Google Scholar
-
Chemspace Services: Compound Sourcing and Procurement, Hit Discovery, Molecular Docking, Custom Synt; https://chem-space.com/services (accessed October 2023).
-
Garibsingh, R.-A. A. et al. Rational design of ASCT2 inhibitors using an integrated experimental-computational approach. Proc. Natl Acad. Sci. USA 118, e2104093118 (2021).
Google Scholar
-
Koscher, B. A. et al. Autonomous, multiproperty-driven molecular discovery: from predictions to measurements and back. Science 382, eadi1407 (2023).
Google Scholar
-
Barry, C. E. Lessons from seven decades of antituberculosis drug discovery. Curr. Topics Med. Chem. 11, 1216–1225 (2011).
Google Scholar
-
Wesolowski, S. S. & Brown, D. G. Lead Generation 487–512 (John Wiley & Sons, 2016).
-
Brown, D. G. & Boström, J. Analysis of past and present synthetic methodologies on medicinal chemistry: where have all the new reactions gone? J. Med. Chem. 59, 4443–4458 (2016).
Google Scholar
-
Button, A., Merk, D., Hiss, J. A. & Schneider, G. Automated de novo molecular design by hybrid machine intelligence and rule-driven chemical synthesis. Nat. Mach. Intell. 1, 307–315 (2019).
Google Scholar
-
Dunning, I., Mitchell, S. & O’Sullivan, M. PuLP: A Linear Programming Toolkit for Python (Univ. Auckland, 2011).
-
Forrest, J. et al. coin-or/Cbc: release releases/2.10.11 (2023); https://zenodo.org/doi/10.5281/zenodo.2720283 (accessed October 2023).
-
Klotz, E. & Newman, A. M. Practical guidelines for solving difficult linear programs. Surveys Oper. Res. Manag. Sci. 18, 1–17 (2013).
Google Scholar
-
Klotz, E. in Bridging Data and Decisions, INFORMS TutORials in Operations Research (eds Newman, A. & Leung, J.) 54–108 (INFORMS, 2014).
-
Benders, J. F. Partitioning procedures for solving mixed-variables programming problems. Numer. Math. 4, 238–252 (1962).
Google Scholar
-
Grzybowski, B. A., Badowski, T., Molga, K. & Szymkuć, S. Network search algorithms and scoring functions for advanced-level computerized synthesis planning. WIREs Comput. Mol. Sci. 13, e1630 (2023).
Google Scholar
-
Wen, M. et al. Chemical reaction networks and opportunities for machine learning. Nat. Comput. Sci. 3, 12–24 (2023).
Google Scholar
-
Levin, I., Fortunato, M. E., Tan, K. L. & Coley, C. W. Computer-aided evaluation and exploration of chemical spaces constrained by reaction pathways. AIChE J. 69, e18234 (2023).
Google Scholar
-
Götz, J. et al. High-throughput synthesis provides data for predicting molecular properties and reaction success. Sci. Adv. 9, eadj2314 (2023).
Google Scholar
-
Casetti, N., Alfonso-Ramos, J. E., Coley, C. W. & Stuyver, T. Combining molecular quantum mechanical modeling and machine learning for accelerated reaction screening and discovery. Chem. A Eur. J. 29, e202301957 (2023).
Google Scholar
-
Pasquini, M. & Stenta, M. LinChemIn: Syngraph—a data model and a toolkit to analyze and compare synthetic routes. J. Cheminform. 15, 41 (2023).
-
Pasquini, M. & Stenta, M. LinChemIn: route arithmetic-operations on digital synthetic routes. J. Chem. Inf. Model. 64, 1765–1771 (2024).
Google Scholar
-
Gao, H. et al. Using machine learning to predict suitable conditions for organic reactions. ACS Central Sci. 4, 1465–1476 (2018).
Google Scholar
-
Coley, C. et al. A graph-convolutional neural network model for the prediction of chemical reactivity. Chem. Sci. 10, 370–377 (2019).
Google Scholar
-
Fromer, J. & Coley, C. coleygroup/sparrow: v1.0.0 (2024); https://zenodo.org/doi/10.5281/zenodo.11068069
Acknowledgements
This work was supported by the DARPA Accelerated Molecular Discovery program (contract no. HR00111920025) and the Office of Naval Research (grant no. N00014-21-1-2195). J.C.F. received additional support from the National Science Foundation Graduate Research Fellowship (grant no. 2141064). We are grateful to M. Stenta, M. Pasquini, D. Jimenez and T. Ziegler for participating in discussions that guided the development of SPARROW. We are also grateful to M. A. McDonald, B. Koscher, R. Canty and the remaining authors of ref. 33 for providing the candidate set for case 2. Finally, we thank B. Mahjour and A. Zhang for providing insight into the validity of reactions and conditions proposed by retrosynthetic software.
Author information
Authors and Affiliations
Contributions
C.W.C. and J.C.F. conceptualized the project, validated the method, analyzed results and wrote the paper. J.C.F. curated the data and wrote the software. C.W.C. supervised the work.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Peer review
Peer review information
Nature Computational Science thanks Mingyue Zheng and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Primary Handling Editor: Kaitlin McCardle, in collaboration with the Nature Computational Science team.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Supplementary Information
Supplementary Fig. 1 and Tables 1–4.
Supplementary Data 1
Starting material prices from the ChemSpace API in October 2023 and March 2024 used in the first case study, plotted in Supplementary Fig. 1a.
Supplementary Data 2
Starting material prices from the ChemSpace API in October 2023 and March 2024 used in the second case study, plotted in Supplementary Fig. 1b.
Source data
Source Data Fig. 3
Numerical source data; reaction SMILES, scores and conditions
Source Data Fig. 4
Numerical source data for a–d
Source Data Fig. 5
Reaction SMILES, scores and conditions
Source Data Fig. 6
Reaction SMILES, scores and conditions
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
Reprints and permissions
About this article
Cite this article
Fromer, J.C., Coley, C.W. An algorithmic framework for synthetic cost-aware decision making in molecular design.
Nat Comput Sci (2024). https://doi.org/10.1038/s43588-024-00639-y
-
Received: 20 December 2023
-
Accepted: 07 May 2024
-
Published: 17 June 2024
-
DOI: https://doi.org/10.1038/s43588-024-00639-y