References

Adilov, Sanjar. 2021. “Generative Pre-Training from Molecules.” ChemRxiv Preprint, September. https://doi.org/10.26434/chemrxiv-2021-5fwjd.

Ahmad, Walid, Elana Simon, Seyone Chithrananda, Gabriel Grand, and Bharath Ramsundar. 2022. “Chemberta-2: Towards chemical foundation models.” arXiv Preprint. https://doi.org/10.48550/arXiv.2209.01712.

Ahn, Michael, Anthony Brohan, Noah Brown, Yevgen Chebotar, Omar Cortes, Byron David, Chelsea Finn, et al. 2022. “Do as i Can, Not as i Say: Grounding Language in Robotic Affordances.” arXiv Preprint. https://doi.org/10.48550/arXiv.2204.01691.

Ai, Qianxiang, Fanwang Meng, Jiale Shi, Brenden Pelkie, and Connor W Coley. 2024. “Extracting Structured Data from Organic Synthesis Procedures Using a Fine-Tuned Large Language Model.” Digital Discovery 3 (9): 1822–31. https://doi.org/10.1039/d4dd00091a.

Alampara, Nawaf, Santiago Miret, and Kevin Maik Jablonka. 2024. “MatText: Do language models need more than text & scale for materials modeling?” arXiv Preprint. https://doi.org/10.48550/arXiv.2406.17295.

Alampara, Nawaf, Mara Schilling-Wilhelmi, and Kevin Maik Jablonka. 2025. “Lessons from the trenches on evaluating machine-learning systems in materials science.” arXiv Preprint. https://doi.org/10.48550/arXiv.2503.10837.

Alampara, Nawaf, Mara Schilling-Wilhelmi, Martiño Rı́os-Garcı́a, Indrajeet Mandal, Pranav Khetarpal, Hargun Singh Grover, NM Krishnan, and Kevin Maik Jablonka. 2024. “Probing the limitations of multimodal language models for chemistry and materials research.” arXiv Preprint. https://doi.org/10.48550/arXiv.2411.16955.

Alberts, Bruce. 2002. Molecular Biology of the Cell. 4th ed. Garland Science.

Alberts, Marvin, Oliver Schilter, Federico Zipoli, Nina Hartrampf, and Teodoro Laino. 2024. “Unraveling Molecular Structure: A Multimodal Spectroscopic Dataset for Chemistry.” arXiv Preprint. https://doi.org/10.48550/arXiv.2407.17492.

Altmäe, Signe, Alberto Sola-Leyva, and Andres Salumets. 2023. “Artificial intelligence in scientific writing: a friend or a foe?” Reproductive BioMedicine Online 47 (1): 3–9. https://doi.org/10.1016/j.rbmo.2023.04.009.

Amin, Ishan, Sanjeev Raja, and Aditi Krishnapriyan. 2025. “Towards Fast, Specialized Machine Learning Force Fields: Distilling Foundation Models via Energy Hessians.” arXiv Preprint. https://doi.org/10.48550/arXiv.2501.09009.

Ananthanarayanan, Vaishnav, and William Thies. 2010. “BioCoder: A Programming Language for Standardizing and Automating Biology Protocols.” Journal of Biological Engineering 4: 13. https://doi.org/10.1186/1754-1611-4-13.

Aneesh, Anagha, Nawaf Alampara, José A. Márquez, and Kevin Maik Jablonka. 2025. “Semantic Device Graphs for Perovskite Solar Cell Design.” The Thirsteenth International Conference on Learning Representations Workshop on AI for Materials Science, ICLR-AI4MAT. https://openreview.net/forum?id=AGCClISEXL&referrer=%5Bthe%20profile%20of%20Anagha%20Aneesh%5D(%2Fprofile%3Fid%3D~Anagha_Aneesh1).

Ansari, Mehrad, and Seyed Mohamad Moosavi. 2024. “Agent-Based Learning of Materials Datasets from the Scientific Literature.” Digital Discovery 3 (12): 2607–17. https://doi.org/10.1039/D4DD00252K.

Ansari, Mehrad, Jeffrey Watchorn, Carla E. Brown, and Joseph S. Brown. 2024. “dZiner: Rational Inverse Design of Materials with AI Agents.” Arxiv Preprint, October. https://doi.org/10.48550/arXiv.2410.03963.

Anthropic. 2025a. “Claude for Education | Partnering with Universities on Responsible AI.” https://www.anthropic.com/education.

———. 2025b. “System Card: Claude Opus 4 & Claude Sonnet 4.” Anthropic. https://www-cdn.anthropic.com/6be99a52cb68eb70eb9572b4cafad13df32ed995.pdf.

Antunes, Luis M., Keith T. Butler, and Ricardo Grau-Crespo. 2024. “Crystal Structure Generation with Autoregressive Large Language Modeling.” Nature Communications 15 (1). https://doi.org/10.1038/s41467-024-54639-7.

Arlt, Sören, Haonan Duan, Felix Li, Sang Michael Xie, Yuhuai Wu, and Mario Krenn. 2024. “Meta-Designing Quantum Experiments with Language Models.” arXiv Preprint arXiv: 2406.02470. https://doi.org/10.48550/arXiv.2406.02470.

Arús-Pous, Josep, Simon Viet Johansson, Oleksii Prykhodko, Esben Jannik Bjerrum, Christian Tyrchan, Jean-Louis Reymond, Hongming Chen, and Ola Engkvist. 2019. “Randomized SMILES Strings Improve the Quality of Molecular Generative Models.” Journal of Cheminformatics 11: 1–13. https://doi.org/10.1186/s13321-019-0393-0.

Atz, Kenneth, Leandro Cotos, Clemens Isert, Maria Håkansson, Dorota Focht, Mattis Hilleke, David F Nippa, et al. 2024. “Prospective de novo drug design with deep interactome learning.” Nature Communications 15 (1): 3408. https://doi.org/s41467-024-47613-w.

Bai, Yuntao, Saurav Kadavath, Sandipan Kundu, Amanda Askell, Jackson Kernion, Andy Jones, Anna Chen, et al. 2022. “Constitutional AI: Harmlessness from AI Feedback.” arXiv Preprint, December. https://doi.org/10.48550/arXiv.2212.08073.

Baillargeon, Jean-Thomas, and Luc Lamontagne. 2022. “Assessing the Impact of Sequence Length Learning on Classification Tasks for Transformer Encoder Models.” The Florida AI Research Society. https://doi.org/10.32473/flairs.37.1.135283.

Balaji, Suryanarayanan, Rishikesh Magar, Yayati Jadhav, and Amir Barati Farimani. 2023. “GPT-MolBERTa: GPT Molecular Features Language Model for Molecular Property Prediction.” Arxiv Preprint arXiv:2310.03030, October. https://doi.org/10.48550/arXiv.2310.03030.

Baral, Sami, Li Lucy, Ryan Knight, Alice Ng, Luca Soldaini, Neil T. Heffernan, and Kyle Lo. 2025. “DrawEduMath: Evaluating Vision Language Models with Expert-Annotated Students’ Hand-Drawn Math Images.” arXiv Preprint arXiv: 2501.14877. https://doi.org/10.48550/arXiv.2501.14877.

Barez, Fazl, Tingchen Fu, Ameya Prabhu, Stephen Casper, Amartya Sanyal, Adel Bibi, Aidan O’Gara, et al. 2025. “Open Problems in Machine Unlearning for AI Safety.” arXiv Preprint arXiv: 2501.04952.

Batatia, Ilyes, Philipp Benner, Yuan Chiang, Alin M Elena, Dávid P Kovács, Janosh Riebesell, Xavier R Advincula, et al. 2023. “A foundation model for atomistic materials chemistry.” arXiv Preprint arXiv:2401.00096. https://doi.org/10.48550/arXiv.2401.00096.

Batatia, Ilyes, D. Kov’acs, G. Simm, C. Ortner, and Gábor Csányi. 2022. “MACE: Higher Order Equivariant Message Passing Neural Networks for Fast and Accurate Force Fields.” Neural Information Processing Systems. https://doi.org/10.48550/arXiv.2206.07697.

Batzner, Simon, Albert Musaelian, Lixin Sun, Mario Geiger, Jonathan P. Mailoa, Mordechai Kornbluth, Nicola Molinari, Tess E. Smidt, and Boris Kozinsky. 2022. “E(3)-equivariant graph neural networks for data-efficient and accurate interatomic potentials.” Nature Communications 13 (1). https://doi.org/10.1038/s41467-022-29939-5.

Beltagy, Iz, Kyle Lo, and Arman Cohan. 2019. “SciBERT: A Pretrained Language Model for Scientific Text.” Conference on Empirical Methods in Natural Language Processing. https://doi.org/10.18653/v1/D19-1371.

Bender, Emily M, Timnit Gebru, Angelina McMillan-Major, and Shmargaret Shmitchell. 2021. “On the Dangers of Stochastic Parrots: Can Language Models Be Too Big?” Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, 610–23. https://doi.org/10.1145/3442188.3445922.

Bengio, Yoshua, Sören Mindermann, Daniel Privitera, Tamay Besiroglu, Rishi Bommasani, Stephen Casper, Yejin Choi, et al. 2025. “International AI Safety Report.” arXiv Preprint arXiv: 2501.17805. https://doi.org/10.48550/arXiv.2501.17805.

Bengio, Yoshua, Li Yao, Guillaume Alain, and Pascal Vincent. 2013. “Generalized Denoising Auto-Encoders as Generative Models.” Advances in Neural Information Processing Systems 26. https://doi.org/10.48550/arXiv.1305.6663.

Bhattacharya, Debjyoti, Harrison J. Cassady, Michael A. Hickner, and Wesley F. Reinhart. 2024. “Large Language Models as Molecular Design Engines.” Journal of Chemical Information and Modeling 64 (18): 7086–96. https://doi.org/10.1021/acs.jcim.4c01396.

Bhuiyan, Johana. 2025. “Google Undercounts Its Carbon Emissions, Report Finds.” https://www.theguardian.com/technology/2025/jul/02/google-carbon-emissions-report.

Bjerrum, Esben Jannik. 2017. “SMILES Enumeration as Data Augmentation for Neural Network Modeling of Molecules.” arXiv Preprint arXiv:1703.07076. https://doi.org/10.48550/arXiv.1703.07076.

Bloomfield, Doni, Jaspreet Pannu, Alex W. Zhu, Madelena Y. Ng, Ashley Lewis, Eran Bendavid, Steven M. Asch, Tina Hernandez-Boussard, Anita Cicero, and Tom Inglesby. 2024. “AI and Biosecurity: The Need for Governance.” Science 385 (6711): 831–33. https://doi.org/10.1126/science.adq1977.

Board, Nature Computational Science Editorial. 2023. “The Carbon Footprint of Computational Research.” Nature Computational Science 3 (8): 659–59. https://doi.org/10.1038/s43588-023-00506-2.

Boiko, Daniil A, Robert MacKnight, Ben Kline, and Gabe Gomes. 2023. “Autonomous chemical research with large language models.” Nature 624 (7992): 570–78. https://doi.org/10.1038/s41586-023-06792-0.

Bojanowski, Piotr, Edouard Grave, Armand Joulin, and Tomas Mikolov. 2017. “Enriching Word Vectors with Subword Information.” Transactions of the Association for Computational Linguistics 5: 135–46. https://doi.org/10.1162/tacl_a_00051.

Bommasani, Rishi, Drew A Hudson, Ehsan Adeli, Russ Altman, Simran Arora, Sydney von Arx, Michael S Bernstein, et al. 2021. “On the opportunities and risks of foundation models.” arXiv Preprint arXiv:2108.07258. https://doi.org/10.48550/arXiv.2108.07258.

Bonet, Blai, and Hector Geffner. 2012. “Action Selection for MDPs: Anytime AOversus UCT.” Proceedings of the AAAI Conference on Artificial Intelligence 26 (1): 1749–55. https://doi.org/10.1609/aaai.v26i1.8369.

Born, Jannis, Greta Markert, Nikita Janakarajan, Talia B Kimber, Andrea Volkamer, Marı́a Rodrı́guez Martı́nez, and Matteo Manica. 2023. “Chemical Representation Learning for Toxicity Prediction.” Digital Discovery 2 (3): 674–91. https://doi.org/10.1039/d2dd00099g.

Bouritsas, Giorgos, Fabrizio Frasca, Stefanos Zafeiriou, and Michael M Bronstein. 2022. “Improving Graph Neural Network Expressivity via Subgraph Isomorphism Counting.” IEEE Transactions on Pattern Analysis and Machine Intelligence 45 (1): 657–68. https://doi.org/10.1109/TPAMI.2022.3154319.

Bran, Andres M., Sam Cox, Oliver Schilter, Carlo Baldassari, Andrew D. White, and Philippe Schwaller. 2024. “Augmenting Large Language Models with Chemistry Tools.” Nature Machine Intelligence 6 (5). https://doi.org/10.1038/s42256-024-00832-8.

Breunig, Drew. 2025. “How to Fix Your Context.” https://www.dbreunig.com/2025/06/26/how-to-fix-your-context.html.

Brinkhaus, Henning Otto, Kohulan Rajan, Achim Zielesny, and Christoph Steinbeck. 2022. “RanDepict: Random chemical structure depiction generator.” Journal of Cheminformatics 14 (1): 31. https://doi.org/10.1186/s13321-022-00609-4.

Brown, Nathan, Marco Fiscato, Marwin H. S. Segler, and Alain C. Vaucher. 2019. “GuacaMol: Benchmarking Models for de Novo Molecular Design.” Journal of Chemical Information and Modeling 59 (3): 1096–1108. https://doi.org/10.1021/acs.jcim.8b00839.

Brown, Tom, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D Kaplan, Prafulla Dhariwal, Arvind Neelakantan, et al. 2020. “Language models are few-shot learners.” Advances in Neural Information Processing Systems 33: 1877–1901. https://doi.org/10.48550/arXiv.2005.14165.

Bucior, Benjamin J., Andrew S. Rosen, Maciej Haranczyk, Zhenpeng Yao, Michael E. Ziebel, Omar K. Farha, Joseph T. Hupp, J. Ilja Siepmann, Alán Aspuru-Guzik, and Randall Q. Snurr. 2019. “Identification Schemes for Metal-Organic Frameworks To Enable Rapid Search and Cheminformatics Analysis.” Crystal Growth & Design 19 (11): 6682–97. https://doi.org/10.1021/acs.cgd.9b01050.

Butler, Keith T., Daniel W. Davies, Hugh Cartwright, Olexandr Isayev, and Aron Walsh. 2018. “Machine Learning for Molecular and Materials Science.” Nature 559 (7715): 547–55. https://doi.org/10.1038/s41586-018-0337-2.

Cai, Feiyang, Jiahui Bai, Tao Tang, Joshua Luo, Tianyu Zhu, Ling Liu, and Feng Luo. 2025. “MolLangBench: A Comprehensive Benchmark for Language-Prompted Molecular Structure Recognition, Editing, and Generation.” arXiv Preprint. https://doi.org/10.48550/arxiv.2505.15054.

Cai, Hengxing, Xiaochen Cai, Junhan Chang, Sihang Li, Lin Yao, Changxin Wang, Zhifeng Gao, et al. 2024. “SciAssess: Benchmarking LLM Proficiency in Scientific Literature Analysis.” arXiv Preprint arXiv: 2403.01976. https://doi.org/10.48550/arXiv.2403.01976.

Calanzone, Diego, Pierluca D’Oro, and Pierre-Luc Bacon. 2025. “Mol-MoE: Training Preference-Guided Routers for Molecule Generation.” Arxiv Preprint arXiv:2502.05633, February. https://doi.org/10.48550/arXiv.2502.05633.

Campbell, Quintina, Sam Cox, Jorge Medina, Brittany Watterson, and Andrew D. White. 2025. “MDCrow: Automating Molecular Dynamics Workflows with Large Language Models.” arXiv Preprint arXiv:2502.09565. https://doi.org/10.48550/arXiv.2502.09565.

Cao, He, Zijing Liu, Xingyu Lu, Yuan Yao, and Yu Li. 2023. “InstructMol: Multi-Modal Integration for Building a Versatile and Reliable Molecular Assistant in Drug Discovery.” arXiv Preprint arXiv: 2311.16208. https://doi.org/10.48550/arXiv.2311.16208.

Cao, Shuxiang, Zijian Zhang, Mohammed Alghadeer, Simone D Fasciati, Michele Piscitelli, Mustafa Bakr, Peter Leek, and Alán Aspuru-Guzik. 2024. “Agents for self-driving laboratories applied to quantum computing.” arXiv Preprint. https://doi.org/10.48550/arXiv.2412.07978.

Cao, Zhendong, Xiaoshan Luo, Jian Lv, and Lei Wang. 2024. “Space Group Informed Transformer for Crystalline Materials Generation.” arXiv Preprint arXiv: 2403.15734. https://doi.org/10.48550/arXiv.2403.15734.

Cao, Zhendong, and Lei Wang. 2025. “CrystalFormer-RL: Reinforcement Fine-Tuning for Materials Design.” Arxiv Preprint arXiv:2504.02367, April. https://doi.org/10.48550/arXiv.2504.02367.

“Career Update: Google DeepMind -> Anthropic.” 2025. https://nicholas.carlini.com/writing/2025/career-update.html.

Carlson, James, Arthur Jaffe, and Andrew Wiles, eds. 2006. The Millennium Prize Problems. Providence, RI: American Mathematical Society & Clay Mathematics Institute.

Caron, Mathilde, Piotr Bojanowski, Armand Joulin, and Matthijs Douze. 2018. “Deep Clustering for Unsupervised Learning of Visual Features.” arXiv Preprint arXiv: 1807.05520. https://doi.org/10.48550/arXiv.1807.05520.

Cassani, Andrea, Alessandro Monteverde, and Marco Piumetti. 2021. “Belousov–Zhabotinsky Type Reactions: The Non-Linear Behavior of Chemical Systems.” Journal of Mathematical Chemistry 59 (3): 792–826. https://doi.org/10.1007/s10910-021-01223-9.

Cavanagh, Joseph M., Kunyang Sun, Andrew Gritsevskiy, Dorian Bagni, Thomas D. Bannister, and Teresa Head-Gordon. 2024. “SmileyLlama: Modifying Large Language Models for Directed Chemical Space Exploration.” arXiv Preprint arXiv: 2409.02231. https://doi.org/10.48550/arXiv.2409.02231.

CERN. 2024. “CERN Publishes Its First Nuclear Safeguards Policy.” Official News Release. https://home.cern/news/official-news/cern/cern-publishes-its-first-nuclear-safeguards-policy.

Chacko, Edwin, Rudra Sondhi, Arnav Praveen, Kylie L Luska, and Rodrigo Alejandro Vargas Hernandez. 2024. “Spectro: A Multi-Modal Approach for Molecule Elucidation Using IR and NMR Data.” ChemRxiv Preprint. https://doi.org/10.26434/chemrxiv-2024-37v2j.

Chan, Jun Shern, Neil Chowdhury, Oliver Jaffe, James Aung, Dane Sherburn, Evan Mays, Giulio Starace, et al. 2024. “Mle-Bench: Evaluating Machine Learning Agents on Machine Learning Engineering.” arXiv Preprint arXiv:2410.07095. https://doi.org/10.48550/arXiv.2410.07095.

Charalambous, Charithea, Elias Moubarak, Johannes Schilling, Eva Sanchez Fernandez, Jin-Yu Wang, Laura Herraiz, Fergus Mcilwaine, et al. 2024. “A holistic platform for accelerating sorbent-based carbon capture.” Nature 632 (8023): 89–94. https://doi.org/10.1038/s41586-024-07683-8.

Chen, Chi, and Shyue Ping Ong. 2022. “A Universal Graph Deep Learning Interatomic Potential for the Periodic Table.” Nature Computational Science 2 (11): 718–28. https://doi.org/10.1038/s43588-022-00349-3.

Chen, Kexin, Hanqun Cao, Junyou Li, Yuyang Du, Menghao Guo, Xin Zeng, Lanqing Li, Jiezhong Qiu, Pheng Ann Heng, and Guangyong Chen. 2024. “An Autonomous Large Language Model Agent for Chemical Literature Data Mining.” arXiv Preprint arXiv: 2402.12993. https://doi.org/10.48550/arXiv.2402.12993.

Chen, Kexin, Junyou Li, Kunyi Wang, Yuyang Du, Jiahui Yu, Jiamin Lu, Lanqing Li, et al. 2023. “Chemist-x: Large Language Model-Empowered Agent for Reaction Condition Recommendation in Chemical Synthesis.” arXiv Preprint arXiv:2311.10776. https://doi.org/10.48550/arXiv.2311.10776.

Chen, Lichang, Jiuhai Chen, Tom Goldstein, Heng Huang, and Tianyi Zhou. 2024. “InstructZero: Efficient Instruction Optimization for Black-Box Large Language Models.” Forty-First International Conference on Machine Learning, ICML 2024. https://openreview.net/forum?id=rADFNrIss3.

Chen, Pengzhan, Jiean Pei, Weiqing Lu, and Mingzhen Li. 2022. “A deep reinforcement learning based method for real-time path planning and dynamic obstacle avoidance.” Neurocomputing 497: 64–75. https://doi.org/10.1016/j.neucom.2022.05.006.

Chen, Richard J., Judy J. Wang, Drew F. K. Williamson, Tiffany Y. Chen, Jana Lipkova, Ming Y. Lu, Sharifa Sahai, and Faisal Mahmood. 2023. “Algorithmic Fairness in Artificial Intelligence for Medicine and Healthcare.” Nature Biomedical Engineering. https://doi.org/10.1038/s41551-023-01056-8.

Chen, Weize, Yusheng Su, Jingwei Zuo, Cheng Yang, Chenfei Yuan, Chi-Min Chan, Heyang Yu, et al. 2023. “AgentVerse: Facilitating Multi-Agent Collaboration and Exploring Emergent Behaviors.” arXiv Preprint. https://doi.org/10.48550/arXiv.2308.10848.

Cheng, Austin H, Andy Cai, Santiago Miret, Gustavo Malkomes, Mariano Phielipp, and Alán Aspuru-Guzik. 2023. “Group SELFIES: A Robust Fragment-Based Molecular String Representation.” Digital Discovery 2 (3): 748–58. https://doi.org/10.1039/D3DD00012E.

Chennakesavalu, Shriram, Frank Hu, Sebastian Ibarraran, and Grant M. Rotskoff. 2025. “Aligning Transformers with Continuous Feedback via Energy Rank Alignment.” Arxiv Preprint arXiv:2405.12961, May. https://doi.org/10.48550/arXiv.2405.12961.

Chiang, Yuan, Elvis Hsieh, Chia-Hong Chou, and Janosh Riebesell. 2024. “LLaMP: Large Language Model Made Powerful for High-fidelity Materials Knowledge Retrieval and Distillation.” Arxiv, October. https://doi.org/10.48550/arXiv.2401.17244.

Chirkova, Nadezhda, Thibault Formal, Vassilina Nikoulina, and Stéphane Clinchant. 2025. “Provence: Efficient and Robust Context Pruning for Retrieval-Augmented Generation.” arXiv Preprint. https://doi.org/10.48550/arXiv.2501.16214.

Chithrananda, Seyone, Gabriel Grand, and Bharath Ramsundar. 2020. “ChemBERTa: Large-Scale Self-Supervised Pretraining for Molecular Property Prediction.” Arxiv, October. https://doi.org/10.48550/arXiv.2010.09885.

Choi, Jae-Woo, Youngwoo Yoon, Hyobin Ong, Jaehong Kim, and Minsu Jang. 2024. “Lota-Bench: Benchmarking Language-Oriented Task Planners for Embodied Agents.” arXiv Preprint arXiv:2402.08178. https://doi.org/10.48550/arXiv.2402.08178.

Chowdhery, Aakanksha, Sharan Narang, Jacob Devlin, Maarten Bosma, Gaurav Mishra, Adam Roberts, Paul Barham, et al. 2023. “Palm: Scaling Language Modeling with Pathways.” Journal of Machine Learning Research 24 (240): 1–113. https://doi.org/10.48550/arXiv.2204.02311.

Christofidellis, Dimitrios, Giorgio Giannone, Jannis Born, Ole Winther, Teodoro Laino, and Matteo Manica. 2023. “Unifying Molecular and Textual Representations via Multi-Task Language Modelling.” International Conference on Machine Learning, ICML 2023, Proceedings of machine learning research, 202: 6140–57. https://doi.org/10.48550/arXiv.2301.12586.

Chu, Johan S. G., and James A. Evans. 2021. “Slowed Canonical Progress in Large Fields of Science.” Proceedings of the National Academy of Sciences 118 (41). https://doi.org/10.1073/pnas.2021636118.

Chuang, Kangway V, and Michael J Keiser. 2018. “Comment on ‘Predicting Reaction Performance in c–n Cross-Coupling Using Machine Learning’.” Science 362 (6416): eaat8603. https://doi.org/10.1126/science.aat8603.

Cissé, Abdoulatif, Xenophon Evangelopoulos, Vladimir V. Gusev, and Andrew I. Cooper. 2025. “Language-Based Bayesian Optimization Research Assistant (BORA).” arXiv Preprint arXiv: 2501.16224. https://doi.org/10.48550/arXiv.2501.16224.

Clune, Jeff. 2019. “AI-GAs: AI-Generating Algorithms, an Alternate Paradigm for Producing General Artificial Intelligence.” arXiv Preprint arXiv: 1905.10985. https://doi.org/10.48550/arXiv.1905.10985.

Coley, Connor W, Natalie S Eyke, and Klavs F Jensen. 2020. “Autonomous Discovery in the Chemical Sciences Part i: Progress.” Angewandte Chemie International Edition 59 (51): 22858–93. https://doi.org/10.1002/anie.201909987.

Coley, Connor W, Dale A Thomas III, Justin AM Lummiss, Jonathan N Jaworski, Christopher P Breen, Victor Schultz, Travis Hart, et al. 2019. “A robotic platform for flow synthesis of organic compounds informed by AI planning.” Science 365 (6453): eaax1566. https://doi.org/10.1126/science.aax1566.

“Common Crawl.” 2024. https://commoncrawl.org.

Conrad, Stefan, Philipp Auth, Tom Masselter, and Thomas Speck. 2025. “Lowering the Entrance Hurdle for Lab Automation: An Artificial Intelligence‐supported, Interactive Robotic Arm for Automated, Repeated Testing Procedures.” Advanced Intelligent Systems. https://doi.org/10.1002/aisy.202401086.

Corey, Elias J, Richard D Cramer III, and W Jeffrey Howe. 1972. “Computer-assisted synthetic analysis for complex molecules. Methods and procedures for machine generation of synthetic intermediates.” Journal of the American Chemical Society 94 (2): 440–59. https://doi.org/10.1021/ja00757a022.

Crawford, K. 2021. The Atlas of AI: Power, Politics, and the Planetary Costs of Artificial Intelligence. Yale University Press. https://books.google.de/books?id=KfodEAAAQBAJ.

Criado-Perez, Caroline. 2019. Invisible Women: Exposing Data Bias in a World Designed for Men. Chatto & Windus.

Cunningham, Hoagy, Aidan Ewart, Logan Riggs, Robert Huben, and Lee Sharkey. 2023. “Sparse Autoencoders Find Highly Interpretable Features in Language Models.” arXiv Preprint arXiv: 2309.08600. https://doi.org/10.48550/arXiv.2309.08600.

Curtò, J. de, I. de Zarzà, Gemma Roig, and Carlos T. Calafate. 2024. “Large Language Model-Informed x-Ray Photoelectron Spectroscopy Data Analysis.” Signals 5 (2): 181–201. https://doi.org/10.3390/signals5020010.

Dagan, Gautier, Frank Keller, and Alex Lascarides. 2023. “Dynamic Planning with a Llm.” arXiv Preprint arXiv:2308.06391. https://doi.org/10.48550/arXiv.2308.06391.

Dagdelen, John, Alexander Dunn, Sanghoon Lee, Nicholas Walker, Andrew S Rosen, Gerbrand Ceder, Kristin A Persson, and Anubhav Jain. 2024. “Structured Information Extraction from Scientific Text with Large Language Models.” Nature Communications 15 (1): 1418. https://doi.org/10.1038/s41467-024-45563-x.

Dann, Christoph, and Emma Brunskill. 2015. “Sample Complexity of Episodic Fixed-Horizon Reinforcement Learning.” Advances in Neural Information Processing Systems 28. https://doi.org/10.48550/arXiv.1510.08906.

Darvish, Kourosh, Marta Skreta, Yuchi Zhao, Naruki Yoshikawa, Sagnik Som, Miroslav Bogdanovic, Yang Cao, et al. 2025. “ORGANA: A Robotic Assistant for Automated Chemistry Experimentation and Characterization.” Matter 8 (2). https://doi.org/10.1016/j.matt.2024.10.015.

De Luna, Phil, Jennifer Wei, Yoshua Bengio, Alán Aspuru-Guzik, and Edward Sargent. 2017. “Use Machine Learning to Find Energy Materials.” Nature 552 (7683): 23–27. https://doi.org/10.1038/d41586-017-07820-6.

De Moura, Leonardo, Soonho Kong, Jeremy Avigad, Floris Van Doorn, and Jakob von Raumer. 2015. “The Lean theorem prover (system description).” Automated Deduction-CADE-25: 25th International Conference on Automated Deduction, 378–88. https://doi.org/10.1007/978-3-319-21401-6_26.

Dean, Romeo. 2025. “Security Forecast – AI 2027.” AI 2027. https://ai-2027.com/research/security-forecast.

Deringer, Volker L., Noam Bernstein, Gábor Csányi, Chiheb Ben Mahmoud, Michele Ceriotti, Mark Wilson, David A. Drabold, and Stephen R. Elliott. 2021. “Origins of Structural and Electronic Transitions in Disordered Silicon.” Nature 589 (7840): 59–64. https://doi.org/10.1038/s41586-020-03072-z.

Dettmers, Tim, Mike Lewis, Younes Belkada, and Luke Zettlemoyer. 2022. “Gpt3. Int8 (): 8-Bit Matrix Multiplication for Transformers at Scale.” Advances in Neural Information Processing Systems 35: 30318–32. https://doi.org/10.48550/arXiv.2208.07339.

Dettmers, Tim, Artidoro Pagnoni, Ari Holtzman, and Luke Zettlemoyer. 2023. “Qlora: Efficient Finetuning of Quantized Llms.” Advances in Neural Information Processing Systems 36: 10088–115. https://doi.org/10.48550/arXiv.2305.14314.

Devlin, Jacob, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. “BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding.” arXiv Preprint arXiv: 1810.04805. https://doi.org/10.48550/arXiv.1810.04805.

Dinh, Tuan, Yuchen Zeng, Ruisu Zhang, Ziqian Lin, Michael Gira, Shashank Rajput, Jy-yong Sohn, Dimitris Papailiopoulos, and Kangwook Lee. 2022. “LIFT: Language-Interfaced Fine-Tuning for Non-language Machine Learning Tasks.” Advances in Neural Information Processing Systems 35: 11763–84. https://doi.org/10.48550/arXiv.2206.06565.

Donker, Tjibbe. 2023. “The Dangers of Using Large Language Models for Peer Review.” The Lancet Infectious Diseases 23 (7): 781. https://doi.org/10.1016/s1473-3099(23)00290-6.

Dotan, Ravit, and S. Milli. 2019. “Value-Laden Disciplinary Shifts in Machine Learning.” FAT*. https://doi.org/10.1145/3351095.3373157.

Du, Yilun, Shuang Li, Antonio Torralba, Joshua B Tenenbaum, and Igor Mordatch. 2023. “Improving Factuality and Reasoning in Language Models Through Multiagent Debate.” Forty-First International Conference on Machine Learning. https://doi.org/10.48550/arXiv.2305.14325.

Du, Yuanqi, Chenru Duan, Andres Bran, Anna Sotnikova, Yi Qu, Heather Kulik, Antoine Bosselut, Jinjia Xu, and Philippe Schwaller. 2024. “Large Language Models are Catalyzing Chemistry Education.” ChemRxiv Preprint, June. https://doi.org/10.26434/chemrxiv-2024-h722v.

Dung, Leonard, and Dominik Balg. 2025. “Learning Alone: Language Models, Overreliance, and the Goals of Education.” https://philpapers.org/rec/DUNLAL-3.

Edunov, Sergey, Myle Ott, Michael Auli, and David Grangier. 2018. “Understanding Back-Translation at Scale.” arXiv Preprint. https://doi.org/10.48550/arXiv.1808.09381.

Edwards, Carl, Tuan Lai, Kevin Ros, Garrett Honke, Kyunghyun Cho, and Heng Ji. 2022. “Translation Between Molecules and Natural Language.” Arxiv Preprint. https://doi.org/10.48550/arXiv.2204.11817.

Edwards, Carl, ChengXiang Zhai, and Heng Ji. 2021. “Text2Mol: Cross-Modal Molecule Retrieval with Natural Language Queries.” Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, November, 595–607. https://doi.org/10.18653/v1/2021.emnlp-main.47.

EleutherAI. 2024. “Third Party Model Evaluations.” https://blog.eleuther.ai/third-party-evals/.

Elnaggar, Ahmed, Michael Heinzinger, Christian Dallago, Ghalia Rehawi, Yu Wang, Llion Jones, Tom Gibbs, et al. 2022. “ProtTrans: Toward Understanding the Language of Life Through Self-Supervised Learning.” IEEE Transactions on Pattern Analysis and Machine Intelligence 44 (10): 7112–27. https://doi.org/10.1109/tpami.2021.3095381.

Eppel, Sagi, Haoping Xu, Mor Bismuth, and Alan Aspuru-Guzik. 2020. “Computer Vision for Recognition of Materials and Vessels in Chemistry Lab Settings and the Vector-LabPics Data Set.” ACS Central Science 6 (10): 1743–52. https://doi.org/10.1021/acscentsci.0c00460.

EU. 2024. “Regulation (EU) 2024/1689 of the European Parliament and of the Council of 13 June 2024 Laying down Harmonised Rules on Artificial Intelligence and Amending Regulations (EC) No 300/2008, (EU) No 167/2013, (EU) No 168/2013, (EU) 2018/858, (EU) 2018/1139 and (EU) 2019/2144 and Directives 2014/90/EU, (EU) 2016/797 and (EU) 2020/1828 (Artificial Intelligence Act) (Text with EEA Relevance).” http://data.europa.eu/eli/reg/2024/1689/oj/eng.

Fedus, William, Barret Zoph, and Noam Shazeer. 2022. “Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity.” Journal of Machine Learning Research 23 (120): 1–39. https://doi.org/10.48550/arXiv.2101.03961.

Feng, Kehua, Keyan Ding, Weijie Wang, Xiang Zhuang, Zeyuan Wang, Ming Qin, Yu Zhao, Jianhua Yao, Qiang Zhang, and Huajun Chen. 2024. “SciKnowEval: Evaluating Multi-level Scientific Knowledge of Large Language Models.” arXiv Preprint arXiv: 2406.09098. https://doi.org/10.48550/arXiv.2406.09098.

Fernando, Chrisantha, Dylan Banarse, H. Michalewski, Simon Osindero, and Tim Rocktäschel. 2023. “Promptbreeder: Self-Referential Self-Improvement via Prompt Evolution.” International Conference on Machine Learning. https://doi.org/10.48550/arXiv.2309.16797.

Fifty, Christopher, Jure Leskovec, and Sebastian Thrun. 2023. “In-Context Learning for Few-Shot Molecular Property Prediction.” arXiv Preprint arXiv: 2310.08863. https://doi.org/10.48550/arXiv.2310.08863.

Fleming, Alexander. 1929. “On the Antibacterial Action of Cultures of a Penicillium, with Special Reference to Their Use in the Isolation of b. Influenzae.” British Journal of Experimental Pathology 10 (3): 226–36. https://www.jstor.org/stable/4452419.

———. 1964. “Penicillin.” In Nobel Lectures, Physiology or Medicine 1942–1962, 83–93. Amsterdam: Elsevier. https://www.nobelprize.org/uploads/2018/06/fleming-lecture.pdf.

Flöge, Klemens, Srisruthi Udayakumar, Johanna Sommer, Marie Piraud, Stefan Kesselheim, Vincent Fortuin, Stephan Günneman, et al. 2024. “OneProt: Towards Multi-Modal Protein Foundation Models.” arXiv Preprint arXiv:2411.04863. https://doi.org/10.48550/arXiv.2411.04863.

Frey, Nathan C., Ryan Soklaski, Simon Axelrod, Siddharth Samsi, Rafael Gómez-Bombarelli, Connor W. Coley, and Vijay Gadepally. 2023. “Neural Scaling of Deep Chemical Models.” Nature Machine Intelligence 5 (11): 1297–1305. https://doi.org/10.1038/s42256-023-00740-3.

Fu, Li, Qingwei Zhou, Meiqing Jin, and Weihong Wu. 2025. “Large Language Models as Spectrographic Assistants: Opportunities and Challenges in Laboratory Data Analysis.” Environmental Chemistry and Safety, April. https://doi.org/10.26599/ecs.2025.9600002.

Fujinuma, Naohiro, Brian DeCost, Jason Hattrick-Simpers, and Samuel E. Lofland. 2022. “Why Big Data and Compute Are Not Necessarily the Path to Big Materials Science.” Communications Materials 3 (1). https://doi.org/10.1038/s43246-022-00283-x.

Gadde, Rohit S. K., Sreelaya Devaguptam, Fangning Ren, Rajat Mittal, Lechen Dong, Yao Wang, and Fang Liu. 2025. “Chatbot-Assisted Quantum Chemistry for Explicitly Solvated Molecules.” Chemical Science 16 (9): 3852–64. https://doi.org/10.1039/D4SC08677E.

Ganguli, Deep, Liane Lovitt, Jackson Kernion, Amanda Askell, Yuntao Bai, Saurav Kadavath, Ben Mann, et al. 2022. “Red Teaming Language Models to Reduce Harms: Methods, Scaling Behaviors, and Lessons Learned.” arXiv Preprint arXiv: 2209.07858. https://doi.org/10.48550/arXiv.2209.07858.

Ganose, Alex M, and Anubhav Jain. 2019. “Robocrystallographer: automated crystal structure text descriptions and analysis.” MRS Communications 9 (3): 874–81. https://doi.org/10.1557/mrc.2019.94.

Gao, Peng, Jun Zhang, Qian Peng, Jie Zhang, and Vassiliki-Alexandra Glezakou. 2020. “General Protocol for the Accurate Prediction of Molecular 13C/1H NMR Chemical Shifts via Machine Learning Augmented DFT.” Journal of Chemical Information and Modeling 60 (8): 3746–54.

Gao, Rujun, Xiaosu Guo, Xiaodi Li, Arun Balajiee Lekshmi Narayanan, Naveen Thomas, and Arun R. Srinivasa. 2024. “Towards Scalable Automated Grading: Leveraging Large Language Models for Conceptual Question Evaluation in Engineering.” arXiv Preprint arXiv: 2411.03659. https://doi.org/10.48550/arXiv.2411.03659.

Gao, Wenhao, Tianfan Fu, Jimeng Sun, and Connor W. Coley. 2022. “Sample Efficiency Matters: A Benchmark for Practical Molecular Optimization.” Neural Information Processing Systems. https://doi.org/10.48550/arXiv.2206.12411.

Gao, Yunfan, Yun Xiong, Yijie Zhong, Yuxi Bi, Ming Xue, and Haofen Wang. 2025. “Synergizing Rag and Reasoning: A Systematic Review.” arXiv Preprint arXiv:2504.15909. https://doi.org/10.48550/arXiv.2504.15909.

Ge, Suyu, Chunting Zhou, Rui Hou, Madian Khabsa, Yi-Chia Wang, Qifan Wang, Jiawei Han, and Yuning Mao. 2023. “MART: Improving LLM Safety with Multi-round Automatic Red-Teaming.” arXiv Preprint arXiv: 2311.07689. https://doi.org/10.48550/arXiv.2311.07689.

Ghafarollahi, Alireza, and Markus J. Buehler. 2024. “SciAgents: Automating Scientific Discovery Through Bioinspired Multi-Agent Intelligent Graph Reasoning.” Advanced Materials, December. https://doi.org/10.1002/adma.202413523.

Ghareeb, Ali Essam, Benjamin Chang, Ludovico Mitchener, Angela Yiu, Caralyn J. Szostkiewicz, Jon M. Laurent, Muhammed T. Razzak, Andrew D. White, Michaela M. Hinks, and Samuel G. Rodriques. 2025. “Robin: A Multi-Agent System for Automating Scientific Discovery.” arXiv Preprint arXiv: 2505.13400. https://doi.org/10.48550/arXiv.2505.13400.

Giglio, Auro Del, and Mateus Uerlei Pereira da Costa. 2023. “The Use of Artificial Intelligence to Improve the Scientific Writing of Non-Native English Speakers.” Revista Da Associação Médica Brasileira 69 (9): e20230560. https://doi.org/10.1590/1806-9282.20230560.

Girdhar, Rohit, Alaaeldin El-Nouby, Zhuang Liu, Mannat Singh, Kalyan Vasudev Alwala, Armand Joulin, and Ishan Misra. 2023. “ImageBind: One Embedding Space to Bind Them All.” arXiv Preprint arXiv: 2305.05665. https://doi.org/10.48550/arXiv.2305.05665.

Goldberg, Alexander, Ihsan Ullah, Thanh Gia Hieu Khuong, Benedictus Kent Rachmat, Zhen Xu, Isabelle Guyon, and Nihar B. Shah. 2024. “Usefulness of LLMs as an Author Checklist Assistant for Scientific Papers: NeurIPS’24 Experiment.” arXiv Preprint arXiv: 2411.03417. https://doi.org/10.48550/arXiv.2411.03417.

Goldstein, Josh A., Girish Sastry, Micah Musser, Renee DiResta, Matthew Gentzel, and Katerina Sedova. 2023. “Generative Language Models and Automated Influence Operations: Emerging Threats and Potential Mitigations.” arXiv Preprint. https://doi.org/10.48550/arxiv.2301.04246.

Gonzales, Carmelo, Michael Martin Pieler, Kevin Maik Jablonka, and Santiago Miret. 2024. “Evaluating Chemistry Prompts for Large-Language Model Fine-Tuning.” AI for Accelerated Materials Design - NeurIPS 2024. https://openreview.net/forum?id=cEkUia8neA.

Gottweis, Juraj, Wei-Hung Weng, Alexander Daryin, Tao Tu, Anil Palepu, Petar Sirkovic, Artiom Myaskovsky, et al. 2025. “Towards an AI Co-Scientist.” Arxiv Preprint arXiv:2502.18864, February. https://doi.org/10.48550/arXiv.2502.18864.

Grattafiori, Aaron, Abhimanyu Dubey, Abhinav Jauhri, Abhinav Pandey, Abhishek Kadian, Ahmad Al-Dahle, Aiesha Letman, et al. 2024. “The Llama 3 Herd of Models.” arXiv Preprint arXiv: 2407.21783. https://doi.org/10.48550/arXiv.2407.21783.

Griffiths, Ryan-Rhys, and José Miguel Hernández-Lobato. 2020. “Constrained Bayesian Optimization for Automatic Chemical Design Using Variational Autoencoders.” Chemical Science 11 (2): 577–86. https://doi.org/10.1039/c9sc04026a.

Group, Cronin. 2023. “XDL 2.0 Standard Specification.” https://gitlab.com/croningroup/chi-dl-specification.

Gruver, Nate, Marc Anton Finzi, Dylan Sam, J. Zico Kolter, Ben Athiwaratkun, and Andrew Gordon Wilson. 2024. “The Promises and Pitfalls of Language Models for Structured Numerical Data.” OpenReview.net, October. https://openreview.net/forum?id=SZpygmv3G1.

Gruver, Nate, Anuroop Sriram, Andrea Madotto, Andrew Gordon Wilson, C. Lawrence Zitnick, and Zachary Ulissi. 2024. “Fine-Tuned Language Models Generate Stable Inorganic Materials as Text.” Arxiv Preprint arXiv: 2402.04379, February. https://doi.org/10.48550/arXiv.2402.04379.

Grzybowski, Bartosz A, Sara Szymkuć, Ewa P Gajewska, Karol Molga, Piotr Dittwald, Agnieszka Wołos, and Tomasz Klucznik. 2018. “Chematica: a story of computer code that started to think like a chemist.” Chem 4 (3): 390–98. https://doi.org/10.1016/j.chempr.2018.02.024.

Gu, Albert, and Tri Dao. 2023. “Mamba: Linear-Time Sequence Modeling with Selective State Spaces.” arXiv Preprint arXiv: 2312.00752. https://doi.org/10.48550/arXiv.2312.00752.

Gu, Xuemei, and Mario Krenn. 2024. “Interesting Scientific Idea Generation Using Knowledge Graphs and LLMs: Evaluations with 100 Research Group Leaders.” arXiv Preprint arXiv: 2405.17044. https://doi.org/10.48550/arXiv.2405.17044.

———. 2025. “Forecasting High-Impact Research Topics via Machine Learning on Evolving Knowledge Graphs.” Machine Learning: Science and Technology 6 (2): 025041. https://doi.org/10.1088/2632-2153/add6ef.

Gunasekar, Suriya, Yi Zhang, Jyoti Aneja, Caio César Teodoro Mendes, Allie Del Giorno, Sivakanth Gopi, Mojan Javaheripi, et al. 2023. “Textbooks Are All You Need.” arXiv Preprint arXiv: 2306.11644. https://doi.org/10.48550/arXiv.2306.11644.

Guo, Daya, Dejian Yang, Haowei Zhang, Junxiao Song, Ruoyu Zhang, Runxin Xu, Qihao Zhu, et al. 2025. “Deepseek-r1: Incentivizing reasoning capability in llms via reinforcement learning.” arXiv Preprint arXiv:2501.12948. https://doi.org/10.48550/arXiv.2501.12948.

Guo, Jiang, A. Santiago Ibanez-Lopez, Hanyu Gao, Victor Quach, Connor W. Coley, Klavs F. Jensen, and Regina Barzilay. 2021. “Automated Chemical Reaction Extraction from Scientific Literature.” Journal of Chemical Information and Modeling 62 (9): 2035–45. https://doi.org/10.1021/acs.jcim.1c00284.

Guo, Kehan, Bozhao Nan, Yujun Zhou, Taicheng Guo, Zhichun Guo, Mihir Surve, Zhenwen Liang, Nitesh V Chawla, Olaf Wiest, and Xiangliang Zhang. 2024. “Can LLMs Solve Molecule Puzzles? A Multimodal Benchmark for Molecular Structure Elucidation.” The Thirty-Eight Conference on Neural Information Processing Systems Datasets and Benchmarks Track. https://openreview.net/forum?id=t1mAXb4Cop.

Guo, Taicheng, Kehan Guo, B. Nan, Zhengwen Liang, Zhichun Guo, N. Chawla, O. Wiest, and Xiangliang Zhang. 2023. “What can Large Language Models do in chemistry? A comprehensive benchmark on eight tasks.” Neural Information Processing Systems. https://doi.org/10.48550/arXiv.2305.18365.

Gupta, Sonakshi, Akhlak Mahmood, Pranav Shetty, Aishat Adeboye, and Rampi Ramprasad. 2024. “Data Extraction from Polymer Literature Using Large Language Models.” Communications Materials 5 (1): 269. https://doi.org/10.1038/s43246-024-00708-9.

Hadsell, Raia, Sumit Chopra, and Yann LeCun. 2006. “Dimensionality Reduction by Learning an Invariant Mapping.” 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’06) 2: 1735–42. https://doi.org/10.1109/CVPR.2006.100.

Hall, S. R., F. H. Allen, and I. D. Brown. 1991. “The Crystallographic Information File (CIF): A New Standard Archive File for Crystallography.” Acta Crystallographica Section A 47 (6): 655–85. https://doi.org/10.1107/S010876739101067X.

Hammer, Alexander J. S., Andrei I. Leonov, Nicholas L. Bell, and Leroy Cronin. 2021. “Chemputation and the Standardization of Chemical Informatics.” JACS Au 1 (10): 1572–87. https://doi.org/10.1021/jacsau.1c00303.

Handa, Kunal, Drew Bent, Alex Tamkin, Miles McCain, Esin Durmus, Michael Stern, Mike Schiraldi, et al. 2025. “Anthropic Education Report: How University Students Use Claude.” https://www.anthropic.com/news/anthropic-education-report-how-university-students-use-claude.

Hao, Shibo, Yi Gu, Haodi Ma, Joshua Jiahua Hong, Zhen Wang, Daisy Zhe Wang, and Zhiting Hu. 2023. “Reasoning with language model is planning with world model.” arXiv Preprint arXiv:2305.14992. https://doi.org/10.48550/arXiv.2305.14992.

Häse, Florian, Matteo Aldeghi, Riley J. Hickman, Loı̈c M. Roch, and Alán Aspuru-Guzik. 2021. “G<scp>ryffin</Scp>: An Algorithm for Bayesian Optimization of Categorical Variables Informed by Expert Knowledge.” Applied Physics Reviews 8 (3). https://doi.org/10.1063/5.0048164.

He, Jiyan, Weitao Feng, Yaosen Min, Jingwei Yi, Kunsheng Tang, Shuai Li, Jie Zhang, et al. 2023. “Control Risk for Potential Misuse of Artificial Intelligence in Science.” Arxiv Preprint arXiv:2312.06632, December. https://doi.org/10.48550/arXiv.2312.06632.

He, Mingguang, Zhixi Li, Chi Liu, Danli Shi, and Zachary Tan. 2020. “Deployment of Artificial Intelligence in Real-World Practice: Opportunity and Challenge.” Asia-Pacific Journal of Ophthalmology 9 (4): 299–307. https://doi.org/10.1097/apo.0000000000000301.

Heidorn, P Bryan. 2008. “Shedding light on the dark data in the long tail of science.” Library Trends 57 (2): 280–99. https://doi.org/10.1353/lib.0.0036.

Hinton, Geoffrey, Oriol Vinyals, and Jeff Dean. 2015. “Distilling the knowledge in a neural network.” arXiv Preprint arXiv:1503.02531. https://doi.org/10.48550/arXiv.1503.02531.

Hira, Kausik, Mohd Zaki, Dhruvil Sheth, NM Anoop Krishnan, et al. 2024. “Reconstructing the Materials Tetrahedron: Challenges in Materials Information Extraction.” Digital Discovery 3 (5): 1021–37. https://doi.org/10.1039/d4dd00032c.

Hochreiter, Sepp, and Jürgen Schmidhuber. 1997. “Long Short-Term Memory.” Neural Computation 9 (8): 1735–80. https://doi.org/10.1162/neco.1997.9.8.1735.

Hollmann, Noah, Samuel Müller, Lennart Purucker, Arjun Krishnakumar, Max Körfer, Shi Bin Hoo, Robin Tibor Schirrmeister, and Frank Hutter. 2025. “Accurate Predictions on Small Data with a Tabular Foundation Model.” Nature 637 (8045): 319–26. https://doi.org/10.1038/s41586-024-08328-6.

Hong, Kung Yin, Lifeng Han, Riza Batista-Navarro, and Goran Nenadic. 2024. “CantonMT: Cantonese to English NMT Platform with Fine-Tuned Models Using Synthetic Back-Translation Data.” arXiv Preprint arXiv: 2403.11346. https://doi.org/10.48550/arXiv.2403.11346.

Hooker, Sara. 2020. “The Hardware Lottery.” Communications of the ACM. https://doi.org/10.1145/3467017.

Howard, Jeremy, and Sebastian Ruder. 2018. “Universal language model fine-tuning for text classification.” arXiv Preprint arXiv:1801.06146. https://doi.org/10.48550/arXiv.1801.06146.

Hsu, Ting-Yao, C Lee Giles, and Ting-Hao’Kenneth’Huang. 2021. “SciCap: Generating captions for scientific figures.” arXiv Preprint arXiv:2110.11624. https://doi.org/10.48550/arXiv.2110.11624.

Hsu, Ting-Yao, Chieh-Yang Huang, Ryan Rossi, Sungchul Kim, C. Lee Giles, and Ting-Hao K. Huang. 2023. “GPT-4 as an Effective Zero-Shot Evaluator for Scientific Figure Captions.” arXiv Preprint arXiv: 2310.15405. https://doi.org/10.48550/arXiv.2310.15405.

Hu, Edward J, Yelong Shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, Shean Wang, Lu Wang, Weizhu Chen, et al. 2022. “Lora: Low-Rank Adaptation of Large Language Models.” ICLR 1 (2): 3. https://doi.org/10.48550/arXiv.2106.09685.

Hu, Shengran, Cong Lu, and Jeff Clune. 2024. “Automated Design of Agentic Systems.” arXiv Preprint arXiv: 2408.08435. https://doi.org/10.48550/arXiv.2408.08435.

Huan, Maggie, Yuetai Li, Tuney Zheng, Xiaoyu Xu, Seungone Kim, Minxin Du, Radha Poovendran, Graham Neubig, and Xiang Yue. 2025. “Does Math Reasoning Improve General LLM Capabilities? Understanding Transferability of LLM Reasoning.” arXiv Preprint, July. https://doi.org/10.48550/arXiv.2507.00432.

Huang, Bing, and O. Anatole von Lilienfeld. 2016. “Understanding Molecular Representations in Machine Learning: The Role of Uniqueness and Target Similarity.” arXiv Preprint arXiv: 1608.06194. https://doi.org/10.48550/arXiv.1608.06194.

Huang, Qian, Jian Vora, Percy Liang, and J. Leskovec. 2023. “MLAgentBench: Evaluating Language Agents on Machine Learning Experimentation.” International Conference on Machine Learning. https://doi.org/10.48550/arXiv.2310.03302.

Huang, Shu, and Jacqueline M Cole. 2022. “BatteryBERT: A Pretrained Language Model for Battery Database Enhancement.” Journal of Chemical Information and Modeling 62 (24): 6365–77.

Huang, Wenlong, Fei Fei, Trevor Darrell, and Yuke Zhu. 2022. “Language Models as Zero-Shot Planners: Extracting Actionable Knowledge for Embodied Agents.” Proceedings of the 39th International Conference on Machine Learning (ICML). https://doi.org/10.48550/arXiv.2201.07207.

HyMARC. 2019. “Hydrogen Storage Materials Database.” https://www.hymarc.org/home.

Inagaki, Takashi, Akari Kato, Koichi Takahashi, Haruka Ozaki, and Genki N. Kanda. 2023. “LLMs Can Generate Robotic Scripts from Goal-Oriented Instructions in Biological Laboratory Automation.” arXiv Preprint arXiv:2304.10267, April. https://doi.org/10.48550/arXiv.2304.10267.

Intology.ai. 2025. “Zochi Publishes a* Paper.” https://www.intology.ai/blog/zochi-acl.

Isert, Clemens, Kenneth Atz, José Jiménez-Luna, and Gisbert Schneider. 2022. “QMugs, quantum mechanical properties of drug-like molecules.” Scientific Data 9 (1). https://doi.org/10.1038/s41597-022-01390-7.

Jablonka, Kevin Maik, Qianxiang Ai, Alexander Al-Feghali, Shruti Badhwar, Joshua D. Bocarsly, Andres M. Bran, Stefan Bringuier, et al. 2023. “14 examples of how LLMs can transform materials science and chemistry: a reflection on a large language model hackathon.” Digital Discovery 2 (5): 1233–50. https://doi.org/10.1039/d3dd00113j.

Jablonka, Kevin Maik, Charithea Charalambous, Eva Sanchez Fernandez, Georg Wiechers, Juliana Monteiro, Peter Moser, Berend Smit, and Susana Garcia. 2023. “Machine learning for industrial processes: Forecasting amine emissions from a carbon capture plant.” Science Advances 9 (1): eadc9576. https://doi.org/10.1126/sciadv.adc9576.

Jablonka, Kevin Maik, Daniele Ongari, Seyed Mohamad Moosavi, and Berend Smit. 2020. “Big-data science in porous materials: materials genomics and machine learning.” Chemical Reviews 120 (16): 8066–8129. https://doi.org/10.1021/acs.chemrev.0c00004.

Jablonka, Kevin Maik, Luc Patiny, and Berend Smit. 2022. “Making the collective knowledge of chemistry open and machine actionable.” Nature Chemistry 14 (4): 365–76. https://doi.org/10.1038/s41557-022-00910-7.

Jablonka, Kevin Maik, Philippe Schwaller, Andres Ortega-Guerrero, and Berend Smit. 2024. “Leveraging large language models for predictive chemistry.” Nature Machine Intelligence 6 (2): 161–69. https://doi.org/10.1038/s42256-023-00788-1.

Jacobs, Pieter Floris, and Robert Pollice. 2025. “Developing Large Language Models for Quantum Chemistry Simulation Input Generation.” Digital Discovery 4 (3): 762–75. https://doi.org/10.1039/D4DD00366G.

Jang, Hyosoon, Yunhui Jang, Jaehyung Kim, and Sungsoo Ahn. 2025. “Can LLMs Generate Diverse Molecules? Towards Alignment with Structural Diversity.” Arxiv Preprint arXiv:2410.03138, February. https://doi.org/10.48550/arXiv.2410.03138.

Jansen, Peter, Oyvind Tafjord, Marissa Radensky, Pao Siangliulue, Tom Hope, Bhavana Dalvi Mishra, Bodhisattwa Prasad Majumder, Daniel S. Weld, and Peter Clark. 2025. “CodeScientist: End-to-End Semi-Automated Scientific Discovery with Code-Based Experimentation.” arXiv Preprint arXiv: 2503.22708. https://doi.org/10.48550/arXiv.2503.22708.

Jha, Dipendra, Logan Ward, Arindam Paul, Wei-keng Liao, Alok Choudhary, Chris Wolverton, and Ankit Agrawal. 2018. “ElemNet: Deep Learning the Chemistry of Materials From Only Elemental Composition.” Scientific Reports 8 (1). https://doi.org/10.1038/s41598-018-35934-y.

Ji, Yixin, Juntao Li, Hai Ye, Kaixin Wu, Kai Yao, Jia Xu, Linjian Mo, and Min Zhang. 2025. “A Survey of Test-Time Compute: From Intuitive Inference to Deliberate Reasoning.” arXiv Preprint. https://doi.org/10.48550/arXiv.2501.02497.

Ji, Ziwei, Nayeon Lee, Rita Frieske, Tiezheng Yu, Dan Su, Yan Xu, Etsuko Ishii, Ye Jin Bang, Andrea Madotto, and Pascale Fung. 2023. “Survey of Hallucination in Natural Language Generation.” ACM Comput. Surv. 55 (12): 248:1–38. https://doi.org/10.1145/3571730.

Jia, Xiwen, Allyson Lynch, Yuheng Huang, Matthew Danielson, Immaculate Lang’at, Alexander Milder, Aaron E. Ruby, et al. 2019. “Anthropogenic Biases in Chemical Reaction Data Hinder Exploratory Inorganic Synthesis.” Nature 573 (7773): 251–55. https://doi.org/10.1038/s41586-019-1540-5.

Jiang, Shuo, Daniel Evans-Yamamoto, Dennis Bersenev, Sucheendra K Palaniappan, and Ayako Yachie-Kinoshita. 2024. “ProtoCode: Leveraging Large Language Models (LLMs) for Automated Generation of Machine-Readable PCR Protocols from Scientific Publications.” SLAS Technology 29 (3): 100134. https://doi.org/10.1016/j.slast.2024.100134.

Jimenez, Carlos E., John Yang, Alexander Wettig, Shunyu Yao, Kexin Pei, Ofir Press, and Karthik Narasimhan. 2023. “SWE-Bench: Can Language Models Resolve Real-World GitHub Issues?” arXiv Preprint. https://doi.org/10.48550/arxiv.2310.06770.

Jing, Xia, Vimla L Patel, James J Cimino, Jay H Shubrook, Yuchun Zhou, Chang Liu, and Sonsoles De Lacalle. 2022. “The Roles of a Secondary Data Analytics Tool and Experience in Scientific Hypothesis Generation in Clinical Research: Protocol for a Mixed Methods Study.” JMIR Research Protocols 11 (7): e39414. https://doi.org/10.2196/39414.

Joshi, Chaitanya K. 2025. “Transformers Are Graph Neural Networks.” arXiv Preprint. https://doi.org/10.48550/arXiv.2506.22084.

Jung, Son Gyo, Guwon Jung, and Jacqueline M Cole. 2024. “Automatic Prediction of Molecular Properties Using Substructure Vector Embeddings within a Feature Selection Workflow.” Journal of Chemical Information and Modeling 65 (1): 133–52. https://doi.org/10.1021/acs.jcim.4c01862.

Kahneman, Daniel. 2011. Thinking, Fast and Slow. New York: Farrar, Straus; Giroux.

Kambhampati, Subbarao, Karthik Valmeekam, Miquel Marquez, and Luyang Guan. 2023. “On the Role of Large Language Models in Planning.” Tutorial presented at the International Conference on Automated Planning and Scheduling (ICAPS). https://yochan-lab.github.io/tutorial/ICAPS-2023/.

Kang, Yeonghun, and Jihan Kim. 2024. “ChatMOF: An Artificial Intelligence System for Predicting and Generating Metal-Organic Frameworks Using Large Language Models.” Nature Communications 15 (1): 4705. https://doi.org/10.1038/s41467-024-48998-4.

Kaplan, Jared, Sam McCandlish, Tom Henighan, Tom B. Brown, Benjamin Chess, Rewon Child, Scott Gray, Alec Radford, Jeffrey Wu, and Dario Amodei. 2020. “Scaling Laws for Neural Language Models.” arXiv Preprint arXiv: 2001.08361. https://doi.org/10.48550/arXiv.2001.08361.

Kaur, Harveen, Flaviano Della Pia, Ilyes Batatia, Xavier R Advincula, Benjamin X Shi, Jinggang Lan, Gábor Csányi, Angelos Michaelides, and Venkat Kapil. 2025. “Data-Efficient Fine-Tuning of Foundational Models for First-Principles Quality Sublimation Enthalpies.” Faraday Discussions 256: 120–38. https://doi.org/10.1039/d4fd00107a.

Kawchak, Kevin. 2024. “High Dimensional and Complex Spectrometric Data Analysis of an Organic Compound Using Large Multimodal Models and Chained Outputs.” ChemRxiv Preprint, September. https://doi.org/10.26434/chemrxiv-2024-06gf1.

Kayali, Moe, Anton Lykov, Ilias Fountalis, Nikolaos Vasiloglou, Dan Olteanu, and Dan Suciu. 2024. “CHORUS: Foundation Models for Unified Data Discovery and Exploration.” Proc. VLDB Endow. 17 (8): 2104–14. https://doi.org/10.14778/3659437.3659461.

Kazdan, Joshua, Rylan Schaeffer, Apratim Dey, Matthias Gerstgrasser, Rafael Rafailov, David L. Donoho, and Sanmi Koyejo. 2024. “Collapse or Thrive? Perils and Promises of Synthetic Data in a Self-Generating World.” arXiv Preprint arXiv: 2410.16713. https://doi.org/10.48550/arXiv.2410.16713.

Ke, T-W, Aaron S Brewster, Stella X Yu, Daniela Ushizima, Chao Yang, and Nicholas K Sauter. 2018. “A Convolutional Neural Network-Based Screening Tool for x-Ray Serial Crystallography.” Synchrotron Radiation 25 (3): 655–70.

Kearnes, Steven M., Michael R. Maser, Michael Wleklinski, Anton Kast, Abigail G. Doyle, Spencer D. Dreher, Joel M. Hawkins, Klavs F. Jensen, and Connor W. Coley. 2021. “The Open Reaction Database.” J. Am. Chem. Soc. 143 (45): 18820–26. https://doi.org/10.1021/jacs.1c09820.

Keith, John A., Valentin Vassilev-Galindo, Bingqing Cheng, Stefan Chmiela, Michael Gastegger, Klaus-Robert Müller, and Alexandre Tkatchenko. 2021. “Combining Machine Learning and Computational Chemistry for Predictive Insights into Chemical Systems.” Chemical Reviews 121 (16): 9816–72. https://doi.org/10.1021/acs.chemrev.1c00107.

Khalifa, Mohamed, and Mona Albadawy. 2024. “Using artificial intelligence in academic writing and research: An essential productivity tool.” Computer Methods and Programs in Biomedicine Update, 100145. https://doi.org/10.1016/j.cmpbup.2024.100145.

Kharchenko, Yuliia V, and Olena M Babenko. 2024. “Advantages and limitations of large language models in chemistry education: A comparative analysis of ChatGPT, Gemini and Copilot.” Proceedings of the Free Open-Access Proceedings for Computer Science Workshops, Lviv, Ukraine 3781: 42–59. https://ceur-ws.org/Vol-3781/paper03.pdf.

Kim, Seongmin, Yousung Jung, and Joshua Schrier. 2024. “Large Language Models for Inorganic Synthesis Predictions.” Journal of the American Chemical Society.

Kim, Seongmin, Joshua Schrier, and Yousung Jung. 2025. “Explainable Synthesizability Prediction of Inorganic Crystal Polymorphs Using Large Language Models.” Angewandte Chemie International Edition. https://doi.org/10.1002/anie.202423950.

Kimber, Talia B, Maxime Gagnebin, and Andrea Volkamer. 2021. “Maxsmi: Maximizing Molecular Property Prediction Performance with Confidence Estimation Using Smiles Augmentation and Deep Learning.” Artificial Intelligence in the Life Sciences 1: 100014. https://doi.org/10.1016/j.ailsci.2021.100014.

Kingsbury, Ryan S., Andrew S. Rosen, Ayush S. Gupta, Jason M. Munro, Shyue Ping Ong, Anubhav Jain, Shyam Dwaraknath, Matthew K. Horton, and Kristin A. Persson. 2022. “A Flexible and Scalable Scheme for Mixing Computed Formation Energies from Different Levels of Theory.” Npj Computational Materials. https://doi.org/10.1038/s41524-022-00881-w.

Kinney, Rodney, Chloe Anastasiades, Russell Authur, Iz Beltagy, Jonathan Bragg, Alexandra Buraczynski, Isabel Cachola, et al. 2023. “The Semantic Scholar Open Data Platform.” arXiv Preprint arXiv: 2301.10140. https://doi.org/10.48550/arXiv.2301.10140.

Kirchhübel, Christin, and Georgina Brown. 2024. “Intellectual Property Rights at the Training, Development and Generation Stages of Large Language Models.” Edited by Ingo Siegert and Khalid Choukri. Proceedings of the Workshop on Legal and Ethical Issues in Human Language Technologies @ LREC-COLING, May. https://aclanthology.org/2024.legal-1.3/.

Klein, Ezra, and Rebecca Winthrop. 2025. “We Have to Really Rethink the Purpose of Education.” https://www.youtube.com/watch?v=HQQtaWgIQmE.

Kobayashi, Sosuke. 2018. “Contextual Augmentation: Data Augmentation by Words with Paradigmatic Relations.” arXiv Preprint arXiv: 1805.06201. https://doi.org/10.48550/arXiv.1805.06201.

Kolbert, Elizabeth. 2024. “The Obscene Energy Demands of a.i.” https://www.newyorker.com/news/daily-comment/the-obscene-energy-demands-of-ai.

Kon, Patrick Tser Jern, Jiachen Liu, Xinyi Zhu, Qiuyi Ding, Jingjia Peng, Jiarong Xing, Yibo Huang, et al. 2025. “EXP-Bench: Can AI Conduct AI Research Experiments?” arXiv Preprint arXiv: 2505.24785. https://doi.org/10.48550/arXiv.2505.24785.

Kortemeyer, Gerd, Julian Nöhl, and Daria Onishchuk. 2024. “Grading assistance for a handwritten thermodynamics exam using artificial intelligence: An exploratory study.” Physical Review Physics Education Research 20 (2). https://doi.org/10.1103/physrevphyseducres.20.020144.

Kosmyna, Nataliya, Eugene Hauptmann, Ye Tong Yuan, Jessica Situ, Xian-Hao Liao, Ashly Vivian Beresnitzky, Iris Braunstein, and Pattie Maes. 2025. “Your Brain on ChatGPT: Accumulation of Cognitive Debt When Using an AI Assistant for Essay Writing Task.” arXiv Preprint. https://doi.org/10.48550/arxiv.2506.08872.

Kosso, Peter. 2017. What Goes up... Gravity and Scientific Method. Cambridge: Cambridge University Press. https://doi.org/10.1017/9781316417003.

Koziarski, Andrei, Michałand Rekesh, Dmytro Shevchuk, Almer van der Sloot, Piotr Gaiński, Yoshua Bengio, Chenghao Liu, Mike Tyers, and Robert Batey. 2024. “RGFN: Synthesizable Molecular Generation Using GFlowNets.” Advances in Neural Information Processing Systems 37: 46908–55. https://doi.org/10.48550/arXiv.2406.08506.

Krenn, Mario, Florian Häse, AkshatKumar Nigam, Pascal Friederich, and Alan Aspuru-Guzik. 2020. “Self-referencing embedded strings (SELFIES): A 100% robust molecular string representation.” Machine Learning: Science and Technology 1 (4): 045024. https://doi.org/10.1088/2632-2153/aba947.

Kristiadi, Agustinus, Felix Strieth-Kalthoff, Marta Skreta, Pascal Poupart, Alán Aspuru-Guzik, and Geoff Pleiss. 2024. “A Sober Look at LLMs for Material Discovery: Are They Actually Good for Bayesian Optimization over Molecules?” Forty-First International Conference on Machine Learning, ICML 2024. https://doi.org/10.48550/arXiv.2402.05015.

Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E Hinton. 2012. “Imagenet Classification with Deep Convolutional Neural Networks.” Advances in Neural Information Processing Systems 25. https://doi.org/10.1145/3065386.

Krzyzanowski, Adrian, Stephen D. Pickett, and Peter Pogány. 2025. “Exploring BERT for Reaction Yield Prediction: Evaluating the Impact of Tokenization, Molecular Representation, and Pretraining Data Augmentation.” Journal of Chemical Information and Modeling 65 (9): 4381–4402. https://doi.org/10.1021/acs.jcim.5c00359.

Kuhn, Michael, Ivica Letunic, Lars Juhl Jensen, and Peer Bork. 2016. “The SIDER database of drugs and side effects.” Nucleic Acids Research 44 (D1): D1075–79. https://doi.org/10.1093/nar/gkv1075.

Kuhn, Thomas S. 1962. The Structure of Scientific Revolutions. Vol. 2. International Encyclopedia of Unified Science 2. Chicago: University of Chicago Press.

Kumar, Aounon, Chirag Agarwal, Suraj Srinivas, Aaron Jiaxun Li, Soheil Feizi, and Himabindu Lakkaraju. 2023. “Certifying LLM Safety Against Adversarial Prompting.” arXiv Preprint. https://doi.org/10.48550/arxiv.2309.02705.

Kumar, Pankaj, Saurabh Kabra, and Jacqueline M Cole. 2025. “MechBERT: Language Models for Extracting Chemical and Property Relationships about Mechanical Stress and Strain.” Journal of Chemical Information and Modeling.

Kumbhar, Shrinidhi, Venkatesh Mishra, Kevin Coutinho, Divij Handa, Ashif Iquebal, and Chitta Baral. 2025. “Hypothesis Generation for Materials Discovery and Design Using Goal-Driven and Constraint-Guided LLM Agents.” North American Chapter of the Association for Computational Linguistics. https://doi.org/10.48550/arXiv.2501.13299.

Kuntz, Thomas, Agatha Duzan, Hao Zhao, Francesco Croce, Zico Kolter, Nicolas Flammarion, and Maksym Andriushchenko. 2025. “OS-Harm: A Benchmark for Measuring Safety of Computer Use Agents.” arXiv Preprint arXiv: 2506.14866. https://doi.org/10.48550/arXiv.2506.14866.

Lakatos, Imre. 1970. “Falsification and the Methodology of Scientific Research Programmes.” In Criticism and the Growth of Knowledge, edited by Imre Lakatos and Alan Musgrave, 91–196. Cambridge: Cambridge University Press.

Langer, Marcel F., Alex Goeßmann, and Matthias Rupp. 2022. “Representations of molecules and materials for interpolation of quantum-mechanical simulations via machine learning.” Npj Computational Materials 8 (1). https://doi.org/10.1038/s41524-022-00721-x.

Laurent, Jon M., Joseph D. Janizek, Michael Ruzo, Michaela M. Hinks, Michael J. Hammerling, Siddharth Narayanan, Manvitha Ponnapati, Andrew D. White, and Samuel G. Rodriques. 2024. “LAB-Bench: Measuring Capabilities of Language Models for Biology Research.” arXiv Preprint arXiv: 2407.10362. https://doi.org/10.48550/arXiv.2407.10362.

Lazaridou, Angeliki, and Marco Baroni. 2020. “Emergent Multi-Agent Communication in the Deep Learning Era.” arXiv Preprint arXiv:2006.02419. https://doi.org/10.48550/arXiv.2006.02419.

Lee, Daeseok, and Yongjun Cho. 2024. “FINE-TUNING POCKET-CONDITIONED 3D MOLECULE GENERATION VIA REINFORCEMENT LEARNING.” The Twelfth International Conference on Learning Representations Workshop on Generative and Experimental Perspectives for Biomolecular Design, ICLR-GEM. https://openreview.net/forum?id=hlzRzr9ksu.

Lee, Jinhyuk, Anthony Chen, Zhuyun Dai, Dheeru Dua, Devendra Singh Sachan, Michael Boratko, Yi Luan, et al. 2024. “Can Long-Context Language Models Subsume Retrieval, RAG, SQL, and More?” arXiv Preprint. https://doi.org/10.48550/arXiv.2406.13121.

Lee, Namkyeong, Edward De Brouwer, Ehsan Hajiramezanali, Tommaso Biancalani, Chanyoung Park, and Gabriele Scalia. 2025. “RAG-Enhanced Collaborative LLM Agents for Drug Discovery.” arXiv Preprint arXiv: 2502.17506. https://doi.org/10.48550/arXiv.2502.17506.

Leonov, Artem I., Alexander J. S. Hammer, Sławomir Lach, S. Hessam M. Mehr, Dario Caramelli, Davide Angelone, Aamir Khan, et al. 2024. “An Integrated Self-Optimizing Programmable Chemical Synthesis and Reaction Engine.” Nature Communications 15 (1): 4544. https://doi.org/10.1038/s41467-024-45444-3.

Lewis, Patrick, Ethan Perez, Aleksandra Piktus, Fabio Petroni, Vladimir Karpukhin, Naman Goyal, Heinrich Küttler, et al. 2020. “Retrieval-Augmented Generation for Knowledge-Intensive Nlp Tasks.” Advances in Neural Information Processing Systems 33: 9459–74. https://doi.org/10.48550/arXiv.2005.11401.

Li, Cheng, Mingyang Zhang, Qiaozhu Mei, Yaqing Wang, Spurthi Amba Hombaiah, Yi Liang, and Michael Bendersky. 2023. “Teach LLMs to Personalize - an Approach Inspired by Writing Education.” arXiv Preprint arXiv: 2308.07968. https://doi.org/10.48550/arXiv.2308.07968.

Li, Guohao, Hasan Abed Al Kader Hammoud, Hani Itani, Dmitrii Khizbullin, and Bernard Ghanem. 2023. “CAMEL: Communicative Agents for "Mind" Exploration of Large Language Model Society.” arXiv Preprint arXiv: 2303.17760. https://doi.org/10.48550/arXiv.2303.17760.

Li, Jiatong, Wei Liu, Zhihao Ding, Wenqi Fan, Yuqiang Li, and Qing Li. 2025. “Large Language Models Are in-Context Molecule Learners.” IEEE Transactions on Knowledge and Data Engineering 37 (7). https://doi.org/10.1109/TKDE.2025.3557697.

Li, Jiatong, Yunqing Liu, Wei Liu, Jingdi Le, Di Zhang, Wenqi Fan, Dongzhan Zhou, Yuqiang Li, and Qing Li. 2024. “MolReFlect: Towards In-Context Fine-Grained Alignments Between Molecules and Texts.” Arxiv Preprint arXiv:2411.14721, November. https://doi.org/10.48550/arXiv.2411.14721.

Li, Junxian, Di Zhang, Xunzhi Wang, Zeying Hao, Jingdi Lei, Qian Tan, Cai Zhou, et al. 2024. “Seeing and Understanding: Bridging Vision with Chemical Knowledge via ChemVLM.” arXiv Preprint arXiv: 2408.07246. https://doi.org/10.48550/arXiv.2408.07246.

Li, Xiaobo, Yu Che, Linjiang Chen, Tao Liu, Kewei Wang, Lunjie Liu, Haofan Yang, Edward O. Pyzer-Knapp, and Andrew I. Cooper. 2024. “Sequential Closed-Loop Bayesian Optimization as a Guide for Organic Molecular Metallophotocatalyst Formulation Discovery.” Nature Chemistry 16 (8): 1286–94. https://doi.org/10.1038/s41557-024-01546-5.

Li, Zhaoxing, Vahid Yazdanpanah, Jindi Wang, Wen Gu, Lei Shi, Alexandra I. Cristea, Sarah Kiden, and Sebastian Stein. 2025. “TutorLLM: Customizing Learning Recommendations with Knowledge Tracing and Retrieval-Augmented Generation.” arXiv Preprint arXiv: 2502.15709. https://doi.org/10.48550/arXiv.2502.15709.

Li, Zhuoran, Xu Sun, Wanyu Lin, and Jiannong Cao. 2024. “Unveiling Molecular Secrets: An LLM-Augmented Linear Model for Explainable and Calibratable Molecular Property Prediction.” arXiv Preprint arXiv: 2410.08829. https://doi.org/10.48550/arXiv.2410.08829.

Liang, Tian, Zhiwei He, Wenxiang Jiao, Xing Wang, Yan Wang, Rui Wang, Yujiu Yang, Shuming Shi, and Zhaopeng Tu. 2024. “Encouraging Divergent Thinking in Large Language Models Through Multi-Agent Debate.” arXiv Preprint. https://doi.org/10.48550/arXiv.2305.19118.

Lim, Sangrak, and Yong Oh Lee. 2020. “Predicting Chemical Properties Using Self-Attention Multi-Task Learning Based on SMILES Representation.” 25th International Conference on Pattern Recognition, ICPR 2020, Virtual Event / Milan, Italy, January 10-15, 2021, 3146–53. https://doi.org/10.1109/ICPR48806.2021.9412555.

Lin, Li-Chiang, Adam H. Berger, Richard L. Martin, Jihan Kim, Joseph A. Swisher, Kuldeep Jariwala, Chris H. Rycroft, et al. 2012. “In silico screening of carbon-capture materials.” Nature Materials 11 (7): 633–41. https://doi.org/10.1038/nmat3336.

Lin, Xuan, Long Chen, Yile Wang, Xiangxiang Zeng, and Philip S. Yu. 2025. “Property Enhanced Instruction Tuning for Multi-Task Molecule Generation with Large Language Models.” Arxiv Preprint arXiv:2412.18084, May. https://doi.org/10.48550/arXiv.2412.18084.

Listgarten, Jennifer. 2024. “The Perpetual Motion Machine of AI-Generated Data and the Distraction of ChatGPT as a ‘Scientist’.” Nature Biotechnology 42 (3): 371–73. https://doi.org/10.1038/s41587-023-02103-0.

Liu, Bo, Yuqian Jiang, Xiaohan Zhang, Qiang Liu, Shiqi Zhang, Joydeep Biswas, and Peter Stone. 2023. “Llm+ p: Empowering large language models with optimal planning proficiency.” arXiv Preprint arXiv:2304.11477. https://doi.org/10.48550/arXiv.2304.11477.

Liu, Gang, Michael Sun, Wojciech Matusik, Meng Jiang, and Jie Chen. 2024. “Multimodal Large Language Models for Inverse Molecular Design with Retrosynthetic Planning.” Arxiv Preprint arXiv: 2410.04223, October. https://doi.org/10.48550/arXiv.2410.04223.

Liu, Gang, Jiaxin Xu, Eric Inae, Yihan Zhu, Ying Li, Tengfei Luo, Meng Jiang, et al. 2025. “NeurIPS - Open Polymer Prediction 2025.” https://kaggle.com/competitions/neurips-open-polymer-prediction-2025.

Liu, Hongxuan, Haoyu Yin, Zhiyao Luo, and Xiaonan Wang. 2025. “Integrating Chemistry Knowledge in Large Language Models via Prompt Engineering.” Synthetic and Systems Biotechnology 10 (1): 23–38. https://doi.org/10.1016/j.synbio.2024.07.004.

Liu, Shengchao, Weili Nie, Chengpeng Wang, Jiarui Lu, Zhuoran Qiao, Ling Liu, Jian Tang na, Chaowei Xiao na, and Animashree Anandkumar. 2023. “Multi-Modal Molecule Structure-Text Model for Text-Based Retrieval and Editing.” Nature Machine Intelligence. https://doi.org/10.1038/s42256-023-00759-6.

Liu, Yuyan, Sirui Ding, Sheng Zhou, Wenqi Fan, and Qiaoyu Tan. 2024. “MolecularGPT: Open Large Language Model (LLM) for Few-Shot Molecular Property Prediction.” Arxiv Preprint arXiv:2406.12950, October. https://doi.org/10.48550/arXiv.2406.12950.

Liu, Zequn, Wei Zhang, Yingce Xia, Lijun Wu, Shufang Xie, Tao Qin, Ming Zhang, and Tie-Yan Liu. 2023. “MolXPT: Wrapping Molecules with Text for Generative Pre-Training.” arXiv Preprint arXiv: 2305.10688. https://doi.org/10.48550/arXiv.2305.10688.

Liu, Zhihan, Yubo Chai, and Jianfeng Li. 2025. “Toward Automated Simulation Research Workflow Through LLM Prompt Engineering Design.” Journal of Chemical Information and Modeling 65 (1): 114–24. https://doi.org/10.1021/acs.jcim.4c01653.

Liu, Zhiyuan, Sihang Li, Yanchen Luo, Hao Fei, Yixin Cao, Kenji Kawaguchi, Xiang Wang, and Tat-Seng Chua. 2023. “MolCA: Molecular Graph-Language Modeling with Cross-Modal Projector and Uni-Modal Adapter.” arXiv Preprint arXiv:2310.12798v4, October. https://doi.org/10.48550/arXiv.2310.12798.

Liu, Zichang, Qingyun Liu, Yuening Li, Liang Liu, Anshumali Shrivastava, Shuchao Bi, Lichan Hong, Ed H Chi, and Zhe Zhao. 2024. “Wisdom of Committee: Distilling from Foundation Model to Specialized Application Model.” arXiv Preprint arXiv:2402.14035. https://doi.org/10.48550/arXiv.2402.14035.

Livne, Micha, Zulfat Miftahutdinov, Elena Tutubalina, Maksim Kuznetsov, Daniil Polykovskiy, Annika Brundyn, Aastha Jhunjhunwala, et al. 2024. “nach0: Multimodal natural and chemical languages foundation model.” Chemical Science 15 (22): 8380–89. https://doi.org/10.1039/d4sc00966e.

Lommerse, Jos P. M., W. D. Sam Motherwell, Herman L. Ammon, Jack D. Dunitz, Angelo Gavezzotti, Detlef W. M. Hofmann, Frank J. J. Leusen, et al. 2000. “A test of crystal structure prediction of small organic molecules.” Acta Crystallographica Section B Structural Science 56 (4): 697–714. https://doi.org/10.1107/s0108768100004584.

Lu, Jieyu, Zhangde Song, Qiyuan Zhao, Yuanqi Du, Yirui Cao, Haojun Jia, and Chenru Duan. 2025. “Generative Design of Functional Metal Complexes Utilizing the Internal Knowledge and Reasoning Capability of Large Language Models.” Journal of the American Chemical Society, July. https://doi.org/10.1021/jacs.5c02097.

Lu, Zimu, Aojun Zhou, Houxing Ren, Ke Wang, Weikang Shi, Junting Pan, Mingjie Zhan, and Hongsheng Li. 2024. “MathGenie: Generating Synthetic Data with Question Back-translation for Enhancing Mathematical Reasoning of LLMs.” Annual Meeting of the Association for Computational Linguistics. https://doi.org/10.48550/arXiv.2402.16352.

Lynch, Aengus, Benjamin Wright, Caleb Larson, Kevin K. Troy, Stuart J. Ritchie, Sören Mindermann, Ethan Perez, and Evan Hubinger. 2025. “Agentic Misalignment: How LLMs Could Be an Insider Threat.” Anthropic Research.

M. Mehr, S Hessam, Dario Caramelli, and Leroy Cronin. 2023. “Digitizing Chemical Discovery with a Bayesian Explorer for Interpreting Reactivity Data.” Proceedings of the National Academy of Sciences 120 (17): e2220045120. https://doi.org/10.1073/pnas.2220045120.

Mahmood, Omar, Elman Mansimov, Richard Bonneau, and Kyunghyun Cho. 2021. “Masked Graph Modeling for Molecule Generation.” Nature Communications 12 (1): 3156. https://doi.org/10.1038/s41467-021-23415-2.

Maini, Pratyush, Skyler Seto, He Bai, David Grangier, Yizhe Zhang, and Navdeep Jaitly. 2024. “Rephrasing the Web: A Recipe for Compute and Data-Efficient Language Modeling.” arXiv Preprint arXiv: 2401.16380. https://doi.org/10.48550/arXiv.2401.16380.

Makelov, Aleksandar, Georg Lange, and Neel Nanda. 2023. “Is This the Subspace You Are Looking for? An Interpretability Illusion for Subspace Activation Patching.” arXiv Preprint arXiv: 2311.17030. https://doi.org/10.48550/arXiv.2311.17030.

Malkov, Yu A, and Dmitry A Yashunin. 2018. “Efficient and Robust Approximate Nearest Neighbor Search Using Hierarchical Navigable Small World Graphs.” IEEE Transactions on Pattern Analysis and Machine Intelligence 42 (4): 824–36. https://doi.org/10.1109/tpami.2018.2889473.

Mandal, Indrajeet, Jitendra Soni, Mohd Zaki, Morten M. Smedskjaer, Katrin Wondraczek, Lothar Wondraczek, Nitya Nand Gosvami, and N. M. Anoop Krishnan. 2024. “Autonomous Microscopy Experiments through Large Language Model Agents.” arXiv Preprint arXiv: 2501.10385. https://doi.org/10.48550/arXiv.2501.10385.

Marcus, Gary. 2020. “The Next Decade in AI: Four Steps Towards Robust Artificial Intelligence.” arXiv Preprint arXiv:2002.06177. https://doi.org/10.48550/arXiv.2002.06177.

Marcus, Greil. 2025. “Will the Humanities Survive Artificial Intelligence?” The New Yorker, April. https://www.newyorker.com/culture/the-weekend-essay/will-the-humanities-survive-artificial-intelligence.

Marion, Max, Ahmet Üstün, Luiza Pozzobon, Alex Wang, Marzieh Fadaee, and Sara Hooker. 2023. “When Less Is More: Investigating Data Pruning for Pretraining LLMs at Scale.” arXiv Preprint arXiv: 2309.04564. https://doi.org/10.48550/arXiv.2309.04564.

Martin, Stephen F. 2022. “Bridging known and unknown unknowns: From natural products and their mimics to unmet needs in neuroscience.” Accounts of Chemical Research 55 (17): 2397–2408. https://doi.org/10.1021/acs.accounts.1c00773.

McDonald, Robert S., and Paul A. Wilks. 1988. “JCAMP-DX: A Standard Form for Exchange of Infrared Spectra in Computer Readable Form.” Applied Spectroscopy 42 (1): 151–62. https://doi.org/10.1366/0003702884428734.

Mehr, Saman H. M., Mark Craven, Andrei I. Leonov, Graham Keenan, and Leroy Cronin. 2020. “A Universal System for Digitization and Automatic Execution of the Chemical Synthesis Literature.” Science 370 (6512): 101–8. https://doi.org/10.1126/science.abc2986.

Mendible-Barreto, Orlando A., Misael Díaz-Maldonado, Fernando J. Carmona Esteva, J. Emmanuel Torres, Ubaldo M. Córdova-Figueroa, and Yamil J. Colón. 2025. “DynaMate: Leveraging AI-Agents for Customized Research Workflows.” Molecular Systems Design & Engineering 10: 585–98. https://doi.org/10.1039/D5ME00062A.

Micikevicius, Paulius, Sharan Narang, Jonah Alben, Gregory Diamos, Erich Elsen, David Garcia, Boris Ginsburg, et al. 2017. “Mixed Precision Training.” arXiv Preprint arXiv:1710.03740. https://doi.org/10.48550/arXiv.1710.03740.

Mikolov, Tomas, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. “Efficient Estimation of Word Representations in Vector Space.” arXiv Preprint arXiv: 1301.3781. https://doi.org/10.48550/arXiv.1301.3781.

Mikolov, Tomas, Ilya Sutskever, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. “Distributed Representations of Words and Phrases and Their Compositionality.” Neurips. https://doi.org/10.48550/arXiv.1310.4546.

Miret, Santiago, and N M Anoop Krishnan. 2024. “Are LLMs Ready for Real-World Materials Discovery?” arXiv Preprint arXiv: 2402.05200. https://doi.org/10.48550/arXiv.2402.05200.

Mirza, Adrian, Nawaf Alampara, Sreekanth Kunchapu, Martiño Rı́os-Garcı́a, Benedict Emoekabu, Aswanth Krishnan, Tanya Gupta, et al. 2025. “A Framework for Evaluating the Chemical Knowledge and Reasoning Abilities of Large Language Models Against the Expertise of Chemists.” Nature Chemistry, 1–8. https://doi.org/10.1038/s41557-025-01815-x.

Mirza, Adrian, Nawaf Alampara, Martiño Rı́os-Garcı́a, Mohamed Abdelalim, Jack Butler, Bethany Connolly, Tunca Dogan, et al. 2025. “ChemPile: A 250GB Diverse and Curated Dataset for Chemical Foundation Models.” arXiv Preprint arXiv: 2505.12534. https://doi.org/10.48550/arXiv.2505.12534.

Mirza, A., and K. M. Jablonka. 2024. “Elucidating Structures from Spectra Using Multimodal Embeddings and Discrete Optimization.” ChemRxiv Preprint. https://doi.org/10.26434/chemrxiv-2024-f3b18-v2.

Mishra, Vaibhav, Somaditya Singh, Dhruv Ahlawat, Mohd Zaki, Vaibhav Bihani, Hargun Singh Grover, Biswajit Mishra, Santiago Miret, Mausam, and N. M. Anoop Krishnan. 2024. “Foundational Large Language Models for Materials Research.” arXiv Preprint arXiv: 2412.09560. https://doi.org/10.48550/arXiv.2412.09560.

Mitchell, John B. O. 2017. “DLS-100 Solubility Dataset.” https://doi.org/10.17630/3A3A5ABC-8458-4924-8E6C-B804347605E8.

Mitchener, Ludovico, Jon M Laurent, Benjamin Tenmann, Siddharth Narayanan, Geemi P Wellawatte, Andrew White, Lorenzo Sani, and Samuel G Rodriques. 2025. “BixBench: a Comprehensive Benchmark for LLM-based Agents in Computational Biology.” arXiv Preprint arXiv: 2503.00096. https://doi.org/10.48550/arXiv.2503.00096.

Mittermaier, Mirja, Marium M. Raza, and Joseph C. Kvedar. 2023. “Bias in AI-Based Models for Medical Applications: Challenges and Mitigation Strategies.” Npj Digital Medicine. https://doi.org/10.1038/s41746-023-00858-z.

Mobley, David L., and J. Peter Guthrie. 2014. “FreeSolv: a database of experimental and calculated hydration free energies, with input files.” Journal of Computer-Aided Molecular Design 28 (7). https://doi.org/10.1007/s10822-014-9747-x.

Mollick, Ethan R., Lilach Mollick, Natalie Bach, LJ Ciccarelli, Ben Przystanski, and Daniel Ravipinto. 2024. “AI Agents and Education: Simulated Practice at Scale.” The Wharton School Research Paper. https://doi.org/10.2139/ssrn.4871171.

Mollick, Ethan, and Lilach Mollick. 2024. “Instructors as Innovators: A future-focused approach to new AI learning opportunities, with prompts.” arXiv Preprint arXiv: 2407.05181. https://doi.org/10.48550/arXiv.2407.05181.

Moreno-Barea, Francisco J, Leonardo Franco, David Elizondo, and Martin Grootveld. 2022. “Application of Data Augmentation Techniques Towards Metabolomics.” Computers in Biology and Medicine 148: 105916.

Morris, Meredith Ringel, Jascha Sohl-dickstein, Noah Fiedel, Tris Warkentin, Allan Dafoe, Aleksandra Faust, Clement Farabet, and Shane Legg. 2023. “Levels of AGI for Operationalizing Progress on the Path to AGI.” arXiv Preprint arXiv: 2311.02462. https://doi.org/10.48550/arXiv.2311.02462.

Moult, John. 2005. “A decade of CASP: progress, bottlenecks and prognosis in protein structure prediction.” Current Opinion in Structural Biology 15 (3): 285–89. https://doi.org/10.1016/j.sbi.2005.05.011.

Musil, Felix, Andrea Grisafi, Albert P. Bartók, Christoph Ortner, Gábor Csányi, and Michele Ceriotti. 2021. “Physics-Inspired Structural Representations for Molecules and Materials.” Chemical Reviews 121 (16): 9759–9815. https://doi.org/10.1021/acs.chemrev.1c00021.

Mytton, David. 2021. “Data Centre Water Consumption.” Npj Clean Water. https://doi.org/10.1038/s41545-021-00101-w.

Narayan, Avanika, Ines Chami, Laurel Orr, Simran Arora, and Christopher Ré. 2022. “Can Foundation Models Wrangle Your Data?” Arxiv Preprint arXiv:2205.09911. https://doi.org/10.48550/ARXIV.2205.09911.

Narayanan, Arvind, and Sayash Kapoor. 2025. “Why an Overreliance on AI-Driven Modelling Is Bad for Science.” Nature 640 (8058): 312–14. https://doi.org/10.1038/d41586-025-01067-2.

Narayanan, Siddharth M., James D. Braza, Ryan-Rhys Griffiths, Albert Bou, Geemi Wellawatte, Mayk Caldas Ramos, Ludovico Mitchener, Samuel G. Rodriques, and Andrew D. White. 2025. “Training a Scientific Reasoning Model for Chemistry.” arXiv Preprint arXiv: 2506.17238. https://doi.org/10.48550/arXiv.2506.17238.

Naumov, Vladimir, Diana Zagirova, Sha Lin, Yupeng Xie, Wenhao Gou, Anatoly Urban, Nina Tikhonova, et al. 2025. “DORA AI Scientist: Multi-Agent Virtual Research Team for Scientific Exploration Discovery and Automated Report Generation.” bioRxiv, March. https://doi.org/10.1101/2025.03.06.641840.

Neese, Frank. 2022. “Software Update: The ORCA Program System, Version 5.0.” Wiley Interdisciplinary Reviews: Computational Molecular Science 12 (1): e1606. https://doi.org/10.1002/wcms.1606.

Nega, Philip W., Zhi Li, Victor Ghosh, Janak Thapa, Shijing Sun, Noor Titan Putri Hartono, Mansoor Ani Najeeb Nellikkal, et al. 2021. “Using Automated Serendipity to Discover How Trace Water Promotes and Inhibits Lead Halide Perovskite Crystal Formation.” Applied Physics Letters 119 (4). https://doi.org/10.1063/5.0059767.

Newton, Isaac. 1999. The Principia: Mathematical Principles of Natural Philosophy. Translated by I. Bernard Cohen and Anne Whitman. Berkeley: University of California Press.

Ni, Yuyan, Shikun Feng, Xin Hong, Yuancheng Sun, Wei-Ying Ma, Zhi-Ming Ma, Qiwei Ye, and Yanyan Lan. 2024. “Pre-Training with Fractional Denoising to Enhance Molecular Property Prediction.” Nature Machine Intelligence 6 (10): 1169–78. https://doi.org/10.1038/s42256-024-00900-z.

NIST. 2024. “Safety Considerations for Chemical and/or Biological AI Models.” Federal Register. https://www.federalregister.gov/documents/2024/10/04/2024-22974/safety-considerations-for-chemical-andor-biological-ai-models.

Novikov, Alexander, Ngân Vũ, Marvin Eisenberger, Emilien Dupont, Po-Sen Huang, Adam Zsolt Wagner, Sergey Shirobokov, et al. 2025. “AlphaEvolve: A Coding Agent for Scientific and Algorithmic Discovery.” Google DeepMind. https://storage.googleapis.com/deepmind-media/DeepMind.com/Blog/alphaevolve-a-gemini-powered-coding-agent-for-designing-advanced-algorithms/AlphaEvolve.pdf.

O’Donoghue, Odhran, Aleksandar Shtedritski, John Ginger, Ralph Abboud, Ali Essa Ghareeb, Justin Booth, and Samuel G Rodriques. 2023. “BioPlanner: Automatic Evaluation of LLMs on Protocol Planning in Biology.” arXiv Preprint arXiv:2310.10632. https://doi.org/10.48550/arXiv.2310.10632.

O’Neill, Charles, Tirthankar Ghosal, Roberta Răileanu, Mike Walmsley, Thang Bui, Kevin Schawinski, and Ioana Ciucă. 2025. “Sparks of Science: Hypothesis Generation Using Structured Paper Data.” arXiv Preprint arXiv: 2504.12976. https://doi.org/10.48550/arXiv.2504.12976.

Ollion, Étienne, Rubing Shen, Ana Macanovic, and Arnault Chatelain. 2024. “The Dangers of Using Proprietary LLMs for Research.” Nature Machine Intelligence 6 (1): 4–5. https://doi.org/10.1038/s42256-023-00783-6.

Omiye, Jesutofunmi A., Jenna C. Lester, Simon Spichak, Veronica Rotemberg, and Roxana Daneshjou. 2023. “Large Language Models Propagate Race-Based Medicine.” Npj Digital Medicine 6 (1): 1–4. https://doi.org/10.1038/s41746-023-00939-z.

Oord, Aaron van den, Yazhe Li, and Oriol Vinyals. 2018. “Representation Learning with Contrastive Predictive Coding.” arXiv Preprint arXiv: 1807.03748. https://doi.org/10.48550/arXiv.1807.03748.

OpenAI. 2023. “Written Evidence to [Committee Name].” UK Parliament; Written Evidence. https://committees.parliament.uk/writtenevidence/126981/pdf/.

———. 2024. “Building an Early Warning System for LLM-Aided Biological Threat Creation.” https://openai.com/index/building-an-early-warning-system-for-llm-aided-biological-threat-creation/.

OpenAI, Josh Achiam, Steven Adler, Sandhini Agarwal, Lama Ahmad, Ilge Akkaya, Florencia Leoni Aleman, et al. 2023. “GPT-4 Technical Report.” arXiv Preprint arXiv: 2303.08774. https://doi.org/10.48550/arXiv.2303.08774.

Ouyang, Long, Jeff Wu, Xu Jiang, Diogo Almeida, Carroll L. Wainwright, Pamela Mishkin, Chong Zhang, et al. 2022. “Training Language Models to Follow Instructions with Human Feedback.” arXiv Preprint. https://doi.org/10.48550/arXiv.2203.02155.

Oviedo, Felipe, Zekun Ren, Shijing Sun, Charles Settens, Zhe Liu, Noor Titan Putri Hartono, Savitha Ramasamy, et al. 2019. “Fast and Interpretable Classification of Small x-Ray Diffraction Datasets Using Data Augmentation and Deep Neural Networks.” Npj Computational Materials 5 (1): 60.

Pagel, Sebastian, Michal Jirásek, and Leroy Cronin. 2024. “Validation of the Scientific Literature via Chemputation Augmented by Large Language Models.” arXiv Preprint arXiv:2410.06384, October. https://doi.org/10.48550/arXiv.2410.06384.

Pantha, Nishan, Muthukumaran Ramasubramanian, Iksha Gurung, Manil Maskey, and Rahul Ramachandran. 2024. “Challenges in Guardrailing Large Language Models for Science.” Arxiv Preprint arXiv: 2411.08181, December. https://doi.org/10.48550/arXiv.2411.08181.

Parisi, Aaron, Yao Zhao, and Noah Fiedel. 2022. “Talm: Tool Augmented Language Models.” arXiv Preprint arXiv:2205.12255. https://doi.org/10.48550/arXiv.2205.12255.

Park, Nathaniel H., Matteo Manica, Jannis Born, James L. Hedrick, Tim Erdmann, Dmitry Yu. Zubarev, Nil Adell-Mill, Pedro L. Arrechea, et al. 2023. “Artificial Intelligence Driven Design of Catalysts and Materials for Ring Opening Polymerization Using a Domain-Specific Language.” Nature Communications 14 (1). https://doi.org/10.1038/s41467-023-39396-3.

Patiny, Luc, and Guillaume Godin. 2023. “Automatic Extraction of FAIR Data from Publications Using LLM.” ChemRxiv Preprint. https://doi.org/10.26434/chemrxiv-2023-05v1b-v2.

Penedo, Guilherme, Hynek Kydlı́ček, Anton Lozhkov, Margaret Mitchell, Colin A Raffel, Leandro Von Werra, Thomas Wolf, et al. 2024. “The fineweb datasets: Decanting the web for the finest text data at scale.” Advances in Neural Information Processing Systems 37: 30811–49. https://doi.org/10.48550/arXiv.2406.17557.

Penedo, Guilherme, Quentin Malartic, Daniel Hesslow, Ruxandra Cojocaru, Hamza Alobeidli, Alessandro Cappelli, Baptiste Pannier, Ebtesam Almazrouei, and Julien Launay. 2023. “The Refinedweb Dataset for Falcon Llm: Outperforming Curated Corpora with Web Data Only.” Advances in Neural Information Processing Systems 36: 79155–72. https://doi.org/10.48550/arXiv.2306.01116.

Peng, Ji-Lun, Sijia Cheng, Egil Diau, Yung-Yu Shih, Po-Heng Chen, Yen-Ting Lin, and Yun-Nung Chen. 2024. “A Survey of Useful LLM Evaluation.” arXiv Preprint arXiv: 2406.00936. https://doi.org/10.48550/arXiv.2406.00936.

Peppin, Aidan, Anka Reuel, Stephen Casper, Elliot Jones, Andrew Strait, Usman Anwar, Anurag Agrawal, et al. 2024. “The Reality of AI and Biorisk.” arXiv Preprint arXiv: 2412.01946. https://doi.org/10.48550/arXiv.2412.01946.

Perez, Ethan, Saffron Huang, Francis Song, Trevor Cai, Roman Ring, John Aslanides, Amelia Glaese, Nat McAleese, and Geoffrey Irving. 2022. “Red Teaming Language Models with Language Models.” arXiv Preprint arXiv: 2202.03286. https://doi.org/10.48550/arXiv.2202.03286.

Perez, Ryann M., Marie Shimogawa, Yanan Chang, Hoang Anh T. Phan, Jason G. Marmorstein, Evan S. K. Yanagawa, and E. James Petersson. 2025. “Large Language Models for Education: ChemTAsk - An Open-Source Paradigm for Automated Q&A in the Graduate Classroom.” arXiv Preprint arXiv: 2502.00016. https://doi.org/10.48550/arXiv.2502.00016.

Pieler, Michael, Marco Bellagente, Hannah Teufel, Duy Phung, Nathan Cooper, Jonathan Tow, Paulo Rocha, et al. 2024. “Rephrasing Natural Text Data with Different Languages and Quality Levels for Large Language Model Pre-Training.” arXiv Preprint arXiv:2410.20796. https://doi.org/10.48550/arXiv.2410.20796.

Pietsch, Wolfgang, and Jörg Wernecke. 2017. “Introduction: Ten Theses on Big Data and Computability.” In Berechenbarkeit Der Welt?, 37–57. Springer Fachmedien Wiesbaden. https://doi.org/10.1007/978-3-658-12153-2_2.

Pistono, Federico, and Roman V. Yampolskiy. 2016. “Unethical Research: How to Create a Malevolent Artificial Intelligence.” Arxiv Preprint arXiv:1605.02817, September. https://doi.org/10.48550/arXiv.1605.02817.

Polak, Maciej P, and Dane Morgan. 2024. “Extracting Accurate Materials Data from Research Papers with Conversational Language Models and Prompt Engineering.” Nature Communications 15 (1): 1569. https://doi.org/10.1038/s41467-024-45914-8.

Polanyi, Michael. 2009. The Tacit Dimension. Reproduction en fac-similé. Chicago: University of Chicago press.

Popper, Karl R. 1959. The Logic of Scientific Discovery. London: Routledge.

Preuer, Kristina, Philipp Renz, Thomas Unterthiner, Sepp Hochreiter, and Günter Klambauer. 2018. “Fréchet ChemNet Distance: A Metric for Generative Models for Molecules in Drug Discovery.” Journal of Chemical Information and Modeling 58 (9): 1736–41. https://doi.org/10.1021/acs.jcim.8b00234.

Qian, Chen, Wei Liu, Hongzhang Liu, Nuo Chen, Yufan Dang, Jiahao Li, Cheng Yang, et al. 2024. “ChatDev: Communicative Agents for Software Development.” arXiv Preprint. https://doi.org/10.48550/arXiv.2307.07924.

Qu, Jiaxing, Yuxuan Richard Xie, Kamil M. Ciesielski, Claire E. Porter, Eric S. Toberer, and Elif Ertekin. 2023. “Leveraging Language Representation for Material Recommendation, Ranking, and Exploration.” Arxiv Preprint arXiv: 2305.01101, May. https://doi.org/10.48550/arXiv.2305.01101.

Radford, Alec, Jeffrey Wu, Rewon Child, David Luan, Dario Amodei, and Ilya Sutskever. 2019. “Language Models Are Unsupervised Multitask Learners.” Technical Report TR-2019-1. San Francisco, CA: OpenAI. https://cdn.openai.com/better-language-models/language_models_are_unsupervised_multitask_learners.pdf.

Raffel, Colin, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, and Peter J Liu. 2020. “Exploring the limits of transfer learning with a unified text-to-text transformer.” Journal of Machine Learning Research 21 (140): 1–67. https://www.jmlr.org/papers/v21/20-074.html.

Rajabi-Kochi, Mahyar, Negareh Mahboubi, Aseem Partap Singh Gill, and Seyed Mohamad Moosavi. 2025. “Adaptive Representation of Molecules and Materials in Bayesian Optimization.” Chemical Science 16 (13): 5464–74. https://doi.org/10.1039/d5sc00200a.

Ramakrishnan, Raghunathan, Pavlo O Dral, Matthias Rupp, and O Anatole Von Lilienfeld. 2014. “Quantum chemistry structures and properties of 134 kilo molecules.” Scientific Data 1 (1): 1–7. https://doi.org/10.1038/sdata.2014.22.

Ramé, Alexandre, Guillaume Couairon, Mustafa Shukor, Corentin Dancette, Jean-Baptiste Gaya, Laure Soulier, and Matthieu Cord. 2023. “Rewarded Soups: Towards Pareto-Optimal Alignment by Interpolating Weights Fine-Tuned on Diverse Rewards.” Arxiv Preprint arXiv:2306.04488, October. https://doi.org/10.48550/arXiv.2306.04488.

Ramos, Mayk Caldas, Shane S. Michtavy, Marc D. Porosoff, and Andrew D. White. 2023. “Bayesian Optimization of Catalysis with in-Context Learning.” arXiv Preprint arXiv: 2304.05341. https://doi.org/10.48550/arXiv.2304.05341.

Ranković, Bojana, and Philippe Schwaller. 2023. “BoChemian: Large Language Model Embeddings for Bayesian Optimization of Chemical Reactions.” NeurIPS 2023 Workshop on Adaptive Experimental Design and Active Learning in the Real World. https://openreview.net/forum?id=A1RVn1m3J3.

———. 2025. “GOLLuM: Gaussian Process Optimized LLMs - Reframing LLM Finetuning Through Bayesian Optimization.” arXiv Preprint arXiv: 2504.06265. https://doi.org/10.48550/arXiv.2504.06265.

Raschka, Sebastian. 2018. “Model Evaluation, Model Selection, and Algorithm Selection in Machine Learning.” arXiv Preprint arXiv: 1811.12808. https://doi.org/10.48550/arXiv.1811.12808.

Rauschen, Robert, Mason Guy, Jason E. Hein, and Leroy Cronin. 2024. “Universal Chemical Programming Language for Robotic Synthesis Repeatability.” Nature Synthesis 3 (4). https://doi.org/10.1038/s44160-023-00473-6.

Reiser, Patrick, Marlen Neubert, André Eberhard, Luca Torresi, Chen Zhou, Chen Shao, Houssam Metni, et al. 2022. “Graph Neural Networks for Materials Science and Chemistry.” Communications Materials 3 (1): 93. https://doi.org/10.48550/arXiv.2208.09481.

Renze, Matthew, and Erhan Guven. 2024. “Self-Reflection in LLM Agents: Effects on Problem-Solving Performance.” arXiv Preprint arXiv: 2405.06682. https://doi.org/10.48550/arXiv.2405.06682.

Richard, Ann M., Ruili Huang, Suramya Waidyanatha, Paul Shinn, Bradley J. Collins, Inthirany Thillainadarajah, Christopher M. Grulke, et al. 2021. “The Tox21 10K Compound Library: Collaborative Chemistry Advancing Toxicology.” Chemical Research in Toxicology 34 (2): 189–216. https://doi.org/10.1021/acs.chemrestox.0c00264.

Riebesell, Janosh, Rhys E. A. Goodall, Philipp Benner, Yuan Chiang, Bowen Deng, Gerbrand Ceder, Mark Asta, Alpha A. Lee, Anubhav Jain, and Kristin A. Persson. 2025. “A Framework to Evaluate Machine Learning Crystal Stability Predictions.” Nature Machine Intelligence. https://doi.org/10.1038/s42256-025-01055-1.

Rives, Alexander, Joshua Meier, Tom Sercu, Siddharth Goyal, Zeming Lin, Jason Liu, Demi Guo, et al. 2021. “Biological structure and function emerge from scaling unsupervised learning to 250 million protein sequences.” Proceedings of the National Academy of Sciences 118 (15). https://doi.org/10.1073/pnas.2016239118.

Rı́os-Garcı́a, Martiño, and Kevin Maik Jablonka. 2025. “LLM-as-Judge Meets LLM-as-Optimizer: Enhancing Organic Data Extraction Evaluations Through Dual LLM Approaches.” AI for Accelerated Materials Design - ICLR. https://openreview.net/forum?id=MjQml5U1Xq.

Rock, Charles. 2018. “A Hypothesis Can’t Be Right Unless It Can Be Proven Wrong.” https://www.stjude.org/research/progress/2018/hypothesis-must-be-falsifiable.html.

Rouleau, Nicolas, and Nirosha J. Murugan. 2025. “The Risks and Rewards of Embodying Artificial Intelligence with Cloud-Based Laboratories.” Advanced Intelligent Systems 7 (1): 2400193. https://doi.org/10.1002/aisy.202400193.

Rubungo, Andre Niyongabo, Craig Arnold, Barry P. Rand, and Adji Bousso Dieng. 2023. “LLM-Prop: Predicting Physical And Electronic Properties Of Crystalline Solids From Their Text Descriptions.” arXiv Preprint arXiv: 2310.14029. https://doi.org/10.48550/arXiv.2310.14029.

Ruffolo, Jeffrey A., and Ali Madani. 2024. “Designing proteins with language models.” Nature Biotechnology 42 (2): 200–202. https://doi.org/10.1038/s41587-024-02123-4.

Rulev, Alexander Yu. 2017. “Serendipity or the art of making discoveries.” New Journal of Chemistry 41 (11): 4262–68. https://doi.org/10.1039/c7nj00182g.

Runcie, Nicholas T., Charlotte M. Deane, and Fergus Imrie. 2025. “Assessing the Chemical Intelligence of Large Language Models.” arXiv Preprint. https://doi.org/10.48550/arxiv.2505.07735.

Rupp, Matthias, Alexandre Tkatchenko, Klaus-Robert Müller, and O. Anatole von Lilienfeld. 2012. “Fast and Accurate Modeling of Molecular Atomization Energies with Machine Learning.” Physical Review Letters 108 (5). https://doi.org/10.1103/physrevlett.108.058301.

Sakiyama, Hiroshi, Motohisa Fukuda, and Takashi Okuno. 2021. “Prediction of Blood-Brain Barrier Penetration (BBBP) Based on Molecular Descriptors of the Free-Form and In-Blood-Form Datasets.” Molecules 26 (24). https://doi.org/10.3390/molecules26247428.

Sanchez-Fernandez, Ana, Elisabeth Rumetshofer, Sepp Hochreiter, and Günter Klambauer. 2023. “CLOOME: Contrastive Learning Unlocks Bioimaging Databases for Queries with Chemical Structures.” Nature Communications 14 (1): 7339. https://doi.org/10.1038/s41467-023-42328-w.

Sandbrink, Jonas B. 2023. “Artificial Intelligence and Biological Misuse: Differentiating Risks of Language Models and Biological Design Tools.” Arxiv Preprint arXiv:2306.13952, December. https://doi.org/10.48550/arXiv.2306.13952.

Sanh, Victor, Lysandre Debut, Julien Chaumond, and Thomas Wolf. 2019. “DistilBERT, a Distilled Version of BERT: Smaller, Faster, Cheaper and Lighter.” arXiv Preprint arXiv:1910.01108. https://doi.org/10.48550/arXiv.1910.01108.

Sardiña, Víctor Juan Lamas, Daniel García-González, and Miguel Rodríguez Luaces. 2024. “DSL-Xpert: LLM-Driven Generic DSL Code Generation.” Proceedings of the 27th ACM/IEEE International Conference on Model Driven Engineering Languages and Systems Companion (MODELS Companion ’24), September, 5 pages. https://doi.org/10.1145/3652620.3687782.

Satariano, Adam, and Paul Mozur. 2025. “The a.i. Race Is Splitting the World into Haves and Have-Nots.” https://www.nytimes.com/interactive/2025/06/23/technology/ai-computing-global-divide.html.

Satorras, Vıctor Garcia, Emiel Hoogeboom, and Max Welling. 2021. “E (n) equivariant graph neural networks.” International Conference on Machine Learning, 9323–32. https://doi.org/10.48550/arXiv.2102.09844.

Savitsky, Zack. 2025. “Exclusive: Start-up FutureHouse Debuts Powerful AI ‘Reasoning Model’ for Science.” Nature 642 (8068): 552–53. https://doi.org/10.1038/d41586-025-01753-1.

Scheidgen, Markus, Lauri Himanen, Alvin Noe Ladines, David Sikter, Mohammad Nakhaee, Ádám Fekete, Theodore Chang, et al. 2023. “NOMAD: A distributed web-based platform for managing materials science research data.” Journal of Open Source Software 8 (90): 5388. https://doi.org/10.21105/joss.05388.

Schick, Timo, Jane Dwivedi-Yu, Roberto Dessì, Roberta Raileanu, Maria Lomeli, Eric Hambro, Luke Zettlemoyer, Nicola Cancedda, and Thomas Scialom. 2023. “Toolformer: Language Models Can Teach Themselves to Use Tools.” Advances in Neural Information Processing Systems 36: 68539–51. https://doi.org/10.48550/arXiv.2302.04761.

Schilling-Wilhelmi, Mara, Nawaf Alampara, and Kevin Maik Jablonka. 2025. “Lifting the Benchmark Iceberg with Item-Response Theory.” OpenReview. https://openreview.net/forum?id=ZyVQqK7mcP.

Schilling-Wilhelmi, Mara, and Kevin Maik Jablonka. 2024. “Using Machine-Learning and Large-Language-Model Extracted Data to Predict Copolymerizations.” AI for Accelerated Materials Design. https://openreview.net/forum?id=zlutCyZ12H.

Schilling-Wilhelmi, Mara, Martiño Rı́os-Garcı́a, Sherjeel Shabih, Marı́a Victoria Gil, Santiago Miret, Christoph T Koch, José A Márquez, and Kevin Maik Jablonka. 2025. “From text to insight: large language models for chemical data extraction.” Chemical Society Reviews. https://doi.org/10.1039/d4cs00913d.

Schmidgall, Samuel, and Michael Moor. 2025. “AgentRxiv: Towards Collaborative Autonomous Research.” arXiv Preprint arXiv: 2503.18102. https://doi.org/10.48550/arXiv.2503.18102.

Schmidgall, Samuel, Yusheng Su, Ze Wang, Ximeng Sun, Jialian Wu, Xiaodong Yu, Jiang Liu, Michael Moor, Zicheng Liu, and Emad Barsoum. 2025. “Agent Laboratory: Using LLM Agents as Research Assistants.” arXiv Preprint arXiv: 2501.04227. https://doi.org/10.48550/arXiv.2501.04227.

Schmidinger, Niklas, Lisa Schneckenreiter, Philipp Seidl, Johannes Schimunek, Pieter-Jan Hoedt, Johannes Brandstetter, Andreas Mayr, Sohvi Luukkonen, Sepp Hochreiter, and Günter Klambauer. 2025. “Bio-xLSTM: Generative Modeling, Representation and in-Context Learning of Biological and Chemical Sequences.” The Thirteenth International Conference on Learning Representations, ICLR. https://doi.org/10.48550/arXiv.2411.04165.

Schulman, John, Filip Wolski, Prafulla Dhariwal, Alec Radford, and Oleg Klimov. 2017. “Proximal Policy Optimization Algorithms.” arXiv Preprint arXiv: 1707.06347. https://doi.org/10.48550/arXiv.1707.06347.

Schwaller, Philippe, Teodoro Laino, Théophile Gaudin, Peter Bolgar, Christopher A Hunter, Costas Bekas, and Alpha A Lee. 2019. “Molecular Transformer: A Model for Uncertainty-Calibrated Chemical Reaction Prediction.” ACS Central Science 5 (9): 1572–83. https://doi.org/10.1021/acscentsci.9b00576.

Schwaller, Philippe, Daniel Probst, Alain C. Vaucher, Vishnu H. Nair, David Kreutter, Teodoro Laino, and Jean-Louis Reymond. 2021. “Mapping the Space of Chemical Reactions Using Attention-Based Neural Networks.” Nature Machine Intelligence 3 (2): 144–52. https://doi.org/10.1038/s42256-020-00284-w.

Sculley, David, Gary Holt, Daniel Golovin, Eugene Davydov, Todd Phillips, Dietmar Ebner, Vinay Chaudhary, and Michael Young. 2014. “Machine Learning: The High Interest Credit Card of Technical Debt.” SE4ML: Software Engineering for Machine Learning (NIPS 2014 Workshop) 8. https://research.google/pubs/machine-learning-the-high-interest-credit-card-of-technical-debt/.

Segler, Marwin, Mike Preuß, and Mark P Waller. 2017. “Towards" Alphachem": Chemical Synthesis Planning with Tree Search and Deep Neural Network Policies.” arXiv Preprint arXiv:1702.00020. https://doi.org/10.48550/arXiv.1702.00020.

Seifrid, Martin, Robert Pollice, Andrés Aguilar-Granda, Zamyla Morgan Chan, Kazuhiro Hotta, Cher Tian Ser, Jenya Vestfrid, Tony C. Wu, and Alán Aspuru-Guzik. 2022. “Autonomous Chemical Experiments: Challenges and Perspectives on Establishing a Self-Driving Lab.” Accounts of Chemical Research 55 (17): 2454–66. https://doi.org/10.1021/acs.accounts.2c00220.

Selivanov, Alexander, Oleg Y Rogov, Daniil Chesakov, Artem Shelmanov, Irina Fedulova, and Dmitry V Dylov. 2023. “Medical image captioning via generative pretrained transformers.” Scientific Reports 13 (1): 4171. https://doi.org/10.1038/s41598-023-31223-5.

Shabih, Sherjeel, Christoph T Koch, Kevin Maik Jablonka, and José A. Márquez. 2025. “Automated Data Extraction from Solar Cell Literature Using Large Language Models.” AI for Accelerated Materials Design - ICLR. https://openreview.net/forum?id=gwLX7cdESk.

Shao, Zekai, Siyu Yuan, Lin Gao, Yixuan He, Deqing Yang, and Siming Chen. 2025. “Unlocking Scientific Concepts: How Effective Are LLM-Generated Analogies for Student Understanding and Classroom Practice?” arXiv Preprint arXiv: 2502.16895. https://doi.org/10.48550/arXiv.2502.16895.

Shao, Zhihong, Peiyi Wang, Qihao Zhu, Runxin Xu, Junxiao Song, Xiao Bi, Haowei Zhang, et al. 2024. “DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models.” arXiv Preprint arXiv: 2402.03300. https://doi.org/10.48550/arXiv.2402.03300.

Sharma, Sahil, Puneet Mittal, Mukesh Kumar, and Vivek Bhardwaj. 2025. “The role of large language models in personalized learning: a systematic review of educational impact.” Discover Sustainability 6 (1). https://doi.org/10.1007/s43621-025-01094-z.

Shazeer, Noam, Azalia Mirhoseini, Krzysztof Maziarz, Andy Davis, Quoc Le, Geoffrey Hinton, and Jeff Dean. 2017. “Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer.” arXiv Preprint arXiv:1701.06538. https://doi.org/10.48550/arXiv.1701.06538.

Shields, Benjamin J., Jason Stevens, Jun Li, Marvin Parasram, Farhan Damani, Jesus I. Martinez Alvarado, Jacob M. Janey, Ryan P. Adams, and Abigail G. Doyle. 2021. “Bayesian Reaction Optimization as a Tool for Chemical Synthesis.” Nature 590 (7844): 89–96. https://doi.org/10.1038/s41586-021-03213-y.

Shoghi, Nima, Adeesh Kolluru, John R. Kitchin, Zachary W. Ulissi, C. L. Zitnick, and Brandon M. Wood. 2023. “From Molecules to Materials: Pre-Training Large Generalizable Models for Atomic Property Prediction.” International Conference on Learning Representations. https://doi.org/10.48550/arXiv.2310.16802.

Shorten, Connor, and Taghi M Khoshgoftaar. 2019. “A survey on image data augmentation for deep learning.” Journal of Big Data 6 (1): 1–48. https://doi.org/10.1186/s40537-019-0197-0.

Shorten, Connor, Taghi M Khoshgoftaar, and Borko Furht. 2021. “Text data augmentation for deep learning.” Journal of Big Data 8 (1): 101. https://doi.org/10.1186/s40537-021-00492-0.

Shumailov, Ilia, Zakhar Shumaylov, Yiren Zhao, Nicolas Papernot, Ross Anderson, and Yarin Gal. 2024. “AI Models Collapse When Trained on Recursively Generated Data.” Nature 631 (8022): 755–59. https://doi.org/10.1038/s41586-024-07566-y.

Si, Chenglei, Tatsunori Hashimoto, and Diyi Yang. 2025. “The Ideation-Execution Gap: Execution Outcomes of LLM-Generated Versus Human Research Ideas.” arXiv Preprint arXiv: 2506.20803. https://doi.org/10.48550/arXiv.2506.20803.

Si, Chenglei, Diyi Yang, and Tatsunori Hashimoto. 2025. “Can LLMs Generate Novel Research Ideas? A Large-Scale Human Study with 100+ NLP Researchers.” International Conference on Learning Representations. https://doi.org/10.48550/arXiv.2409.04109.

Silver, David, and Richard S Sutton. 2025. “Welcome to the Era of Experience.” Google AI 1.

Singh, Nikhil, Lucy Lu Wang, and Jonathan Bragg. 2024. “Figura11y: Ai assistance for writing scientific alt text.” Proceedings of the 29th International Conference on Intelligent User Interfaces, 886–906. https://doi.org/10.1145/3640543.3645212.

Siska, Charlotte, Katerina Marazopoulou, Melissa Ailem, and James Bono. 2024. “Examining the robustness of LLM evaluation to the distributional assumptions of benchmarks.” Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 10406–21. https://doi.org/10.18653/v1/2024.acl-long.560.

Skalse, Joar, Nikolaus Howe, Dmitrii Krasheninnikov, and David Krueger. 2022. “Defining and Characterizing Reward Hacking.” Advances in Neural Information Processing Systems 35. https://doi.org/10.48550/arXiv.2209.13085.

Skarlinski, Michael D, Sam Cox, Jon M Laurent, James D Braza, Michaela Hinks, Michael J Hammerling, Manvitha Ponnapati, Samuel G Rodriques, and Andrew D White. 2024. “Language Agents Achieve Superhuman Synthesis of Scientific Knowledge.” arXiv Preprint arXiv:2409.13740. https://doi.org/10.48550/arXiv.2409.13740.

Skinnider, Michael A. 2024. “Invalid SMILES are beneficial rather than detrimental to chemical language models.” Nature Machine Intelligence 6 (4): 437–48. https://doi.org/10.1038/s42256-024-00821-x.

Soares, Eduardo, Victor Yukio Shirasuna, Emilio Vital Brazil, Indra Priyadarsini, and Seiji Takeda. 2025. “Multi-View Mixture-of-Experts for Predicting Molecular Properties Using SMILES, SELFIES, and Graph-Based Representations.” Machine Learning: Science and Technology 6 (June): 025070. https://doi.org/10.1088/2632-2153/ade4ef.

Soares, Eduardo, Emilio Vital Brazil, Victor Shirasuna, Dmitry Zubarev, Renato Cerqueira, and Kristin Schmidt. 2025. “A Mamba-Based Foundation Model for Materials.” Npj Artificial Intelligence 1 (1): 1–8. https://doi.org/10.1038/s44387-025-00009-7.

Son, Guijin, Jiwoo Hong, Honglu Fan, Heejeong Nam, Hyunwoo Ko, Seungwon Lim, Jinyeop Song, et al. 2025. “When AI Co-Scientists Fail: SPOT-a Benchmark for Automated Verification of Scientific Research.” arXiv Preprint arXiv: 2505.11855. https://doi.org/10.48550/arXiv.2505.11855.

Song, Chan Hee, Jiaman Wu, Clayton Washington, Brian M Sadler, Wei-Lun Chao, and Yu Su. 2023. “Llm-Planner: Few-Shot Grounded Planning for Embodied Agents with Large Language Models.” Proceedings of the IEEE/CVF International Conference on Computer Vision, 2998–3009. https://doi.org/10.1109/ICCV51070.2023.00280.

Spotte-Smith, Evan Walter Clark. 2025. “Considering the Ethics of Large Machine Learning Models in the Chemical Sciences.” ChemRxiv Preprint, March. https://doi.org/10.26434/chemrxiv-2025-ct5k8.

Srinivas, Sakhinana Sagar, and Venkataramana Runkana. 2024a. “Crossing New Frontiers: Knowledge-Augmented Large Language Model Prompting for Zero-Shot Text-Based de Novo Molecule Design.” arXiv Preprint arXiv: 2408.11866. https://doi.org/10.48550/arXiv.2408.11866.

———. 2024b. “Cross-Modal Learning for Chemistry Property Prediction: Large Language Models Meet Graph Machine Learning.” Arxiv Preprint arXiv: 2408.14964, August. https://doi.org/10.48550/arXiv.2408.14964.

Sriram, Anuroop, Benjamin Kurt Miller, Ricky T. Q. Chen, and Brandon M. Wood. 2024. “FlowLLM: Flow Matching for Material Generation with Large Language Models as Base Distributions.” Arxiv Preprint arXiv, October. https://doi.org/10.48550/arXiv.2410.23405.

Stanley, Kenneth O., and Joel Lehman. 2015. Why Greatness Cannot Be Planned: The Myth of the Objective. Cham, Switzerland: Springer. https://doi.org/10.1007/978-3-319-15524-1.

Stanley, Kenneth O., Joel Lehman, and Lisa Soros. 2017. “Open-Endedness: The Last Grand Challenge You’ve Never Heard Of.” https://www.oreilly.com/radar/open-endedness-the-last-grand-challenge-youve-never-heard-of/.

Starace, Giulio, Oliver Jaffe, Dane Sherburn, James Aung, Jun Shern Chan, Leon Maksin, Rachel Dias, et al. 2025. “PaperBench: Evaluating AI’s Ability to Replicate AI Research.” arXiv Preprint arXiv: 2504.01848. https://doi.org/10.48550/arXiv.2504.01848.

“Statement on AI Risk CAIS.” n.d. Accessed May 24, 2025. https://www.safe.ai/work/statement-on-ai-risk.

Stechly, Kaya, Karthik Valmeekam, and Subbarao Kambhampati. 2024. “Chain of Thoughtlessness? An Analysis of Cot in Planning.” The Thirty-Eighth Annual Conference on Neural Information Processing Systems. https://doi.org/10.48550/arXiv.2405.04776.

Steiner, Sebastian, Jakob Wolf, Stefan Glatzel, Anna Andreou, Jarosław M. Granda, Graham Keenan, Trevor Hinkley, et al. 2019. “Organic Synthesis in a Modular Robotic System Driven by a Chemical Programming Language.” Science 363 (6423): eaav2211. https://doi.org/10.1126/science.aav2211.

Strateos. 2023. “Autoprotocol Specification.” https://autoprotocol.org/specification/.

Strieth-Kalthoff, Felix, Han Hao, Vandana Rathore, Joshua Derasp, Théophile Gaudin, Nicholas H. Angello, Martin Seifrid, et al. 2024. “Delocalized, Asynchronous, Closed-Loop Discovery of Organic Laser Emitters.” Science 384 (6697): eadk9227. https://doi.org/10.1126/science.adk9227.

Strubell, Emma, Ananya Ganesh, and Andrew McCallum. 2019. “Energy and Policy Considerations for Deep Learning in NLP.” arXiv Preprint arXiv: 1906.02243. https://doi.org/10.48550/arXiv.1906.02243.

Subasinghe, S. M. Supundrika, Simon G. Gersib, and Neal P. Mankad. 2025. “Large Language Models (LLMs) as Graphing Tools for Advanced Chemistry Education and Research.” Journal of Chemical Education, March. https://doi.org/10.1021/acs.jchemed.4c01498.

Sun, Kunyang, Dorian Bagni, Joseph M. Cavanagh, Yingze Wang, Jacob M. Sawyer, Andrew Gritsevskiy, Oufan Zhang, and Teresa Head-Gordon. 2025. “SynLlama: Generating Synthesizable Molecules and Their Analogs with Large Language Models.” Arxiv Preprint arXiv: 2503.12602, April. https://doi.org/10.48550/arXiv.2503.12602.

Sun, Liangtai, Danyu Luo, Da Ma, Zihan Zhao, Baocai Chen, Zhennan Shen, Su Zhu, Lu Chen, Xin Chen, and Kai Yu. 2024. “SciDFM: A Large Language Model with Mixture-of-Experts for Science.” arXiv Preprint arXiv:2409.18412. https://doi.org/10.48550/arXiv.2409.18412.

Sypetkowski, Maciej, Frederik Wenkel, Farimah Poursafaei, Nia Dickson, Karush Suri, Philip Fradkin, and Dominique Beaini. 2024. “On the Scalability of Gnns for Molecular Graphs.” Advances in Neural Information Processing Systems 37: 19870–906. https://doi.org/10.48550/arXiv.2404.11568.

Taber, Keith S. 2014. “The Significance of Implicit Knowledge for Learning and Teaching Chemistry.” Chem. Educ. Res. Pract. 15 (4): 447–61. https://doi.org/10.1039/c4rp00124a.

Takeda, Seiji, Indra Priyadarsini, Akihiro Kishimoto, Hajime Shinohara, Lisa Hamada, Hirose Masataka, Junta Fuchiwaki, and Daiju Nakano. 2023. “Multi-Modal Foundation Model for Material Design.” AI for Accelerated Materials Design-NeurIPS 2023 Workshop. https://openreview.net/forum?id=EiT2bLsfM9.

Tang, Xiangru, Qiao Jin, Kunlun Zhu, Tongxin Yuan, Yichi Zhang, Wangchunshu Zhou, Meng Qu, et al. 2024. “Prioritizing Safeguarding Over Autonomy: Risks of LLM Agents for Science.” Arxiv Preprint arXiv: 2402.04247, June. https://doi.org/10.48550/arXiv.2402.04247.

Taylor, Connor J., Alexander Pomberger, Kobi C. Felton, Rachel Grainger, Magda Barecka, Thomas W. Chamberlain, Richard A. Bourne, Christopher N. Johnson, and Alexei A. Lapkin. 2023. “A Brief Introduction to Chemical Reaction Optimization.” Chemical Reviews 123 (6): 3089–3126. https://doi.org/10.1021/acs.chemrev.2c00798.

Taylor, Ross, Marcin Kardas, Guillem Cucurull, Thomas Scialom, Anthony Hartshorn, Elvis Saravia, Andrew Poulton, Viktor Kerkez, and Robert Stojnic. 2022. “Galactica: A Large Language Model for Science.” arXiv Preprint arXiv:2211.09085. https://doi.org/10.48550/arXiv.2211.09085.

The Danish National Committee on Health Research Ethics. 2024. “Hypothesis-Generating Research.” https://researchethics.dk/guidelines/-guidance-on-hypothesis-generating-research.

Thompson, Derek. 2025. “Why Chatbots Keep Beating the Tests.” The Atlantic, March. https://www.theatlantic.com/technology/archive/2025/03/chatbots-benchmark-tests/681929/.

Thrush, Tristan, Christopher Potts, and Tatsunori Hashimoto. 2024. “Improving Pretraining Data Using Perplexity Correlations.” arXiv Preprint arXiv:2409.05816. https://doi.org/10.48550/arXiv.2409.05816.

Tian, Minyang, Luyu Gao, Shizhuo Zhang, Xinan Chen, Cunwei Fan, Xuefei Guo, Roland Haas, et al. 2024. “Scicode: A Research Coding Benchmark Curated by Scientists.” Advances in Neural Information Processing Systems 37: 30624–50. https://doi.org/10.48550/arXiv.2407.13168.

Tian, Siyu Isaac Parker, Aron Walsh, Zekun Ren, Qianxiao Li, and Tonio Buonassisi. 2022. “What Information is Necessary and Sufficient to Predict Materials Properties using Machine Learning?” arXiv Preprint. https://doi.org/10.48550/arXiv.2206.04968.

Tikhonov, Alexey, and Ivan P. Yamshchikov. 2023. “Post Turing: Mapping the landscape of LLM Evaluation.” arXiv Preprint arXiv: 2311.02049. https://doi.org/10.48550/arXiv.2311.02049.

Tom, Gary, Stefan P. Schmid, Sterling G. Baird, Yang Cao, Kourosh Darvish, Han Hao, Stanley Lo, et al. 2024. “Self-Driving Laboratories for Chemistry and Materials Science.” Chemical Reviews 124 (16): 9633–732. https://doi.org/10.1021/acs.chemrev.4c00055.

Trager, Robert, Ben Harack, Anka Reuel, Allison Carnegie, Lennart Heim, Lewis Ho, Sarah Kreps, et al. 2023. “International Governance of Civilian AI: A Jurisdictional Certification Approach.” Arxiv Preprint arXiv: 2308.15514, September. https://doi.org/10.48550/arXiv.2308.15514.

Trewartha, Amalie, Nicholas Walker, Haoyan Huo, Sanghoon Lee, Kevin Cruse, John Dagdelen, Alexander Dunn, Kristin A Persson, Gerbrand Ceder, and Anubhav Jain. 2022. “Quantifying the Advantage of Domain-Specific Pre-Training on Named Entity Recognition Tasks in Materials Science.” Patterns 3 (4).

Trinh, Trieu H, Yuhuai Wu, Quoc V Le, He He, and Thang Luong. 2024. “Solving olympiad geometry without human demonstrations.” Nature 625 (7995): 476–82. https://doi.org/10.1038/s41586-023-06747-5.

Tsai, Meng-Lin, Chong Wei Ong, and Cheng-Liang Chen. 2023. “Exploring the use of large language models (LLMs) in chemical engineering education: Building core course problem models with Chat-GPT.” Education for Chemical Engineers 44 (July): 71–95. https://doi.org/10.1016/j.ece.2023.05.001.

Tshitoyan, Vahe, John Dagdelen, Leigh Weston, Alexander Dunn, Ziqin Rong, Olga Kononova, Kristin A. Persson, Gerbrand Ceder, and Anubhav Jain. 2019. “Unsupervised Word Embeddings Capture Latent Knowledge from Materials Science Literature.” Nature 571 (7763): 95–98. https://doi.org/10.1038/s41586-019-1335-8.

Tu, Zhengkai, Sourabh J Choure, Mun Hong Fong, Jihye Roh, Itai Levin, Kevin Yu, Joonyoung F Joung, et al. 2025. “ASKCOS: an open source software suite for synthesis planning.” arXiv Preprint arXiv:2501.01835. https://doi.org/10.48550/arXiv.2501.01835.

Unke, Oliver T, Stefan Chmiela, Huziel E Sauceda, Michael Gastegger, Igor Poltavsky, Kristof T Schutt, Alexandre Tkatchenko, and Klaus-Robert Muller. 2021. “Machine learning force fields.” Chemical Reviews 121 (16): 10142–86. https://doi.org/10.1021/acs.chemrev.0c01111.

Urbina, Fabio, Filippa Lentzos, Cedric Invernizzi, and Sean Ekins. 2022. “Dual use of artificial-intelligence-powered drug discovery.” Nature Machine Intelligence 4 (3): 189–91. https://doi.org/10.1038/s42256-022-00465-9.

Van Herck, Joren, Marı́a Victoria Gil, Kevin Maik Jablonka, Alex Abrudan, Andy S. Anker, Mehrdad Asgari, Ben Blaiszik, et al. 2025. “Assessment of fine-tuned large language models for real-world chemistry and material science applications.” Chemical Science 16 (2): 670–84. https://doi.org/10.1039/D4SC04401K.

Vangala, Sarveswara Rao, Sowmya Ramaswamy Krishnan, Navneet Bung, Dhandapani Nandagopal, Gomathi Ramasamy, Satyam Kumar, Sridharan Sankaran, Rajgopal Srinivasan, and Arijit Roy. 2024. “Suitability of Large Language Models for Extraction of High-Quality Chemical Reaction Dataset from Patent Literature.” Journal of Cheminformatics 16 (1): 131. https://doi.org/10.1186/s13321-024-00928-8.

Vaswani, Ashish, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin. 2017. “Attention Is All You Need.” NEURIPS. https://doi.org/10.48550/arXiv.1706.03762.

Vaucher, Alain C., Federico Zipoli, Joppe Geluykens, Vishnu H. Nair, Philippe Schwaller, Teodoro Laino, et al. 2020. “Automated Extraction of Chemical Synthesis Actions from Experimental Procedures.” Nature Communications 11 (1). https://doi.org/10.1038/s41467-020-17266-6.

Veličković, Petar. 2023. “Everything Is Connected: Graph Neural Networks.” Current Opinion in Structural Biology 79: 102538. https://doi.org/10.1016/j.sbi.2023.102538.

Vincent, Pascal, Hugo Larochelle, Yoshua Bengio, and Pierre-Antoine Manzagol. 2008. “Extracting and Composing Robust Features with Denoising Autoencoders.” Proceedings of the 25th International Conference on Machine Learning, 1096–1103. https://doi.org/10.1145/1390156.1390294.

Vincent, Pascal, Hugo Larochelle, Isabelle Lajoie, Yoshua Bengio, Pierre-Antoine Manzagol, and Léon Bottou. 2010. “Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion.” Journal of Machine Learning Research 11 (12). https://jmlr.org/papers/v11/vincent10a.html.

Von Oswald, Johannes, Eyvind Niklasson, Ettore Randazzo, João Sacramento, Alexander Mordvintsev, Andrey Zhmoginov, and Max Vladymyrov. 2023. “Transformers Learn in-Context by Gradient Descent.” International Conference on Machine Learning, 35151–74. https://doi.org/10.48550/arXiv.2212.07677.

Vriza, Aikaterini, Henry C. Chan, Jie Xu, Keith L. Barnett, Ian Staffell, Oleksandr Stanevich, Siqi Du, et al. 2023. “Self-Driving Laboratory for Polymer Electronics.” Chemistry of Materials 35 (8): 3046–56. https://doi.org/10.1021/acs.chemmater.2c03593.

Wan, Yuwei, Tong Xie, Nan Wu, Wenjie Zhang, Chunyu Kit, and Bram Hoex. 2024. “From Tokens to Materials: Leveraging Language Models for Scientific Discovery.” arXiv Preprint arXiv: 2410.16165. https://doi.org/10.48550/arXiv.2410.16165.

Wang, Anthony Yu-Tung, Steven K. Kauwe, Ryan J. Murdock, and Taylor D. Sparks. 2021. “Compositionally restricted attention-based network for materials property predictions.” Npj Computational Materials 7 (1). https://doi.org/10.1038/s41524-021-00545-1.

Wang, Chengshi, Yeon-Ju Kim, Aikaterini Vriza, Rohit Batra, Arun Baskaran, Naisong Shan, Nan Li, et al. 2025. “Autonomous Platform for Solution Processing of Electronic Polymers.” Nature Communications 16 (1): 1498. https://doi.org/10.1038/s41467-024-55655-3.

Wang, Evan, Federico Cassano, Catherine Wu, Yunfeng Bai, Will Song, Vaskar Nath, Ziwen Han, Sean Hendryx, Summer Yue, and Hugh Zhang. 2024. “Planning in Natural Language Improves Llm Search for Code Generation.” arXiv Preprint arXiv:2409.03733. https://doi.org/10.48550/arXiv.2409.03733.

Wang, Hanchen, Tianfan Fu, Yuanqi Du, Wenhao Gao, Kexin Huang, Ziming Liu, Payal Chandak, et al. 2023. “Scientific Discovery in the Age of Artificial Intelligence.” Nature 620 (7972): 47–60. https://doi.org/10.1038/s41586-023-06221-2.

Wang, Haorui, Jeff Guo, Lingkai Kong, Rampi Ramprasad, Philippe Schwaller, Yuanqi Du, and Chao Zhang. 2025. “LLM-Augmented Chemical Synthesis and Design Decision Programs.” arXiv Preprint arXiv: 2505.07027. https://doi.org/10.48550/arXiv.2505.07027.

Wang, Haorui, Marta Skreta, Cher Tian Ser, Wenhao Gao, Lingkai Kong, Felix Strieth-Kalthoff, Chenru Duan, et al. 2025. “Efficient Evolutionary Search over Chemical Space with Large Language Models.” The Thirteenth International Conference on Learning Representations, ICLR 2025, Singapore, April 24-28, 2025. https://doi.org/10.48550/arXiv.2406.16976.

Wang, Jin, and Wenxiang Fan. 2025. “The Effect of ChatGPT on Students’ Learning Performance, Learning Perception, and Higher-Order Thinking: Insights from a Meta-Analysis.” Humanities and Social Sciences Communications 12 (1). https://doi.org/10.1057/s41599-025-04787-y.

Wang, Qingyun, Doug Downey, Heng Ji, and Tom Hope. 2023. “SciMON: Scientific Inspiration Machines Optimized for Novelty.” arXiv Preprint arXiv: 2305.14259. https://doi.org/10.48550/arXiv.2305.14259.

Wang, Xinyu Jessica, Christine Lee, and Bilge Mutlu. 2025. “LearnMate: Enhancing Online Education with LLM-Powered Personalized Learning Plans and Support.” CHI Extended Abstracts. https://doi.org/10.1145/3706599.3719857.

Wang, Ye, Honggang Zhao, Simone Sciabola, and Wenlu Wang. 2023. “cMolGPT: A Conditional Generative Pre-Trained Transformer for Target-Specific de Novo Molecular Generation.” Molecules 28 (11): 4430. https://doi.org/10.3390/molecules28114430.

Wang, Yuyang, Jianren Wang, Zhonglin Cao, and Amir Barati Farimani. 2022. “Molecular Contrastive Learning of Representations via Graph Neural Networks.” Nature Machine Intelligence 4 (3): 279–87. https://doi.org/10.1038/s42256-022-00447-x.

Wang, Yuyang, Changwen Xu, Zijie Li, and Amir Barati Farimani. 2023. “Denoise Pretraining on Nonequilibrium Molecules for Accurate and Transferable Neural Potentials.” Journal of Chemical Theory and Computation 19 (15): 5077–87. https://doi.org/10.1021/acs.jctc.3c00289.

Wang, Zhenbin, Kevin Cruse, Yifei Fei, Aaron Chia, Yihuang Zeng, Haozhe Huo, Tianxiao He, Bowen Deng, Olga Kononova, and Gerbrand Ceder. 2022. “ULSA: Unified Language of Synthesis Actions for the Representation of Inorganic Synthesis Protocols.” Digital Discovery 1 (3): 313–24. https://doi.org/10.1039/D2DD00049D.

Warr, Wendy A. 2014. “A short review of chemical reaction database systems, computer-aided synthesis design, reaction prediction and synthetic feasibility.” Molecular Informatics 33 (6-7): 469–76. https://doi.org/10.1002/minf.201400052.

Wei, Jason, Zhiqing Sun, Spencer Papay, Scott McKinney, Jeffrey Han, Isa Fulford, Hyung Won Chung, Alex Tachard Passos, William Fedus, and Amelia Glaese. 2025. “Browsecomp: A Simple yet Challenging Benchmark for Browsing Agents.” arXiv Preprint arXiv:2504.12516. https://doi.org/10.48550/arXiv.2504.12516.

Wei, Jason, Xuezhi Wang, Dale Schuurmans, Maarten Bosma, Fei Xia, Ed Chi, Quoc V Le, Denny Zhou, et al. 2022. “Chain-of-Thought Prompting Elicits Reasoning in Large Language Models.” Advances in Neural Information Processing Systems 35: 24824–37. https://doi.org/10.48550/arXiv.2201.11903.

Wei, Jason, and Kai Zou. 2019. “EDA: Easy Data Augmentation Techniques for Boosting Performance on Text Classification Tasks.” arXiv Preprint. https://doi.org/10.48550/arXiv.1901.11196.

Weininger, David. 1988. “SMILES, a Chemical Language and Information System. 1. Introduction to Methodology and Encoding Rules.” Journal of Chemical Information and Computer Sciences 28 (1). https://doi.org/10.1021/ci00057a005.

Wellawatte, Geemi P, and Philippe Schwaller. 2025. “Human interpretable structure-property relationships in chemistry using explainable machine learning and large language models.” Communications Chemistry 8 (1): 11. https://doi.org/10.1038/s42004-024-01393-y.

Wellawatte, Geemi P, Aditi Seshadri, and Andrew D White. 2022. “Model Agnostic Generation of Counterfactual Explanations for Molecules.” Chemical Science 13 (13): 3697–3705. https://doi.org/10.1039/d1sc05259d.

Weng, Lilian. 2022. “Generalized Visual Language Models.” Lil’Log, June. https://lilianweng.github.io/posts/2022-06-09-vlm/.

Wenzel, Makarius, Lawrence C Paulson, and Tobias Nipkow. 2008. “The isabelle framework.” International Conference on Theorem Proving in Higher Order Logics, 33–38. https://doi.org/10.1007/978-3-540-71067-7_7.

White, Andrew D. 2023. “The future of chemistry is language.” Nature Reviews Chemistry 7 (7): 457–58. https://doi.org/10.1038/s41570-023-00502-0.

Wierenga, Rick P., Stefan M. Golas, Wilson Ho, Connor W. Coley, and Kevin M. Esvelt. 2023. “PyLabRobot: An Open-Source, Hardware-Agnostic Interface for Liquid-Handling Robots and Accessories.” Device 1 (4): 100111. https://doi.org/10.1016/j.device.2023.100111.

Wilbraham, Liam, S. Hessam M. Mehr, and Leroy Cronin. 2021. “Digitizing Chemistry Using the Chemical Processing Unit: From Synthesis to Discovery.” Accounts of Chemical Research 54 (2): 253–62. https://doi.org/10.1021/acs.accounts.0c00674.

Wilson, Andrew Gordon. 2025. “Deep Learning is Not So Mysterious or Different.” arXiv Preprint arXiv: 2503.02113. https://doi.org/10.48550/arXiv.2503.02113.

Wood, Brandon M., Misko Dzamba, Xiang Fu, Meng Gao, Muhammed Shuaibi, Luis Barroso-Luque, Kareem Abdelmaqsoud, et al. 2025. “UMA: A Family of Universal Models for Atoms.” arXiv Preprint. https://doi.org/10.48550/arXiv.2506.23971.

Wu, Jianchang, Luca Torresi, ManMan Hu, Patrick Reiser, Jiyun Zhang, Juan S. Rocha-Ortiz, Luyao Wang, et al. 2024. “Inverse Design Workflow Discovers Hole-Transport Materials Tailored for Perovskite Solar Cells.” Science 386 (6727): 1256–64. https://doi.org/10.1126/science.ads0901.

Wu, Juan-Ni, Tong Wang, Yue Chen, Li-Juan Tang, Hai-Long Wu, and Ru-Qin Yu. 2024. “t-SMILES: a fragment-based molecular representation framework for de novo ligand design.” Nature Communications 15 (1): 4993. https://doi.org/10.1038/s41467-024-49388-6.

Wu, Qingyun, Gagan Bansal, Jieyu Zhang, Yiran Wu, Beibin Li, Erkang Zhu, Li Jiang, et al. 2023. “Autogen: Enabling Next-Gen Llm Applications via Multi-Agent Conversation.” arXiv Preprint arXiv:2308.08155. https://doi.org/10.48550/arXiv.2308.08155.

Wu, Tongwei, Yao Sun, Xiaoxi Guo, Lin Tian, Yanning Zhang, Haitao Zhao, and Yuen Wu. 2025. “A Large Language Models-Guided Grand Canonical DFT Framework for Accelerating the Discovery of Efficient Electrocatalysts.”

Wu, Yuhuai, Albert Qiaochu Jiang, Wenda Li, Markus Rabe, Charles Staats, Mateja Jamnik, and Christian Szegedy. 2022. “Autoformalization with Large Language Models.” Advances in Neural Information Processing Systems 35: 32353–68. https://doi.org/10.48550/arXiv.2205.12615.

Wu, Zhenqin, Bharath Ramsundar, Evan N. Feinberg, Joseph Gomes, Caleb Geniesse, Aneesh S. Pappu, Karl Leswing, and Vijay Pande. 2018. “MoleculeNet: a benchmark for molecular machine learning.” Chemical Science 9 (2): 513–30. https://doi.org/10.1039/c7sc02664a.

Xiao, Hang, Rong Li, Xiaoyang Shi, Yan Chen, Liangliang Zhu, Xi Chen, and Lei Wang. 2023. “An invertible, invariant crystal representation for inverse design of solid-state materials using generative deep learning.” Nature Communications 14 (1). https://doi.org/10.1038/s41467-023-42870-7.

Xie, Tong, Yuwei Wan, Wei Huang, Zhenyu Yin, Yixuan Liu, Shaozhou Wang, Qingyuan Linghu, et al. 2023. “Darwin series: Domain specific large language models for natural science.” arXiv Preprint arXiv:2308.13565. https://doi.org/10.48550/arXiv.2308.13565.

Xie, Tong, Yuwei Wan, Yixuan Liu, Yuchen Zeng, Shaozhou Wang, Wenjie Zhang, Clara Grazian, et al. 2025. “DARWIN 1.5: Large Language Models as Materials Science Adapted Learners.” Arxvi Preprint arXiv:2412.11970, January. https://doi.org/10.48550/arXiv.2412.11970.

Xin, Huajian, Daya Guo, Zhihong Shao, Zhizhou Ren, Qihao Zhu, Bo Liu, Chong Ruan, Wenda Li, and Xiaodan Liang. 2024. “Deepseek-prover: Advancing theorem proving in llms through large-scale synthetic data.” arXiv Preprint arXiv:2405.14333. https://doi.org/10.48550/arXiv.2405.14333.

Xu, Fengli, Qianyue Hao, Zefang Zong, Jingwei Wang, Yunke Zhang, Jingyi Wang, Xiaochong Lan, et al. 2025. “Towards Large Reasoning Models: A Survey of Reinforced Reasoning with Large Language Models.” arXiv Preprint. https://doi.org/10.48550/arXiv.2501.09686.

Yamada, Yutaro, Robert Tjarko Lange, Cong Lu, Shengran Hu, Chris Lu, Jakob Foerster, Jeff Clune, and David Ha. 2025. “The AI Scientist-V2: Workshop-Level Automated Scientific Discovery via Agentic Tree Search.” arXiv Preprint arXiv: 2504.08066. https://doi.org/10.48550/arXiv.2504.08066.

Yan, Cong, and Yeye He. 2020. “Auto-Suggest: Learning-to-Recommend Data Preparation Steps Using Data Science Notebooks.” Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data, SIGMOD/PODS ’20, May. https://doi.org/10.1145/3318464.3389738.

Yang, Chengrun, Xuezhi Wang, Yifeng Lu, Hanxiao Liu, Quoc V. Le, Denny Zhou, and Xinyun Chen. 2023. “Large Language Models as Optimizers.” arXiv Preprint arXiv: 2309.03409. https://doi.org/10.48550/arXiv.2309.03409.

Yang, Wuyue, Liangrong Peng, Yi Zhu, and Liu Hong. 2020. “When machine learning meets multiscale modeling in chemical reactions.” The Journal of Chemical Physics 153 (9). https://doi.org/10.1063/5.0015779.

Yang, Yuzhe, Yujia Liu, Xin Liu, Avanti Gulhane, Domenico Mastrodicasa, Wei Wu, Edward J. Wang, Dushyant W. Sahani, and Shwetak Patel. 2024. “Demographic Bias of Expert-Level Vision-Language Foundation Models in Medical Imaging.” arXiv Preprint arXiv:2402.14815, February. https://doi.org/10.48550/arXiv.2402.14815.

Yang, Zonglin, Wanhao Liu, Ben Gao, Yujie Liu, Wei Li, Tong Xie, Lidong Bing, Wanli Ouyang, Erik Cambria, and Dongzhan Zhou. 2025. “MOOSE-Chem2: Exploring LLM Limits in Fine-Grained Scientific Hypothesis Discovery via Hierarchical Search.” arXiv Preprint arXiv: 2505.19209. https://doi.org/10.48550/arXiv.2505.19209.

Yang, Zonglin, Wanhao Liu, Ben Gao, Tong Xie, Yuqiang Li, Wanli Ouyang, Soujanya Poria, Erik Cambria, and Dongzhan Zhou. 2025. “MOOSE-Chem: Large Language Models for Rediscovering Unseen Chemistry Scientific Hypotheses.” The Thirteenth International Conference on Learning Representations, ICLR. https://doi.org/10.48550/arXiv.2410.07076.

Yano, Junko, Kelly J Gaffney, John Gregoire, Linda Hung, Abbas Ourmazd, Joshua Schrier, James A Sethian, and Francesca M Toma. 2022. “The case for data science in experimental chemistry: examples and recommendations.” Nature Reviews Chemistry 6 (5): 357–70. https://doi.org/10.1038/s41570-022-00382-w.

Yao, Shunyu, Jeffrey Zhao, Dian Yu, Nan Du, Izhak Shafran, Karthik Narasimhan, and Yuan Cao. 2023. “React: Synergizing Reasoning and Acting in Language Models.” International Conference on Learning Representations (ICLR). https://doi.org/10.48550/arXiv.2210.03629.

Yao, Zhenpeng, Yanwei Lum, Andrew Johnston, Luis Martin Mejia-Mendoza, Xin Zhou, Yonggang Wen, Alán Aspuru-Guzik, Edward H. Sargent, and Zhi Wei Seh. 2022. “Machine Learning for a Sustainable Energy Future.” Nature Reviews Materials 8 (3): 202–15. https://doi.org/10.1038/s41578-022-00490-5.

Yona, Itay, Ilia Shumailov, Jamie Hayes, and Nicholas Carlini. 2024. “Stealing User Prompts from Mixture of Experts.” Arxiv Preprint, no. arXiv:2410.22884 (October). https://doi.org/10.48550/arXiv.2410.22884.

Yoshikai, Yasuhiro, Tadahaya Mizuno, Shumpei Nemoto, and Hiroyuki Kusuhara. 2024. “A Novel Molecule Generative Model of VAE Combined with Transformer for Unseen Structure Generation.” arXiv Preprint arXiv: 2402.11950. https://doi.org/10.48550/arXiv.2402.11950.

Yoshikawa, Naruki, Marta Skreta, Kourosh Darvish, Sebastian Arellano-Rubach, Zhi Ji, Lasse Bjørn Kristensen, Andrew Zou Li, et al. 2023. “Large Language Models for Chemistry Robotics.” Autonomous Robots 47 (8): 1057–86. https://doi.org/10.1007/s10514-023-10136-2.

Yu, Botao, Frazier N. Baker, Ziqi Chen, Xia Ning, and Huan Sun. 2024. “LlaSMol: Advancing Large Language Models for Chemistry with a Large-Scale, Comprehensive, High-Quality Instruction Tuning Dataset.” arXiv Preprint arXiv: 2402.09391. https://doi.org/10.48550/arXiv.2402.09391.

Yu, Jiajun, Yizhen Zheng, Huan Yee Koh, Shirui Pan, Tianyue Wang, and Haishuai Wang. 2025. “Collaborative Expert LLMs Guided Multi-Objective Molecular Optimization.” arXiv Preprint. https://doi.org/10.48550/arXiv.2503.03503.

Zaki, Mohd, Jayadeva, Mausam, and N. M. Anoop Krishnan. 2023. “MaScQA: A Question Answering Dataset for Investigating Materials Science Knowledge of Large Language Models.” arXiv Preprint arXiv: 2308.09115. https://doi.org/10.48550/arXiv.2308.09115.

Zhang, Di, Wei Liu, Qian Tan, Jingdan Chen, Hang Yan, Yuliang Yan, Jiatong Li, et al. 2024. “Chemllm: A chemical large language model.” arXiv Preprint. https://doi.org/10.48550/arXiv.2402.06852.

Zhang, Jenny, Shengran Hu, Cong Lu, Robert Lange, and Jeff Clune. 2025. “Darwin Godel Machine: Open-Ended Evolution of Self-Improving Agents.” arXiv Preprint. https://doi.org/10.48550/arXiv.2505.22954.

Zhang, Jenny, Joel Lehman, Kenneth O. Stanley, and Jeff Clune. 2024. “OMNI: Open-Endedness via Models of Human Notions of Interestingness.” International Conference on Learning Representations. https://doi.org/10.48550/arXiv.2306.01711.

Zhang, Qiang, Keyan Ding, Tianwen Lv, Xinda Wang, Qingyu Yin, Yiwen Zhang, Jing Yu, et al. 2025. “Scientific Large Language Models: A Survey on Biological & Chemical Domains.” ACM Computing Surveys 57 (6): 1–38. https://doi.org/10.1145/3715318.

Zhang, Wei, Qinggong Wang, Xiangtai Kong, Jiacheng Xiong, Shengkun Ni, Duanhua Cao, Buying Niu, et al. 2024. “Fine-Tuning Large Language Models for Chemical Text Mining.” Chemical Science 15 (27): 10600–10611. https://doi.org/10.1039/D4SC00924J.

Zhang, Yu, Yang Han, Shuai Chen, Ruijie Yu, Xin Zhao, Xianbin Liu, Kaipeng Zeng, et al. 2025. “Large Language Models to Accelerate Organic Chemistry Synthesis.” Nature Machine Intelligence. https://doi.org/10.1038/s42256-025-01066-y.

Zhao, Zihan, Da Ma, Lu Chen, Liangtai Sun, Zihao Li, Yi Xia, Bo Chen, et al. 2024. “ChemDFM: A Large Language Foundation Model for Chemistry.” arXiv Preprint. https://doi.org/10.48550/arXiv.2401.14818.

Zheng, Yizhen, Huan Yee Koh, Jiaxin Ju, Anh T. N. Nguyen, Lauren T. May, Geoffrey I. Webb, and Shirui Pan. 2025. “Large language models for scientific discovery in molecular property prediction.” Nature Machine Intelligence 7 (3): 437–47. https://doi.org/10.1038/s42256-025-00994-z.

Zheng, Zhiling, Zhiguo He, Omar Khattab, Nakul Rampal, Matei A. Zaharia, Christian Borgs, Jennifer T. Chayes, and Omar M. Yaghi. 2024. “Image and Data Mining in Reticular Chemistry Powered by GPT-4V.” Digital Discovery 3 (3): 491–501. https://doi.org/10.1039/d3dd00239j.

Zheng, Zhiling, Oufan Zhang, C. Borgs, J. Chayes, and O. Yaghi. 2023. “ChatGPT Chemistry Assistant for Text Mining and Prediction of MOF Synthesis.” Journal of the American Chemical Society. https://doi.org/10.1021/jacs.3c05819.

Zhou, Andy, and Ron Arel. 2025. “Tempest: Autonomous Multi-Turn Jailbreaking of Large Language Models with Tree Search.” arXiv Preprint. https://doi.org/10.48550/arXiv.2503.10619.

Zhou, Hattie, Arwen Bradley, Etai Littwin, Noam Razin, Omid Saremi, Josh Susskind, Samy Bengio, and Preetum Nakkiran. 2023. “What Algorithms Can Transformers Learn? A Study in Length Generalization.” arXiv Preprint. https://doi.org/10.48550/arXiv.2310.16028.

Zhou, Yujun, Jingdong Yang, Yue Huang, Kehan Guo, Zoe Emory, Bikram Ghosh, Amita Bedar, et al. 2024. “LabSafety Bench: Benchmarking LLMs on Safety Issues in Scientific Labs.” arXiv Preprint. https://doi.org/10.48550/arXiv.2410.14182.

Zhou, Zhanhui, Jie Liu, Jing Shao, Xiangyu Yue, Chao Yang, Wanli Ouyang, and Yu Qiao. 2024. “Beyond One-Preference-Fits-All Alignment: Multi-Objective Direct Preference Optimization.” Arxiv Preprint. https://doi.org/10.48550/arXiv.2310.03708.

Zhu, Huaisheng, Teng Xiao, and Vasant G. Honavar. 2024. “3M-Diffusion: Latent Multi-Modal Diffusion for Language-Guided Molecular Structure Generation.” Arxiv Preprint, October. https://doi.org/10.48550/arXiv.2403.07179.

Zhu, Kaijie, Jindong Wang, Jiaheng Zhou, Zichen Wang, Hao Chen, Yidong Wang, Linyi Yang, et al. 2023. “PromptRobust: Towards Evaluating the Robustness of Large Language Models on Adversarial Prompts.” https://doi.org/10.48550/arxiv.2306.04528.

Zou, Yunheng, Austin H. Cheng, Abdulrahman Aldossary, Jiaru Bai, Shi Xuan Leong, Jorge Arturo Campos-Gonzalez-Angulo, Changhyeok Choi, et al. 2025. “El Agente: An Autonomous Agent for Quantum Chemistry.” Matter 8 (7): 102263. https://doi.org/10.1016/j.matt.2025.102263.

Zunger, Alex. 2019. “Beware of plausible predictions of fantasy materials.” Nature 566 (7745): 447–49. https://doi.org/10.1038/d41586-019-00676-y.

Other Links