Cardille Computational Landscape Ecology Lab
  • Home
  • Research
    • Remote Sensing & Change Detection
    • Geo-AI
    • Aquatic
    • Landscape Ecology
    • Books
  • Team
    • Current lab members
    • Past lab members
    • Invitation To Students
    • Funding
  • Courses
  • Publications
  • Service
  • Contact

AI-Accelerated Scientific Discovery

​An AI system to help scientists write expert-level empirical software

Picture
Figure 1. Schematic and performance of our method. a, Schematic of our method algorithm. b, Performance of code generation methods on Kaggle Playground benchmark. c, Mechanisms used to create initial research ideas to solve scientific problem
Picture
Supplementary Figure 14. Example output segmenting DLRSD image pixels from our method Solution 1 (U-Net++).
Picture
Supplementary Figure 22. Schematic of Algorithm, consisting of a code mutation system, where the prompt is augmented with research ideas. Research ideas can be sourced from the primary literature, or from a search algorithm
Background

The cycle of scientific discovery is frequently stalled by the slow, manual creation of software to support computational experiments. To address this, we present an AI system that creates expert-level scientific software whose goal is to maximize a quality metric. In nearly every field, researchers rely on "empirical software", code designed to maximize a specific quality score, such as how well a model fits observed data. Historically, building this software requires years of specialized labor and often depends on human intuition rather than a systematic search for the best approach. Because of how slow this development process is, it severely limits the range of complex hypotheses that can be practically explored. There is a critical need for an automated system capable of producing expert-level scientific software at scale to overcome these research bottlenecks.

Approach

We developed an AI system that combines a Large Language Model with a Tree Search algorithm to systematically navigate the massive space of potential code solutions. To refine our methodology, we utilized a benchmark of 16 data science competitions on the Kaggle platform, which allowed us to calibrate the AI’s performance against thousands of human participants in a fast-paced environment. The system works by intelligently rewriting code and evaluating each version against a specific quality metric in a secure "sandbox". To reach expert-level results, we provide the AI with complex research ideas from scientific papers and textbooks, which it then recombines into novel, high-performing software.

Key Findings

We demonstrated that our system achieves "superhuman" performance by identifying high-quality code solutions that often surpass human experts across various disciplines.
  • Genomics and Epidemiology: In the field of single-cell data analysis, our system discovered 40 novel methods that outperformed the top human-developed techniques on an active global leaderboard. For public health, the AI generated 14 forecasting models that beat the official ensemble used by the Centers for Disease Control and Prevention for predicting COVID-19 hospitalizations.
  • Diverse Scientific Utility: We successfully applied the system to segment satellite imagery, predict brain-wide neural activity in zebrafish, and solve difficult mathematical integrals where standard industry libraries typically fail.
  • Innovation Through Recombination: A key finding was the system's ability to effectively recombine ideas from multiple research papers. For instance, it created superior hybrid strategies by fusing different modeling paradigms that had never been previously combined by human researchers.

Impact
​

This research represents a major step toward accelerating scientific progress by automating the most tedious aspects of research software creation. By reducing the time required to test new research ideas from months to just hours or days, we enable a significantly faster cycle of discovery. The lab believes this capability will revolutionize fields where solutions can be numerically scored, such as drug discovery, climate modeling, and environmental monitoring. Ultimately, this technology puts scientific advancement on the precipice of a revolutionary acceleration.​

Resources

Published Paper: Aygün E, Belyaeva A, Comanici G, Coram M, Cui H, Garrison J, Johnston R, Kast A, McLean CY, Norgaard P, Shamsi Z, Smalling D, Thompson J, Venugopalan S, Williams BP, He C, Martinson S, Plomecka M, Wei L, Zhou Y, Zhu Q-Z, Abraham M, Brand E, Bulanova A, Cardille JA, Co C, Ellsworth S, Joseph G, Kane M, Krueger R, Kartiwa J, Liebling D, Lueckmann J-M, Raccuglia P, Wang X(J), Chou K, Manyika J, Matias Y, Platt JC, Dorfman L, Mourad S, Brenner MP. An AI system to help scientists write expert-level empirical software. arXiv:2509.06503 [cs.AI]. 2025. DOI: https://doi.org/10.48550/arXiv.2509.06503.

Source Code Repository: github.com/google-research/s

Data Repositories:
  • Kaggle Playground Series
  • OpenProblems Single-Cell Batch Integration Benchmark
  • CDC COVID-19 Forecast Hub
  • GIFT-Eval Time Series Benchmark

Back to Geo-AI Overview

Back to Research
Powered by Create your own unique website with customizable templates.
  • Home
  • Research
    • Remote Sensing & Change Detection
    • Geo-AI
    • Aquatic
    • Landscape Ecology
    • Books
  • Team
    • Current lab members
    • Past lab members
    • Invitation To Students
    • Funding
  • Courses
  • Publications
  • Service
  • Contact