Models at the speed of thought: How AI coding is reshaping theoretical neuroscience – The Transmitter

Home AI Models at the speed of thought: How AI coding is reshaping theoretical neuroscience – The Transmitter
Models at the speed of thought: How AI coding is reshaping theoretical neuroscience – The Transmitter

Agentic coding makes it possible to specify a neuroscience model in hours instead of months. Seven neuroscientists weigh in on what that tectonic change may bring to the field.
Almost two decades ago, Larry Abbott laid out principles and foundations for the coalescing field of theoretical neuroscience in a perspective in Neuron titled “Theoretical neuroscience rising.” One idea that stuck with me was that rigorous models—expressed in equations rather than words—can be formulated, explored and often rejected at a pace that no experimental program can match. Equations, Abbott argued, “force a model to be precise, complete, and self-consistent,” and this precision, combined with speed, acts as an intellectual filter, winnowing possibilities before expensive experiments begin. Abbott painted a picture in which modelers rapidly cycle through phases of model exploration and rejection. 
In practice, this doesn’t quite happen. Translating a word model into equations, translating equations into code and integrating a model with data all represent engineering work that is invisible to the scientific narrative. These costs create a bottleneck that shapes what models are explored, adopted and published, ultimately throttling the pace of theoretical exploration, which Abbott identified as theory’s core advantage.
Agentic coding frameworks—systems that can write, debug and integrate code through natural language interaction—are changing that story, fast. Not by replacing the human work that goes into conceiving of models and weighing their merits but by eliminating the engineering scaffolding that has, until now, limited what theories we fully explore, which models get tested and how fast we can reject them. A theoretical neuroscientist can now specify a model in conversation—sketch out the assumptions, describe the data, outline the inference procedure—and have working code in days instead of months. This is exactly what the field should want. 
How is this reshaping the landscape of theoretical and computational neuroscience? I outline below four main directions along which I think tectonic change is coming. To get a sense of how others in the field view this change, I also asked seven neuroscientists to weigh in. 
The field risks becoming prolific but shallow—generating models faster than we can generate insights.

F
Second, theorists can finally accelerate their own exploratory work at the pace Abbott envisioned. Abbott argued that theorists could “formulate, explore and often reject models” faster than experimentalists could test them. But that advantage was always throttled by the time required to translate word models into equations and equations into code. In the former era, ideas that might bud remained kernels jotted in notebooks, simply because the cost of implementation was not worth the payoff, assuming most ideas were dead ends. Alternatively, this friction encouraged researchers to build simple models that could be analyzed and coded quickly. They were less likely to develop and explore complex models, which require more engineering overhead. Sometimes this simplicity offered deeper insight, and sometimes it failed to capture critical biological realism. In an agentic-framework era, a theorist can sketch a bold new theoretical idea—a modified recurrent neural network, a novel learning rule, a different population coding scheme—and quickly have it implemented, tested on synthetic data and refined. The theoretical frontier expands significantly, enabling exploration of more models, and more complex models, from seed to fruit. Many ideas will fail, which is a good thing. Abbott’s vision—rapid exploration and rejection of models—finally becomes operational. 
Third, and most disorienting, is what agentic systems can discover on their own. Evolutionary algorithms guided by large language models (LLMs), such as Google’s AlphaEvolve, can explore model space in ways human intuition might not naturally traverse, and at a scale impossible for even the most caffeinated graduate student. We might not think to try, or have the infinite bandwidth to exhaustively attempt, particular combinations of nonlinearities, or a specific coupling between neural populations, even if the precursors to those combinations were sitting on the shelf waiting to be combined. Through sheer brute force, AI systems can explore these combinations, creating millions of potential models that include long-established ideas and unexplored new directions.
History offers insight into the potential pitfalls of rapid, technology-driven scientific discovery: The arrival of the spectroscope, for example, introduced a wave of new hypothetical chemical elements in the 1860s that included both real discoveries, such as helium, and phantoms, such as coronium—a hypothetical element rooted in a real observation, which took decades of theoretical work in physics and chemistry to dissolve. For AI-guided search, some discoveries will be scientifically meaningful; others will be artifacts of the search process. We’ll have to develop new skills for understanding models we didn’t directly build and continue to lean on and develop models based on first principles to separate the heliums from the coroniums. 
But that advantage was always throttled by the time required to translate word models into equations and equations into code.
Fourth, agentic coding frameworks enable more mathematically sophisticated models. Some of the most acclaimed and mathematically rigorous models in computational neuroscience, such as mean-field theories from statistical physics, employ advanced techniques that require years of focused training to wield. As we move into a “vibe proving” era, the outlook for equation-based, AI-assisted theoretical work is changing. A system that can manipulate symbolic mathematics and executable code simultaneously opens new research directions in mathematically sophisticated models by supporting researchers who are very comfortable at the computer terminal but less comfortable at the blackboard. 
It’s worth acknowledging that this acceleration and expansion of theoretical possibilities comes with a real risk, particularly for trainees. For theorists, the struggle to translate an idea into equations and then into code isn’t merely friction; it’s a form of disciplined thinking. When a theorist builds a model piece by piece, they develop an intimacy with the model that forces understanding. The slow, painstaking construction is how the modeler comes to see gaps in their own reasoning, where vague intuitions confront reality. Automating this struggle away risks creating a world in which theorists never develop that intimate understanding of their own models. For trainees, this is especially dangerous. Advisers face a heavy responsibility to decide which struggles are worth preserving to protect the conditions under which learning occurs.
But there’s a deeper risk still. It’s precisely this struggle that often sparks genuine theoretical insight. Nights spent staring at a model behavior you can’t explain are when theorists make conceptual leaps that lead to genuine discovery. If we lose the struggle, we may lose not just understanding but inspiration itself. The field risks becoming prolific but shallow, generating models faster than we can generate insights. As Tim Requarth recently argued regarding AI-assisted writing, “if the struggle to articulate an idea is part of how you come to understand it, then tools that bypass that struggle might degrade … the kind of thinking that matters most for actual discovery.”
Agentic coding frameworks help us realize Abbott’s vision: the freedom to explore and reject ideas rapidly, the removal of technical barriers that reduce model adoption, and the ability for more researchers to explore complex and mathematically rigorous models. A North Star as we navigate this era is remembering that most ideas should fail, and, to develop and maintain this discretion, we may need to periodically step away from the AI and return to the time-tested resource of our own minds.  

AI tools have come along at a very fortuitous time for neuroscientists, the era of big data. Recordings from multiple thousands of neurons, long lists of cell types and huge connectome datasets provide more information than a human brain can envelop—but not a human brain and an AI assistant. Major new discoveries are likely to come from humans generating insights, just as they have always done, but collaborating with an agent that can quickly provide them with intelligent access to data too vast for previous researchers, no matter how brilliant, to ponder.

I find one aspect of the new AI era particularly exciting: the possibility of deeper, agent-assisted research synthesis. This will still have to be guided by humans with real expertise in a field, both to avoid hallucinations and to recognize when an AI system is invoking concepts that do not actually exist. But when used critically, these tools can help us rediscover older papers, identify convergent findings across studies, increase the value of results that have been independently reproduced and generate new hypotheses by linking published works that would otherwise remain disconnected. They can also help us dismiss ideas that are inconsistent with the broader experimental work.
For theoretical neuroscience, this is really exciting. Theorists no longer have to rely only on a close experimental collaborator or hope that the dataset from their favorite paper has been made public, although thankfully that is becoming more common. Instead, we can begin to build and evaluate models against a much broader experimental landscape: not just a single dataset, but a searchable, extensive body of experimental evidence. In that sense, AI agents may help theoretical models become more tightly constrained by the full richness of neuroscience rather than by the small subset of experiments any one modeler happens to know well.

It’s worth distinguishing between two separate questions. How should we think about “vibe modeling” based on the current generation of LLMs versus future AI models? For the current generation, there is a real risk that isn’t acknowledged in the article, that an LLM will produce code that appears to test the idea of the modeler but in fact contains errors or hidden assumptions that aren’t visible in the output. This isn’t a theoretical risk; it’s something that seems to crop up very regularly in practice.This danger isn’t unique to LLMs. It happens every time we use the wrong statistical test because the software package makes it so easy. The risk is that vibe modeling will make this much more frequent, and it would be exacerbated by the suggestion in the article to get an LLM to iterate without human intervention. Some of this may be addressed by future improvements in LLMs or a switch to a different type of AI model, although this might not come as soon as we think.
Even if we’re optimistic in anticipating AI progress, the question remains whether vibe modeling is a good idea. I agree that we need to develop understanding as well as produce code, but I don’t think this is only an issue for trainees. I would put it differently: Our aim should not be the production of models. The fact that our community implicitly values the production of models over understanding is a byproduct of perverse career incentives to publish a lot of papers, and there’s a risk that vibe modeling will make this even worse. I suppose one positive outcome could be that if it becomes trivially easy to generate a model in support of any idea, it might force us to think a little harder about what we’re doing and why.
I don’t want to be only negative though: I do think there’s a possible place for AI in research. I would love to have the ability to describe a vague idea and have it instantly converted into a concrete implementation, as long as we recognize that this output is only the starting point for a process that should end in a human-comprehensible nugget of “understanding.” That nugget might be code or an equation that contains the core idea. We shouldn’t care if this nugget was the result of traditional modeling, AI or indeed the result of a feverish night after eating too many hallucinogenic mushrooms. If it can be understood and verified by a human, and it stands alone from the process that produced it, then it has a place in science.

Computational neuroscientists have often posed the question, “What would you do if you had all the data about the brain you wanted?” In a way, this is a very practical question: As our experimental methods grow in capacity, we need to have plans and pipelines in place to handle their growing outputs. But it also functions as a rhetorical device: In contemplating the answer, we may come to realize that data isn’t our only—or even our main—bottleneck after all. Neuroscience needs better theories, models and frameworks just as much as it needs more tools, equipment and datasets. 
But now, as DePasquale’s discussion of AI-assisted modeling lays out, we’re faced with a new question: “What would you do if you had all the models of the brain you wanted?” Removing the barrier to implementing and evaluating models should have a similar effect on the field as increasing data collection—which is to say that its benefits will depend on our ability to use all these outputs intelligently and at this new scale. One effect of this may be that the questions of science increasingly become questions of metascience. Not just “is this model useful?” but “how do we decide if models are useful?” And the underlying tensions that have always existed in our heavily interdisciplinary field—questions regarding the purpose of neuroscience and how to define success—will be on larger display when everyone has their own set of models to choose from. The way to buttress ourselves against this potentially overwhelming onslaught is to sharpen our view. becoming clearer about what our individual questions are and stricter with the objective means we will use to evaluate our answers. And, of course, we need to know why we are asking these questions in the first place.

Let me preface this by stating unequivocally that I am by no means a theoretical neuroscientist. Computational, maybe, but mostly I am a software and machine-learning engineer who likes to think about brains, behavior and biology more broadly. To that end, we have honed our craft of building scientific tools through advanced software engineering practices. Yet, as this piece articulates, the emergence of agentic engineering has radically transformed the way scientific software is built by virtually removing the need to code at all. 
It might seem reasonable to see this as an existential threat to our research program and the value we provide to the scientific community (and there have been some nights when I certainly have). In the same way a theorist might value the struggle of structuring equations into code, we value engineering with elegant design patterns that reflect structured thinking about algorithmic and system architectures. If my agent builds the functions, their docstrings and the unit tests for me, have we lost the art and deeper craft of scientific software engineering? 
Ultimately, however, I have found this new mode of operation indescribably liberating. All my lab members—even my administrative assistant—now have a Claude Max subscription. Leaning into it has brought about new forms of excitement by allowing us to think bigger about what’s possible. Transcending the shackles of deep technical debugging means that my advising meetings with trainees are less about troubleshooting and more about ideation on models and experiments. Turnaround times are faster, and the quality of the work is higher. Even if the AI makes a mistake, it now becomes an opportunity for my trainees to learn how to evaluate science more critically. Finally, for me personally, it has delivered much-needed soul-nourishing freedom to do intellectually stimulating work in between the Kafkaesque tedium of faculty responsibilities.

A sentence in this article stuck out to me: LLMs “can explore model space in ways human intuition doesn’t naturally traverse.” I wonder! Is the exploration strategy implemented by LLMs profoundly different, or is it just more patient? In some ways, LLM strategy is similar to ours: simply trying probable or plausible combinations of code, according to a loose grammar articulated by the statistics of the training data. Because these attempts are derived from human-authored text and code, LLMs can leverage much of our existing knowledge and communicate in terms of shared vocabulary and conventions. However, they likely inherit the same “theory-induced blindness” that we do, and an idea that seems really wild and out there to most humans is unlikely to occur to an LLM. Lacking boredom and exhaustion, they can keep up this search automatically and indefinitely, allowing a pretty thorough exploration of the space of not-too-crazy solutions, and at present, I suspect they work best when you can point them in vaguely the right conceptual direction. Like a flashlight as opposed to a laser pointer, they can illuminate much in the vicinity of what you’re aiming at, but they won’t show you what is behind you.

AI is obviously going to have a massive impact on how we do science, and it’s hard to know what the long run looks like. LLMs are like genies: They can grant you almost any wish, but you have to be very careful about what you ask for. But in the short run, four things stand out.
Researchers ask colleagues to weigh in on important topics in the field.
Special report: The state of neuroscience
Curating neuroscience, connecting community
An editorially independent publication supported by the Simons Foundation

source

Leave a Reply

Your email address will not be published.