Tuesday, April 7, 2026

Could A.I. be good for scientists but bad for science?

 There has been recent attention to using LLMs to generate novel (and often correct) mathematical proofs, prompted by plain English prompts.

A recent Amazon blog post by Michael Kearns and Aaron Roth recounts how they have been able to collaborate with a LLM in the production of increasingly sophisticated proofs of new results. They anticipate that this is a development that will only continue to grow in usefulness. At the same time, they worry about what impact it may have both on the training of new mathematical scientists, and on the peer review process (as the cost of writing polished and correct papers falls faster than the cost of evaluating them for importance). Perhaps unsurprisingly, one of the first fields to feel the strain of this imbalance has been theoretical  research into machine learning models.

How AI is changing the nature of mathematical research  
What machine learning theorists learned using AI agents to generate proofs — and what comes next.



“Specifically, how can intuition and “good taste” in scientific research be developed when AI automates many of the steps that have historically been used to train young researchers? Peer review is another challenge: AI-generated research papers, quickly churned out at scale, highlight the limitations of peer review and modern-day publishing structures and also exacerbate already emerging challenges to incentives for scientific success. Without claiming to have answers or solutions to these concerns, we are personally living through them and will discuss each in turn.

“Historically, people earn expertise in the mathematical sciences through struggle as junior researchers. PhD students spend years working through the details of technical arguments to gain hard-won intuitions about when a proof approach is promising, when they are being led astray by a problem, or what constitutes a novel and interesting research direction.

“But these aspects of being a researcher are exactly what AI tools are “giving away”. If doctoral students can simply ask AI for proofs — which is extremely tempting, especially when it is in service of advancing research — how do they develop the experience and skill that, for now at least, are required to use AI tools productively in the first place?

“Breaking and remaking peer review
“From our perspective, peer review is not only, or even primarily, a process to verify the correctness and quality of research. Rather, its purpose is to focus a scarce resource — the attention of the research community — in the right places. Science progresses as researchers build on each other’s work, but there is already too much work out there for anyone to keep up with. The publication process should help identify the most interesting and promising directions, so they can be more efficiently and thoroughly developed.

“AI tools make it much easier to produce work that looks polished and correct, dramatically lowering the barrier to generating “papers” that can be submitted to journals and conferences. Many of these papers are neither interesting nor actually correct — but discovering this requires significant effort from reviewers.

This is straining an already overburdened machine learning publishing ecosystem struggling with tens of thousands of submissions per venue. We have seen that reducing the time and effort needed to produce "a paper" — not necessarily a good paper — is beginning to destabilize our existing institutions for peer review. The most recent iterations of AI and ML conferences have seen the number of submissions growing by large multiples, with a significant number of papers polished by AI, but ultimately of low quality, making it surprisingly far through the review process before being noticed and called out.
“This is a problem across research fields, partially because it’s creating a market for AI-generated papers. This has in turn engendered a countermarket for AI-assisted detection of AI-generated papers — much like the familiar technological arms races around things like spam and its detection, but with the integrity of scientific publication at stake, not just the filtration of annoying or fraudulent e-mails.


“Without a serious, community-wide re-evaluation of peer review, AI threatens to arrest scientific progress at the community level even as it accelerates it at the level of individual researchers.”
 

No comments: