自主数学研究
Towards Autonomous Mathematics Research

原始链接: https://arxiv.org/abs/2602.10177

## Aletheia:用于自主数学研究的AI 本文介绍Aletheia,一种旨在超越解决竞赛级数学问题,并主动*进行*数学研究的新型AI代理。Aletheia由增强版的Gemini Deep Think和复杂推理的新型扩展定律提供支持,它使用自然语言生成、验证和修改解决方案,并利用各种工具来浏览数学文献。 研究人员通过几个突破展示了Aletheia的能力:完全由AI生成的算术几何研究(计算本征权重)、关于交互粒子系统的AI-人类协作证明,以及在评估700个问题后,自主解决Bloom的Erdos猜想数据库中的四个未解决问题。 为了促进理解和透明度,作者提出了用于量化AI在研究中的自主性和新颖性的标准化指标,以及“人机交互卡”。所有提示和模型输出均公开可用,这标志着在数学领域人机协作发展的重要一步。

黑客新闻 新的 | 过去的 | 评论 | 提问 | 展示 | 工作 | 提交 登录 朝向自主数学研究 (arxiv.org) 20 分,由 gmays 54 分钟前发布 | 隐藏 | 过去的 | 收藏 | 2 条评论 帮助 amiune 24 分钟前 | 上一个 | 下一个 [–] 完美匹配此测试:https://arxiv.org/abs/2602.05192 回复 measurablefunc 30 分钟前 | 上一个 | 下一个 [–] 我仍然不明白在某个基准上达到 96% 的成绩意味着它是一个超级天才,但剩下的 4% 却仍然无法触及。那些不断将机器人与人进行比较的人应该真正思考一下,一个人在某个高级数学基准上达到 90% 的成绩,仍然会以某种方式错过最后的 10%。回复 指南 | 常见问题 | 列表 | API | 安全 | 法律 | 申请 YC | 联系 搜索:
相关文章

原文

View a PDF of the paper titled Towards Autonomous Mathematics Research, by Tony Feng and 27 other authors

View PDF HTML (experimental)
Abstract:Recent advances in foundational models have yielded reasoning systems capable of achieving a gold-medal standard at the International Mathematical Olympiad. The transition from competition-level problem-solving to professional research, however, requires navigating vast literature and constructing long-horizon proofs. In this work, we introduce Aletheia, a math research agent that iteratively generates, verifies, and revises solutions end-to-end in natural language. Specifically, Aletheia is powered by an advanced version of Gemini Deep Think for challenging reasoning problems, a novel inference-time scaling law that extends beyond Olympiad-level problems, and intensive tool use to navigate the complexities of mathematical research. We demonstrate the capability of Aletheia from Olympiad problems to PhD-level exercises and most notably, through several distinct milestones in AI-assisted mathematics research: (a) a research paper (Feng26) generated by AI without any human intervention in calculating certain structure constants in arithmetic geometry called eigenweights; (b) a research paper (LeeSeo26) demonstrating human-AI collaboration in proving bounds on systems of interacting particles called independent sets; and (c) an extensive semi-autonomous evaluation (Feng et al., 2026a) of 700 open problems on Bloom's Erdos Conjectures database, including autonomous solutions to four open questions. In order to help the public better understand the developments pertaining to AI and mathematics, we suggest quantifying standard levels of autonomy and novelty of AI-assisted results, as well as propose a novel concept of human-AI interaction cards for transparency. We conclude with reflections on human-AI collaboration in mathematics and share all prompts as well as model outputs at this https URL.
From: Thang Luong [view email]
[v1] Tue, 10 Feb 2026 18:50:15 UTC (2,611 KB)
[v2] Thu, 12 Feb 2026 18:27:29 UTC (2,612 KB)
联系我们 contact @ memedata.com