ATLAS:大规模自动形式化教材库
ATLAS: Autoformalized Textbook Library At Scale

原始链接: https://github.com/facebookresearch/atlas-lean

ATLAS 是一个使用 Lean 4 语言编写的大规模自动形式化数学库。该项目由“AutoformBot”流水线生成,涵盖了代数、分析、几何和理论计算机科学等多个学科,翻译了 26 本本科及研究生水平教科书中的定义、陈述和证明。 作为一项持续的研究工作,ATLAS 提供了一个可重用的形式化构建块存储库,旨在加速人类和机器驱动的形式化进程。目前的库包含超过 63 万行代码,拥有 46,203 个声明,自动证明成功率达 92.7%。 用户可以通过在线可视化工具访问该库,进行浏览、对比非形式化教科书陈述与 Lean 对应项,并检查逻辑依赖关系。每个书籍目录都包含源文件、目标陈述以及针对准确性和代码质量的自动评估指标。由 Ahmad Rammal 等人领导的 ATLAS 团队正持续扩展该语料库,提升可维护性,并使代码与标准的 Mathlib 规范保持一致。他们欢迎外部贡献,以帮助扩展和完善这一日益增长的形式化数学知识集合。

对不起。
相关文章

原文

ATLAS Logo

ATLAS is a Lean 4 library of textbook mathematics autoformalized by LLMs: informal statements and proofs translated into Lean code. It draws from undergraduate and graduate textbooks across analysis, algebra, geometry, topology, combinatorics, probability, statistics, PDEs, number theory, and theoretical computer science.

The goal of ATLAS is to provide reusable formal building blocks for future human- and machine-driven Lean formalization. This is an active effort: we are continuing to scale to more sources, curate the generated material, improve coverage and maintainability, and move the library closer to Mathlib conventions.

ATLAS was generated with AutoformBot, our autoformalization pipeline.

Each book directory under Atlas/ contains:

  • Lean source files for the generated definitions, statements, and proofs.
  • targets.yaml, listing the textbook statements selected for formalization.
  • report.json, containing automated evaluation results for matched Lean declarations, including faithfulness, proof-integrity, and code-quality scores.

A visualizer is provided at https://rammalahmad.github.io/atlas/. It allows users to browse ATLAS, compare informal statements with their Lean formalizations, inspect logical dependency graphs between results, and extract the Lean code needed to state a selected theorem.

ATLAS visualizer

ATLAS is an ongoing, machine-generated extension effort rather than a finished product. We are actively working on scaling the corpus, curating the generated code, formalizing remaining statements, and improving idiomatic Mathlib reuse. External contributions are welcome!

To build the full library with the pinned Lean and Mathlib versions, run:

26 books · 630,999 lines of code · 483,917 lines of Lean code (excl. comments/blanks) · 46,203 declarations · 42,837 proved (92.7%) · 2,855 / 4,007 statements formalized (71.3%) · 183,157M tokens

Book Target Statements Formalized % Formalized Lines of Code Lines of Lean Declarations Proved % Proved Tokens (M)
AlgebraNotes 176 151 85.8% 5,037 4,409 274 261 95.3% 1,962.99
AlgebraicCombinatorics 39 37 94.9% 10,695 9,343 737 734 99.6% 1,440.73
AlgebraicGeometryI 186 112 60.2% 40,678 27,393 4,499 4,210 93.6% 7,629.26
AlgebraicTopologyI 171 110 64.3% 29,154 20,142 2,416 2,063 85.4% 10,323.27
AnAlgorithmistsToolkit 158 131 82.9% 9,656 8,234 712 668 93.8% 2,004.00
ArithmeticGeometry 335 266 79.4% 39,257 29,573 3,047 2,861 93.9% 11,100.62
BooleanFunctions 108 44 40.7% 9,516 7,949 667 614 92.1% 2,327.49
Buildings 74 44 59.5% 64,383 48,809 4,345 4,247 97.7% 20,442.93
CombinatorialOptimization 36 22 61.1% 8,908 7,934 428 414 96.7% 2,475.65
ComplexVariables 38 37 97.4% 7,231 6,225 285 280 98.2% 1,250.91
DifferentialAnalysis 113 88 77.9% 31,302 23,713 1,634 1,506 92.2% 11,743.27
DifferentialGeometry 147 112 76.2% 10,592 8,942 888 781 88.0% 1,933.97
EllipticCurves 360 212 58.9% 32,819 22,316 3,483 2,981 85.6% 11,058.00
FourierAnalysis 38 34 89.5% 7,943 6,671 373 359 96.2% 1,185.90
GeometryOfManifolds 72 40 55.6% 22,686 16,408 3,251 3,098 95.3% 6,864.93
HighDimensionalStatistics 73 65 89.0% 39,656 31,715 1,564 1,518 97.1% 975.36
IntroductionToFunctionalAnalysis 72 68 94.4% 2,709 2,006 113 109 96.5% 553.64
IntroductionToPartialDifferentialEquations 105 86 81.9% 27,666 20,740 1,585 1,414 89.2% 2,972.23
LieGroups 185 74 40.0% 60,285 50,594 4,219 3,814 90.4% 45,384.33
NumberTheoryI 576 460 79.9% 64,958 54,760 3,764 3,591 95.4% 15,424.36
ProbabilisticMethodsInCombinatorics 210 109 51.9% 20,555 15,604 1,272 1,089 85.6% 2,720.15
ProjectionTheory 111 73 65.8% 13,357 9,672 979 871 89.0% 2,678.00
RealAnalysis 177 175 98.9% 2,886 2,224 149 147 98.7% 585.64
TensorCategories 229 137 59.8% 42,812 29,729 3,373 3,176 94.2% 11,338.45
TheoryOfComputation 118 84 71.2% 15,094 10,581 1,553 1,482 95.4% 3,580.36
TheoryOfProbability 100 84 84.0% 11,164 8,231 593 549 92.6% 3,200.61
Total 4,007 2,855 71.3% 630,999 483,917 46,203 42,837 92.7% 183,157

The initial ATLAS effort was led by Ahmad Rammal, Niket Patel, Fabian Gloeckle, Amaury Hayat, Julia Kempe, Remi Munos, Charles Arnal, and Vivien Cabannes.

If you find this work useful, please cite our paper:

@misc{rammal2026formalizingmathematicsscale,
      title={Formalizing Mathematics at Scale}, 
      author={Ahmad Rammal and Niket Patel and Fabian Gloeckle and Amaury Hayat and Julia Kempe and Remi Munos and Charles Arnal and Vivien Cabannes},
      year={2026},
      eprint={2605.29955},
      archivePrefix={arXiv},
      primaryClass={cs.AI},
      url={https://arxiv.org/abs/2605.29955}, 
}
联系我们 contact @ memedata.com