Sup AI achieves 52.15% accuracy with 7+ percentage points ahead of every model in the ensemble (p<0.001).
If you need accurate answers, fewer hallucinations, or research-grade work that must be correct, Sup AI is your only option.
Disclaimer: These results are from an independent evaluation conducted by Sup AI (Dec 2025) and are not officially endorsed by the Center for AI Safety or Scale AI. Accuracy scores were calculated on a random sample of 1,369 questions from Humanity's Last Exam. All models, including competitors, were evaluated using enhanced settings (custom instructions and web search) to maximize performance. Comparisons reflect model versions available at the time of testing, including "Preview" builds which are subject to change.