Show HN:Magnitude——一款面向Web应用的开源、原生AI测试框架
Show HN: Magnitude – open-source, AI-native test framework for web apps

原始链接: https://github.com/magnitudedev/magnitude

Magnitude是一个由视觉AI代理驱动的端到端测试框架,能够适应UI变化。它使用自然语言,方便创建测试用例。一个强大的推理代理负责规划和调整测试,而一个快速的视觉代理则负责可靠地执行测试,无论是在本地还是在CI/CD流水线中。 入门方法:安装`magnitude-test`,设置配置文件,并使用自然语言定义测试,利用`.step`、`.data`和`.check`函数。Magnitude需要两个大型语言模型客户端:一个强大的多模态LLM(例如Gemini)用于规划,以及Moondream用于视觉执行。您需要这两个客户端的API密钥。 配置完成后,使用提供的命令运行测试。Magnitude的架构将规划和执行模型分离,与通用工具相比,它提供了速度、可靠性和成本效益。如有任何支持需求,请联系创始人[email protected]或加入Discord社区。

Magnitude是一个新的开源、原生AI的端到端Web应用测试框架。其创建者Anders和Tom旨在用视觉LLM代理取代传统的Web测试,解决现有浏览器代理速度慢、成本高和一致性差的问题。Magnitude利用基于视觉的方法(Moondream)来避免易错的“标记集”系统,并采用两个代理:一个用于规划和调整测试,另一个用于快速执行。 “规划器”代理创建Web操作的自然语言“计划”,然后由“执行器”运行。这些计划可以被缓存并高效地重新运行,规划器会在需要时重新调整测试。用户可以自定义规划器LLM,目前Moondream是唯一支持的执行器,但团队正在考虑扩展这一列表。团队欢迎贡献,以添加可访问性测试、数据提取、无LLM执行器模式以及与其他测试工具的集成。

原文

End-to-end testing framework powered by visual AI agents that see your interface and adapt to any changes in it.

  • ✍️ Build test cases easily with natural language
  • 🧠 Strong reasoning agent to plan and adjust tests
  • 👁️ Fast visual agent to reliably execute runs
  • 📄 Plan is saved to execute runs the same way
  • 🛠 Reasoning agent steps in if there is a problem
  • 🏃‍♂️ Run tests locally or in CI/CD pipelines

Video showing Magnitude tests running in a terminal and agent taking actions in the browser

↕️ Magnitude test case in action! ↕️

test('can add and complete todos', { url: 'https://magnitodo.com' })
    .step('create 3 todos')
        .data('Take out the trash, Buy groceries, Build more test cases with Magnitude')
        .check('should see all 3 todos')
    .step('mark each todo complete')
        .check('says 0 items left')

1. Install our test runner in the node project you want to test (or see our demo repo if you don't have a project to try it on)

npm install --save-dev magnitude-test

2. Setup Magnitude in your project by running:

This will create a basic tests directory tests/magnitude with:

  • magnitude.config.ts: Magnitude test configuration file
  • example.mag.ts: An example test file

Magnitude requires setting up two LLM clients:

  1. A strong general multi-modal LLM (the "planner")
  2. A fast vision LLM with pixel-precision (the "executor")

For the planner, you can use any multi-modal LLM, but we recommend Gemini 2.5 pro. You can use Gemini via Google AI Studio or Vertex AI. If you don't have either set up, you can create an API key in Google AI Studio (requires billing) and export to GOOGLE_API_KEY.

If no GOOGLE_API_KEY is found, Magnitude will fallback to other common providers (ANTHROPIC_API_KEY / OPENAI_API_KEY).

To explicitly select a specific provider and model, see configuration docs. Currently we support Google AI Studio, Google Vertex AI, Anthropic, AWS Bedrock, OpenAI, and OpenAI-compatible providers.

Currently for the executor model, we only support Moondream, which is a fast vision model that Magnitude uses for precise UI interactions.

To configure Moondream, sign up and create an API with Moondream here, then add to your environment as MOONDREAM_API_KEY. This will use the cloud version, which includes 5,000 free requests per day (roughly a few hundred test cases in Magnitude). Moondream is fully open source and self-hostable as well.

🚀 Once you've got your LLMs set up, you're ready to run tests!

Run your Magnitude tests with:

This will run all Magnitude test files discovered with the *.mag.ts pattern. If the agent finds a problem with your app, it will tell you what happened and describe the bug!

To run many tests in parallel, add -w <workers>

Now that you've got Magnitude set up, you can create real test cases for your app. Here's an example for a general idea:

import { test } from 'magnitude-test';

test('can log in and create company')
    .step('Log in to the app')
        .data({ username: '[email protected]', password: 'test' }) // any key/values
        .check('Can see dashboard') // natural language assertion
    .step('Create a new company')
        .data('Make up the first 2 values and use defaults for the rest')
        .check('Company added successfully');

Steps, checks, and data are all natural language. Think of it like you're describing how to test a particular flow to a co-worker - what steps they need to take, what they should check for, and what test data to use.

For more information on how to build test cases see our docs.

You can run Magnitude tests in CI anywhere that you could run Playwright tests, just include LLM client credentials. For instructions on running tests cases on GitHub actions, see here.

Why not OpenAI Operator / Claude Computer Use?

We use separate planning / execution models in order to plan effective tests while executing them quickly and reliably. OpenAI or Anthropic's Computer Use APIs are better suited to general purpose desktop/web tasks but lack the speed, reliability, and cost-effectiveness for running test cases. Magnitude's agent is designed from the ground up to plan and execute test cases, and provides a native test runner purpose-built for designing and running these tests.

To get a personalized demo or see how Magnitude can help your company, feel free to reach out to us at [email protected]

You can also join our Discord community for help or any suggestions!

联系我们 contact @ memedata.com