Show HN：Index——新的最先进的开源浏览器代理

Show HN：Index——新的最先进的开源浏览器代理
Show HN: Index – New Open Source browser agent

Index是一个开源的浏览器代理，用于自主执行网页任务。它提供交互式命令行界面 (CLI)，具有浏览器状态持久性和实时更新功能，并可以通过无服务器 API (Laminar) 使用。要使用Index，请安装`lmnr-index`包，在你的`.env`文件中设置模型API密钥（Anthropic、Gemini、OpenAI），并可选择使用你的项目API密钥初始化Laminar来追踪代理行为和记录浏览器会话。你可以用提示语运行代理，指定任务。例如：“导航到news.ycombinator.com，找到一篇关于AI的文章，并对其进行总结。” 你可以通过配置自定义浏览器，例如连接到现有的Chrome DevTools Protocol端点或设置视口大小。你可以串流代理的输出或异步运行它。文档提供了使用Anthropic和OpenAI提供商的示例。

Laminar AI 推出了 Index，一个新的开源浏览器代理，其性能达到最先进水平（在使用 Claude 3.7 的 WebVoyager 测试中达到 92% 的准确率）。其主要功能包括：通过 Playwright 补丁实现浏览器代理的可观察性、录制浏览器会话、追踪代理步骤/LLM 调用，以及在调试 UI 中同步所有内容。它使用一个简单的 JavaScript 脚本，结合 CV 和 OCR 进行元素检测，由一个 `while` 循环和精心设计的提示词控制。 Index 是一个带有 CLI 的 Python 包，支持 Gemini 2.5 Pro 和 Flash。它可以通过无服务器 API 或聊天 UI 使用。复杂的例如研究和 UI 交互等任务都是合适的应用场景，避免了硬编码脚本。演示和代码已上传至 GitHub。团队还在致力于 MCP 服务器集成，以支持代理 IDE。

Show HN：Hyperbrowser MCP 服务器——通过浏览器连接AI智能体到网络 2025-03-21

（评论） 2025-03-23

LangManus：一个基于LangChain + LangGraph的开源Manus代理 2025-03-23

LaVague：用于自动化 Selenium 浏览的开源大型操作模型 2024-03-15

Show HN：Mastra - 盖茨比开发人员的开源JS代理框架 2025-02-21

原文

Index is the SOTA open-source browser agent for autonomously executing complex tasks on the web.

prompt: go to ycombinator.com. summarize first 3 companies in the W25 batch and make new spreadsheet in google sheets.

local_agent_spreadsheet_demo.mp4

Check out full documentation here

The easiest way to use Index in production is via the serverless API. Index API manages remote browser sessions, agent infrastructure and browser observability. To get started, sign up and create project API key. Read the docs to learn more.

from lmnr import Laminar, AsyncLaminarClient
import asyncio
# you can also set LMNR_PROJECT_API_KEY environment variable

# Initialize tracing
Laminar.initialize(project_api_key="your_api_key")

# Initialize the client
client = AsyncLaminarClient(project_api_key="your_api_key")

async def main():

    response = await client.agent.run(
        prompt="Navigate to news.ycombinator.com, find a post about AI, and summarize it"
    )

    print(response.result)
    
if __name__ == "__main__":
    asyncio.run(main())

pip install lmnr-index

# Install playwright
playwright install chromium

Setup your model API keys in .env file in your project root:

ANTHROPIC_API_KEY=
GEMINI_API_KEY=
OPENAI_API_KEY=

You can run Index via interactive CLI. It features:

Browser state persistence between sessions
Follow-up messages with support for "give human control" action
Real-time streaming updates
Beautiful terminal UI using Textual

You can run the agent with the following command. Remember to set API key for the selected model in the .env file.

Output will look like this:

Loaded existing browser state
╭───────────────────── Interactive Mode ─────────────────────╮
│ Index Browser Agent Interactive Mode                       │
│ Type your message and press Enter. The agent will respond. │
│ Press Ctrl+C to exit.                                      │
╰────────────────────────────────────────────────────────────╯

Choose an LLM model:
1. Gemini 2.5 Flash
2. Claude 3.7 Sonnet
3. OpenAI o4-mini
Select model [1/2] (1): 3
Using OpenAI model: o4-mini
Loaded existing browser state

Your message: go to lmnr.ai, summarize pricing page

Agent is working...
Step 1: Opening lmnr.ai
Step 2: Opening Pricing page
Step 3: Scrolling for more pricing details
Step 4: Scrolling back up to view pricing tiers
Step 5: Provided concise summary of the three pricing tiers

import asyncio
from index import Agent, AnthropicProvider

async def main():

    llm = AnthropicProvider(
            model="claude-3-7-sonnet-20250219",
            enable_thinking=True, 
            thinking_token_budget=2048)
    # llm = OpenAIProvider(model="o4-mini") you can also use OpenAI models

    agent = Agent(llm=llm)

    output = await agent.run(
        prompt="Navigate to news.ycombinator.com, find a post about AI, and summarize it"
    )
    
    print(output.result)
    
if __name__ == "__main__":
    asyncio.run(main())

Stream the agent's output

async for chunk in agent.run_stream(
    prompt="Navigate to news.ycombinator.com, find a post about AI, and summarize it"
):
    print(chunk)

Enable browser agent observability

To trace Index agent's actions and record browser session you simply need to initialize Laminar tracing before running the agent.

from lmnr import Laminar

Laminar.initialize(project_api_key="your_api_key")

Then you will get full observability on the agent's actions synced with the browser session in the Laminar platform.

import asyncio
from index import Agent, AnthropicProvider, BrowserConfig

async def main():
    # Configure browser to connect to an existing Chrome DevTools Protocol endpoint
    browser_config = BrowserConfig(
        cdp_url="<cdp_url>"
    )
    
    llm = AnthropicProvider(model="claude-3-7-sonnet-20250219", enable_thinking=True, thinking_token_budget=2048)
    
    agent = Agent(llm=llm, browser_config=browser_config)
    
    output = await agent.run(
        prompt="Navigate to news.ycombinator.com and find the top story"
    )
    
    print(output.result)
    
if __name__ == "__main__":
    asyncio.run(main())

Customize browser window size

import asyncio
from index import Agent, AnthropicProvider, BrowserConfig

async def main():
    # Configure browser with custom viewport size
    browser_config = BrowserConfig(
        viewport_size={"width": 1200, "height": 900}
    )
    
    llm = AnthropicProvider(model="claude-3-7-sonnet-20250219")
    
    agent = Agent(llm=llm, browser_config=browser_config)
    
    output = await agent.run(
        "Navigate to a responsive website and capture how it looks in full HD resolution"
    )
    
    print(output.result)
    
if __name__ == "__main__":
    asyncio.run(main())

Made with ❤️ by the Laminar team