启动Hacker News：LlamaFarm (YC W22) – 分布式AI的开源框架

启动Hacker News：LlamaFarm (YC W22) – 分布式AI的开源框架
Launch HN: LlamaFarm (YC W22) – Open-source framework for distributed AI

原始链接: https://github.com/llama-farm/llamafarm

## LlamaFarm：本地构建与部署AI LlamaFarm是一个开源框架，用于创建检索增强生成（RAG）和代理AI应用。它优先考虑**本地优先的开发者体验**，提供一个简单的CLI (`lf`)来管理项目、数据集和聊天会话，同时保持**完全可扩展性**以进行生产部署。主要特性包括：预设的默认配置（Ollama & Chroma），但支持替换为vLLM、OpenAI兼容的主机和自定义向量存储。应用通过**YAML**进行配置，提倡“配置优先于代码”和易于版本控制。 **上手很容易：** 通过一条命令安装（macOS/Linux/Windows），调整Ollama的上下文窗口，并使用`lf init`和`lf start`来创建一个带有内置聊天UI的项目并运行它。 LlamaFarm提供一个**REST API**（OpenAI兼容），用于集成，并支持数据集摄取、处理和语义查询。它专为可扩展性而设计 – 可以添加新的提供者、解析器或CLI命令，并提供清晰的文档和指南。 **了解更多：** [https://youtu.be/W7MHGyN0MdQ](https://youtu.be/W7MHGyN0MdQ) & [https://github.com/llama-farm/llamafarm](https://github.com/llama-farm/llamafarm)

## LlamaFarm：分布式AI框架 LlamaFarm (YC W22) 是一个全新的开源框架，旨在将AI开发推进到可靠的生产环境，超越脆弱的演示阶段。其创建者观察到，AI项目在从本地环境迁移到部署时经常失败，面临数据降级和模型陈旧的问题。他们的解决方案是“AI即代码”，使用声明式的YAML文件来定义和编排**混合专家模型**——许多小型、专门的模型，通过实时世界的使用数据持续进行微调。这种方法，结合检索增强生成 (RAG)，有望提供更廉价、更快速、更可审计的AI系统。 LlamaFarm 提供一个单一、可移植的包（模型、数据、API、测试），可以在任何地方运行——云端、边缘或隔离环境——从而消除供应商锁定和意外成本。目前，它支持包含15多种文档格式的完整RAG流程，并且可以在笔记本电脑、数据中心和云环境中无缝工作。该团队欢迎反馈，并邀请用户在 [GitHub](https://github.com/llama-farm/llamafarm) 上探索该项目。

原文

Build powerful AI locally, extend anywhere.

LlamaFarm is an open-source framework for building retrieval-augmented and agentic AI applications. It ships with opinionated defaults (Ollama for local models, Chroma for vector storage) while staying 100% extendable—swap in vLLM, remote OpenAI-compatible hosts, new parsers, or custom stores without rewriting your app.

Local-first developer experience with a single CLI (lf) that manages projects, datasets, and chat sessions.
Production-ready architecture that mirrors server endpoints and enforces schema-based configuration.
Composable RAG pipelines you can tailor through YAML, not bespoke code.
Extendable everything: runtimes, embedders, databases, extractors, and CLI tooling.

📺 Video demo (90 seconds): https://youtu.be/W7MHGyN0MdQ

Prerequisites:

Docker
Ollama (local runtime; additional options coming soon)

Install the CLI

macOS / Linux

curl -fsSL https://raw.githubusercontent.com/llama-farm/llamafarm/main/install.sh | bash

Windows (via winget)

winget install LlamaFarm.CLI

Adjust Ollama context window
- Open the Ollama app, go to Settings → Advanced, and set the context window to match production (e.g., 100K tokens).
- Larger context windows improve RAG answers when long documents are ingested.

Create and run a project

lf init my-project            # Generates llamafarm.yaml using the server template
lf start                      # Spins up Docker services & opens the dev chat UI

Start an interactive project chat or send a one-off message

# Interactive project chat (auto-detects namespace/project from llamafarm.yaml)
lf chat

# One-off message
lf chat "Hello, LlamaFarm!"

Need the full walkthrough with dataset ingestion and troubleshooting tips? Jump to the Quickstart guide.

Prefer building from source? Clone the repo and follow the steps in Development & Testing.

Run services manually (without Docker auto-start):

git clone https://github.com/llama-farm/llamafarm.git
cd llamafarm

# Install Nx globally and bootstrap the workspace
npm install -g nx
nx init --useDotNxInstallation --interactive=false

# Option 1: start both server and RAG worker with one command
nx dev

# Option 2: start services in separate terminals
# Terminal 1
nx start rag
# Terminal 2
nx start server

Open another terminal to run lf commands (installed or built from source). This is equivalent to what lf start orchestrates automatically.

Own your stack – Run small local models today and swap to hosted vLLM, Together, or custom APIs tomorrow by changing llamafarm.yaml.
Battle-tested RAG – Configure parsers, extractors, embedding strategies, and databases without touching orchestration code.
Config over code – Every project is defined by YAML schemas that are validated at runtime and easy to version control.
Friendly CLI – lf handles project bootstrapping, dataset lifecycle, RAG queries, and non-interactive chats.
Built to extend – Add a new provider or vector store by registering a backend and regenerating schema types.

Task	Command	Notes
Initialize a project	`lf init my-project`	Creates `llamafarm.yaml` from server template.
Start dev stack + chat TUI	`lf start`	Spins up server, rag worker, monitors Ollama/vLLM.
Interactive project chat	`lf chat`	Opens TUI using project from `llamafarm.yaml`.
Send single prompt	`lf chat "Explain retrieval augmented generation"`	Uses RAG by default; add `--no-rag` for pure LLM.
Preview REST call	`lf chat --curl "What models are configured?"`	Prints sanitized `curl` command.
Create dataset	`lf datasets create -s pdf_ingest -b main_db research-notes`	Validates strategy/database against project config.
Upload files	`lf datasets upload research-notes ./docs/*.pdf`	Supports globs and directories.
Process dataset	`lf datasets process research-notes`	Streams heartbeat dots during long processing.
Semantic query	`lf rag query --database main_db "What did the 2024 FDA letters require?"`	Use `--filter`, `--include-metadata`, etc.

See the CLI reference for full command details and troubleshooting advice.

LlamaFarm provides a comprehensive REST API (compatible with OpenAI's format) for integrating with your applications. The API runs at http://localhost:8000.

Chat Completions (OpenAI-compatible)

curl -X POST http://localhost:8000/v1/projects/{namespace}/{project}/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "messages": [
      {"role": "user", "content": "What are the FDA requirements?"}
    ],
    "stream": false,
    "rag_enabled": true,
    "database": "main_db"
  }'

RAG Query

curl -X POST http://localhost:8000/v1/projects/{namespace}/{project}/rag/query \
  -H "Content-Type: application/json" \
  -d '{
    "query": "clinical trial requirements",
    "database": "main_db",
    "top_k": 5
  }'

Dataset Management

# Upload file
curl -X POST http://localhost:8000/v1/projects/{namespace}/{project}/datasets/{dataset}/data \
  -F "[email protected]"

# Process dataset
curl -X POST http://localhost:8000/v1/projects/{namespace}/{project}/datasets/{dataset}/process

Finding Your Namespace and Project

Check your llamafarm.yaml:

name: my-project        # Your project name
namespace: my-org       # Your namespace

Or inspect the file system: ~/.llamafarm/projects/{namespace}/{project}/

See the complete API Reference for all endpoints, request/response formats, Python/TypeScript clients, and examples.

🗂️ Configuration Snapshot

llamafarm.yaml is the source of truth for each project. The schema enforces required fields and documents every extension point.

version: v1
name: fda-assistant
namespace: default

runtime:
  provider: openai                   # "openai" for any OpenAI-compatible host, "ollama" for local Ollama
  model: qwen2.5:7b
  base_url: http://localhost:8000/v1 # Point to vLLM, Together, etc.
  api_key: sk-local-placeholder
  instructor_mode: tools             # Optional: json, md_json, tools, etc.

prompts:
  - role: system
    content: >-
      You are an FDA specialist. Answer using short paragraphs and cite document titles when available.

rag:
  databases:
    - name: main_db
      type: ChromaStore
      default_embedding_strategy: default_embeddings
      default_retrieval_strategy: filtered_search
      embedding_strategies:
        - name: default_embeddings
          type: OllamaEmbedder
          config:
            model: nomic-embed-text:latest
      retrieval_strategies:
        - name: filtered_search
          type: MetadataFilteredStrategy
          config:
            top_k: 5
  data_processing_strategies:
    - name: pdf_ingest
      parsers:
        - type: PDFParser_LlamaIndex
          config:
            chunk_size: 1500
            chunk_overlap: 200
      extractors:
        - type: HeadingExtractor
        - type: ContentStatisticsExtractor

datasets:
  - name: research-notes
    data_processing_strategy: pdf_ingest
    database: main_db

Configuration reference: Configuration Guide • Extending LlamaFarm

🧩 Extensibility Highlights

Swap runtimes by pointing to any OpenAI-compatible endpoint (vLLM, Mistral, Anyscale). Update runtime.provider, base_url, and api_key; regenerate schema types if you add a new provider enum.
Bring your own vector store by implementing a store backend, adding it to rag/schema.yaml, and updating the server service registry.
Add parsers/extractors to support new file formats or metadata pipelines. Register implementations and extend the schema definitions.
Extend the CLI with new Cobra commands under cli/cmd; the docs include guidance on adding dataset utilities or project tooling.

Check the Extending guide for step-by-step instructions.

Example	What it Shows	Location
FDA Letters Assistant	Multi-document PDF ingestion, RAG queries, reference-style prompts	`examples/fda_rag/` & Docs
Raleigh UDO Planning Helper	Large ordinance ingestion, long-running processing tips, geospatial queries	`examples/gov_rag/` & Docs

Run lf datasets and lf rag query commands from each example folder to reproduce the flows demonstrated in the docs.

🧪 Development & Testing

# Python server + RAG tests
cd server
uv sync
uv run --group test python -m pytest

# CLI tests
cd ../cli
go test ./...

# RAG tooling smoke tests
cd ../rag
uv sync
uv run python cli.py test

# Docs build (ensures navigation/link integrity)
cd ..
nx build docs

Linting: uv run ruff check --fix . (Python), go fmt ./... and go vet ./... (Go).

Discord – chat with the team, share feedback, find collaborators.
GitHub Issues – bug reports and feature requests.
Discussions – ideas, RFCs, roadmap proposals.
Contributing Guide – code style, testing expectations, doc updates, schema regeneration steps.

Want to add a new provider, parser, or example? Start a discussion or open a draft PR—we love extensions!

📄 License & Acknowledgments

Licensed under the Apache 2.0 License.
Built by the LlamaFarm community and inspired by the broader open-source AI ecosystem. See CREDITS for detailed acknowledgments.

Build locally. Deploy anywhere. Own your AI.