展示HN:为什么还要写代码,如果LLM可以直接完成这件事?(网络应用实验)
Show HN: Why write code if the LLM can just do the thing? (web app experiment)

原始链接: https://github.com/samrolken/nokode

该项目探索了一个未来,应用程序无需传统代码运行,完全依赖于能够访问工具的大型语言模型(LLM)。作者构建了一个功能性Web服务器——一个包含CRUD操作的联系人管理器,其中*不包含*任何应用程序代码。相反,LLM动态处理每个HTTP请求,根据URL和用户输入决定执行的操作。 LLM使用了三个工具:一个用于SQL查询的数据库,一个用于生成HTML/JSON的Web响应工具,以及一个用于存储用户反馈的内存工具。值得注意的是,AI自主设计了数据库模式,实现了类似REST的API,甚至响应了用户反馈,例如UI更改请求。 然而,该系统目前不实用:响应时间为30-60秒(比传统应用程序慢300-6000倍),并且由于API令牌的使用,成本高100-1000倍。它还存在一致性问题和偶尔的幻觉。尽管存在这些限制,但其成功表明LLM具有处理应用程序逻辑的*能力*,预示着一个计算机可以直接响应意图的未来,绕过对代码和复杂基础设施的需求。作者认为,推理速度、成本、上下文窗口和错误减少方面的改进可以弥合与这一愿景之间的差距。

相关文章

原文

A web server with no application logic. Just an LLM with three tools.

One day we won't need code. LLMs will output video at 120fps, sample inputs in realtime, and just... be our computers. No apps, no code, just intent and execution.

That's science fiction.

But I got curious: with a few hours this weekend and today's level of tech, how far can we get?

I expected this to fail spectacularly.

Everyone's focused on AI that writes code. You know the usual suspects, Claude Code, Cursor, Copilot, all that. But that felt like missing the bigger picture. So I built something to test a different question: what if you skip code generation entirely? A web server with zero application code. No routes, no controllers, no business logic. Just an HTTP server that asks an LLM "what should I do?" for every request.

The goal: prove how far away we really are from that future.

Contact manager. Basic CRUD: forms, database, list views, persistence.

Why? Because most software is just CRUD dressed up differently. If this works at all, it would be something.

// The entire backend
const result = await generateText({
  model,
  tools: {
    database,      // Run SQL queries
    webResponse,   // Return HTML/JSON
    updateMemory   // Save user feedback
  },
  prompt: `Handle this HTTP request: ${method} ${path}`,
});

Three tools:

  • database - Execute SQL on SQLite. AI designs the schema.
  • webResponse - Return any HTTP response. AI generates the HTML, JavaScript, JSON or whatever fits.
  • updateMemory - Persist feedback to markdown. AI reads it on next request.

The AI infers what to return from the path alone. Hit /contacts and you get an HTML page. Hit /api/contacts and you get JSON:

// What the AI generates for /api/contacts
{
  "contacts": [
    { "id": 1, "name": "Alice", "email": "[email protected]" },
    { "id": 2, "name": "Bob", "email": "[email protected]" }
  ]
}

Every page has a feedback widget. Users type "make buttons bigger" or "use dark theme" and the AI implements it.

It works. That's annoying.

Every click or form submission took 30-60 seconds. Traditional web apps respond in 10-100 milliseconds. That's 300-6000x slower. Each request cost $0.01-0.05 in API tokens—100-1000x more expensive than traditional compute. The AI spent 75-85% of its time reasoning, forgot what UI it generated 5 seconds ago, and when it hallucinated broken SQL that was an immediate 500 error. Colors drifted between requests. Layouts changed. I tried prompt engineering tricks like "⚡ THINK QUICKLY" and it made things slower because the model spent more time reasoning about how to be fast.

But despite all that, forms actually submitted correctly. Data persisted across restarts. The UI was usable. APIs returned valid JSON. User feedback got implemented. The AI invented, without any examples, sensible database schemas with proper types and indexes, parameterized SQL queries that were safe from injection, REST-ish API conventions, responsive Bootstrap layouts, form validation, and error handling for edge cases. All emergent behavior from giving it three tools and a prompt.

So yes, the capability exists. The AI can handle application logic. It's just catastrophically slow, absurdly expensive, and has the memory of a goldfish.

The capability exists. The AI can handle application logic.

The problems are all performance: speed (300-6000x slower), cost (100-1000x more expensive), consistency (no design memory), reliability (hallucinations → errors).

But these feel like problems of degree, not kind:

  • Inference: improving ~10x/year
  • Cost: heading toward zero
  • Context: growing (eventual design memory?)
  • Errors: dropping

But the fact that I built a working CRUD app with zero application code, despite it being slow and expensive, suggests we might be closer to "AI just does the thing" than "AI helps write code."

In this project, what's left is infrastructure: HTTP setup, tool definitions, database connections. The application logic is gone. But the real vision? 120 inferences per second rendering displays with constant realtime input sampling. That becomes the computer. No HTTP servers, no databases, no infrastructure layer at all. Just intent and execution.

I think we don't realize how much code, as a thing, is mostly transitional.


.env:

LLM_PROVIDER=anthropic
ANTHROPIC_API_KEY=sk-ant-...
ANTHROPIC_MODEL=claude-3-haiku-20240307

Visit http://localhost:3001. First request: 30-60s.

What to try:

Check out prompt.md and customize it. Change what app it builds, add features, modify the behavior. That's the whole interface.

Out of the box it builds a contact manager. But try:

  • /game - Maybe you get a game?
  • /dashboard - Could be anything
  • /api/stats - Might invent an API
  • Type feedback: "make this purple" or "add a search box"

⚠️ Cost warning: Each request costs $0.001-0.05 depending on model. Budget accordingly.

MIT License

联系我们 contact @ memedata.com