比JSON更好
Better Than JSON

原始链接: https://aloisdeniel.com/blog/better-than-json

## 超越 JSON:为什么 API 要选择 Protocol Buffers (Protobuf)? 数十年以来,JSON 因其人类可读性、灵活性和广泛的工具支持,一直是 API 的主导数据格式。然而,许多开发者,包括作者本人,选择了一种更高效的替代方案:Protocol Buffers (Protobuf)。Protobuf 由 Google 于 2008 年开发,在性能和可维护性方面具有显著优势,尤其是在微服务等现代架构中。 Protobuf 依赖于定义的 `.proto` 文件来建立严格的数据结构契约,消除了 JSON 灵活特性中常见的歧义和潜在错误。该文件会生成各种语言(Dart、TypeScript、Go 等)的强类型代码,从而实现自动化验证并减少解析开销。 至关重要的是,Protobuf 是一种*二进制*格式,导致消息尺寸显著减小——示例中大约小 3 倍——从而实现更快的传输和减少带宽使用。虽然 JSON 易于检查,但 Protobuf 由于其二进制特性,需要专门的工具进行调试。 尽管通常与 gRPC 相关联,但 Protobuf 可以独立于传统的 HTTP API 使用。作者提倡 Protobuf 卓越的性能、健壮性和开发者体验,鼓励其他人考虑将其用于下一个项目。

一篇 Hacker News 的讨论围绕着“Better Than JSON”这篇文章展开,引发了关于数据传输协议的争论。虽然 Protobuf (Protocol Buffers) 提供了诸如模式强制和效率等优势,但评论员质疑它的实用性。 一位用户讲述了在生产系统中实现 Protobuf 的痛苦经历,认为 JSON 尽管有缺点,可能更简单。其他人指出 Protobuf 并非分布式系统的通用解决方案,尤其是在版本兼容性方面。 这场对话突出了 Protobuf 严格的模式既是优点——无需外部模式验证工具,又是缺点,因为 JSON 的灵活性更适合许多用例。CBOR (Concise Binary Object Representation) 被建议作为 JSON 的潜在更强替代方案,因为它编码简洁。 最后,一些用户争论原始文章是否由 AI 生成,因为它的写作风格令人困惑,而另一些人则认为关注内容质量比猜测写作工具更有成效。
相关文章

原文

Or why I stopped using JSON for my APIs

If you develop or use an API, there’s a 99% chance it exchanges data encoded in JSON. It has become the de facto standard for the modern web. And yet, for almost ten years, whenever I develop servers—whether for personal or professional projects—I do not use JSON.

And I find it surprising that JSON is so omnipresent when there are far more efficient alternatives, sometimes better suited to a truly modern development experience. Among them: Protocol Buffers, or Protobuf.

In this article, I’d like to explain why.

Before going any further, let’s put the topic back into context.

An API (Application Programming Interface) is a set of rules that allow two systems to communicate. In the web world, REST APIs—those using the HTTP protocol and its methods (GET, POST, PUT, DELETE…)—are by far the most widespread.

When a client sends a request to a server, it transmits a message containing:

  • headers, including the well-known Content-Type, which indicates the message format (JSON, XML, Protobuf, etc.);
  • a body (payload), which contains the data itself;
  • a response status.

Serialization is the process of turning a data structure into a sequence of bytes that can be transmitted. JSON, for example, serializes data as human-readable text.

There are many reasons for its popularity:

Human-readable

JSON is easy to understand, even for non-developers. A simple console.log() is often enough to inspect most data.

Perfectly integrated into the web

It was propelled by JavaScript, then massively adopted by backend frameworks.

Flexible

You can add a field, remove one, or change a type “on the fly.” Useful… sometimes too much.

Tools everywhere

Need to inspect JSON? Any text editor will do. Need to send a request? Curl is enough. Result: massive adoption, rich ecosystem.


However, despite these advantages, another format offers me better efficiency—for both developers and end users.

There’s a strong chance you’ve never really worked with Protobuf. Yet this format was created as early as 2001 at Google and made public in 2008.

It’s heavily used inside Google and in many modern infrastructures—especially for inter-service communication in microservice architectures.

So why is it so discreet in public API development?

Perhaps because Protobuf is often associated with gRPC, and developers think they must use both together (which is false). Maybe also because it’s a binary format, making it feel less “comfortable” at first glance.

But here’s why I personally use it almost everywhere.

With JSON, you often send ambiguous or non-guaranteed data. You may encounter:

  • a missing field,
  • an incorrect type,
  • a typo in a key,
  • or simply an undocumented structure.

With Protobuf, that’s impossible. Everything starts with a .proto file that defines the structure of messages precisely.

Example of a Proto3 file

syntax = "proto3";

message User {
  int32 id = 1;
  string name = 2;
  string email = 3;
  bool isActive = 4;
}

Each field has:

  • a strict type (string, int32, bool…)
  • a numeric identifier (1, 2, 3…)
  • a stable name (name, email…)

This file is then used to automatically generate code in your preferred language.

Code generation

You use protoc:

protoc --dart_out=lib user.proto

and you automatically get the following in your Dart code:

final user = User()
  ..id = 42
  ..name = "Alice"
  ..email = "[email protected]"
  ..isActive = true;

final bytes = user.writeToBuffer();       
final sameUser = User.fromBuffer(bytes);  

No manual validation. No JSON parsing. No risk of type errors.

And this mechanism works with:

  • Dart
  • TypeScript
  • Kotlin
  • Swift
  • C#
  • Go
  • Rust
  • and many more…

It represents a huge time saver and brings exceptional maintainability comfort.

Another major strength of Protobuf: it’s a binary format, designed to be compact and fast.

Let’s compare with JSON.

Example JSON message

{
  "id": 42,
  "name": "Alice",
  "email": "[email protected]",
  "isActive": true
}

Size: 78 bytes (depending on whitespace).

The same message in Protobuf binary

→ About 23 bytes. Roughly 3× more compact, and often much more depending on structure.

Why? Because Protobuf uses:

  • compact “varint” encoding for numbers
  • no textual keys (they’re replaced by numeric tags)
  • no spaces, no JSON overhead
  • optimized optional fields
  • a very efficient internal structure

Results:

  • less bandwidth
  • faster response times
  • savings on mobile data
  • direct impact on user experience

To make things more concrete, let’s build a minimal HTTP server in Dart using the shelf package, and return our User object serialized as Protobuf, with the correct Content-Type.

We’ll assume you already have the previously generated code for the User type.

Create a simple Shelf server

Create a file bin/server.dart:

import 'dart:io';

import 'package:shelf/shelf.dart';
import 'package:shelf/shelf_io.dart' as shelf_io;
import 'package:shelf_router/shelf_router.dart';

import 'package:your_package_name/user.pb.dart'; 

void main(List<String> args) async {
  final router = Router()
    ..get('/user', _getUserHandler);

  final handler = const Pipeline()
      .addMiddleware(logRequests())
      .addHandler(router);

  final server = await shelf_io.serve(handler, InternetAddress.anyIPv4, 8080);
  print('Server listening on http://${server.address.host}:${server.port}');
}

Response _getUserHandler(Request request) {
  final user = User()
    ..id = 42
    ..name = 'Alice'
    ..email = '[email protected]'
    ..isActive = true;

  final bytes = user.writeToBuffer();

  return Response.ok(
    bytes,
    headers: {
      'content-type': 'application/protobuf',
    },
  );
}

Key points:

  • User() comes from the generated Protobuf code.
  • writeToBuffer() serializes the object into Protobuf binary.
  • The Content-Type header is set to application/protobuf, allowing clients to know they must decode Protobuf instead of JSON.

Calling the Protobuf API from Dart (using http)

Once your server returns a Protobuf-encoded User, you can retrieve and decode it directly from Dart. All you need is:

  • the http package
  • the generated Protobuf classes (user.pb.dart)

Create a Dart file (e.g. bin/client.dart):

import 'package:http/http.dart' as http;

import 'package:your_package_name/user.pb.dart'; 

Future<void> main() async {
  final uri = Uri.parse('http://localhost:8080/user');

  final response = await http.get(
    uri,
    headers: {
      'Accept': 'application/protobuf',
    },
  );

  if (response.statusCode == 200) {
    
    final user = User.fromBuffer(response.bodyBytes);

    print('User received:');
    print('  id       : ${user.id}');
    print('  name     : ${user.name}');
    print('  email    : ${user.email}');
    print('  isActive : ${user.isActive}');
  } else {
    print('Request failed: ${response.statusCode}');
  }
}

With this setup, both the server and the client rely on the same Protobuf definition, ensuring that data structures stay perfectly aligned without manual validation or JSON parsing. The same .proto file generates strongly typed code on both sides, making it impossible for the client and server to “disagree” about the shape or type of the data.

And this is not limited to Dart: the exact same approach works seamlessly if your server is written in Go, Rust, Kotlin, Swift, C#, TypeScript, or any language supported by the Protobuf compiler. Protobuf acts as a shared contract, giving you end-to-end type safety and consistent, compact data serialization across your entire stack.

You can decode Protobuf messages, of course—but unlike JSON, you don’t see human-readable field names. Instead, you see numeric field identifiers and wire types. The data is meaningful, but without the corresponding .proto schema you can only interpret it at a structural level, not semantically. You can see the fields, but you don’t know what they represent.

Human-friendly debugging

JSON can be read and understood immediately.

{
  "id": 42,
  "name": "Alice",
  "email": "[email protected]",
  "isActive": true
}

A Protobuf payload, being binary, can’t be interpreted in a meaningful, human-readable way without knowing the schema behind it.

1: 42
2: "Alice"
3: "[email protected]"
4: true

This doesn’t prevent you from working with Protobuf, but it does add some complexity:

  • requires specialized tooling
  • schemas must be maintained and versioned
  • decoding tools are essential

For me, the trade-off is well worth it given the performance and efficiency benefits Protobuf provides.

I hope this article makes you want to try Protobuf. It’s an incredibly mature, extremely performant tool, but still too invisible in the world of public APIs.

And even though Protobuf is often associated with gRPC, nothing forces you to use both. Protobuf can work independently, on any traditional HTTP API.

If you’re looking for:

  • more performance,
  • more robustness,
  • fewer errors,
  • and a genuinely enjoyable development experience,

then I strongly encourage you to try Protobuf on your next project.

联系我们 contact @ memedata.com