为什么GADTs对性能很重要（2015）

为什么GADTs对性能很重要（2015）
Why GADTs matter for performance (2015)

原始链接: https://blog.janestreet.com/why-gadts-matter-for-performance/

在GADTs出现之前，OCaml对多态类型的统一内存表示导致空间效率低下，尤其是在数组方面。虽然`bytes`为字符提供了更紧凑的表示，但将其集成到一般数据结构中却很笨拙。尝试创建`Compact_array`（作为标准变体或简易对象），使用字节和标准数组表示，都会面临类型推断问题或产生闭包分配开销。 GADTs通过允许精确指定变体构造函数的类型来解决这个问题。通过使用GADTs定义`Compact_array`（`Array : 'a array -> 'a t` 和 `Bytes : bytes -> char t`），`Bytes`构造函数与`char`显式关联，从而实现正确的类型推断。为了获得所需的多态函数类型（例如，`length : 'a t -> int`），使用了局部抽象类型和注解，帮助编译器处理匹配分支间的类型变化。这允许构建具有细粒度内存表示控制的高效抽象，这对于系统编程中的性能至关重要。事实证明，GADTs不仅仅是特定领域的语言，它实际上是提高内存表示和性能的一种强大方法。

这个Hacker News讨论串探讨了使用广义代数数据类型 (GADTs) 进行性能优化的优势，并提到了Jane Street的一篇相关博文。一位用户请求一些非编译器的实际例子。其他人则提供了见解，包括GADTs如何强制执行允许状态的编译时约束，例如在FPGA处理器设计中，减少错误并启用编译器优化。讨论还涉及到Jane Street使用的OCaml语言，以及它是否赋予了他们竞争优势，尽管它并不主流。用户们讨论了使用特定类型（如字节数组）的性能优势以及与更通用类型相比的权衡。一些人争论了该语言的开销和好处。英特尔Skylake漏洞及其与不同编译器和语言的关系也被提及。最终，该讨论串探讨了GADTs在启用类型安全优化方面的强大功能，以及在性能关键型领域中使用OCaml的策略选择。

（评论） 2023-11-06

(评论) 2025-03-20

(评论) 2025-03-08

（评论） 2025-05-09

Haskell中的打包数据支持 2025-04-28

原文

When GADTs (Generalized Algebraic Data Types) landed in OCaml, I wasn’t particularly happy about it. I assumed that it was the kind of nonsense you get when you let compiler writers design your programming language.

Which is to say that the standard GADT examples all seem to be about the kinds of things that compiler writers do, like embed domain-specific languages or build typed abstract-syntax trees. But it didn’t seem particularly relevant for the kind of systems programming that I think about.

But it became apparent pretty quickly that I was wrong. In particular, since GADTs have landed, at Jane Street we’ve found lots of examples where GADTs are important for performance, of all things. The theme in these examples is that GADTs enable you to tweak your memory representations in ways that would otherwise be painful or impossible to do safely in OCaml.

The Problem of Polymorphism

I’d like to walk through a simple example that illustrates this aspect of GADTs, but first, a few words about OCaml’s memory representation. OCaml’s polymorphism is in an important way backed on that memory representation. In particular, consider a simple polymorphic function like List.iter, which has the following type:

val iter: 'a list -> f:('a -> unit) -> unit

The polymorphic type tells you that List.iter can operate on lists of any type, and in OCaml, this is achieved with a single compiled version of the code. This is possible because the memory representation of the elements of a list are uniform: you can always refer to an OCaml value in a single word, either as a pointer to a heap-allocated value, or as an immediate that fits inside that word.

That means that some OCaml datatypes are less efficient space-wise than you might imagine. Arrays, for example, take the same amount of space per element whether those elements are bytes, 32-bit ints, or 64-bit ints. (There’s actually some special magic in the compiler for float arrays, though this is probably more trouble than it’s worth, as described by Alain Frisch here. But let’s ignore float arrays for now.)

OCaml does have a tighter representations for byte arrays, called bytes. But it’s a completely different type, and so building a general purpose data structure that uses bytes when it would make sense, and ordinary arrays otherwise, is a little awkward.

Controlling memory representation without GADTs

Let’s see what happens if we try to design (without GADTs) an array type that sometimes uses the general array representation and sometimes uses bytes.

You could imagine representing such a value using an ordinary variant.

type 'a t = | Array of 'a array
            | Bytes of bytes

We could then implement each of the operations we want on our new array type, implementing each operation differently depending on the particular representation. Let’s see what happens if we just take this idea and run with it, implementing all the required functions in the most straightforward way.

> module Compact_array = struct

    type 'a t = | Array of 'a array
                | Bytes of bytes

    let of_bytes x : char t = Bytes x
    let of_array x = Array x

    let length = function
      | Array a -> Array.length a
      | Bytes s -> Bytes.length s

    let get t i =
      match t with
      | Array a -> a.(i)
      | Bytes s -> s.[i]

    let set t i v =
      match t with
      | Array a -> a.(i) <- v
      | Bytes s -> s.[i] <- v

  end;;

module Compact_array :
  sig
    type 'a t = Array of 'a array | Bytes of bytes
    val of_bytes : bytes -> char t
    val of_array : 'a array -> 'a t
    val length : 'a t -> int
    val get : char t -> int -> char
    val set : char t -> int -> char -> unit
  end

This seems pretty good at first glance, but the inferred types aren’t quite what we want. In particular, get and set only work with Compact_arrays containing characters. If you think about how type inference works, it’s not really all that surprising. If you think about the code for get:

let get t i =
  match t with
  | Array  a -> Array.get  a i
  | String s -> String.get s i

The OCaml compiler is looking for a single type to assign to the return value for all the cases of the match. Given that String.get always returns a char, then Compact_array.get will be restricted to only returning a char.

One way to work around this problem is to essentially implement what we want as a poor-man’s object. Here, we just write the code separately for the different cases, and stuff those functions into a record full of closures. Here’s how that looks.

> module Compact_array = struct

  type 'a t = { len: unit -> int
              ; get: int -> 'a
              ; set: int -> 'a -> unit
              }

  let of_string s =
    { len = (fun () -> String.length s)
    ; get = (fun i -> String.get s i)
    ; set = (fun i x -> String.set s i x)
    }

  let of_array a =
    { len = (fun () -> Array.length a)
    ; get = (fun i -> Array.get a i)
    ; set = (fun i x -> Array.set a i x)
    }

  let length t = t.len ()
  let get t i = t.get i
  let set t i x = t.set i x

end;;
module Compact_array :
  sig
    type 'a t = {
      len : unit -> int;
      get : int -> 'a;
      set : int -> 'a -> unit;
    }
    val of_string : bytes -> char t
    val of_array : 'a array -> 'a t
    val length : 'a t -> int
    val get : 'a t -> int -> 'a
    val set : 'a t -> int -> 'a -> unit
  end

This more or less solves the problem, but it’s still not really the memory representation we want. In particular, we have to allocate three closures for each Compact_array.t, and this number of closures will only go up as we add more functions whose behavior depends on the underlying array.

GADTs to the rescue

Let’s go back to our failed variant-based implementation, but rewrite it using the GADT syntax. Note that we’re not trying to change the types at all this time, just rewriting the same type we had before in the language of GADTs.

type 'a t = | Array : 'a array -> 'a t
            | Bytes : bytes -> 'a t

The syntax of this declaration suggests thinking about variant constructor like Array or Bytes as functions from the constructor arguments to the type of the resulting value, with the thing to the right of the : roughly corresponding to the type signature of the constructor.

Note that for the Array constructor, the type value of 'a depends on the type of the argument:

> Array [|1;2;3|];;
- : int t = Array [|1; 2; 3|]
> Array [|"one";"two";"three"|];;
- : bytes t = Array [|"one"; "two"; "three"|]

But for the Bytes constructor, the type 'a in the type is still free.

> Bytes "foo";;
- : 'a t = Bytes "foo"

This is really the problematic case, because we’d like for Bytes "foo" for the parameter 'a to by char, since in the Bytes case, that’s what the element type of our array is.

Because GADTs give us the ability to specify the type on the right-hand side of the arrow, we can get that.

type 'a t = | Array : 'a array -> 'a t
            | Bytes : bytes -> char t

Now, the Bytes constructor behaves as we’d like it too.

> Bytes "foo";;
- : char t = Bytes "foo"

Now let’s see what happens when we try to write the length function.

> let length t = 
     match t with
     | Bytes b -> Bytes.length b
     | Array a -> Array.length a
  ;;
val length : char t -> int = <fun>

Disappointingly, we’re again stuck with a function that doesn’t have the right type. In particular, the compiler has decided that this function can only operate on char t, when we want it to work for arrays of any type.

But the problem now is that type inference in the presence of GADTs is difficult, and the compiler needs a little help. Roughly speaking, without some hints, OCaml’s type system will try to identify all types as having a single value within a given function. But in this case, we need a type variable which might have different values in different branches of a match statement.

We can do this by creating a locally-abstract type el to represent the type parameter of t (and the element type), and annotating t accordingly.

> let length (type el) (t:el t) = 
     match t with
     | Bytes b -> Bytes.length b
     | Array a -> Array.length a
  ;;
val length : 'a t -> int = <fun>

Now we see that we get the right type. We can push this approach through to get a complete implementation.

> module Compact_array = struct

    type 'a t = | Array  : 'a array -> 'a t
                | Bytes : bytes -> char t

    let of_bytes x = Bytes x
    let of_array x = Array x

    let length (type el) (t:el t) =
      match t with
      | Array a -> Array.length a
      | Bytes s -> Bytes.length s

    let get (type el) (t:el t) i : el =
      match t with
      | Array a -> Array.get a i
      | Bytes s -> Bytes.get s i

    let set (type el) (t:el t) i (v:el) =
      match t with
      | Array a -> Array.set a i v
      | Bytes s -> Bytes.set s i v

  end;;
module Compact_array :
  sig
    type 'a t = Array : 'a array -> 'a t | Bytes : bytes -> char t
    val of_bytes : bytes -> char t
    val of_array : 'a array -> 'a t
    val length : 'a t -> int
    val get : 'a t -> int -> 'a
    val set : 'a t -> int -> 'a -> unit
  end

As I said at the beginning, this is really just an example of the more general theme. GADTs are about more than clever typed interpreters; they’re a powerful mechanism for building easy to use abstractions that give you more precise control of your memory representation. And getting the right memory representation is often critical for building high performance applications.