WebMonkeys:JavaScript 的并行 GPU 编程
WebMonkeys: parallel GPU programming in JavaScript (2016)

原始链接: https://github.com/VictorTaelin/WebMonkeys

WebMonkeys 使用简洁的 API 简化了浏览器和 Node.js 中的 GPU 并行处理。它利用 WebGL,无需扩展,即可使用 GLSL 1.0(以及用于数组访问和设置器的扩展)实现数千个并行任务。 核心函数包括 `set`(将数据发送到 GPU)、`get`(检索数据)和 `work`(执行并行任务)。`work` 接受一定数量的并行任务和一段 GLSL 代码片段,其中每个“猴子”(任务)都使用其索引 `i` 来操作数据。WebMonkeys 处理了 WebGL 中固有的数据编码、纹理映射和精度问题等复杂性。 性能提示包括:最小化 `set`/`get` 调用以提高速度;通过重用代码片段来缓存着色器;当算术运算强度较低时,使用原始缓冲区来优化数据处理。`fill` 和 `clear` 等特殊函数可以快速初始化数组。

Hacker News 的讨论集中在 WebMonkeys 上,这是一个 2016 年的 JavaScript 并行 GPU 编程库。评论者指出它缺乏近期的维护,并探讨了在浏览器中进行 GPU 加速的替代方案。WebGPU 被强调为一个很有前景的技术,尽管它属于底层技术,但它可以将计算着色器引入 Web。 一些用户讨论了 GPU 使用的更广泛的影响以及抽象 GPU 架构复杂性的挑战。一位评论者表达了对 JavaScript 到 GLSL 编译器的兴趣,以便利用 GPU 的强大功能来执行通用计算和 AI 任务。其他主题包括使用 raylib 或 SDL 进行 3D 图形处理,以及使用 QuickJS 等 JavaScript 解释器,以及 Deno 或 Bun 在浏览器之外访问 GPU 的潜力。这场讨论反映了人们对 Web 上易于访问的 GPU 编程的渴望。
相关文章

原文

Allows you to spawn thousands of parallel tasks on the GPU with the simplest, dumbest API possible. It works on the browser (with browserify) and on Node.js. It is ES5-compatible and doesn't require any WebGL extension.

On the browser, add <script src="WebMonkeys.js"><script> to your HTML. On Node.js, install it from npm:

npm install webmonkeys --save

The example below uses the GPU to square all numbers in an array in parallel:

// Creates a WebMonkeys object
const monkeys = require("WebMonkeys")(); // on the browser, call WebMonkeys() instead

// Sends an array of numbers to the GPU
monkeys.set("nums", [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16]);

// Employs 16 monkeys to work in parallel on the task of squaring each number
monkeys.work(16, "nums(i) := nums(i) * nums(i);");

// Receives the result back
console.log(monkeys.get("nums"));

// output: [ 1, 4, 9, 16, 25, 36, 49, 64, 81, 100, 121, 144, 169, 196, 225, 256 ]

set/get allow you to send/receive data from the GPU, and work creates a number of parallel tasks (monkeys) that can read, process and rewrite that data. The language used is GLSL 1.0, extended array access (foo(index), usable anywhere on the source), setters (foo(index) := value, usable on the end only), and int i, a global variable with the index of the monkey.

More elaborate algorithms can be developed with GLSL.

  • Vector multiplication:

    monkeys.set("a", [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16]);
    monkeys.set("b", [1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 4, 4, 4, 4]);
    monkeys.set("c", 16); // use a number to just alloc an array
    
    monkeys.work(16, "c(i) := a(i) * b(i);");
    
    console.log(monkeys.get("c"));
  • Crypto-currency mining:

    monkeys.set("blockhash", [blockhash]);
    monkeys.set("monkeyNonce", monkeyNonce);
    monkeys.set("result", [0]);
    monkeys.work(totalMonkeys, `
      const float attempts = ${attemptsPerMonkey.toFixed(1)};
      float bhash = blockhash(0);
      float startNonce = monkeyNonce(i);
      float mined = 0.0;
      for (float nonce = startNonce; nonce < startNonce+attempts; ++nonce){
        // Yes, this hash function is stupid
        float hash = mod(bhash * (nonce+1.0), pow(2.0,31.0) - 1.0);
        if (hash >= 0.0 && hash <= 3000.0)
          mined = nonce;
      };
      result(mined > 0.0 ? 0 : 1) := mined;
    `);
    // Will be set if mined a block
    console.log(monkeys.get("result"));

You can also define libs, write to many indices in a single call, and work with raw Uint32 buffers if you wish to. For more details, please check the examples directory.

The only reliable way to access the GPU on the browser is by using WebGL. Since it wasn't designed for general programming, doing it is very tricky. For one, the only way to upload data is as 2D textures of pixels. Even worse, your shaders (programs) can't write directly to them; you need to, instead, render the result using geometrical primitives. You're, thus, in charge of converting JS numbers (IEEE 754 floats) to pixels, projecting them to/from 2D textures and using proper geometries to render the results on the right places. You must also deal with aliasing/blurring, rounding, and loss of precision. It is a very delicate job with many small details that could go wrong and no satisfactory way of debugging. WebMonkeys takes care of all that for you, abstracting the overcomplication away and making the power of the GPU as easily accessible as possible, with a very simple API based on array reads and writes.

Performance and debugging tips

  • A single monkey can write to multiple places. If you need to fill an array of 100 numbers, you could use 100 monkeys writing to 1 index each, or 10 monkeys writing to 10 indices each. What is faster will depend on your application.

  • While CPU/GPU bandwidth is huge these days, it still takes time to communicate data between them. Whenever possible, reduce your calls to set/get, and keep things internal to the GPU. For example, if you need to move data between two arrays, this: monkeys.work(16, "target(i) := source(i);") - is much faster than this: monkeys.set("target", monkeys.get("source")).

  • The first call to monkeys.work(count, someTask) is slow due to program compilation, but every call after that is fast. That is for two reasons: 1. WebMonkeys caches shaders so that, when you call task with a repeated source code, it just recovers the previously compiled program; 2. JS engines keep strings hashed, which means that retrieval can be done in O(1). In other words, it is perfectly reasonable to call monkeys.work(n, bigSourceCode) inside your animation loop (as long as bigSourceCode doesn't change).

  • Since WebMonkeys stores numbers as WebGL textures, writing/reading to/from arrays has an encode/decode overhead. If your application spends much more time doing arithmetics than writing/reading data, that is acceptable. If not, use raw buffers and do your own packing/unpacking.

  • Remember you can't have setters (foo(i) := v;) in the middle of your program. They must be at the end. If you're having weird WebGL errors, it could be WebMonkeys's fault: its very simple parser sometimes fails to separate the program's body from the list of setters. Usually, just adding an extra line with a commented semicolon (//;) between your program and your setters solves it.

  • Use monkeys.fill("nums", 0) rather than monkeys.work(numsLength, "nums(i) := 0.0;") (and clear, its equivalent for raw Uint32s).

联系我们 contact @ memedata.com