在关键领域,指令集架构并不重要。
The ISA Doesn't Matter Where It Counts

原始链接: https://www.chipstrat.com/p/the-isa-doesnt-matter-where-it-counts

在争夺人工智能基础设施主导权的竞赛中,关于 x86 和 Arm 架构孰优孰劣的争论在很大程度上被夸大了。虽然 x86 在历史上一直占据服务器市场的主导地位,但在围绕 GPU 的最关键插槽中,指令集架构(ISA)的选择正变得日益无关紧要。 在高价值的“一致性主机”(coherent host)插槽中,主要的“护城河”并非 ISA,而是允许 CPU 与 GPU 共享内存空间的高带宽专用互连技术(如 NVLink 或 Infinity Fabric)。英伟达即将推出的“NVLink Fusion”计划进一步印证了这一点,该计划旨在向包括 x86、Arm 以及未来的 RISC-V 在内的多种架构开放这一插槽。 同样,在负责为 GPU 提供数据的“标准主机”插槽中,超大规模数据中心运营商已经在向定制的 Arm 芯片迁移,且并未出现性能问题。ISA 仅在传统的企业环境中仍是一个重要因素,因为这些环境下的主机 CPU 必须执行“双重任务”,既要处理 GPU 端的任务,又要处理传统的应用层工作负载。归根结底,ISA 正日益成为由所选加速器平台决定的次要细节,而非一个独立的战略优势。

``` Hacker News 最新 | 往期 | 评论 | 提问 | 展示 | 招聘 | 投稿 登录 ISA(指令集架构)在关键领域并不重要 (chipstrat.com) 3 个积分,由 ksec 发布于 1 小时前 | 隐藏 | 往期 | 收藏 | 2 条评论 | 帮助 imtringued 1 分钟前 | 下一条 [–] 这篇文章太令人失望了,没人应该为了看余下内容而付费。 回复 ahartmetz 15 分钟前 | 上一条 [–] 我不确定付费部分是否更好,但免费部分的内容显而易见。在服务器上运行定制软件时,ISA 不重要,这不是废话吗。 回复 指南 | 常见问题 | 列表 | API | 安全 | 法律 | 加入 YC | 联系 搜索: ```
相关文章

原文

AMD, Intel, Nvidia, Arm, and Qualcomm are all selling datacenter CPUs into the AI buildout. The previous piece mapped them across five sockets orbiting the GPU and ranked those sockets by value: coherent host, standard host, thinker, doer, traditional cloud.

The coherent host is the most valuable. The traditional cloud CPU is the least.

Many readers asked if it matters whether the CPU is x86 or Arm.

Honestly, not as much as made out to be. But let’s go socket by socket.

The ISA is the language a CPU speaks. Software gets compiled into that language, and a chip can only run code written for its dialect.

x86 has been the server default for decades. Yet Arm has been gaining in servers, first slowly, then quickly as Graviton, Axion, and Cobalt took hold in cloud, and now inside AI infrastructure as hyperscalers build Arm into their GPU server stacks.

Naturally, everyone asks which ISA is “better” for agentic AI; they’re both just fine.

The more interesting question at each socket is whether the software running there cares which ISA it runs on? Specifically, is the ISA a “moat” at any of the agentic sockets? Let’s see:

The coherent host’s moat is the coherent link to the GPU, not its ISA.

NVLink-C2C connects Nvidia’s Grace CPU to the Blackwell GPU at 900 GB/s, providing a shared address space in which the GPU reads CPU DRAM as if it were local. Vera doubles that to 1.8 TB/s with Rubin. Infinity Fabric ties AMD’s EPYC to the Instinct MI455X at comparable bandwidth. The coherent link is what makes this socket valuable. It’s what no other CPU can replicate without a bilateral design agreement with the GPU vendor... like NVLink Fusion...

Before Grace, Nvidia GPU servers shipped with standard x86 hosts (Intel Xeon or AMD EPYC) connected over PCIe. Grace Hopper (2023) was Nvidia’s first coherent superchip: Grace CPU (Arm, Neoverse V2) connected to the Hopper GPU via NVLink-C2C at 900 GB/s — and Nvidia’s first deployment of the full datacenter CUDA stack on an Arm server CPU. CUDA already ran on Arm through the Jetson embedded line, but this was the server-grade debut.

Grace Blackwell carried that forward; Vera Rubin extends it with a custom Arm CPU (88 Nvidia-designed cores) at 1.8 TB/s to Rubin.

So clearly, ISA isn’t a differentiator for the 800-lb gorilla. Host software runs on either.

What about AMD? ROCm is effectively x86-native. AMD’s coherent platform is built around EPYC, so an Arm port has naturally never been a priority.

The main takeaway is that the ISA is baked into the accelerator platform choice.

NVLink Fusion is Nvidia’s move to open the coherent-host socket to third-party CPUs. Previously, the only CPU that could claim a coherent seat on Nvidia’s backend was the one Nvidia built (Grace/Vera). NVLink Fusion allows other vendors to couple their processors to Blackwell GPUs over the same high-bandwidth coherent link Grace uses. Note that no NVLink Fusion product has actually shipped yet, these are simply announced partnerships. But the partner list includes Qualcomm (Arm), Fujitsu, Intel (x86), and SiFive (RISC-V).

If and when these ship, the coherent-host socket will be accessible to any ISA, so the moat is most definitely not the ISA. RISC-V even... although lots of software porting required.

The standard host’s job is to keep the GPU fed: tokenize inputs, batch requests, stage data over PCIe, manage memory. The CPU needs to work as fast as possible and also move a lot of data. PCIe can become a bottleneck here… hence the coherent host.

The hyperscalers started with x86 standard hosts paired with their XPUs, but that has moved toward Arm. AWS pairs Graviton with Trainium. Google pairs Axion with its gen 8 TPUs.

The feed-the-XPU stack runs on x86 or Arm interchangeably; ISA is not the moat.

Note that there is still an x86 standard-host business in smaller deployments, specifically enterprises and small neoclouds running DGX, Instinct MI355X, RTX Pro 6000 servers, and so on.

In these setups, the host often runs double duty with GPU feeding and application-tier workloads on the same box. That brings legacy x86 software dependencies back into the picture, and ISA does matter. Lower volume, but will grow.

Takeaway: if the host is doing double duty as application processor, then ISA matters. Otherwise, nope.

The two orbits closest to the GPU give the same answer: ISA does not matter there. The three that remain do not all agree. One has a real x86 lock-in story. One has a wrinkle. One… not so much.

联系我们 contact @ memedata.com