云规格：云硬件演进一览

云规格：云硬件演进一览
Cloudspecs: Cloud Hardware Evolution Through the Looking Glass

原始链接: http://muratbuffalo.blogspot.com/2026/01/cloudspecs-cloud-hardware-evolution.html

## 云硬件趋势 (2015-2025): 摘要一篇最近的 CIDR’26 论文分析了 AWS 上的云硬件性能，并与本地系统进行了比较。主要发现是什么？虽然每美元的网络带宽*增加了十倍*，但每美元的 CPU 和 DRAM 性能提升幅度不大（约 2-3 倍），令人惊讶的是，**自 2016 年以来，云端的 NVMe 存储性能停滞不前。** CPU 核心数量激增，但这并未转化为成比例的性能提升，AWS Graviton 实例是一个显著的例外。尽管 DDR5 提高了带宽，但每美元的 DRAM 容量却持平，而且最近的 AI 需求正在*增加*价格。 NVMe 的停滞尤其引人注目，与本地的进步形成了鲜明对比。这推动了对分离式存储解决方案的兴趣，因为远程存储变得比性能不佳的本地 NVMe 更具成本效益。该研究强调了从统一硬件扩展到**专业化**的转变——网络（Nitro 卡）和定制硅（Graviton）——作为性能提升的主要驱动力。充分利用不断增加的核心数量至关重要，因为并行编程挑战限制了可扩展性。一个交互式工具 Cloudspecs 伴随该研究，允许用户直接探索数据。

以下是 Hacker News 讨论的简短总结，字数在 200 字以内：一篇名为《Cloudspecs：云硬件演进》的博客文章引发了关于云实例存储的讨论，特别是 AWS 上的 NVMe SSD 性能和价格。用户注意到，2016 年推出的 i3 实例系列——首批基于 NVMe 的产品——即使在 2025 年推出了 36 个新的 NVMe 系列后，仍然提供最佳的每美元 I/O 性能。对话强调了直接连接的 NVMe（如 i3）和“Nitro NVMe”（m6id 等）之间的区别，后者通过嵌入式卡模拟 NVMe。许多评论员认为，AWS 优先考虑 EBS 等网络存储服务的盈利能力，而不是实例附加存储，可能会限制 SSD 速度或优化工作。其他讨论点包括，随着本地解决方案变得可行，云回流的趋势日益增长；低估了维护专业级本地基础设施的成本；以及临时存储在缓存等用例中的价值。最终，与更广泛的云存储需求相比，对高性能本地 SSD 的需求似乎相对较低。

原文

This paper (CIDR'26) presents a comprehensive analysis of cloud hardware trends from 2015 to 2025, focusing on AWS and comparing it with other clouds and on-premise hardware.

TL;DR: While network bandwidth per dollar improved by one order of magnitude (10x), CPU and DRAM gains (again in performance per dollar terms) have been much more modest. Most surprisingly, NVMe storage performance in the cloud has stagnated since 2016. Check out the NVMe SSD discussion below for data on this anomaly.

CPU Trends

Multi-core parallelism has skyrocketed in the cloud. Maximum core counts have increased by an order of magnitude over the last decade. The largest AWS instance u7in now boasts 448 cores. However, simply adding cores hasn't translated linearly into value. To measure real evolution, the authors normalized benchmarks (SPECint, TPC-H, TPC-C) by instance cost. SPECint benchmarking shows that cost-performance improved roughly 3x over ten years. A huge chunk of that gain comes from AWS Graviton. Without Graviton, the gain drops to roughly 2x. For in-memory database benchmarks, gains were even lower (2x–2.5x), likely due to memory and cache latency bottlenecks.

On-prem hardware comparison shows that this stagnation is not cloud price gouging. Historically, Moore's Law and Dennard scaling doubled cost-performance every two years (which would have sum up to 32x gain over a decade). However, an analysis of on-premise AMD server CPUs reveals a similar slump, only a 1.7x gain from 2017 to 2025.

Memory Trends

DRAM capacity per dollar has effectively flatlined. The only significant improvement was the 2016 introduction of memory-optimized x instances, which offered ~3.3x more GiB-hours/$ than compute-optimized peers. While absolute single-socket bandwidth jumped ~5x (93 GiB/s to 492 GiB/s) as servers moved from DDR3 to DDR5, the cost-normalized gain is only 2x.

Historical data suggests commodity DRAM prices dropped 3x over the decade. But in the last three months, due to AI-driven demand, DDR5 prices rose sharply, further limiting effective memory gains.

Network Trends

We have good news here, finally. Network bandwidth per dollar exploded by 10x. And absolute speeds went from 10 Gbit/s to 600 Gbit/s (60x).

These gains were not universal though. Generic instances saw little change. The gains were driven by network-optimized n instances (starting with the c5n in 2018) powered by proprietary Nitro cards.

NVMe Trends

NVMe SSDs are the biggest surprise. Unlike CPUs and memory, where cloud trends mirror on-prem hardware, NVMe performance in AWS has largely stagnated. The first NVMe-backed instance family, i3, appeared in 2016. As of 2025, AWS offers 36 NVMe instance families. Yet the i3 still delivers the best I/O performance per dollar by nearly 2x.

SSD capacity has stagnated since 2019 and I/O throughput since 2016. This sharply contrasts with on-prem hardware, where SSD performance doubled twice (PCIe 4 and PCIe 5) in the same timeframe. The gap between cloud and on-premise NVMe is widening rapidly.

This price/performance gap likely explains the accelerating push toward disaggregated storage. When local NVMe is expensive and underperforming, remote storage starts to look attractive. The paper speculates that with network speeds exploding and NVMe stagnating, architectures may shift further. For systems like Snowflake, using local NVMe for caching might no longer be worth the complexity compared to reading directly from S3 with fast networks.

Discussion

I think the main takeaway is that uniform hardware scaling in the cloud is over. Moore's Law no longer lifts all boats. Performance gains now come from specialization, especially networking (e.g., Graviton, Nitro, Accelerators).

In my HPTS 2024 review, I noted that contrary to the deafening AI hype, the real excitement in the hallways was about hardware/software codesign. This paper validates that sentiment. With general-purpose CPU and memory cost-performance stagnating, future databases must be tightly integrated with specialized hardware and software capabilities to provide value. I think the findings here will refuel that trend.

A key open question is why massive core counts deliver so little value. Where is the performance lost? Possible explanations include memory bandwidth limits, poor core-to-memory balance, or configuration mismatches. But I think the most likely culprit is software. Parallel programming remains hard, synchronization is expensive, and many systems fail to scale beyond a modest number of cores. We may be leaving significant performance on the table simply because our software cannot effectively utilize the massive parallelism now available.

The paper comes with an interactive tool, Cloudspecs, built on DuckDB-WASM (yay!). This allows you to run SQL queries over the dataset directly in the browser to visualize these trends. The figures in the PDF actually contain clickable link symbols that take you to the specific query used to generate that chart. Awesome reproducibility!

Aleksey and I did a live-reading of the paper. As usual, we had a lot to argue about. I'll add a recording of our discussion on YouTube when it becomes available, and here is a link to my annotated paper.