AWS工程师报告:Linux 7.0使PostgreSQL性能减半,修复可能不易。
AWS engineer reports PostgreSQL perf halved by Linux 7.0, fix may not be easy

原始链接: https://www.phoronix.com/news/Linux-7.0-AWS-PostgreSQL-Drop

即将发布的Linux 7.0内核中的性能下降导致PostgreSQL数据库吞吐量下降约一半,尤其是在Graviton4服务器上。问题源于限制内核抢占模式的更改,增加了在用户空间自旋锁中花费的时间。 虽然有人提出了一个补丁来撤销这些更改,但其被接受的可能性不大。内核开发者Peter Zijlstra建议,解决方案在于PostgreSQL适应使用Linux 7.0中也引入的“可重启序列”(RSEQ)时间片扩展。 这意味着PostgreSQL可能需要更新才能恢复之前内核中看到的性能水平。如果未解决,预计两周后发布的Linux 7.0——以及将为其提供支持的Ubuntu 26.04 LTS——可能会在数据库服务器更新之前,以明显降低的PostgreSQL性能发布。

相关文章

原文
An Amazon/AWS engineer raised the alarms on Friday over the current Linux 7.0 development kernel leading to the throughput for the PostgreSQL database server being around half that of prior kernel versions. The culprit halving the PostgreSQL performance is known but a revert looks like it may not happen and currently suggesting that PostgreSQL may need to be adapted.

Salvatore Dipietro of Amazon/AWS reported a throughput and latency regression for PostgreSQL. They found Linux 7.0 in its near-final form delivering around 0.51x the throughputof prior kernels on a Graviton4 server due to now much more time being spent in a user-space spinlock.

Bisecting the regression was traced back to the Linux 7.0 change of restricting the available preemption modes for the kernel. That change was previously covered on Phoronix within Linux 7.0 To Focus Just On Full & Lazy Preemption Models For Up-To-Date CPU Archs and in turn upstreamed with the Linux 7.0 scheduler updates.

As a result, yesterday posted to the Linux kernel mailing list was a patch to restore PREEMPT_NONE as the default given the severity of the reported regression.

pgbench regression benchmark


While fixing an active performance regression, it looks like this change to restore PREEMPT_NONE as the default preemption model might not be picked up. Peter Zijlstra who authored the original code simplifying the preemption modes has responded that the "fix" is to make PostgreSQL make use of the Restartable Sequences (RSEQ) time slice extension. That time slice extension support was also upstreamed for Linux 7.0.
"The fix here is to make PostgreSQL make use of rseq slice extension:

https://lkml.kernel.org/r/[email protected]

That should limit the exposure to lock holder preemption (unless PostgreSQL is doing seriously egregious things)."


So if that stands and shifting the blame to PostgreSQL, Linux 7.0 stable could lead to a significant drop for PostgreSQL performance in some scenarios until that popular database server is updated.

Linux 7.0 stable is due out in about two weeks. This is also the kernel version powering Ubuntu 26.04 LTS to be released later in April.

联系我们 contact @ memedata.com