预编译头文件以及Squid为何不使用它们 (2023)

预编译头文件以及Squid为何不使用它们 (2023)
Precompiled headers and why Squid won't be using them (2023)

原始链接: https://squidproxy.wordpress.com/2023/10/10/precompiled-headers-and-why-squid-wont-be-using-them/

## 预编译头文件与Squid：一次失败的优化预编译头文件旨在通过保存解析后的头信息来加速C++编译，避免在多个源文件中重复工作。GCC、Clang和MSVC等编译器都支持此功能，各自具有不同的实现细节，包括文件格式（GCC的.gch，Clang的.pch）和构建过程。 Squid项目研究了使用预编译头文件，特别是对于包含大量可移植性层的“squid.h”——为它的800多个源文件中的每一个解析206个头文件。虽然GCC显示出适度的5%（大约30秒）编译时间改进，但集成却存在问题。使用GCC，实现需要对Autotools构建系统进行修改，包括修改包含目录中的文件。Clang提出了更大的挑战：它需要在编译*期间*使用`-include-pch`标志，并且如果缺少预编译头文件则会失败，这使得在现有的Autotools框架内控制构建顺序变得不可能。最终，收益不足以证明所需的复杂性和修改是合理的，导致放弃了优化工作。对于Clang的一个关键改进是，应该像GCC一样行事——将预编译头文件视为可选的优化，而不是构建要求。

## 预编译头文件与Squid的决定一篇最近的博客文章引发了Hacker News上关于预编译头文件（PCH）以及Squid代理服务器为何不使用它们的讨论。虽然PCH*可以*加速构建，但评论员们指出了潜在的缺点。一位用户回忆起一个过去的项目，大量使用PCH导致内存消耗增加，并且模糊了依赖关系链，使代码更难理解。另一位用户指出博客文章中的不准确之处，并强烈建议不要在生产环境中使用Clang，因为其存在感知到的限制，但承认GCC也存在类似问题。有人提出了解决方案，例如使用叠加包含目录来管理构建的纯净性。也有人分享了使用CMake的积极经验，展示了成功地为大型项目（Godot 4插件）实施PCH，且工作量很小。最终，这场对话表明，虽然PCH提供了性能优势，但它们需要仔细考虑，并非普遍适用的解决方案，尤其是在模块和头文件单元等替代方案不断发展的情况下。

原文

First thing, what are precompiled headers?

Once an entire dependency tree is exploded, a single c++ include file can become huge, and easily span tens if not hundreds of source files; these will need to be parsed for each compilation unit (c++ file), resulting in a large amount of duplicate work.

So compiler writers came up with the clever idea to optionally save an intermediate state of key headers, to reduce the amount of duplicate work. gcc , clang, msvc all support some variant of the precompiled headers idea.

How do they work in practice?i

Each compiler has its own quirks.

On GCC

A precompiled header has the same name as the header it accelerates with an additional .gch suffix, placed in the same directory as the header file it refers to. It is generated by calling the compiler with the same exact command line arguments as used to build the code, with the additional switches -x c++-header . If a precompiled header is present, it will be automatically used

On clang

A precompiled header has the same name as the header it accelerates with an additional .pch suffix. It is generated by calling the compiler with the same arguments as used to build the code, with the additional switches -x c++-header -emit-pch (the latter might be implicit if the former is supplied). To use it, it is not enough that it be present; the compiler switch -include-pch <pch-file-path> must be used.

On top of this: clang internal documentation highlights that there can only be one precompiled header and it must be included at the beginning of the translation unit

On MSVC

This is not yet a specific target for squid

Could it work for squid?

Yes, in theory. Our coding guidelines mandate that each c++ file start including “squid.h”, which in turn includes our whole portability abstraction layer, which in turn takes in several system files. On my Ubuntu Linux system, a total of 206 header files have to be read and parsed just for this purpose for each of the over 800 files that make up squid. Sounds promising!

Does it work for squid?

In short, unfortunately not. I have experimented with a feature branch, and the results are not what I was hoping for, under several dimensions.

The good: performance gains

I ran some checks, on a NUC6i7KYB (Intel(R) Core(TM) i7-6770HQ CPU @ 2.60GHz, with 16 GiB core, SSD). The test command was

git clean -fdx && ./bootstrap.sh && ./configure && time (make -s -j12 all && make -s -j12 check)

Over 6 attempts, wall clock time averaged 10 minutes and 18 seconds without precompiled headers, and 9 minutes and 48 seconds with them, so a roughly 30 seconds (or about 5%) compile time improvement with gcc. Good but not earth shaking.

The bad: poor integration with the autotools toolchain (gcc edition)

Autotools’ stance on precompiled headers is pretty clear:

This is how I’ve done it. It’s hacky, but some parts of it may not apply to other projects’ setup.

In configure.ac, define an user argument --enable-precompiled-headers , and react to it with an automake conditional ENABLE_PCH_GCC .

In src/Makefile.am , define a custom Makefile rule that builds the precompiled header:

$(top_srcdir)/include/squid.h.gch: $(top_srcdir)/include/squid.h
    $(CXXCOMPILE) -x c++-header -o $@ $<

What’s wrong with this:

src/Makefile.am is touching files in include/
This is necessary because include/ doesn’t have a Makefile.am of its own, and the top level Makefile.am doesn’t have access to the CXXCOMPILE variable.
srcdir shouldn’t be mucked about at build time; that’s what builddir is for

Then, add a section

if ENABLE_PCH_GCC
PCH_FILE=$(top_srcdir)/include/squid.h.gch
endif

which is then referenced in

BUILT_SOURCES = \
   dnl ... \
   $(PCH_FILE)

This will pull in the precompiled header in the list of dependencies of squid, unit tests, and files to clean up. We can’t really control the order this gets built in, but it isn’t a big deal: if we need to compile anything before the precompiled header is built, everything will still work, just without the speed bump.

The worse: poor integration with the autotools toolchain (clang edition)

Clang has one extra problem compared to gcc: to actually use the precompiled header, it needs the -include-pch <file> option. If the option is used, the file needs to be there, or the build will fail.

Which makes being unable to control the build order a showstopper. We would need to build the precompiled header file without that flag before we do anything else. But looking at the generated src/Makefile:

all: $(BUILT_SOURCES)
$(MAKE) $(AM_MAKEFLAGS) all-recursive

One way to make sure that we only add the –include-pch option would be to send it down the recursive make invocation, except we don’t control that.

That’s it, I give up

The benefit is just not worth the number of hacks and complexity.

What could make it work?

gcc gets this behaviour right; it would be great if clang took inspiration from them. At the very least, do not fail building if the included pch was missing. This would enable treating it for what it is: an optimisation.

This entry was posted on October 10, 2023 at 07:52 and is filed under Uncategorized. You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.