(评论)
(comments)

原始链接: https://news.ycombinator.com/item?id=43999492

Hacker News上的一篇讨论集中在C++初始化的怪癖上,尤其关注未初始化变量可能导致的未定义行为(UB)。许多评论者都认为C++的默认初始化行为是有问题的,可能导致难以调试的bug,尤其对于新手来说。 一些人提出了解决方案,例如默认初始化为零,或者要求使用明确的关键字来表示有意不初始化,优先考虑安全性而不是微小的性能提升。向后兼容性问题被提出,但一些人认为带有UB的旧代码本来就已经可以说是坏了。另一些人则抵制“训练轮”的想法,更倾向于程序员拥有最终的控制权,即使这意味着冒着UB的风险。他们还建议显式初始化是解决这个问题的方法。 讨论涉及到优化和安全之间的矛盾、C++初始化规则的复杂性以及该语言历史上对底层的关注。一些人指出,其他语言,如Rust,默认情况下更安全地处理初始化。C++在性能关键型环境(嵌入式系统)中的作用也被提及。

相关文章
  • (评论) 2025-03-17
  • (评论) 2025-05-13
  • (评论) 2024-08-05
  • (评论) 2024-07-01
  • C 和 C++ 中未定义行为指南 (2010) 2025-03-17

  • 原文
    Hacker News new | past | comments | ask | show | jobs | submit login
    Initialization in C++ is bonkers (2017) (tartanllama.xyz)
    85 points by todsacerdoti 4 hours ago | hide | past | favorite | 62 comments










    Heh, low comments on C++ posts now. A sign of the times. My two cents anyway.

    I've been using C++ for a decade. Of all the warts, they all pale in comparison to the default initialization behavior. After seeing thousands of bugs, the worst have essentially been caused by cascading surprises from initialization UB from newbies. The easiest, simplest fix is simply to default initialize with a value. That's what everyone expects anyway. Use Python mentality here. Make UB initialization an EXPLICIT choice with a keyword. If you want garbage in your variable and you think that's okay for a tiny performance improvement, then you should have to say it with a keyword. Don't just leave it up to some tiny invisible visual detail no one looks at when they skim code (the missing parens). It really is that easy for the language designers. When thinking about backward compatibility... keep in mind that the old code was arguably already broken. There's not a good reason to keep letting it compile. Add a flag for --unsafe-initialization-i-cause-trouble if you really want to keep it.

    C++, I still love you. We're still friends.



    > When thinking about backward compatibility... keep in mind that the old code was arguably already broken. There's not a good reason to keep letting it compile.

    Oh how I wish the C++ committee and compiler authors would adopt this way of thinking... Sadly we're dealing with an ecosystem where you have to curate your compiler options and also use clang-tidy to avoid even the simplest mistakes :/

    Like its insane to me how Wconversion is not the default behavior.



    Compilers should add this as a non-standard extension, right? -ftrivial-auto-var-init=zero is a partial solution to a related problem, but it seems like they could just... not have UB here. It can't be that helpful for optimization.


    Yes but it’s not portable. If zero initialization were the default and you had to opt-in with [[uninitialized]] for each declaration it’d be a lot safer. Unfortunately I don’t think that will happen any time soon.


    I don't really care if it isn't portable. I only have to work with Clang, personally.

    > If zero initialization were the default and you had to opt-in with [[uninitialized]] for each declaration it’d be a lot safer.

    I support that, too. Just seems harder than getting a flag into Clang or GCC.



    Portability is always for the other guy’s sake, not your own. That’s why so many people don’t care about it.


    Not to worry, there is a 278 page book about initialization in C++!

    https://leanpub.com/cppinitbook

    (I don't know whether it's good or not, I just find it fascinating that it exists)



    Wow! Exhibit 1 for the prosecution.


    Well, authors are incentivized into writing long books. Having said that it obviously doesn't take away the fact that C++ init is indeed bonkers.


    What would be the incentive for making this a long book? Couldn't be money.


    It is actually. It's been shown that longer books make more sales as they are considered more trustworthy, so authors are incentivized to artificially drag them longer than they actually require


    I imagine if I'd managed to actually memorize all of C++'s initialization rules, I'd probably have to write a book too just to get it all out, or I'd lose my sanity.




    This is a specialization of the general statement that C++ is bonkers.


    Small discussion at the time (42 points, 6 comments) https://news.ycombinator.com/item?id=14532478

    Related: Initialization in C++ is Seriously Bonkers (166 points, 2019, 126 points) https://news.ycombinator.com/item?id=18832311



    Aside, but the author of this blog is the author of https://nostarch.com/building-a-debugger

    A wonderful exploration of an underexplored topic--I've pre-ordered the hard copy and have been following along with the e-book in the interim.



    This idea that everything must be initialized (i.e. no undefined or non-deterministic behavior) should never be forced upon a language like C++ which rightly assumes the programmer should have the final say. I don't want training wheels put on C++ -- I want C++ do exactly and only what the programmer specifies and no more. If the programmer wants to have uninitialized memory -- that is her business.


    It's so ironic hearing a comment like this. If what you really want is for C++ to do only what you strictly specified, then you'd always release your software with all optimizations disabled.

    But I'm going to go out on a limb here and guess you don't do that. You actually do allow the C++ compiler to make assumptions that are not explicitly in your code, like reorder instructions, hoist invariants, eliminate redundant loads and stores, vectorize loops, inline functions, etc...

    All of these things I listed are based on the compiler not doing strictly what you specified but rather reinterpreting the source code in service of speed... but when it comes to the compiler reinterpreting the source code in service of safety.... oh no... that's not allowed, those are training wheels that real programmers don't want...

    Here's the deal... if you want uninitialized variables, then explicitly have a way to declare a variable to be uninitialized, like:

        int x = void;
    
    This way for the very very rare cases where it makes a performance difference, you can explicitly specify that you want this behavior... and for the overwhelming majority of cases where it makes no performance impact, we get the safe and well specified behavior.


    How about int x = 0 if you want 0. Just `int x;` doesn't make it clear that you want 0.


    That's the inherent tension, though, isn't it?

    A programmer wants the compiler to accept code that looks like a stupid mistake when he knows it's not.

    But he also wants to have the compiler make sure he isn't making stupid mistakes by accident.

    How can it do both? They're at odds.



    The discussion about what should be the default behavior and of what should be the opt-in behavior is very different from what should be possible. It is definitely clear that in c++, it must be possible to not initialize variables.

    Would it really be that unreasonable to have initialisation be opt-out instead of opt-in? You'd still have just as much control, but it would be harder to shoot yourself in the foot by mistake. Instead it would be slightly more easy to get programs that can be optimised.



    C++ is supposed to be an extension of C, so I wouldn't expect things to be initialized by default, even though personally I'm using C++ for things where it'd be nice.

    I'm more annoyed that C++ has some way to default-zero-init but it's so confusing that you can accidentally do it wrong. There should be only one very clear way to do this, like you have to put "= 0" if you want an int member to init to 0. If you're still concerned about safety, enable warnings for uninitialized members.



    As someone who has to work in C++ day in and day out: please, give me the fucking training wheels. I don't want UB if I declare an object `A a;` instead of `A a{};`. At least make it a compiler error I can enable!


    Ideally, there would be a keyword for it. So ‘A a;’ would not compile. You’d need to do ‘A a{};’ or something like ‘noinit A a;’ to tell the compiler you’re sure you know what you are doing!


    By that logic, you'd have to dislike the situations where C++ does already initialize variables to defined values, like `int i;`, because they're removing your control and forcing training wheels upon you.

    So, do you?



        int i;
    
    does not initialize the value.


    It's a gotcha to be sure. Sometimes it does, sometimes it doesn't. From a reference[0]:

      #include 
      
      struct T1 { int mem; };
      
      struct T2
      {
          int mem;
          T2() {} // “mem” is not in the initializer list
      };
      
      int n; // static non-class, a two-phase initialization is done:
      // 1) zero-initialization initializes n to zero
      // 2) default-initialization does nothing, leaving n being zero
      
      int main()
      {
          [[maybe_unused]]
          int n;            // non-class, the value is indeterminate
          std::string s;    // class, calls default constructor, the value is ""
          std::string a[2]; // array, default-initializes the elements, the value is {"", ""}
          //  int& r;           // Error: a reference
          //  const int n;      // Error: a const non-class
          //  const T1 t1;      // Error: const class with implicit default constructor
          [[maybe_unused]]
          T1 t1;            // class, calls implicit default constructor
          const T2 t2;      // const class, calls the user-provided default constructor
          // t2.mem is default-initialized
      }
    
    That `int n;` on the 11th line is initialized to 0 per standard. `int n;` on line 18, inside a function, is not. And `struct T1 { int mem; };` on line 3 will have `mem` initialized to 0 if `T1` is instantiated like `T1 t1{};`, but not if it's instantiated like `T1 t1;`. There's no way to tell from looking at `struct T1{...}` how the members will be initialized without knowing how they'll be called.

    C++ is fun!

    [0]https://en.cppreference.com/w/cpp/language/default_initializ...



    Stroustrup once said

    > "There's a great language somewhere deep inside of C++"

    or something to that effect.



    Unless `i` is global…


    The entire problem is that what the programmer wants to do and what the program actually does isn't always clear to the programmer.


    The problem is that the initialization semantics are so complex in C++ that almost no programmer is actually empowered to exercise their final say, and no programmer without significant effort.

    And that's not just said out of unfamiliarity. I'm a professional C++ developer, and I often find I'm more familiar with C++'s more arcane semantics than many of my professional C++ developer co-workers.



    "If the programmer wants to have uninitialized memory -- that is her business."

    idk, seems like years of academic effort and research wasted if we do the way C++ do it



    The dev should have the option to turn it off but I think that removing a lot of undefined and non deterministic behavior would be a good thing. When I did C++ I initialized everything and when there was a bug it could usually be reproduced. There are a few cases where it makes sense performance wise to not initialize but those cases are very small compared to most other code where undefined behavior causes a ton of intermittent bugs.


    If they want the program to do exactly what is told they won't get to have optimization.


    Most of that actually just makes sense if you approach it from the historic,low-level, minimalist direction. But maybe if you're coming from some other, higher-comfort language...


    Coming from C, none of this made sense to me. Wut is `foo() = default;`? If you want a default value of 0, why isn't it just

      struct foo {
        int a = 0;
      };
    
    In Python, which is higher-level ofc, I still have to do `foo = 0`, nice and clear.


    > If you want a default value of 0, why isn't it ...

    It is.



    And for the most part it does what you expect.


    I largely prefer modern C++ as systems languages go but there is no getting around the fact that the initialization story in C++ is a hot mess. Fortunately, it mostly does what you need it to even if you don't understand it.


    And sometimes it doesn’t do what you think it does.


    Let the language die, hope it goes quicker than cobol.


    https://www.phoronix.com/news/GCC-15-Merges-COBOL

    COBOL Language Frontend Merged For GCC 15 Compiler Written by Michael Larabel in GNU on 11 March 2025 at 06:22 AM EDT. 33 Comments



    “quicker than cobol” means it will die in the next 100 years (maybe) :)


    COBOL is alive and well. Why would a company rewrite a codebase that has decades of error free functionality? What do they get?


    > Why would a company rewrite a codebase that has decades of error free functionality? What do they get?

    All well and good if it is something you do not have to modify/maintain on a regular basis. But, if you do, then the ROI on replacing it might be high, depending on how much pain it is to keep maintaining it.

    We have an old web app written in asp.net web forms. It mostly works. But we have to maintain it and add functionality to it. And that is where the pain is. We've been doing it for a few years but the amount of pain it is to work on it is quite high. So we are slowly replacing it. One page at a time.



    the insurance companies running COBOL don't care. it's cheaper to pay a cowboy $X00,000/yr to keep the gravy dispenser running than trying to modify it. by definition, this is code that's been in use for decades. Why change it?


    I suspect the committee agrees with you. I think they’ve anticipated a competitor coming to kill C++ for two decades now and see themselves as keeping C++ on life support for those who need it.

    It’s shameful that there’s no good successor to C++ outside of C# and Java (and those really aren’t successors). Carbon was the closest we came and Google seems to have preemptively dropped it.



    The latest Carbon newsletter is here, from March: https://github.com/carbon-language/carbon-lang/discussions/5...


    Carbon is still quite active.


    It's fun to cross the streams of HN catnip.

    C++ sucks, it's too hard to use, the compiler should generate stores all over the place to preemptively initialize everything!

    Software is too bloated, if we optimized more we could use old hardware!



    I'm not familiar with programming languages that generate redundant stores in order to initialize anything.

    Usually what happens is the language requires you to initialize the variable before it's read for the first time, but this doesn't have to be at the point of declaration. Like in Java you can declare a variable, do other stuff, and then initialize it later... so long as you initialize it before reading from it.

    Note that in C++, reading from a variable before writing to it is undefined behavior, so it's not particularly clear what benefit you're getting from this.



    > Explicitly initialize your variables, and if you ever fall in to the trap of thinking C++ is a sane language, remember this

    It's a systems language. Systems are not sane. They are dominated by nuance. In any case the language gives you a choice in what you pay for. It's nice to be able to allocate something like a copy or network buffer without having to pay for initialization that I don't need.



    >> Systems are not sane.

    “The systems programmer has seen the terrors of the world and understood the intrinsic horror of existence.”

    https://www.usenix.org/system/files/1311_05-08_mickens.pdf



    I think in this case it's not amiss to mention Rust. Rust gives a compile error if it's not certain a variable is initialized. Option is the standard dynamic representation of this, and works nicely in the context of all Rust code. MaybeUninint is the `unsafe` variant that is offered for performance-critical situations.


    That may have made sense in the days of < 100 MHz CPUs but today I wish they would amend the standard to reduce UB by default and only add risky optimizations with specific flags, after the programmer has analyzed them for each file.


    > That may have made sense in the days of < 100 MHz CPUs

    you don't know how much C++ code is being written for 100-200MHz CPUs everyday

    https://github.com/search?q=esp8266+language%3AC%2B%2B&type=...

    I have a codebase that is right now C++23 and soon I hope C++26 targeting from Teensy 3.2 (72 MHz) to ESP32 (240 MHz). Let me tell you, I'm fighting for microseconds every time I work with this.



    I bet even there you have only a few spots where it really makes a difference. It’s good to have the option but I think the default behavior should be safer.


    I don't know, way too often often my perf traces are evenly distributed across a few hundred functions (at best), without any clear outlier.


    "how much code" =/= how many developers.

    the people who care about clock ticks should be the ones inconvenienced, not ordinary joes who are maintaining a FOSS package that is ultimately stuck by a 0-day. It still takes a swiss-cheese lineup to get there, for sure. but one of the holes in the cheese is C++'s default behavior, trying to optimize like it's 1994.



    > the people who care about clock ticks

    I mean that's pretty much the main reason for using c++ isn't it? Video games, real-time media processing, CPU ai inference, network middleware, embedded, desktop apps where you don't want startup time to take more than a few milliseconds...



    No, it's not a dichotomy of having uninitialized data and fast startup or wait several milliseconds for a jvm or interpreter to load a gigabyte of heap allocated crap.


    CPU speed is not memory bandwidth. Latency and contention always exist. Long lived processes are not always the norm.

    In another era we would have just called this optimal. https://x.com/ID_AA_Carmack/status/1922100771392520710







    Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact



    Search:
    联系我们 contact @ memedata.com