(评论)
(comments)

原始链接: https://news.ycombinator.com/item?id=39318867

另一个使用类似模式的 GitHub 存储库是 https://github.com/prometheus-monitoring-stack/processmanager。 虽然它是 Rust 而不是 Go,但它具有相似的依赖注入原理以及核心逻辑与配置和设置步骤之间的关注点分离。 其架构包括一个集中的“代理”进程,通过 Prometheus 堆栈注册和监控托管资源,提供详细的监控功能和基于预定义阈值的自动扩展。 总的来说,它展示了类似编程概念和哲学在 Rust 和 Prometheus 等不同流行后端技术中的实际应用。

相关文章

原文
Hacker News new | past | comments | ask | show | jobs | submit login
How I write HTTP services in Go after 13 years (grafana.com)
607 points by matryer 15 hours ago | hide | past | favorite | 162 comments










>The Valid method takes a context (which is optional but has been useful for me in the past) and returns a map. If there is a problem with a field, its name is used as the key, and a human-readable explanation of the issue is set as the value.

I used to do this, but ever since reading Lexi Lambda's "Parse, Don't Validate," [0] I've found validators to be much more error-prone than leveraging Go's built-in type checker.

For example, imagine you wanted to defend against the user picking an illegal username. Like you want to make sure the user can't ever specify a username with angle brackets in it.

With the Validator approach, you have to remember to call the validator on 100% of code paths where the username value comes from an untrusted source.

Instead of using a validator, you can do this:

    type Username struct {
      value string
    }

    func NewUsername(username string) (Username, error) {
      // Validate the username adheres to our schema.
      ...

      return Username{username}
    }
That guarantees that you can never forget to validate the username through any codepath. If you have a Username object, you know that it was validated because there was no other way to create the object.

[0] https://lexi-lambda.github.io/blog/2019/11/05/parse-don-t-va...



Crazy that actually using your type system leads to better code. Stop passing everything around as `string`. Parse them, and type them.


There's a name for this anti-pattern: "Stringly typed"


I've also seen it called primitive obsession, which is also applicable to other primitive types like using an integer in situations where an enum would be better.


Definitely use to fall for primitive obsession. It seemed so silly to wrap objects in an intermediary type.

After playing with Rust, I changed my tune. The type system just forces you into the correct path, that a lot of code became boring because you no longer had to second guess what-if scenarios.



> Definitely use to fall for primitive obsession. It seemed so silly to wrap objects in an intermediary type.

A lot of languages certainly don't make it easy. You shouldn't have to make a Username struct/class with a string field to have a typed username. You should be able to declare a type Username which is just a string under the hood, but with different associated functions.



Yeah, modern type systems are game changers. I've soured on Rust, but if Go had the full Ocaml type system with match statements I think it would be the perfect language.


Sadly enums are too advanced of a concept to be included in Go.




Bash :(


JSON


TCL


This term is typically used to refer to things like data structures and numerical values all being passed as strings. I don't think a reasonable person would consider storing a username in a string to be "stringly typed".


It definitely is stringly typed. It's just that it's a very normalized example of it, that people don't think of as being an antipattern.

If you want to implement what Yaron Minsky described as "make illegal states unrepresentable", then you use a username type, not a string. That rules out multiple entire classes of illegal states.

If you do that, then when you compile your program, the typechecker can provide a much stronger correctness proof, for more properties. It allows you to do "static debugging" effectively, where you debug your code before it ever even runs.



The One True Wiki[0] says "Used to describe an implementation that needlessly relies on strings when programmer & refactor friendly options are available."

Which is exactly what's going on here. A username has a string as a payload, but that payload has restrictions (not every string will do) and methods which expect a username should get a username, not any old string.

[0]: https://wiki.c2.com/?StringlyTyped



I don't agree that this example is more "programmer friendly". Anything you want to do with the username other than null check and passing an argument is going to be based directly on the string representation. Insert into a database? String. Display in a UI? String. Compare? String comparison. Sort? String sort. Is it really more "programmer friendly" to create wrapper types for individual strings all over your codebase that need to have passthrough methods for all the common string methods? One could argue that it's worth the tradeoff but this C2 definition is far from helpful in setting a clear boundary.

Meanwhile the real world usages of this term I've seen in the past have all been things like enums as strings, lists as strings, numbers as strings, etc... Not arbitrary textual inputs from the user.



You inherit some code. Is that string a username or a phone number? Who knows. Someone accidentally swapped two parameter values. Now the phone number is a username and you’ve got a headache of trying to figure out what’s wrong.

By having stronger types this won’t come up as a problem. You don’t have to rely on having the best programmers in the world that never make mistakes (tm) to be on your team and instead rely on the computer making guard rails for you so you can’t screw up minor things like that.



As a PHP developer I am frankly disappointed you think that we only do that with strings. I've got an array[1] full of other tools.

1. Or maybe a map? Those keys might have significance I didn't tell you about.



I originally typed out `int` and wanted to do more, but I try to keep my comments as targeted as possible to avoid the common reply pattern of derailing a topic by commenting on the smallest and least important part of it. If I type `string`, `int`, `arrays`, `maps`, `enums`... someone will write 3 paragraphs about enums are actually an adequate usage of the type system, and everyone will focus on that instead of the overarching message.


This is a good design pattern, but be wary of doing validation too early. The design pattern allows you to do it as early or late as you like, but doesn't tell you when to do it. Often it's best to do it as part of parsing/validating some larger object.

See Steven Witten's "I is for Intent" [1] for some ideas about the use of unvalidated data in a UI context.

[1] https://acko.net/blog/i-is-for-intent/



I read through that piece and strongly disagree with the premise that their insight is somehow at odds with leaning into the type system for correctness.

The legitimate insight that they have is that anchoring the state as close as possible to the user input is valuable—I think that that is a great insight with a lot of good applications.

However, there's nothing that says you can't take that user-centric state and put it in a strongly typed data structure as soon as possible, with a set of clearly defined and well-typed transitions mapping the user-centric state to the derived states.

Edit: looks like there was discussion on this the other day, with a number of people making similar observations—https://news.ycombinator.com/item?id=39269886



A text file and an abstract syntax tree can both be rigorously represented using types, but one is before parsing and other is after parsing. The question is which one is more suitable for editing?

Text has more possible states than the equivalent AST, many of which are useful when you haven't typed in all the code yet. Incomplete code usually doesn't parse.

This suggests that drafts should be represented as text, not an AST.

And maybe similarly for drafts of other things? Drafts will have some representation that follows some rules, but maybe they shouldn't have to follow all the rules. You may still want to save drafts and collaborate on them even though they break some rules.

In a system that's not an editor, though, maybe it makes sense to validate early. For a command-line utility, the editor is external, provided by the environment (a shell or the editor for a shell script) so you don't need to be concerned with that.



Conceptually equivalent to the ancient arts of private constructors and factory methods.


Which (in Java) were then abstracted away in... interesting annotations.


I’ve found it hard to apply this pattern in Go since, if Username is embedded in a struct, and you forget to set it, you’ll get Username’s zero value, which may violate your constraints.


But if you then create a constructor / factory method for that struct, not setting it would trigger an error. But this is one of the problem with Go and other languages that have nil or no "you have to set this" built into their type system: it relies on people's self-discipline, checked by the author, reviewer, and unit test, and ensuring there's not a problem like you describe takes up a lot of diligence.


Why? You can easily call NewUsername inside NewAccount for example, just return the error. Or did I misunderstood?


Because go doesn’t have exhaustiveness checking when initialising structs. Instead it encourages “make the zero value meaningful” which is not always possible nor desirable. I usually use a linter to catch this kind of problem https://github.com/GaijinEntertainment/go-exhaustruct


This is a variation on one of my favorite software design principles: Make illegal states unrepresentable. I first learned about it through Scott Wlaschin[1].

[1]: https://fsharpforfunandprofit.com/posts/designing-with-types...



The issue is DRY often comes to wreck this sort of thing. Some devs will see "Hmm, Username is exactly the same as just a string so let's just use a string as Username is just added complexity".

I've tried it with constructs like `Data` and `ValidatedData` and it definitely works, but you do end up with duplicate fields between the two objects or worse an ever growing inheritance tree and fields unrelated to either object shared by both.

For example, consider data looking like

    Data {
      value string
    }
and ValidatedData looking like

    ValidatedData {
      value int
    }
There's a mighty temptation for some devs to want to apply DRY and zip these two things together. Unfortunately, that can really be messy on these sorts of type changes and the where of where validation needs to happen gets muddled.


Except Username is not exactly the same as string, and that's important. Username is a subset of string. If they were equivalent, we wouldn't need to parse/validate.

The often misinterpreted part of DRY is conflating "these are the same words, so they are the same", with "these are the same concept, so they are the same". A Username and a String are conceptually different.



DRY is just "Do not repeat yourself". And a LOT of devs take that literally. It's not "Do not repeat concepts" (which is what it SHOULD be but DRC isn't a fun acronym).

Unfortunately "This is the same character string" is all a DRY purist needs to start messing up the code base.

I honestly believe that "DRY" is an anti-pattern because of how often I see this exact behavior trotted out or espoused. It's a cargo cult thing to some devs.



That's why I like to tell people to always remember to stay MOIST - the Most Optimal is Implicitly the Simplest Thing.

When you add complexity to DRY out your code, you're adding a readability regression. DRY matters in very few context beyond readability, and simplicity and low cognitive load need to be in charge. Everything else you do code-style wise should be in service of those two things.



DRY has nothing to do with readability. The fact that it might help with it is purely coincidental.

DRY is about maintainability - if you repeat rules (behavior) around the system and someone comes along and changes it, how can you be sure it affected all the system coherently?

I've seen this in practice: we get a demand from the PO, a more recent hire goes to make the change, the use case of interest to the PO gets accepted. A week later we have a bug on production because a different code path is still relying on the old rule.



This seems less about DRY and more a story about a hypothetical junior dev making a dumb mistake masquerading as commentary about “DRY purism”.


Man I wish it was just jr devs. I cut jrs a ton of slack, they don't know any better. However, it's the seniors with the quick quips that are the biggest issue I run into. Or perhaps senior devs with jr mentalities


most srs are just jrs with inflated egos and titles


Like everything, it depends is the right answer.


One of the major issues with a lot of the outdated concepts in programming is that we still teach them to young people. I work a side gig as an external examiner for CS students. Especially in the early years they are taught the same OOP content that I was taught some decades ago, stuff that I haven’t used (also) for some decades. Because while a lot of the concepts may work well in theory, they never work out in a world where programmers have to write code on a Thursday afternoon after a terrible week.

It’s almost always better to repeat code. It’s obviously not something that is completely black and white, even if I prefer to never really do any form of inheritance or mutability, it’s not like I wouldn’t want you to create a “base” class with “created by” “updated by” and so on for your data classes and if you have some functions that do universal stuff for you and never change, then by all means use them in different places. But for the most part, repeating code will keep your code much cleaner. Maybe not today or the next month, but five years down the line nobody is going to want to touch that shared code which is now so complicated you may as well close your business before you let anyone touch it. Again, not because the theoretical concepts that lead to this are necessarily flawed, but because they require too much “correctness” to be useful.

Academia hasn’t really caught on though. I still grade first semester students who have the whole “Animal” -> “duck”, “dog”, “cat” or whatever they use into their heads as the “correct way” to do things. Similar to how they are often taught other processes than agile, but are taught that agile is the “only” way, even though we’ve seen just how wrong that is.

I’m not sure what we can really do about it. I’ve always championed strongly opinionated dev setups where I work. Some of the things we’ve done, and are going to do, aren’t going to be great, but what we try to do is to build an environment where it’s as easy as possible for every developer to build code the most maintainable way. We want to help them get there, even when it’s 15:45 on a Thursday that has been full of shit meetings in a week that’s been full of screaming children and an angry spouse and a car that exploded. And things like DRY just aren’t useful.



Yeah, no. Not at all. I imagine that you are taking DRY quite literally, as if and critiquing the most stupid use cases of it, like DRYing calls to Split with spaces to SplitBySpace.

DRY's goal is to avoid defining behaviors in duplicity, resulting in having multiple points in code to change when you need to modify said behavior. Code needs to be coherent to be "good", for a number of of the different quality indicators.

I'm doing a "side project" right now where I'm using a newcomer payment gateway. They certainly don't DRY stuff. Same field gets serialized with camel case and snake case in different API, and whole structures that represent the same concept are duplicate with slightly different fields. This probably means that Thursday 15.25 the dev checked-in her code happy because the reviewer never cared about DRY, and now I'm paying the price of maintaining four types of addresses in my code base.



It’s a balancing act, but deletable code is often preferable to purely-DRY-for-the-sake-of-DRY, overly abstracted code.


> It’s almost always better to repeat code.

God no. Stop the copy pasta disease! It's horrible, mindless programming.

When reviewing code, I'm astonished anything was accomplished by copy pasting so much old code (complete with bugs and comment typos).

Incidentally, OOP encourages you to copy a lot. It's just an engine for generating code bloat. Want to serialize some objects? Here's your Object serializer and your overloaded Car serialize and your overloaded Boat serializer, with only a few different fields to justify the difference!

OOP is bad. Copy pasta is bad. DRY is good. All hail DRY, forever, at any cost.



Countless man-centuries have been lost looking for the perfect abstraction to cover two (or an imagined future with two) cases which look deceptively similar, then teasing them apart again.


OOP and Dry are compatible! I’ve actually done the thing that the above commenter suggests - create a base object with created on/by so that I never have to think about it. Whether or not you actually care about that, if you implement a descended of that object you’re going to get some stuff for free, and you’re gonna like it!


For what it's worth, I've always had an easier time combining WET code than untangling the knot than is too DRY code. Too little abstraction and you might have to read some extra code to understand it. Too much abstraction and no one other than the writer, and even then, may ever understand it.


There's a mistake many junior devs (and sometimes mid and senior devs) make where they confuse hiding complexity with simplicity - using a string instead of a well defined domain type is a good example, there is a certain complexity of the domain expressed by the type that they don't want to think about too deeply so they replace with a string which superficially looks simpler but in fact hides all of the inherent complexity and nuance.

It causes what I call the lumpy carpet syndrome - sweeping the complexity under the carpet causes bumps to randomly appear that when squashed tend to cause other bumps to pop up rather than actually solving the problem.



Go now has generics, so I'm confident some smart fellow will apply DRY and make it a generic ValidatedData[type, validator] type struct, with a ValidatedDataFactory that applies the correct validator callback, and a ValidatorFactory that instantiates the validators based on a new valdiation rule DSL written in JSON or XML.

...Easy!



Related:

Parse, don't validate (2019) - https://news.ycombinator.com/item?id=35053118 - March 2023 (219 comments)

Parse, Don't Validate (2019) - https://news.ycombinator.com/item?id=27639890 - June 2021 (270 comments)

Parse, Don’t Validate - https://news.ycombinator.com/item?id=21476261 - Nov 2019 (230 comments)

Parse, Don't Validate - https://news.ycombinator.com/item?id=21471753 - Nov 2019 (4 comments)



> you have to remember to call the validator on 100% of code paths

But copy-pasting the same lines of code in literally every function is the Golang Way.

It makes code "simpler".



So far I like the commonly used approach in the Typescript community best:

1. Create your Schema using https://zod.dev or https://github.com/sinclairzx81/typebox or one of the other many libs.

2. Generate your types from the schema. It's very simple to create partial or composite types, e.g. UpdateModel, InsertModels, Arrays of them, etc.

3. Most modern Frameworks have first class support for validation, like Fastify (with typebox). Just reuse your schema definition.

That is very easy, obvious and effective.



It's not guaranteed at all, that's where go's zero-values come in. E.g. nested structs, un/marshaljson magic methods etc. How do you deal with that?


Every struct requiring its zero value to be meaningful is probably one of the worst design flaws in the language.


This is where we arrive at my conclusion that go is not well-suited to implementing business logic!


There is no such requirement. Common wisdom suggests that you should ensure zero values are useful, but that isn't about every random struct field – only the values you actually give others. Initialize your struct fields and you won't have to consider their zero state. They will never be zero.

It's funny seeing this beside the DRY thread. Seems programmers taking things a bit too literally is a common theme.



> Initialize your struct fields and you won't have to consider their zero state.

“Just do the right thing everywhere and you don’t have to worry!”

You can’t stop consumers of your libraries from creating zero-valued instances.



Then the zero value is their problem, not yours. You have no reason to be worried about that any more than you are worried about them not getting enough sleep, or eating unhealthy food. What are you doing to stop them from doing that? Nothing, of course. Not your problem.

Coq exists if you really feel you need a complete type system. But there is probably good reason why almost nobody uses it.



> Then the zero value is their problem, not yours.

Except for all those times you're the consumer of someone else's library and there's no way for them to indicate that creating a zero-valued struct is a bug.

Again, it's the philosophy of "Just do the right thing everywhere and you don’t have to worry!" Sometimes it's nice to work with a type system where designers of libraries can actually prevent you from writing bugs.



> Except for all those times you're the consumer of someone else's library and there's no way for them to indicate that creating a zero-valued struct is a bug.

Nonsense. Go has a built-in facility for documentation to communicate these things to other developers. Idiomatic Go strongly encourages you to use it. Consumers of the libraries expect it.

> Sometimes it's nice to work with a type system where designers of libraries can actually prevent you from writing bugs.

Well, sure. But, like I said, almost nobody uses Coq. The vast, vast, vast majority of projects – and I expect 100% of web projects – use languages with incomplete type systems, making what you seek impossible.

And there's probably a good reason for that. While complete type systems sound nice in theory, practice isn't so kind. There are tradeoffs abound. There is no free lunch in life. Sorry.



> The vast, vast, vast majority of projects – and I expect 100% of web projects – use languages with incomplete type systems, making what you seek impossible.

…where, "what GP seeks" is…

> way for [library authors] to indicate that creating a zero-valued struct is a bug

I'd say that's a really low and practical bar, you really don't need Coq for that. Good old Python is enough, even without linters and type hints.

Of course it's very easy to create an equivalent of zero struct (object without __init__ called), but do you think it's possible to do it while not noticing that you are doing something unusual?



> Good old Python is enough

No, Python is not enough to "...work with a type system where designers of libraries can actually prevent you from writing bugs." Not even typed Python is going to enable that. Only a complete type system can see the types prevent you from writing those bugs. And I expect exactly nobody is writing HTTP services with a language that has a complete type system – for good reason.

> Of course it's very easy to create an equivalent of zero struct

Yes, you are quite right that you, the library consumer, can Foo.__new__(Foo) and get an object that hasn't had its members initialized just like you can in Go. But unless the library author has specifically called attention to you to initialize the value this way, that little tingling sensation should be telling you that you're doing something wrong. It is not conventional for libraries to have those semantics. Not in Python, not in Go.

Just because you can doesn't mean you should.



C++ constructors actually make the guarantee, but it comes with other pains


Lots of languages handle it just fine and don’t need the mess of C++ ctors.

GP is pointing out that go specifically makes it an issue.



I always understood "parse don't validate" a bit differently. If you are doing the validation inside of a constructor, you are still doing validation instead of parsing. It is safer to do the validation in one place you know the execution will go through, of course, but not the idea I understand "parse don't validate" to mean. I understand it to mean: "write an actual parser, whatever passes the parser can be used in the rest of the program", where a parser is a set of grammar rules for example, or PEG.


I'm not a Haskell developer, so it's possible that I misunderstood the original "Parse, Don't Validate" post.

>If you are doing the validation inside of a constructor, you are still doing validation instead of parsing.

Why that would be considered validation rather than parsing?

From the original post:

>Consider: what is a parser? Really, a parser is just a function that consumes less-structured input and produces more-structured output.

That's the key idea to me.

A parser enforces checks on an input and produces an output. And if you define an output type that's distinct from the input type, you allow the type system "preserve" the fact that the data passed a parser at some point in its life.

But again, I don't know Haskell, so I'm interested to know if I'm misunderstanding Lexi Lambda's post.



Parse don't validate means that if you want a function that converts an IP address string to a struct IpAddress{ address: string } you don't validate that the input string is a valid IP address then return a struct with that string inside. Instead you parse that IP into raw integers, then join those back into an IP string.

The idea is that your parsed representation and serializer are likely produce a much smaller and more predictable set of values than may pass the validator.

As an example there was a network control plane outage in GCP because the Java frontend validated an IP address then stored it (as a string) in the database. The C++ network control plane then crashed because the IP address actually contained non-ASCII "digits" that Java with its Unicode support accepted.

If instead the address was parsed into 4 or 8 integers and was reserialized before being written to the DB this outage wouldn't have happened. The parsing was still probably more lax than it should have been, but at least the value written to the DB was valid.

In this case it was funny Unicode, but it could be as simple as 1.2.3.04 vs 1.2.3.4. By parsing then re-serializing you are going to produce the more canonical and expected form.



Perhaps "normalize" or "canonicalize" is more appropriate. A parser can liberally interpret but I don't take it to imply some destructured form necessarily. There are countless scenarios where you want to be able to reproduce the exact input, and often preserving the input is the simplest solution.

But yes usually you do want to split something into it's elemental components, should it have any.



Thanks for that explanation! I hadn't appreciated that aspect of "parse, don't validate," before.

But even with that understanding and from re-reading the post, that seems to be an extra safety measure rather than the essence of the idea.

Going back to my original example of parsing a Username and verifying that it doesn't contain any illegal characters, how does a parser convert a string into a more direct representation of a username without using a string internally? Or if you're parsing an uint8 into a type that logically must be between 1 and 100, what's the internal type that you parse it into that isn't a uint8?



> Or if you're parsing an uint8 into a type that logically must be between 1 and 100, what's the internal type that you parse it into that isn't a uint8?

Just for the sake of example, your internal representation might start from 0, and you just add 1 whenever you output it.

Your internal type might also not be a uint8. Eg in Python you would probably just use their default type for integers, which supports arbitrarily big numbers. (Not because you need arbitrarily big numbers, but just because that's the default.)



the fact that this is some special “technique” really shows how far behind Go’s type system & community around typing is


> and a human-readable explanation of the issue is set as the value.

This is annoying to translate later. At least also include some error code string that is documented somewhere and isn't prone to change randomly.



I mean, you may end up just wanting something like,

    type UsernameError struct {
      name   string
      reason string
    }
    func (e *UsernameError) Error() string { 
      return fmt.Errorf("invalid username %q: %s", e.name, e.reason)
    }
And reason can be "username cannot be empty" or "username may not contain 'This is fine for lots of different cases, because it’s likely that your code wants to know how to handle “username is invalid”, but only humans care about why.

I have personally never seen a Go codebase where you parse error strings. I know that people keep complaining about it so it must be happening out there—but every codebase I’ve worked with either has error constants (an exported var set to some errors.New() value) or some kind of custom error type you can check. Or if it doesn’t have those things, I had no interest in parsing the errors.



I write mostly frontends. Sometimes the APIs I talk to give back beautiful English error messages - that I can't just show to the user, because they are using a different language most of the time. And I don't want to write logic that depends on that sentence, far too brittle.


Right—I think the “error code” here is going to be the error type, i.e., UsernameError, or some qualified version of that.

It’s not perfect, but software evolves through many imperfect stages as it gets better, and this is one such imperfect stage that your software may evolve through.

Including a human-readable version of the error is useful because the developers / operators will want to read through the logs for it. Sometimes that is where you stop, because not all errors from all backends will need to be localized.



You can use new types with validation too. In fact the approaches seem to be duals.

Parse, don't validate:

                    string          ParsedString
  untrusted source -------> parse --------------> rest of system
Validate, don't parse:

                    UnvalidatedString            string
  untrusted source ------------------> validate -------> rest of system


The problem is that pattern "fails open." If anyone on the team forgets to define an untrusted string as UnvalidatedString, the data skips validation.

If you default to treating primitive types as untrusted, it's hard for someone to accidentally convert an untrusted type to a trusted type without using the correct parse method.



The dual problem would be any function which forgets to accept a ParsedString instead of a string can skip parsing.

Both cases appear to depend on there being a "checkpoint" all data must go through to cross over to the rest of the system, either at parsing or at UnvalidatedString construction.



>The dual problem would be any function which forgets to accept a ParsedString instead of a string can skip parsing.

>Both cases appear to depend on there being a "checkpoint" all data must go through to cross over to the rest of the system, either at parsing or at UnvalidatedString construction.

The difference is that if string is the trusted type, then it's easy to miss a spot and use the trusted string type for an untrusted value. The mistake will be subtle because the rest of your app uses a string type as well.

The converse is not true. If string is an untrusted type and ParsedString is a trusted type, if you miss a spot and forget to convert an untrusted string into a ParsedString, that function can't interact with any other part of your codebase that expects a ParsedString. The error would be much more visible and the damage more contained.

I think UnvalidatedString -> string also kind of misses the point of the type system in general. To parse a string into some other type, you're asserting something about the value it stores. It's not just a string with a blessing that says it's okay. It's a subset of the string type that can contain a more limited set of values than the built-in string type.

For example, parsing a string into a Username, I'm asserting things about the string (e.g., it's



The example also assumes that everything is like a 'ParsedString' that contains a copy of the original untrusted value inside.


Just do

    type Username string
And replace

      return Username{username}
with

      return Username(username)


The problem there is that you lose the guarantee that the parser validated the string value.

A caller can just say:

    // This is returning an error for some reason, so let's do it directly.
    // username, err := parsers.NewUsername(raw)
    username := parsers.Username(raw)
You also get implicit conversions in ways you probably don't want:

    var u Username
    u = "" // Implicitly converts from string to Username


That's true I did not think of that.


If you do that, people outside the package can also do Username(x) conversions instead of calling NewUsername. Making value package private means that you can only set it from outside the package using provided functionality.


My Go is rusty, do you mean not exporting the type "Username" (ie username) to avoid default constructor usage?


In Go, capitalized identifiers are exported, whereas lowercase identifiers are not.

In the example I gave above, clients outside of the package can instantiate Username, but they can't access its "value" member, so the only way they could get a populated Username instance is by calling NewUsername.



Encapsulation saves lives.


AKA 'Value Object' from DDD or a similar 'Quantity' accounting pattern. Another angle is that this fixes 'Primitive Obsession' code smell.


Now what? the username is in an unexported field and unusable? I can kind of see what its going for but it seems like a way just to add another layer of wrapping and indirection.


It would need a getter here. Probably good to keep it immutable, if you want guarantees that it will never be changed to something that violates the username rules.


> need a getter

Yeah, thats what I figured. Im not sure if I want the tradeoff of calling .GetValue in multiple places just to save calling validate in maybe 2 or 3 places.

Not to mention I cant easily marshal/unmarshal into it and next week valid username is a username that doesnt already exist in the database.

Maybe this approach appeals to people and Im hesitant to say “that’s not how Go is supposed to be written” but for me this feels like “clever over clear”.



> Yeah, thats what I figured. Im not sure if I want the tradeoff of calling .GetValue in multiple places just to save calling validate in maybe 2 or 3 places.

The tradeoff is not that you save calling validate, it’s that you avoid forgetting to call validate in the first place, because when you forget to validate, you get a type error.

IMO it’s a little more clear this way:

    type Ticket struct {
      requestor Username
      assignee  Username
    }
It lets you write code that is little more obvious.


I’m not sure I understand. In your example you’ve grouped related data in a struct and validating that it matches your system’s invariants, that feels good to me.

The original example was more “wrap a simple type in an object so it’s always validated when set” which looks beautiful when you don’t have the needed getters in the example nor show all the Get call sites opposed to the 1 or 2 New call sites. All in the name of “we don’t want to set the username without validation” but without private constructors Username{“invalid”} can be invoked, the validation circumvented and I’m not convinced the overhead we paid was worth it.



The countless bugs I've had to deal with and all the time I've lost fixing these bugs caused by people who forgot to validate data in a certain place or didn't realize they had to do so proves to me that the overhead of calling a get on a wrapper type is totally worth it.

I value the hours wasted on diagnosing a bug far more than the extra keystrokes and couple of seconds required to avoid it in the first place.



No, you’ve achieved an illusion of that as now your spending hours wasted on discovering where a developer forgot to call NewUsername and instead called Username{“broken”}. I cant see the value in this abstraction in Go.


They can’t because value is not exported. They must use the NewUsername function, which forces the validation.

In my opinion, this pattern breaks when the validation must return an error and everything becomes very verbose.



Oh, thats true about it being unexported. I hadn’t considered that.


But surely this is just another way of doing validation and not fundamentally "parsing"? If at the end you've just stored the input exactly as you got it, the only parsing you're potentially doing is in the validation step and then it gets thrown away.


The validation is not completely thrown away, since the type indicates that the data has been validated. I understand "parsing" as applying more structure to a piece of data. Going from a String to an IP or a Username fits the definition.

I push my team to use this pattern in our (mostly Scala) codebase. We have too many instances of useless validations, because the fact that a piece of data has been "parsed"/validated is not reflected in its type using simple validation.

For example using String, a function might validate the String as a Username. Lower in the call stack, a function ends up taking this String as an arg. It has no way of knowing if it has been validated or not and has to re-validate it. If the first validation gets a Username as a result, other functions down the call stack can take a Username as an argument and know for sure it's been validated / "parsed".



Implementation-wise, yes, but the interface you're exposing is indistinguishable from that of a parser. For all your consumers know, you could be storing the username as a sequence of a 254-valued enum (one for each byte, except the angle brackets) and reconstructing the string on each "get" call. For more complex data you would certainly be storing it piecewise; the only reasons this example gets a pass are 1) because it is so low in surface area that a human can reasonably validate the implementation as bug-free without further aid from the type checker, and 2) because Go's type system is so inexpressive that you can't encode complex requirements with it anyway.


em ai have a problem from cars


> func NewServer(... config *Config ...) http.Handler

one of my biggest pet peeves is when people take a Config object, which represents the configuration of an entire system, and pass it around mutably. When you do that, you're coupling everything together through the config object. I've worked on systems where you had to configure the parts in a specific order in order for things to work, because someone decided to write back to the config object when it was passed to them. Or another case was where I've seen it such that you couldn't disable a portion of the system because it wrote data into the config object that was read by some other subsystem later. The pattern of "your configuration is one big value, which is mutable" is one of the more annoying patterns that I've seen before, both in Go and in other languages.



The keyword here is “mutable” config object and not config data object in general. I use immutable config dataclass liberally in one of my python projects and i pass it around in all modules. Many functions rely on multiple values and instead of passing all of them as function parameters (which requires their own function typings), the dataclass has all variables with typing definitions in one place, its pretty handy design pattern.


My favorite way to prevent this is to make the config truly immutable, but still configurable with something like this:

  package config

  type options struct {
    name string
  }

  type Option func(o *options)

  func Name(name string) Option {
    return func(o *options) {
      o.name = name
    }
  }

  type Config struct {
    opts *options
  }

  func New(opts ...Option) *Config {
    o := &options{}
    for _, option := range opts {
      option(o)
    }
    return &Config{opts: o}
  }

  func (c *Config) Name() string {
    return c.opts.name
  }
Use it with:

  cfg := config.New(config.Name("Emanon"))
  fmt.Println(cfg.Name())


I used that pattern for a while but stopped using it. I first encountered it from this blog post: https://commandcenter.blogspot.com/2014/01/self-referential-...

It's a lot of boilerplate to create something that's not actually immutable. It also makes it harder to figure out which options are available, since now you can't just look at the documentation of the type, you have to look at the whole module package to figure out what the various options are. If one of the fields is a slice or map you can just mutate that slice or map in place, so it's not really immutable. The pattern as Pike describes it has the benefit that supplying an option returns an option that reverses the effect of supplying the option so that you can use the options somewhat like Python context objects that have enter and exit semantics, but in practice I've found that to be useful in a small portion of situations.



I've tended to create a Config struct for each package and then a configs.Config struct that's just made up of each package's Config. It might not be a Go best practice but I like that I can setup the entire system's configuration on startup as one entity but then I only pass in the minimally required dependencies for each package. It also makes testing a little easier because I don't have to fake out the entire configuration for testing one package.


I think that's a valid criticism. What do you think would be a more ergonomic pattern?


Not the OP, but I mitigate the issue rather than use a different pattern. Like so:

type Server struct { val bool }

type Config struct { Val bool }

func NewServer(... config *Config ...) http.Handler { if config == nil { config = &Config{} } return &Server{ val: config.Val } }

It took me a long time to settle on this pattern and I admit it's tedious to copy configuration over to the server struct, but I've found that it ends up being the least verbose and maintainable long term while making sure callers can't mutate config after the fact.

I can pass nil to NewServer to say "just the usual, please", customize everything, or surgically change a single option.

It's also useful for maintaining backwards compatibility. I'm free to refactor config on my server struct and "upgrade" deprecated config arguments inside my NewServer function.



I just use a struct literal, and then I have the type define a `func (t *Thing) ready() error { ... }` method and call the ready method to check that its valid. I prefer this over self-referential options, the builder pattern, supplying a secondary config object as a parameter to a constructor, etc.


I wrote a static config class that reads configuration for the entire app / server from a JSON or YAML file ( https://github.com/uber/zanzibar/blob/master/runtime/static_... ).

Once you've loaded it and mutated it for testing purposes or for copying from ENV vars into the config, you can then freeze it before passing it down to all your app level code.

Having this wrapper object that can be frozen and has a `get()` method to read JSON like data make it effectively not mutable.



I use similar pattern myself. Was curious if the OP is using some other, like for instance splitting the struct into two (im/mutable) and then passing them around, or what.

BTW kudos on zanzibar. Love the tech and the code).



I agree. We ran into sev by changing the top level config object before. You DO NOT want to modify it. The wasted man hour is not worth. You will never know where or how it get used. If you make changes it's better to derive from it instead.

Update: What's funny was, in our design the config object was kinda immutable. You have to use the WARNING_DO_NOT_USE api to make modification. We did mutate the object and we caused a sev



I really like Mat Ryer's work, and I've applied most of the ideas in the 2018 version of this article to all of my Go projects since then.

The one weak spot for me is this aspect:

>NewServer is a big constructor that takes in all dependencies as arguments... In test cases that don’t need all of the dependencies, I pass in nil as a signal that it won’t be used.

This has always felt wrong to me, but I've never been able to figure out a better solution.

It means that a huge chunk of your code has a huge amount of unnecessary shared state.

I often end up writing HTTP handlers that only need access to a tiny amount of the shared state. Like the HTTP handler needs to check if the requesting user has access to a resource, and then it needs to call one function on the datastore.

I'd love to write tests where I only mock out those two methods, but I can't write simple tests because the handler is part of this giant glob where it has access to all of the datastore and every object the parent server has access to because it's all one giant object.

Nothing against Mat Ryer, as his pattern is the best I've found, but I still feel like there's some better solution out there.



I've become increasingly sensitive to these high afferent coupling points in the repos I work on, especially the deeper I embed into the world of bazel and how dependency management and physical design influence the code I author.

Where possible plugins are a great strategy to lay down these code seam points that don't force all possibilities upon some body of code, because fundamentally with plugin architectures you pick and choose what you want. Plugins are opt out by default, you must explicitly opt into a plugin for it to manifest. I've been calling software that has this quality going as being an "a la carte" style.

But in general you do what you need to do to avoid "doing everything so you can do anything".



It means the object created by NewServer is dealing with too much. Probably has too many data types coupled to it and too much behavior.

Simple example is adding a logger. If you add it as a dependency to the constructor, the object starts doing a bit more than initial simple implementation. It's fine to do it, but shame to not figure out how to log without editing the implementation of a simple thing.

Higher order functions (a logger decorator) get there to allow composition, but even they have their drawbacks.

It's still some form of structure that you can deal with, not a mistake.



As you say, having a logger attached is one of those pragmatic and acceptable exceptions to the rule. In a perfect world we'd have the time to go to the trouble of implementing loggable types and data flows and associated higher order functions, in practice taking the compromise means getting the real business valuable work completed while still having the necessary (but usually "low priority") non-functional requirements like logging and metrics implemented.


I agree that too many arguments to the constructor may have the smell of too much coupling.

But if I really feel I can't avoid the need to pass a good amount of external context, I create a dedicated "options" struct and pass that into the constructor as a pointer. The purpose of the pointer (rather than pass by value) is if I want default arguments, I can pass nil.

    type ServerOptions struct {
        logger    *magic.Logger
        secretKey string
    }

    func NewServer(options *ServerOptions) (*Server, error) {
        ...
    }


You can use Dependency Injection to solve this issue but in my view the added complexity is not really worth it.


I tend to write most of my logic in packages... so a "users" package or a "comments" package (if we were building HN). These have NO http interface! They do however each have their own "main" and some sort of CLI interface: "//go:build ignore" in the comment of that file is your friend.


> It means that a huge chunk of your code has a huge amount of unnecessary shared state.

Can you explain that a little more?

Which chunk of code has what shared state, and why is it unnecessary?



Basically, not all the handlers will use every dependency the server (which is the entire program in this pattern) has. Not every handler will use a database, for example.

While I may prefer a struct for this instead of separate arguments, I do agree it's useful to capture "the world" as the set of all dependencies, even if some handlers don't use them (yet).





I agree with a lot of this, I'll add my own opinions:

* I would pass a waitgroup with the app context to service structs. This way the interrupt can trigger the app shutdown via the context and the main goroutine can wait on the waitgroup before actually killing the app.

* If writing a CLI program, then testing stdout, stdin, stderr, args, env, etc. is useful. But for an http server, this is less true. I would pass structured config to the run function to let those tests be more focused.

* I disagree with parsing templates using sync.Once in a handler because I don't think handlers should do template parsing at all. I would do this when the app starts: if the template cannot be parsed, the app should not become ready to receive any requests and should rather exit with a non-zero exit code.



I found fx(https://github.com/uber-go/fx) to be a super simple yet versatile tool to design my application around.

All the advice in the article is still helpful, but it takes the "how do I make sure X is initialized when Y needs it" part completely out of the equation and reduces it from an N*M problem to an N problem, ie I only have to worry about how to initialize individual pieces, not about how to synchronize initialization between them.

I've used quite a few dependency injection libraries in various languages over the years (and implemented a couple myself) and the simplicity and versatility of fx makes it my favorite so far.



>All the advice in the article is still helpful, but it takes the "how do I make sure X is initialized when Y needs it" part completely out of the equation and reduces it from an N*M problem to an N problem, ie I only have to worry about how to initialize individual pieces, not about how to synchronize initialization between them.

I gotta say, I hate these dependency injection frameworks.

In a well designed system this should be trivial. Making sure something is initialised when you want to use it is just a matter of it being available to pass in a constructor as a parameter.

  stockService := NewStockService()
  orderService := NewOrderService()
  orderProcessor := NewOrderProcessor(stockService, orderService)
There shouldn't be any sort of "synchronisation" of initialisation needed because your code won't compile if you do something wrong. If you add a cyclic dependency you will clearly see that because you won't be able to construct things in the right order without an obvious workaround.


If you have ever topologically sorted 100 components connected in a complex graph by hand or found the right spot to insert the 101st, you'd quickly appreciate more help than a compiler check.


Your dependency structure should just be a tree.

It should be inserted literally right next to it's first use case. Your IDE will literally point it to you with red squigglys because the places where you've added a dependency will be missing a parameter. Go to the highest one and add it on the line above.



I’m sure there’s a place for them.

But when micro-services are so common, it seems like people use them (Spring) because everyone else does, not because they actually provide needed value.



I've recently been playing with ogen: https://github.com/ogen-go/ogen

Write openapi definition, it'll do routing, definition of structs, validation of JSON schemas, etc.

All I need to do is implement the service.

Validating an integer range for a querystring parameter is just too boring. And too easy to mistype when writing it manually.

Anyways, so far only been playing, so haven't found the bad parts yet.



The problem with this approach is writing openapi by hand from scratch is incredibly tedious process. Writing Protobufs, capnproto or any such similar idl feels much more productive


Its a bit icky but LLMs / copilot can speed up the creation of openapi specs a lot.

Agree it doesn't fix the "root" problem that the overall syntax is not ergonomic.



Or, if you're more into publishing an Openapi spec from your Go code, I do like danielgtaylor/huma[1] and swaggest/rest[2].

[1] https://github.com/danielgtaylor/huma

[2] https://github.com/swaggest/rest



I like a lot of what they've done here. My testing looks a bit different however.

srv, err := newTestServer()

require.NoError(t, err)

defer srv.Close()

resp, err := http.Post(fmt.Sprintf("http://localhost:%d/signup/json", srv.Port()), "application/json", strings.NewReader(` {"email":"[email protected]", "password": "p@55Word", "password_copy": "p@55Word"} `))

In my newTestServer, I spin up a server with fakes for my dependencies. If I want to test a dependency error, I replace that property with a fake that will return an error. I can validate my error paths. I can validate my log entries. I can validate my metric emission. I can validate timeouts and graceful shutdowns.

After the server starts, I inspect to determine which port it is running on (default is :0 so I have to wait to see what it got bound to).

My "unit" tests can test at the handler level or the http level, making sure that I can fully test the code as the users of my system will see it, exercising all middleware or none. I can spin up N instances and run my tests in parallel.



Great article with lots of interesting ideas. Can't believe I didn't know about signal.NotifyContext. Finally I'll be able to actually rememeber how to respond to signals instead of copy-pasting that between projects.


I don't write go, but I like these patterns. Feels fairly universal for testable code.

I never want to see another (esp. Python) Quick Start guide that treats dependencies as implicit/static/untestable.



I just run Go servers under fcgi. You get orchestration and crash recovery with a very simple interface. Fcgi will launch server processes as needed, feed them events, and shut it down when there's no traffic. Performance is good, and you can run on cheap hosting.


Which hosting do you use? I use fastcgi with python on Dreamhost and it works fine, but I’m sorta worried that they’ll turn it off because it seems kind of niche and under-documented


Dreamhost too. Dreamhost will let you run a continuously running process.

The amount of work you can get done on low-end shared hosting is really quite impressive.



What is the value making main.go as small as possible?

Whose dreams come true in this scenario?



The author goes on to explain a few scenarios where the pattern is helpful. It's not to keep main.go as small as possible, it's so that you can test parts of your main.go file properly. In my experience, if all of my logic is stuffed into `func main() {}`, then I can't actually test it. If I have a helper method(like run in this case), I can test out specific scenarios and ensure the application handles it properly. Some of the examples Mat gave were handling context cancellations properly.


“Whose dreams come true in this scenario?”

I love this! I will use this as well.

There are so many situations where I have a feeling that people are solving problems that don’t exist. In code I run into at work, code and projects I see online, etc

The “whose dreams are you making come true” really applies here, because dreams are exactly what they are.

I spent quite some time writing an automatic image resizer and optimiser for my blog. Does it matter? No! Should I have spent that time writing blog posts instead? Yes! Still I was chasing some dream.

Thanks for this image



If you do, you can use the application as a library and most of your code will also be easier to test.


I've never been a fan of making main.go one line. I create the logger, parse the flags, create objects from the flags, and call Run() or something. In the tests, you aren't ever going to do those things in the same way, so there is really no point in putting them in some other file.


Usually your main function can't be used by any other part of your program. You should move all component implementations to modules so they can be re-used elsewhere.


not main.go but func main. This allows your run function to return an error and you only need to deal with the abruptness of using os.Exit once


The initialization has to be done in a separate function that you call from the setup code for your end-to-end tests.


The idea is to keep the untestable code as small as possible but in practice you just add a layer of indirection and all of your untestable init code is in a different castle.


For bespoke internal services, I like to keep main.go as flat as reasonable, like a "script". Handlers can have their own files but the bulk of the control flow and moving parts should be apparent from reading the main file.

Abstracting things away from main makes it less readable and is general pointless for bespoke services that will be deployed in exactly one configuration.



That's a nice way of putting it. When exploring a new codebase for the first time it can be very helpful to have main.go give you a high level idea about the overall structure of the program.


The validator should return map[string][]string so that a request can have multiple problems with one field.


The sync.Once should be sync.OnceValues instead.


The encode example contains a bug and a lint issue. Firstly, calling w.Header().Set after w.WriteHeader is likely a bug, as the w.WriteHeader method call should occur after setting the headers.

The second issue involves passing an unused *http.Request, which will likely cause the linter to flag it.



I want to see a greater acceptance of this idea:

> My handlers used to be methods hanging off a server struct, but I no longer do this. If a handler function wants a dependency, it can bloody well ask for it as an argument. No more surprise dependencies when you’re just trying to test a single handler.

For HTTP services in any language, your handlers will usually end up with a lot of business logic, logic which probably has many dependencies. I see single handlers using all of the following on a regular basis: DB, cache, blob storage, some kind of special authz thing specific to your endpoints, maybe some fancy licensing checker, a queue or two, a specialized logger, and specialized metrics client. Many of those (metrics, request/response logging) can live in middlewares most of the time, but in every code base there will be times where you need to do something custom with one or the other. As time passes, the more I wonder "why aren't these all just function parameters?"

Yes, that would be a lot of function parameters (9+ for a single handler, before even getting into the request or custom params themselves), and we all have many rules of thumb and linter rules which try to keep us from having lots of function parameters. But it's not like we're not writing code which depends on all those dependencies, instead we're just sticking them on the "server" class/struct and pretending that because the method signature is shorter, we have fewer dependencies!

As time passes, I find myself wishing more and more for code that takes all its dependencies in the function/method signature, even if there's 20 of them; at least then we wouldn't be lying about how complex the code's getting...



I've always have my handlers individually set as a struct each with a method to handle the route/request.

type CreateUser struct {

   store  storage.Store

   cache  caching.Cache

   logger logging.Logger

   pub    events.Publisher

   // etc
}

func (op CreateUser) ServeHTTP(ctx, req, rw) {}

// or if you have custom handlers

func (op CreateUser) ServeHTTP(ctx, input) (output, error) {}

And in my main.go, or where I set up my dependencies, I create each operation, passing it its specific dependencies. I love that because I can keep all the helper methods for that specific operation/handler on that specific struct as private methods.

It does get tedious when you have one operation needing another, as you might start passing these around or you extract that into its own package/service.



This is kinda missing the point; each handler needs a lot of deps to do it's job, and the most obvious place to put them is in the parameters of the function. That is what I want. I do not want more indirection for aesthetics; I want clarity, even if it's brutal clarity.

Whether all the deps are in the method receiver (the parent struct) or in a struct that's a param; it's all just more indirection to hide all the "stuff" that we need cause we think it's ugly. I dream of a world where we don't do that.



You do have to instantiate that struct, and you can do it with.... a beautiful NewCreateUser(dep1, dep2, dep3, ..., dep20) *CreateUser {...}. This is essentially what he recommends with his "func newMiddleware() func(h http.Handler) http.Handler".


It doesn't have to be 9+ separate arguments, in some languages it can be a single 'context' or 'env' object that contains just what the handler needs, something like `handleHello({ db, cache, blobStore, authz }, req, res)`. That way, if two handlers use the exact same context you can reuse, but it's also easy enough to declare a per-handler context at the call site.


i really like the patterns in this post, pretty much what i've also settled on after much experimentation with different styles.


is there a git repo with example code?


Not OP, but I design my Go projects with a very similar pattern that I learned from OP's 2018 post.

I think this is a pretty good example of a real-world implementation:

https://github.com/mtlynch/picoshare

Particularly these files:

https://github.com/mtlynch/picoshare/blob/2cd9979dab084ca781...

https://github.com/mtlynch/picoshare/blob/2cd9979dab084ca781...



out of curiosity, why no sort-of-established pkg and internal dirs? What do you think of https://github.com/photoprism/photoprism structure?


I'm not familiar with that package structure, unfortunately. It might be good, but I'm not sure what the reasons are for structuring the project that way.


I did write my own HTTP stuff in C (and more generally internet stuff), on linux (sometimes without a libc, namely direct syscalls), running on ARM64 and x86_64.

And I plan to move to rv64 assembly once I can get reasonably performant hardware (it is already here, but it extremely hard to get some where I am from and how I operate). I dunno if it will be bare metal or with a linux kernel first (coze a minimal TCP stack is already a big thingy).



TLDR: optimize for unit tests and do DI with explicit function arguments. Looks kind of similar to Dropwizard.






Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact



Search:
联系我们 contact @ memedata.com