基于其语法的选择语言？

基于其语法的选择语言？
Choosing a Language Based on Its Syntax?

原始链接: https://www.gingerbill.org/article/2026/02/19/choosing-a-language-based-on-syntax/

## 语法不应定义一门语言作者认为，仅凭语法——特别是声明风格——来评判一门编程语言是错误的。虽然语法*可以*影响易用性，但它很容易改变，而不会从根本上改变语言的核心*语义*（其含义和行为）。不同的声明风格（类型优先、名称优先、限定符优先）很大程度上是符合人体工程学的选择，而不是决定性特征。作者强调理解语言的底层语义——它*实际*运作方式——比关注其表面外观更重要。缺乏经验的程序员常常将语法与实质混淆，未能认识到更深层次的区别。小的语法选择，例如分号的使用，常常会引发不成比例的争论。作者详细介绍了在 Odin 中使分号成为可选功能的过程，这既出于语法一致性的考虑，也是为了避免劝退潜在用户。最终，好的语法应该反映和支持语言的语义，而不是阻碍理解。核心信息是：有经验的程序员关注语言*能做什么*，而不仅仅是它*看起来如何*。语言设计者应该优先考虑清晰的语义和一致的符合人体工程学的设计，而不应被肤浅的风格偏好所左右。

一个黑客新闻的讨论围绕着语言语法的重要性。一位评论员将好的语法比作好天气——当它运作良好时，人们不会注意到它，变得透明，并允许专注于代码的*含义*（语义）。另一位评论员认为语法体现了语言设计者的技能和对既定最佳实践的认识，暗示某些选择（例如 `type name` 与 `name: type`）是危险信号。糟糕的语法会给程序员带来持续的摩擦。对话还涉及了隐式语法规则，以Python基于新行的“分号”为例。重点是，应该避免过于巧妙或容易忘记的规则，而倾向于明确的、尽管可能更冗长的语法，以实现整体清晰度和易用性。最终，语法是与语言的主要接口，因此其质量至关重要。

原文

I am still perplexed by how people judge a language purely by its declaration syntax, and will decide whether to use the language purely based on whether they like that aspect or not.

The general categories of declarations can be classified as the following:

type name = value—type-focused
name: type = value—name-focused
var name type = value—qualifier-focused

When designing a language, if your semantics are pretty clear you can trivially change this declaration syntax and the semantics of the language will be mostly the same (if not identical). People seriously think the declaration syntax is what gives a language its “character”. I do not get this train of thought in the slightest I probably do have a bias, in that I know how compilers work, but even when I didn’t understand, I chose languages based on my need, not on whether they “looked good”..

Syntax Doesn’t Matter Until it Does

A programming language is not merely its syntax. Semantics actually exist, be that denotation semantics I’ve always found that focusing on the denotational semantics of a language is more important than focusing on the operational semantics because (for me at least) the operational semantics are “obvious” once the denotational semantics are decided upon., operational semantics, or algebraic semantics. The issue is that many inexperienced programmers don’t have this mental distinction and think all languages are mostly the same but with just differing “syntax”. Just wait until they are exposed to a functional programming language or a database language, or even find out that a spreadsheet is a language.

I wrote an article in 2018 On the Aesthetics of the Syntax of Declarations about the different families of declaration syntax, but it is still weird to me how people view things when deciding whether to use a language or not. To use my language Odin as an example, do people think this substantially changes the semantics if its declaration syntax had these different looks?

// Actual Odin
x: i32 = 123
y := 123 // inferred type
FOO :: "some constant"
bar :: proc() -> i32 {
    return 123
}

to this:

// Qualifier focused
var x i32 = 123
var y = 123 // inferred type
const FOO = "some constant"
proc bar() -> i32 {
    return 123
}

At best the difference here is going to be slightly more typing needed for var and const, and thus just becomes a question of ergonomics or “optimizing for typing” (which is never the bottleneck). I’d argue most of the compiler would effectively be the same for the latter approach, since it already has to disambiguate between the different kinds of declarations Internally within the compiler, they are called entities. Some other compilers will use the term symbol, and other terms.. However each syntax family does have its own trade-offs which allow or prevent certain possibilities.

Syntax restricts the possibilities of what semantics are possible.

Semicolons? What is this, 1990?

The other similar thing I’ve seen numerous times before across numerous languages:

Semicolons in <insert-current-year>? Why do you still have them, haven’t you learnt that we don’t need them any more?.
Some Rando

From what I gather, this sentiment of not understanding why many “modern” languages still use semicolons is either:

When I first created Odin, semicolons were mostly required and inferred in many places but I eventually made semicolons fully optional as statement terminators. There were two reasons I made them optional:

To make the grammar consistent, coherent, and simpler
To honestly shut up these kinds of bizarre people

That second point might sound silly but it really was a thing where people were put off even trying Odin due to it having semicolons. The first point was the main reason I did it though, especially since even at work, many of my colleagues still use semicolons in the codebase purely out of habit from programming in C/C++ for decades We will eventually clean this up, even if removing unnecessary semicolons simple from our codebase is as simple as calling odin strip-semicolon..

However making semicolons optional in a language can come with a few compromises. One option is to design the grammar such that they are “obvious” to infer their usage. Lua is an example of such a language, and when a semicolon is necessary is when you have something that could be misconstrued as being a call:

(function() print("Test1") end)(); -- That semicolon is required
(function() print("Test2") end)()

Another option is to do something like automatic semicolon insertion (ASI) based on a set of rules. Unfortunately, a lot of people’s first experience with this kind of approach is JavaScript and its really poor implementation of it, which means people usually just write semicolons regardless to remove the possible mistakes. However there are languages that have relatively sane approaches to ASI such as Go, Python, and Odin.

Go’s approach is purely a lexical rule, which does mean you are forced to do things like trailing commas in lists that span multiple lines. However this is probably not just done for simplicity but also to enforce a code styling.

Python and Odin’s approach is both a lexical rule + syntactical rule. Odin’s lexical rule is very similar to Go’s but with the added syntactical rules, it makes it a lot less annoying to use and allows for more code styling options. Odin’s rules, which are very similar to Python’s, are to ignore newline-based “semicolons” within brackets (( ) and [ ], and { } used as an expression or record block).

To allow for things like Allman braces, Odin allows for extra single newline in many places in its grammar, but only an extra single newline. This is to get around certain ambiguities between declaration a procedure type and a procedure literal:

a_type :: proc()

a_procedure_declaration :: proc() {

}

another_procedure_declaration :: proc()
{

}

another_type :: proc() // note the extra newline separating the signature from a `{`

{ // this is just a block

}

First Exposure Bias

How is it that people literally choose a language purely on the most minute syntax issues rather than on the (denotational or operational) semantics? Or do most people not actually “program” but just “pattern match” syntax together and hope it works?

Maybe I don’t need to be as cynical and it is a lot simpler than all of that: first exposure bias. It’s the tendency for an individual to develop a preference simply because they became familiar with it first, rather that it be a rational choice from a plethora of options. People keep to what they are familiar with, which can be rational. But saying they don’t like something without even trying it, is a bit irrational.

However I do think there are rational reasons people do not like a syntax of a language and thus do not use it. Sometimes that syntax is just too incoherent or inconsistent with the semantics of the language. Sometimes it is just too dense and full or sigils Perl is a language that literally gives me headaches when scanning/reading the code. I am not exaggerating., making it very hard to scan I am making a distinction between scanning and reading here, which I don’t think some people do. However that distinction is for another article. and find the patterns within the code. Sometimes it is just so foreign, that the time it takes to learn it would be a lot longer than another alternative.

Syntax Decisions

I’ve written about how C’s declarations match usage, which I’d argue most people don’t realize unless they have made C parser/compiler Or read the C spec.. Most people think C’s declaration syntax is either just type-first or something arcane that you just randomly guess at. For Odin, I designed the syntax for types to be more Pascal-style to improve reading, parsing, and comprehension:

x: [3]^int // array 3 of pointer to int
y: ^[3]int // pointer to array 3 of int

The unfortunate equivalent of this in C would be:

int *x[3];   // array 3 of pointer to int
int (*y)[3]; // pointer to array 3 of int

Instead of following C’s approach of “declarations match usage”, Odin’s approach is “types on the left, usage on the right”:

x: [3]int  // type on the LHS
x[1] = 123 // usage on the RHS

y: ^int = ...
y^ = 123

z: [6]^int = ...
z[3]^ = 123

Coupled with Odin’s very strong and orthogonal type system, things just work^TM as expected and are easy to comprehend for mere mortals like myself.

I have seen many criticisms of Odin’s usage of a caret ^ for pointers Usually from people who have never tried the language, and only thinking of hypothetical issues. And even on such keyboards, symbols like {} and [] are probably just as hard to type but they are rarely (if ever) complained about by the very same people. due to them being dead keys on many keyboard layouts, but due to Odin’s type inference system through :=, auto-dereferencing (x^.y becomes x.y), a much better type system (e.g. actual arrays for example), and other semantics in general, the need to type out explicitly type pointers is a heck of a lot less common than in C. But that’s the problem when people complain about syntax between languages without even trying it or thinking through the actual consequences of such decisions.

I know I’ve spent a lot of time on Odin’s syntax so that it is as consistent as possible And “inconsistent” where it is absolutely necessary. and to be very ergonomic with respect to the semantics of the language, but I don’t think the declaration syntax or optional semicolons of Odin “make or break” the language. For me, the essence of Odin is to feel like a modern C alternative with the joy of programming, rather than to look like C. Even languages which keep as close to the type-focused declaration syntax of C, such as C3, still depart from the rest of C’s syntax for other features and constructs. This is because C’s syntax has a lot of issues to it which should not be repeated, especially not its actual declaration syntax.

Sometimes tiny syntax decisions do add friction, and they do add up. One of the approaches to designing Odin has been to nudge people in a direction which is usually a better way of doing things, rather than cause direct friction. However if friction is needed, usually what is needed is the equivalent of a brick wall, not sandpaper. But the syntax in this case is a reflection of the semantics of the language itself, and that’s what many people seem to misunderstand. Syntax is not everything, semantics are the actual foundation of a language.

Ignoring Such Opinions

If you’re a fellow language designer, honestly: ignore these people. Everyone has an opinion, but that opinion might not be of value to anyone, even the person who holds it.

If a person complains about the general category (not the specifics) of a syntax decision in your language, such as the declaration syntax, the use of semicolons or not, whether the core/standard library uses snake_case or camelCase for procedure names, or some other asinine position: just ignore them.

Conclusion

I think a lot of the reasons people judge languages based on such “minor” syntactic decisions is probably because they don’t have much experience with other programming languages. I’ve found that as people become more experienced with programming and other programming languages, this sentiment disappears entirely and people just focus on programming. The syntax is just there for reading, not for “appreciating”.

Look for the opinions of people that you do value and deem to be of worth, not some rando’s off the internet.

Please don’t choose a language solely for its syntax. Consider the actual language semantics since they will be the things that affect you the most down the line.