基于其语法的选择语言?
Choosing a Language Based on Its Syntax?

原始链接: https://www.gingerbill.org/article/2026/02/19/choosing-a-language-based-on-syntax/

## 语法不应定义一门语言 作者认为,仅凭语法——特别是声明风格——来评判一门编程语言是错误的。虽然语法*可以*影响易用性,但它很容易改变,而不会从根本上改变语言的核心*语义*(其含义和行为)。不同的声明风格(类型优先、名称优先、限定符优先)很大程度上是符合人体工程学的选择,而不是决定性特征。 作者强调理解语言的底层语义——它*实际*运作方式——比关注其表面外观更重要。缺乏经验的程序员常常将语法与实质混淆,未能认识到更深层次的区别。 小的语法选择,例如分号的使用,常常会引发不成比例的争论。作者详细介绍了在 Odin 中使分号成为可选功能的过程,这既出于语法一致性的考虑,也是为了避免劝退潜在用户。最终,好的语法应该反映和支持语言的语义,而不是阻碍理解。 核心信息是:有经验的程序员关注语言*能做什么*,而不仅仅是它*看起来如何*。语言设计者应该优先考虑清晰的语义和一致的符合人体工程学的设计,而不应被肤浅的风格偏好所左右。

一个黑客新闻的讨论围绕着语言语法的重要性。一位评论员将好的语法比作好天气——当它运作良好时,人们不会注意到它,变得透明,并允许专注于代码的*含义*(语义)。 另一位评论员认为语法体现了语言设计者的技能和对既定最佳实践的认识,暗示某些选择(例如 `type name` 与 `name: type`)是危险信号。糟糕的语法会给程序员带来持续的摩擦。 对话还涉及了隐式语法规则,以Python基于新行的“分号”为例。重点是,应该避免过于巧妙或容易忘记的规则,而倾向于明确的、尽管可能更冗长的语法,以实现整体清晰度和易用性。最终,语法是与语言的主要接口,因此其质量至关重要。
相关文章

原文

I am still perplexed by how people judge a language purely by its declaration syntax, and will decide whether to use the language purely based on whether they like that aspect or not.

The general categories of declarations can be classified as the following:

  • type name = value—type-focused
  • name: type = value—name-focused
  • var name type = value—qualifier-focused

When designing a language, if your semantics are pretty clear you can trivially change this declaration syntax and the semantics of the language will be mostly the same (if not identical). People seriously think the declaration syntax is what gives a language its “character”. I do not get this train of thought in the slightest 

Syntax Doesn’t Matter Until it Does

A programming language is not merely its syntax. Semantics actually exist, be that denotation semantics 

I wrote an article in 2018 

// Actual Odin
x: i32 = 123
y := 123 // inferred type
FOO :: "some constant"
bar :: proc() -> i32 {
    return 123
}

to this:

// Qualifier focused
var x i32 = 123
var y = 123 // inferred type
const FOO = "some constant"
proc bar() -> i32 {
    return 123
}

At best the difference here is going to be slightly more typing needed for var and const, and thus just becomes a question of ergonomics or “optimizing for typing” (which is never the bottleneck). I’d argue most of the compiler would effectively be the same for the latter approach, since it already has to disambiguate between the different kinds of declarations 

Syntax restricts the possibilities of what semantics are possible.

Semicolons? What is this, 1990?

The other similar thing I’ve seen numerous times before across numerous languages:

Semicolons in <insert-current-year>? Why do you still have them, haven’t you learnt that we don’t need them any more?.

Some Rando

From what I gather, this sentiment of not understanding why many “modern” languages still use semicolons is either:

When I first created Odin, semicolons were mostly required and inferred in many places but I eventually made semicolons fully optional as statement terminators. There were two reasons I made them optional:

  • To make the grammar consistent, coherent, and simpler
  • To honestly shut up these kinds of bizarre people

That second point might sound silly but it really was a thing where people were put off even trying Odin due to it having semicolons. The first point was the main reason I did it though, especially since even at work, many of my colleagues still use semicolons in the codebase purely out of habit from programming in C/C++ for decades 

However making semicolons optional in a language can come with a few compromises. One option is to design the grammar such that they are “obvious” to infer their usage. Lua is an example of such a language, and when a semicolon is necessary is when you have something that could be misconstrued as being a call:

(function() print("Test1") end)(); -- That semicolon is required
(function() print("Test2") end)()

Another option is to do something like automatic semicolon insertion (ASI) based on a set of rules. Unfortunately, a lot of people’s first experience with this kind of approach is JavaScript and its really poor implementation of it, which means people usually just write semicolons regardless to remove the possible mistakes. However there are languages that have relatively sane approaches to ASI such as Go, Python, and Odin.

Go’s approach is purely a lexical rule, which does mean you are forced to do things like trailing commas in lists that span multiple lines. However this is probably not just done for simplicity but also to enforce a code styling.

Python and Odin’s approach is both a lexical rule + syntactical rule. Odin’s lexical rule is very similar to Go’s but with the added syntactical rules, it makes it a lot less annoying to use and allows for more code styling options. Odin’s rules, which are very similar to Python’s, are to ignore newline-based “semicolons” within brackets (( ) and [ ], and { } used as an expression or record block).

To allow for things like Allman braces, Odin allows for extra single newline in many places in its grammar, but only an extra single newline. This is to get around certain ambiguities between declaration a procedure type and a procedure literal:

a_type :: proc()

a_procedure_declaration :: proc() {

}

another_procedure_declaration :: proc()
{

}

another_type :: proc() // note the extra newline separating the signature from a `{`

{ // this is just a block

}

First Exposure Bias

How is it that people literally choose a language purely on the most minute syntax issues rather than on the (denotational or operational) semantics? Or do most people not actually “program” but just “pattern match” syntax together and hope it works?

Maybe I don’t need to be as cynical and it is a lot simpler than all of that: first exposure bias. It’s the tendency for an individual to develop a preference simply because they became familiar with it first, rather that it be a rational choice from a plethora of options. People keep to what they are familiar with, which can be rational. But saying they don’t like something without even trying it, is a bit irrational.

However I do think there are rational reasons people do not like a syntax of a language and thus do not use it. Sometimes that syntax is just too incoherent or inconsistent with the semantics of the language. Sometimes it is just too dense and full or sigils 

Syntax Decisions

I’ve written about how C’s declarations match usage, which I’d argue most people don’t realize unless they have made C parser/compiler 

x: [3]^int // array 3 of pointer to int
y: ^[3]int // pointer to array 3 of int

The unfortunate equivalent of this in C would be:

int *x[3];   // array 3 of pointer to int
int (*y)[3]; // pointer to array 3 of int

Instead of following C’s approach of “declarations match usage”, Odin’s approach is “types on the left, usage on the right”:

x: [3]int  // type on the LHS
x[1] = 123 // usage on the RHS

y: ^int = ...
y^ = 123

z: [6]^int = ...
z[3]^ = 123

Coupled with Odin’s very strong and orthogonal type system, things just workTM as expected and are easy to comprehend for mere mortals like myself.

I have seen many criticisms of Odin’s usage of a caret ^ for pointers 

I know I’ve spent a lot of time on Odin’s syntax so that it is as consistent as possible 

Sometimes tiny syntax decisions do add friction, and they do add up. One of the approaches to designing Odin has been to nudge people in a direction which is usually a better way of doing things, rather than cause direct friction. However if friction is needed, usually what is needed is the equivalent of a brick wall, not sandpaper. But the syntax in this case is a reflection of the semantics of the language itself, and that’s what many people seem to misunderstand. Syntax is not everything, semantics are the actual foundation of a language.

Ignoring Such Opinions

If you’re a fellow language designer, honestly: ignore these people. Everyone has an opinion, but that opinion might not be of value to anyone, even the person who holds it.

If a person complains about the general category (not the specifics) of a syntax decision in your language, such as the declaration syntax, the use of semicolons or not, whether the core/standard library uses snake_case or camelCase for procedure names, or some other asinine position: just ignore them.

Conclusion

I think a lot of the reasons people judge languages based on such “minor” syntactic decisions is probably because they don’t have much experience with other programming languages. I’ve found that as people become more experienced with programming and other programming languages, this sentiment disappears entirely and people just focus on programming. The syntax is just there for reading, not for “appreciating”.

Look for the opinions of people that you do value and deem to be of worth, not some rando’s off the internet.

Please don’t choose a language solely for its syntax. Consider the actual language semantics since they will be the things that affect you the most down the line.

联系我们 contact @ memedata.com