解析,不要验证(C)
Parse, Don’t Validate – Some C Safety Tips

原始链接: https://www.lelanthran.com/chap13/content.html

Lelanthran的“解析,不要验证”方法通过专注于系统边界的早期数据转换来增强C安全性。在进入时,它没有反复验证整个代码中的不受信任的输入,而是将其分解为强烈的不透明结构(例如,``email_t`,`name_t`t`t`)了。这样可以确保内部功能仅接收经过验证的类型安全数据,从而最大程度地减少漏洞。 该方法提供了三个关键优势:封装通过限制原始``char*''对系统的边缘的使用,从而防止其内部滥用;通过消除核心功能中未验证的数据处理来减少攻击表面;和编译器强制类型的安全性,其中意外参数交换成为编译时间误差。 Destructor函数还应将其释放的指针归零,从而进一步增加代码的鲁棒性。通过优先考虑解析和强大的打字,开发人员可以利用C的现有类型系统来创建更健壮和安全的应用程序。

该黑客新闻线程讨论了一篇博客文章,该文章倡导C中的“解析,不要验证”(PDV),以提高安全性和可维护性。核心思想是将原始输入分析到程序边缘的强键入数据结构中,以最大程度地减少整个代码库的验证检查。 评论者在C中辩论这种方法的实用性和细微差别。有些人同意,早期解析和使用newtypes可以提高代码质量并防止错误。但是,其他人对样板代码,潜在的性能开销以及类型的爆炸提出了担忧。 还讨论了自定义类型的“ _t”后缀的使用,由于POSIX预订,对其进行了一些警告。一些评论者考虑使用Null Pointers的错误处理一种验证。人们担心该文章的示例代码最好被归类为“解析然后验证”。 关键要点是类型安全性和灵活性之间的权衡。在编码类型系统中的更多约束可以防止错误时,它也可以使代码更加僵化和复杂。看来,大多数评论者都同意早期数据解析类型安全性的观点。
相关文章

原文
Parse, Don’t Validate AKA Some C Safety Tips

“A good programmer is someone who looks both ways before crossing a one-way street.” – Doug Linder


Posted by Lelanthran

2025-03-27

If you’ve read the original post on “Parse, Don’t Validate” you may have noticed that it focuses primarily on conceptual correctness. Here, I’ll build on that by showing how this technique can be used outside of niche academic languages by demonstrating it in a language that is as practical as it is dangerous - C.

In this blog post you will see three techniques of reducing the risk of exploitable errors in C.

The basic idea is this:

  1. Data Comes Into Your System.
  2. Your System Processes It.

Your first instinct, when your system receives as input an email address (for example), is to perform validateEmail(untrustedInput) and then pass the validated string further into the depths of the system for usage.

The problem is that other code deep within the rest of the system is going to also do some sort of validation on the string they just got. Every single function deep within the bowels of the system will still need to validate the input before processing it.

I’ll bet good money that the processing functions will attempt to validate their input. Because they’re logically far away from the boundary, they’ll either do it a different way or fail to do it altogether.

So instead of this:

read this post.

When you create the correct types for data entering the system, you can then do this: