你不需要 Kafka:使用 Unix 信号构建消息队列
Building a message queue with only two UNIX signals

原始链接: https://leandronsp.com/articles/you-dont-need-kafka-building-a-message-queue-with-only-two-unix-signals

## UNIX 信号消息代理:摘要 本文详细介绍了仅使用两个 UNIX 信号——SIGUSR1 和 SIGUSR2 构建一个功能性消息代理的过程。作者展示了如何“利用”这些传统上用于进程控制的信号,通过将消息编码为二进制序列来传输数据。 核心思想是将位表示为信号(0 代表 SIGUSR1,1 代表 SIGUSR2)。发送者将消息编码为字节,然后编码为单个位,并将相应的信号发送给接收者。接收者累积这些位,重构字节,并将它们解码回字符。 该项目将这个概念扩展到一个基本的代理系统,包含生产者、代理和消费者。代理接收位流,排队消息,并将它们转发给注册的消费者。 虽然不打算用于生产环境,但这个实验突出了二进制操作、进程间通信 (IPC) 以及理解系统原语的力量等基本概念。这是一个有趣的探索,表明即使是看似有限的工具,通过创造力和对事物运作方式的深刻理解,也可以取得令人惊讶的复杂结果。作者鼓励实验和“无用”项目作为宝贵的学习经验。

## 使用UNIX信号构建消息队列:一个有趣的实验 最近一篇Hacker News上的帖子详细介绍了一个项目,作者仅使用两个UNIX信号和一些Ruby脚本构建了一个消息代理。虽然明确*不*打算用于生产环境——当然也不是Kafka的替代品——但该项目是对基础操作系统概念的有趣探索。 作者和评论者强调了教育价值:通过实践实验,理解二进制操作、UNIX信号和进程间通信(IPC)。核心思想是展示即使不切实际,复杂的系统也可以由基本构建块构建而成。 讨论围绕着使用信号进行排队(潜在信号丢失)的局限性、更可靠传输的实时信号的可用性以及标题略带标题党性质的准确性展开。许多评论者赞赏这种玩乐式的黑客精神,并鼓励类似的实验,强调了为了乐趣而学习的重要性。最终,该项目证明了动手实践的力量,并提醒我们,理解核心计算原理并不总是需要复杂的工具。
相关文章

原文

Have you ever asked yourself what if we could replace any message broker with a very simple one using only two UNIX signals? Well, I’m not surprised if you didn’t. But I did. And I want to share my journey of how I achieved it.

If you want to learn about UNIX signals, binary operations the easy way, how a message broker works under the hood, and a bit of Ruby, this post is for you.

And if you came here just because of the clickbait title, I apologize and invite you to keep reading. It’ll be fun, I promise.

image

Wikipedia:

A UNIX signal is a standardized message sent to a program to trigger specific behaviour, such as quitting or error handling

There are many signals we can send to a process, including:

  • SIGTERM - sends a notification to the process to terminate. It can be “trapped,” which means the process can do some cleanup work before termination, like releasing OS resources and closing file descriptors
  • SIGKILL - sends a termination signal that cannot be trapped or ignored, forcing immediate termination
  • SIGINT - the interrupt signal, typically sent when you press Ctrl+C in the terminal. It can be trapped, allowing the process to perform cleanup before exiting gracefully
  • SIGHUP - the hangup signal, originally sent when a terminal connection was lost. Modern applications often use it to reload configuration files without restarting the process
  • SIGQUIT - similar to SIGINT but also generates a core dump for debugging
  • SIGSTOP - pauses (suspends) a process. Cannot be trapped or ignored
  • SIGCONT - resumes a process that was paused by SIGSTOP
  • SIGCHLD - sent to a parent process when a child process terminates or stops
  • SIGUSR1 and SIGUSR2 - user-defined signals that applications can use for custom purposes

simulated OOP in Bash a couple of years ago (it was fun though).

To understand how we can “hack” UNIX signals and send messages between processes, let’s first talk a bit about binary operations. Yes, those “zeros” and “ones” you were scared of when you saw them for the first time. But they don’t bite (🥁 LOL), I promise.

ASCII, we know that the letter “h” has the following codes:

  • 104 in decimal
  • 0x68 in hexadecimal
  • 01101000 in binary

Binary-wise, what if we represented each “0” with a specific signal and each “1” with another? We know that some signals such as SIGTERM, SIGINT, and SIGCONT can be trapped, but intercepting them would harm their original purpose.

But thankfully, UNIX provides two user-defined signals that are perfect for our hacking experiment.

ASCII table, it should be the following:

  • e in decimal is 101
  • l in decimal is 108
  • o in decimal is 111

Let’s check if Ruby knows that:

104.chr # "h"

101.chr # "e"

108.chr # "l"

111.chr # "o"

We can even “decode” the word to the decimal representation in ASCII:

irb> "hello".bytes

=> [104, 101, 108, 108, 111]

Now, time to finish our receiver implementation to properly print the letter “h”:

@position = 0 # start with the LSB

@accumulator = 0

trap('SIGUSR1') { decode_signal(0) }

trap('SIGUSR2') { decode_signal(1) }

def decode_signal(bit)

accumulate_bit(bit)

return unless @position == 8 # if not yet accumulated a byte, keep accumulating

print "Received byte: #{@accumulator} (#{@accumulator.chr})\n"

@accumulator = 0 # reset the accumulator

@position = 0 # reset position for the next byte

end

def accumulate_bit(bit)

# The left shift operator (<<) is used to

# shift the bits of the number to the left.

#

# This is equivalent of: (2 ** @position) * bit

@accumulator += (bit << @position)

@position += 1 # move to the next bit position: 0 becomes 1, 1 becomes 2, etc.

end

puts "Process ID: #&lbrace;Process.pid&rbrace;"

sleep

Read that code and its comments. It’s very important. Do not continue reading until you really get what’s happening here.

  • Whenever we get SIGUSR1, we accumulate the bit 0
  • When getting SIGUSR2, accumulate then the bit 1
  • When accumulator reaches the position8, it means we have a byte. At this moment we should print the ASCII representation using the .chr we seen earlier. Then, reset bit position and accumulator

Let’s see our receiver in action! Start the receiver in one terminal:

$ ruby receiver.rb

Process ID: 58219

Great! Now the receiver is listening for signals. In another terminal, let’s manually send signals
to form the letter “h” (which is 01101000 in binary, remember?):

# Sending from LSB to MSB: 0, 0, 0, 1, 0, 1, 1, 0

$ kill -SIGUSR1 58219 # 0

$ kill -SIGUSR1 58219 # 0

$ kill -SIGUSR1 58219 # 0

$ kill -SIGUSR2 58219 # 1

$ kill -SIGUSR1 58219 # 0

$ kill -SIGUSR2 58219 # 1

$ kill -SIGUSR2 58219 # 1

$ kill -SIGUSR1 58219 # 0

And in the receiver terminal, we should see:

Received byte: 104 (h)

How amazing is that? We just sent the letter “h” using only two UNIX signals!

But wait. Manually sending 8 signals for each character? That’s tedious and error-prone. What if we wanted to send the word “hello”? That’s 5 characters × 8 bits = 40 signals to send manually. No way.

We need a sender.