02 WHIP
Witty House Infrastructure Processor
PV — Perl Integration
First tools: Victron Modbus + ECS BMS — all in Perl
$ ecs_bms_tool -range 1-16 # query all battery modules
$ ecs_bms_tool -get cell_voltage -get cell_temperature
$ ecs_bms_tool -otype json # JSON for pipeline integration
$ Wmodbus discover 192.168.2.0/24 # find Modbus devices on network
$ Wmodbus --host 192.168.2.201 --unit 2 read holding 0-10
$ Wmodbus --host 192.168.2.201 --profile vents-dbe900l monitor
ecs_bms_tool — ECS LiPro BMS management (SoC, cell voltage, balancing)Wmodbus — Modbus TCP/RTU: discovery, read/write, device profiles, monitoringWcli — solar irradiance & PV power calculatorWthermal — physics-based house thermal model
Scripts work. But a house is more than solar panels.
Looking for a Smarthome
Preferably in Perl, obviously.
WHIP — "I'll build my own."
The FHEM Experience
"We do not like CPAN" — dependencies create problems. So we reimplement everything ourselves. But worse.
"We do not like PBP" — contributions are done by amateurs. Too high expectations would kill contribution.
"Efficient algorithms are overrated" — "So what? That's 0.1s faster?"
"Tests? TDD? That's superfluous work!"
"I don't like you, you cannot use my GPL code"
(FHEM people: no offense. Well, maybe a little.)
Nice idea. Wrong execution.
✓ Open-source, DIY, community-driven
✓ Tree topology with auto-routing & self-healing
✓ Up to 254 nodes × 254 sensors — decent scale
✗ Arduino — ATmega328? For a house? In 2020?
✗ Arduino software model — Endless loop with stuff in it.
✗ RS485 / nRF24L01+ — Master-Slave Architecture
✗ Text protocol — semicolon-delimited ASCII over serial. In 2020.
✗ No autonomous operation — nodes depend on gateway/controller
I felt there had to be something better.
Birth of a Node
Multi-master · Inherent collision resolution · Resilient
Good enough for cars for decades. Industrial standard.
Good for 20–30m runs. Plenty for a house.
72 MHz ARM Cortex-M3 · 7× faster than Arduino
CAN peripheral built-in · $1.50 in 2020
Real tasks · Priorities · Preemption · Hardware abstraction
RobotDyn Black Pill · STM32F103C8T6
Birth of a Node
Multi-master · Inherent collision resolution · Resilient
Good enough for cars for decades. Industrial standard.
Good for 20–30m runs. Plenty for a house.
72 MHz ARM Cortex-M3 · 7× faster than Arduino
CAN peripheral built-in · $15 in 2020 (COVID!)
Real tasks · Priorities · Preemption · Hardware abstraction
Blue Pill · STM32F103C8T6
Birth of a Node
Multi-master · Inherent collision resolution · Resilient
Good enough for cars for decades. Industrial standard.
Good for 20–30m runs. Plenty for a house.
72 MHz ARM Cortex-M3 · 7× faster than Arduino
CAN peripheral built-in · no stock! → DIY
Real tasks · Priorities · Preemption · Hardware abstraction
Green Pill · STM32F103C8T6 · DIY
Why CAN?
Why CAN? Hardware arbitration (CSMA/CR) · true multi-master · 1 Mbit/s · differential · industrial grade
Why not WiFi/Zigbee? No batteries to die. No mesh to collapse. Building for 50 years, not 5.
Why not RS485? No arbitration. Master-slave only. Two nodes transmit = garbage.
Why not KNX? 9600 baud (1990s design). Expensive. Closed ecosystem.
Birth of a Hub
So you have 20 nodes — now what?
Hub assembly · DIN rail mount
Waveshare 2-CH CAN HAT
RasPi 4B/5 · 2-ch CAN HAT · Relay Board · DIN Rail Mount · CAN/IP & CAN/CAN Gateway
WHIP Architecture
Nodes — STM32 MCUs · FreeRTOS · 1MBit CAN bus · Autonomous C / Embedded
Hubs — RasPi · CAN/IP gateway · Hub Aggregation · Protocol bridges · Mojolicious Perl
Server — Orchestration · External connectivity Perl
Higher layers are always a supplement, never a requirement.
Nodes — STM32 + FreeRTOS
Hardware
STM32F103 (Cortex-M3, 72 MHz)
STM32F303 (Cortex-M4F, FPU)
Native bxCAN controller
Software
FreeRTOS · libopencm3
No vendor HAL lock-in
One YAML = one firmware
115+
sensor/actuator modules
🌡️ BME280 · DS18x20 · SHT3x · NTC
⚡ INA219 · INA226 · ACS712 · ADC
💡 DALI · WS281x · PWM dimmer · SSR
🔌 PCF857x · MCP23017 · relay · GPIO
📡 LoRa · Modbus RTU · 1-Wire · SPI
🖥️ SSD1306 · ST7735 · status LEDs
🍃 SCD4x · SGP4x · PMS5003 · SEN5x
🛸 AS3935 (franklin) · MLX90614 · VL53L0x · HX711
Dependency resolver inspired by Linux Kconfig
~5 modules per node → 153,476,148 combinations
Ganglion
Ganglion = GANG of Lightweight I/O Nodes — insect-brain model.
IF-THEN rules, timers, local variables — compiled to bytecode on the MCU.
Nodes operate autonomously even when hub/server are down.
Ganglion — In Action
DEF LightTimeout = 300 # 5 minutes
# Motion detected: light on, start timer
IF motion:detected THEN lights:on; SET $T_0 = LightTimeout
# Timer expired: light off
IF !$T_0 THEN lights:off
# Cross-node: kitchen smoke → alarm everywhere
DEF Kitchen = 42
IF Kitchen:smoke:detected THEN buzzer:alarm(1)
Toolchain:
Wgc — Compiler (Perl)
.tgc source → .bgc bytecode (10–50 bytes)
Wgi — Interpreter
Disassembly + execution trace
Same C source as on STM32
Wgs — Simulator
Perl reference impl · full memory model
Mock sensors · timer simulation · 170 tests
Hubs — a Pantheon
Specialized RasPi hubs. Named by function, not by accident.
Raijin
⚡ Thunder god — energy: Victron, BMS, MPPT, 120 kWh batteries
Lucifer
💡 Light bearer — DALI lighting: 4 buses, scenes, presence simulation
Bragi
🎵 Norse god of poetry — multiroom audio, voice, AI assist
Gaia
🌿 Earth goddess — greenhouse, garden, pond, irrigation
Tyr
⚔️ God of war — ...you can guess.
No hub is a single point of failure. Each domain runs independently.
SELV-DALI — Lighting without mains
SELV = Safety Extra Low Voltage. Under 60V DC. Safe to touch.
The trick: Entire lighting chain runs from battery storage. 48V → 24V DC/DC → LED. No 230V AC anywhere.
DALI controls at 16V. Switches, sensors, dimmers — all SELV.
Inverters fail? Lights stay on — they bypass AC entirely.
Switch next to the bathtub? No problem. No electrician needed.
🇬🇧 English
🇩🇪 Deutsch
WHIP — Protocols & Integrations
Protocols
CAN bus 1Mbit Modbus TCP/RTU DALI MQTT SNMP I2C 1-WireModbus
17 of 21 function codes · 869 tests · 91% coverage
30+ external integrations
Victron VRM · MasterTherm · PVGIS · Discord · Nextcloud · Proxmox · UniFi · ...
All protocol handlers in Perl · Mojolicious async I/O
WHIP — In production
Villa-A (Prague) — completely off-grid
- 40 kWp solar · 120 kWh LiFePO4 · 3× Multiplus-II 10kVA
- MasterTherm heat pump · capillary ceiling heating/cooling
- DALI lighting across 4 buses · distributed CAN nodes
Villa-B (Germany) — same concept, different config
Two deployments = real generalization, not "works on my machine"
Invisible when it works. Competent when it matters. Built for decades, not warranties.
04 AI does Perl
Turning the predicate around.
pperl
PetaPerl / ParallelPerl
A Perl 5 interpreter — designed by humans.
Written in Rust — by many AI agents.
Serious — no toy or academic exercise.
pperl
PetaPerl / ParallelPerl
A Perl 5 interpreter Platform — designed by humans.
Written in Rust — by many AI agents.
Serious — no toy or academic exercise.
pperl — Not the first attempt
Topaz
1999 · C++ rewrite · Chip Salzenberg · abandoned
B::C / perlcc
1996–2016 · Perl-to-C compiler · dead
cperl
2015–2020 · Perl 5 fork · Reini Urban · dormant
RPerl
Restricted Perl → C++ · Will Braswell · dormant
WebPerl
Perl 5 → WebAssembly · runs in browser · semi-active
PerlOnJava
Perl 5 on JVM · Flavio Glock · active — talk at this GPW!
Common failure mode: underestimating Perl 5's complexity
pperl — Scope
Perl 5.42 — ish
Compatibility: strive for maximum Perl 5 compliance, currently 5.42
Performance: strive for V8 levels
XS: no, but yes
Native Rust implementations, integral to the interpreter
Linux only — all architectures
We really don't care about use v5.xx
pperl — Status
22,000+
tests total
~61–400 failures — give or take
Performance: good, bad and ugly
Quotes from the AI
13095 pass (+25 from previous 13070), 31 fail (down from 46!). The File::Path native implementation not only works, it unblocked 15 previously-failing tests that depended on File::Path. Zero regressions.
pperl — Benchmarks
| Benchmark | perl5 | pperl | ratio |
|---|---|---|---|
list_util::sum |
191.8K | 372.8K | 1.9x |
list_util::min |
199.8K | 772.9K | 3.9x |
list_util::max |
201.3K | 673.7K | 3.3x |
list_util::product |
2.7M | 4.0M | 1.5x |
Native Rust implementations — not XS, not C
pperl — Beyond Perl5
Maximum compatibility. But more.
✓ Autoparallelization — for/map/grep via Rayon · transparent · no threads pragma
✓ JIT Compilation — Cranelift · hot codepath detection · native code at runtime
✓ Auto-FFI — call any C library · no XS · no compilation · Peta::FFI namespace
✓ Pre-Compile — .plc blobs · skip parsing · near-instant startup
✓ Daemonize — emacs-style daemon/client · shared memory · zero cold start
Autoparallelization
Powered by Rayon — Rust's data-parallelism library
Work-stealing scheduler
Divides work into tasks, idle threads steal from busy ones — automatic load balancing
One-line change in Rust.iter() → .par_iter() — same code, parallel execution
Guaranteed data-race freedom
If it compiles, it's safe. Rust's type system enforces this at compile time.
# This just works. In parallel.
my @results = map { expensive_computation($_) } @large_list;
# No threads. No MCE. No forks.
# pperl detects safe loops → Rayon handles the rest.
--parallel flag · list ≥ 1000 items · no shared mutation
JIT Compilation
Just-In-Time — compile to machine code while running
How it works in pperl:
- Interpreter runs normally — profiling hot paths
- Hot loop detected → lower to Cranelift IR
- Cranelift compiles IR → native machine code
- Next iteration runs as native code — zero dispatch overhead
Cranelift — the compiler backend behind Wasmtime and Rust's alternative codegen.
Production-proven. Targets: x86-64 · AArch64 · s390x · RISC-V
# pperl detects this as a hot loop pattern
my $sum = 0;
for my $i (1 .. 1_000_000) {
$sum += $i;
}
# → Cranelift compiles to native machine code
JIT — First Win
Inner loop JIT — single hot loop compiled to native code
| Benchmark | perl5 | pperl interpreted | pperl JIT | vs perl5 |
|---|---|---|---|---|
| Mandelbrot | 133ms | 1493ms | 41ms | 3.2× faster |
| Ackermann | 13ms | 630ms | 12ms | 1.1× faster |
The JIT fired and the test passes! The answer is correct (500000500000).
Good. But only the innermost loop is compiled. What about nested loops?
$py = 0; while ($py < $height) { $y0 = $y_min + $py * $y_step; $row_off = $py * $width; $px = 0; while ($px < $width) { $x0 = $x_min + $px * $x_step; $zr = 0.0; $zi = 0.0; $iter = 0; while ($iter < $max_iter) { $r2 = $zr * $zr; $i2 = $zi * $zi; last if ($r2 + $i2 > 4.0); $zi = 2.0 * $zr * $zi + $y0; $zr = $r2 - $i2 + $x0; $iter++; } $frame[$row_off + $px] = $color_lut[$iter]; $px++; } $py++; }
JIT — The Code
Mandelbrot set
Triple-nested while loop
19 variables · float arithmetic
Pure Perl.
No XS. No Inline::C.
No tricks.
JIT — Full Nested
All 3 loop levels compiled as one native function
| Mandelbrot 1000×1000 | perl5 | pperl interpreted | pperl JIT | vs perl5 |
|---|---|---|---|---|
| Wall time | 12,514ms | — | 163ms | 76× faster |
200 million escape iterations of float arithmetic.
19 variables, 3 loop levels — Cranelift register-allocates across all of them.
Perl. With JIT. That's a sentence nobody expected.
Autoparallel JIT — Full Win
JIT + Rayon: compile to native, then split across cores
| Mandelbrot | perl5 | pperl JIT | pperl JIT + 8 threads | vs perl5 |
|---|---|---|---|---|
| 1000×1000 | 12,514ms | 163ms | 29ms | 431× faster |
| 4000×4000 | ~200s | 2,304ms | 342ms | ~580× faster |
JIT alone: 76×. Adding 8 threads: another ~7× on top.user 2.6s vs real 0.34s — near-linear scaling across cores.
Demo Time!
✓ Auto-FFI
No XS. No Inline::C. No compilation. Just call C.
# Layer 0 — Raw: any library, you provide type signatures
use Peta::FFI qw(dlopen call);
my $lib = dlopen("libz.so.1");
my $ver = call($lib, "zlibVersion", "()p");
say "zlib: $ver"; # 1.3.1
# Layer 1 — Pre-baked: curated signatures, zero ceremony
use Peta::FFI::Libc qw(getpid strlen strerror uname);
say strlen("hello"); # 5
my @info = uname();
say "$info[0] $info[2]"; # Linux 6.18.6-arch1-1
Pack-style type codes: (p)L = strlen(const char*) → size_t
50+ native Rust modules already built in — Auto-FFI extends to everything else
Auto-FFI — Details
Powered by libffi — any signature works, no pre-generated stubs
| Layer | Scope | Mechanism |
|---|---|---|
| Raw (Layer 0) | Any .so on the system | dlopen + dlsym + libffi call frame |
| Pre-baked (Layer 1) | libc, libuuid, ... | Direct Rust libc::* calls — zero overhead |
| Discovery (Layer 2) | System-wide scan | scan() → hashref of { soname => path } |
# Layer 2 — What's on this system?
use Peta::FFI qw(scan dlopen call);
my $libs = scan();
say scalar(keys %$libs), " libraries found";
if (exists $libs->{"libz.so.1"}) {
my $z = dlopen("libz.so.1");
say "zlib: ", call($z, "zlibVersion", "()p");
}
Libc: ~30 functions (process, strings, env, math, file, time)
UUID: 6 functions via dlopen — dies with install hint if missing
✓ Bytecode Cache (.plc)
Like Python's .pyc — but for Perl. Opt-in.
# Default: no caching (safe for development)
$ pperl script.pl
# Enable: compile once, load from cache on subsequent runs
$ pperl --cache script.pl
# Invalidate all caches
$ pperl --flush
First run: parse → codegen → execute → save .plc
Second run: load .plc → execute (no parsing, no codegen)
Bytecode Cache — Details
Storable-model: bincode deserializes directly to final runtime types. Zero intermediate conversion.
| Benchmark | perl5 | pperl | pperl --cache |
|---|---|---|---|
three_modules |
22.3ms | 12.6ms | 9.9ms |
mixed_native_fallback |
26.3ms | 13.0ms | 10.0ms |
deep_deps |
18.1ms | 13.1ms | 9.9ms |
Net module-loading cost: 33–37% faster with cache. Biggest win on fallback modules. Native Rust modules already near-zero cost.
SHA-256 keyed · mtime + version validation · aggressive format versioning
✓ Daemonize
Emacs-style daemon/client model
$ pperl --daemon script.pl # compile, warm up, listen
$ pperl --client script.pl # connect → fork → run → respond
$ pperl --stop script.pl # clean shutdown
First run: parse → codegen → execute (warm-up) → listen
Client request: connect → fork() → child inherits arenas → execute → respond
fork() gives each client a fresh address space
with all arenas already mapped — zero I/O, zero parsing, zero deserialization
Daemonize — Details
| Benchmark | perl5 | pperl | --cache | --daemon |
|---|---|---|---|---|
| 5 native modules | 15.0ms | 4.3ms | 4.3ms | 4.6ms |
| fallback + native mix | 23.5ms | 15.8ms | ~10ms | 5.0ms (3.2×) |
Eliminates both startup costs: process creation (~3-4ms) + module compilation (0-15ms)
Faster than bytecode cache — no deserialization, arenas are already in memory
Unix domain socket · JSON wire protocol · copy-on-write pages via fork()
Daemonize — Prior Art
| Solution | Scope | Isolation | State leakage | Status |
|---|---|---|---|---|
| PPerl | General CLI | None | Yes | Dead (2004) |
| SpeedyCGI | CGI | None | Yes | Dead (2003) |
| mod_perl | Apache | Per-child | Per-request | Maintained |
| Starman | PSGI | Per-worker | Per-request | Maintained |
| FastCGI | Web | Per-process | Per-request | Maintained |
| pperl daemon | General CLI | Per-request (fork) | None | Active |
All prior solutions: same interpreter across requests — state leakage by design
pperl: fresh child per request via fork() — compiled arenas via COW, clean runtime state
Future pperl
⏳ Seamless GPU — restricted Perl → OpenCL/HIP/Vulkan/CUDA kernel · same code, GPU execution
⏳ pperl-mini — tailored and scaled down versions. Maybe on a Raspberry Pico one day?
⏳ pperl-compiler — Maybe code running on a STM32 one day?
When to use pperl
Good fit:
- Workloads that benefit from JIT and/or autoparallelization
- Scripts using native builtins (50+ Rust modules, fast)
- Fast startup — inherently ~2× faster than perl5, plus --cache
- pperl-specific features: Auto-FFI, Daemonize, Bytecode Cache
- Security: different codebase — unlikely to share CVEs with perl5
- Smaller, less complex scripts
Not yet:
- Large, complex codebases — edge cases where pperl differs from perl5
- We strive for maximum compatibility, but we're not 100% there yet
Rule of thumb: the longer and more complex the script,
the more likely you hit a corner case. If you don't want to touch the code — use perl5.
Correctness Case Study
How serious is "maximum compatibility"?
The bug: $, (OFS) vs $\ (ORS) in print
pperl checked both with the same flag mask. Perl5 doesn't.
perl5 — $, (OFS)
if (SvGMAGICAL(ofs) || SvOK(ofs))
Checks get-magic AND ok-flags
perl5 — $\ (ORS)
if (PL_ors_sv && SvOK(PL_ors_sv))
Checks ok-flags only. No get-magic.
pperl had:
// Same mask for both — SVS_GMG included for ORS. Wrong.
if flags & (SVF_IOK | SVF_NOK | SVF_POK | SVF_ROK | SVS_GMG) != 0
Practical impact: near zero.
To trigger this, you'd need a tie on $\ whose FETCH returns undef, while the underlying SV has get-magic set but none of IOK/NOK/POK/ROK — and then call print. Nobody writes this. Nobody has ever written this.
We fixed it anyway.
The depth of compatibility is the product's guarantee.
