RCU has a way of popping up unexpectedly. In the words of Fedor Pikus, “In fact, you may already be using the RCU approach in your program without realizing it!” This post will describe uses of a couple of highly unorthodox (some might say “completely irresponsible”) types of RCU implementations, both accidental and otherwise.
Timed-Wait RCU
One very simple class of RCU implementations uses a fixed period of time for the grace period. This can clearly work well in hard real-time kernels and applications, though it has also been used in a prototype non-real-time kernels. In an early 1990s verbal discussion, none other than Van Jacobson pointed out that a 15-second delay would suffice in the research version of a proprietary-UNIX OS that he was working with. I responded (also verbally) that in DYNIX/ptx interrupt handlers sometimes executed for more than a minute (as in more than 15 seconds), but that Jack Slingwine and I had a way to get the same low-overhead/high-scalability effect without the need for hard real-time constraints on readers. Van expressed interest, so I sent him an early draft of the first RCU conference paper. Some years later, I had the privilege of hearing Van say nice things about Linux-kernel RCU.
For the benefit of any long-time RCU users who (like me) have a “make it make sense” filter in their heads, here is a Linux-kernel implementation of a synchronize_rcu() for Van's RCU:
void synchronize_rcu(void)
{
schedule_timeout_uninterruptible(15 * HZ);
}
In other words, synchronize_rcu() does a fixed wait of 15 seconds, after which it is assumed, without proof, that all pre-existing readers have completed.
In the mid-1990s, Aju John wrote the USENIX paper Dynamic vnodes — Design and Implementation, which proposed a fixed 10-minute wait time for reclaiming vnodes in a proprietary UNIX system, DEC OSF/1 Version 3.0.
This approach might actually make sense in a hard-real-time environment, but would of course be extremely dangerous even there. On the other hand, you have to admit that Van and Aju were taking a no-holds-bared approach to performance and scalability, and doing so very early in the game! And Paul Khuong reports that this technique was recently used in production for an extended period, until a more principled and sufficiently performant technique was put in its place.
As far as I know, none of these timed-wait RCU use cases remains in production use, but the fact remains that timed-wait RCU really has been used in production. And maybe it is still being used, dangerous though use of timed-wait RCU might be outside of hard-real-time environments! On the other hand, this is a rare instance of hard real-time making something way simpler, at least as long as non-real-time threads are prohibited from using RCU read-side critical sections.
Fixed-Buffer RCU
The previous section covered RCU grace-period detection based purely on time. But a recent discussion with Hui Xie led me to wonder about instead basing them purely on space, as in address space. Is this possible?
The Linux-kernel address sanitizer (KASAN) keeps newly freed blocks of memory in a fixed-size quarantine. This is done to increase the probability that a use-after-free access occurs while corresponding block of memory has not yet been reallocated, thus reducing the number of false negatives that could otherwise occur if freed memory was instead immediately reallocated.
However, this can be thought of another corner-case RCU implementation, where the “readers” are assumed to complete before the corresponding block of memory is reallocated.
And rcutorture does something similar by deferring freeing of RCU_TORTURE_PIPE_LEN (AKA 10) of the blocks previously referenced by rcu_torture_current. In addition, there are 10 times that many elements total (AKA 100 of them). As with KASAN, the idea is to reduce the incidence of false negatives, so that an extremely long RCU reader combined with a series of extremely short RCU grace periods would still have roughly a 99% chance of each reader issuing a diagnostic for the too-short grace periods.
Although in these two examples, the intent of the fixed buffers is bug detection, they could instead be used to create an extremely limited and very sharp-edged RCU grace-period detection mechanism. So the answer to the question raised by the discussion with Hui Xie is “yes”!
Time vs. Space and Implications
In short, it is possible (though risky!) to implement RCU grace-period detection purely in terms of time on the one hand and purely in terms of space on the other. This provides a nice counterpoint to RCU using both time and space for its underlying synchronization, with rcu_read_lock(), rcu_read_unlock(), synchronize_rcu(), and call_rcu() handing time and rcu_assign_pointer() and rcu_dereference() handling space.
And who knows? Perhaps someone is using fixed-buffer RCU in production. But if you are that someone, please understand that you need to sharply and deterministically bound both reader duration and update rate. Either that, or you (and your users!) need to tolerate some probability of memory corruption.
Some historical perspective is provided by the rumored 1980s DEC VAX system running BSD UNIX whose systems administrator chose to use known-buggy patches that were said to provide about a 15% performance boost. And about one extra crash per week. At the time, some felt that this was the acme of systems-administration excellence, while others felt that this was completely foolhardy.
A less intentional example is provided by none other than Linux-kernel RCU shortly after suspend and hibernation were accepted into the kernel. These features forced CPU hotplug, which silently negated my attempts to test RCU in non-CPU-hotplug kernels. This meant that rcutorture failed to spot a stupid bug that I injected into RCU that, in non-CPU-hotplug kernels, caused RCU grace periods to be a fixed few-millisecond delay.
And nobody noticed.
Aside from a developer who tested hundreds of concurrent kernel builds on a two-CPU system, who reported the bug, but disappeared when I asked for his .config. And a few months after that, Nick Piggin, whose dentry cache testing tickled this bug. Nick found and fixed my stupid bug. I then changed rcutorture to pay more attention to the .config file for the kernel under test, and to apply stress in a more targeted manner. And given today's much more aggressive Linux-kernel testing, I suspect that a similar bug today would be noticed much more quickly.
So, as always, choose wisely! ;–)
History
- Add example
synchronize_rcu()timed-wait implementation to reinforce the unorthodox nature of these RCU implementation approaches. - Add 1980s rumors of DEC VAX system choosing performance over reliabilty.