波导合成笔记 (2018)
Notes on Waveguide Synthesis (2018)

原始链接: https://www.osar.fr/notes/waveguides/

## 波导合成:概要 波导合成是一种强大的技术,用于创建物理上逼真的声音,它利用延迟信号和反馈回路来模拟声学现象。其基本原理是:带有反馈的延迟线会产生共振频率,从而有效地生成音高。负反馈会消除偶次谐波,这对于模拟诸如排箫之类的乐器很有用。 然而,简单的模型听起来不真实,因为缺乏频率吸收。在反馈回路*内部*添加滤波器可以模拟这种吸收,从而塑造音色。通过结合*非线性*——模拟较响亮声音被更强烈吸收的函数——可以进一步提高真实感,防止振幅失控并实现持续的音符。 控制音高需要理解滤波器如何引入相位延迟,需要通过将滤波器截止频率与所需音高相关的方程进行补偿。实现音符之间的平滑过渡需要在多个延迟线之间进行交叉淡入。 高级技术探索复杂的滤波和频谱整形,从而可以产生有机声音,例如动物叫声和细微的乐器模拟。关键的实现细节包括为延迟线选择合适的插值方法,并使用高通滤波器解决直流偏移问题。波导合成为声音设计提供了一种灵活且基于物理的方法。

Hacker News 新闻 | 过去 | 评论 | 提问 | 展示 | 招聘 | 提交 登录 关于波导合成的笔记 (2018) (osar.fr) 38 分,由 jstrieb 2 天前发布 | 隐藏 | 过去 | 收藏 | 1 条评论 musicale 1 天前 [–] 物理建模很棒;音乐合成通常在软件中进行非常实用,因为音频频率在千赫兹,给你每个样本提供了大量的周期。还链接到 Julius Smith 的主页(和在线教科书)。回复 考虑申请 YC 的 2026 年冬季批次!申请截止日期为 11 月 10 日 指南 | 常见问题 | 列表 | API | 安全 | 法律 | 申请 YC | 联系 搜索:
相关文章

原文

Waveguide Synthesis is one of the most effective approaches to generating sounds with physically realistic traits. If you're not convinced of this, perhaps the polished waveguide models of Chet Singer will change your mind. When researching the topic, I've found most material on waveguide synthesis to be heavy on the theory and yet lacking when it comes to application and implementation. This is why I decided to assemble my notes on the implementation and applied exploration of waveguides, using the occasion to play with interactive in-browser synthesis. This text assumes some knowledge of audio DSP.

A Delay Line with Feedback

We begin with a minimal, special case of a waveguide more commonly known as a feedback comb filter. Take a source signal (the excitation), and delay it by a short duration (the delay length). Then take the delayed signal, attenuate it by a certain factor (the feedback), and feed it back into the delay along with the source signal. As the delay receives its own output in a loop, some frequencies will begin to emerge, creating a perceivable pitch, which in simple cases is the inverse of the delay length.

If you play with the model below, you'll notice the pitch seems lower by an octave when feedback is negative. This naturally arises from the fact that a number is repeated after its sign is inverted twice, effectively doubling the delay length. Negative feedback also changes the timbre, because all even harmonics cancel themselves out. This is useful in wind instrument waveguides, where closed bores (such as pan flutes) are often modeled with negative feedback.

A delay line with feedback.

Another way of representing this type of model is with a difference equation, which is useful as it can be directly translated to an algorithm. In this case the difference equation is:

$$ y(n) = x(n-d) + ay(n-d) $$

where:

  • $y$ is the output signal, eg. $y(t)$ is the output sample at time $t$
  • $x$ is the input (exciter) signal, eg. $x(t)$ the input sample at time $t$
  • $n$ is the current time
  • $d$ is the delay length
  • $a$ is the feedback

Filters in the Waveguide

Note how the previous model has very sustained high frequencies, giving it an unrealistic timbre. In reality, a physical medium will absorb higher frequencies more heavily than lower frequencies. This can be roughly modeled with a filter inside the loop. It's also important to feel the difference between filtering inside the loop vs. filtering only the excitation. The following model has butterworth lowpass filters in both positions:

A basic waveguide model with two filters.

Nonlinearities

Note how the previous model, when filtered, produces only short notes, even with maximum feedback. Can this be remedied? Raising the feedback above 1 would cause the amplitude to keep increasing theoretically forever (this would be instability). We've been simply multiplying the signal $x$ by the feedback value $a$, in other words, applying a linear function $\mathrm{L}(x) = ax$. In real acoustic systems, louder sounds are more heavily absorbed by the medium, and the simplest way of modeling this is to use a sigmoid non-linear function such as $\mathrm{NL}(x) = \tanh(ax)$ that gives us lower feedback for high values. This will also allow sustained notes without instability.

A waveguide model with filters and a nonlinearity.

Controlling the Pitch

The first model's pitch was simply determined by the delay length. You may have noticed that's not the case anymore: change the inner filter's frequency, and the pitch also changes. This is because digital filters are based on delay, so by introducing a filter, we're changing the total length of the cumulative delay line. In fact, the filters we're using have a phase delay that's different for every frequency. This makes it more difficult to find an exact solution to compensate for this delay, as the filter introduces slight inharmonicity, and it affects the pitch in a way that depends on the pitch. Thankfully we don't actually need to use our brains: we can use regression instead! Just measure how the filter affects the pitch and fit an equation onto it. For simple lowpass filters, we obtain something in the form:

$$ K = \frac{c_2}{f + \frac{c_1 f^2}{f_c}} $$

where:

  • $K$ is the delay length
  • $f$ is the desired pitch
  • $f_c$ is the cutoff frequency of the inner filter
  • $c_1$ and $c_2$ are constants tied to the inner filter algorithm being used

A waveguide model playing a melody. For the simple 1-pole filter used here, we obtain the constants $c_1 \approx 0.1786$ and $c_2 \approx 1.011$

Note Transitions

Notice that the model above has staccato notes. Why? Because I was too lazy to implement proper note transitions. If we attempt to change the length of the delay line while a note is playing, the output does not remotely resemble a legato sound (at best, we can change it slowly and obtain a slide whistle sound). In the legato transition of a real flute, a tone hole is opened or closed, and the bore effectively has a Y-junction during the transition. This can be better modeled by cross-fading between two fixed-length delay lines in the loop. In fact, two delay lines will suffice for any sequence of notes: one of them can always sound while the other secretly changes length.

A waveguide model playing controllable legato notes. Move the "note" slider to control the notes.

Experiment: Wildlife

When exploring waveguide synthesis, it's useful to have mental models and simplifications to work with. One such insight is to imagine that the looped delay line provides an infinite number of harmonics of its base frequency, some having more weight than others. Without any filters, the weights of harmonics are based on their harmonic distance: the simplest fractions, such as the fundamental, are the heaviest. This weight determines which harmonics are more likely to resonate, with the lighter ones generally being softer or absent (this is probabilistic when the exciter is based on noise). By adding filters into the loop, we shape the spectrum of those weights. When crossfading multiple delays in the loop, we intersect the spectra of their weighted harmonics.

However, complex filtering in the loop means the final pitch of the model will be much harder to control. One practical solution is to play along with it, since these pitch fluctuations can be satisfyingly organic, especially when the context is not musical. This is, I believe, how pitch is handled in some xoxos instruments such as Fauna and Elder Thing. In the following example, the same approach is used to create organic animal calls. There are two parallel bandpass filters in the loop, one of which has three times the frequency of the other.

What animal is this?

Experiment: Harmonic Flute

The following example simulates the higher registers of a flute by means of a bandpass crossfaded with a lowpass in the loop, making the pitch harder to predict. Despite this, an attempt is made to keep the final pitch in tune with the original intended pitch. We use the same formula introduced in the above section Controlling the Pitch, with one modification: $f_c$ is now the sum of the frequencies of both filters.

Flute model with exaggerated higher registers.

Implementation Details

Commonly used waveguide forms.

Form A above is commonly found in educational material, because it's the one that directly matches the physical model. However, it should generally not be used in practice. Instead, we've been using the simplified form B, which is obtained by changing the point at which the delay loop is sampled, then combining delays inside the loop. More generally, the input and output points as well as the order of elements inside the delay loop can often be modified with no audible difference, even if the model is not exactly equivalent. Form C is particularly useful when low latency is required.

Delay Line Sampling and Interpolation

Digital delay lines are made of discrete samples, but we need to read them at arbitrary time points between samples. There are a few possible solutions to this:

  • Rounding the delay length to the nearest exact sample. This will cause the model to be detuned, especially at common sampling rates and higher pitches, where a difference of half a sample is easily perceptible.
  • Sampling the delay line with linear interpolation. This will cause undesirably different timbres depending on which note is being played. Delay lengths that are far from integer values will cause an audible filtering and shorter decay.
  • Polynomial interpolation. This is the simplest solution to give an acceptable result, and is used in all the examples above. Interpolating over four samples tends to be sufficient for musical purposes.
  • Resampling the delay loop. A more unusual solution, where we use a fixed or rounded delay length, but we process the entire delay loop at a different samplerate than the final output. This way, interpolation artifacts are not amplified by the delay loop. The ratio between the samplerates is sometimes called the time step.

DC in the loop

When experimenting with looped delay lines, the output will sometimes seem to paradoxically vanish at high feedback values. This is usually due to DC offset, that is, any slight positive or negative average offset gets amplified inside the loop until the signal gets squashed. This arises when using non-linearities with too much slope at the origin, that is $|\mathrm{NL}'(0)| > 1$, assuming $\mathrm{NL}$ includes the feedback. The simplest solution is to use a high-pass filter in the loop. In the above examples, a one-pole 30Hz high pass filter is used.

Waveguide synthesis related:

Tools that were used to create this interactive notebook:



联系我们 contact @ memedata.com