排序、清理和修剪：碰撞检测算法 (2023)

排序、清理和修剪：碰撞检测算法 (2023)
Sort, sweep, and prune: Collision detection algorithms (2023)

原始链接: https://leanrada.com/notes/sweep-and-prune/

这篇文章讨论了视频游戏中碰撞检测的各种方法，并介绍了扫描和修剪算法的概念。作者使用刚体球模拟来说明这些概念。简单的碰撞检测方法涉及检查每对对象是否发生碰撞，从而导致大量对象的性能较差。为了改进这一点，作者建议在执行碰撞检查之前按最小 x 值对对象进行排序。通过这样做，需要更少的计算工作，从而提高性能。提出的最终解决方案将这种排序方法与扫描和剪枝算法相结合。这产生了一种高效的碰撞检测算法，该算法仅对必要的对执行相交测试，与简单的方法相比，显着减少了计算时间。作者提供了视觉演示并比较了不同方法的性能。

所提供的文本讨论了 Big-O 表示法应用于实际计算机算法时的局限性。它认为 Big-O 模型忽略了机器寄存器大小、内存容量、存储访问速度和传输速率等因素，导致了不切实际的结果。给出的示例是 O(1) 算法需要数小时才能完成，而 O(n) 竞争对手每秒处理 4 GB，一旦处理超出可用系统资源的大量数据，就变得没有竞争力。作者还提到，虽然 Big-O 复杂性提供了见解，但它们缺乏特异性，就像在不知道商店商品数量或质量的情况下只知道平均价格一样。复杂的空间划分方法或树细分会增加不必要的复杂性，特别是在处理大量动态对象时。从历史上看，复杂的分区系统对于《DOOM》或《Quake》等游戏至关重要，因为它们依赖于原始计算性能。然而，现代 CPU 擅长处理有序的项目数组，由于流水线技术，可以提供比链表/树更快的性能。这意味着用于管理实体列表的 CPU 时间分配更少，而更多地关注 AI、渲染等方面。在考虑胶囊的精细碰撞时，他们将胶囊视为具有厚度的三维片段，根据片段距离确定它们是否发生碰撞。可以在此处找到一个演示分段之间计算的有用网站：[链接]。另一种计算距离的方法是将其视为最小化问题，在每个对象内查找点并最小化它们之间的平方距离，同时确保约束成立。在胶囊的背景下，这创建了一个“凸”优化问题，可以通过牛顿方法使用不到十个迭代步骤来解决，然后投影到相应的凸对象上。几何经常将对象分解为简单的凸形状的联合，以实现高效的计算，一个例子是球体树覆盖具有重叠球体的对象，以便在胶囊之间快速进行粗略碰撞检测。

5 Aug 2023 · 13 min read

tags: algo games

Sweep-and-prune is my go-to algorithm when I want to quickly implement collision detection for a game. I think it’s an awesome and elegant algorithm, so I wrote a post about it.

This post is lengthy with many examples and explanations, thus split into two parts. You can jump to specific bits using this special springboard:

As for the rest of the post, I try to paint a picture of what I think are first principles and show it with interactive demos! Let’s go!

Collision detection

As you may know, the problem of collision detection is pretty common in video game programming. It’s a prerequisite to the implementation of certain game mechanics or simulations.

Some of these mechanics include: preventing characters from passing through each other, goombas turning around when bumping into another, big cells eating smaller cells in agar.io, or just about any game physics. All of these need some kind of collision detection.

Here I’ll cover several related approaches, starting with the simplest and building up to the sweep-and-prune algorithm. I won’t cover other approaches, such as space partitioning or spatial tree subdivision.

Balls.

I’ll use this rigid-body ball simulation as a recurring example to demonstrate the algorithms throughout the post:

Alright, let’s dive in! How do we detect these collisions?

Naive approach 🐥

The straightforward solution is to test every potential pair of objects for collision. That is, check every ball against every other ball.


for (let i = 0; i < balls.length; i++) {
  const ball1 = balls[i];
  
  for (let j = i + 1; j < balls.length; j++) {
    const ball2 = balls[j];
    
    if (intersects(ball1, ball2)) {
      bounce(ball1, ball2);
    }
  }
}

Note in the above code that the inner loop starts at i + 1 to prevent duplicate pairs from being counted (A-B vs B-A). Other than that, it’s a pretty simple solution.

These checks are done on every time step, ensuring that balls will bounce exactly when they collide.

Here’s a slowed-down, highlighted simulation, showing pairs being tested for intersection per time step:

Pairs are highlighted when being tested via intersects().

And it works. But if we had more than just a handful of balls we would start seeing performance issues.

Performance, or lack thereof

This naive algorithm runs in O(n²) time in Big O terms. That is, for an input of n balls, the algorithm’s running time grows proportionally to the square of the input n. That’s a lot! 📈

This is because for n balls, there are around (n * (n-1))/2 pairs to test, or 0.5n² - 0.5n. For example, if n = 5 there would be a total of 10 pairs. For n = 10, there would be 45 pairs. For n = 15, 105 pairs (!). And so on… Using Big O notation, we can simplify this information into a compact expression “O(n²)”

To (painfully) demonstrate how the performance scales badly for bigger inputs, here’s a simulation with n = 20:

20 balls = 190 pairs to test

That’s a lot of tests per frame! Clearly, the naive solution does not scale well for large numbers of objects.

How can we improve this solution?

The worst case running time for any collision detection algorithm is always O(n²). That’s when all objects intersect simultaneously and you have no choice but to process each of the n² collisions.

Thus, it’s more practical to compare the average and best cases.

Having said that, the naive algorithm is still Θ(n²) for any case, no matter the number of actual collisions. A lot of room for improvement!

Prologue: Improving the solution

Usually when optimising algorithms, you wanna find redundant or unnecessary work. Then find a way to consolidate that redundancy. (That sounded corporate-ish.)

A good place to start would be the intersects() function since it is called for every candidate pair. If we take the typical object intersection test to be its implementation, we get a bunch of these inequality checks:

function intersects(object1, object2) {
  
  return object1.left < object2.right
      && object1.right > object2.left
      && object1.top < object2.bottom
      && object1.bottom > object2.top;
}

In the above code, the intersects() function checks if two objects intersect by comparing their opposing bounds for each direction. (Refer to this MDN article for a better explanation.)

We can break the test down to its constituent checks:

object1.left < object2.right
object1.right > object2.left
object1.top < object2.bottom
object1.bottom > object2.top

Each check is solely concerned with one particular axis in a specific direction.

Here’s the key thing: Due to the && operator’s short-circuit evaluation, if any one of these checks turns out to be false, then the overall test will immediately evaluate to false.

Our goal then is to generalise the case where at least one of these checks is false across many tests as possible.

It’s the same idea as the separating axis theorem, which implies that two objects can’t be colliding if there’s at least one axis where their shadows don’t overlap.

Let’s say we focus only on the second check - object1.right > object2.left. Don’t worry about the rest of the checks. As hinted above, optimising in just one axis can still make a big difference later, so we’ll focus on this single check for now.

Let’s look at it in the context of multiple objects. Consider three objects - A, B, and C - in this horizontal configuration:

There are three potential pairs to be checked here: A-B, B-C, and A-C. Remember, we’re trying to find redundant work. Pretend we’re running all the pairs through the check, like so:

A.right > B.left 
B.right > C.left 
A.right > C.left

See any redundant work? Maybe abstractify it a little…

A > B 
B > C 
A > C

Voilà. Due to the transitive property of inequality, realise that we don’t need to run the third test! If we know that A > B and B > C are both false, then we would know that A > C is false as well.

“If a ≤ b and b ≤ c, then a ≤ c.” the transitive property of inequality

So in this example, we don’t really need to run intersects(A, C).


intersects(A, B) 


intersects(B, C)

We’ve skipped one intersects() call for free! ✨

I’m handwaving the fact that P.left ≤ P.right is implied for any object P. Nevertheless, working those details out would just mean more transitivity.

You might be wondering how this contrived example could apply to general n-body collision detection. A smart reader such as you might also have realised that this skip only works if A, B, and C are in a particular order.

What particular order? Try dragging the balls below to see when the optimisation applies and when it does not:


intersects(A, B) 
intersects(B, C) 
intersects(A, C)

Tip: Drag the balls so that they’re horizontally spaced out in this order: A‑B‑C

While it’s true that this skip only works when A, B, and C are ordered, remember that these labels are arbitrary! What if we simply decided to always call the leftmost ball A, the middle ball B, and the rightmost C? Then the optimisation would always be applicable! 🌌🧠

But wait… labeling objects according to some logical ordering is essentially ✨sorting✨! What if we sorted the list of objects every time? Would the number of skipped tests be worth the cost of sorting?

Chapter 1. Sorting

Sorting, inequalities, and optimisation go hand in hand in hand. A sorted list allows us to exploit the transitive property of inequality en masse.

Even if we had to sort the list of objects every frame, the quickest general sorting algorithm runs in O(n log n) time which is certainly lower than O(n²).

As shown by the tri-object example above, to achieve the power to skip tests we need to sort the list of objects by x position.

However, objects aren’t zero-width points. They’re widthy, by which I mean having a size thus occupying an interval in the x-axis, also known as “width”. How can one unambiguously sort by x position if objects span intervals in the x-axis?

Sort by min x

A solution to sorting widthy objects is to sort them by their minimum x (their left edge’s x-coordinate). This technique can be applied to improve the naive approach.

It involves minimal modifications to the O(n²) solution. But it will result in a good chunk of tests skipped. I’ll explain later.

First, the modified code:

+ 
+ sortByLeft(balls);
+ 
  
  for (let i = 0; i < balls.length; i++) {
    const ball1 = balls[i];
    
    for (let j = i + 1; j < balls.length; j++) {
      const ball2 = balls[j];
+ 
+     
+     if (ball2.left > ball1.right) break;
+ 
      
      if (intersects(ball1, ball2)) {
        bounce(ball1, ball2);
      }
    }
  }

It’s mostly the same as the naive solution, differing only in two extra lines of code.

The first line sortByLeft(balls) simply sorts the list, with ranking based on the balls’ left edge x-coords.

function sortByLeft(balls) {
  balls.sort((a,b) => a.left - b.left);
}

And in the inner loop, there is now this break:

if (ball2.left > ball1.right) break;

Let’s break that down.

First, we know that the list is sorted, so the following statement holds true for any positive integer c:

balls[j + c].left >= balls[j].left

The break condition, which is derived from the first operand of the intersection test, if true indicates early that the current pair being tested for intersection would fail:

balls2.left > ball1.right
or balls[j].left > ball1.right

But there are more implications. If it was true, then by combining the above two inequations…

balls[j + c].left >= balls[j].left > ball1.right

And by transitive property, the following statement would also be true!

balls[j + c].left > ball1.right

Which means the intersection tests of balls at balls[j + c] would also fail. We know this without needing to test those balls individually. A range of balls have been eliminated from testing!

In conclusion, when the current ball2 balls[j] stops overlapping with the current ball1, then any further ball2s in the iteration balls[j + c] would be guaranteed to not overlap ball1 as well. In other words, we stop the inner loop when it gets too far away.

Finally, here’s a demo:

Pairs highlighted when tested by intersects().

Pretty cool, right! It’s much faster now.

Some observations:

Since the list is sorted, the tests are performed from left to right.
More importantly, it visibly does fewer tests than the naive approach. 📉 This is due the above optimisation which effectively limits pairs to those that overlap in the x-axis!

Let’s analyse the time complexity. 👓

The sort - if we take the "fastest" sorting algorithm, like mergesort or quicksort - would add an O(n log n) term.

The two-level loop, now with an early break, would average out to O(n + m) where m is the total number of x-overlaps. This could degenerate into n² but as mentioned above, it’s more useful to look at the average and best cases. At best, the loop would be O(n), wasting no excess processing when there are no overlaps. On average it’s O(n + m).

The average case refers to a world where objects are mostly evenly distributed and only a couple intersections per object is happening. I think this is a reasonable assumption for a relatively simple video game like a platformer or side-scroller.

Here’s the code with running time annotations:


sortByLeft(balls);


for (let i = 0; i < balls.length; i++) {
  const ball1 = balls[i];
  
  for (let j = i + 1; j < balls.length; j++) {
    const ball2 = balls[j];
    if (ball2.left > ball1.right) break;
    if (intersects(ball1, ball2)) {
      bounce(ball1, ball2);
    }
  }
}

Adding those together we get O(n log n + m).

This is a super good improvement over the naive approach’s O(n²), because [1] n log n is much smaller than n² and [2] it is partially output-based - depending on the number of overlaps, it does not process more than necessary.

Furthermore, the choice of sorting algorithm could be improved. We’ll look into that in the next part (somehow better than n log n!).

If you got this far trying to find a decent collision detection algorithm, then you can stop reading and take the above design! It’s the perfect balance between programming effort and running time performance. If you are curious how this develops or just want to see more interactive demos, read on to the next part.

Visual comparison

Here’s a side-by-side comparison of the strategies we’ve covered so far! Observe the amount of intersection tests required per frame. 🔍 n = 10

(Not shown: the cost of sorting. Let’s just say the intersection test is sufficiently expensive.)

Aaand that concludes the first part. Those two lines of code definitely were the MVPs.

How will it compare to the more advanced versions?

Continued in part 2.