# Depth精度问题分析

```Depth precision is a pain in the ass that every graphics programmer has to struggle with sooner or later. Many articles and papers have been written on the topic, and a variety of different depth buffer formats and setups are found across different games, engines, and devices.
```

Depth的精度是图形开发者迟早要面对的痛点。很多文章讨论过这个问题，不同的游戏，引擎，设备中也有多种不同的Depth缓存格式和设置。

```Because of the way it interacts with perspective projection, GPU hardware depth mapping is a little recondite and studying the equations may not make things immediately obvious. To get an intuition for how it works, it's helpful to draw some pictures.
```

```This article has three main parts. In the first part, I try to provide some motivation for nonlinear depth mapping. Second, I present some diagrams to help understand how nonlinear depth mapping works in different situations, intuitively and visually. The third part is a discussion and reproduction of the main results of Tightening the Precision of Perspective Rendering by Paul Upchurch and Mathieu Desbrun (2012), concerning the effects of floating-point roundoff error on depth precision.
```

### 为什么是 1/z

```GPU hardware depth buffers don't typically store a linear representation of the distance an object lies in front of the camera, contrary to what one might naïvely expect when encountering this for the first time. Instead, the depth buffer stores a value proportional to the reciprocal of world-space depth. I want to briefly motivate this convention.
```

GPU硬件深度缓冲区通常不会存储物体位于摄像机前方的距离的线性表示，这与人们的直觉相反。 事实上，深度缓冲区存储的值与世界空间深度的倒数成比例。 我想简要地解释下为什么是这样。

```In this article, I'll use d to represent the value stored in the depth buffer (in [0, 1]), and z to represent world-space depth, i.e. distance along the view axis, in world units such as meters. In general, the relationship between them is of the form
```

```where a,b are constants related to the near and far plane settings. In other words, d is always some linear remapping of 1/z.
```

```On the face of it, you can imagine taking d to be any function of z you like. So why this particular choice? There are two main reasons.
```

```First, 1/z fits naturally into the framework of perspective projections. This is the most general class of transformation that is guaranteed to preserve straight lines—which makes it convenient for hardware rasterization, since straight edges of triangles stay straight in screen space. We can generate linear remappings of 1/z by taking advantage of the perspective divide that the hardware already performs:
```

```The real power in this approach, of course, is that the projection matrix can be multiplied with other matrices, allowing you to combine many transformation stages together in one.
```

```The second reason is that 1/z is linear in screen space, as noted by Emil Persson. So it's easy to interpolate d across a triangle while rasterizing, and things like hierarchical Z-buffers, early Z-culling, and depth buffer compression are all a lot easier to do.
```

### 绘制深度贴图

```Equations are hard; let's look at some pictures!
```

```The way to read these graphs is left to right, then down to the bottom. Start with d, plotted on the left axis. Because d can be an arbitrary linear remapping of 1/z, we can place 0 and 1 wherever we wish on this axis. The tick marks indicate distinct depth buffer values. For illustrative purposes, I'm simulating a 4-bit normalized integer depth buffer, so there are 16 evenly-spaced tick marks.
```

```Trace the tick marks horizontally to where they hit the 1/z curve, then down to the bottom axis. That's where the distinct values fall in the world-space depth range.
```

```The graph above shows the “standard”, vanilla depth mapping used in D3D and similar APIs. You can immediately see how the 1/z curve leads to bunching up values close to the near plane, and the values close to the far plane are quite spread out.
```

```It's also easy to see why the near plane has such a profound effect on depth precision. Pulling in the near plane will make the d range skyrocket up toward the asymptote of the 1/z curve, leading to an even more lopsided distribution of values:
```

```Similarly, it's easy to see in this context why pushing the far plane all the way out to infinity doesn't have that much effect. It just means extending the d range slightly down to 1/z=0:
```

```What about floating-point depth? The following graph adds tick marks corresponding to a simulated float format with 3 exponent bits and 3 mantissa bits:
```

```There are now 40 distinct values in [0, 1]—quite a bit more than the 16 values previously, but most of them are uselessly bunched up at the near plane where we didn't really need more precision.
```

```A now-widely-known trick is to reverse the depth range, mapping the near plane to d=1 and the far plane to d=0:
```

```Much better! Now the quasi-logarithmic distribution of floating-point somewhat cancels the 1/z nonlinearity, giving us similar precision at the near plane to an integer depth buffer, and vastly improved precision everywhere else. The precision worsens only very slowly as you move farther out.
```

```The reversed-Z trick has probably been independently reinvented several times, but goes at least as far back as a SIGGRAPH ’99 paper by Eugene Lapidous and Guofang Jiao (no open-access link available, unfortunately). It was more recently re-popularized in blog posts by Matt Pettineo and Brano Kemen, and by Emil Persson's Creating Vast Game Worlds SIGGRAPH 2012 talk.
```

```All the previous diagrams assumed [0, 1] as the post-projection depth range, which is the D3D convention. What about OpenGL?
```

```OpenGL by default assumes a [-1, 1] post-projection depth range. This doesn't make a difference for integer formats, but with floating-point, all the precision is stuck uselessly in the middle. (The value gets mapped into [0, 1] for storage in the depth buffer later, but that doesn't help, since the initial mapping to [-1, 1] has already destroyed all the precision in the far half of the range.) And by symmetry, the reversed-Z trick will not do anything here.
```

```Fortunately, in desktop OpenGL you can fix this with the widely-supported ARB_clip_control extension (now also core in OpenGL 4.5 as glClipControl). Unfortunately, in GL ES you're out of luck.
```

### 取整误差的影响

```The 1/z mapping and the choice of float versus integer depth buffer are a big part of the precision story, but not all of it. Even if you have enough depth precision to represent the scene you're trying to render, it's easy to end up with your precision controlled by error in the arithmetic of the vertex transformation process.
```

1 / z映射和浮点数与整数的选择是整个精确度故事的重要组成部分，但不是全部。 即使您具有足够的深度精度来表示要渲染的场景，也很容易导致精度由顶点转换过程的算术中的误差控制。

```As mentioned earlier, Upchurch and Desbrun studied this and came up with two main recommendations to minimize roundoff error:
```

1. Use an infinite far plane.
2. Keep the projection matrix separate from other matrices, and apply it in a separate operation in the vertex shader, rather than composing it into the view matrix.
1. 使用无限远的far plane。
2. 保持投影矩阵与其他矩阵分开，并在顶点着色器中单独操作，而不是将其组合到视图矩阵中。
```Upchurch and Desbrun came up with these recommendations through an analytical technique, based on treating roundoff errors as small random perturbations introduced at each arithmetic operation, and keeping track of them to first order through the transformation process. I decided to check the results using direct simulation.
```

Upchurch和Desbrun通过分析技术提出了这些建议，该分析技术基于将舍入误差视为每次算术操作引入的小随机扰动，并通过转换过程对它们进行第一次跟踪。 我决定使用直接模拟来检查结果。

```My source code is here—Python 3.4 with numpy. It works by generating a sequence of random points, ordered by depth, spaced either linearly or logarithmically between the near and far planes. Then it passes the points through view and projection matrices and the perspective divide, using 32-bit float precision throughout, and optionally quantizes the final result to 24-bit integer. Finally, it runs through the sequence and counts how many times two adjacent points (which originally had distinct depths) have either become indistiguishable because they mapped to the same depth value, or have actually swapped order. In other words, it measures the rate at which depth comparison errors occur—which corresponds to issues like Z-fighting—under different scenarios.
```

```Here are the results obtained for near = 0.1, far = 10K, with 10K linearly spaced depths. (I tried logarithmic depth spacing and other near/far ratios as well, and while the detailed numbers varied, the general trends in the results were the same.)
```

```In the table, “indist” means indistinguishable (two nearby depths mapped to the same final depth buffer value), and “swap” means that two nearby depths swapped order.
```

```Apologies for not graphing these, but there are too many dimensions to make it easy to graph! In any case, looking at the numbers, a few general results are clear.
```

• There is no difference between float and integer depth buffers in most setups. The arithmetic error swamps the quantization error. In part this is because float32 and int24 have almost the same-sized ulp in [0.5, 1] (because float32 has a 23-bit mantissa), so there actually is almost no additional quantization error over the vast majority of the depth range.
• In many cases, separating the view and projection matrices (following Upchurch and Desbrun’s recommendation) does make some improvement. While it doesn't lower the overall error rate, it does seem to turn swaps into indistinguishables, which is a step in the right direction.
• An infinite far plane makes only a miniscule difference in error rates. Upchurch and Desbrun predicted a 25% reduction in absolute numerical error, but it doesn't seem to translate into a reduced rate of comparison errors.
• 在大多数设置中，float和integer深度缓冲区之间没有区别。 算术错误吞噬量化误差。 部分原因是因为float32和int24在[0.5,1]中具有几乎相同大小的ulp（因为float32具有23位尾数），所以在绝大多数深度范围内实际上几乎没有额外的量化误差。
• 在许多情况下，分离视图和投影矩阵（遵循Upchurch和Desbrun的建议）确实有所改进。 虽然它不会降低整体错误率，但它似乎将掉期转换为不可区分，这是向正确方向迈出的一步。
• 无限远平面在错误率方面只会造成极小的差异。 Upchurch和Desbrun预测绝对数值误差减少25％，但似乎并没有将比较误差率降低。
```The above points are practically irrelevant, though, because the real result that matters here is: the reversed-Z mapping is basically magic. Check it out:
```

• Reversed-Z with a float depth buffer gives a zero error rate in this test. Now, of course you can make it generate some errors if you keep tightening the spacing of the input depth values. Still, reversed-Z with float is ridiculously more accurate than any of the other options.
• Reversed-Z with an integer depth buffer is as good as any of the other integer options.
• Reversed-Z erases the distinctions between precomposed versus separate view/projection matrices, and finite versus infinite far planes. In other words, with reversed-Z you can compose your projection matrix with other matrices, and you can use whichever far plane you like, without affecting precision at all.
• 使用浮点深度缓冲区反转Z可以在此测试中给出零错误率。 现在，如果继续收紧输入深度值的间距，当然可以使其产生一些错误。 不过，与浮点数精度相反的reversed-Z比其他任何方式都更可靠。
• 用整数深度缓冲区reversed-Z与其他任何整数选项一样好。
• Reversed-Z消除了预分解 视图/投影矩阵 与 有限与无限远平面 之间的区别。 换句话说，使用逆Z可以用其他矩阵组成投影矩阵，并且可以使用任何你喜欢的平面，而不会影响精度。
```I think the conclusion here is clear. In any perspective projection situation, just use a floating-point depth buffer with reversed-Z! And if you can't use a floating-point depth buffer, you should still use reversed-Z. It isn't a panacea for all precision woes, especially if you're building an open-world environment that contains extreme depth ranges. But it's a great start.
```

```Nathan is a Graphics Programmer, currently working at NVIDIA on the DevTech software team. You can read more on his blog here.
```

Nathan是一名图形编程人员，目前在NVIDIA公司的DevTech软件团队工作。 你可以在他的 博客 上阅读更多内容。