Gradient Domain Path Tracing
Dario Seyb
January 22, 2017
Abstract
In recent years the popularity of the path tracing rendering algorithm has risen dramatically. Especially
the movie industry appreciates the accuracy and relative simplicity of physically based rendering. Un-
fortunately the drawbacks of path tracing are still holding back artistic expression. For complex scenes
rendering times can be prohibitive. Hence extensions of path tracing which address these issues have to be
found. In this report we will present Gradient Domain Path Tracing”, one of the more recent extensions.
We will describe the algorithm, its advantages over classical path tracing and recent work in gradient
domain rendering. Additionally we will provide an implementation of the algorithm in the rendering
framework zaphod” [Sey17].
Keywords Path Tracing, Physically Based Rendering, Gradient Domain Light Transport
Contents
1 Introduction 2
2 A Brief Introduction to Path Tracing 2
3 Introduction to Gradient Domain Light Transport 3
3.1 Symmetric Gradient Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
4 Shift Mappings 6
4.1 Construction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
4.1.1 Shift via Manifold Exploration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
4.1.2 Improved Shift via Manifold Exploration . . . . . . . . . . . . . . . . . . . . . . . . 7
4.1.3 Half Vector Replication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
4.1.4 Discussion of Shift Mappings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
4.2 Computing the Jacobian . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
4.3 Multiple Importance Sampling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
5 Solving Screened Poisson Equations 10
6 A Theoretical Analysis of Gradient Domain Light Transport 11
6.1 A Math Primer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
6.1.1 The Discrete Fourier Transform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
6.1.2 Power Spectra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
6.1.3 Dirac Delta Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
6.2 Analyzing the Components of Gradient Domain Methods . . . . . . . . . . . . . . . . . . 13
6.2.1 The Gradient Operator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
6.2.2 Stochastic Sampling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
6.2.3 Integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
6.2.4 Poisson Reconstruction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
1
7 Recent Work 16
7.1 Gradient Domain Bidirectional Path Tracing . . . . . . . . . . . . . . . . . . . . . . . . . 16
7.1.1 Shift Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
7.1.2 Connection Strategy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
7.2 Temporal Gradient-Domain Path Tracing . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
7.2.1 Shift Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
7.2.2 Spatio-temporal Reconstruction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
8 Discussion 18
8.1 Equal Time Comparisons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
8.2 Analysing Shift Mappings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
9 Acknowledgements 20
1 Introduction
Gradient domain methods are a rather recent research area in light transport simulation. Especially
in the last two years there have been multiple extensions to the original algorithm, applying gradient
rendering to existing light transport strategies. This report intends to be a survey of the state of the
art in gradient domain rendering. We will start by describing the basic algorithm and the components
needed to implement it. Next we will present a theoretical analysis of gradient rendering which provides
a mathematical explanation for the empirical results. Finally we will compare the various gradient
domain adaptions. We are especially interested in the difference between Gradient Domain Metropolis
Light Transport (MLT) and Gradient Domain Bidirectional Path Tracing (BDPT) [Vea98]. We focus
on these algorithms because their non-gradient domain versions are popular in research and production.
Additionally both MLT and BDPT are unbiased, sampling-based algorithms that handle a large variety
of scenes. Even traditionally hard to render scenes are handled robustly. As such, the scenes chosen
for analysis should be handled similarly well between algorithms, and the comparison can focus on the
influence of gradient rendering on their results.
2 A Brief Introduction to Path Tracing
Path tracing is a technique used for light transport simulation which uses Monte Carlo sampling to
evaluate the integral in the rendering equation by [Kaj86]
I(x, x
0
) =
Geometry Term
z }| {
g(x, x
0
)[(x, x
0
)
| {z }
Emission
+
Z
S
BRDF
z }| {
ρ(x, x
0
, x
00
)
Incoming light
z }| {
I(x
0
, x
00
) dx
00
| {z }
Light reflected at x
0
in direction
x
0
x
] (1)
I(x, x
0
) is the radiance arriving from point x
0
at x. It depends on the light that is being reflected at
x
0
and it is defined recursively. In modern literature the prevalent mathematical framework to describe
light transport problems is the path integral formulation introduced by [Vea98].
I
j
=
Z
f
j
(¯x)(¯x) (2)
I
j
is the amount of radiance arriving at a pixel j on the sensor and f
j
is the measurement contribution
function which assigns an amount of transported energy to each path ¯x. The main difference between
Equations (1) and (2) is that instead of describing a measurement I by a local recursive integral we
describe it by globally integrating over Path Space Ω, that is, the set of all paths ¯x = x
1
x
2
. . . x
n
that are
sampled by a path tracer.
In this context Monte Carlo simulation corresponds to drawing random samples in path space. To
compute a pixel value we randomly generate paths that connect a point on the sensor to a point on a
light source. This is usually done by recursively tracing a ray through the scene and calculating a new
outgoing direction at each intersection point. Once a path is sampled we can calculate its contribution
to the image. Figure 1 illustrates this process.
x
1
x
0
x
2
x
4
x
3
Figure 1: Sampling a ray that contributes to a certain pixel on the sensor.
While the sampling process is identical whether we are using the path integral or the recursive scat-
tering formulation, working in path space allows us to develop algorithms that work on complete paths
and are decoupled from the path generation strategy.
The main issue with the Monte Carlo simulation approach is that naively sampling the high di-
mensional path space results in high variance between samples. This is visible in the final image as a
characteristic high frequency noise. Because of the high number of samples required for a convergent
image, path tracing was prohibitively expensive for many applications. The high variance is in part due
to the fact that many paths never hit a light source. Figure 2 a) shows the distribution of light paths in
a scene with diffuse materials. Blue paths never hit a light; yellow paths do. As you can see, since the
light source is small and out of the way, very few paths hit it randomly. The result is a noisy image as
shown in Figure 2 b).
1
(a) Path Density (b) Resulting Noisy Image
Figure 2: A scene illuminated by a small light source rendered with standard
path tracing. A high sample count is necessary because very few of the paths
contribute to the image.
3 Introduction to Gradient Domain Light Transport
The goal of almost any new light transport method is to reduce the variance in the evaluated samples
since this leads to faster converging images which need less time to render. While there are many ways
to reduce the variance of a given sampler, for example via importance sampling [Vea98], these are often
1
The 2D figures in this report are created with a tool we created specifically for this purpose. You can find it
at https://darioseyb.com/pathgraph. It is inspired by the paper ”Theory, Analysis and Applications of 2D Global
Illumination” [Jar+12].
quickly exhausted or provide only marginal improvements. Gradient Domain Path Tracing (G-PT) does
not change the base sampling distribution at all, instead it relies on the spectral properties of gradients
and screened Poisson reconstruction to reduce variance.
In natural images and many images we want to render with a path tracer, the energy is concentrated
in the low frequencies [Rud94]. In a traditional path tracer we are sampling the low frequencies directly.
Since the energy is the high, variance is high, too. This results in a noisy image. The idea of gradient
domain methods is to instead sample the high frequencies, i.e. gradients, directly. These are a lot lower
energy and thus can be sampled with less variance. The final image can be reconstructed from the
gradients and a (noisy) base image using Screened Poisson Reconstruction as described in Section 5. A
detailed analysis of these claims will be provided in Section 6.
The idea of direct gradient sampling was first introduced in the context of MLT [Leh+13]. Since
MLT is notoriously difficult to implement correctly, [Ket+15] applied the same concepts to classical path
tracing with great success. The resulting algorithm ”Gradient Domain Path Tracing” is easy to implement
and provides greatly improved results compared to regular path tracing.
(a) Path Tracing (b) Gradient-Domain Path Tracing
Figure 3: Comparing the results of standard path tracing (left) and G-PT
(right) at the same number of samples (10).
Instead of sampling the value I
j
in Equation (2) directly we want to sample
ij
, the finite difference
between two pixels i and j.
ij
= I
i
I
j
(3)
Naively computing finite differences is not enough to ensure low variance since Var(I
i
I
j
) = Var(I
i
)+
Var(I
j
) if I
i
and I
j
are uncorrelated. Hence we need to find a way to sample I
i
and I
j
in a strongly
correlated fashion. Recall that a single measurement of I
j
only depends on the value of the measurement
contribution function f
j
(¯x). Thus, if we manage to generate a path ¯x contributing to pixel i and a path
¯y contributing a similar amount of radiance to pixel j, the variance of
ij
will be low.
In all the gradient domain methods proposed so far this is achieved by deterministically shifting a
base path ¯x through pixel i to an offset path ¯y through pixel j. We call this shift mapping T
ij
and write
T
ij
(¯x) = ¯y. With T
ij
defined we can rewrite Equation (3) in terms of base and offset paths.
ij
= I
i
I
j
=
Z
f
i
(¯x)(¯x)
Z
f
j
(¯x)(¯x)
=
Z
f
i
(¯x) f
j
(T
ij
(¯x))|T
0
ij
|(¯x)
(4)
The transformation is straightforward except for the introduction of the Jacobian determinant |T
0
ij
|.
This is necessary because we are changing the integration variable of f
j
from ¯x to T
ij
(¯x). The integration
domain is not adjusted because we assume T to be bijective on and thus T (Ω) = Ω. Thanks to the
Naive Gradient Sampling
Highly-Correlated Gradient Sampling
Figure 4: Comparing naive gradients with carefully sampled gradients. Both
images are rendered with 100 SPP. The images on the right are the results of
Poisson reconstruction based on the gradient images on the left.
formulation of the shift as a change in integration variables we have a lot of freedom in constructing a
shift mapping. As long as we can compute the corresponding Jacobian the results will be unbiased.
3.1 Symmetric Gradient Formulation
Unfortunately it is impractical to require a shift mapping to be a bijection on the complete path space.
[Man+14] introduce a Symmetric Gradient formulation which does not pose this requirement on T . Let
us first investigate reasons why a shift may not be bijective. The first case is that there might be paths ¯x
for which T (¯x) is not defined. This can happen when the shift procedure fails, e.g. for numerical reasons.
We call these shifts non-invertible. The second case is that T (¯x) might map ¯x to a path ¯y / Ω, which
means the offset path ¯y would never be generated during regular path sampling. These shifts are called
non-symmetric. Thirdly, the first two cases might apply to the inverse shift T
1
. Bijectivity requires
T
1
(T (¯x)) = ¯x, but if the inverse shift is not well defined this requirement is violated. In the ideal
case all shifts would be symmetric (a non-invertible shift can not be symmetric, hence shifts which are
non-invertible as well as symmetric do not exist). If any of these cases apply, the integral in Equation (4)
will not return the correct value and as a result the final image will be biased.
[Man+14] formulate a different way to express
ij
which can handle cases of bijectivity violation
robustly. This formulation is extended by [Ket+15] to allow for multiple importance sampling. Instead
of just sampling the forward finite difference between pixels i and j we also sample the backwards finite
difference and combine them using weights.
ij
=
Forward Difference
z }| {
Z
w
ij
(¯x)(f
i
(¯x) f
j
(T
ij
(¯x))|T
0
ij
|)(¯x) +
Backward Difference
z }| {
Z
w
ji
(¯x)(f
j
(¯x) f
i
(T
ji
(¯x))|T
0
ji
|)(¯x) (5)
Non-symmetric and non-invertible shifts are handled via the weights w
ij
and w
ji
. In both cases
the difference in the other direction will not be sampled and we set the weight of the current sampling
strategy to 1. If the shift is symmetric the difference will be sampled twice, once via forward and once
via backward differences, so we weight it by 1/2.
Since the definition of bijectivity violations is fairly abstract we want to give practical examples for
each type of shift. Figure 5 shows the three ways shift computations can proceed.
h
x
n
x
c
a) Non-Symmetric b) Non-Invertible c) Non-Invertible
Figure 5: Shown are three examples for how a shift can fail. The shift in
a) is non-symmetric because the connection to vertex x
c
is blocked and a
blocked path will never be sampled by the path tracer. In Figure b) the offset
path hits a material with a different classification, and hence the shift is non-
invertible. Figure c) shows another case of non-invertibility. The primary ray
gets refracted, but the offset path encounters total reflection.
4 Shift Mappings
The shift mapping, that is the way we compute offset paths, is at the heart of G-PT. A good shift mapping
produces offset paths which have a similar throughput to the corresponding base path. This is not always
easy. Even small changes in the position of a path vertex can completely change the throughput of the
path if specular materials are involved. The first two vertices of an offset path do not depend on the
mapping. The first vertex is fixed on the point on the sensor that corresponds to the pixel shift from i
to j. The second vertex is the primary scene intersection of the ray defined by the camera position and
the first vertex. In this section well will take a look how to continue the offset path after the primary
intersection. In all so far proposed shift mappings the aim is to connect the offset path back to the
base path as soon as possible. This technique has proved successful, since if two paths share segments
their throughputs are likely to be similar. The points at which we connect the paths have to be chosen
carefully.
4.1 Construction
To construct an efficient shift mapping we need to classify vertices into diffuse and specular. Diffuse
vertices are easier to handle and allow more flexibility in how to connect paths. This is because their
BRDF is smooth and the directions to the next and previous path vertices don’t change the throughput
much. Specular vertices need to be treated with more care. Assume a vertex lies on the surface of a
perfect mirror. Moving either of the adjacent vertices on the path will completely nullify any throughput
through the whole path since the BRDF is only non zero for a very specific direction. To describe the
various shift mappings we introduce a general description of an offset path.
T
ij
(¯x) = ¯y = (s, h, o
0
, . . . , o
m
, x
c
, . . . , x
n
)
s = shift
ij
(x
0
)
h = R(s, ω
s
)
(6)
where shift
ij
(x
0
) denotes shifting the vertex x
0
in primary sample space from pixel i to pixel j, R
is the ray casting function and R(s, ω
s
) the primary hit of the ray starting at position s in direction
ω
s
which is determined by the orientation of the sensor. x
c
is the connection vertex. After connection,
the vertices of ¯x and ¯y coincide. In this framework the shift mapping decides the length m of the offset
vertex chain ¯o by picking a connection vertex x
c
and chooses the vertices in ¯o such that the change in
throughput is minimal.
4.1.1 Shift via Manifold Exploration The first shift mapping, introduced in [Leh+13], uses tech-
niques from Manifold Exploration [JM12] to construct parts of ¯o (Figure 6). This is a natural choice since
Manifold Exploration was developed in the context of MLT and solves the problem of connecting two
diffuse vertices via a chain of specular vertices. Here ¯o is split into three parts. x
c
is chosen as the second
diffuse vertex after h. A specular chain is traced from h (like in standard path tracing) until a diffuse
vertex o
b
is hit. A second specular chain connecting o
b
and x
c
is then constructed using the optimization
technique from [JM12]. Note that this second chain can be empty if x
c
follows directly after o
b
. In this
case they are immediately connected and no Manifold Exploration is performed.
Base Path ¯x
Offset Path T (¯x)
Diffuse Surface
Specular Surface
h
o
b
x
n
o
1
x
c
Figure 6: Shifting a path using manifold exploration. Since the primary hit h
on the offset path lies on a diffuse surface we continue the path like in standard
path tracing. The next hit o
0
lies on a diffuse surface. We connect the offset
path to the base path at the next diffuse base path vertex x
c
via manifold
exploration.
4.1.2 Improved Shift via Manifold Exploration [Man+14] contribute a new shift mapping which
reduces variance in the case where the second specular chain of the shift by [Leh+13] is empty (Figure
7). This can be an issue because the direct connection can result in a low geometry term. If the second
chain is empty the improved shift treats x
c
as specular and finds the next diffuse vertex in ¯x. It then
uses the found vertex as the new connection vertex and performs Manifold exploration for connection as
before. This completely eliminates the case of empty second specular chains.
Base Path ¯x
Offset Path T (¯x)
Diffuse Surface
Specular Surface
h
o
0
x
c
x
n
o
1
x
c
Figure 7: Shifting a path using the improved manifold exploration shift by
[Man+14]. The first hit h is on a specular surface and we copy the half vector
of the corresponding base path vertex. The second vertex o
0
is on a diffuse
surface and we could connect to the next diffuse vertex on the base path. Since
this would leave the second specular chain empty we use manifold exploration
to construct a path connecting to x
c
.
4.1.3 Half Vector Replication Finally, [Ket+15] introduce a shift not based on Manifold Explo-
ration. x
c
is chosen as the first vertex in ¯x where both x
c
as well as x
c+1
are classified as diffuse (Figure
8). This eliminates the need for connecting diffuse vertices through specular chains. For the vertices in
between h and x
c
a path is traced by replicating the projected half vectors of the base path vertices at
each step. Replicating half vectors of specular vertices proves to result in similar values for the BRDF
at each vertex, since specular BRDFs are usually well defined by the half vector between incoming and
outgoing directions.
Base Path ¯x
Offset Path T (¯x)
Diffuse Surface
Specular Surface
h
o
0
x
c
x
n
Figure 8: Shifting a path using Half Vector Replication. At the first hit h we
need to replicate the half vector of the corresponding base path vertex because
the surface is specular. As a result we hit o
0
which is on a diffuse surface. The
next base path vertex x
c
is also on a diffuse surface and we can connect the
offset path back to the base path.
4.1.4 Discussion of Shift Mappings The shift mappings based on Manifold Exploration produce
excellent results, but are hard to implement and require a costly optimization process. This is why
[Ket+15] prefer their novel shift mapping. It results in a slightly higher amount of artifacts caused by
singularities that [Man+14] avoids, but those are canceled by using the multiple importance sampling
techniques also introduced in [Ket+15]. Overall [Ket+15] is the most promising. It requires about 30
lines of C++ code to implement correctly (including Jacobian calculation) and produces great results at
little cost.
4.2 Computing the Jacobian
To account for the change in integration variables in Equation (4) we need to compute the determinant of
the Jacobian |T
0
ij
(¯x)| = |dT
ij
(¯x)/d¯x| for the used shift mapping. This is important to make sure that the
results of the integration are mathematically correct, otherwise we would introduce bias. In this section
we aim to given an intuition for what this value denotes, but we will not derive the Jacobian for any
of the presented shift mappings. The value of |T
0
ij
(¯x)| gives us a measure for how much T stretches or
squishes the path space around ¯x. Figure 9 shows how path density can be a lot lower after shifting,
resulting in high Jacobian values.
Computing this value can be challenging. In [Ket+15] the derivations are made easier by two insights.
First, the authors show that the calculation can be split into three steps. Namely, reparametrizing the
path in a way so the shift itself has a trivial to compute Jacobian, computing the Jacobian of the shifted
path in the reparameterized space and transforming the path back into the original space. Mathematically
this is expressed as multiplying the Jacobian determinants of the three steps.
|T
0
ij
(¯x)| = |
d¯y
d¯x
| =
Reparametrization
z}|{
|
d¯y
dˆy
| |
dˆy
dˆx
|
|{z}
Trivial Shift
Back Transform
z}|{
|
dˆx
d¯x
| (7)
The second insight is that computing the reparametrization at each path vertex separately and mul-
tiplying them together results in the final Jacobian for the whole shift. This allows us to combine ideas
from different shift mappings and reuse the derived Jacobians.
Base Path ¯x
Offset Path T (¯x)
Figure 9: Visualizing the path density changes caused by T
ij
. Blue paths
are sampled via regular path tracing, red paths via the strategy by [Ket+15].
In the bottom right corner the density of red paths is much lower than the
density of blue paths, leading to high Jacobian values.
4.3 Multiple Importance Sampling
As shown in the previous section the Jacobian values can get very large. These high values can be an
issue, because they cause outliers in the gradients which manifest themselves after reconstruction as very
obvious artifacts. As shown in Figure 10 they mostly appear in corners and are rather unpleasant as far as
rendering artifacts go. The typical way to deal with outliers like these is to employ multiple importance
sampling ([Vea98]). Due to the Symmetric Gradient Formulation we already weight the forward and
backward gradient sampling strategies. If we extend these weights to be continuous we can weight the
strategies by their importance in the symmetric shift case, where they would otherwise be 1/2 for both
strategies.
Figure 10: Even with multiple importance sampling, artifacts appear in
corners and on edges at low sample counts.
To fit the difference sampling into the framework needed for importance sampling, we interpret the
base path ¯x and the inverse shift T
ji
(¯y) = ¯x as two strategies to sample the same distribution. The
weights are then computed as follows when using the balance heuristic.
w
ij
(¯x) =
p(¯x)
p(¯x) + p(T
ij
(¯x))|T
0
ij
(¯x)|
(8)
The intuition here is that if the Jacobian |T
0
ij
(¯x)| is large the contribution of the sample will be
weighted by approximately 1/|T
0
ij
(¯x)|. Hence outliers will be suppressed.
5 Solving Screened Poisson Equations
The outputs of a gradient domain path tracer are a primal image I
g
and a set of gradient images I
d
.
With G-PT the gradient images are the spatial differences I
dx
and I
dy
. We want to reconstruct the final
image I from these values. That is, we want to estimate the image function I from its gradients. It is
easy to see that I is optimal if it minimizes the constraint in Equation (9), because that means that the
gradient I is as close as possible to our computed gradient images. [Bha+08]
Z Z
||I
I
dx
I
dy
||
2
dxdy (9)
Reconstructing an image based on this constraint introduces issues if there is error in the gradients.
Especially low frequencies are not reconstructed well, because small gradient error accumulates over longer
distances. To improve the reconstruction we use the primal image I
g
. The primal image is noisy in the
high frequencies, but represents low frequencies fairly well, even at low sample counts. [Bha+08] add an
additional term to the integrand in Equation (9).
Z Z
α(I I
g
)
2
+ ||I
I
dx
I
dy
||
2
dxdy (10)
Here α is a factor that controls the influence of the primal image on the solution. Note that setting
α to 0 reduces Equation (10) to Equation (9). Expressing the final image as in Equation (11) allows us
to use standard numerical solvers.
I = argmin
¯
I
(||α(
¯
I I
g
)||
p
+ ||
¯
I
I
dx
I
dy
||
p
) (11)
The variable p allows us to change the norm used for minimization. Only the L2 norm results in
bias-free images, but [Ket+15] illustrate that using the L1 norm, while introducing bias, results in images
with less artifacts and converges to the same results when the sample count is high. Artifacts in L2
reconstructions stem from the outliers in the gradient sampling result discussed in the previous section.
The reconstruction spreads out the error to minimize it. L1 reconstruction does not spread out error as
much and as such outliers generally only influence one pixel and manifest as fireflies which are a common
artifact in path tracing and hence do not stand out as much. While multiple importance sampling reduces
the gradient outliers and makes L2 reconstruction more practical, L1 reconstruction still produces cleaner
images.
L1-Reconstrction L2-Reconstrction
Figure 11: Reconstruction using different norms. The base gradient and
primal images are exactly the same.
6 A Theoretical Analysis of Gradient Domain Light Transport
Early papers on gradient domain methods provided little in the way of rigorous mathematical analysis.
[Leh+13] discuss the frequency characteristics of the gradient operator and Poisson reconstruction, but do
not go into any further depth. Particularly the factor α in Equation (11) is chosen empirically. The goal
of analysing the frequency characteristics of light transport is to gain a deeper insight into where error is
coming from and why gradient methods are successful. In the following we will introduce commonly used
techniques and motivate gradient based light transport by analyzing the frequency content of natural
images and their gradients.
6.1 A Math Primer
Many of the techniques used in signal processing and image analysis are not commonly taught in un-
dergraduate courses. For the benefit of the reader we will give a short overview over the mathematical
concepts and notation used in this section.
6.1.1 The Discrete Fourier Transform Our aim is to analyze the frequency content of a discretely
sampled signal f (e.g an image). For now we can assume f to be a continuous function defined over the
real numbers. The frequency content of a continuous function, also called its Fourier spectrum can be
found by projecting it onto a basis of complex exponential functions E
ω
(x) = e
i2πωx
. This makes sense
if you recall that a complex exponential function can be decomposed into sine and cosine as follows
e
i2πωx
= cos(2πωx) + i sin(2πωx). Hence, once we can express a function f in terms of a set of periodic
parts E
ω
and thus know the factor for each frequency ω we know how much that frequency influences
the function f. The forward and backwards Fourier transforms are given by [SSJ16] as follows.
2
F (ω) =
Z
−∞
f(x)e
i2πωx
dx (12)
f(x) =
Z
−∞
F (ω)e
i2πωx
(13)
F (ω) is the factor for the frequency ω we are looking for. Equation (13) shows how to reconstruct
a function from its periodic parts. This definition only works with continuous functions, but not with a
set of discrete samples which we are typically trying to analyze. To deal with this we need two further
facts about the Fourier transform. First, if f is periodic, its Fourier spectrum is non-zero only at discrete
points. Secondly, if f is discrete, its Fourier spectrum will be periodic [SSJ16]. Generally we sample f
over a finite range and can assume that the sampled values repeat outside that range. Hence we can
assume f to be periodic and thus f has a discrete Fourier spectrum. Also, as already said, f is sampled
at discrete points and hence its Fourier spectrum is periodic. Combining these two properties we can see
that we can represent the Fourier spectrum of a discrete, periodic function with a finite amount of values
(since it itself is discrete and periodic).
In practice we need exactly as many samples of the Fourier spectrum as we have samples of the original
function. From those spectrum samples the original samples can be reconstructed perfectly.
Another important concept is that a multiplication f × g in the spatial domain corresponds to a
a convolution F G in the frequency domain. Correspondingly, a convolution in the spatial domain
corresponds to a multiplication in the frequency domain. This is easy to prove using the definitions of
the Fourier transform and its inverse.
Furthermore, evaluating F (ω) at 0 equals the integral of f over its complete domain as shown in
Equation (14).
F (0) =
Z
−∞
f(x)e
i2πw×0
dx
=
Z
−∞
f(x)dx
(14)
Photograph
High-Gradient Image
Figure 12: Left: Images with different frequency content, Right: Their power
spectra. Note the vertical spike in the power spectrum of the photograph,
indicating the existence of higher frequencies in the y direction of the image.
This corresponds to the clear separation between sky, mountains, trees and
the lake in the image.
6.1.2 Power Spectra A power spectrum is the square of a Fourier transform |F (ω)|
2
. It is often used
to represent the results of analysis since it presents the frequency content f in a very visual way.
Figure 12 shows a natural image and an image constructed to have high gradients on the left and
their power spectra on the right. In a power spectrum the values in the centre of the image correspond
to low frequencies. As you can see, the power of the natural image is focused in the low frequencies while
the high-gradient image has a spread out power spectrum.
6.1.3 Dirac Delta Functions A common problem in light transport is representing and analyzing
functions which are 0 everywhere except for one specific point. This comes up for example in the BRDF
of a perfectly specular material. It reflects light into only one very specific direction. Unfortunately it
is hard to work with such a functions since they are not integrable via usual methods. To solve this we
introduce the Dirac Delta Function δ(x). Function” is in this case a bit of a misnomer, because it is
technically not a function, but a limit of a series of functions defined as follows. [Bra00]
g(x, t) =
1
2πt
e
x
2
/2t
δ(x) = lim
t0
g(x, t)
(15)
g(x, t) is a Gaussian centered around the origin, which gets sharper as t decreases. Figure 13 shows
how g approaches to have one of the properties we want, namely, vanishing everywhere except for a
shrinking footprint around 0. We call δ a function, because we can work with it like it is one in the
context of integration. Specifically, multiplying it with a function f inside of an integral, like in Equation
(1), results in the following:
hδ(x), f (x)i = lim
t0
Z
−∞
g(x, t)f (x)dx = f (0)
(16)
2
From here on out a Fourier transform will always be denoted by a capital letter.
An important property for the next section is its Fourier transform, given by ∆(ω) = 1. This follows
trivially from the definition of the Fourier transform in Equation (12) and the identity in Equation (16).
3 2 1 1 2 3
1
2
3
4
x
g(x, t)
Figure 13: Plotting g for multiple values of t. Note how it approaches a spike
at 0.
6.2 Analyzing the Components of Gradient Domain Methods
This section is a summary of the analysis performed in [Ket+15]. At the heart of the analysis is the result
that the efficiency of gradient methods depends on the power spectrum of the image function. Images
with sparse gradients (which means power is concentrated in the low frequencies) benefit the most from
gradient rendering. Luckily this is a property that most natural images have [Rud94]. Since we mostly
want to depict natural scenes, this is also the case for the subjects of light transport simulation. Note
that natural in this context doesn’t refer to scenes in nature”, but to scenes that depict reality in some
(even stylized) way.
The analysis is done in parts and makes assumptions and simplifications to reduce its verbosity. The
main simplification is that we are only investigating two dimensional scenes and hence one dimensional
sensors. We also limit ourselves to one bounce. That way path space can be represented in two dimensions.
The primary sample space dimension x and the secondary sample space at the first bounds p. Figure 14
shows the 2D scene we are using for the analysis.
x
p
Figure 14: A two dimensional scene with a camera and a path starting at
the camera, bouncing once and then being connected to the light.
Figure 15 is meant as a visual guide through this analysis. Sections advance from left to right. The
top row shows the analysis performed on the primal image, the bottom row on the gradient image. We
advice to come back to the figure after each section.
6.2.1 The Gradient Operator First we investigate the power spectrum of the gradient image G.
We use a simplified shift mapping T (x, ¯p) = (x + 1, ¯p) which only changes the primary sample space
location of the path (x, ¯p). The gradient function is then defined as g(x, ¯p) = f(x, ¯p) f(x + 1, ¯p).
For convenience we can express g as a convolution g(x, ¯p) = (d f)(x, ¯p) between the finite difference
operator d(x) = δ(x) δ(x + 1) and f where δ is the Dirac delta function. As noted in section 6.1.1 this
0 50 100 150 200 250 300 350 400 450 500
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
a) Image contribution f b) Power Spectrum |F |
2
c) PSD |S|
2
|F |
2
d) Integrated Image
0 50 100 150 200 250 300 350 400 450 500
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
e) Gradient g = d f f) Power Spectrum |G|
2
g) PSD |S|
2
|G|
2
h) Integrated Gradients
Path Dimension ¯p
Image Dimension x
Path Freq. Dimension ω
¯p
Image Freq. Dimension ω
x
Gradient Operator
Fourier Transform
Sampling
Integration
Reconstruction
0 50 100 150 200 250 300 350 400 450 500
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
i) Result
Figure 15: This figure shows the steps of the analysis. The top and the
bottom row show the same transformations applied to the primal image f and
the gradient image g respectively. On the left (a) and e)) we show the reference
images with no sampling error. The next column shows the respective Fourier
transforms. Note that that b) contains a lot more power than f). This is the
result of Equation (17) and the fact that all of the power of F is concentrated
in the low frequencies. Next we show the power spectral densities of f and g
after sampling. The result from Equation (18) is visually represented by the
lighter constant background of c). This is because sampling adds noise to all
frequencies equally and the power of the added noise is proportional to the
power of the sampled function. Next we integrated of the path dimension,
which corresponds to slicing along the red line in c) and g). The results of the
integration are shown in d) and h). Not the considerably higher base noise
in d). Poisson reconstruction corresponds to combining the results in d) and
h) while low-pass filtering d) and high-pass filtering h). Figure i) shows the
result of the reconstruction in the frequency domain. It retains the low level
of noise of h) while representing low frequencies accurately.
convolution corresponds to a multiplication in the Fourier domain and hence the Fourier transform of g
is G(ω
x
, ω
¯p
) = D(ω
x
, ω
¯p
)F (ω
x
, ω
¯p
). We can not give F directly since it depends on the scene, but we can
investigate d. Its Fourier transform is given by D(ω
x
, ω
¯p
) = δ(ω
x
, ω
¯p
) δ(ω
x+1
, ω
¯p
) = 1 e
i2πω
x
. This
follows trivially from the identities given in section 6.1.3. An important property is the power spectrum
of D, |D|
2
= |1 e
i2πω
x
|
2
= 2 2 cos(2πω
x
)
3
. In Equation (17) we summarize the spectral properties of
the gradient function.
G(ω
x
, ω
¯p
) = (1 e
i2πω
x
)F (ω
x
, ω
¯p
)
|G(ω
x
, ω
¯p
)|
2
= (2 2 cos(2πω
x
))|F (ω
x
, ω
¯p
)|
2
(17)
Especially the second result is interesting. Figure 16 shows how the finite difference operator attenu-
ates low frequencies ω
x
<
1
6
.
6.2.2 Stochastic Sampling The complete analysis of stochastic sampling is fairly lengthy and would
be out of scope for this report. We limit ourselves to presenting the assumptions and results. For a more
detailed description we refer to the supplemental materials of [Ket+15]. The sampling process is modeled
as a multiplication s × f where s is the sampling function. Its power spectral density ([MC12]) is given
by Equation (18). It can be interpreted as the expected value of a power spectrum.
3
This could also be simplified to 4 sin
2
(πω
x
). This is not done in the original paper and we keep it to avoid
confusion.
0.1 0.2 0.3 0.4 0.5
1
2
3
4
ω
x
|G|
2
Figure 16: The power spectrum of G assuming F is constant. Areas shaded
green correspond to a attenuation of power in the corresponding frequencies.
While the power of frequencies approaching the Nyquist limit is amplified
significantly, most images do not contain these frequencies as discussed in
section 6.1.2.
(|S|
2
|F |
2
)(ω
x
, ω
¯p
) = n
2
|F |
2
(ω
x
, ω
¯p
) +
Frequency independent term
z }| {
n||F ||
2
(18)
where n is the number samples drawn. The important result is that sampling adds power to all
frequencies equally and the added power is proportional to the energy of the signal.
6.2.3 Integration Finally we need to compute the power spectrum of the Monte Carlo integration.
We integrate f over all paths ¯p to obtain a pixel value I as in Equation (2). The Fourier transform
of that integral corresponds to slicing the Fourier transform of f at w
¯p
= 0 (see Equation (14)). Since we
are taking n discrete samples and simply adding them up we need to multiply the output of the sampling
process by 1/n (and the power spectrum by 1/n
2
).
4
1
n
2
(|F |
2
|S|
2
)(ω
x
, 0)) = |F (ω
x
, 0)|
2
+
1
n
||F ||
2
(19)
Note that this is just the power spectrum of the converged image with an additional error term
|
F
(ω
x
)|
2
= ||F ||
2
/n which gets smaller when the number of samples increases. It is proportional to the
energy of the image and constant over all frequencies.
Having derived the error term |
F
(ω
x
)|
2
for the image, we can similarly give the error term for gradient
sampling as |
G
(ω
x
)|
2
= ||G||
2
/n. We see in Equation (20) that the relative error between gradient and
direct sampling only depends on the relative power of the gradient image versus the primal image.
|
G
(ω
x
)|
2
|
F
(ω
x
)|
2
=
||G||
2
||F ||
2
(20)
Recalling the result from section 6.2.1 this means that, since the energy of the gradient image will be
lower than the energy of the primal image for most natural images, the error introduced by sampling will
be lower as well. This is the first important result of the analysis.
6.2.4 Poisson Reconstruction As a last step the analysis turns to the screened Poisson reconstruc-
tion. Again, we will only present results since the derivations are fairly lengthy. First we find the error
term (analogous to Equation (20)) for the image error after reconstruction. This is done by transforming
the Poisson reconstruction problem into the Fourier domain [Bha+08] and plugging in the definitions of
F and G.
|
R
α
|
2
=
α
4
|
F
|
2
+ |D|
2
|
G
|
2
(α
2
+ |D|
2
)
2
(21)
Here α is the same values as in Equation (11) and sets the weight of the primal image during the
reconstruction. Next we derive an optimal value α
for the factor α in Equation (21).
4
We omit the pixel filter here since it is non essential to the result of the analysis.
α
2
=
||G||
2
||F ||
2
(22)
In practice the optimal value is unknown since we need the converged images to calculate it, but a
value of 0.2 seems to be good for many different scenes and is used for all the reconstructions in this
report. Substituting α
for α in Equation (21) gives us an expression returning the minimal error after
reconstruction for a given image.
|
R
α
|
2
=
|
G
|
2
|
F
|
2
|
F
|
2
|D|
2
+ |
G
|
2
(23)
For comparison purposes we’d like to know the factor by which gradient sampling reduces the error
versus direct sampling. For this we rearrange Equation (23) and use the result in Equation (20) to
simplify, giving us:
r =
|
F
(ω
x
))|
2
|
R
α
(ω
x
))|
2
=
||G||
2
/||F ||
2
+ 2 2 cos(2πω
x
)
||G||
2
/||F ||
2
(24)
This is the central result of the analysis. If r has a high value the error is reduced by a large margin.
Reconstruction effectively combines the high frequencies of the gradient images with the low frequencies of
the primal images. In the process it filters the low frequency error from gradients and the high frequency
error in the primal image and thus reduces the overall error.
7 Recent Work
Because of the promising results provided by G-PT there has been additional research conducted in this
area of light transport simulation. A focus has been extending G-PT to more robust algorithms than
path tracing and exploiting temporal coherence in animated image sequences.
7.1 Gradient Domain Bidirectional Path Tracing
While G-PT does considerably better than traditional path tracing in most scenes it is still subject to the
same limitations path tracing suffers from. Rendering scenes with difficult light transport characteristics
like the one in Figure 17 still results in high variance, even at high sample counts. To address these issues
[Man+15] introduces Gradient Domain Bidirectional Path Tracing (G-BDPT). G-BDPT is the natural
extension of the Bidirectional Path Tracing Algorithm by [Vea98] into the gradient domain.
The main contributions are an adaption of the shift mapping by [Man+14] to the bidirectional path
construction algorithm, an efficient BDPT connection strategy and a multiple importance sampling tech-
nique which reduces gradient sampling artifacts drastically.
7.1.1 Shift Mapping The shift mapping used in [Man+15] is identical to the one proposed by
[Man+14]. It is similar to the G-PT shift described earlier in this paper. The main difference is that
instead of requiring two consecutive diffuse vertices to connect to the base path this shift always connects
at the third diffuse vertex. The number of specular vertices between the second and the third diffuse
vertex is not considered. Hence the variance of the throughput of the resulting shift paths is potentially
higher, but the shift mapping retains properties that can be exploited to make gradient sampling drasti-
cally more efficient in the context of BDPT. We are basically trading higher variance for more samples
which in this case proves to result in lower error at equal render time.
7.1.2 Connection Strategy The cost of Bidirectional Path Tracing over traditional Path Tracing is
not negligible. Hence a practical shift mapping most be efficient. Naively shifting every path generated
by the many connection strategies would be prohibitively expensive.
[Man+15] propose to drop connection strategies where the connecting path vertices are classified as
specular. They reason that these strategies don’t contribute much energy anyway since the BRDF at the
connecting vertices is likely to have small values when evaluated into essentially random directions.
Ignoring these connection strategies allows us to shift the light and eye paths once and reuse large
parts of the original shifts (including their Jacobian values).
Figure 17: A scene which is difficult to render with standard path tracing,
because the light sources are small and covered. Even after rendering 20 000
SPP it did not converge to a clear image.
7.2 Temporal Gradient-Domain Path Tracing
Temporal methods have become popular in computer graphics in recent years. Most of them focus on
exploiting temporal coherence through filtering rendered images in a post processing step. Especially
Temporal Anti Aliasing” is widely used in computer games and there’s been a lot of research on how
to deal with the artifacts introduced by temporal filtering. There have been attempts to apply temporal
filtering to path tracing, but the common post processing based methods have trouble dealing with very
noisy images and view dependent lighting effects [Sey16].
[Man+16] introduce Temporal Gradient Domain Path Tracing (T-GPT), a technique which extends
gradient domain path tracing into the spatial domain. Instead of just computing gradients between pixels
in space, they also compute differences over time. Additionally, instead of applying a post processing filter
they take advantage of temporal coherence at a sample level. Similarly to how G-PT computes offset paths
T
ij
(¯x) between pixels in space, T-GPT adds offset paths between pixels in time as well. Additionally,
second order differentials (differences in time and space) are computed. The resulting images I
dx
, I
dy
,
I
dt
, I
dxdt
and I
dydt
are combined in a 3D Poisson reconstruction step.
7.2.1 Shift Mapping The shift mapping for spatial derivatives is the same as the one used for G-PT.
For time derivatives two mappings are proposed. The first one is straight forward. The primary sample
space (pixel) position of a path is kept constant between frames and paths are reconnected as usual. This
can result in large path space differences if objects or the camera move between frames since the chance
of an object moving away from under a ray is quite large. The second proposed shift mapping takes
motion vectors, that is object and camera movement projected into primary sample space, into account.
Using the motion vectors pixels positions are reprojected between frames. Instead of naively using the
same primary sample space position for the temporal shift the reprojected one is used.
In practice the shifts with a temporal mapping are not computed explicitly. Instead we render two
consecutive frames independently from each other. We ensure coherence between frames by starting
off the random number generator with the same seed both times. In multi-threaded or multi-machine
environments additional care has to be taken to ensure that rendering process is deterministic.
7.2.2 Spatio-temporal Reconstruction To ensure temporal coherence in the output image sequence
(resulting in more pleasing, flicker-less animations) we need to solve for image sequences instead of single
images. This is achieved by solving the following minimization problem. Note that all the symbols in the
equation denote image sequences.
I = argmin
¯
I
(||α(
¯
I I
g
)||
p
+ ||
H
dx
¯
I
H
dy
¯
I
H
dt
¯
I
H
dxdt
¯
I
H
dytdt
¯
I
I
dx
I
dy
I
dt
I
dxdt
I
dydt
||
p
) (25)
Where H
denotes the finite difference operator and I
g
is the sequence of primal images. This
is basically identical to the reconstruction in Equation (11) except for the fact that we reconstruct a
sequence of images. To achieve arbitrary length reconstruction [Man+16] use a sliding window of 20
frames.
8 Discussion
In this report we presented the state of the art in gradient domain rendering. We have provided an
analysis that shows, under certain simplifications, how the process of sampling gradients followed by
Poisson reconstruction suppresses high frequency noise and reduces the overall error. The papers which
describe the individual techniques already have detailed analysis comparing their work to the non-gradient
domain counterparts. Also, a direct comparison between gradient domain methods is not very insightful.
For example, it is obvious that gradient domain path tracing is going to perform worse than gradient
domain bidirectional path tracing in scenes with difficult light transport. Gradient domain rendering does
not change the characteristics of the underlying technique, but provides a constant error reduction factor
(see Equation (20)). Even with a very bad shift mapping, which always produces non-symmetric paths,
we will not do worse than the base algorithm at equal sample counts as long as we handle non-symmetric
shifts correctly. Something that is worth investigating is the run time overhead of gradient sampling,
which is dominated by the time taken to construct offset paths. This is why equal time comparisons tend
to be more meaningful than equal sample count comparisons. Additionally, it would be interesting to get
statistics about how scene geometry influences the effectiveness of shift mappings. Getting insights into
where the problem areas lie is helpful in the construction of new shift mappings.
8.1 Equal Time Comparisons
We want to compare the effectiveness of gradient sampling in various scenarios. For this we measure
the relative mean squared error (relMSE) of the generated images after letting them render for fixed
amount of time. The relMSE of a sampled image I is defined as
rel
= ||I
ref
I||
2
/||I
ref
||
2
where
I
ref
is a converged reference image, usually obtained via path tracing a large number of samples. For
each technique we render two images, one using the classical version of the techniques and one using
gradient sampling. We measure the relative improvement by dividing their relMSEs. We also compute
the predicted improvement by computing the factor given by Equation (20) for the reference image.
Figure 18 shows the result of this investigation. We use three scenes with different light transport
characteristics. The scene a) has little power in the gradients but highly specular materials. We predict
an improvement do to gradient sampling by 28x over classical sampling, but neither G-PT nor G-BDPT
achieve factors close to this. Scene b) is the classic Sponza” scene. It contains only diffuse materials
and is illuminated by a large light, ideal conditions for both path tracing as well as the shift mapping
employed for G-PT. In fact, we see that G-PT almost achieves the ideal improvement over PT. On the
other hand, G-BDPT only reduces relMSE by 8 times, half of the theoretically predicted value. At last,
the scene c) presents a difficult light transport scenario for PT. This is the only scene were G-BDPT
outperforms PT in terms of relative error reduction.
8.2 Analysing Shift Mappings
To evaluate the performance of shift mappings we compute the percentage of failed paths per pixel. For
this we modify the implementation by [Ket+15] and add an additional output. In Figure 19 we display
the final image on the left and a heat map of failed samples on the right. Generally, we can see that shift
failures concentrate on edges, though the specularity of the interacting materials also seems to have an
influence.
To further investigate this we study a scene with materials of varying reflectivity and otherwise fairly
simple light transport in Figure 20. Of interest here is that the shift does not fail frequently when hitting
a)
PT BDPT MLT
|F |
2
0.3526 Classical 0.0090 0.0123 -
|G|
2
0.0122 Gradient 0.0012 0.0021 -
Predicted 28.86 Improvement 7.6683 5.8167 -
b)
PT BDPT MLT
|F |
2
0.2814 Classical 0.0027 0.0068 -
|G|
2
0.0184 Gradient 0.0002 0.0008 -
Predicted 15.27 Improvement 14.063 8.6862 -
c)
PT BDPT MLT
|F |
2
0.3003 Classical 0.0092 0.0086 -
|G|
2
0.0175 Gradient 0.0043 0.0016 -
Predicted 17.17 Improvement 2.1458 5.4328 -
Figure 18
Reference Image Failed Shift Mappings
Figure 19
perfectly specular materials. It does have an issue with materials that are glossy but classified as diffuse
(the second bar from the top).
Figure 21 serves as an example of a scene with difficult light transport. Again, we can see that most
shifts fail on edges of objects. The complex light transport does not seem to reduce the effectiveness of
the shift mapping much.
Concluding we want to note that, since many of the failing shifts lie on edges, edges will have higher
variance in the gradients, because we discard failing shifts and revert back to the naive gradient sampling
illustrated in Figure 4. This is unfortunate since these edges also tend to be the parts of the image where
the gradients are high. Hence the effectiveness of gradient sampling is reduced and we conjecture that a
shift mapping which can deal with shifts across object edges would improve results significantly. Even
just sampling a gradient multiple times in case a shift fails might reduce overall error effectively. This
would be a variant of adaptive sampling.
Reference Image Failed Shift Mappings
Figure 20
Reference Image Failed Shift Mappings
Figure 21
9 Acknowledgements
The scene files in this report have been mostly provided by Benedikt Bitterli ([Bit16]). staircase” is
made by the Blend Swap user Wig42”. three spheres” and glass egg” are modeled after scenes from
Eric Veach. The photograph in Figure 12 is in the public domain. The renders used in Figure 18 are
supplemental images provided as part of [Man+15]. The Matlab code used for analysis is ours and
available on request.
References
[Bha+08] Pravin Bhat et al. “Fourier Analysis of the 2D Screened Poisson Equation for Gradient Do-
main Problems”. In: Proceedings of the 10th European Conference on Computer Vision: Part
II. ECCV ’08. Marseille, France: Springer-Verlag, 2008, pp. 114–128. isbn: 978-3-540-88685-3.
doi: 10.1007/978-3-540- 88688- 4_9. url: http://dx.doi.org/10.1007/978- 3-540-
88688-4_9.
[Bit16] Benedikt Bitterli. Rendering resources. https://benedikt-bitterli.me/resources/. 2016.
[Bra00] R.N. Bracewell. The Fourier Transform and Its Applications. Electrical engineering series.
McGraw Hill, 2000. isbn: 9780073039381. url: https://books . google.de/ books ?id=
ZNQQAQAAIAAJ.
[JM12] Wenzel Jakob and Steve Marschner. “Manifold Exploration: A Markov Chain Monte Carlo
Technique for Rendering Scenes with Difficult Specular Transport”. In: ACM Trans. Graph.
31.4 (July 2012), 58:1–58:13. issn: 0730-0301. doi: 10.1145/2185520.2185554. url: http:
//doi.acm.org/10.1145/2185520.2185554.
[Jar+12] Wojciech Jarosz et al. “Theory, Analysis and Applications of 2D Global Illumination”. In:
ACM Transactions on Graphics (Presented at SIGGRAPH) 31.5 (Sept. 2012), 125:1–125:21.
doi: 10.1145/2231816.2231823.
[Kaj86] James T. Kajiya. “The Rendering Equation”. In: SIGGRAPH Comput. Graph. 20.4 (Aug.
1986), pp. 143–150. issn: 0097-8930. doi: 10.1145/15886.15902. url: http://doi.acm.
org/10.1145/15886.15902.
[Ket+15] Markus Kettunen et al.“Gradient-Domain Path Tracing”. In: ACM Trans. Graph. 34.4 (2015).
[Leh+13] Jaakko Lehtinen et al. “Gradient-domain Metropolis Light Transport”. In: ACM Trans.
Graph. 32.4 (July 2013), 95:1–95:12. issn: 0730-0301. doi: 10 . 1145 / 2461912 . 2461943.
url: http://doi.acm.org/10.1145/2461912.2461943.
[Man+15] Marco Manzi et al. “Gradient-Domain Bidirectional Path Tracing”. In: Eurographics Sympo-
sium on Rendering - Experimental Ideas and Implementations. Ed. by Jaakko Lehtinen and
Derek Nowrouzezahrai. The Eurographics Association, 2015. doi: 10.2312/sre.20151168.
[Man+14] Marco Manzi et al. “Improved Sampling for Gradient-domain Metropolis Light Transport”.
In: ACM Trans. Graph. 33.6 (Nov. 2014), 178:1–178:12. issn: 0730-0301. doi: 10 . 1145 /
2661229.2661291. url: http://doi.acm.org/10.1145/2661229.2661291.
[Man+16] Marco Manzi et al. “Temporal Gradient-domain Path Tracing”. In: ACM Trans. Graph. 35.6
(Nov. 2016), 246:1–246:9. issn: 0730-0301. doi: 10 .1145/2980179.2980256. url: http:
//doi.acm.org/10.1145/2980179.2980256.
[MC12] S.L. Miller and D.G. Childers. Probability and Random Processes: With Applications to Signal
Processing and Communications. Academic Press, 2012. isbn: 9780123869814. url: https:
//books.google.de/books?id=1rTcR3nIkJcC.
[Rud94] Daniel L Ruderman. “The statistics of natural images”. In: Network: computation in neural
systems 5.4 (1994), pp. 517–548.
[Sey16] Dario Seyb. “Efficient Ray Tracing Techniques”. Pro Seminar Report Selected Topics in
Computer Graphics” at RWTH Aachen. 2016.
[Sey17] Dario Seyb. Zaphod Raytracer Code. 2017. url: http://github.com/daseyb/zaphod (visited
on 01/20/2017).
[SSJ16] Kartic Subr, Gurprit Singh, and Wojciech Jarosz. “Fourier Analysis of Numerical Integration
in Monte Carlo Rendering: Theory and Practice: Understanding Estimation Error in Monte
Carlo Image Synthesis”. In: ACM SIGGRAPH 2016 Courses. SIGGRAPH ’16. Anaheim, Cal-
ifornia: ACM, 2016, 10:1–10:40. isbn: 978-1-4503-4289-6. doi: 10.1145/2897826.2927356.
url: http://doi.acm.org/10.1145/2897826.2927356.
[Vea98] Eric Veach. “Robust Monte Carlo Methods for Light Transport Simulation”. AAI9837162.
PhD thesis. Stanford, CA, USA, 1998. isbn: 0-591-90780-1.