Life-long Map Learning for Graph-based Simultaneous Localization

Transcription

Life-long Map Learning for Graph-based Simultaneous Localization
Institut für Informatik
Autonome Intelligente Systeme
Prof. Dr. Wolfram Burgard
Life-long Map Learning
for Graph-based
Simultaneous Localization and Mapping
Diplomarbeit
Henrik Kretzschmar
Juli 2009
Betreuer:
Prof. Dr. Wolfram Burgard
Dr. Cyrill Stachniss
Dr. Giorgio Grisetti
Eidesstattliche Erklärung
Hiermit erkläre ich, dass ich die vorliegende Arbeit selbständig und ohne Benutzung
anderer als der angegebenen Hilfsmittel angefertigt und alle Stellen, die wörtlich oder
sinngemäß aus veröffentlichten oder unveröffentlichten Schriften entnommen wurden, als
solche kenntlich gemacht habe. Außerdem erkläre ich, dass diese Arbeit nicht, auch nicht
auszugsweise, bereits für eine andere Prüfung angefertigt wurde.
(Henrik Kretzschmar)
Freiburg, den 22. Juli 2009
Zusammenfassung
Für zahlreiche Anwendungen im Bereich der mobilen Robotik sind Karten der Umgebung
unerlässlich. Bringdienste, automatische Inspizierung von Industrieanlagen und der Einsatz in Katastrophengebieten sind nur einige wenige Beispiele. Damit ist die Fähigkeit,
Karten selbständig zu erzeugen, ein zentraler Bestandteil eines jeden autonomen mobilen
Roboters. Ein moderner Ansatz zur sogenannten Kartierung und zeitgleichen Lokalisierung (engl. Simultaneous Localization and Mapping, SLAM) modelliert die Schätzung
des Roboters über den Zustand der Welt anhand eines Graphen. Die Knoten des Graphen entsprechen dabei Positionen des Roboters, die Kanten repräsentieren räumliche
Beziehungen zwischen den Knoten, die durch Messungen des Roboters zustande kommen.
Während der Erstellung der Karte wächst der Graph jedoch stetig an. Dies bringt eine
Komplexität des Verfahrens mit sich, die einen Dauereinsatz ausschliesst.
In dieser Diplomarbeit wird ein Ansatz vorgestellt, der sich dem Problem der Skalierbarkeit der graphbasierten SLAM-Verfahren annimmt. Wir beschreiben ein informationstheoretisches Verfahren, das besonders für Roboter, die sich längere Zeit in
derselben Umgebung aufhalten, zu einer deutlichen Geschwindigkeitssteigerung des Kartierverfahrens führt. Des Weiteren erzeugt unser Ansatz Rasterkarten, die eine höhere
Qualität aufweisen als Karten, die mit bisherigen Verfahren erstellt wurden. Je weniger
Knoten sich in dem Graph, mit dem die Umgebung modelliert wird, befinden, desto
schneller können die nötigen Berechnungen durchgeführt werden und desto geringer ist
der Speicherbedarf. Außerdem ist eine dünn besetzte Graphstruktur für eine effiziente
Graphoptimierung wünschenswert. Unser Algorithmus entfernt redundante Informationen
aus dem Graph und verhindert somit, dass dieser weiter wächst, während der Roboter
durch bereits besuchte Bereiche der Umgebung fährt. Exaktes Entfernen redundanter
Knoten aus dem Graph führt jedoch zur Entstehung vieler neuer Kanten und somit zu
einer dichten Graphstruktur, wodurch der Algorithmus nicht mehr effizient arbeiten kann.
Wir beschreiben ein neues Verfahren zum Entfernen von Knoten aus dem Graph, das
durch Näherungen verhindert, dass die Anzahl der Kanten steigt. Unser Verfahren achtet
dabei darauf, dass möglichst wenig der Informationen in den Kanten verloren geht.
Die experimentelle Auswertung auf der Grundlage von Daten echter Roboter zeigt,
dass durch unsere Herangehensweise ein deutlicher Geschwindigkeitsgewinn im Kartierverfahren erzielt wird. Zudem werden im Vergleich zu bisherigen Ansätzen genauere
Rasterkarten erzeugt.
Abstract
Maps of the environment are needed for a wide range of robotic applications, such
as delivery, surveillance, and pursuit-evasion. The ability to generate a map of the
environment is therefore a prerequisite for any truly autonomous mobile robot. A stateof-the-art approach to simultaneous localization and mapping (SLAM) uses a graph to
model the belief of the robot. The nodes in the graph correspond to positions of the
robot, while the edges in the graph represent spatial constraints between the nodes which
arise from the measurements obtained by the robot. While mapping, this graph grows
constantly as the robot travels, resulting in a computational complexity and memory
requirements which impede long-term mapping applications.
This thesis formulates an approach for graph-based simultaneous localization and
mapping using grid maps that addresses the problem of scalability in long-term applications. We describe an information-theoretic algorithm that achieves significant gains in
computational efficiency for robots that frequently re-traverse previously visited areas. In
addition to that, our method produces grid maps of higher quality compared to standard
grid-based mapping approaches. The less nodes in the graph used to model the belief
of the robot, the faster the computations are. Furthermore, a sparse graph structure is
desirable for efficient graph optimization. Our algorithm removes redundant information
from the belief of the robot and thus prevents the graph from growing as long as the robot
is moving through already visited territory. Exact elimination of information, however,
introduces a large number of new constraints into the belief, ceasing the sparsity of the
graph and thus the efficiency of the algorithm. We therefore describe an approximation
technique to remove nodes from the graph while maintaining its sparsity and minimizing
the involved loss of information, culminating in a highly efficient approach to graph
sparsification.
The experimental evaluation using real robot data show that our method yields grid
maps which exhibit less blur compared to standard grid-based approaches while requiring
significantly less computational resources.
Acknowledgments
I am very grateful to the Autonomous Intelligent Systems group for providing a highly
stimulating research environment. In particular, I would like to express my thanks to my
supervisors Wolfram Burgard, Cyrill Stachniss, and Giorgio Grisetti for their excellent
support, interest in my research, and feedback on academic writing.
Additionally, I would like to thank Christian Plagemann, Kai O. Arras, Kai Wurm,
and Slawomir Grzonka.
Furthermore, I would like to thank the people who recorded and published the mobile
robot datasets used for the experimental evaluation of this work. I thank Dirk Hähnel for
providing the Intel Research Lab dataset and the FHW Museum dataset, and Mike Bosse
for providing the Killian Court Infinite Corridor dataset.
Many people external to the university have contributed with ideas, inspiration, and
ceaseless moral support over the last months while I was writing this thesis. I owe a
special debt of gratitude to Alex, Dominic, Jakob, Javier, Laura, Moritz, Silva, and Sonja.
Thank you.
Finally, I thank my family for supporting my educational endeavors.
Contents
1 Introduction
13
1.1 Contribution of this Thesis . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.2 Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.3 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2 Related Work
2.1 Gaussian Filter Approaches . .
2.2 Particle Filter Approaches . . .
2.3 Graph-based Approaches . . . .
2.4 Graph Pruning . . . . . . . . .
2.5 Chow-Liu Tree Approximation
2.6 Summary . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
17
17
20
22
23
24
24
3 Background
3.1 Robot Motion . . . . . . . . . . . . . . .
3.2 Robot Perception . . . . . . . . . . . . .
3.3 Grid Maps . . . . . . . . . . . . . . . . .
3.4 Mapping with Known Poses . . . . . . .
3.4.1 Occupancy Probability Mapping
3.4.2 Reflection Probability Mapping .
3.5 Graphs . . . . . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
25
25
26
27
28
29
29
30
.
.
.
.
.
.
.
.
33
33
35
36
38
38
39
40
41
4 The
4.1
4.2
4.3
4.4
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Graph-based Formulation of the SLAM
Simultaneous Localization and Mapping
Computing the Full SLAM Posterior . .
Loop Closing . . . . . . . . . . . . . . .
The Pose Graph Metaphor . . . . . . . .
4.4.1 Pose Graphs . . . . . . . . . . .
4.4.2 Pose/Feature Graphs . . . . . . .
4.5 SLAM as a Least Squares Problem . . .
4.6 Pose Graph Optimization . . . . . . . .
Problem
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
12
5 Our
5.1
5.2
5.3
5.4
5.5
Approach to Life-long Map Learning
Information-Theoretic Node Reduction . .
Pose Graphs as Gaussian Markov Random
Marginalization Ceases Sparsity . . . . . .
Approximate Marginalization . . . . . . .
Chow-Liu Tree Approximation . . . . . .
6 Experimental Evaluation
6.1 Runtime and Memory Requirements
6.2 Approximate Marginalization . . . .
6.3 Improved Grid Map Quality . . . . .
6.4 Standard Datasets . . . . . . . . . .
6.4.1 Intel Research Lab . . . . . .
6.4.2 Killian Court . . . . . . . . .
6.4.3 FHW Museum . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. . . .
Fields
. . . .
. . . .
. . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
45
45
48
50
53
54
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
57
57
61
62
65
65
66
67
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
7 Discussion
69
7.1 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
7.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
List of Figures
71
Bibliography
73
1 Introduction
Mobile robots can be reliable helpers in scenarios too dangerous for humans since they
are disposable and unaffected by fumes, smoke, fear, or fatigue. For instance, search-andrescue robots scour the wreckage of collapsed buildings for disaster survivors. Deep-sea
diving robots are successfully engaged in the search for flight data recorders of aircrafts
that crashed offshore. In addition to these comparably extreme applications, service
robots are expected to more and more pervade our daily lives. Applications range from
robotic caregivers which deliver pharmaceuticals within a hospital to service robots which
vacuum clean entire buildings. So far, many of these robots follow trivial and highly
suboptimal navigation policies or are even remotely controlled by human operators due
to the lack of reliable means of autonomous navigation. However, there are many reasons
to operate a mobile robot without human guidance. Applications, such as surveillance,
transportation, or planetary exploration, may require the robot to persistently operate
for an extended period of time, making it difficult and costly to have a human constantly
in the loop. For example, the goal of the ongoing Mars Exploration Rover mission is to
explore the surface of Mars using two vehicles. Due to one-way communication delays of
several minutes, remotely controlling the two vehicles from Earth is highly error-prone
and cumbersome. Hence, as robotics advances, researchers strive to improve autonomous
capabilities of mobile robots.
Efficiently operating autonomous mobile robots depend upon maps of the environment
to accomplish their tasks. However, for many applications it is not feasible to provide the
robots with accurate handcrafted models of their surroundings in advance. To allow for
flexible operation, mobile robots themselves must be able to infer and maintain a model of
their environment from their own sensor measurements. However, the process of turning
local perception into a globally consistent map is a challenging problem which requires
the robot to build a map of the environment, and, at the very same time, estimate its
location within that map. This problem is known as the simultaneous localization and
mapping problem.
The pivotal importance of maps to mobile robotics led to the development of a large
diversity of approaches to the simultaneous localization and mapping problem. A stateof-the-art approach uses a graph to model the belief of the robot. The nodes in the
graph correspond to positions of the robot, whereas the edges in the graph represent
spatial constraints between the nodes which arise from the measurements obtained by the
robot. This graph-based formulation exhibits several key advantages over other available
methods. In particular, as opposed to many other techniques, such as the extended
14
1 Introduction
Kalman filter, graph-based approaches do not suffer from accumulating linearization
errors, which can have serious detrimental effects on the quality of the map. Graph-based
approaches are becoming more and more widely-used since they are less sensitive to
linearization errors and offer a high degree of flexibility. A disadvantage of graph-based
techniques, however, is the fact that the graph grows constantly as the robot travels. This
results in a computational complexity and memory requirements which impede long-term
mapping applications. This thesis presents an approach to overcome this problem and is
therefore an important step towards life-long simultaneous localization and mapping for
graph-based approaches.
1.1 Contribution of this Thesis
In this thesis, we introduce a novel approach for mobile robots using the graph-based
formulation of the simultaneous localization and mapping (SLAM) problem to allow for
life-long map learning using grid maps. We describe an information-theoretic algorithm
which seeks to exhibit a computational complexity and memory requirements that scale
with the size of the mapped area, and not with the length of the robot’s trajectory,
as traditional graph-based methods do. Therefore, the key idea of our approach is to
remove redundant information from the graph such that its size stays constant as long
as the robot moves through previously visited places. Our method explicitly considers
the information gain of observations and removes those nodes from the graph which do
not contain new information with respect to the model of the environment constructed
so far. A major contribution of this thesis is an approximate method that is based on
Chow-Liu trees which enables us to remove nodes from the graph while preserving its
sparsity. This is important for efficient optimization while at the same time minimizing
the loss of information encoded in the constraints. As a result, our approach is highly
efficient and scales with the size of the mapped environment.
Note that a short version of this diploma thesis has been peer reviewed and accepted for
publication as a journal article in the magazine “Zeitschrift KI – Künstliche Intelligenz”,
special issue on simultaneous localization and mapping (Kretzschmar et al., 2009).
1.2 Outline
The remainder of this thesis is structured as follows. In Chapter 2, we review the current
state of research in the field of simultaneous localization and mapping. We discuss the most
influential approaches and allude to their advantages and disadvantages. We furthermore
contemplate the contribution of this thesis in the light of newest research. Chapter 3 aims
to convey some background material necessary to understand the techniques presented
in the subsequent chapters. Chapter 4 is devoted to a general introduction to the
1.3 Notation
15
family of graph-based approaches to simultaneous localization and mapping. Chapter 5
presents our novel approach to life-long map learning. In Chapter 6, we illustrate the
efficacy of the presented approach. We therefore carried out some experiments at the
University of Freiburg using a real robot. Additionally, we applied our method to several
real-world datasets which are widely-used as benchmarks in the research community.
Finally, Chapter 7 concludes this thesis and discusses future work.
1.3 Notation
Throughout this thesis, we stick to the following notation. Time is considered to be
discrete, and the time index is labeled t. The pose of a mobile robot operating in a
planar environment comprises its two-dimensional location (x, y)T , along with its angular
orientation θ, relative to an external coordinate frame. The pose of the robot at time t
is denoted xt . The map of the environment is denoted m. We denote the ith feature
in the map by mi . In case of grid maps, c refers to the grid cells. To perceive its
environment, the robot is equipped with sensors. The observation obtained at time t
is denoted zt . Observations often comprise several individual measurements, e. g., laser
beams. We use ztk to refer to the individual measurements obtained at time t. The
robot changes its pose with respect to the map issuing control commands. The control
command ut thereby provides information about the pose xt relative to the pose xt−1 . To
deal with the data association problem, we use dt to represent the identities of the features
perceived in observation zt . The sets of robot poses, observations, control commands,
and data associations obtained between time 1 and time t are denoted x1:t , z1:t , u1:t , d1:t ,
respectively. Table 1.1 summarizes the notation used in this thesis.
variable
xt
x1:t
m
c
zt
ztk
ut
dt
description
pose of the robot at time step t
sequence of poses of the robot from time step 1 to time step t
map of the environment
grid cell
sensor observation obtained at time step t
individual measurement obtained at time step t
control command providing information about pose xt relative to pose xt−1
identities of the features perceived in observation zt
Table 1.1: Notation.
2 Related Work
The tremendous importance of the simultaneous localization and mapping (SLAM)
problem to the field of mobile robotics has given rise to a myriad of different approaches to
SLAM. Most techniques account for the inherent uncertainty present in the measurements
by formulating the problem in a probabilistic manner. More precisely, they aim to compute
the joint posterior probability distribution over the poses of the robot and the map of
the environment. Common approaches apply Bayes filter implementations, such as the
extended Kalman filter, the extended information filter, or the particle filter, to estimate
the posterior or perform smoothing. In contrast to that, most graph-based approaches
to SLAM compute the most likely map based on the full path of the robot and all
measurements.
In this chapter, we will present the main ideas of the most popular SLAM approaches
and discuss their advantages and disadvantages. We will thereby motivate the graphbased formulation of the SLAM problem and, in particular, the contribution of this thesis
in the context of related work.
2.1 Gaussian Filter Approaches
Gaussian filters model the posterior distribution as a multivariate Gaussian probability
density function. Using Gaussians to model the posterior is convenient because computations can be carried out in closed form. Applying a linear transformation to a Gaussian
random variable yields another Gaussian random variable. Gaussians are unimodal, which
means that they possess a single peak. Consequently, they cannot model the probability
distribution of global estimation problems in which multiple distinct hypotheses exist.
The earliest approaches to SLAM (Cheeseman and Smith, 1986; Leonard et al., 1992;
Smith et al., 1990) are based on the extended Kalman filter (Kalman, 1960). An extended
Kalman filter (EKF) estimates the state of a discrete-time controlled dynamic process
from a series of noisy measurements. EKF-based SLAM approaches model both the
posterior of the robot’s pose and the positions of all landmarks in the map by a single
multivariate Gaussian represented by its mean µ and its covariance matrix Σ. The
approaches apply an extended Kalman filter to recursively estimate this posterior. Being
recursive filters, they incorporate the measurements into the posterior without having to
store them afterwards.
Most EKF-based approaches to SLAM rely on a greedy maximum likelihood estimator to determine the correspondences, which means that they never amend past data
18
2 Related Work
incorporate measurement
recover posterior mean/covariance
Kalman filter
information filter
O N 2
O 1
O 1 O N2
Table 2.1: Computational complexity of the Kalman filter and the information filter.
associations. Choosing the wrong association, however, can make a filter diverge. This
is why landmarks are required to be uniquely identifiable. In other words, these SLAM
approaches depend on a flawless feature detector. Neira and Tardós (2001) presented
techniques that deal with unknown data associations.
If the motion model and the sensor model of the robot were Gaussian and linear,
the solution computed by EKF-based SLAM approaches would be optimal. The real
world, however, is neither exactly Gaussian nor linear. For example, the motion of a
robot that travels on a circular trajectory cannot be described by linear state transitions.
EKF therefore applies a first order Taylor expansion to linearize both the motion model
and the sensor model at the current state estimate. Linearization approximates a
nonlinear function by a linear function. Consequently, the EKF irrevocably induces
linearization errors in every update. These errors accumulate over time, especially when
the linearization point, i. e., the current state estimate, is poor (Bailey et al., 2006; Frese
and Hirzinger, 2001; Julier and Uhlmann, 2001). If the nonlinearities are too large, the
approach tends to fail (Julier et al., 1995; Uhlmann, 1995). The unscented Kalman
filter (UKF), proposed by Julier et al. (1995), performs a more accurate linearization
based on the first two terms of a Taylor expansion, but does not provide a general solution
to the problem of nonlinearities.
While the Kalman prediction of the robot motion is efficient, both the Kalman update
of the measurement and the size of the belief state, i. e., the covariance matrix, are
quadratic in the number of landmarks in the map. This effectively limits the application
of EKF-based SLAM approaches to small environments (Eustice et al., 2005b).
The dual form of the posterior distribution expressed by mean and covariance is called
canonical form or information form. It describes the Gaussian using the information
vector ξ and the information matrix Ω, where ξ = Σ−1 µ and Ω = Σ−1 . The counterpart
of the EKF in information form is the extended information filter (EIF), introduced
by Maybeck (1979). Conversely, its update step is efficient but its prediction step is
quadratic in the number of landmarks. Furthermore, due to µ = Ω−1 ξ, recovering the
state estimate µ requires inverting Ω, which takes time cubic in the size of the map.
Table 2.1 compares the computational complexity of the Kalman filter to the one of the
information filter.
2.1 Gaussian Filter Approaches
19
Whereas the covariance matrix encodes all indirect relations between poses and/or
landmarks explicitly, the information matrix only represents direct relations, i. e., relations
which are involved in a common measurement. Sensors like laser range finders and sonar
sensors have strictly limited ranges of measurement. Features that are located farther
apart from each other than this range cannot be simultaneously observed. Marginalizing
out past robot poses, however, causes the information matrix in feature-based SLAM to
become dense (McLauchlan and Murray, 1995). The new constraints which are added
to the information matrix due to marginalization are often referred to as fill-in. Thrun
et al. (2002) empirically observed, and Frese (2005) later proved that the information
matrix in feature-based SLAM is approximately sparse. More precisely, Frese showed
that off-diagonal entries corresponding to two landmarks decay exponentially with the
distance the robot traveled between observing the two landmarks one after the other.
Consequently, many of the off-diagonal values of the information matrix are zero or close
to zero when properly normalized.
The insight that the information matrix in feature-based SLAM is approximately sparse
has spawned the development of scalable approaches. The key idea at the bottom of most
efficient algorithms is to prune weak constraints, i. e., approximate the dense information
matrix by a sparse matrix. The sparse extended information filter (SEIF), proposed by
Thrun et al. (2004), selectively prunes small entries in the information matrix setting
them to zero. Given the sparsified information matrix, it can then carry out the prediction
step and the update step in nearly constant time, irrespective of the number of landmarks
in the map. The year after, Walter et al. (2005) pointed out that the error estimates
computed by the sparse extended information filter are overconfident. In the same
paper, they presented the so-called exactly sparse extended information filter (ESEIF).
It maintains sparsity while preserving consistency by periodically relocalizing the robot.
Similarly, the thin junction tree filter (TJTF), presented by Paskin (2003), represents
the belief state using a junction tree. It prunes the junction tree such that the resulting
approximation error is minimized.
The filter techniques mentioned above marginalize out all poses except for the current
one. In contrast, the approach presented in this thesis marginalizes out only those poses
whose observations do not reduce the uncertainty in the belief of the robot; all other poses
remain in the state. It furthermore approximates the elimination cliques, which result
from marginalization of redundant poses, by the optimal tree-shaped approximations.
Our approach thereby removes edges from the graph, which is equivalent to setting
the corresponding entries of the information matrix to zero. Hence, our approximation
approach is a technique to sparsify the information matrix.
Eustice et al. (2005a) showed that the information matrix is exactly sparse in a delayedstate framework. As mentioned above, marginalizing out poses of the robot makes the
information matrix become dense. A delayed-state filter maintains the exact sparsity
of the information matrix by not marginalizing out robot poses. The sparsity of the
information matrix in a delayed-state framework directly stems from the first-order
20
2 Related Work
Markov assumption in the state transition model.
The approach presented in this thesis can also be regarded as a delayed-state framework,
albeit with reservations. As long as the robot does not remove any nodes from the graph,
the representation contains all poses and is thus naturally sparse. However, as soon as
the robot removes a redundant node, the marginalization causes fill-in. As a result, the
information matrix becomes dense. Our approach therefore approximates the resulting fillin to recover a sparse representation. Furthermore, as opposed to the filtering techniques
mentioned above, our approach computes the state by means of global optimization in
order to avoid accumulating linearization errors.
Strasdat et al. (2009) recently presented a feature-based approach to SLAM which
aims to reduce the computational burden by only selectively incorporating landmarks.
Their approach uses an unscented Kalman filter to estimate the posterior and only adds
those landmarks to the state which actually help the robot navigate. The key idea
of our approach is related to that. However, we aim to discard observations without
having to rely on predefined features. Consequently, a feature-based representation, as
used in the work presented by Strasdat et al., is not an option. Instead, we use grid
maps to model the environment. Hence, we are not dealing with feature observations,
but with raw laser scans. We discard the observations which do not contribute to the
belief of the robot in terms of uncertainty reduction in the grid map. In contrast to
Strasdat et al., we even consider former observations for removal. Furthermore, whereas
the approach presented by Strasdat et al. is based on the unscented Kalman filter,
which linearizes the measurements at the current state estimate, our approach uses a
graph-based representation and applies nonlinear optimization techniques to compute
the most likely map.
2.2 Particle Filter Approaches
Particle filters approximate a posterior by a set of hypotheses, represented by a set of
weighted random samples called particles. Each particle represents a potential state of
the system. One of the key advantages of particle filters over approaches which model
the posterior as a Gaussian, such as methods based on the extended Kalman filter or
the extended information filter, is their ability to represent multi-modal probability
distributions. They are thus capable of solving global estimation problems with multiple
modes. Doucet et al. (1998) provide an overview and a discussion of the properties of
particle filters in general.
In the context of mobile robot localization and mobile robot map learning, the motion
model of the robot is typically used as the proposal distribution to draw the next
generation of particles. The importance weight of the particles is, for the most part,
computed based on the observation likelihood of the most recent sensor observation
according to the sensor model of the robot. Monte Carlo localization (MCL), introduced
2.2 Particle Filter Approaches
21
by Dellaert et al. (1999), first applied particle filters in the context of mobile robot
localization.
Murphy (1999) and Doucet et al. (2000) were the first to use Rao-Blackwellized particle
filters (RBPF) for map learning. The idea of the RBPF is to separate the estimate of
the trajectory of the robot from the map of the environment by an exact factorization.
Each particle represents a possible robot trajectory and a map.
The FastSLAM algorithm (Montemerlo et al., 2002, 2003) applies the idea of the
Rao-Blackwellized particle filter to landmark-based SLAM tracking each landmark by
an extended Kalman filter. Interestingly, the FastSLAM algorithm deals with the data
association problem by also sampling over potential correspondences (Montemerlo and
Thrun, 2003). Every particle maintains its own associations and the filter can undo
incorrect associations by removing the corresponding inconsistent particles. Hähnel et al.
(2003) presented a FastSLAM approach for occupancy grid mapping by combining a RaoBlackwellized particle filter and scan matching. Grisetti et al. (2007b) presented a highly
efficient approach that is capable of reusing already computed proposal distributions.
Furthermore, it enables the individual particles to share parts of the model of the
environment. Stachniss et al. (2007) presented a novel sampling technique which is
able to deal with multi-modal distributions while maintaining the same efficiency as the
Gaussian proposal.
DP-SLAM, presented by Eliazar and Parr (2003), uses a particle filter to represent
both the robot poses and possible map configurations. The approach is capable of dealing
with a large number of particles thanks to its map representation which allows for highly
efficient updates. Instead of maintaining a map for each particle, it maintains a single
occupancy grid map. It stores in each cell a tree containing links to the particles that
have made changes to that particular cell.
The solution reported by particle filters converges to the true Bayesian posterior as
the number of particles goes to infinity (Verma et al., 2003). In practice, however,
maintaining an infinite number of samples is computationally not feasible. It is desirable
to approximate the posterior as good as possible given the limited number of particles
which can be maintained. Particle filters therefore remove particles with a low importance
weight and replace them by particles with a high importance weight in the so-called
resampling step. In doing so, they have to accept the risk of irrevocably eliminating a
particle even if it is very close to the true state. The effect that discarded hypotheses are
lost and cannot be recovered is known as the particle depletion problem (Doucet, 1998;
Doucet et al., 2001; van der Merwe et al., 2000). There are several approaches to reduce
its detrimental effects (Grisetti et al., 2005, 2006; Stachniss et al., 2005, 2007). In short,
they seek to maintain a reasonable variety of particles by only selectively triggering the
resampling step. Furthermore, they compute more accurate proposal distributions by
taking into account both the movement of the robot and the most recent observation.
22
2 Related Work
2.3 Graph-based Approaches
A key disadvantage of all filter techniques is that they process the measurements and
then discard them (Frese and Hirzinger, 2001). This gives rise to yet another approach
to the SLAM problem. Lu and Milios (1997) were the first to globally optimize the poses
of the robot based on a set of constraints between them. The set of constraints naturally
emerges from observations and motion commands. Such a network of constraints is
usually modeled as a graph, thus the term graph-based SLAM. The nodes in the graph
represent poses of the robot and features, and the edges encode constraints between them.
Graph-based approaches to SLAM aim to find the map that is most likely given a set
of observations. The framework can be nicely illustrated by means of the spring-mass
model (Golfarelli et al., 1998).
Whereas the filter techniques mentioned above marginalize out past robot poses, graphbased SLAM approaches preserve them. Graph-based methods solve the so-called full
SLAM problem estimating all poses of the robot, and the map.
Methods based on the extended Kalman filter and the extended information filter
perform linearization of each nonlinear term, such as the motion and the measurement
model, only once. The point of linearization is chosen to be the state estimate at the time
of linearization. Suboptimal points of linearization, resulting from poor state estimates,
cannot be revised. Filtering techniques therefore suffer from accumulating linearization
errors (Bailey et al., 2006; Julier and Uhlmann, 2001). In contrast to that, graph-based
SLAM approaches avoid inducing these accumulating linearization errors by continuously
relinearizing around the current state estimate (Olson et al., 2006). That is to say,
these approaches consider the SLAM problem to be a nonlinear optimization problem.
The constraints in the graph are typically nonlinear because they include the rotational
component of the pose of the robot, i. e., the orientation of the robot. The resulting
optimization problem is therefore nonlinear and difficult to solve.
In graph-based SLAM, the aforesaid information matrix has a natural interpretation.
It is the adjacency matrix of the graph, where non-zero off-diagonal elements are edges,
and zero-valued off-diagonal elements indicate that there is no edge connecting the
corresponding nodes (Olson et al., 2006; Thrun and Montemerlo, 2006).
Graph-like constraints were first explicitly formulated by Cheeseman and Smith (1986)
and Durrant-Whyte (1987). Lu and Milios (1997) were the first to refine a map by
globally optimizing such a system of equations. Since then, a large variety of approaches
to minimizing the error in constraint networks have been proposed. Duckett et al. (2002)
use Gauss-Seidel relaxation. Frese’s multi-level relaxation technique (Frese et al., 2005)
applies relaxation at different resolutions. Given a good initial guess, it yields accurate
maps particularly in flat environments. Folkesson and Christensen (2004a) define an
energy function for each node and minimize it. Thrun and Montemerlo (2006) apply
variable elimination techniques to reduce the dimensionality of the optimization problem.
A promising approach to SLAM is incremental smoothing and mapping (iSAM), proposed
2.4 Graph Pruning
23
by Kaess et al. (2007). It ensures efficiency by updating a QR factorization of the sparse
information matrix, even in case of many re-traversals. It therefore only recomputes
those matrix entries that actually change. Furthermore, iSAM allows for efficient, and
exact computation of the marginal covariances for fast online data association. Olson
et al. (2006) presented a fast and accurate optimization approach which is based on the
stochastic gradient descent (SGD) (Robbins and Monro, 1951). In contrast to multi-level
relaxation, proposed by Frese et al., it can better recover from poor initial guesses.
Based on Olson’s optimization algorithm, Grisetti et al. (2007a) proposed a different
parameterization of the nodes in the graph. The tree-based parameterization yields
a significant boost in performance. In addition to that, the approach can deal with
arbitrary graph topologies. This property is crucial for the approach presented in this
thesis.
2.4 Graph Pruning
Most graph-based approaches available today do not provide means to efficiently prune
the pose graph that has to be corrected by the underlying optimization framework. Most
approaches add a new node to the pose graph for every single observation. There are
some notable exceptions, though.
A simple method to reduce the number of nodes in the pose graph is to sample the
trajectory of the robot at an appropriate spatial decimation (Eustice et al., 2006). A
similar approach is to only add a new node to the graph if it is not spatially close to any
existing node (Konolige and Agrawal, 2008). These sparsification techniques, however,
ignore the observations themselves and therefore run the risk of discarding nodes which
capture important aspects of the environment. Suppose a robot is close to an obstacle
and records two observations from almost the same pose such that the first observation
captures the area left to the obstacle, and the second observation covers the area right to
the obstacle. Since both observations are recorded from two poses which are spatially very
close to each other, the approaches mentioned above may not add the second observation.
Hence, they may discard precious information about the environment which may not
be covered by subsequent observations. In contrast to that, our approach explicitly
computes the information gain of observations. It is thus able to detect whether an
observation contributes to the belief of the robot or not, even in case the observations
are recorded from almost the same pose. In the example above, our approach would add
both observations to the graph and would thus not neglect the opportunity to capture
the area right to the obstacle.
Folkesson and Christensen (2004b) and Folkesson et al. (2005) combine nodes into
so-called star nodes, yielding rigid local submaps. In contrast to filter based approaches,
which perform linearization at the time of observation, their methods apply delayed
linearization of local subsets of the graph permanently gluing together a set of nodes in a
24
2 Related Work
relative frame. Konolige (2004) proposed to reduce the graph such that it only contains
those nodes which have a loop constraint attached, computing the other nodes separately.
This can lead to a substantial speed-up in case the connectivity of the graph is low.
These methods, however, do not actually discard nodes and therefore do not prevent the
graph from growing when the robot travels through already mapped terrain. They are
thus not suited for life-long map learning. In contrast to that, the approach presented
in this thesis actually discards those observations that do not capture new aspects of
the environment freeing allocated resources. Consequently, once an environment is well
explored, the resources needed by our approach do not further increase when the robot
comes back to re-traverse the same region.
2.5 Chow-Liu Tree Approximation
In our approach, we use Chow-Liu trees to obtain optimal tree-shaped approximations of
the probability distributions which arise from the elimination cliques due to marginalization. Other researchers within the field of robotics recently used Chow-Liu trees to
approximate multi-dimensional probability distributions. For instance, Cummins and
Newman (2008) extended the well-known bag-of-words approach (Sivic and Zisserman,
2003) by learning a model for the sensory data in the form of a Chow-Liu tree. Furthermore, Chli and Davison (2009) used a Chow-Liu tree approximation to divide large
visual maps into a fully hierarchical correlation and clustering structure. However, to
the best of our knowledge, Chow-Liu trees have not been used previously to sparsify the
elimination cliques in graph-based approaches to SLAM.
2.6 Summary
Compared to all previous work, this thesis presents a novel approach which allows for
applying graph-based SLAM approaches in the context of life-long map learning that
explicitly considers the information gain of observations. It removes nodes from the
pose graph if they do not contain new information with respect to the model of the
environment constructed so far. It is highly efficient because it removes the nodes from
the graph such that the corresponding information matrix remains sparse. It does so by
locally approximating the belief state with optimal, tree-shaped probability distributions.
As a result, the method presented in this thesis is an important step to allow for life-long
map learning in mobile robotics.
3 Background
This chapter briefly introduces some background material which is needed to understand
the techniques yet to come. Section 3.1 and Section 3.2 deal with the motion and the
perception of the robot, respectively. In Section 3.3, we introduce the grid map as a
flexible representation of an environment. Section 3.4 analyzes the mapping problem
under the assumption that the poses of the robot are exactly known. In Section 3.5, we
introduce the notion of graphs and the corresponding terminology used in this thesis.
The chapter is based on Thrun et al. (2005), Stachniss (2009), Hähnel (2005), and
Diestel (2005), which can be consulted for more in-depth treatment.
3.1 Robot Motion
The robot changes its state in the environment by issuing control commands. The motion
model thereby describes how the state of the robot evolves.
The state of the robot typically corresponds to its pose in the environment. In more
complex settings, however, the state can additionally comprise the dynamics of the robot
or other relevant information.
In probabilistic robotics, motion models explicitly account for the fact that the outcome
of a control command is uncertain, due to various types of noise. Motion models therefore
describe the transition probability of the robot’s state. They are given by the posterior
probability density distribution
p(xt | ut , xt−1 ),
(3.1)
where xt−1 is the state of the robot at time t − 1, ut is the issued control command, and
xt is the resulting state of the robot at time t.
Motion models typically assume that the state evolves according to a first-order
Markov process. Hence, the state xt is only conditionally dependent upon its immediately
preceding state xt−1 and the issued control command ut .
Figure 3.1 illustrates the motion of the robot in terms of a Bayesian network representation. A Bayesian network is a directed acyclic graph which encodes the dependence
properties of a joint density.
Actual implementations of Eq. (3.1) typically assume a Gaussian distribution (Cheeseman and Smith, 1986) or a “banana-shaped” distribution (Dellaert et al., 1999). As we
will discuss in the next chapter, the work presented in this thesis assumes a Gaussian
distribution.
26
3 Background
ut−1
ut+1
ut
xt−1
xt
xt+1
Figure 3.1: Bayesian network representation of the robot’s motion. The motion model assumes
that the state of the robot evolves according to a first-order Markov process.
Hence, the state xt is only conditionally dependent upon its immediately preceding
state xt−1 and the issued control command ut .
xt−1
xt
xt+1
zt−1
zt
zt+1
xt
zt1
ztK
m
m
(a)
(b)
Figure 3.2: Bayesian network representation of the robot’s perception. (a) An observation zt
is conditionally dependent upon the pose of the robot xt and the map m of the
environment. (b) An observation zt at time t often comprises several individual
measurements, giving zt = {zt1 , . . . , ztK }. These individual measurements are
assumed to be independent.
3.2 Robot Perception
To perceive its environment, the robot is equipped with sensors. The sensors are typically
mounted on the robot. Therefore, they sense the environment relative to the current pose
of the robot. Sensor models describe the formation process by which sensor measurements
are generated in the physical world.
All sensor measurements are subject to noise. In probabilistic robotics, sensor models
therefore explicitly model the uncertainty in the sensors. Hence, sensor models are defined
as the conditional probability distribution
p(zt | xt , m),
(3.2)
where zt is the measurement, xt is the pose of the robot, and m is the map of the
environment. An observation zt at time t often comprises several individual measurements,
3.3 Grid Maps
27
Figure 3.3: An occupancy grid map. Gray shades indicate the probability that a cell c is
occupied, where white means p(c) = 0, and black means p(c) = 1.
e. g., several laser beams. Hence we have
zt = {zt1 , . . . , ztK },
(3.3)
where zt1 , . . . , ztK refer to the individual measurements. We assume that the individual
measurements ztk are independent, allowing us to write
p(zt | xt , m) =
K
Y
p(ztk | xt , m).
(3.4)
k=1
Figure 3.2 illustrates the perception of the robot in terms of two Bayesian networks.
Again, the concrete realization of p(ztk | xt , m) depends on the sensor which is used.
3.3 Grid Maps
In this thesis, we use grid maps to model the environment. Grid maps partition the
continuous space of locations in the environment into a grid of rectangular cells, the
so-called grid cells. Each grid cell contains information about the corresponding area
in the environment. The most frequently used grid maps are occupancy grid maps and
reflection probability grid maps. Occupancy grid maps assume that each grid cell is
either occupied by an obstacle or free. Each cell thereby stores the probability that the
particular cell is occupied. Reflection probability grid maps store the probability that a
beam is reflected by an obstacle within this particular cell. Figure 3.3 depicts a simple
occupancy grid map.
The occupancy grid map and the reflection probability grid map are efficient approaches
for representing uncertainty. Grid maps allow for detailed representations of an environment without the need of a predefined feature extractor. As they have the ability
28
3 Background
xt−1
xt
xt+1
zt−1
zt
zt+1
m
Figure 3.4: Bayesian network representation of the problem of mapping with known poses. The
measurements z1:t as well as the robot trajectory x1:t are known. The goal is to
compute the map m.
to represent occupied space, empty space, and unknown (unobserved) space, grid maps
are well suited for tasks such as path-planning, obstacle avoidance, and exploration. In
contrast to most representations based on features, grid maps offer constant time access
to cells. However, grid maps suffer from discretization errors. Furthermore, they require
a significant amount of memory resources.
3.4 Mapping with Known Poses
The term mapping with known poses refers to the problem of producing maps from noisy
measurements z1:t , under the assumption that the poses of the robot are known at all
times t. In practice, the true trajectory of the robot x1:t cannot be inferred directly from
the odometry measurements u1:t because of the inherent noise present in all measurements.
Nevertheless, the problem of mapping with known poses is of vital importance in the
context of SLAM. This is due to the fact that most SLAM approaches do not output
the map itself. Instead, they aim to recover the true robot trajectory. In short, mapping
with known poses can be used to process the trajectory estimate of the SLAM approach
to produce the final map.
The problem comprises the dependencies illustrated by the Bayesian network depicted
in Figure 3.4. In general terms, mapping with known poses can be expressed as computing
the posterior distribution over the maps m given the measurements z1:t and the robot
poses x1:t , i. e.,
p(m | z1:t , x1:t ).
(3.5)
For reasons of simplicity, it is common to assume that the grid cells are independent
of each other. The probability of the entire map is then given by the product of the
3.4 Mapping with Known Poses
29
probabilities of the individual cells. Hence, we have
Y
p(m) =
p(c),
(3.6)
p(c | z1:t , x1:t ).
(3.7)
c∈m
and, in particular,
Y
p(m | z1:t , x1:t ) =
c∈m
Solving the estimation problem given in Eq. (3.5) turns into solving a collection of several
separate binary estimation problems, i. e.,
p(c | z1:t , x1:t )
(3.8)
for all grid cells c ∈ m. Although Eq. (3.6) expresses a strong assumption, the resulting
maps typically are of reasonable quality. Only very few approaches exist that consider
such dependencies. Notable exceptions are Thrun (2003) and O’Callaghan et al. (2009).
3.4.1 Occupancy Probability Mapping
The standard occupancy grid mapping approach, presented by Moravec and Elfes (1985),
solves the estimation problem for each grid cell, as given in Eq. (3.8), using a Bayes filter.
The corresponding recursive occupancy update formula is given by
h
p(c | z1:t , x1:t ) = 1 +
p(c)
1 − p(c | z1:t−1 , x1:t−1 ) i−1
1 − p(c | zt , xt )
·
·
.
p(c | zt , xt )
1 − p(c)
p(c | z1:t−1 , x1:t−1 )
(3.9)
The term p(c) describes the prior probability that a cell is occupied by an obstacle.
Typically, a prior of p(c) = 12 is assumed. The actual implementation of p(c | zt , xt )
depends on the sensor which is used.
3.4.2 Reflection Probability Mapping
In contrast to occupancy grid maps, which store the probability that a cell is occupied
by an obstacle, reflection probability maps capture the probability that a laser beam is
reflected by an object within a particular grid cell. Reflection probability maps can be
computed by means of the so-called simple counting model. It states that the reflection
probability of a cell c is given by
p(c | z1:t , x1:t ) =
hits(c, z1:t , x1:t )
,
hits(c, z1:t , x1:t ) + misses(c, z1:t , x1:t )
(3.10)
where hits(c, z1:t , x1:t ) refers to the number of times a beam zik measured from xi ∈ x1:t
ended in cell c, and misses(c, z1:t , x1:t ) refers to the number of times a beam has passed
through cell c.
30
3 Background
x2
x3
x4
x2
x5
x3
x0
x1
(a) A disconnected graph.
x0
x1
(b) A complete graph.
Figure 3.5: Drawings of two graphs. The blue circles and the lines represent the nodes and
the edges in the graphs, respectively. (a) A drawing of a disconnected graph. A
disconnected graph contains a pair of nodes such that there exists no path in the
graph which links the two nodes. (b) A complete graph. All nodes in a complete
graph are pairwise adjacent.
3.5 Graphs
In this thesis, we use a graph-based approach to address the SLAM problem. A graph
is a pair G = (V, E) of sets such that E ⊆ V × V . The elements of V are the nodes or
vertices of the graph, the elements of E are the edges of the graph.
In the remainder of this document, we will stick to the following basic terminology of
graph theory:
• The nodes belonging to an edge are called incident with that edge.
• The two nodes incident with an edge are its ends.
• If v is incident with an edge e, we say that edge e is an edge at v.
• The set of all edges in E at v is denoted by E(v).
• An edge joins its ends.
• Two edges are adjacent if they have an end in common.
• Two nodes x, y of G are adjacent, or neighbors, if {x, y} is an edge of G.
• If all nodes of G are pairwise adjacent, then G is called complete.
• The set of neighbors of a node v in G is denoted by NG (v), or briefly by N (v).
• The degree of a node v is the number of edges at v, written as dG (v), or briefly as
d(v).
3.5 Graphs
31
• The number of nodes of a graph G is its order, written as |G|.
• The number of edges of a graph G is its size, denoted by ||G||.
• A graph G0 = (V 0 , E 0 ) is a subgraph of G = (V, E) if V 0 ⊆ V and E 0 ⊆ E. In this
case, G is a supergraph of G0 .
• A clique in a graph G = (V, E) is a non-empty set of nodes C ⊆ V such that all
nodes in C are pairwise adjacent in G.
• A path is a non-empty graph P = (V, E) of the form V = {x0 , x1 , . . . , xk } and
E = {{x0 , x1 }, {x1 , x2 }, . . . , {xk−1 , xk }}, where i =
6 j ⇒ xi =
6 xj . In this case, P
links its ends x0 and xk .
• A non-empty graph G is called connected if any two of its nodes are linked by a
path in G. A non-empty graph which is not connected is called disconnected.
• The adjacency matrix A = (aij )n×n of a graph G = (V, E) is defined by
(
aij =
1
0
if vi vj ∈ E,
otherwise.
(3.11)
The usual way to visualize a graph is by drawing a symbol for each node, and by
drawing a straight or curved line between two nodes if and only if there is an edge joining
them. Figure 3.5 exemplifies a disconnected graph and a complete graph.
See, for instance, Diestel (2005) for a thorough introduction to graph theory.
4 The Graph-based Formulation of the
SLAM Problem
In this thesis, we use a graph-based approach to address the simultaneous localization
and mapping (SLAM) problem. Graph-based approaches to SLAM cast map building as
an optimization problem. They model the poses of the robot and the map features as
nodes in a graph. The edges of the graph model spatial constraints between the nodes.
These constraints naturally arise from odometry measurements and from observations.
The spatial configuration of the nodes which best satisfies the constraints encoded in the
edges corresponds to the most likely map given the measurements. Graph-based SLAM
approaches apply optimization techniques to determine this configuration.
This chapter provides a general introduction to the graph-based formulation of the
SLAM problem. Section 4.1 first introduces the SLAM problem from a formal, probabilistic point of view. It draws a distinction between the online and the full SLAM
problem. Since graph-based approaches to SLAM aim to solve the latter, Section 4.2
derives a closed form expression to calculate the corresponding posterior distribution.
Section 4.3 discusses the importance of a robust approach to loop closing, i. e., the ability
to recognize places that the robot previously visited. Section 4.4 introduces a graphical,
intuitive representation of the full SLAM posterior, namely the concept of pose graphs.
Assuming independent Gaussian noise, Section 4.5 formulates the SLAM problem as a
least squares problem, which can be solved efficiently. Finally, Section 4.6 discusses the
task of computing the most likely configuration of the nodes in the pose graph yielding
the maximum likelihood map.
This introduction to the graph-based formulation of the SLAM problem is mainly
based on Olson (2008), Thrun et al. (2005), and Kaess et al. (2008).
4.1 Simultaneous Localization and Mapping
A robot that has to acquire a map of its initially unknown environment has to solve the
so-called simultaneous localization and mapping (SLAM) problem. SLAM is challenging
because it is a “chicken-and-egg” problem. On the one hand, an accurate estimate of the
pose of the robot is needed while building the map. Estimating the pose of the robot
is called localization. On the other hand, localization needs a map to correct the pose
uncertainty which arises from the odometry measurements due to noise. Localization
and map building are therefore inherently coupled. SLAM is regarded as one of the
34
4 The Graph-based Formulation of the SLAM Problem
most fundamental problems in mobile robotics because the ability to build maps is a key
precondition for truly autonomous mobile robots.
The remainder of this section defines the SLAM problem from a formal, probabilistic
point of view. The subsequent sections will be concerned with an approach to solve the
problem.
In probabilistic robotics, the SLAM problem is described as finding the joint posterior
distribution of the map m and the poses of the robot x1:t given a set of observations z1:t
and a set of robot motion commands u1:t . A distinction is drawn between the full SLAM
and the online SLAM problem.
The full SLAM problem refers to estimating the joint posterior probability distribution
over the entire robot path x1:t along with the map m given the observations z1:t and the
control commands u1:t . The full SLAM posterior is thus given by
p(x1:t , m | z1:t , u1:t ).
(4.1)
Integrating out all past robot poses x1:t−1 yields the online SLAM posterior. Hence, the
online SLAM posterior is
p(xt , m | z1:t , u1:t ) =
Z Z
Z
...
p(x1:t , m | z1:t , u1:t ) dx1 dx2 . . . dxt−1 .
(4.2)
Figure 4.1 illustrates the structure of the SLAM problem in general and emphasizes the
differences between the online SLAM problem and the offline SLAM problem.
In online SLAM, a robot pose is typically integrated out immediately once a new pose
comes into play only keeping track of the current pose of the robot. As already mentioned
in Chapter 2, integrating out past robot poses substantially changes the dependency
structures. These structural changes will be of great interest for the work presented in
this thesis.
The distributions above assume that the so-called data association problem has been
solved. Data association refers to the problem of identifying a feature in one observation
as the very same feature found in another observation. The data association problem is
crucial to SLAM and thus must not be neglected. Let di represent the identities of the
features perceived in observation zi . Considering the data association problem, the full
SLAM posterior turns into
p(x1:t , m, d1:t | z1:t , u1:t ).
(4.3)
Integrating out all past robot poses x1:t−1 and summing over all past data associations d1:t−1 yields the online SLAM posterior
p(xt , m, dt | z1:t , u1:t , d1:t )
Z Z
=
...
Z XX
d1
d2
...
X
dt−1
p(x1:t , m, d1:t | z1:t , u1:t ) dx1 dx2 . . . dxt−1 .
(4.4)
4.2 Computing the Full SLAM Posterior
ut−1
ut+1
ut
35
ut−1
ut+1
ut
xt−1
xt
xt+1
xt−1
xt
xt+1
zt−1
zt
zt+1
zt−1
zt
zt+1
m
m
(a) The online SLAM problem.
(b) The full SLAM problem.
Figure 4.1: Bayesian network representation of the SLAM problem. In SLAM, the control
commands u1:t and the observations z1:t are given, the robot poses x1:t and the
map m are unknown. The goal is to compute the map m. (a) Online SLAM aims to
recover the current robot pose xt only. (b) In contrast, full SLAM aims to recover
the entire trajectory of the robot, i. e., all robot poses x1:t .
4.2 Computing the Full SLAM Posterior
In Section 4.1, we described the SLAM problem as finding the posterior probability
distribution of a map. We furthermore introduced the online and the full SLAM problem.
Graph-based SLAM aims to solve the latter. Hence, in this section, we will derive a
closed form expression to calculate the corresponding posterior distribution.
According to Eq. (4.3), the full SLAM posterior assuming known data associations is
given by
p(x1:t , m | z1:t , u1:t , d1:t ).
(4.5)
Using Bayes’ theorem, we obtain
p(x1:t , m | z1:t , u1:t , d1:t )
= η p(zt | x1:t , m, z1:t−1 , u1:t , d1:t ) p(x1:t , m | z1:t−1 , u1:t , d1:t ),
(4.6)
where η is the normalizer resulting from Bayes’ theorem, which indicates that the final
result is normalized to 1. We proceed by simplifying the two probabilities on the right.
By exploiting the assumption that our state is complete, we have
p(zt | x1:t , m, z1:t−1 , u1:t , d1:t ) = p(zt | x1:t , m, dt ).
(4.7)
36
4 The Graph-based Formulation of the SLAM Problem
Partitioning x1:t , m into xt and x1:t−1 , m, and then removing conditionals which are
irrelevant under the Markov assumption leads to
p(x1:t , m | z1:t−1 , u1:t , d1:t )
= p(xt | x1:t−1 , m, z1:t−1 , u1:t , d1:t ) p(x1:t−1 , m | z1:t−1 , u1:t , d1:t )
= p(xt | xt−1 , ut ) p(x1:t−1 , m | z1:t−1 , u1:t−1 , d1:t−1 ).
(4.8)
Combining the two intermediate results from Eq. (4.7) and Eq. (4.8) with Eq. (4.6), we
obtain the recursive formula
p(x1:t , m | z1:t , u1:t , d1:t )
= η p(zt | xt , m, dt ) p(xt | xt−1 , ut ) p(x1:t−1 , m | z1:t−1 , u1:t−1 , d1:t−1 ).
(4.9)
Induction over time t yields the closed form expression
p(x1:t , m | z1:t , u1:t , d1:t ) = η p(x1 , m)
Y
p(xt | ut , xt−1 ) p(zt | xt , m, dt ) .
(4.10)
t
Eq. (3.4) from Section 3.2 states that the individual measurements are independent of
each other, allowing us to write
p(x1:t , m | z1:t , u1:t , d1:t ) = η p(x1 , m)
Y
p(xt | ut , xt−1 )
t
Y
p(ztk | xt , m, dkt ) .
(4.11)
k
And finally, we drop the joint prior on the map m and on the initial state x1 for simplicity,
which results in
p(x1:t , m | z1:t , u1:t , d1:t ) = η
Y
t
p(xt | ut , xt−1 )
Y
p(ztk | xt , m, dkt ) .
(4.12)
k
Note that the expression in Eq. (4.12) corresponds to the Bayesian network representations
depicted in Figure 4.1. The term p(xi | ui , xi−1 ) refers to the motion model, discussed
in Section 3.1. The term p(ztk | xt , m, dkt ) corresponds to the measurement model
from Section 3.2, assuming known correspondences. In Section 4.4, we will introduce an
intuitive representation of the posterior distribution given in Eq. (4.12), namely the pose
graph.
4.3 Loop Closing
As already mentioned in Section 4.1, an accurate estimate of the current pose of the
robot is required while building a map. Better pose estimates result in better maps. It
is therefore desirable for the robot to maintain an accurate estimate about its current
pose during the entire mapping process. However, as the robot travels through unknown
4.3 Loop Closing
37
terrain, it basically has to rely on dead reckoning or odometry measurements, to update
its pose estimate. The odometry estimate is typically affected by significant noise due to
unpredictable wheel slippage, and the like. Techniques which improve upon the estimate
provided by odometry, such as incremental scan matching, can only mitigate the error
induced with every pose update, but cannot avoid that the error accumulates along
the path. Even a slight error in orientation can thereby lead to a vast position error
later on. As a consequence, the robot becomes more and more uncertain about its pose.
Fortunately, once the robot comes back to a previously observed place, whose position
is known to the robot with relative certainty, it has the chance to compensate the pose
error. That is to say, if the robot is now able to recognize the place as the very same
place it was before, it can refine its current pose estimate with respect to that place,
creating a geometric relationship between the corresponding poses. In the context of the
SLAM problem, place recognition is therefore referred to as loop closing. As we will see
in Section 4.6, each loop closure allows the robot to retrospectively refine its trajectory
estimate, and hence the map.
The difficulty in closing loops is to find the correct data associations when re-entering
known terrain. As already mentioned in Section 4.1, data association refers to the
problem of identifying a feature in one observation as the very same feature found in
another observation. In feature-based approaches, closing loops therefore corresponds to
associating landmarks with previously observed landmarks. In grid-based approaches, a
loop closure typically results from successfully matching two or more sensor scans. Scan
matching is the task of finding the transformation that best aligns two or more scans.
We refer the reader to Censi (2006) and Olson (2009) for more details on scan matching.
On the one hand, if the robot fails to detect a loop closure, it potentially re-maps
previously visited areas in the wrong global location, leading to an inconsistent map. On
the other hand, if the robot closes an incorrect loop, the resulting wrong refinement of
the robot trajectory typically has catastrophic effects on the map.
Hence, a robust approach to loop closing is an essential component of any successful
mapping system. However, loop closing is challenging. On the one hand, sensors such
as laser scanners provide a rather limited view of the environment. As a consequence,
entirely different places may look alike in the scans. On the other hand, algorithms need
to discriminate between different places which actually are very similar. Furthermore,
environments that have slightly changed due to moving objects should still be recognized.
In addition to that, algorithms need to be robust to spurious measurements, which are
caused by noise.
Unfortunately, for the reasons mentioned above, feature matching is typically slow
and error prone. Most approaches therefore do not perform feature matching against
all observations. They rather restrict the set of observations to those for which feature
matching seems likely. Hence, the algorithms only consider those observations which
were recorded close to the current pose. They therefore identify all former poses which
are within the ellipsoid of the current pose uncertainty (e. g. 95 % confidence interval).
38
4 The Graph-based Formulation of the SLAM Problem
x1
x13
x2
x12
x3
x11
x5
x4
x10
x6
x7
x9
x8
x14
Figure 4.2: A pose graph. The triangles are the nodes in the pose graph and represent the
robot poses discretized over time. The edges in the pose graph represent spatial
constraints between the robot poses arisen from odometry readings (shown in blue)
and from scan matching (shown in red). Each edge describes a likelihood as a
function of the deviation from the preferred mutual arrangement of the two nodes
it joins. The most likely positions of the nodes are therefore implicitly given by the
edges.
In graph-based approaches to SLAM, this set of observations can be found by performing
a Dijkstra projection (Dijkstra, 1959) of the node covariances beginning at the current
node. Tipaldi et al. (2007) describes how to thereby efficiently compute the marginal
covariances.
4.4 The Pose Graph Metaphor
In Section 3.5, we introduced the graph as an abstract representation of a set of objects
where some pairs of objects are linked by edges. Graph-based SLAM models the mapping
problem by means of a graph, a so-called pose graph. This section therefore introduces
the concept of pose graphs. It draws a distinction between pose graphs and pose/feature
graphs.
4.4.1 Pose Graphs
Graph-based SLAM encodes the experience of a mobile robot in a graph, a so-called
pose graph. The nodes in a pose graph represent poses of the robot along its trajectory,
discretized over time. The edges in a pose graph represent information about the
geometric relation of the corresponding robot poses. Together with the sensor model,
each edge describes a likelihood as a function of the deviation from the preferred mutual
arrangement of the two nodes it joins. The negative log likelihood is often referred to as
the cost of an edge. The most likely positions of the nodes are therefore implicitly given
by the edges. We will be concerned with determining these positions in the following
sections. The spatial constraints in a pose graph arise from odometry measurements
(see Section 3.1) and from loop closures (see Section 4.3). Constraints can express any
4.4 The Pose Graph Metaphor
39
m2
x1
x2
x3
x5
x4
x6
m3
m1
x11
x7
x8
x10
x9
Figure 4.3: A pose/feature graph. The triangles and the stars are both nodes in the pose/feature
graph. The triangles represent the robot poses discretized over time. The stars
represent map features. The edges in the graph represent spatial constraints
between the nodes arisen from odometry readings (shown in blue) and from feature
observations (shown in red). Each edge describes a likelihood as a function of the
deviation from the preferred mutual arrangement of the two nodes it joins. The
most likely positions of the nodes are therefore implicitly given by the edges and
their cost functions.
type of information such as a range constraint, a bearing constraint, or a full rigidbody transformation. The set of spatial constraints constitutes the full SLAM posterior
probability distribution. Figure 4.2 exemplifies a simple pose graph. The edges arisen
from odometry measurements and from loop closures are depicted in blue and red,
respectively.
As shown by Montemerlo et al. (2002), once the poses of the robot have been recovered,
the map can be computed easily based on the original feature-to-pose information via
techniques known as mapping with known poses (see Section 3.4). Hence, the pose graph
is an implicit representation of the map itself.
4.4.2 Pose/Feature Graphs
According to Olson (2008), a graph whose nodes represent not only robot poses but
also map features is called a pose/feature graph. Hence, the spatial constraints in a
pose/feature graph arise from odometry measurements and from feature observations
(see Section 3.2). Pose graphs and pose/feature graphs are closely related to each other
since marginalizing out the features of a pose/feature graph yields its equivalent pose
graph. Figure 4.3 exemplifies a simple pose/feature graph. The edges arisen from
odometry measurements and from feature observations are depicted in blue and red,
respectively. In this thesis, we use plain pose graphs since our approach does not rely on
features in the environment, but directly computes grid maps from the laser range data.
40
4 The Graph-based Formulation of the SLAM Problem
4.5 SLAM as a Least Squares Problem
In Section 4.2, we derived a closed form expression of the full SLAM posterior probability
distribution function. Our derivation led to Eq. (4.12). In practice, however, computing
the entire posterior distribution is not feasible when dealing with high-dimensional maps.
Graph-based SLAM approaches therefore only compute the mode of this posterior. In
other words, they compute the most likely map given the measurements. The maximum
likelihood (ML) estimate is given by
x∗1:t , m∗ = argmax p(x1:t , m | z1:t , u1:t ).
(4.13)
x1:t , m
Unfortunately, computing the solution of Eq. (4.13) is still not tractable in general.
Therefore, graph-based approaches to SLAM further reduce the computational burden
by assuming independent Gaussian noise. More precisely, the approaches assume the
motion and the measurement noise to be independent and normally distributed. As we
will see shortly, the independent Gaussian noise assumption leads to a convenient set of
quadratic equations, which can be solved efficiently.
A Gaussian motion model is given by
xt = g(ut , xt−1 ) + wt ,
(4.14)
where the deterministic function g describes the kinematic properties of the robot, along
with the Gaussian random variable wt ∼ N (0, Rt ), which models the white motion
noise with zero mean and covariance matrix Rt . This leads to the posterior probability
distribution
T
1
p(xt | ut , xt−1 ) ∝ exp − xt − g(ut , xt−1 ) Rt−1 xt − g(ut , xt−1 ) .
2
(4.15)
Similarly, the measurement model is given by
ztk = h(xt , m, dkt , k) + vtk ,
(4.16)
where h is the measurement function, and vtk ∼ N (0, Qkt ) is a Gaussian random variable
modeling the white measurement noise with zero mean and covariance Qkt . This leads to
the posterior
p(ztk | xt , m, dkt ) ∝ exp −
T
1 k
−1
zt − h(xt , m, dkt , k) Qkt
ztk − h(xt , m, dkt , k) . (4.17)
2
Representing probabilities in logarithmic form, the maximum likelihood estimate
4.6 Pose Graph Optimization
41
according to Eq. (4.13) becomes
x∗1:t , m∗ = argmax p(x1:t , m | z1:t , u1:t )
x1:t , m
= argmin − log p(x1:t , m | z1:t , u1:t )
x1:t , m
= argmin − log
Y
x1:t , m
p(xt | ut , xt−1 )
t
"
= argmin
X
x1:t , m
Y
p(ztk | xt , m, dkt )
k
− log p(xt | ut , xt−1 ) +
X
− log
t
k
X
XX
p(ztk
|
xt , m, dkt )
"
= argmin
#
#
x1:t , m
− log p(xt | ut , xt−1 ) +
t
t
− log
p(ztk
|
xt , m, dkt )
.
(4.18)
k
Assuming the Gaussian motion model according to Eq. (4.15) and the Gaussian measurement model according to Eq. (4.17), the maximum likelihood estimate finally turns into
the nonlinear least squares problem
"
x∗1:t , m∗
= argmin
x1:t , m
+
X T
Rt−1 xt − g(ut , xt−1 )
t
X X t
xt − g(ut , xt−1 )
ztk − h(xt , mi , k)
T
−1
Qkt
ztk − h(xt , mi , k)
#
.
(4.19)
k
Estimating the most likely robot trajectory and the most likely map corresponds to
solving Eq. (4.19). In the next section, we will discuss the task of solving this optimization
problem.
Note that the motion model expressed in Eq. (4.15) and the measurement model
expressed in Eq. (4.17) form the edges in the pose graph, which we introduced in the
last section. The cost functions associated with the edges are expressed in terms of the
squared Mahalanobis distance given the covariances Rt and Qkt , respectively. The costs
are thus quadratic in the deviation from the preferred configuration of the corresponding
two nodes. Furthermore, since all measurements are considered independent, multiple
edges linking the same pair of nodes can be fused.
4.6 Pose Graph Optimization
In Section 4.4, we introduced the pose graph and the pose/feature graph. They both
intuitively encode the full SLAM posterior distribution. Assuming independent Gaussian
noise, we formulated the mapping problem as a least squares problem in Section 4.5.
Pose graph optimization is the task of computing the positions of all nodes in the pose
graph such that the edges have minimum cost. In other words, pose graph optimization
42
4 The Graph-based Formulation of the SLAM Problem
solves the least squares problem. Pose graph optimization is also referred to as the SLAM
back-end.
Each edge in the pose graph describes a cost as a quadratic function of the deviation
from the preferred relative arrangement of the two nodes it joins. In the context of
graph-based SLAM, a loop closure results in an edge in the pose graph which relates the
current robot pose with a former robot pose, making the pose graph cyclic. However, in
general, these loop closure edges contradict the pose estimates resulting from plain dead
reckoning. The reason is that dead reckoning estimates are typically rather poor due to
measurement errors like wheel slippage, accumulated along the path. The optimization
aims to arrange the nodes in the pose graph such that the total costs associated with all
edges are minimized. Figure 4.4 illustrates the effect of the pose graph optimization after
a successful loop closure.
Optimizing the pose graph is important because the maximum likelihood configuration
of the nodes corresponds to the best map of the environment. However, there are four
main reasons why pose graph optimization is a challenging endeavor:
• The state space can be very large.
• The constraints in the pose graph are typically nonlinear.
• Local minima in the likelihood function.
• The initial guess can be very poor.
The state space can be very large since it typically grows with the length of the robot
trajectory. The constraints in the pose graph are typically nonlinear and “wrap-around”
at 360◦ because they include the rotational component of the robot pose, i. e., the
orientation of the robot. Hence, the resulting cost surface is complicated and rarely
convex. Having many local minima, the corresponding optimization problem is difficult
to solve. Furthermore, the initial guess tends to be rather poor. To obtain a first initial
guess of the robot poses, algorithms typically set the first pose to zero and compute the
subsequent poses by simply concatenating the motion model p(xt | ut , xt−1 ). As a result,
the optimal configuration of the pose graph can substantially disagree with its initial
configuration. After all, this discrepancy is the main motivation for optimizing the pose
graph.
Existing approaches to tackle the least squares optimization problem formulated in
Eq. (4.19) are based on
• LU decomposition (applied by Lu and Milios, 1997)
• Gauss Seidel relaxation (applied by Duckett et al., 2000)
• conjugate gradients (applied by Konolige, 2004; Montemerlo and Thrun, 2004)
4.6 Pose Graph Optimization
43
(a) The robot travels in a loop and returns
to a previously visited place. Its dead
reckoning pose estimate is rather poor.
(b) The robot recognizes the previously
seen place and adds a loop closure constraint to the pose graph.
(c) The high error in the loop closure constraint is reduced by adding a small
amount of error to other constraints.
(d) The pose graph optimization has minimized the total error in the pose graph,
and the loop has been closed.
Figure 4.4: Pose graph optimization after a successful loop closure.
• multi-level relaxation (applied by Frese et al., 2005)
• stochastic gradient descent (applied by Grisetti et al., 2007a; Olson et al., 2006)
• sparse Cholesky decomposition (applied by Dellaert and Kaess, 2006)
• QR matrix factorization (applied by Kaess et al., 2007).
In this thesis, we use the approach proposed by Grisetti et al. (2007a). It is a refinement
of Olson’s algorithm (Olson et al., 2006), which is based on the stochastic gradient
descent (Robbins and Monro, 1951). The difference to Olson’s approach is given by an
improved node parameterization, as well as by an optimized update rule in the stochastic
gradient descent framework.
5 Our Approach to Life-long Map Learning
The preceding chapter provided an introduction to the graph-based formulation of the
simultaneous localization and mapping problem. Most graph-based approaches constantly
add new nodes to the pose graph even when the robot is moving through already mapped
terrain. Hence, the complexity of these approaches scales with the length of the robot’s
trajectory. A robot which has to constantly update its belief about the environment
throughout its entire life cycle, however, will inevitably run out of resources in this way.
This chapter is devoted to presenting our approach to life-long map learning using the
graph-based formulation of the simultaneous localization and mapping problem. Our
approach to life-long map learning aims to remove those nodes from the pose graph
whose observations do not reduce the uncertainty in the belief of the robot. As a result,
our mapping approach scales with the size of the environment, and not with the length
of the robot’s trajectory.
5.1 Information-Theoretic Node Reduction
The key idea of our sparsification approach is to discard those observations which do
not reduce the uncertainty in the belief of the robot. The belief reflects the internal
knowledge of the robot about the state of the environment and is thus given by
bel(x1:t , m) = p(x1:t , m | z1:t , u1:t , d1:t ).
(5.1)
We can factor the belief and obtain
bel(x1:t , m) = p(x1:t , m | z1:t , u1:t , d1:t )
= p(x1:t | z1:t , u1:t , d1:t ) p(m | x1:t , z1:t , u1:t , d1:t ).
(5.2)
In Section 4.5, we already mentioned that graph-based approaches to SLAM only compute
the maximum likelihood estimate of the robot’s poses. It is given by
x∗1:t = argmax p(x1:t | z1:t , u1:t , d1:t ).
(5.3)
x1:t
Thus, p(x1:t | z1:t , u1:t , d1:t ) from Eq. (5.2) turns into a Dirac delta distribution at x∗1:t .
In our case, we therefore only need to consider the belief about the map. It is given by
bel(m) = p(m | x∗1:t , z1:t , u1:t , d1:t ).
(5.4)
46
5 Our Approach to Life-long Map Learning
H (p(c))
1.0
0.5
0
0
0.5
p(c)
1.0
Figure 5.1: Binary entropy function. The plot shows the entropy H of a grid cell c as a function
of the probability p(c) that the cell is occupied. Entropy is a measure of the
uncertainty associated with a random variable. The entropy of a grid cell attains
its maximum when the state of the cell is entirely unknown. The probability of an
entirely unknown cell is p(c) = 0.5.
For simplicity of notation, we will drop the conditionals in the remainder of this chapter
simply writing p(m), instead of p(m | x∗1:t , z1:t , u1:t , d1:t ).
We can measure the uncertainty of the robot’s belief in terms of information entropy
(Shannon, 1948).
Definition 1 The information entropy H of a probability distribution p(x) is the expectation of its information content in bits. It can be defined as
H (p(x)) = E[− log2 p(x)].
(5.5)
Hence, in case of a discrete probability distribution, entropy is given by
X
H (p(x)) = −
p(x) log2 p(x).
(5.6)
p(x) log2 p(x) dx.
(5.7)
x
In the continuous case, entropy turns into
H (p(x)) = −
Z
x
In the following, we will discuss how to compute H(p(m)). Our approach uses a grid
map m to model the environment. We already introduced grid maps in Section 3.3.
Remember that grid maps partition the environment into a grid of rectangular cells, the
so-called grid cells.
According to the definition of entropy given in Eq. (5.5), the entropy of a single grid
cell c is given by
H(p(c)) = p(c) log p(c) + p(¬c) log p(¬c),
(5.8)
5.1 Information-Theoretic Node Reduction
47
where p(c) is the probability that cell c is occupied, and p(¬c) = 1 − p(c) is the probability
that cell c is free. Consider Figure 5.1. It depicts the entropy of a grid cell as a function
of the probability that the cell is occupied. Note that the entropy attains its maximum
when the state of the cell is entirely unknown. The probability of an entirely unknown
cell is p(c) = 0.5.
In Section 3.4, we stated that the grid cells of a grid map are independent of each
other, allowing us to write
Y
p(m) =
p(c).
(5.9)
c∈m
Hence, for a grid map, the entropy is computed by summing over the entropy values of
the individual grid cells, giving
H(p(m)) =
X
H(p(c))
c∈m
=−
X
p(c) log p(c) + p(¬c) log p(¬c) .
(5.10)
c∈m
Definition 2 The information gain I of an observation zi is the reduction of entropy in
the belief of the robot caused by the observation. It can be defined as
I(zi ) = H p(m | z1:t \ {zi }) − H p(m | z1:t ) ,
(5.11)
where z1:t refers to the set of all observations in the graph.
According to this definition, the information gain of an observation has the following
properties:
• Observations which explore unknown territory decrease the uncertainty in the belief
of the robot. Hence, they will exhibit a positive information gain.
• Observations which agree with former measurements do not change the map and
thus do not change the uncertainty in the belief of the robot. Their information
gain will therefore be close to zero or zero.
• Observations which disagree with former measurements increase the uncertainty in
the belief of the robot. Consequently, they will exhibit a negative information gain.
Hence, if I(zi ) > 0, observation zi contains valuable, new information about the
environment and should be preserved. If I(zi ) ≤ 0, observation zi does not reduce or
even increases the uncertainty in the belief of the robot and should thus be discarded.
As already mentioned above, the goal of our approach is to discard all observations zi
for which I(zi ) ≤ 0. Note that discarding an observation typically affects the values
of other observations, particular the values of those observations which were recorded
nearby. To identify all observations which can be discarded, our sparsification approach
therefore iterates the following steps:
48
5 Our Approach to Life-long Map Learning
1. Compute the grid map m given all observations z1:t in the graph.
2. Compute the information gain I(zi ) for all observations zi ∈ z1:t .
3. Determine the observation zk with the lowest information gain I(zk ).
4. If I(zk ) > 0, terminate.
5. If (zk ) < 0, discard zk and proceed with step 1.
Our approach typically operates online, which means that it prunes the pose graph
while the SLAM approach is constructing it. Since the perception of the robot is typically
limited to its vicinity, our implementation only considers those observations which were
recorded close to the current location of the robot, and not all observations z1:t . These
observations can quickly be determined by performing a Dijkstra expansion (Dijkstra,
1959) on the pose graph, starting from the current pose of the robot. Furthermore, our
implementation does not regenerate the entire grid map every time the graph changes.
Instead, the grid map is implemented such that it maintains counters in each grid cell to
keep track of its occupancy probability resulting from the observations. In this way, the
grid map can efficiently reflect changes in the graph, such as the removal of a node or a
change in position of a node.
In this section, we described how to identify observations which can be discarded
without losing relevant information about the environment. However, removing the
corresponding nodes from the pose graph is challenging. A naïve technique introduces a
large number of new constraints and thus destroys the sparsity pattern of the pose graph.
In the following sections, we will introduce an efficient approach to removing nodes from
the pose graph, while preserving sparsity and most of the information encoded in the
incident edges.
5.2 Pose Graphs as Gaussian Markov Random Fields
In Section 4.4, we introduced the pose graph as an intuitive representation of the robot’s
belief represented in information form. An advantageous property of the pose graph
is its interpretation as a Gaussian Markov random field (GMRF). Markov random
fields (Jordan, 1998; Kindermann and Snell, 1980) are useful to analyze the conditional
independence structure of a probability distribution. Markov random fields will help us
understand the structural changes in the pose graph that take place when we remove
nodes. This section therefore briefly describes the Markov random field framework.
Definition 3 A potential function is any function whose range is the nonnegative real
numbers.
Definition 4 A Markov random field is given by a graph G = (V, Ψ), where V is a set
of random variables x1 , . . . , xn , and Ψ is a set of potential functions over subsets of V .
5.2 Pose Graphs as Gaussian Markov Random Fields
S
49
B
A
Figure 5.2: Conditional independence properties in a Markov random field. Every path from
any node in set A to any node in set B passes through at least one node in set S.
As a result, the conditional independence property A ⊥⊥ B | S is valid for any
probability distribution described by this graph.
The joint distribution represented by G is
pG (V ) =
1 Y
ψ(Vψ ),
Z ψ∈Ψ
(5.12)
where Vψ ⊆ V = {x1 , . . . , xn } is the domain of ψ, and
Z
Z
...
Z=
x1
Y
xn ψ∈Ψ
ψ(Vψ ) dx1 . . . dxn .
(5.13)
Consider Figure 5.2. In a Markov random field, any two subsets A and B of nodes are
conditionally independent given a separating subset S of nodes if every path from any
node in set A to any node in set B passes through at least one node in set S. Hence, any
two variables which are not directly connected by a link are conditionally independent
given all other variables. Furthermore, a variable is conditionally independent of all other
variables given its Markov blanket, i. e., the set of its neighbors.
Gaussian Markov random fields are a subset of Markov random fields suitable for
expressing multivariate Gaussian distributions, like the belief of the robot. Translating
a pose graph into a Gaussian Markov random field given by the graph G = (V, Ψ) is
straightforward. The nodes in the pose graph correspond to the random variables in V .
All the potential functions in Ψ are either unary or binary. The unary potentials are
typically set to unity, except the one for the initial pose, which is bound to the origin. The
likelihood functions associated with the edges of the pose graph correspond to the binary
potentials in Ψ. A missing edge between xi and xj implies conditional independence
between xi and xj . In this case, ψij (xi , xj ) is unity. Hence, the belief of the robot is
given by
t
t
Y
1 Y
p(x1 , . . . , xt ) =
ψi (xi )
ψi,j (xi , xj ).
(5.14)
Z i=1
j=i+1
50
5 Our Approach to Life-long Map Learning
5.3 Marginalization Ceases Sparsity
Representing the belief of the robot using a Gaussian Markov random field helps us
understand the structural changes in the pose graph that take place when we remove
a node. In the Markov random field framework, removing a node from the pose graph
translates to eliminating a variable from the corresponding probability distribution. The
process of eliminating a variable from a probability distribution is called marginalization.
Suppose the belief of the robot is given by the joint Gaussian probability distribution
p(α, β) over two stacked vectors α and β which together comprise the poses of the robot.
The density p(α, β) can be expressed in covariance form as
"
p(α, β) = N
# "
µα
µβ
,
#!
Σαα Σαβ
Σβα Σββ
,
(5.15)
and in information form as
"
p(α, β) = N
−1
ξα
ξβ
# "
,
where
"
ξα
ξβ
#
"
=
Σαα Σαβ
Σβα Σββ
Ωαα Ωαβ
Ωβα Ωββ
#−1 "
µα
µβ
and
"
Ωαα Ωαβ
Ωβα Ωββ
#
"
=
Σαα Σαβ
Σβα Σββ
#!
,
(5.16)
#
,
(5.17)
.
(5.18)
#−1
The main difference between the covariance form and the information form is the way
they represent indirect relations between variables in a probability distribution. Whereas
the covariance matrix encodes all indirect relations explicitly, the information matrix
only represents direct relations.
Marginalizing out β yields the marginal density
Z
p(α) =
p(α, β) dβ.
(5.19)
Marginalization in covariance form corresponds to extracting the appropriate sub-block
from the covariance matrix dropping the other variables. As a result, in covariance form,
p(α) is given by
p(α) = N (µα , Σαα ) .
(5.20)
The pose graph, however, represents the belief of the robot in information form. In
information form, marginalization is more complex because it requires computing the
Schur complement over the remaining variables. In information form, p(α) is given by
p(α) = N −1 (ξ, Ω) ,
(5.21)
5.3 Marginalization Ceases Sparsity
51
x1 x2 x3 x4 x5 x6 x7 x8 x9 x10 x11
x1
x1
x2
x4
x2
x5
x3
x3
x4
x6
x5
x6
x7
x9
x10
x8
x8
x7
x9
x10
x11
x11
(a)
(b)
x1
x1 x2 x3 x4 x5 x6 x7 x8 x10 x11
x2
x4
x1
x5
x2
x3
x3
x6
x4
x5
x6
x10
x8
x7
x7
x8
x10
x11
x11
(c)
(d)
Figure 5.3: Eliminating a pose from the belief of the robot. (a) Markov random field according
to the measurements, where the bold, orange node x9 is slated for removal. (b) The
corresponding information matrix. Gray shades indicate nonzero values, and white
indicates exactly zero. (c) Resulting Markov random field after marginalizing out
node x9 . The former neighbors of x9 form an elimination clique in the new graph.
(d) Marginalization causes the information matrix to become dense.
where
ξ = ξα − Ωαβ Ω−1
ββ ξβ ,
(5.22)
Ω = Ωαα − Ωαβ Ω−1
ββ Ωβα .
(5.23)
and
The Schur complement in Eq. (5.23) triggers interesting changes in the conditional
independence structure of the probability distribution. All entries of the matrix outer
product Ωαβ Ω−1
ββ Ωβα which are associated with variables directly related to β are nonzero.
As a result, the information matrix Ω of the marginal density p(α) is filled in at the
corresponding values and becomes dense.
52
5 Our Approach to Life-long Map Learning
We can intuitively understand these changes in the dependency structure of the
robot’s belief by visual inspection of the associated Gaussian Markov random field.
Consider the example depicted in Figure 5.3. The belief of the robot is represented as
a Markov random field and its corresponding information matrix in Figure 5.3 (a) and
Figure 5.3 (b), respectively. Each measurement imposes an uncertain constraint between
the corresponding poses of the robot, represented by an edge in the graph, and by a
nonzero entry in the information matrix. The reader should take a moment to study the
correspondence between the graph and its information matrix.
To remove a node from the graph, we must change the structure of the graph such
that the resulting probability distribution is not altered with respect to the remaining
nodes. In the example, we remove node x9 . All pairs of nodes in the Markov blanket
of node x9 are indirectly coupled via node x9 . Hence, when node x9 is removed, these
indirect relationships vanish and must therefore be explicitly represented by direct edges.
Combining the information encoded in the edges can be done in a straight forward
manner since the edges encode Gaussian constraints. Thus, concatenating two constraints
with means δji and δkj and information matrices Ωji and Ωkj is done by
δki = δkj ⊕ δji
Ωki =
(5.24)
−1
Ω−1
ji + Ωkj
−1
.
(1)
(5.25)
(2)
Similarly, merging two constraints with means δij and δij and information matrices
(1)
(2)
Ωji and Ωji can be achieved by
(1) (1)
(2) (2)
δji = Ω−1
Ωji δji + Ωji δji
ji
(1)
(2)
Ωji = Ωji + Ωji .
(5.26)
(5.27)
The belief of the robot after eliminating node x9 is illustrated in Figure 5.3 (c) and
Figure 5.3 (d). Note that the nodes which were adjacent to node x9 form an elimination
clique in the resulting graph. Likewise, the information matrix fills in and becomes dense.
Eliminating a pose variable from the belief of the robot has important ramifications
with respect to the independence structure of the belief. In particular, marginalization
destroys some of the conditional independencies by introducing new constraints between
all pairs of variables which are directly related to the eliminated variable. These new
constraints change the corresponding entries in the information matrix from potentially
zero to a nonzero value. We say that the information matrix fills in. Unfortunately, the
fill-in destroys the natural sparsity pattern of the information matrix. Representing a
dense matrix typically requires significantly more memory resources than representing
a sparse matrix. Furthermore, the density adversely affects the computational costs of
subsequent marginalizations and of the underlying pose graph optimization.
5.4 Approximate Marginalization
53
In summary, while it is possible to eliminate nodes from the pose graph, in the worst
case, the number of edges increases quadratically with the number of nodes. Hence,
marginalization ceases the sparsity of the pose graph and thus introduces a complexity
which is not suited for life-long map learning.
5.4 Approximate Marginalization
In the last section, we analyzed the changes in the independence structure of the robot’s
belief caused by node marginalization. In particular, we found that marginalization in
information form requires computing the Schur complement over the remaining variables.
The Schur complement introduces new constraints between all pairs of variables which
are directly related to the marginalized variables. As a result, these variables form an
elimination clique.
For simplicity of notation, let x = (x1 , . . . , xk )T be the stacked vector of the set of
variables in the elimination clique. Furthermore, let p(x) be the probability distribution which arises from the constraints within the clique. Finally, let Σ and Ω be the
corresponding covariance and information matrix, respectively.
Within the elimination clique, every variable is conditioned upon all other variables.
Hence, the number of constraints grows quadratically with the number of variables. As a
result, the complexity becomes intractable as the connectivity of the graph, and thus the
number of neighbors of those nodes which are eliminated increases.
A solution to this problem is to approximate the probability distribution p(x) by
a probability distribution q(x) which has less conditional dependencies. This implies
treating some pairs of variables which are dependent upon each other as if they were
independent of each other. We can use the Kullback-Leibler divergence (Kullback and
Leibler, 1951) to measure the “quality” of an approximation.
Definition 5 The Kullback-Leibler divergence is a non-symmetric measure of the difference between two probability distributions p and q. It is defined as
Z
DKL (p, q) =
p(x) log
x
p(x)
dx.
q(x)
(5.28)
The Kullback-Leibler divergence is zero in case the distributions p and q are identical,
and strictly positive otherwise:
(
DKL (p, q) =
0
>0
if p = q,
otherwise.
(5.29)
We reduce the number of edges in the elimination clique by approximating the probability distribution p(x) using a probability distribution q(x) such that each variable is
54
5 Our Approach to Life-long Map Learning
conditioned upon only one of the other variables. Without loss of generality, we obtain
p(x) = p(xk )
≈ p(xk )
k−1
Y
i=1
k−1
Y
p(xi | xi+1 , . . . , xk )
(5.30)
p(xi | xi+1 ) = q(x).
(5.31)
i=1
Eq. (5.31) requires the graphical model of probability distribution q(x) to be a tree. We
therefore say that such a probability distribution is of first-order tree dependence. There
are as many as k k−2 differently structured tree-shaped graphical models, but we have to
commit to one of them.
Fortunately, Chow and Liu (1968) presented an efficient algorithm to find the optimal
distribution among them. This optimal distribution is therefore also referred to as the
Chow-Liu tree approximation.
5.5 Chow-Liu Tree Approximation
The Chow-Liu algorithm (Chow and Liu, 1968) approximates a multidimensional probability distribution p(x) by a distribution qopt (x) in which each variable is conditioned upon
at most one of the other variables such that the Kullback-Leibler divergence DKL (p, q) is
minimized.
Definition 6 The mutual information I of two random variables xi and xj is defined as
the average expected reduction in entropy of one of them upon learning the precise value
of the other. It is thus given by
I(xi , xj ) = H(xi ) − H(xi | xj )
p(xi | xj )
= E log2
.
p(xi )
(5.32)
(5.33)
In case the random variables are continuous, their mutual information is given by
Z
I(xi , xj ) =
xi ,xj
p(xi , xj ) log2
p(xi , xj )
dxi dxj .
p(xi )p(xj )
(5.34)
Given the probability distribution p(x), the Chow-Liu algorithm constructs a complete
mutual information graph G, where each edge joining xi and xj has weight equal to the
mutual information I(xi , xj ).
While Eq. (5.34) allows for computing the mutual information of two random variables
in arbitrary probability distributions, it requires integrating over all states of the two
random variables. Fortunately, the probability distribution p(x) is Gaussian and therefore
5.5 Chow-Liu Tree Approximation
6
x1
x2
55
x1
7
x6
8
x1
x2
x6
x2
x6
x3
x5
x3
x5
5
10
x3
x5
x4
9
x4
x4
(a)
(b)
(c)
x1
x1
x1
x2
x6
x2
x6
x2
x6
x3
x5
x3
x5
x3
x5
x4
x4
x4
(d)
(e)
(f)
Figure 5.4: Building the Chow-Liu tree. (a) Graphical model of the probability density distribution p(x), where thickness of the edges corresponds to magnitude of mutual
information. Additionally, the mutual information values of particular interest
are written next to the edges. (b) – (f) The Chow-Liu algorithm computes the
maximum-weight spanning tree of the mutual information graph G using Kruskal’s
algorithm. Intuitively, the Chow-Liu algorithm discards those dependencies which
have as little mutual information as possible. (f) Chow-Liu tree approximation of
p(x).
the mutual information of two variables xi and xj can be computed efficiently. According
to Davison (2005), in this case, it is given by
1
I(xi , xj ) = log2
2
!
|Σxi xi
|Σxi xi |
,
− Σxi xj Σ−1
xj xj Σxj xi |
(5.35)
where Σxi xj refers to the entry in the covariance matrix Σ of probability distribution p(x)
which relates xi and xj . The pose graph, however, encodes the probability distribution in
information form. Hence, we have to perform a matrix inversion to recover Σ. Since the
approximate graph sparsification performed by our approach ensures a low connectivity
of the graph, the probability distribution p(x) is typically low dimensional, independent
56
5 Our Approach to Life-long Map Learning
x1
x1 x2 x3 x4 x5 x6 x7 x8 x10 x11
x2
x4
x1
x5
x2
x3
x3
x6
x4
x5
x6
x10
x8
x7
x7
x8
x10
x11
x11
(a)
(b)
Figure 5.5: Approximate marginalization. (a) Possible graph resulting from Chow-Liu tree
approximation of the clique in the graph shown in Figure 5.3. (b) The corresponding
information matrix remains sparse.
of the size of the graph. As a result, the matrix inversion can be carried out in more or
less constant time.
Chow and Liu proved that the probability distribution which arises from the maximumweight spanning tree of the mutual information graph G is the optimal approximation of
tree dependence. We can compute the maximum-weight spanning tree using Kruskal’s
algorithm (Kruskal Jr, 1956). Figure 5.4 visualizes the computation of the Chow-Liu
tree. In short, in each step, the algorithm adds the edge of maximum mutual information
to the tree which does not cause the resulting graph to become cyclic. Figure 5.5 shows
a possible Chow-Liu tree approximation of the elimination clique arisen in the example
depicted in Figure 5.3.
To summarize, we use Chow-Liu trees to sparsify the elimination cliques which emerge
when we remove a node from the pose graph. Our method locally approximates these
elimination cliques with optimal, tree-shaped distributions. Furthermore, our technique
is guaranteed to not split up the pose graph into several disconnected subgraphs. Finally,
our approach only locally restricts the pose graph to be a tree. Therefore, our technique
preserves the global topology of the pose graph and, in particular, vital global loop
closures.
6 Experimental Evaluation
We carried out a series of experiments to analyze our graph-based approach to simultaneous localization and mapping. We therefore implemented the presented method based
on the SLAM back-end described in Grisetti et al. (2007a) and a re-implementation of
the SLAM front-end described in Olson (2008). We carried out several experiments at
the University of Freiburg using a real ActivMedia Pioneer-2 robot. The mobile robot
was equipped with a SICK laser range finder to perceive the environment. We used the
Carnegie Mellon Robot Navigation Toolkit (Carmen) to control the robot. In addition to
the experiments carried out at the University of Freiburg, we evaluated the performance of
our approach based on numerous datasets frequently used as benchmarks in the robotics
community. These datasets contain laser data and odometry measurements recorded by
other real robots as they were moving through interestingly structured buildings. We
evaluate our graph sparsification technique by comparing the performance of our SLAM
approach when using the presented graph sparsification to the performance when the
graph sparsification is disabled. For simplicity, we refer to the approach without graph
pruning as the “standard approach”.
The experiments are designed to show that our approach to informed graph pruning
is well suited for life-long mobile robot mapping and standard SLAM problems. In
Section 6.1, we show that our graph sparsification technique significantly reduces the
computational requirements of graph-based SLAM approaches, particularly in cases in
which the robot frequently re-traverses previously visited terrain. The speed-up is achieved
by removing redundant information from the belief of the robot. As a consequence, the
graph stays small which allows for efficient optimization and fast, reliable loop closing.
In Section 6.2, we show that our approximate marginalization technique preserves the
sparsity of the pose graph. In Section 6.3, we highlight the positive influence of our
pruning technique on the resulting grid maps. Since our approach discards observations
which do not align well with the map of the environment, it produces crisp grid maps
which exhibit less blur, compared to standard grid-based mapping approaches. Finally, in
Section 6.4, we evaluate our graph sparsification approach on several standard datasets
often used in the robotics community to compare SLAM algorithms.
6.1 Runtime and Memory Requirements
In this section, we analyze the runtime of our approach and its memory requirements in
terms of the size of the resulting pose graph. We show that the computational overhead
58
6 Experimental Evaluation
no graph reduction
our approach
5000
no graph reduction
our approach
150
100
50
number of edges
150000
number of nodes
time per observation [s]
no graph reduction
our approach
200
4000
3000
2000
100000
50000
1000
0
0
0
0
1000 2000 3000 4000
observation
0
1000 2000 3000 4000
observation
0
1000 2000 3000 4000
observation
Figure 6.1: Typical results of an experiment exhibiting a large number of re-traversals in
which the robot traveled through already visited terrain for an extended period of
time. The results clearly show that the behavior leads to an explosion in terms of
computing and memory requirements when using the standard approach (blue). In
contrast to that, our approach (red) keeps the number of nodes and edges in the
graph low and is highly efficient in this way.
of our approach is only marginal, leading to significantly reduced total processing times.
In a first experiment, we exposed both mapping approaches to extreme conditions. We
therefore made the robot travel forth and back 100 times in the corridor of our lab
environment. The corridor is approximately 20 meters long and 3 meters wide.
Figure 6.1 shows three plots illustrating the results of the experiment. The plots clearly
indicate that the behavior pattern executed by the robot leads to an explosion in terms of
computing and memory requirements when using the standard approach. The slowdown
is mainly due to the loop closing component of the SLAM front-end. As discussed in
Section 4.3, the scan matcher endeavors to find constraints between the current scan and
all former scans which were recorded in the vicinity of the robot. Since the standard
approach constantly adds new nodes to the graph, the set of scans for which feature
matching seems likely grows without bound during the experiment. Consequently, the
robot has to perform scan matching against an increasing number of scans. This is
disadvantageous since the process of scan matching is typically slow and prone to errors.
Additionally, the computational burden increases due to the underlying optimization
framework. The results of the experiment clearly show that the number of nodes and
edges in the graph rapidly grows when using the standard approach. As the size of
the graph grows, pose graph optimization becomes increasingly more challenging and
thus more time-consuming. Furthermore, the large number of nodes and edges in the
graph requires huge amounts of memory. All this is problematic since mobile robots are
typically equipped with rather limited computing devices. In contrast to the standard
approach, our sparsification technique removes those nodes from the graph which do
not reduce the uncertainty in the belief of the robot. As a result, the robot prunes the
6.1 Runtime and Memory Requirements
no graph reduction
our approach
no graph reduction
our approach
3000
no graph reduction
our approach
10
5
0
number of edges
15000
number of nodes
time per observation [s]
15
59
2000
1000
0
0
1000
2000
observation
10000
5000
0
0
1000
2000
observation
0
1000
2000
observation
Figure 6.2: Results of an experiment in which the robot moved around in an office environment
for an extended period of time. The results show that the computational and
memory complexity of the standard approach (blue) increases without bound during
the experiment. Our approach (red) is highly efficient since it keeps the number
of nodes and edges in the graph low. The standard approach required a total of
2 hours and 9 minutes to process the dataset, whereas our approach processed the
entire dataset in 4 minutes and 32 seconds.
pose graph such that the number of nodes in the graph remains more or less constant
as long as the robot does not explore new territory. Furthermore, our approximate
marginalization technique ensures that the set of edges in the graph remains small, and
that only informative constraints remain in the graph. As a result, our graph pruning
technique avoids both the runtime and the memory explosion altogether.
Note that the visible fluctuations in the time needed to incorporate an observation
have two main reasons. First, our implementation of the mapping system is optimized for
speed such that it suppresses certain computations in cases in which the graph changes
only marginally. Second, our loop closing component does not perform feature matching
against all observations. It rather restricts the set of observations to those for which
feature matching seems likely. As a consequence, some observations are incorporated
significantly faster than others.
We carried out a second experiment to further evaluate the effects of our graph pruning
technique. The second experiment exhibits typical conditions which a robot faces when
carrying out a longer mission in a closed office environment. In this experiment, the
robot moved around in our lab environment for an extended period of time. It thereby
visited the individual rooms and the corridor many times. The robot was constantly
moving for almost an hour.
Figure 6.2 illustrates the results of this experiment. The results leave no doubt that the
standard approach is not equipped for such a task. The time required by the standard
approach to incorporate an observation increases throughout the entire mission, leading
to a total of 2 hours and 9 minutes to process the dataset. Our approach was already
60
6 Experimental Evaluation
2
6000
no graph reduction
our approach
2500
5000
2000
4000
edges
3
nodes
time per observation [s]
3000
no graph reduction
our approach
4
1500
no graph reduction
our approach
3000
1000
2000
500
1000
1
0
0
0
500
1000
1500
observation
0
0
500
1000
1500
observation
0
500
1000
1500
observation
Figure 6.3: Intel Research lab dataset. Performance of the standard approach (blue) versus our
approach (red).
no graph reduction
our approach
10000
no graph reduction
our approach
2500
no graph reduction
our approach
7500
2000
4
edges
6
nodes
time per observation [s]
8
1500
5000
1000
2
2500
500
0
0
0
500
1000 1500
observation
2000
0
0
500
1000 1500 2000
observation
0
500 1000 1500 2000
observation
Figure 6.4: FHW Museum dataset. Performance of the standard approach (blue) versus our
approach (red).
done after 4 minutes and 32 seconds. To allow for useful navigation tasks, the robot has
to be capable of incorporating measurements in real-time. The experiments show that the
average time needed by the standard approach to incorporate an observation increases
without bound. In contrast to the standard approach, our graph sparsification technique
ensures that the average time to incorporate measurements remains low throughout the
entire mission.
We furthermore evaluated the runtime and the memory requirements of our approach on
standard benchmark datasets. Figure 6.3 and Figure 6.4 show the results obtained when
processing the Intel Research Lab dataset and the FHW Museum dataset, respectively.
The plots indicate results similar to the results of the experiments mentioned above. The
experiments clearly illustrate that our approach leads to a significant performance gain
even when processing standard datasets which are not explicitly designed for testing
long-term simultaneous localization and mapping.
6.2 Approximate Marginalization
61
exact marginalization
our approach
gamma index (γ)
edges
15000
10000
5000
exact marginalization
our approach
1
0.1
0.01
0
0
500
1000
1500
0
observation
500
1000
1500
observation
Figure 6.5: Exact marginalization versus our approximate marginalization technique when
processing the Intel Research Lab dataset. Node elimination by means of exact
marginalization adds a large number of edges to the graph. In contrast to that, our
approach ensures that the number of edges in the graph stays small. The gamma
index (γ) of a graph is defined as the ratio between the number of existing edges
in the graph and the maximum number of possible edges in the graph. Exact
marginalization yields a high gamma index, indicating that the graph is dense. Our
approach leads to a small and more or less constant gamma index, indicating that
the graph is sparse. A sparse graph is highly advantageous since it can be stored
and optimized much more efficiently than a dense graph.
6.2 Approximate Marginalization
In Section 5.3, we asserted that eliminating nodes from the belief of the robot by means
of exact marginalization causes the pose graph to become dense. In this section, we
show that our approximate marginalization technique preserves the sparsity of the pose
graph. A sparse pose graph is highly advantageous since it can be stored and optimized
efficiently.
To evaluate how densely connected a pruned pose graph is in practice, we use the
so-called gamma index (Garrison and Marble, 1965). The gamma index (γ) of a graph is
defined as the ratio between the number of existing edges in the graph and the maximum
number of possible edges in the graph. For undirected graphs, it is given by
γ =
e
,
− 1)
1
2 v(v
(6.1)
where e is the number of edges and v is the number of nodes in the graph. The gamma
index varies from 0 to 1, where 0 means that the nodes are not connected at all, and 1
means that the graph is complete.
Figure 6.5 and Figure 6.6 show a comparison of the graphs resulting from exact
marginalization and our approximate marginalization technique when processing the
Intel Research lab dataset. The figures show that eliminating nodes from the pose graph
62
6 Experimental Evaluation
(a) Exact marginalization.
(b) Our approach.
Figure 6.6: Final pose graphs resulting from exact marginalization and our approximate
marginalization technique when processing the Intel Research Lab dataset. Eliminating nodes from the pose graph by means of exact marginalization causes the
pose graph to become dense. Our approach preserves the sparsity of the pose graph
by using Chow-Liu trees to locally approximate the elimination cliques which result
from marginalization. As a consequence, only the most informative constraints
remain in the graph. In this experiment, the final pose graph which resulted from
exact marginalization possesses 13052 edges, whereas the pose graph produced by
our approach only has 559 edges. Nevertheless, our approach produced a high
quality grid map.
by means of exact marginalization causes the pose graph to become dense. Our approach
preserves the sparsity of the pose graph by using Chow-Liu trees to locally approximate
the elimination cliques which result from marginalization.
6.3 Improved Grid Map Quality
We furthermore evaluated the effects of our information-driven graph pruning technique
on the quality of the resulting grid maps. In feature-based approaches, the correlations
between landmark estimates increase monotonically with the number of observations
(Durrant-Whyte and Bailey, 2006). Grid maps, however, have a disadvantage when
it comes to life-long map learning. Whenever the robot obtains a measurement, the
scan matcher aims to align the new scan against existing scans in order to solve the
data association problem. The probability that the scan matcher thereby makes a small
6.3 Improved Grid Map Quality
63
(a) Standard approach.
(b) Our approach.
Figure 6.7: In this experiment, the robot moved 100 times forth and back in a corridor. (a) Standard grid-based mapping approaches tend to produce thick and blurred walls due to
accumulating scan matching errors. (b) Our graph sparsification approach discards
scans which do not align well with existing scans and thus produces crisp maps
even in cases of many re-traversals.
alignment error is nonzero. A scan which is incorporated at a slightly wrong position
blurs the map. As a result, the probability that the scan matcher misaligns subsequent
scans increases since scan matching is performed against misaligned scans. Hence, the
probability of making alignment errors increases with the number of incorporated scans.
In the long run, the map tends to become increasingly blurred. In the worst case, the
mapping approach diverges. The problem is inherent in grid maps due to the discretization
of space. Improving the scan matcher can only mitigate the detrimental effects of the
discretization, but cannot avoid that the alignment errors accumulate as more scans are
incorporated.
Figure 6.7 and Figure 6.8 show the maps corresponding to the first two experiments
discussed in Section 6.1. The grid maps generated by the standard approach exhibit
clearly visible blur and artificially thick walls. In general, the more often the robot
re-traverses already visited terrain, the more blur is added to the map. As opposed to
the standard approach, our graph pruning approach discards scans which do not align
well with existing scans and thus produces crisp maps even in cases in which the robot
64
6 Experimental Evaluation
(a) Standard approach.
(b) Our approach.
Figure 6.8: In this experiment, the robot moved around in an office environment at the University
of Freiburg for an extended period of time. The robot was constantly moving for
almost an hour thereby visiting the individual rooms and the corridor many times.
Our method removed most of the nodes, yielding a graph which only contains highly
informative edges and nodes.
frequently re-traverses already visited places. Figure 6.8 shows the graphs resulting from
the two methods. The graph produced by the standard approach is dense and contains a
large number of nodes, whereas the graph generated by our method is sparse and only
consists of a fraction of the nodes and only highly informative links.
6.4 Standard Datasets
65
(a) Standard approach.
(b) Our approach.
Figure 6.9: Intel Research Lab dataset.
6.4 Standard Datasets
We furthermore evaluated our graph sparsification approach on several standard datasets.
These datasets are often used in the robotics community to compare algorithms which
address the simultaneous localization and mapping problem.
6.4.1 Intel Research Lab
Since Dirk Hähnel recorded the Intel Research Lab dataset, it has become a popular
benchmark dataset frequently used in the community. The Intel Research Lab provides
an interesting environment due to its circular structure. The robot travels multiple times
around the circular corridor repeatedly visiting the individual rooms. The behavior
pattern allows for closing a large number of loops.
Figure 6.9 shows the results obtained from the Intel Research Lab dataset. The map
generated by our approach exhibits areas with significantly less blur compared to the
map produced by the standard approach. Some of these areas are highlighted by red
arrows in the figures. Our algorithm thereby produced the map by only maintaining 29 %
of the scans obtained by the robot. Furthermore, our approach was significantly faster
than the standard approach. The standard approach required a total of 537 seconds to
process the dataset, whereas the processing time of our approach was only 314 seconds.
66
6 Experimental Evaluation
Figure 6.10: Killian Court dataset.
6.4.2 Killian Court
The MIT Killian Court infinite corridor dataset, first recorded by Bosse et al. (2004),
is a challenging benchmark for SLAM approaches. It is a large dataset with a path
length of 1450 meters and two hours of robot travel. The dataset is interesting for the
work presented in this thesis because it poses a challenge to the loop closing capabilities
due to several long loops. Closing long loops is difficult since the dead reckoning error
accumulates along the path, and, as a result, the number of scans for which feature
matching seems likely grows. The approach is likely to generate an inconsistent map in
cases in which the scan matcher fails to close the loops. Hence, this experiment serves
well to analyze the impact of our approach on the performance of the scan matcher.
Figure 6.10 depicts the map generated by our approach. Although it prunes the pose
graph such that the environment is represented by a small number of scans, the scan
matcher still managed to close all loops, leading to a consistent map.
6.4 Standard Datasets
67
(a) Standard approach.
(b) Our approach.
Figure 6.11: FHW Museum dataset.
6.4.3 FHW Museum
The FHW Museum dataset, recorded by Dirk Hähnel, is reasonable large, spanning
almost two hours of robot driving. The dataset is interesting for the work presented
in this thesis because the robot spends most of the time re-traversing already visited
terrain.
Figure 6.9 depicts the maps and the graphs obtained from the FHW Museum dataset.
The figures show that the graph output by our method contains much less nodes. Furthermore, although based on only 21 % of the scans, the map generated by our approach
is of higher quality than the map produced by the standard approach. Furthermore, our
approach led to a significant reduced runtime. The standard approach required a total of
32 minutes and 45 seconds to process the dataset. Processing time of our approach was
only 5 minutes and 19 seconds.
7 Discussion
7.1 Conclusion
In this thesis, we presented a novel approach for mobile robots using the graph-based
formulation of the simultaneous localization and mapping problem to allow for life-long
map learning using grid maps. The key idea of our approach is to remove redundant
information from the graph such that its size stays constant as long as the robot moves
through previously visited areas. Hence, in contrast to traditional graph-based approaches,
the complexity of our new approach does not scale with the length of the robot’s trajectory,
but with the size of the mapped area. As a result, a major bottleneck in the mapping
component of current navigation systems vanishes, and long-term applications of mobile
robots become feasible.
We use grid maps to model the environment since they do not require pre-defined
feature extractors. Grid maps, however, suffer from accumulating discretization errors,
which is why standard grid-based mapping approaches tend to produce blurred maps.
We use the concept of information entropy to explicitly measure the information gain of
observations. In this way, we are able to identify redundant observations which do not
reduce the uncertainty in the map of the environment. Discarding redundant observations
from the belief of the robot has two major advantages. First, it allows for more sustainable
use of limited resources. Second, discarding unnecessary and slightly misaligned scans
increases the quality of the maps.
Removing observations from the graph while changing the belief of the robot as
little as possible, however, is challenging. An exact elimination technique introduces a
large number of new constraints into the belief of the robot due to the marginalization.
The resulting graph becomes dense and prevents the robot from efficiently storing and
processing data. Therefore, preserving sparsity is a key prerequisite for life-long map
learning.
We introduced an efficient approach to removing nodes from the graph which fulfills this
requirement while performing as few approximations as possible. Our technique locally
approximates the constraint network using optimal, tree-shaped probability distributions,
which are computed by means of Chow-Liu trees. As a result, our method preserves
sparsity while minimizing the loss of information encoded in the constraints.
We implemented our approach and illustrated its efficacy in a series of experiments
with a real robot. We furthermore considered standard benchmark datasets widely-used
in the research community. The key improvements of our new approach are significantly
70
7 Discussion
reduced computational requirements and less blur in the grid maps. The results of
the experimental evaluation emphasize that the approach presented in this thesis is an
important step to allow for life-long map learning in mobile robotics.
7.2 Future Work
Despite the encouraging results presented in this thesis, there is still room for improvements. Although grid maps are widely-used, they have disadvantages. For simplicity,
grid maps assume that the individual grid cells are independent of each other. As a
result, correlations between grid cells are neglected in the computation of the information
entropy. Even though this is a common assumption with known problems, no promising
approach to efficiently considering these dependencies has been presented. We expect
that sound consideration of such spatial dependencies will lead to a less approximate
computation of the uncertainty and thus to an improvement of our node reduction
approach.
Furthermore, while our approximate marginalization technique is directly applicable
to any constraint network, the presented method to compute the information gain of
observations is rather specific to 2D-mapping scenarios using laser scanners. Further
research could transfer the concept to other sensing modalities and other application
scenarios.
It would be interesting to explore how to extend our approach towards highly dynamic
environments. Although other researchers are investigating this topic, it is still unclear
how to robustly cope with moving objects and environments that significantly change
over time.
Related to that, we plan on investigating possible representations of dynamic aspects
of the environment in order to provide useful information for mobile robot navigation.
List of Figures
3.1
3.2
3.3
3.4
3.5
Robot motion . . . . . . . .
Robot perception . . . . . .
An occupancy grid map . .
Mapping with known poses
Drawings of graphs . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
26
26
27
28
30
4.1
4.2
4.3
4.4
The SLAM problem . .
A pose graph . . . . . .
A pose/feature graph . .
Pose graph optimization
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
35
38
39
43
5.1
5.2
5.3
5.4
5.5
Binary entropy function . . . . . . . . . . . . . .
Conditional independence properties in a Markov
Eliminating a pose from the belief of the robot .
Chow-Liu tree approximation . . . . . . . . . . .
Approximate marginalization . . . . . . . . . . .
. . . . . . . .
random field
. . . . . . . .
. . . . . . . .
. . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
46
49
51
55
56
6.1
6.2
6.3
6.4
6.5
6.6
6.7
6.8
6.9
6.10
6.11
Experiment with many re-traversals: plots . . . . . . . . . . . . . .
Experiment in an office environment: plots . . . . . . . . . . . . .
Intel Research lab dataset: plots . . . . . . . . . . . . . . . . . . .
FHW Museum dataset: plots . . . . . . . . . . . . . . . . . . . . .
Exact marginalization versus approximate marginalization: plots .
Exact marginalization versus approximate marginalization: graphs
Experiment with many re-traversals: maps and graphs . . . . . . .
Experiment in an office environment: maps and graphs . . . . . . .
Intel Research Lab dataset: maps and graphs . . . . . . . . . . . .
Killian Court dataset: map . . . . . . . . . . . . . . . . . . . . . .
FHW Museum dataset: maps and graphs . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
58
59
60
60
61
62
63
64
65
66
67
.
.
.
.
.
.
.
.
Bibliography
T. Bailey, J. Nieto, J. Guivant, M. Stevens, and E. Nebot. Consistency of the EKF-SLAM
Algorithm. In Proc. of the IEEE/RSJ International Conference on Intelligent Robots
and Systems, pages 3562–3568, Beijing, China, 2006.
M. Bosse, P. Newman, J. Leonard, and S. Teller. Simultaneous Localization and Map
Building in Large-Scale Cyclic Environments Using the Atlas Framework. Int. Journal
of Robotics Research, 23(12):1113, 2004.
A. Censi. Scan Matching in a Probabilistic Framework. In Proc. of the IEEE Int. Conf. on
Robotics & Automation (ICRA), pages 2291–2296, Orlando, Florida, 2006.
P. Cheeseman and P. Smith. On the Representation and Estimation of Spatial Uncertainty.
Int. Journal of Robotics Research, 5:56–68, 1986. ISSN 0278-3649.
M. Chli and A. J. Davison. Automatically and Efficiently Inferring the Hierarchical
Structure of Visual Maps. In Proc. of the IEEE Int. Conf. on Robotics & Automation
(ICRA), Kobe, Japan, June 2009.
C. Chow and C. Liu. Approximating Discrete Probability Distributions with Dependence
Trees. IEEE Transactions on Information Theory, 14(3):462–467, 1968.
M. Cummins and P. Newman. FAB-MAP: Probabilistic Localization and Mapping in
the Space of Appearance. Int. Journal of Robotics Research, 27(6):647–665, 2008.
A. J. Davison. Active Search for Real-Time Vision. In Proc. of the Int. Conf. on Computer
Vision (ICCV), volume 1, 2005.
F. Dellaert and M. Kaess. Square Root SAM: Simultaneous Localization and Mapping via
Square Root Information Smoothing. Intl. J. of Robotics Research, 25(12):1181–1204,
December 2006.
F. Dellaert, D. Fox, W. Burgard, and S. Thrun. Monte Carlo Localization for Mobile
Robots. In Proc. of the IEEE Int. Conf. on Robotics & Automation (ICRA), pages
99–141, Leuven, Belgium, May 1999.
R. Diestel. Graph Theory. Springer Verlag, third edition, August 2005. ISBN 3540261826.
74
Bibliography
E. W. Dijkstra. A Note on Two Problems in Connexion With Graphs. Numerische
Mathematik, 1(1):269–271, 1959.
A. Doucet. On Sequential Simulation-Based Methods for Bayesian Filtering. Technical
report, Signal Processing Group, Dept. of Engeneering, University of Cambridge, 1998.
A. Doucet, E. Dept, and U. of Cambridge. On Sequential Simulation-based Methods for
Bayesian Filtering. University of Cambridge, Department of Engineering, 1998.
A. Doucet, J. de Freitas, K. Murphy, and S. Russel. Rao-Blackwellized Partcile Filtering
for Dynamic Bayesian Networks. In Proc. of the Conf. on Uncertainty in Artificial
Intelligence (UAI), pages 176–183, Stanford, CA, USA, 2000.
A. Doucet, N. de Freitas, and N. Gordan, editors. Sequential Monte-Carlo Methods in
Practice. Springer Verlag, 2001.
T. Duckett, S. Marsland, and J. Shapiro. Learning Globally Consistent Maps by Relaxation. In Proc. of the IEEE Int. Conf. on Robotics & Automation (ICRA), 2000.
T. Duckett, S. Marsland, and J. Shapiro. Fast, On-line Learning of Globally Consistent
Maps. Autonomous Robots, 12(3):287 – 300, 2002.
H. Durrant-Whyte. Uncertain Geometry in Robotics. In Proc. of the IEEE Conf. on
Robotics and Automation, 1987.
H. Durrant-Whyte and T. Bailey. Simultaneous Localization and Mapping (SLAM): Part
I The Essential Algorithms. IEEE Robotics & Automation Magazine, 13(2):99–110,
2006.
A. Eliazar and R. Parr. DP-SLAM: Fast, Robust Simultainous Localization and Mapping
Without Predetermined Landmarks. In Proc. of the Int. Conf. on Artificial Intelligence
(IJCAI), pages 1135–1142, Acapulco, Mexico, 2003.
R. Eustice, H. Singh, and J. Leonard. Exactly Sparse Delayed-State Filters. In Proc. of
the IEEE Int. Conf. on Robotics & Automation (ICRA), pages 2428–2435, Barcelona,
Spain, April 2005a.
R. Eustice, M. Walter, and J. Leonard. Sparse Extended Information Filters: Insights
into Sparsification. In Proc. of the IEEE/RSJ Int. Conf. on Intelligent Robots and
Systems (IROS), pages 641–648, Edmonton, Cananda, 2005b.
R. Eustice, H. Singh, and J. Leonard. Exactly Sparse Delayed-State Filters for View-Based
SLAM. IEEE Transactions on Robotics, 22(6):1100–1114, 2006.
J. Folkesson and H. Christensen. Graphical SLAM - A Self-Correcting Map. In Proc. of
the IEEE Int. Conf. on Robotics & Automation (ICRA), Orlando, FL, USA, 2004a.
Bibliography
75
J. Folkesson and H. Christensen. Robust SLAM. In Proc. of the IFAC Symposium on
Intelligent Autonomous Vehicles, 2004b.
J. Folkesson, P. Jensfelt, and H. Christensen. Vision SLAM in the Measurement Subspace.
In Proc. of the IEEE Int. Conf. on Robotics & Automation (ICRA), pages 325–330,
April 2005.
U. Frese. A Proof for the Approximate Sparsity of SLAM Information Matrices. In Proc.
of the IEEE Int. Conf. on Robotics and Automation, pages 331–337, 2005.
U. Frese and G. Hirzinger. Simultaneous Localization and Mapping - A Discussion. In
Proceedings of the IJCAI Workshop on Reasoning with Uncertainty in Robotics, Seattle,
2001.
U. Frese, P. Larsson, and T. Duckett. A Multilevel Relaxation Algorithm for Simultaneous
Localization and Mapping. IEEE Transactions on Robotics, 21(2):1–12, 2005.
W. L. Garrison and D. F. Marble. A Prolegomenon to the Forecasting of Transportation
Development, 1965.
M. Golfarelli, D. Mairo, and S. Rizzi. Elastic Correction of Dead-reckoning Errors in
Map Building. In Proc. of the IEEE/RSJ Int. Conf. on Intelligent Robots and Systems,
pages 905–911, 1998.
G. Grisetti, C. Stachniss, and W. Burgard. Improving Grid-based SLAM with RaoBlackwellized Particle Filters by Adaptive Proposals and Selective Resampling. In
Proc. of the IEEE Int. Conf. on Robotics & Automation (ICRA), pages 2443–2448,
Barcelona, Spain, April 2005.
G. Grisetti, G. D. Tipaldi, C. Stachniss, W. Burgard, and D. Nardi. Speeding-Up
Rao-Blackwellized SLAM. In Proc. of the IEEE Int. Conf. on Robotics & Automation
(ICRA), pages 442–447, Orlando, FL, USA, 2006.
G. Grisetti, C. Stachniss, S. Grzonka, and W. Burgard. A Tree Parameterization for
Efficiently Computing Maximum Likelihood Maps using Gradient Descent. In Proc. of
Robotics: Science and Systems (RSS), Atlanta, GA, USA, 2007a.
G. Grisetti, G. D. Tipaldi, C. Stachniss, W. Burgard, and D. Nardi. Fast and Accurate
SLAM with Rao-Blackwellized Particle Filters. Journal of Robotics & Autonomous
Systems, 55(1):30–38, 2007b.
D. Hähnel. Mapping with Mobile Robots. PhD thesis, University of Freiburg, Department
of Computer Science, 2005.
76
Bibliography
D. Hähnel, W. Burgard, D. Fox, and S. Thrun. An Efficient FastSLAM Algorithm
for Generating Maps of Large-Scale Cyclic Environments from Raw Laser Range
Measurements. In Proc. of the IEEE/RSJ Int. Conf. on Intelligent Robots and Systems
(IROS), pages 206–211, Las Vegas, NV, USA, 2003.
M. Jordan. Learning in Graphical Models. Kluwer Academic Publishers, 1998.
S. Julier and J. Uhlmann. A Counter Example to the Theory of Simultaneous Localization
and Map Building. In Proc. of the IEEE Int. Conf. on Robotics and Automation, pages
4238–4243, 2001.
S. Julier, J. Uhlmann, and H. Durrant-Whyte. A New Approach for Filtering Nonlinear
Systems. In Proc. of the American Control Conference, pages 1628–1632, Seattle, WA,
USA, 1995.
M. Kaess, A. Ranganathan, and F. Dellaert. iSAM: Fast Incremental Smoothing and
Mapping with Efficient Data Association. In Proc. of the IEEE Int. Conf. on Robotics
& Automation (ICRA), Rome, Italy, 2007.
M. Kaess, A. Ranganathan, and F. Dellaert. iSAM: Incremental Smoothing and Mapping.
IEEE Trans. on Robotics, 24(6):1365–1378, December 2008.
R. Kalman. A New Approach To Linear Filtering and Prediction Problems. ASMEJournal of Basic Engineering, March:35–45, 1960.
R. Kindermann and J. L. Snell. Markov Random Fields and Their Applications. American
Mathematical Society, 1980.
K. Konolige. Large-Scale Map-Making. In Proc. of the National Conference on Artificial
Intelligence (AAAI), pages 457–463, 2004.
K. Konolige and M. Agrawal. FrameSLAM: From Bundle Adjustment to Realtime Visual
Mappping. IEEE Transactions on Robotics, 24(5):1066–1077, 2008.
H. Kretzschmar, G. Grisetti, and C. Stachniss. Life-long Map Learning for Graph-based
Simultaneous Localization and Mapping. Zeitschrift KI – Künstliche Intelligenz, 2009.
Accepted for publication.
J. B. Kruskal Jr. On the Shortest Spanning Subtree of a Graph and the Traveling
Salesman Problem. Proceedings of the American Mathematical Society, pages 48–50,
1956.
S. Kullback and R. A. Leibler. On Information and Sufficiency. The Annals of Mathematical Statistics, pages 79–86, 1951.
Bibliography
77
J. Leonard, H. Durrant-Whyte, and I. Cox. Dynamic Map Building for an Autonomous
Mobile Robot. Int. Journal of Robotics Research, 11(4):286, 1992.
F. Lu and E. Milios. Globally Consistent Range Scan Alignment for Environment
Mapping. Autonomous Robots, 4:333–349, 1997.
P. S. Maybeck. Stochastic Models, Estimation, and Control, volume 141 of Mathematics
in Science and Engineering. 1979.
P. McLauchlan and D. Murray. A Unifying Framework for Structure and Motion Recovery
from Image Sequences. In Proc. of the Int. Conf. on Computer Vision (ICCV), pages
314–320, Boston, MA, USA, 1995.
M. Montemerlo and S. Thrun. Simultaneous Localization and Mapping with Unknown
Data Association using FastSLAM. In Proc. of the IEEE Int. Conf. on Robotics &
Automation (ICRA), pages 1985–1991, Taipei, Taiwan, 2003.
M. Montemerlo and S. Thrun. Large-scale Robotic 3-D Mapping of Urban Structures.
In Proc. of the Int. Symposium on Experimental Robotics (ISER), 2004.
M. Montemerlo, S. Thrun, D. Koller, and B. Wegbreit. FastSLAM: A Factored Solution
to Simultaneous Localization and Mapping. In Proc. of the National Conference on
Artificial Intelligence (AAAI), pages 593–598, Edmonton, Canada, 2002.
M. Montemerlo, S. T. D. Koller, and B. Wegbreit. FastSLAM 2.0: An Improved
Particle Filtering Algorithm for Simultaneous Localization and Mapping that Provably
Converges. In Proc. of the Int. Conf. on Artificial Intelligence (IJCAI), pages 1151–1156,
Acapulco, Mexico, 2003.
H. Moravec and A. Elfes. High Resolution Maps from Wide Angle Sonar. In Proc. of the
IEEE Int. Conf. on Robotics & Automation (ICRA), pages 116–121, St. Louis, MO,
USA, 1985.
K. Murphy. Bayesian Map Learning in Dynamic Environments. In Proc. of the Conf. on
Neural Information Processing Systems (NIPS), pages 1015–1021, Denver, CO, USA,
1999.
J. Neira and J. Tardós. Data Association in Stochastic Mapping Using the Joint
Compatibility Test. IEEE Transactions on Robotics and Automation, 17(6):890–897,
2001.
E. Olson. Robust and Efficient Robotic Mapping. PhD thesis, Massachusetts Institute of
Technology, Cambridge, MA, USA, June 2008.
E. Olson. Real-Time Correlative Scan Matching. In Proc. of the IEEE Int. Conf. on
Robotics & Automation (ICRA), Kobe, Japan, June 2009.
78
Bibliography
E. Olson, J. Leonard, and S. Teller. Fast Iterative Optimization of Pose Graphs with
Poor Initial Estimates. pages 2262–2269, 2006.
S. O’Callaghan, F. Ramos, and H. Durrant-Whyte. Contextual Occupancy Maps using
Gaussian Processes. In Proc. of the IEEE Int. Conf. on Robotics & Automation (ICRA),
Kobe, Japan, June 2009.
M. Paskin. Thin Junction Tree Filters for Simultaneous Localization and Mapping. In
Proc. of the Int. Conf. on Artificial Intelligence (IJCAI), pages 1157–1164, Acapulco,
Mexico, 2003.
H. Robbins and S. Monro. A Stochastic Approximation Method. The Annals of Mathematical Statistics, pages 400–407, 1951.
C. Shannon. A Mathematical Theory of Communication. The Bell System Technical
Journal, 15:379–423, 1948.
J. Sivic and A. Zisserman. Video Google: A Text Retrieval Approach to Object Matching
in Videos. In Proc. of the Int. Conf. on Computer Vision (ICCV), pages 1470–1477,
2003.
R. Smith, M. Self, and P. Cheeseman. Estimating Uncertain Spatial Realtionships in
Robotics. In I. Cox and G. Wilfong, editors, Autonomous Robot Vehicles, pages 167–193.
Springer Verlag, 1990.
C. Stachniss. Robotic Mapping and Exploration, volume 55 of Springer Tracts in Advanced
Robotics. Springer Verlag, 2009.
C. Stachniss, G. Grisetti, and W. Burgard. Recovering Particle Diversity in a RaoBlackwellized Particle Filter for SLAM after Actively Closing Loops. In Proc. of the
IEEE Int. Conf. on Robotics & Automation (ICRA), pages 667–672, Barcelona, Spain,
April 2005.
C. Stachniss, G. Grisetti, W. Burgard, and N. Roy. Evaluation of Gaussian Proposal
Distributions for Mapping with Rao-Blackwellized Particle Filters. In Proc. of the
IEEE/RSJ Int. Conf. on Intelligent Robots and Systems (IROS), San Diego, CA, USA,
2007.
H. Strasdat, C. Stachniss, and W. Burgard. Which Landmark is Useful? Learning
Selection Policies for Navigation in Unknown Environments. In Proc. of the IEEE
Int. Conf. on Robotics & Automation (ICRA), Kobe, Japan, June 2009.
S. Thrun. Learning Occupancy Grids With Forward Sensor Models. Autonomous Robots,
15:111–127, 2003.
Bibliography
79
S. Thrun and M. Montemerlo. The Graph SLAM Algorithm with Applications to LargeScale Mapping of Urban Structures. Int. Journal of Robotics Research, 25(5-6):403,
2006.
S. Thrun, D. Koller, Z. Ghahmarani, and H. Durrant-Whyte. SLAM Updates Require
Constant Time. School of Computer Science, Carnegie Mellon University, Pittsburgh,
Tech. Rep, 2002.
S. Thrun, Y. Liu, D. Koller, A. Ng, Z. Ghahramani, and H. Durrant-Whyte. Simultaneous
Localization and Mapping With Sparse Extended Information Filters. Int. Journal of
Robotics Research, 23(7/8):693–716, 2004.
S. Thrun, W. Burgard, and D. Fox. Probabilistic Robotics. MIT Press, Cambridge, MA,
USA, 2005.
G. D. Tipaldi, G. Grisetti, and W. Burgard. Approximate Covariance Estimation in
Graphical Approaches to SLAM. In Proc. of the IEEE/RSJ Int. Conf. on Intelligent
Robots and Systems (IROS), San Diego, CA, USA, 2007.
J. Uhlmann. Dynamic Map Building and Localization: New Theoretical Foundations.
PhD thesis, University of Oxford, 1995.
R. van der Merwe, N. de Freitas, A. Doucet, and E. Wan. The Unscented Particle
Filter. Technical Report CUED/F-INFENG/TR380, Cambridge University Engineering
Department, August 2000.
V. Verma, S. Thrun, and R. Simmons. Variable Resolution Particle Filter. In Int. Joint
Conf. on Artificial Intelligence, volume 18, pages 976–984, 2003.
M. Walter, R. Eustice, and J. Leonard. A Provably Consistent Method for Imposing
Sparsity in Feature-based SLAM information Filters. In Proc. of the Int. Symposium
of Robotics Research (ISRR), San Francisco, CA, 2005.

Documents pareils