Life-long Map Learning for Graph-based Simultaneous Localization
Transcription
Life-long Map Learning for Graph-based Simultaneous Localization
Institut für Informatik Autonome Intelligente Systeme Prof. Dr. Wolfram Burgard Life-long Map Learning for Graph-based Simultaneous Localization and Mapping Diplomarbeit Henrik Kretzschmar Juli 2009 Betreuer: Prof. Dr. Wolfram Burgard Dr. Cyrill Stachniss Dr. Giorgio Grisetti Eidesstattliche Erklärung Hiermit erkläre ich, dass ich die vorliegende Arbeit selbständig und ohne Benutzung anderer als der angegebenen Hilfsmittel angefertigt und alle Stellen, die wörtlich oder sinngemäß aus veröffentlichten oder unveröffentlichten Schriften entnommen wurden, als solche kenntlich gemacht habe. Außerdem erkläre ich, dass diese Arbeit nicht, auch nicht auszugsweise, bereits für eine andere Prüfung angefertigt wurde. (Henrik Kretzschmar) Freiburg, den 22. Juli 2009 Zusammenfassung Für zahlreiche Anwendungen im Bereich der mobilen Robotik sind Karten der Umgebung unerlässlich. Bringdienste, automatische Inspizierung von Industrieanlagen und der Einsatz in Katastrophengebieten sind nur einige wenige Beispiele. Damit ist die Fähigkeit, Karten selbständig zu erzeugen, ein zentraler Bestandteil eines jeden autonomen mobilen Roboters. Ein moderner Ansatz zur sogenannten Kartierung und zeitgleichen Lokalisierung (engl. Simultaneous Localization and Mapping, SLAM) modelliert die Schätzung des Roboters über den Zustand der Welt anhand eines Graphen. Die Knoten des Graphen entsprechen dabei Positionen des Roboters, die Kanten repräsentieren räumliche Beziehungen zwischen den Knoten, die durch Messungen des Roboters zustande kommen. Während der Erstellung der Karte wächst der Graph jedoch stetig an. Dies bringt eine Komplexität des Verfahrens mit sich, die einen Dauereinsatz ausschliesst. In dieser Diplomarbeit wird ein Ansatz vorgestellt, der sich dem Problem der Skalierbarkeit der graphbasierten SLAM-Verfahren annimmt. Wir beschreiben ein informationstheoretisches Verfahren, das besonders für Roboter, die sich längere Zeit in derselben Umgebung aufhalten, zu einer deutlichen Geschwindigkeitssteigerung des Kartierverfahrens führt. Des Weiteren erzeugt unser Ansatz Rasterkarten, die eine höhere Qualität aufweisen als Karten, die mit bisherigen Verfahren erstellt wurden. Je weniger Knoten sich in dem Graph, mit dem die Umgebung modelliert wird, befinden, desto schneller können die nötigen Berechnungen durchgeführt werden und desto geringer ist der Speicherbedarf. Außerdem ist eine dünn besetzte Graphstruktur für eine effiziente Graphoptimierung wünschenswert. Unser Algorithmus entfernt redundante Informationen aus dem Graph und verhindert somit, dass dieser weiter wächst, während der Roboter durch bereits besuchte Bereiche der Umgebung fährt. Exaktes Entfernen redundanter Knoten aus dem Graph führt jedoch zur Entstehung vieler neuer Kanten und somit zu einer dichten Graphstruktur, wodurch der Algorithmus nicht mehr effizient arbeiten kann. Wir beschreiben ein neues Verfahren zum Entfernen von Knoten aus dem Graph, das durch Näherungen verhindert, dass die Anzahl der Kanten steigt. Unser Verfahren achtet dabei darauf, dass möglichst wenig der Informationen in den Kanten verloren geht. Die experimentelle Auswertung auf der Grundlage von Daten echter Roboter zeigt, dass durch unsere Herangehensweise ein deutlicher Geschwindigkeitsgewinn im Kartierverfahren erzielt wird. Zudem werden im Vergleich zu bisherigen Ansätzen genauere Rasterkarten erzeugt. Abstract Maps of the environment are needed for a wide range of robotic applications, such as delivery, surveillance, and pursuit-evasion. The ability to generate a map of the environment is therefore a prerequisite for any truly autonomous mobile robot. A stateof-the-art approach to simultaneous localization and mapping (SLAM) uses a graph to model the belief of the robot. The nodes in the graph correspond to positions of the robot, while the edges in the graph represent spatial constraints between the nodes which arise from the measurements obtained by the robot. While mapping, this graph grows constantly as the robot travels, resulting in a computational complexity and memory requirements which impede long-term mapping applications. This thesis formulates an approach for graph-based simultaneous localization and mapping using grid maps that addresses the problem of scalability in long-term applications. We describe an information-theoretic algorithm that achieves significant gains in computational efficiency for robots that frequently re-traverse previously visited areas. In addition to that, our method produces grid maps of higher quality compared to standard grid-based mapping approaches. The less nodes in the graph used to model the belief of the robot, the faster the computations are. Furthermore, a sparse graph structure is desirable for efficient graph optimization. Our algorithm removes redundant information from the belief of the robot and thus prevents the graph from growing as long as the robot is moving through already visited territory. Exact elimination of information, however, introduces a large number of new constraints into the belief, ceasing the sparsity of the graph and thus the efficiency of the algorithm. We therefore describe an approximation technique to remove nodes from the graph while maintaining its sparsity and minimizing the involved loss of information, culminating in a highly efficient approach to graph sparsification. The experimental evaluation using real robot data show that our method yields grid maps which exhibit less blur compared to standard grid-based approaches while requiring significantly less computational resources. Acknowledgments I am very grateful to the Autonomous Intelligent Systems group for providing a highly stimulating research environment. In particular, I would like to express my thanks to my supervisors Wolfram Burgard, Cyrill Stachniss, and Giorgio Grisetti for their excellent support, interest in my research, and feedback on academic writing. Additionally, I would like to thank Christian Plagemann, Kai O. Arras, Kai Wurm, and Slawomir Grzonka. Furthermore, I would like to thank the people who recorded and published the mobile robot datasets used for the experimental evaluation of this work. I thank Dirk Hähnel for providing the Intel Research Lab dataset and the FHW Museum dataset, and Mike Bosse for providing the Killian Court Infinite Corridor dataset. Many people external to the university have contributed with ideas, inspiration, and ceaseless moral support over the last months while I was writing this thesis. I owe a special debt of gratitude to Alex, Dominic, Jakob, Javier, Laura, Moritz, Silva, and Sonja. Thank you. Finally, I thank my family for supporting my educational endeavors. Contents 1 Introduction 13 1.1 Contribution of this Thesis . . . . . . . . . . . . . . . . . . . . . . . . . . 14 1.2 Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 1.3 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 2 Related Work 2.1 Gaussian Filter Approaches . . 2.2 Particle Filter Approaches . . . 2.3 Graph-based Approaches . . . . 2.4 Graph Pruning . . . . . . . . . 2.5 Chow-Liu Tree Approximation 2.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 17 20 22 23 24 24 3 Background 3.1 Robot Motion . . . . . . . . . . . . . . . 3.2 Robot Perception . . . . . . . . . . . . . 3.3 Grid Maps . . . . . . . . . . . . . . . . . 3.4 Mapping with Known Poses . . . . . . . 3.4.1 Occupancy Probability Mapping 3.4.2 Reflection Probability Mapping . 3.5 Graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 25 26 27 28 29 29 30 . . . . . . . . 33 33 35 36 38 38 39 40 41 4 The 4.1 4.2 4.3 4.4 . . . . . . . . . . . . . . . . . . . . . . . . Graph-based Formulation of the SLAM Simultaneous Localization and Mapping Computing the Full SLAM Posterior . . Loop Closing . . . . . . . . . . . . . . . The Pose Graph Metaphor . . . . . . . . 4.4.1 Pose Graphs . . . . . . . . . . . 4.4.2 Pose/Feature Graphs . . . . . . . 4.5 SLAM as a Least Squares Problem . . . 4.6 Pose Graph Optimization . . . . . . . . Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 5 Our 5.1 5.2 5.3 5.4 5.5 Approach to Life-long Map Learning Information-Theoretic Node Reduction . . Pose Graphs as Gaussian Markov Random Marginalization Ceases Sparsity . . . . . . Approximate Marginalization . . . . . . . Chow-Liu Tree Approximation . . . . . . 6 Experimental Evaluation 6.1 Runtime and Memory Requirements 6.2 Approximate Marginalization . . . . 6.3 Improved Grid Map Quality . . . . . 6.4 Standard Datasets . . . . . . . . . . 6.4.1 Intel Research Lab . . . . . . 6.4.2 Killian Court . . . . . . . . . 6.4.3 FHW Museum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 45 48 50 53 54 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 57 61 62 65 65 66 67 . . . . . . . . . . . . . . . . . . . . . 7 Discussion 69 7.1 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 7.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 List of Figures 71 Bibliography 73 1 Introduction Mobile robots can be reliable helpers in scenarios too dangerous for humans since they are disposable and unaffected by fumes, smoke, fear, or fatigue. For instance, search-andrescue robots scour the wreckage of collapsed buildings for disaster survivors. Deep-sea diving robots are successfully engaged in the search for flight data recorders of aircrafts that crashed offshore. In addition to these comparably extreme applications, service robots are expected to more and more pervade our daily lives. Applications range from robotic caregivers which deliver pharmaceuticals within a hospital to service robots which vacuum clean entire buildings. So far, many of these robots follow trivial and highly suboptimal navigation policies or are even remotely controlled by human operators due to the lack of reliable means of autonomous navigation. However, there are many reasons to operate a mobile robot without human guidance. Applications, such as surveillance, transportation, or planetary exploration, may require the robot to persistently operate for an extended period of time, making it difficult and costly to have a human constantly in the loop. For example, the goal of the ongoing Mars Exploration Rover mission is to explore the surface of Mars using two vehicles. Due to one-way communication delays of several minutes, remotely controlling the two vehicles from Earth is highly error-prone and cumbersome. Hence, as robotics advances, researchers strive to improve autonomous capabilities of mobile robots. Efficiently operating autonomous mobile robots depend upon maps of the environment to accomplish their tasks. However, for many applications it is not feasible to provide the robots with accurate handcrafted models of their surroundings in advance. To allow for flexible operation, mobile robots themselves must be able to infer and maintain a model of their environment from their own sensor measurements. However, the process of turning local perception into a globally consistent map is a challenging problem which requires the robot to build a map of the environment, and, at the very same time, estimate its location within that map. This problem is known as the simultaneous localization and mapping problem. The pivotal importance of maps to mobile robotics led to the development of a large diversity of approaches to the simultaneous localization and mapping problem. A stateof-the-art approach uses a graph to model the belief of the robot. The nodes in the graph correspond to positions of the robot, whereas the edges in the graph represent spatial constraints between the nodes which arise from the measurements obtained by the robot. This graph-based formulation exhibits several key advantages over other available methods. In particular, as opposed to many other techniques, such as the extended 14 1 Introduction Kalman filter, graph-based approaches do not suffer from accumulating linearization errors, which can have serious detrimental effects on the quality of the map. Graph-based approaches are becoming more and more widely-used since they are less sensitive to linearization errors and offer a high degree of flexibility. A disadvantage of graph-based techniques, however, is the fact that the graph grows constantly as the robot travels. This results in a computational complexity and memory requirements which impede long-term mapping applications. This thesis presents an approach to overcome this problem and is therefore an important step towards life-long simultaneous localization and mapping for graph-based approaches. 1.1 Contribution of this Thesis In this thesis, we introduce a novel approach for mobile robots using the graph-based formulation of the simultaneous localization and mapping (SLAM) problem to allow for life-long map learning using grid maps. We describe an information-theoretic algorithm which seeks to exhibit a computational complexity and memory requirements that scale with the size of the mapped area, and not with the length of the robot’s trajectory, as traditional graph-based methods do. Therefore, the key idea of our approach is to remove redundant information from the graph such that its size stays constant as long as the robot moves through previously visited places. Our method explicitly considers the information gain of observations and removes those nodes from the graph which do not contain new information with respect to the model of the environment constructed so far. A major contribution of this thesis is an approximate method that is based on Chow-Liu trees which enables us to remove nodes from the graph while preserving its sparsity. This is important for efficient optimization while at the same time minimizing the loss of information encoded in the constraints. As a result, our approach is highly efficient and scales with the size of the mapped environment. Note that a short version of this diploma thesis has been peer reviewed and accepted for publication as a journal article in the magazine “Zeitschrift KI – Künstliche Intelligenz”, special issue on simultaneous localization and mapping (Kretzschmar et al., 2009). 1.2 Outline The remainder of this thesis is structured as follows. In Chapter 2, we review the current state of research in the field of simultaneous localization and mapping. We discuss the most influential approaches and allude to their advantages and disadvantages. We furthermore contemplate the contribution of this thesis in the light of newest research. Chapter 3 aims to convey some background material necessary to understand the techniques presented in the subsequent chapters. Chapter 4 is devoted to a general introduction to the 1.3 Notation 15 family of graph-based approaches to simultaneous localization and mapping. Chapter 5 presents our novel approach to life-long map learning. In Chapter 6, we illustrate the efficacy of the presented approach. We therefore carried out some experiments at the University of Freiburg using a real robot. Additionally, we applied our method to several real-world datasets which are widely-used as benchmarks in the research community. Finally, Chapter 7 concludes this thesis and discusses future work. 1.3 Notation Throughout this thesis, we stick to the following notation. Time is considered to be discrete, and the time index is labeled t. The pose of a mobile robot operating in a planar environment comprises its two-dimensional location (x, y)T , along with its angular orientation θ, relative to an external coordinate frame. The pose of the robot at time t is denoted xt . The map of the environment is denoted m. We denote the ith feature in the map by mi . In case of grid maps, c refers to the grid cells. To perceive its environment, the robot is equipped with sensors. The observation obtained at time t is denoted zt . Observations often comprise several individual measurements, e. g., laser beams. We use ztk to refer to the individual measurements obtained at time t. The robot changes its pose with respect to the map issuing control commands. The control command ut thereby provides information about the pose xt relative to the pose xt−1 . To deal with the data association problem, we use dt to represent the identities of the features perceived in observation zt . The sets of robot poses, observations, control commands, and data associations obtained between time 1 and time t are denoted x1:t , z1:t , u1:t , d1:t , respectively. Table 1.1 summarizes the notation used in this thesis. variable xt x1:t m c zt ztk ut dt description pose of the robot at time step t sequence of poses of the robot from time step 1 to time step t map of the environment grid cell sensor observation obtained at time step t individual measurement obtained at time step t control command providing information about pose xt relative to pose xt−1 identities of the features perceived in observation zt Table 1.1: Notation. 2 Related Work The tremendous importance of the simultaneous localization and mapping (SLAM) problem to the field of mobile robotics has given rise to a myriad of different approaches to SLAM. Most techniques account for the inherent uncertainty present in the measurements by formulating the problem in a probabilistic manner. More precisely, they aim to compute the joint posterior probability distribution over the poses of the robot and the map of the environment. Common approaches apply Bayes filter implementations, such as the extended Kalman filter, the extended information filter, or the particle filter, to estimate the posterior or perform smoothing. In contrast to that, most graph-based approaches to SLAM compute the most likely map based on the full path of the robot and all measurements. In this chapter, we will present the main ideas of the most popular SLAM approaches and discuss their advantages and disadvantages. We will thereby motivate the graphbased formulation of the SLAM problem and, in particular, the contribution of this thesis in the context of related work. 2.1 Gaussian Filter Approaches Gaussian filters model the posterior distribution as a multivariate Gaussian probability density function. Using Gaussians to model the posterior is convenient because computations can be carried out in closed form. Applying a linear transformation to a Gaussian random variable yields another Gaussian random variable. Gaussians are unimodal, which means that they possess a single peak. Consequently, they cannot model the probability distribution of global estimation problems in which multiple distinct hypotheses exist. The earliest approaches to SLAM (Cheeseman and Smith, 1986; Leonard et al., 1992; Smith et al., 1990) are based on the extended Kalman filter (Kalman, 1960). An extended Kalman filter (EKF) estimates the state of a discrete-time controlled dynamic process from a series of noisy measurements. EKF-based SLAM approaches model both the posterior of the robot’s pose and the positions of all landmarks in the map by a single multivariate Gaussian represented by its mean µ and its covariance matrix Σ. The approaches apply an extended Kalman filter to recursively estimate this posterior. Being recursive filters, they incorporate the measurements into the posterior without having to store them afterwards. Most EKF-based approaches to SLAM rely on a greedy maximum likelihood estimator to determine the correspondences, which means that they never amend past data 18 2 Related Work incorporate measurement recover posterior mean/covariance Kalman filter information filter O N 2 O 1 O 1 O N2 Table 2.1: Computational complexity of the Kalman filter and the information filter. associations. Choosing the wrong association, however, can make a filter diverge. This is why landmarks are required to be uniquely identifiable. In other words, these SLAM approaches depend on a flawless feature detector. Neira and Tardós (2001) presented techniques that deal with unknown data associations. If the motion model and the sensor model of the robot were Gaussian and linear, the solution computed by EKF-based SLAM approaches would be optimal. The real world, however, is neither exactly Gaussian nor linear. For example, the motion of a robot that travels on a circular trajectory cannot be described by linear state transitions. EKF therefore applies a first order Taylor expansion to linearize both the motion model and the sensor model at the current state estimate. Linearization approximates a nonlinear function by a linear function. Consequently, the EKF irrevocably induces linearization errors in every update. These errors accumulate over time, especially when the linearization point, i. e., the current state estimate, is poor (Bailey et al., 2006; Frese and Hirzinger, 2001; Julier and Uhlmann, 2001). If the nonlinearities are too large, the approach tends to fail (Julier et al., 1995; Uhlmann, 1995). The unscented Kalman filter (UKF), proposed by Julier et al. (1995), performs a more accurate linearization based on the first two terms of a Taylor expansion, but does not provide a general solution to the problem of nonlinearities. While the Kalman prediction of the robot motion is efficient, both the Kalman update of the measurement and the size of the belief state, i. e., the covariance matrix, are quadratic in the number of landmarks in the map. This effectively limits the application of EKF-based SLAM approaches to small environments (Eustice et al., 2005b). The dual form of the posterior distribution expressed by mean and covariance is called canonical form or information form. It describes the Gaussian using the information vector ξ and the information matrix Ω, where ξ = Σ−1 µ and Ω = Σ−1 . The counterpart of the EKF in information form is the extended information filter (EIF), introduced by Maybeck (1979). Conversely, its update step is efficient but its prediction step is quadratic in the number of landmarks. Furthermore, due to µ = Ω−1 ξ, recovering the state estimate µ requires inverting Ω, which takes time cubic in the size of the map. Table 2.1 compares the computational complexity of the Kalman filter to the one of the information filter. 2.1 Gaussian Filter Approaches 19 Whereas the covariance matrix encodes all indirect relations between poses and/or landmarks explicitly, the information matrix only represents direct relations, i. e., relations which are involved in a common measurement. Sensors like laser range finders and sonar sensors have strictly limited ranges of measurement. Features that are located farther apart from each other than this range cannot be simultaneously observed. Marginalizing out past robot poses, however, causes the information matrix in feature-based SLAM to become dense (McLauchlan and Murray, 1995). The new constraints which are added to the information matrix due to marginalization are often referred to as fill-in. Thrun et al. (2002) empirically observed, and Frese (2005) later proved that the information matrix in feature-based SLAM is approximately sparse. More precisely, Frese showed that off-diagonal entries corresponding to two landmarks decay exponentially with the distance the robot traveled between observing the two landmarks one after the other. Consequently, many of the off-diagonal values of the information matrix are zero or close to zero when properly normalized. The insight that the information matrix in feature-based SLAM is approximately sparse has spawned the development of scalable approaches. The key idea at the bottom of most efficient algorithms is to prune weak constraints, i. e., approximate the dense information matrix by a sparse matrix. The sparse extended information filter (SEIF), proposed by Thrun et al. (2004), selectively prunes small entries in the information matrix setting them to zero. Given the sparsified information matrix, it can then carry out the prediction step and the update step in nearly constant time, irrespective of the number of landmarks in the map. The year after, Walter et al. (2005) pointed out that the error estimates computed by the sparse extended information filter are overconfident. In the same paper, they presented the so-called exactly sparse extended information filter (ESEIF). It maintains sparsity while preserving consistency by periodically relocalizing the robot. Similarly, the thin junction tree filter (TJTF), presented by Paskin (2003), represents the belief state using a junction tree. It prunes the junction tree such that the resulting approximation error is minimized. The filter techniques mentioned above marginalize out all poses except for the current one. In contrast, the approach presented in this thesis marginalizes out only those poses whose observations do not reduce the uncertainty in the belief of the robot; all other poses remain in the state. It furthermore approximates the elimination cliques, which result from marginalization of redundant poses, by the optimal tree-shaped approximations. Our approach thereby removes edges from the graph, which is equivalent to setting the corresponding entries of the information matrix to zero. Hence, our approximation approach is a technique to sparsify the information matrix. Eustice et al. (2005a) showed that the information matrix is exactly sparse in a delayedstate framework. As mentioned above, marginalizing out poses of the robot makes the information matrix become dense. A delayed-state filter maintains the exact sparsity of the information matrix by not marginalizing out robot poses. The sparsity of the information matrix in a delayed-state framework directly stems from the first-order 20 2 Related Work Markov assumption in the state transition model. The approach presented in this thesis can also be regarded as a delayed-state framework, albeit with reservations. As long as the robot does not remove any nodes from the graph, the representation contains all poses and is thus naturally sparse. However, as soon as the robot removes a redundant node, the marginalization causes fill-in. As a result, the information matrix becomes dense. Our approach therefore approximates the resulting fillin to recover a sparse representation. Furthermore, as opposed to the filtering techniques mentioned above, our approach computes the state by means of global optimization in order to avoid accumulating linearization errors. Strasdat et al. (2009) recently presented a feature-based approach to SLAM which aims to reduce the computational burden by only selectively incorporating landmarks. Their approach uses an unscented Kalman filter to estimate the posterior and only adds those landmarks to the state which actually help the robot navigate. The key idea of our approach is related to that. However, we aim to discard observations without having to rely on predefined features. Consequently, a feature-based representation, as used in the work presented by Strasdat et al., is not an option. Instead, we use grid maps to model the environment. Hence, we are not dealing with feature observations, but with raw laser scans. We discard the observations which do not contribute to the belief of the robot in terms of uncertainty reduction in the grid map. In contrast to Strasdat et al., we even consider former observations for removal. Furthermore, whereas the approach presented by Strasdat et al. is based on the unscented Kalman filter, which linearizes the measurements at the current state estimate, our approach uses a graph-based representation and applies nonlinear optimization techniques to compute the most likely map. 2.2 Particle Filter Approaches Particle filters approximate a posterior by a set of hypotheses, represented by a set of weighted random samples called particles. Each particle represents a potential state of the system. One of the key advantages of particle filters over approaches which model the posterior as a Gaussian, such as methods based on the extended Kalman filter or the extended information filter, is their ability to represent multi-modal probability distributions. They are thus capable of solving global estimation problems with multiple modes. Doucet et al. (1998) provide an overview and a discussion of the properties of particle filters in general. In the context of mobile robot localization and mobile robot map learning, the motion model of the robot is typically used as the proposal distribution to draw the next generation of particles. The importance weight of the particles is, for the most part, computed based on the observation likelihood of the most recent sensor observation according to the sensor model of the robot. Monte Carlo localization (MCL), introduced 2.2 Particle Filter Approaches 21 by Dellaert et al. (1999), first applied particle filters in the context of mobile robot localization. Murphy (1999) and Doucet et al. (2000) were the first to use Rao-Blackwellized particle filters (RBPF) for map learning. The idea of the RBPF is to separate the estimate of the trajectory of the robot from the map of the environment by an exact factorization. Each particle represents a possible robot trajectory and a map. The FastSLAM algorithm (Montemerlo et al., 2002, 2003) applies the idea of the Rao-Blackwellized particle filter to landmark-based SLAM tracking each landmark by an extended Kalman filter. Interestingly, the FastSLAM algorithm deals with the data association problem by also sampling over potential correspondences (Montemerlo and Thrun, 2003). Every particle maintains its own associations and the filter can undo incorrect associations by removing the corresponding inconsistent particles. Hähnel et al. (2003) presented a FastSLAM approach for occupancy grid mapping by combining a RaoBlackwellized particle filter and scan matching. Grisetti et al. (2007b) presented a highly efficient approach that is capable of reusing already computed proposal distributions. Furthermore, it enables the individual particles to share parts of the model of the environment. Stachniss et al. (2007) presented a novel sampling technique which is able to deal with multi-modal distributions while maintaining the same efficiency as the Gaussian proposal. DP-SLAM, presented by Eliazar and Parr (2003), uses a particle filter to represent both the robot poses and possible map configurations. The approach is capable of dealing with a large number of particles thanks to its map representation which allows for highly efficient updates. Instead of maintaining a map for each particle, it maintains a single occupancy grid map. It stores in each cell a tree containing links to the particles that have made changes to that particular cell. The solution reported by particle filters converges to the true Bayesian posterior as the number of particles goes to infinity (Verma et al., 2003). In practice, however, maintaining an infinite number of samples is computationally not feasible. It is desirable to approximate the posterior as good as possible given the limited number of particles which can be maintained. Particle filters therefore remove particles with a low importance weight and replace them by particles with a high importance weight in the so-called resampling step. In doing so, they have to accept the risk of irrevocably eliminating a particle even if it is very close to the true state. The effect that discarded hypotheses are lost and cannot be recovered is known as the particle depletion problem (Doucet, 1998; Doucet et al., 2001; van der Merwe et al., 2000). There are several approaches to reduce its detrimental effects (Grisetti et al., 2005, 2006; Stachniss et al., 2005, 2007). In short, they seek to maintain a reasonable variety of particles by only selectively triggering the resampling step. Furthermore, they compute more accurate proposal distributions by taking into account both the movement of the robot and the most recent observation. 22 2 Related Work 2.3 Graph-based Approaches A key disadvantage of all filter techniques is that they process the measurements and then discard them (Frese and Hirzinger, 2001). This gives rise to yet another approach to the SLAM problem. Lu and Milios (1997) were the first to globally optimize the poses of the robot based on a set of constraints between them. The set of constraints naturally emerges from observations and motion commands. Such a network of constraints is usually modeled as a graph, thus the term graph-based SLAM. The nodes in the graph represent poses of the robot and features, and the edges encode constraints between them. Graph-based approaches to SLAM aim to find the map that is most likely given a set of observations. The framework can be nicely illustrated by means of the spring-mass model (Golfarelli et al., 1998). Whereas the filter techniques mentioned above marginalize out past robot poses, graphbased SLAM approaches preserve them. Graph-based methods solve the so-called full SLAM problem estimating all poses of the robot, and the map. Methods based on the extended Kalman filter and the extended information filter perform linearization of each nonlinear term, such as the motion and the measurement model, only once. The point of linearization is chosen to be the state estimate at the time of linearization. Suboptimal points of linearization, resulting from poor state estimates, cannot be revised. Filtering techniques therefore suffer from accumulating linearization errors (Bailey et al., 2006; Julier and Uhlmann, 2001). In contrast to that, graph-based SLAM approaches avoid inducing these accumulating linearization errors by continuously relinearizing around the current state estimate (Olson et al., 2006). That is to say, these approaches consider the SLAM problem to be a nonlinear optimization problem. The constraints in the graph are typically nonlinear because they include the rotational component of the pose of the robot, i. e., the orientation of the robot. The resulting optimization problem is therefore nonlinear and difficult to solve. In graph-based SLAM, the aforesaid information matrix has a natural interpretation. It is the adjacency matrix of the graph, where non-zero off-diagonal elements are edges, and zero-valued off-diagonal elements indicate that there is no edge connecting the corresponding nodes (Olson et al., 2006; Thrun and Montemerlo, 2006). Graph-like constraints were first explicitly formulated by Cheeseman and Smith (1986) and Durrant-Whyte (1987). Lu and Milios (1997) were the first to refine a map by globally optimizing such a system of equations. Since then, a large variety of approaches to minimizing the error in constraint networks have been proposed. Duckett et al. (2002) use Gauss-Seidel relaxation. Frese’s multi-level relaxation technique (Frese et al., 2005) applies relaxation at different resolutions. Given a good initial guess, it yields accurate maps particularly in flat environments. Folkesson and Christensen (2004a) define an energy function for each node and minimize it. Thrun and Montemerlo (2006) apply variable elimination techniques to reduce the dimensionality of the optimization problem. A promising approach to SLAM is incremental smoothing and mapping (iSAM), proposed 2.4 Graph Pruning 23 by Kaess et al. (2007). It ensures efficiency by updating a QR factorization of the sparse information matrix, even in case of many re-traversals. It therefore only recomputes those matrix entries that actually change. Furthermore, iSAM allows for efficient, and exact computation of the marginal covariances for fast online data association. Olson et al. (2006) presented a fast and accurate optimization approach which is based on the stochastic gradient descent (SGD) (Robbins and Monro, 1951). In contrast to multi-level relaxation, proposed by Frese et al., it can better recover from poor initial guesses. Based on Olson’s optimization algorithm, Grisetti et al. (2007a) proposed a different parameterization of the nodes in the graph. The tree-based parameterization yields a significant boost in performance. In addition to that, the approach can deal with arbitrary graph topologies. This property is crucial for the approach presented in this thesis. 2.4 Graph Pruning Most graph-based approaches available today do not provide means to efficiently prune the pose graph that has to be corrected by the underlying optimization framework. Most approaches add a new node to the pose graph for every single observation. There are some notable exceptions, though. A simple method to reduce the number of nodes in the pose graph is to sample the trajectory of the robot at an appropriate spatial decimation (Eustice et al., 2006). A similar approach is to only add a new node to the graph if it is not spatially close to any existing node (Konolige and Agrawal, 2008). These sparsification techniques, however, ignore the observations themselves and therefore run the risk of discarding nodes which capture important aspects of the environment. Suppose a robot is close to an obstacle and records two observations from almost the same pose such that the first observation captures the area left to the obstacle, and the second observation covers the area right to the obstacle. Since both observations are recorded from two poses which are spatially very close to each other, the approaches mentioned above may not add the second observation. Hence, they may discard precious information about the environment which may not be covered by subsequent observations. In contrast to that, our approach explicitly computes the information gain of observations. It is thus able to detect whether an observation contributes to the belief of the robot or not, even in case the observations are recorded from almost the same pose. In the example above, our approach would add both observations to the graph and would thus not neglect the opportunity to capture the area right to the obstacle. Folkesson and Christensen (2004b) and Folkesson et al. (2005) combine nodes into so-called star nodes, yielding rigid local submaps. In contrast to filter based approaches, which perform linearization at the time of observation, their methods apply delayed linearization of local subsets of the graph permanently gluing together a set of nodes in a 24 2 Related Work relative frame. Konolige (2004) proposed to reduce the graph such that it only contains those nodes which have a loop constraint attached, computing the other nodes separately. This can lead to a substantial speed-up in case the connectivity of the graph is low. These methods, however, do not actually discard nodes and therefore do not prevent the graph from growing when the robot travels through already mapped terrain. They are thus not suited for life-long map learning. In contrast to that, the approach presented in this thesis actually discards those observations that do not capture new aspects of the environment freeing allocated resources. Consequently, once an environment is well explored, the resources needed by our approach do not further increase when the robot comes back to re-traverse the same region. 2.5 Chow-Liu Tree Approximation In our approach, we use Chow-Liu trees to obtain optimal tree-shaped approximations of the probability distributions which arise from the elimination cliques due to marginalization. Other researchers within the field of robotics recently used Chow-Liu trees to approximate multi-dimensional probability distributions. For instance, Cummins and Newman (2008) extended the well-known bag-of-words approach (Sivic and Zisserman, 2003) by learning a model for the sensory data in the form of a Chow-Liu tree. Furthermore, Chli and Davison (2009) used a Chow-Liu tree approximation to divide large visual maps into a fully hierarchical correlation and clustering structure. However, to the best of our knowledge, Chow-Liu trees have not been used previously to sparsify the elimination cliques in graph-based approaches to SLAM. 2.6 Summary Compared to all previous work, this thesis presents a novel approach which allows for applying graph-based SLAM approaches in the context of life-long map learning that explicitly considers the information gain of observations. It removes nodes from the pose graph if they do not contain new information with respect to the model of the environment constructed so far. It is highly efficient because it removes the nodes from the graph such that the corresponding information matrix remains sparse. It does so by locally approximating the belief state with optimal, tree-shaped probability distributions. As a result, the method presented in this thesis is an important step to allow for life-long map learning in mobile robotics. 3 Background This chapter briefly introduces some background material which is needed to understand the techniques yet to come. Section 3.1 and Section 3.2 deal with the motion and the perception of the robot, respectively. In Section 3.3, we introduce the grid map as a flexible representation of an environment. Section 3.4 analyzes the mapping problem under the assumption that the poses of the robot are exactly known. In Section 3.5, we introduce the notion of graphs and the corresponding terminology used in this thesis. The chapter is based on Thrun et al. (2005), Stachniss (2009), Hähnel (2005), and Diestel (2005), which can be consulted for more in-depth treatment. 3.1 Robot Motion The robot changes its state in the environment by issuing control commands. The motion model thereby describes how the state of the robot evolves. The state of the robot typically corresponds to its pose in the environment. In more complex settings, however, the state can additionally comprise the dynamics of the robot or other relevant information. In probabilistic robotics, motion models explicitly account for the fact that the outcome of a control command is uncertain, due to various types of noise. Motion models therefore describe the transition probability of the robot’s state. They are given by the posterior probability density distribution p(xt | ut , xt−1 ), (3.1) where xt−1 is the state of the robot at time t − 1, ut is the issued control command, and xt is the resulting state of the robot at time t. Motion models typically assume that the state evolves according to a first-order Markov process. Hence, the state xt is only conditionally dependent upon its immediately preceding state xt−1 and the issued control command ut . Figure 3.1 illustrates the motion of the robot in terms of a Bayesian network representation. A Bayesian network is a directed acyclic graph which encodes the dependence properties of a joint density. Actual implementations of Eq. (3.1) typically assume a Gaussian distribution (Cheeseman and Smith, 1986) or a “banana-shaped” distribution (Dellaert et al., 1999). As we will discuss in the next chapter, the work presented in this thesis assumes a Gaussian distribution. 26 3 Background ut−1 ut+1 ut xt−1 xt xt+1 Figure 3.1: Bayesian network representation of the robot’s motion. The motion model assumes that the state of the robot evolves according to a first-order Markov process. Hence, the state xt is only conditionally dependent upon its immediately preceding state xt−1 and the issued control command ut . xt−1 xt xt+1 zt−1 zt zt+1 xt zt1 ztK m m (a) (b) Figure 3.2: Bayesian network representation of the robot’s perception. (a) An observation zt is conditionally dependent upon the pose of the robot xt and the map m of the environment. (b) An observation zt at time t often comprises several individual measurements, giving zt = {zt1 , . . . , ztK }. These individual measurements are assumed to be independent. 3.2 Robot Perception To perceive its environment, the robot is equipped with sensors. The sensors are typically mounted on the robot. Therefore, they sense the environment relative to the current pose of the robot. Sensor models describe the formation process by which sensor measurements are generated in the physical world. All sensor measurements are subject to noise. In probabilistic robotics, sensor models therefore explicitly model the uncertainty in the sensors. Hence, sensor models are defined as the conditional probability distribution p(zt | xt , m), (3.2) where zt is the measurement, xt is the pose of the robot, and m is the map of the environment. An observation zt at time t often comprises several individual measurements, 3.3 Grid Maps 27 Figure 3.3: An occupancy grid map. Gray shades indicate the probability that a cell c is occupied, where white means p(c) = 0, and black means p(c) = 1. e. g., several laser beams. Hence we have zt = {zt1 , . . . , ztK }, (3.3) where zt1 , . . . , ztK refer to the individual measurements. We assume that the individual measurements ztk are independent, allowing us to write p(zt | xt , m) = K Y p(ztk | xt , m). (3.4) k=1 Figure 3.2 illustrates the perception of the robot in terms of two Bayesian networks. Again, the concrete realization of p(ztk | xt , m) depends on the sensor which is used. 3.3 Grid Maps In this thesis, we use grid maps to model the environment. Grid maps partition the continuous space of locations in the environment into a grid of rectangular cells, the so-called grid cells. Each grid cell contains information about the corresponding area in the environment. The most frequently used grid maps are occupancy grid maps and reflection probability grid maps. Occupancy grid maps assume that each grid cell is either occupied by an obstacle or free. Each cell thereby stores the probability that the particular cell is occupied. Reflection probability grid maps store the probability that a beam is reflected by an obstacle within this particular cell. Figure 3.3 depicts a simple occupancy grid map. The occupancy grid map and the reflection probability grid map are efficient approaches for representing uncertainty. Grid maps allow for detailed representations of an environment without the need of a predefined feature extractor. As they have the ability 28 3 Background xt−1 xt xt+1 zt−1 zt zt+1 m Figure 3.4: Bayesian network representation of the problem of mapping with known poses. The measurements z1:t as well as the robot trajectory x1:t are known. The goal is to compute the map m. to represent occupied space, empty space, and unknown (unobserved) space, grid maps are well suited for tasks such as path-planning, obstacle avoidance, and exploration. In contrast to most representations based on features, grid maps offer constant time access to cells. However, grid maps suffer from discretization errors. Furthermore, they require a significant amount of memory resources. 3.4 Mapping with Known Poses The term mapping with known poses refers to the problem of producing maps from noisy measurements z1:t , under the assumption that the poses of the robot are known at all times t. In practice, the true trajectory of the robot x1:t cannot be inferred directly from the odometry measurements u1:t because of the inherent noise present in all measurements. Nevertheless, the problem of mapping with known poses is of vital importance in the context of SLAM. This is due to the fact that most SLAM approaches do not output the map itself. Instead, they aim to recover the true robot trajectory. In short, mapping with known poses can be used to process the trajectory estimate of the SLAM approach to produce the final map. The problem comprises the dependencies illustrated by the Bayesian network depicted in Figure 3.4. In general terms, mapping with known poses can be expressed as computing the posterior distribution over the maps m given the measurements z1:t and the robot poses x1:t , i. e., p(m | z1:t , x1:t ). (3.5) For reasons of simplicity, it is common to assume that the grid cells are independent of each other. The probability of the entire map is then given by the product of the 3.4 Mapping with Known Poses 29 probabilities of the individual cells. Hence, we have Y p(m) = p(c), (3.6) p(c | z1:t , x1:t ). (3.7) c∈m and, in particular, Y p(m | z1:t , x1:t ) = c∈m Solving the estimation problem given in Eq. (3.5) turns into solving a collection of several separate binary estimation problems, i. e., p(c | z1:t , x1:t ) (3.8) for all grid cells c ∈ m. Although Eq. (3.6) expresses a strong assumption, the resulting maps typically are of reasonable quality. Only very few approaches exist that consider such dependencies. Notable exceptions are Thrun (2003) and O’Callaghan et al. (2009). 3.4.1 Occupancy Probability Mapping The standard occupancy grid mapping approach, presented by Moravec and Elfes (1985), solves the estimation problem for each grid cell, as given in Eq. (3.8), using a Bayes filter. The corresponding recursive occupancy update formula is given by h p(c | z1:t , x1:t ) = 1 + p(c) 1 − p(c | z1:t−1 , x1:t−1 ) i−1 1 − p(c | zt , xt ) · · . p(c | zt , xt ) 1 − p(c) p(c | z1:t−1 , x1:t−1 ) (3.9) The term p(c) describes the prior probability that a cell is occupied by an obstacle. Typically, a prior of p(c) = 12 is assumed. The actual implementation of p(c | zt , xt ) depends on the sensor which is used. 3.4.2 Reflection Probability Mapping In contrast to occupancy grid maps, which store the probability that a cell is occupied by an obstacle, reflection probability maps capture the probability that a laser beam is reflected by an object within a particular grid cell. Reflection probability maps can be computed by means of the so-called simple counting model. It states that the reflection probability of a cell c is given by p(c | z1:t , x1:t ) = hits(c, z1:t , x1:t ) , hits(c, z1:t , x1:t ) + misses(c, z1:t , x1:t ) (3.10) where hits(c, z1:t , x1:t ) refers to the number of times a beam zik measured from xi ∈ x1:t ended in cell c, and misses(c, z1:t , x1:t ) refers to the number of times a beam has passed through cell c. 30 3 Background x2 x3 x4 x2 x5 x3 x0 x1 (a) A disconnected graph. x0 x1 (b) A complete graph. Figure 3.5: Drawings of two graphs. The blue circles and the lines represent the nodes and the edges in the graphs, respectively. (a) A drawing of a disconnected graph. A disconnected graph contains a pair of nodes such that there exists no path in the graph which links the two nodes. (b) A complete graph. All nodes in a complete graph are pairwise adjacent. 3.5 Graphs In this thesis, we use a graph-based approach to address the SLAM problem. A graph is a pair G = (V, E) of sets such that E ⊆ V × V . The elements of V are the nodes or vertices of the graph, the elements of E are the edges of the graph. In the remainder of this document, we will stick to the following basic terminology of graph theory: • The nodes belonging to an edge are called incident with that edge. • The two nodes incident with an edge are its ends. • If v is incident with an edge e, we say that edge e is an edge at v. • The set of all edges in E at v is denoted by E(v). • An edge joins its ends. • Two edges are adjacent if they have an end in common. • Two nodes x, y of G are adjacent, or neighbors, if {x, y} is an edge of G. • If all nodes of G are pairwise adjacent, then G is called complete. • The set of neighbors of a node v in G is denoted by NG (v), or briefly by N (v). • The degree of a node v is the number of edges at v, written as dG (v), or briefly as d(v). 3.5 Graphs 31 • The number of nodes of a graph G is its order, written as |G|. • The number of edges of a graph G is its size, denoted by ||G||. • A graph G0 = (V 0 , E 0 ) is a subgraph of G = (V, E) if V 0 ⊆ V and E 0 ⊆ E. In this case, G is a supergraph of G0 . • A clique in a graph G = (V, E) is a non-empty set of nodes C ⊆ V such that all nodes in C are pairwise adjacent in G. • A path is a non-empty graph P = (V, E) of the form V = {x0 , x1 , . . . , xk } and E = {{x0 , x1 }, {x1 , x2 }, . . . , {xk−1 , xk }}, where i = 6 j ⇒ xi = 6 xj . In this case, P links its ends x0 and xk . • A non-empty graph G is called connected if any two of its nodes are linked by a path in G. A non-empty graph which is not connected is called disconnected. • The adjacency matrix A = (aij )n×n of a graph G = (V, E) is defined by ( aij = 1 0 if vi vj ∈ E, otherwise. (3.11) The usual way to visualize a graph is by drawing a symbol for each node, and by drawing a straight or curved line between two nodes if and only if there is an edge joining them. Figure 3.5 exemplifies a disconnected graph and a complete graph. See, for instance, Diestel (2005) for a thorough introduction to graph theory. 4 The Graph-based Formulation of the SLAM Problem In this thesis, we use a graph-based approach to address the simultaneous localization and mapping (SLAM) problem. Graph-based approaches to SLAM cast map building as an optimization problem. They model the poses of the robot and the map features as nodes in a graph. The edges of the graph model spatial constraints between the nodes. These constraints naturally arise from odometry measurements and from observations. The spatial configuration of the nodes which best satisfies the constraints encoded in the edges corresponds to the most likely map given the measurements. Graph-based SLAM approaches apply optimization techniques to determine this configuration. This chapter provides a general introduction to the graph-based formulation of the SLAM problem. Section 4.1 first introduces the SLAM problem from a formal, probabilistic point of view. It draws a distinction between the online and the full SLAM problem. Since graph-based approaches to SLAM aim to solve the latter, Section 4.2 derives a closed form expression to calculate the corresponding posterior distribution. Section 4.3 discusses the importance of a robust approach to loop closing, i. e., the ability to recognize places that the robot previously visited. Section 4.4 introduces a graphical, intuitive representation of the full SLAM posterior, namely the concept of pose graphs. Assuming independent Gaussian noise, Section 4.5 formulates the SLAM problem as a least squares problem, which can be solved efficiently. Finally, Section 4.6 discusses the task of computing the most likely configuration of the nodes in the pose graph yielding the maximum likelihood map. This introduction to the graph-based formulation of the SLAM problem is mainly based on Olson (2008), Thrun et al. (2005), and Kaess et al. (2008). 4.1 Simultaneous Localization and Mapping A robot that has to acquire a map of its initially unknown environment has to solve the so-called simultaneous localization and mapping (SLAM) problem. SLAM is challenging because it is a “chicken-and-egg” problem. On the one hand, an accurate estimate of the pose of the robot is needed while building the map. Estimating the pose of the robot is called localization. On the other hand, localization needs a map to correct the pose uncertainty which arises from the odometry measurements due to noise. Localization and map building are therefore inherently coupled. SLAM is regarded as one of the 34 4 The Graph-based Formulation of the SLAM Problem most fundamental problems in mobile robotics because the ability to build maps is a key precondition for truly autonomous mobile robots. The remainder of this section defines the SLAM problem from a formal, probabilistic point of view. The subsequent sections will be concerned with an approach to solve the problem. In probabilistic robotics, the SLAM problem is described as finding the joint posterior distribution of the map m and the poses of the robot x1:t given a set of observations z1:t and a set of robot motion commands u1:t . A distinction is drawn between the full SLAM and the online SLAM problem. The full SLAM problem refers to estimating the joint posterior probability distribution over the entire robot path x1:t along with the map m given the observations z1:t and the control commands u1:t . The full SLAM posterior is thus given by p(x1:t , m | z1:t , u1:t ). (4.1) Integrating out all past robot poses x1:t−1 yields the online SLAM posterior. Hence, the online SLAM posterior is p(xt , m | z1:t , u1:t ) = Z Z Z ... p(x1:t , m | z1:t , u1:t ) dx1 dx2 . . . dxt−1 . (4.2) Figure 4.1 illustrates the structure of the SLAM problem in general and emphasizes the differences between the online SLAM problem and the offline SLAM problem. In online SLAM, a robot pose is typically integrated out immediately once a new pose comes into play only keeping track of the current pose of the robot. As already mentioned in Chapter 2, integrating out past robot poses substantially changes the dependency structures. These structural changes will be of great interest for the work presented in this thesis. The distributions above assume that the so-called data association problem has been solved. Data association refers to the problem of identifying a feature in one observation as the very same feature found in another observation. The data association problem is crucial to SLAM and thus must not be neglected. Let di represent the identities of the features perceived in observation zi . Considering the data association problem, the full SLAM posterior turns into p(x1:t , m, d1:t | z1:t , u1:t ). (4.3) Integrating out all past robot poses x1:t−1 and summing over all past data associations d1:t−1 yields the online SLAM posterior p(xt , m, dt | z1:t , u1:t , d1:t ) Z Z = ... Z XX d1 d2 ... X dt−1 p(x1:t , m, d1:t | z1:t , u1:t ) dx1 dx2 . . . dxt−1 . (4.4) 4.2 Computing the Full SLAM Posterior ut−1 ut+1 ut 35 ut−1 ut+1 ut xt−1 xt xt+1 xt−1 xt xt+1 zt−1 zt zt+1 zt−1 zt zt+1 m m (a) The online SLAM problem. (b) The full SLAM problem. Figure 4.1: Bayesian network representation of the SLAM problem. In SLAM, the control commands u1:t and the observations z1:t are given, the robot poses x1:t and the map m are unknown. The goal is to compute the map m. (a) Online SLAM aims to recover the current robot pose xt only. (b) In contrast, full SLAM aims to recover the entire trajectory of the robot, i. e., all robot poses x1:t . 4.2 Computing the Full SLAM Posterior In Section 4.1, we described the SLAM problem as finding the posterior probability distribution of a map. We furthermore introduced the online and the full SLAM problem. Graph-based SLAM aims to solve the latter. Hence, in this section, we will derive a closed form expression to calculate the corresponding posterior distribution. According to Eq. (4.3), the full SLAM posterior assuming known data associations is given by p(x1:t , m | z1:t , u1:t , d1:t ). (4.5) Using Bayes’ theorem, we obtain p(x1:t , m | z1:t , u1:t , d1:t ) = η p(zt | x1:t , m, z1:t−1 , u1:t , d1:t ) p(x1:t , m | z1:t−1 , u1:t , d1:t ), (4.6) where η is the normalizer resulting from Bayes’ theorem, which indicates that the final result is normalized to 1. We proceed by simplifying the two probabilities on the right. By exploiting the assumption that our state is complete, we have p(zt | x1:t , m, z1:t−1 , u1:t , d1:t ) = p(zt | x1:t , m, dt ). (4.7) 36 4 The Graph-based Formulation of the SLAM Problem Partitioning x1:t , m into xt and x1:t−1 , m, and then removing conditionals which are irrelevant under the Markov assumption leads to p(x1:t , m | z1:t−1 , u1:t , d1:t ) = p(xt | x1:t−1 , m, z1:t−1 , u1:t , d1:t ) p(x1:t−1 , m | z1:t−1 , u1:t , d1:t ) = p(xt | xt−1 , ut ) p(x1:t−1 , m | z1:t−1 , u1:t−1 , d1:t−1 ). (4.8) Combining the two intermediate results from Eq. (4.7) and Eq. (4.8) with Eq. (4.6), we obtain the recursive formula p(x1:t , m | z1:t , u1:t , d1:t ) = η p(zt | xt , m, dt ) p(xt | xt−1 , ut ) p(x1:t−1 , m | z1:t−1 , u1:t−1 , d1:t−1 ). (4.9) Induction over time t yields the closed form expression p(x1:t , m | z1:t , u1:t , d1:t ) = η p(x1 , m) Y p(xt | ut , xt−1 ) p(zt | xt , m, dt ) . (4.10) t Eq. (3.4) from Section 3.2 states that the individual measurements are independent of each other, allowing us to write p(x1:t , m | z1:t , u1:t , d1:t ) = η p(x1 , m) Y p(xt | ut , xt−1 ) t Y p(ztk | xt , m, dkt ) . (4.11) k And finally, we drop the joint prior on the map m and on the initial state x1 for simplicity, which results in p(x1:t , m | z1:t , u1:t , d1:t ) = η Y t p(xt | ut , xt−1 ) Y p(ztk | xt , m, dkt ) . (4.12) k Note that the expression in Eq. (4.12) corresponds to the Bayesian network representations depicted in Figure 4.1. The term p(xi | ui , xi−1 ) refers to the motion model, discussed in Section 3.1. The term p(ztk | xt , m, dkt ) corresponds to the measurement model from Section 3.2, assuming known correspondences. In Section 4.4, we will introduce an intuitive representation of the posterior distribution given in Eq. (4.12), namely the pose graph. 4.3 Loop Closing As already mentioned in Section 4.1, an accurate estimate of the current pose of the robot is required while building a map. Better pose estimates result in better maps. It is therefore desirable for the robot to maintain an accurate estimate about its current pose during the entire mapping process. However, as the robot travels through unknown 4.3 Loop Closing 37 terrain, it basically has to rely on dead reckoning or odometry measurements, to update its pose estimate. The odometry estimate is typically affected by significant noise due to unpredictable wheel slippage, and the like. Techniques which improve upon the estimate provided by odometry, such as incremental scan matching, can only mitigate the error induced with every pose update, but cannot avoid that the error accumulates along the path. Even a slight error in orientation can thereby lead to a vast position error later on. As a consequence, the robot becomes more and more uncertain about its pose. Fortunately, once the robot comes back to a previously observed place, whose position is known to the robot with relative certainty, it has the chance to compensate the pose error. That is to say, if the robot is now able to recognize the place as the very same place it was before, it can refine its current pose estimate with respect to that place, creating a geometric relationship between the corresponding poses. In the context of the SLAM problem, place recognition is therefore referred to as loop closing. As we will see in Section 4.6, each loop closure allows the robot to retrospectively refine its trajectory estimate, and hence the map. The difficulty in closing loops is to find the correct data associations when re-entering known terrain. As already mentioned in Section 4.1, data association refers to the problem of identifying a feature in one observation as the very same feature found in another observation. In feature-based approaches, closing loops therefore corresponds to associating landmarks with previously observed landmarks. In grid-based approaches, a loop closure typically results from successfully matching two or more sensor scans. Scan matching is the task of finding the transformation that best aligns two or more scans. We refer the reader to Censi (2006) and Olson (2009) for more details on scan matching. On the one hand, if the robot fails to detect a loop closure, it potentially re-maps previously visited areas in the wrong global location, leading to an inconsistent map. On the other hand, if the robot closes an incorrect loop, the resulting wrong refinement of the robot trajectory typically has catastrophic effects on the map. Hence, a robust approach to loop closing is an essential component of any successful mapping system. However, loop closing is challenging. On the one hand, sensors such as laser scanners provide a rather limited view of the environment. As a consequence, entirely different places may look alike in the scans. On the other hand, algorithms need to discriminate between different places which actually are very similar. Furthermore, environments that have slightly changed due to moving objects should still be recognized. In addition to that, algorithms need to be robust to spurious measurements, which are caused by noise. Unfortunately, for the reasons mentioned above, feature matching is typically slow and error prone. Most approaches therefore do not perform feature matching against all observations. They rather restrict the set of observations to those for which feature matching seems likely. Hence, the algorithms only consider those observations which were recorded close to the current pose. They therefore identify all former poses which are within the ellipsoid of the current pose uncertainty (e. g. 95 % confidence interval). 38 4 The Graph-based Formulation of the SLAM Problem x1 x13 x2 x12 x3 x11 x5 x4 x10 x6 x7 x9 x8 x14 Figure 4.2: A pose graph. The triangles are the nodes in the pose graph and represent the robot poses discretized over time. The edges in the pose graph represent spatial constraints between the robot poses arisen from odometry readings (shown in blue) and from scan matching (shown in red). Each edge describes a likelihood as a function of the deviation from the preferred mutual arrangement of the two nodes it joins. The most likely positions of the nodes are therefore implicitly given by the edges. In graph-based approaches to SLAM, this set of observations can be found by performing a Dijkstra projection (Dijkstra, 1959) of the node covariances beginning at the current node. Tipaldi et al. (2007) describes how to thereby efficiently compute the marginal covariances. 4.4 The Pose Graph Metaphor In Section 3.5, we introduced the graph as an abstract representation of a set of objects where some pairs of objects are linked by edges. Graph-based SLAM models the mapping problem by means of a graph, a so-called pose graph. This section therefore introduces the concept of pose graphs. It draws a distinction between pose graphs and pose/feature graphs. 4.4.1 Pose Graphs Graph-based SLAM encodes the experience of a mobile robot in a graph, a so-called pose graph. The nodes in a pose graph represent poses of the robot along its trajectory, discretized over time. The edges in a pose graph represent information about the geometric relation of the corresponding robot poses. Together with the sensor model, each edge describes a likelihood as a function of the deviation from the preferred mutual arrangement of the two nodes it joins. The negative log likelihood is often referred to as the cost of an edge. The most likely positions of the nodes are therefore implicitly given by the edges. We will be concerned with determining these positions in the following sections. The spatial constraints in a pose graph arise from odometry measurements (see Section 3.1) and from loop closures (see Section 4.3). Constraints can express any 4.4 The Pose Graph Metaphor 39 m2 x1 x2 x3 x5 x4 x6 m3 m1 x11 x7 x8 x10 x9 Figure 4.3: A pose/feature graph. The triangles and the stars are both nodes in the pose/feature graph. The triangles represent the robot poses discretized over time. The stars represent map features. The edges in the graph represent spatial constraints between the nodes arisen from odometry readings (shown in blue) and from feature observations (shown in red). Each edge describes a likelihood as a function of the deviation from the preferred mutual arrangement of the two nodes it joins. The most likely positions of the nodes are therefore implicitly given by the edges and their cost functions. type of information such as a range constraint, a bearing constraint, or a full rigidbody transformation. The set of spatial constraints constitutes the full SLAM posterior probability distribution. Figure 4.2 exemplifies a simple pose graph. The edges arisen from odometry measurements and from loop closures are depicted in blue and red, respectively. As shown by Montemerlo et al. (2002), once the poses of the robot have been recovered, the map can be computed easily based on the original feature-to-pose information via techniques known as mapping with known poses (see Section 3.4). Hence, the pose graph is an implicit representation of the map itself. 4.4.2 Pose/Feature Graphs According to Olson (2008), a graph whose nodes represent not only robot poses but also map features is called a pose/feature graph. Hence, the spatial constraints in a pose/feature graph arise from odometry measurements and from feature observations (see Section 3.2). Pose graphs and pose/feature graphs are closely related to each other since marginalizing out the features of a pose/feature graph yields its equivalent pose graph. Figure 4.3 exemplifies a simple pose/feature graph. The edges arisen from odometry measurements and from feature observations are depicted in blue and red, respectively. In this thesis, we use plain pose graphs since our approach does not rely on features in the environment, but directly computes grid maps from the laser range data. 40 4 The Graph-based Formulation of the SLAM Problem 4.5 SLAM as a Least Squares Problem In Section 4.2, we derived a closed form expression of the full SLAM posterior probability distribution function. Our derivation led to Eq. (4.12). In practice, however, computing the entire posterior distribution is not feasible when dealing with high-dimensional maps. Graph-based SLAM approaches therefore only compute the mode of this posterior. In other words, they compute the most likely map given the measurements. The maximum likelihood (ML) estimate is given by x∗1:t , m∗ = argmax p(x1:t , m | z1:t , u1:t ). (4.13) x1:t , m Unfortunately, computing the solution of Eq. (4.13) is still not tractable in general. Therefore, graph-based approaches to SLAM further reduce the computational burden by assuming independent Gaussian noise. More precisely, the approaches assume the motion and the measurement noise to be independent and normally distributed. As we will see shortly, the independent Gaussian noise assumption leads to a convenient set of quadratic equations, which can be solved efficiently. A Gaussian motion model is given by xt = g(ut , xt−1 ) + wt , (4.14) where the deterministic function g describes the kinematic properties of the robot, along with the Gaussian random variable wt ∼ N (0, Rt ), which models the white motion noise with zero mean and covariance matrix Rt . This leads to the posterior probability distribution T 1 p(xt | ut , xt−1 ) ∝ exp − xt − g(ut , xt−1 ) Rt−1 xt − g(ut , xt−1 ) . 2 (4.15) Similarly, the measurement model is given by ztk = h(xt , m, dkt , k) + vtk , (4.16) where h is the measurement function, and vtk ∼ N (0, Qkt ) is a Gaussian random variable modeling the white measurement noise with zero mean and covariance Qkt . This leads to the posterior p(ztk | xt , m, dkt ) ∝ exp − T 1 k −1 zt − h(xt , m, dkt , k) Qkt ztk − h(xt , m, dkt , k) . (4.17) 2 Representing probabilities in logarithmic form, the maximum likelihood estimate 4.6 Pose Graph Optimization 41 according to Eq. (4.13) becomes x∗1:t , m∗ = argmax p(x1:t , m | z1:t , u1:t ) x1:t , m = argmin − log p(x1:t , m | z1:t , u1:t ) x1:t , m = argmin − log Y x1:t , m p(xt | ut , xt−1 ) t " = argmin X x1:t , m Y p(ztk | xt , m, dkt ) k − log p(xt | ut , xt−1 ) + X − log t k X XX p(ztk | xt , m, dkt ) " = argmin # # x1:t , m − log p(xt | ut , xt−1 ) + t t − log p(ztk | xt , m, dkt ) . (4.18) k Assuming the Gaussian motion model according to Eq. (4.15) and the Gaussian measurement model according to Eq. (4.17), the maximum likelihood estimate finally turns into the nonlinear least squares problem " x∗1:t , m∗ = argmin x1:t , m + X T Rt−1 xt − g(ut , xt−1 ) t X X t xt − g(ut , xt−1 ) ztk − h(xt , mi , k) T −1 Qkt ztk − h(xt , mi , k) # . (4.19) k Estimating the most likely robot trajectory and the most likely map corresponds to solving Eq. (4.19). In the next section, we will discuss the task of solving this optimization problem. Note that the motion model expressed in Eq. (4.15) and the measurement model expressed in Eq. (4.17) form the edges in the pose graph, which we introduced in the last section. The cost functions associated with the edges are expressed in terms of the squared Mahalanobis distance given the covariances Rt and Qkt , respectively. The costs are thus quadratic in the deviation from the preferred configuration of the corresponding two nodes. Furthermore, since all measurements are considered independent, multiple edges linking the same pair of nodes can be fused. 4.6 Pose Graph Optimization In Section 4.4, we introduced the pose graph and the pose/feature graph. They both intuitively encode the full SLAM posterior distribution. Assuming independent Gaussian noise, we formulated the mapping problem as a least squares problem in Section 4.5. Pose graph optimization is the task of computing the positions of all nodes in the pose graph such that the edges have minimum cost. In other words, pose graph optimization 42 4 The Graph-based Formulation of the SLAM Problem solves the least squares problem. Pose graph optimization is also referred to as the SLAM back-end. Each edge in the pose graph describes a cost as a quadratic function of the deviation from the preferred relative arrangement of the two nodes it joins. In the context of graph-based SLAM, a loop closure results in an edge in the pose graph which relates the current robot pose with a former robot pose, making the pose graph cyclic. However, in general, these loop closure edges contradict the pose estimates resulting from plain dead reckoning. The reason is that dead reckoning estimates are typically rather poor due to measurement errors like wheel slippage, accumulated along the path. The optimization aims to arrange the nodes in the pose graph such that the total costs associated with all edges are minimized. Figure 4.4 illustrates the effect of the pose graph optimization after a successful loop closure. Optimizing the pose graph is important because the maximum likelihood configuration of the nodes corresponds to the best map of the environment. However, there are four main reasons why pose graph optimization is a challenging endeavor: • The state space can be very large. • The constraints in the pose graph are typically nonlinear. • Local minima in the likelihood function. • The initial guess can be very poor. The state space can be very large since it typically grows with the length of the robot trajectory. The constraints in the pose graph are typically nonlinear and “wrap-around” at 360◦ because they include the rotational component of the robot pose, i. e., the orientation of the robot. Hence, the resulting cost surface is complicated and rarely convex. Having many local minima, the corresponding optimization problem is difficult to solve. Furthermore, the initial guess tends to be rather poor. To obtain a first initial guess of the robot poses, algorithms typically set the first pose to zero and compute the subsequent poses by simply concatenating the motion model p(xt | ut , xt−1 ). As a result, the optimal configuration of the pose graph can substantially disagree with its initial configuration. After all, this discrepancy is the main motivation for optimizing the pose graph. Existing approaches to tackle the least squares optimization problem formulated in Eq. (4.19) are based on • LU decomposition (applied by Lu and Milios, 1997) • Gauss Seidel relaxation (applied by Duckett et al., 2000) • conjugate gradients (applied by Konolige, 2004; Montemerlo and Thrun, 2004) 4.6 Pose Graph Optimization 43 (a) The robot travels in a loop and returns to a previously visited place. Its dead reckoning pose estimate is rather poor. (b) The robot recognizes the previously seen place and adds a loop closure constraint to the pose graph. (c) The high error in the loop closure constraint is reduced by adding a small amount of error to other constraints. (d) The pose graph optimization has minimized the total error in the pose graph, and the loop has been closed. Figure 4.4: Pose graph optimization after a successful loop closure. • multi-level relaxation (applied by Frese et al., 2005) • stochastic gradient descent (applied by Grisetti et al., 2007a; Olson et al., 2006) • sparse Cholesky decomposition (applied by Dellaert and Kaess, 2006) • QR matrix factorization (applied by Kaess et al., 2007). In this thesis, we use the approach proposed by Grisetti et al. (2007a). It is a refinement of Olson’s algorithm (Olson et al., 2006), which is based on the stochastic gradient descent (Robbins and Monro, 1951). The difference to Olson’s approach is given by an improved node parameterization, as well as by an optimized update rule in the stochastic gradient descent framework. 5 Our Approach to Life-long Map Learning The preceding chapter provided an introduction to the graph-based formulation of the simultaneous localization and mapping problem. Most graph-based approaches constantly add new nodes to the pose graph even when the robot is moving through already mapped terrain. Hence, the complexity of these approaches scales with the length of the robot’s trajectory. A robot which has to constantly update its belief about the environment throughout its entire life cycle, however, will inevitably run out of resources in this way. This chapter is devoted to presenting our approach to life-long map learning using the graph-based formulation of the simultaneous localization and mapping problem. Our approach to life-long map learning aims to remove those nodes from the pose graph whose observations do not reduce the uncertainty in the belief of the robot. As a result, our mapping approach scales with the size of the environment, and not with the length of the robot’s trajectory. 5.1 Information-Theoretic Node Reduction The key idea of our sparsification approach is to discard those observations which do not reduce the uncertainty in the belief of the robot. The belief reflects the internal knowledge of the robot about the state of the environment and is thus given by bel(x1:t , m) = p(x1:t , m | z1:t , u1:t , d1:t ). (5.1) We can factor the belief and obtain bel(x1:t , m) = p(x1:t , m | z1:t , u1:t , d1:t ) = p(x1:t | z1:t , u1:t , d1:t ) p(m | x1:t , z1:t , u1:t , d1:t ). (5.2) In Section 4.5, we already mentioned that graph-based approaches to SLAM only compute the maximum likelihood estimate of the robot’s poses. It is given by x∗1:t = argmax p(x1:t | z1:t , u1:t , d1:t ). (5.3) x1:t Thus, p(x1:t | z1:t , u1:t , d1:t ) from Eq. (5.2) turns into a Dirac delta distribution at x∗1:t . In our case, we therefore only need to consider the belief about the map. It is given by bel(m) = p(m | x∗1:t , z1:t , u1:t , d1:t ). (5.4) 46 5 Our Approach to Life-long Map Learning H (p(c)) 1.0 0.5 0 0 0.5 p(c) 1.0 Figure 5.1: Binary entropy function. The plot shows the entropy H of a grid cell c as a function of the probability p(c) that the cell is occupied. Entropy is a measure of the uncertainty associated with a random variable. The entropy of a grid cell attains its maximum when the state of the cell is entirely unknown. The probability of an entirely unknown cell is p(c) = 0.5. For simplicity of notation, we will drop the conditionals in the remainder of this chapter simply writing p(m), instead of p(m | x∗1:t , z1:t , u1:t , d1:t ). We can measure the uncertainty of the robot’s belief in terms of information entropy (Shannon, 1948). Definition 1 The information entropy H of a probability distribution p(x) is the expectation of its information content in bits. It can be defined as H (p(x)) = E[− log2 p(x)]. (5.5) Hence, in case of a discrete probability distribution, entropy is given by X H (p(x)) = − p(x) log2 p(x). (5.6) p(x) log2 p(x) dx. (5.7) x In the continuous case, entropy turns into H (p(x)) = − Z x In the following, we will discuss how to compute H(p(m)). Our approach uses a grid map m to model the environment. We already introduced grid maps in Section 3.3. Remember that grid maps partition the environment into a grid of rectangular cells, the so-called grid cells. According to the definition of entropy given in Eq. (5.5), the entropy of a single grid cell c is given by H(p(c)) = p(c) log p(c) + p(¬c) log p(¬c), (5.8) 5.1 Information-Theoretic Node Reduction 47 where p(c) is the probability that cell c is occupied, and p(¬c) = 1 − p(c) is the probability that cell c is free. Consider Figure 5.1. It depicts the entropy of a grid cell as a function of the probability that the cell is occupied. Note that the entropy attains its maximum when the state of the cell is entirely unknown. The probability of an entirely unknown cell is p(c) = 0.5. In Section 3.4, we stated that the grid cells of a grid map are independent of each other, allowing us to write Y p(m) = p(c). (5.9) c∈m Hence, for a grid map, the entropy is computed by summing over the entropy values of the individual grid cells, giving H(p(m)) = X H(p(c)) c∈m =− X p(c) log p(c) + p(¬c) log p(¬c) . (5.10) c∈m Definition 2 The information gain I of an observation zi is the reduction of entropy in the belief of the robot caused by the observation. It can be defined as I(zi ) = H p(m | z1:t \ {zi }) − H p(m | z1:t ) , (5.11) where z1:t refers to the set of all observations in the graph. According to this definition, the information gain of an observation has the following properties: • Observations which explore unknown territory decrease the uncertainty in the belief of the robot. Hence, they will exhibit a positive information gain. • Observations which agree with former measurements do not change the map and thus do not change the uncertainty in the belief of the robot. Their information gain will therefore be close to zero or zero. • Observations which disagree with former measurements increase the uncertainty in the belief of the robot. Consequently, they will exhibit a negative information gain. Hence, if I(zi ) > 0, observation zi contains valuable, new information about the environment and should be preserved. If I(zi ) ≤ 0, observation zi does not reduce or even increases the uncertainty in the belief of the robot and should thus be discarded. As already mentioned above, the goal of our approach is to discard all observations zi for which I(zi ) ≤ 0. Note that discarding an observation typically affects the values of other observations, particular the values of those observations which were recorded nearby. To identify all observations which can be discarded, our sparsification approach therefore iterates the following steps: 48 5 Our Approach to Life-long Map Learning 1. Compute the grid map m given all observations z1:t in the graph. 2. Compute the information gain I(zi ) for all observations zi ∈ z1:t . 3. Determine the observation zk with the lowest information gain I(zk ). 4. If I(zk ) > 0, terminate. 5. If (zk ) < 0, discard zk and proceed with step 1. Our approach typically operates online, which means that it prunes the pose graph while the SLAM approach is constructing it. Since the perception of the robot is typically limited to its vicinity, our implementation only considers those observations which were recorded close to the current location of the robot, and not all observations z1:t . These observations can quickly be determined by performing a Dijkstra expansion (Dijkstra, 1959) on the pose graph, starting from the current pose of the robot. Furthermore, our implementation does not regenerate the entire grid map every time the graph changes. Instead, the grid map is implemented such that it maintains counters in each grid cell to keep track of its occupancy probability resulting from the observations. In this way, the grid map can efficiently reflect changes in the graph, such as the removal of a node or a change in position of a node. In this section, we described how to identify observations which can be discarded without losing relevant information about the environment. However, removing the corresponding nodes from the pose graph is challenging. A naïve technique introduces a large number of new constraints and thus destroys the sparsity pattern of the pose graph. In the following sections, we will introduce an efficient approach to removing nodes from the pose graph, while preserving sparsity and most of the information encoded in the incident edges. 5.2 Pose Graphs as Gaussian Markov Random Fields In Section 4.4, we introduced the pose graph as an intuitive representation of the robot’s belief represented in information form. An advantageous property of the pose graph is its interpretation as a Gaussian Markov random field (GMRF). Markov random fields (Jordan, 1998; Kindermann and Snell, 1980) are useful to analyze the conditional independence structure of a probability distribution. Markov random fields will help us understand the structural changes in the pose graph that take place when we remove nodes. This section therefore briefly describes the Markov random field framework. Definition 3 A potential function is any function whose range is the nonnegative real numbers. Definition 4 A Markov random field is given by a graph G = (V, Ψ), where V is a set of random variables x1 , . . . , xn , and Ψ is a set of potential functions over subsets of V . 5.2 Pose Graphs as Gaussian Markov Random Fields S 49 B A Figure 5.2: Conditional independence properties in a Markov random field. Every path from any node in set A to any node in set B passes through at least one node in set S. As a result, the conditional independence property A ⊥⊥ B | S is valid for any probability distribution described by this graph. The joint distribution represented by G is pG (V ) = 1 Y ψ(Vψ ), Z ψ∈Ψ (5.12) where Vψ ⊆ V = {x1 , . . . , xn } is the domain of ψ, and Z Z ... Z= x1 Y xn ψ∈Ψ ψ(Vψ ) dx1 . . . dxn . (5.13) Consider Figure 5.2. In a Markov random field, any two subsets A and B of nodes are conditionally independent given a separating subset S of nodes if every path from any node in set A to any node in set B passes through at least one node in set S. Hence, any two variables which are not directly connected by a link are conditionally independent given all other variables. Furthermore, a variable is conditionally independent of all other variables given its Markov blanket, i. e., the set of its neighbors. Gaussian Markov random fields are a subset of Markov random fields suitable for expressing multivariate Gaussian distributions, like the belief of the robot. Translating a pose graph into a Gaussian Markov random field given by the graph G = (V, Ψ) is straightforward. The nodes in the pose graph correspond to the random variables in V . All the potential functions in Ψ are either unary or binary. The unary potentials are typically set to unity, except the one for the initial pose, which is bound to the origin. The likelihood functions associated with the edges of the pose graph correspond to the binary potentials in Ψ. A missing edge between xi and xj implies conditional independence between xi and xj . In this case, ψij (xi , xj ) is unity. Hence, the belief of the robot is given by t t Y 1 Y p(x1 , . . . , xt ) = ψi (xi ) ψi,j (xi , xj ). (5.14) Z i=1 j=i+1 50 5 Our Approach to Life-long Map Learning 5.3 Marginalization Ceases Sparsity Representing the belief of the robot using a Gaussian Markov random field helps us understand the structural changes in the pose graph that take place when we remove a node. In the Markov random field framework, removing a node from the pose graph translates to eliminating a variable from the corresponding probability distribution. The process of eliminating a variable from a probability distribution is called marginalization. Suppose the belief of the robot is given by the joint Gaussian probability distribution p(α, β) over two stacked vectors α and β which together comprise the poses of the robot. The density p(α, β) can be expressed in covariance form as " p(α, β) = N # " µα µβ , #! Σαα Σαβ Σβα Σββ , (5.15) and in information form as " p(α, β) = N −1 ξα ξβ # " , where " ξα ξβ # " = Σαα Σαβ Σβα Σββ Ωαα Ωαβ Ωβα Ωββ #−1 " µα µβ and " Ωαα Ωαβ Ωβα Ωββ # " = Σαα Σαβ Σβα Σββ #! , (5.16) # , (5.17) . (5.18) #−1 The main difference between the covariance form and the information form is the way they represent indirect relations between variables in a probability distribution. Whereas the covariance matrix encodes all indirect relations explicitly, the information matrix only represents direct relations. Marginalizing out β yields the marginal density Z p(α) = p(α, β) dβ. (5.19) Marginalization in covariance form corresponds to extracting the appropriate sub-block from the covariance matrix dropping the other variables. As a result, in covariance form, p(α) is given by p(α) = N (µα , Σαα ) . (5.20) The pose graph, however, represents the belief of the robot in information form. In information form, marginalization is more complex because it requires computing the Schur complement over the remaining variables. In information form, p(α) is given by p(α) = N −1 (ξ, Ω) , (5.21) 5.3 Marginalization Ceases Sparsity 51 x1 x2 x3 x4 x5 x6 x7 x8 x9 x10 x11 x1 x1 x2 x4 x2 x5 x3 x3 x4 x6 x5 x6 x7 x9 x10 x8 x8 x7 x9 x10 x11 x11 (a) (b) x1 x1 x2 x3 x4 x5 x6 x7 x8 x10 x11 x2 x4 x1 x5 x2 x3 x3 x6 x4 x5 x6 x10 x8 x7 x7 x8 x10 x11 x11 (c) (d) Figure 5.3: Eliminating a pose from the belief of the robot. (a) Markov random field according to the measurements, where the bold, orange node x9 is slated for removal. (b) The corresponding information matrix. Gray shades indicate nonzero values, and white indicates exactly zero. (c) Resulting Markov random field after marginalizing out node x9 . The former neighbors of x9 form an elimination clique in the new graph. (d) Marginalization causes the information matrix to become dense. where ξ = ξα − Ωαβ Ω−1 ββ ξβ , (5.22) Ω = Ωαα − Ωαβ Ω−1 ββ Ωβα . (5.23) and The Schur complement in Eq. (5.23) triggers interesting changes in the conditional independence structure of the probability distribution. All entries of the matrix outer product Ωαβ Ω−1 ββ Ωβα which are associated with variables directly related to β are nonzero. As a result, the information matrix Ω of the marginal density p(α) is filled in at the corresponding values and becomes dense. 52 5 Our Approach to Life-long Map Learning We can intuitively understand these changes in the dependency structure of the robot’s belief by visual inspection of the associated Gaussian Markov random field. Consider the example depicted in Figure 5.3. The belief of the robot is represented as a Markov random field and its corresponding information matrix in Figure 5.3 (a) and Figure 5.3 (b), respectively. Each measurement imposes an uncertain constraint between the corresponding poses of the robot, represented by an edge in the graph, and by a nonzero entry in the information matrix. The reader should take a moment to study the correspondence between the graph and its information matrix. To remove a node from the graph, we must change the structure of the graph such that the resulting probability distribution is not altered with respect to the remaining nodes. In the example, we remove node x9 . All pairs of nodes in the Markov blanket of node x9 are indirectly coupled via node x9 . Hence, when node x9 is removed, these indirect relationships vanish and must therefore be explicitly represented by direct edges. Combining the information encoded in the edges can be done in a straight forward manner since the edges encode Gaussian constraints. Thus, concatenating two constraints with means δji and δkj and information matrices Ωji and Ωkj is done by δki = δkj ⊕ δji Ωki = (5.24) −1 Ω−1 ji + Ωkj −1 . (1) (5.25) (2) Similarly, merging two constraints with means δij and δij and information matrices (1) (2) Ωji and Ωji can be achieved by (1) (1) (2) (2) δji = Ω−1 Ωji δji + Ωji δji ji (1) (2) Ωji = Ωji + Ωji . (5.26) (5.27) The belief of the robot after eliminating node x9 is illustrated in Figure 5.3 (c) and Figure 5.3 (d). Note that the nodes which were adjacent to node x9 form an elimination clique in the resulting graph. Likewise, the information matrix fills in and becomes dense. Eliminating a pose variable from the belief of the robot has important ramifications with respect to the independence structure of the belief. In particular, marginalization destroys some of the conditional independencies by introducing new constraints between all pairs of variables which are directly related to the eliminated variable. These new constraints change the corresponding entries in the information matrix from potentially zero to a nonzero value. We say that the information matrix fills in. Unfortunately, the fill-in destroys the natural sparsity pattern of the information matrix. Representing a dense matrix typically requires significantly more memory resources than representing a sparse matrix. Furthermore, the density adversely affects the computational costs of subsequent marginalizations and of the underlying pose graph optimization. 5.4 Approximate Marginalization 53 In summary, while it is possible to eliminate nodes from the pose graph, in the worst case, the number of edges increases quadratically with the number of nodes. Hence, marginalization ceases the sparsity of the pose graph and thus introduces a complexity which is not suited for life-long map learning. 5.4 Approximate Marginalization In the last section, we analyzed the changes in the independence structure of the robot’s belief caused by node marginalization. In particular, we found that marginalization in information form requires computing the Schur complement over the remaining variables. The Schur complement introduces new constraints between all pairs of variables which are directly related to the marginalized variables. As a result, these variables form an elimination clique. For simplicity of notation, let x = (x1 , . . . , xk )T be the stacked vector of the set of variables in the elimination clique. Furthermore, let p(x) be the probability distribution which arises from the constraints within the clique. Finally, let Σ and Ω be the corresponding covariance and information matrix, respectively. Within the elimination clique, every variable is conditioned upon all other variables. Hence, the number of constraints grows quadratically with the number of variables. As a result, the complexity becomes intractable as the connectivity of the graph, and thus the number of neighbors of those nodes which are eliminated increases. A solution to this problem is to approximate the probability distribution p(x) by a probability distribution q(x) which has less conditional dependencies. This implies treating some pairs of variables which are dependent upon each other as if they were independent of each other. We can use the Kullback-Leibler divergence (Kullback and Leibler, 1951) to measure the “quality” of an approximation. Definition 5 The Kullback-Leibler divergence is a non-symmetric measure of the difference between two probability distributions p and q. It is defined as Z DKL (p, q) = p(x) log x p(x) dx. q(x) (5.28) The Kullback-Leibler divergence is zero in case the distributions p and q are identical, and strictly positive otherwise: ( DKL (p, q) = 0 >0 if p = q, otherwise. (5.29) We reduce the number of edges in the elimination clique by approximating the probability distribution p(x) using a probability distribution q(x) such that each variable is 54 5 Our Approach to Life-long Map Learning conditioned upon only one of the other variables. Without loss of generality, we obtain p(x) = p(xk ) ≈ p(xk ) k−1 Y i=1 k−1 Y p(xi | xi+1 , . . . , xk ) (5.30) p(xi | xi+1 ) = q(x). (5.31) i=1 Eq. (5.31) requires the graphical model of probability distribution q(x) to be a tree. We therefore say that such a probability distribution is of first-order tree dependence. There are as many as k k−2 differently structured tree-shaped graphical models, but we have to commit to one of them. Fortunately, Chow and Liu (1968) presented an efficient algorithm to find the optimal distribution among them. This optimal distribution is therefore also referred to as the Chow-Liu tree approximation. 5.5 Chow-Liu Tree Approximation The Chow-Liu algorithm (Chow and Liu, 1968) approximates a multidimensional probability distribution p(x) by a distribution qopt (x) in which each variable is conditioned upon at most one of the other variables such that the Kullback-Leibler divergence DKL (p, q) is minimized. Definition 6 The mutual information I of two random variables xi and xj is defined as the average expected reduction in entropy of one of them upon learning the precise value of the other. It is thus given by I(xi , xj ) = H(xi ) − H(xi | xj ) p(xi | xj ) = E log2 . p(xi ) (5.32) (5.33) In case the random variables are continuous, their mutual information is given by Z I(xi , xj ) = xi ,xj p(xi , xj ) log2 p(xi , xj ) dxi dxj . p(xi )p(xj ) (5.34) Given the probability distribution p(x), the Chow-Liu algorithm constructs a complete mutual information graph G, where each edge joining xi and xj has weight equal to the mutual information I(xi , xj ). While Eq. (5.34) allows for computing the mutual information of two random variables in arbitrary probability distributions, it requires integrating over all states of the two random variables. Fortunately, the probability distribution p(x) is Gaussian and therefore 5.5 Chow-Liu Tree Approximation 6 x1 x2 55 x1 7 x6 8 x1 x2 x6 x2 x6 x3 x5 x3 x5 5 10 x3 x5 x4 9 x4 x4 (a) (b) (c) x1 x1 x1 x2 x6 x2 x6 x2 x6 x3 x5 x3 x5 x3 x5 x4 x4 x4 (d) (e) (f) Figure 5.4: Building the Chow-Liu tree. (a) Graphical model of the probability density distribution p(x), where thickness of the edges corresponds to magnitude of mutual information. Additionally, the mutual information values of particular interest are written next to the edges. (b) – (f) The Chow-Liu algorithm computes the maximum-weight spanning tree of the mutual information graph G using Kruskal’s algorithm. Intuitively, the Chow-Liu algorithm discards those dependencies which have as little mutual information as possible. (f) Chow-Liu tree approximation of p(x). the mutual information of two variables xi and xj can be computed efficiently. According to Davison (2005), in this case, it is given by 1 I(xi , xj ) = log2 2 ! |Σxi xi |Σxi xi | , − Σxi xj Σ−1 xj xj Σxj xi | (5.35) where Σxi xj refers to the entry in the covariance matrix Σ of probability distribution p(x) which relates xi and xj . The pose graph, however, encodes the probability distribution in information form. Hence, we have to perform a matrix inversion to recover Σ. Since the approximate graph sparsification performed by our approach ensures a low connectivity of the graph, the probability distribution p(x) is typically low dimensional, independent 56 5 Our Approach to Life-long Map Learning x1 x1 x2 x3 x4 x5 x6 x7 x8 x10 x11 x2 x4 x1 x5 x2 x3 x3 x6 x4 x5 x6 x10 x8 x7 x7 x8 x10 x11 x11 (a) (b) Figure 5.5: Approximate marginalization. (a) Possible graph resulting from Chow-Liu tree approximation of the clique in the graph shown in Figure 5.3. (b) The corresponding information matrix remains sparse. of the size of the graph. As a result, the matrix inversion can be carried out in more or less constant time. Chow and Liu proved that the probability distribution which arises from the maximumweight spanning tree of the mutual information graph G is the optimal approximation of tree dependence. We can compute the maximum-weight spanning tree using Kruskal’s algorithm (Kruskal Jr, 1956). Figure 5.4 visualizes the computation of the Chow-Liu tree. In short, in each step, the algorithm adds the edge of maximum mutual information to the tree which does not cause the resulting graph to become cyclic. Figure 5.5 shows a possible Chow-Liu tree approximation of the elimination clique arisen in the example depicted in Figure 5.3. To summarize, we use Chow-Liu trees to sparsify the elimination cliques which emerge when we remove a node from the pose graph. Our method locally approximates these elimination cliques with optimal, tree-shaped distributions. Furthermore, our technique is guaranteed to not split up the pose graph into several disconnected subgraphs. Finally, our approach only locally restricts the pose graph to be a tree. Therefore, our technique preserves the global topology of the pose graph and, in particular, vital global loop closures. 6 Experimental Evaluation We carried out a series of experiments to analyze our graph-based approach to simultaneous localization and mapping. We therefore implemented the presented method based on the SLAM back-end described in Grisetti et al. (2007a) and a re-implementation of the SLAM front-end described in Olson (2008). We carried out several experiments at the University of Freiburg using a real ActivMedia Pioneer-2 robot. The mobile robot was equipped with a SICK laser range finder to perceive the environment. We used the Carnegie Mellon Robot Navigation Toolkit (Carmen) to control the robot. In addition to the experiments carried out at the University of Freiburg, we evaluated the performance of our approach based on numerous datasets frequently used as benchmarks in the robotics community. These datasets contain laser data and odometry measurements recorded by other real robots as they were moving through interestingly structured buildings. We evaluate our graph sparsification technique by comparing the performance of our SLAM approach when using the presented graph sparsification to the performance when the graph sparsification is disabled. For simplicity, we refer to the approach without graph pruning as the “standard approach”. The experiments are designed to show that our approach to informed graph pruning is well suited for life-long mobile robot mapping and standard SLAM problems. In Section 6.1, we show that our graph sparsification technique significantly reduces the computational requirements of graph-based SLAM approaches, particularly in cases in which the robot frequently re-traverses previously visited terrain. The speed-up is achieved by removing redundant information from the belief of the robot. As a consequence, the graph stays small which allows for efficient optimization and fast, reliable loop closing. In Section 6.2, we show that our approximate marginalization technique preserves the sparsity of the pose graph. In Section 6.3, we highlight the positive influence of our pruning technique on the resulting grid maps. Since our approach discards observations which do not align well with the map of the environment, it produces crisp grid maps which exhibit less blur, compared to standard grid-based mapping approaches. Finally, in Section 6.4, we evaluate our graph sparsification approach on several standard datasets often used in the robotics community to compare SLAM algorithms. 6.1 Runtime and Memory Requirements In this section, we analyze the runtime of our approach and its memory requirements in terms of the size of the resulting pose graph. We show that the computational overhead 58 6 Experimental Evaluation no graph reduction our approach 5000 no graph reduction our approach 150 100 50 number of edges 150000 number of nodes time per observation [s] no graph reduction our approach 200 4000 3000 2000 100000 50000 1000 0 0 0 0 1000 2000 3000 4000 observation 0 1000 2000 3000 4000 observation 0 1000 2000 3000 4000 observation Figure 6.1: Typical results of an experiment exhibiting a large number of re-traversals in which the robot traveled through already visited terrain for an extended period of time. The results clearly show that the behavior leads to an explosion in terms of computing and memory requirements when using the standard approach (blue). In contrast to that, our approach (red) keeps the number of nodes and edges in the graph low and is highly efficient in this way. of our approach is only marginal, leading to significantly reduced total processing times. In a first experiment, we exposed both mapping approaches to extreme conditions. We therefore made the robot travel forth and back 100 times in the corridor of our lab environment. The corridor is approximately 20 meters long and 3 meters wide. Figure 6.1 shows three plots illustrating the results of the experiment. The plots clearly indicate that the behavior pattern executed by the robot leads to an explosion in terms of computing and memory requirements when using the standard approach. The slowdown is mainly due to the loop closing component of the SLAM front-end. As discussed in Section 4.3, the scan matcher endeavors to find constraints between the current scan and all former scans which were recorded in the vicinity of the robot. Since the standard approach constantly adds new nodes to the graph, the set of scans for which feature matching seems likely grows without bound during the experiment. Consequently, the robot has to perform scan matching against an increasing number of scans. This is disadvantageous since the process of scan matching is typically slow and prone to errors. Additionally, the computational burden increases due to the underlying optimization framework. The results of the experiment clearly show that the number of nodes and edges in the graph rapidly grows when using the standard approach. As the size of the graph grows, pose graph optimization becomes increasingly more challenging and thus more time-consuming. Furthermore, the large number of nodes and edges in the graph requires huge amounts of memory. All this is problematic since mobile robots are typically equipped with rather limited computing devices. In contrast to the standard approach, our sparsification technique removes those nodes from the graph which do not reduce the uncertainty in the belief of the robot. As a result, the robot prunes the 6.1 Runtime and Memory Requirements no graph reduction our approach no graph reduction our approach 3000 no graph reduction our approach 10 5 0 number of edges 15000 number of nodes time per observation [s] 15 59 2000 1000 0 0 1000 2000 observation 10000 5000 0 0 1000 2000 observation 0 1000 2000 observation Figure 6.2: Results of an experiment in which the robot moved around in an office environment for an extended period of time. The results show that the computational and memory complexity of the standard approach (blue) increases without bound during the experiment. Our approach (red) is highly efficient since it keeps the number of nodes and edges in the graph low. The standard approach required a total of 2 hours and 9 minutes to process the dataset, whereas our approach processed the entire dataset in 4 minutes and 32 seconds. pose graph such that the number of nodes in the graph remains more or less constant as long as the robot does not explore new territory. Furthermore, our approximate marginalization technique ensures that the set of edges in the graph remains small, and that only informative constraints remain in the graph. As a result, our graph pruning technique avoids both the runtime and the memory explosion altogether. Note that the visible fluctuations in the time needed to incorporate an observation have two main reasons. First, our implementation of the mapping system is optimized for speed such that it suppresses certain computations in cases in which the graph changes only marginally. Second, our loop closing component does not perform feature matching against all observations. It rather restricts the set of observations to those for which feature matching seems likely. As a consequence, some observations are incorporated significantly faster than others. We carried out a second experiment to further evaluate the effects of our graph pruning technique. The second experiment exhibits typical conditions which a robot faces when carrying out a longer mission in a closed office environment. In this experiment, the robot moved around in our lab environment for an extended period of time. It thereby visited the individual rooms and the corridor many times. The robot was constantly moving for almost an hour. Figure 6.2 illustrates the results of this experiment. The results leave no doubt that the standard approach is not equipped for such a task. The time required by the standard approach to incorporate an observation increases throughout the entire mission, leading to a total of 2 hours and 9 minutes to process the dataset. Our approach was already 60 6 Experimental Evaluation 2 6000 no graph reduction our approach 2500 5000 2000 4000 edges 3 nodes time per observation [s] 3000 no graph reduction our approach 4 1500 no graph reduction our approach 3000 1000 2000 500 1000 1 0 0 0 500 1000 1500 observation 0 0 500 1000 1500 observation 0 500 1000 1500 observation Figure 6.3: Intel Research lab dataset. Performance of the standard approach (blue) versus our approach (red). no graph reduction our approach 10000 no graph reduction our approach 2500 no graph reduction our approach 7500 2000 4 edges 6 nodes time per observation [s] 8 1500 5000 1000 2 2500 500 0 0 0 500 1000 1500 observation 2000 0 0 500 1000 1500 2000 observation 0 500 1000 1500 2000 observation Figure 6.4: FHW Museum dataset. Performance of the standard approach (blue) versus our approach (red). done after 4 minutes and 32 seconds. To allow for useful navigation tasks, the robot has to be capable of incorporating measurements in real-time. The experiments show that the average time needed by the standard approach to incorporate an observation increases without bound. In contrast to the standard approach, our graph sparsification technique ensures that the average time to incorporate measurements remains low throughout the entire mission. We furthermore evaluated the runtime and the memory requirements of our approach on standard benchmark datasets. Figure 6.3 and Figure 6.4 show the results obtained when processing the Intel Research Lab dataset and the FHW Museum dataset, respectively. The plots indicate results similar to the results of the experiments mentioned above. The experiments clearly illustrate that our approach leads to a significant performance gain even when processing standard datasets which are not explicitly designed for testing long-term simultaneous localization and mapping. 6.2 Approximate Marginalization 61 exact marginalization our approach gamma index (γ) edges 15000 10000 5000 exact marginalization our approach 1 0.1 0.01 0 0 500 1000 1500 0 observation 500 1000 1500 observation Figure 6.5: Exact marginalization versus our approximate marginalization technique when processing the Intel Research Lab dataset. Node elimination by means of exact marginalization adds a large number of edges to the graph. In contrast to that, our approach ensures that the number of edges in the graph stays small. The gamma index (γ) of a graph is defined as the ratio between the number of existing edges in the graph and the maximum number of possible edges in the graph. Exact marginalization yields a high gamma index, indicating that the graph is dense. Our approach leads to a small and more or less constant gamma index, indicating that the graph is sparse. A sparse graph is highly advantageous since it can be stored and optimized much more efficiently than a dense graph. 6.2 Approximate Marginalization In Section 5.3, we asserted that eliminating nodes from the belief of the robot by means of exact marginalization causes the pose graph to become dense. In this section, we show that our approximate marginalization technique preserves the sparsity of the pose graph. A sparse pose graph is highly advantageous since it can be stored and optimized efficiently. To evaluate how densely connected a pruned pose graph is in practice, we use the so-called gamma index (Garrison and Marble, 1965). The gamma index (γ) of a graph is defined as the ratio between the number of existing edges in the graph and the maximum number of possible edges in the graph. For undirected graphs, it is given by γ = e , − 1) 1 2 v(v (6.1) where e is the number of edges and v is the number of nodes in the graph. The gamma index varies from 0 to 1, where 0 means that the nodes are not connected at all, and 1 means that the graph is complete. Figure 6.5 and Figure 6.6 show a comparison of the graphs resulting from exact marginalization and our approximate marginalization technique when processing the Intel Research lab dataset. The figures show that eliminating nodes from the pose graph 62 6 Experimental Evaluation (a) Exact marginalization. (b) Our approach. Figure 6.6: Final pose graphs resulting from exact marginalization and our approximate marginalization technique when processing the Intel Research Lab dataset. Eliminating nodes from the pose graph by means of exact marginalization causes the pose graph to become dense. Our approach preserves the sparsity of the pose graph by using Chow-Liu trees to locally approximate the elimination cliques which result from marginalization. As a consequence, only the most informative constraints remain in the graph. In this experiment, the final pose graph which resulted from exact marginalization possesses 13052 edges, whereas the pose graph produced by our approach only has 559 edges. Nevertheless, our approach produced a high quality grid map. by means of exact marginalization causes the pose graph to become dense. Our approach preserves the sparsity of the pose graph by using Chow-Liu trees to locally approximate the elimination cliques which result from marginalization. 6.3 Improved Grid Map Quality We furthermore evaluated the effects of our information-driven graph pruning technique on the quality of the resulting grid maps. In feature-based approaches, the correlations between landmark estimates increase monotonically with the number of observations (Durrant-Whyte and Bailey, 2006). Grid maps, however, have a disadvantage when it comes to life-long map learning. Whenever the robot obtains a measurement, the scan matcher aims to align the new scan against existing scans in order to solve the data association problem. The probability that the scan matcher thereby makes a small 6.3 Improved Grid Map Quality 63 (a) Standard approach. (b) Our approach. Figure 6.7: In this experiment, the robot moved 100 times forth and back in a corridor. (a) Standard grid-based mapping approaches tend to produce thick and blurred walls due to accumulating scan matching errors. (b) Our graph sparsification approach discards scans which do not align well with existing scans and thus produces crisp maps even in cases of many re-traversals. alignment error is nonzero. A scan which is incorporated at a slightly wrong position blurs the map. As a result, the probability that the scan matcher misaligns subsequent scans increases since scan matching is performed against misaligned scans. Hence, the probability of making alignment errors increases with the number of incorporated scans. In the long run, the map tends to become increasingly blurred. In the worst case, the mapping approach diverges. The problem is inherent in grid maps due to the discretization of space. Improving the scan matcher can only mitigate the detrimental effects of the discretization, but cannot avoid that the alignment errors accumulate as more scans are incorporated. Figure 6.7 and Figure 6.8 show the maps corresponding to the first two experiments discussed in Section 6.1. The grid maps generated by the standard approach exhibit clearly visible blur and artificially thick walls. In general, the more often the robot re-traverses already visited terrain, the more blur is added to the map. As opposed to the standard approach, our graph pruning approach discards scans which do not align well with existing scans and thus produces crisp maps even in cases in which the robot 64 6 Experimental Evaluation (a) Standard approach. (b) Our approach. Figure 6.8: In this experiment, the robot moved around in an office environment at the University of Freiburg for an extended period of time. The robot was constantly moving for almost an hour thereby visiting the individual rooms and the corridor many times. Our method removed most of the nodes, yielding a graph which only contains highly informative edges and nodes. frequently re-traverses already visited places. Figure 6.8 shows the graphs resulting from the two methods. The graph produced by the standard approach is dense and contains a large number of nodes, whereas the graph generated by our method is sparse and only consists of a fraction of the nodes and only highly informative links. 6.4 Standard Datasets 65 (a) Standard approach. (b) Our approach. Figure 6.9: Intel Research Lab dataset. 6.4 Standard Datasets We furthermore evaluated our graph sparsification approach on several standard datasets. These datasets are often used in the robotics community to compare algorithms which address the simultaneous localization and mapping problem. 6.4.1 Intel Research Lab Since Dirk Hähnel recorded the Intel Research Lab dataset, it has become a popular benchmark dataset frequently used in the community. The Intel Research Lab provides an interesting environment due to its circular structure. The robot travels multiple times around the circular corridor repeatedly visiting the individual rooms. The behavior pattern allows for closing a large number of loops. Figure 6.9 shows the results obtained from the Intel Research Lab dataset. The map generated by our approach exhibits areas with significantly less blur compared to the map produced by the standard approach. Some of these areas are highlighted by red arrows in the figures. Our algorithm thereby produced the map by only maintaining 29 % of the scans obtained by the robot. Furthermore, our approach was significantly faster than the standard approach. The standard approach required a total of 537 seconds to process the dataset, whereas the processing time of our approach was only 314 seconds. 66 6 Experimental Evaluation Figure 6.10: Killian Court dataset. 6.4.2 Killian Court The MIT Killian Court infinite corridor dataset, first recorded by Bosse et al. (2004), is a challenging benchmark for SLAM approaches. It is a large dataset with a path length of 1450 meters and two hours of robot travel. The dataset is interesting for the work presented in this thesis because it poses a challenge to the loop closing capabilities due to several long loops. Closing long loops is difficult since the dead reckoning error accumulates along the path, and, as a result, the number of scans for which feature matching seems likely grows. The approach is likely to generate an inconsistent map in cases in which the scan matcher fails to close the loops. Hence, this experiment serves well to analyze the impact of our approach on the performance of the scan matcher. Figure 6.10 depicts the map generated by our approach. Although it prunes the pose graph such that the environment is represented by a small number of scans, the scan matcher still managed to close all loops, leading to a consistent map. 6.4 Standard Datasets 67 (a) Standard approach. (b) Our approach. Figure 6.11: FHW Museum dataset. 6.4.3 FHW Museum The FHW Museum dataset, recorded by Dirk Hähnel, is reasonable large, spanning almost two hours of robot driving. The dataset is interesting for the work presented in this thesis because the robot spends most of the time re-traversing already visited terrain. Figure 6.9 depicts the maps and the graphs obtained from the FHW Museum dataset. The figures show that the graph output by our method contains much less nodes. Furthermore, although based on only 21 % of the scans, the map generated by our approach is of higher quality than the map produced by the standard approach. Furthermore, our approach led to a significant reduced runtime. The standard approach required a total of 32 minutes and 45 seconds to process the dataset. Processing time of our approach was only 5 minutes and 19 seconds. 7 Discussion 7.1 Conclusion In this thesis, we presented a novel approach for mobile robots using the graph-based formulation of the simultaneous localization and mapping problem to allow for life-long map learning using grid maps. The key idea of our approach is to remove redundant information from the graph such that its size stays constant as long as the robot moves through previously visited areas. Hence, in contrast to traditional graph-based approaches, the complexity of our new approach does not scale with the length of the robot’s trajectory, but with the size of the mapped area. As a result, a major bottleneck in the mapping component of current navigation systems vanishes, and long-term applications of mobile robots become feasible. We use grid maps to model the environment since they do not require pre-defined feature extractors. Grid maps, however, suffer from accumulating discretization errors, which is why standard grid-based mapping approaches tend to produce blurred maps. We use the concept of information entropy to explicitly measure the information gain of observations. In this way, we are able to identify redundant observations which do not reduce the uncertainty in the map of the environment. Discarding redundant observations from the belief of the robot has two major advantages. First, it allows for more sustainable use of limited resources. Second, discarding unnecessary and slightly misaligned scans increases the quality of the maps. Removing observations from the graph while changing the belief of the robot as little as possible, however, is challenging. An exact elimination technique introduces a large number of new constraints into the belief of the robot due to the marginalization. The resulting graph becomes dense and prevents the robot from efficiently storing and processing data. Therefore, preserving sparsity is a key prerequisite for life-long map learning. We introduced an efficient approach to removing nodes from the graph which fulfills this requirement while performing as few approximations as possible. Our technique locally approximates the constraint network using optimal, tree-shaped probability distributions, which are computed by means of Chow-Liu trees. As a result, our method preserves sparsity while minimizing the loss of information encoded in the constraints. We implemented our approach and illustrated its efficacy in a series of experiments with a real robot. We furthermore considered standard benchmark datasets widely-used in the research community. The key improvements of our new approach are significantly 70 7 Discussion reduced computational requirements and less blur in the grid maps. The results of the experimental evaluation emphasize that the approach presented in this thesis is an important step to allow for life-long map learning in mobile robotics. 7.2 Future Work Despite the encouraging results presented in this thesis, there is still room for improvements. Although grid maps are widely-used, they have disadvantages. For simplicity, grid maps assume that the individual grid cells are independent of each other. As a result, correlations between grid cells are neglected in the computation of the information entropy. Even though this is a common assumption with known problems, no promising approach to efficiently considering these dependencies has been presented. We expect that sound consideration of such spatial dependencies will lead to a less approximate computation of the uncertainty and thus to an improvement of our node reduction approach. Furthermore, while our approximate marginalization technique is directly applicable to any constraint network, the presented method to compute the information gain of observations is rather specific to 2D-mapping scenarios using laser scanners. Further research could transfer the concept to other sensing modalities and other application scenarios. It would be interesting to explore how to extend our approach towards highly dynamic environments. Although other researchers are investigating this topic, it is still unclear how to robustly cope with moving objects and environments that significantly change over time. Related to that, we plan on investigating possible representations of dynamic aspects of the environment in order to provide useful information for mobile robot navigation. List of Figures 3.1 3.2 3.3 3.4 3.5 Robot motion . . . . . . . . Robot perception . . . . . . An occupancy grid map . . Mapping with known poses Drawings of graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 26 27 28 30 4.1 4.2 4.3 4.4 The SLAM problem . . A pose graph . . . . . . A pose/feature graph . . Pose graph optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 38 39 43 5.1 5.2 5.3 5.4 5.5 Binary entropy function . . . . . . . . . . . . . . Conditional independence properties in a Markov Eliminating a pose from the belief of the robot . Chow-Liu tree approximation . . . . . . . . . . . Approximate marginalization . . . . . . . . . . . . . . . . . . . random field . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 49 51 55 56 6.1 6.2 6.3 6.4 6.5 6.6 6.7 6.8 6.9 6.10 6.11 Experiment with many re-traversals: plots . . . . . . . . . . . . . . Experiment in an office environment: plots . . . . . . . . . . . . . Intel Research lab dataset: plots . . . . . . . . . . . . . . . . . . . FHW Museum dataset: plots . . . . . . . . . . . . . . . . . . . . . Exact marginalization versus approximate marginalization: plots . Exact marginalization versus approximate marginalization: graphs Experiment with many re-traversals: maps and graphs . . . . . . . Experiment in an office environment: maps and graphs . . . . . . . Intel Research Lab dataset: maps and graphs . . . . . . . . . . . . Killian Court dataset: map . . . . . . . . . . . . . . . . . . . . . . FHW Museum dataset: maps and graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 59 60 60 61 62 63 64 65 66 67 . . . . . . . . Bibliography T. Bailey, J. Nieto, J. Guivant, M. Stevens, and E. Nebot. Consistency of the EKF-SLAM Algorithm. In Proc. of the IEEE/RSJ International Conference on Intelligent Robots and Systems, pages 3562–3568, Beijing, China, 2006. M. Bosse, P. Newman, J. Leonard, and S. Teller. Simultaneous Localization and Map Building in Large-Scale Cyclic Environments Using the Atlas Framework. Int. Journal of Robotics Research, 23(12):1113, 2004. A. Censi. Scan Matching in a Probabilistic Framework. In Proc. of the IEEE Int. Conf. on Robotics & Automation (ICRA), pages 2291–2296, Orlando, Florida, 2006. P. Cheeseman and P. Smith. On the Representation and Estimation of Spatial Uncertainty. Int. Journal of Robotics Research, 5:56–68, 1986. ISSN 0278-3649. M. Chli and A. J. Davison. Automatically and Efficiently Inferring the Hierarchical Structure of Visual Maps. In Proc. of the IEEE Int. Conf. on Robotics & Automation (ICRA), Kobe, Japan, June 2009. C. Chow and C. Liu. Approximating Discrete Probability Distributions with Dependence Trees. IEEE Transactions on Information Theory, 14(3):462–467, 1968. M. Cummins and P. Newman. FAB-MAP: Probabilistic Localization and Mapping in the Space of Appearance. Int. Journal of Robotics Research, 27(6):647–665, 2008. A. J. Davison. Active Search for Real-Time Vision. In Proc. of the Int. Conf. on Computer Vision (ICCV), volume 1, 2005. F. Dellaert and M. Kaess. Square Root SAM: Simultaneous Localization and Mapping via Square Root Information Smoothing. Intl. J. of Robotics Research, 25(12):1181–1204, December 2006. F. Dellaert, D. Fox, W. Burgard, and S. Thrun. Monte Carlo Localization for Mobile Robots. In Proc. of the IEEE Int. Conf. on Robotics & Automation (ICRA), pages 99–141, Leuven, Belgium, May 1999. R. Diestel. Graph Theory. Springer Verlag, third edition, August 2005. ISBN 3540261826. 74 Bibliography E. W. Dijkstra. A Note on Two Problems in Connexion With Graphs. Numerische Mathematik, 1(1):269–271, 1959. A. Doucet. On Sequential Simulation-Based Methods for Bayesian Filtering. Technical report, Signal Processing Group, Dept. of Engeneering, University of Cambridge, 1998. A. Doucet, E. Dept, and U. of Cambridge. On Sequential Simulation-based Methods for Bayesian Filtering. University of Cambridge, Department of Engineering, 1998. A. Doucet, J. de Freitas, K. Murphy, and S. Russel. Rao-Blackwellized Partcile Filtering for Dynamic Bayesian Networks. In Proc. of the Conf. on Uncertainty in Artificial Intelligence (UAI), pages 176–183, Stanford, CA, USA, 2000. A. Doucet, N. de Freitas, and N. Gordan, editors. Sequential Monte-Carlo Methods in Practice. Springer Verlag, 2001. T. Duckett, S. Marsland, and J. Shapiro. Learning Globally Consistent Maps by Relaxation. In Proc. of the IEEE Int. Conf. on Robotics & Automation (ICRA), 2000. T. Duckett, S. Marsland, and J. Shapiro. Fast, On-line Learning of Globally Consistent Maps. Autonomous Robots, 12(3):287 – 300, 2002. H. Durrant-Whyte. Uncertain Geometry in Robotics. In Proc. of the IEEE Conf. on Robotics and Automation, 1987. H. Durrant-Whyte and T. Bailey. Simultaneous Localization and Mapping (SLAM): Part I The Essential Algorithms. IEEE Robotics & Automation Magazine, 13(2):99–110, 2006. A. Eliazar and R. Parr. DP-SLAM: Fast, Robust Simultainous Localization and Mapping Without Predetermined Landmarks. In Proc. of the Int. Conf. on Artificial Intelligence (IJCAI), pages 1135–1142, Acapulco, Mexico, 2003. R. Eustice, H. Singh, and J. Leonard. Exactly Sparse Delayed-State Filters. In Proc. of the IEEE Int. Conf. on Robotics & Automation (ICRA), pages 2428–2435, Barcelona, Spain, April 2005a. R. Eustice, M. Walter, and J. Leonard. Sparse Extended Information Filters: Insights into Sparsification. In Proc. of the IEEE/RSJ Int. Conf. on Intelligent Robots and Systems (IROS), pages 641–648, Edmonton, Cananda, 2005b. R. Eustice, H. Singh, and J. Leonard. Exactly Sparse Delayed-State Filters for View-Based SLAM. IEEE Transactions on Robotics, 22(6):1100–1114, 2006. J. Folkesson and H. Christensen. Graphical SLAM - A Self-Correcting Map. In Proc. of the IEEE Int. Conf. on Robotics & Automation (ICRA), Orlando, FL, USA, 2004a. Bibliography 75 J. Folkesson and H. Christensen. Robust SLAM. In Proc. of the IFAC Symposium on Intelligent Autonomous Vehicles, 2004b. J. Folkesson, P. Jensfelt, and H. Christensen. Vision SLAM in the Measurement Subspace. In Proc. of the IEEE Int. Conf. on Robotics & Automation (ICRA), pages 325–330, April 2005. U. Frese. A Proof for the Approximate Sparsity of SLAM Information Matrices. In Proc. of the IEEE Int. Conf. on Robotics and Automation, pages 331–337, 2005. U. Frese and G. Hirzinger. Simultaneous Localization and Mapping - A Discussion. In Proceedings of the IJCAI Workshop on Reasoning with Uncertainty in Robotics, Seattle, 2001. U. Frese, P. Larsson, and T. Duckett. A Multilevel Relaxation Algorithm for Simultaneous Localization and Mapping. IEEE Transactions on Robotics, 21(2):1–12, 2005. W. L. Garrison and D. F. Marble. A Prolegomenon to the Forecasting of Transportation Development, 1965. M. Golfarelli, D. Mairo, and S. Rizzi. Elastic Correction of Dead-reckoning Errors in Map Building. In Proc. of the IEEE/RSJ Int. Conf. on Intelligent Robots and Systems, pages 905–911, 1998. G. Grisetti, C. Stachniss, and W. Burgard. Improving Grid-based SLAM with RaoBlackwellized Particle Filters by Adaptive Proposals and Selective Resampling. In Proc. of the IEEE Int. Conf. on Robotics & Automation (ICRA), pages 2443–2448, Barcelona, Spain, April 2005. G. Grisetti, G. D. Tipaldi, C. Stachniss, W. Burgard, and D. Nardi. Speeding-Up Rao-Blackwellized SLAM. In Proc. of the IEEE Int. Conf. on Robotics & Automation (ICRA), pages 442–447, Orlando, FL, USA, 2006. G. Grisetti, C. Stachniss, S. Grzonka, and W. Burgard. A Tree Parameterization for Efficiently Computing Maximum Likelihood Maps using Gradient Descent. In Proc. of Robotics: Science and Systems (RSS), Atlanta, GA, USA, 2007a. G. Grisetti, G. D. Tipaldi, C. Stachniss, W. Burgard, and D. Nardi. Fast and Accurate SLAM with Rao-Blackwellized Particle Filters. Journal of Robotics & Autonomous Systems, 55(1):30–38, 2007b. D. Hähnel. Mapping with Mobile Robots. PhD thesis, University of Freiburg, Department of Computer Science, 2005. 76 Bibliography D. Hähnel, W. Burgard, D. Fox, and S. Thrun. An Efficient FastSLAM Algorithm for Generating Maps of Large-Scale Cyclic Environments from Raw Laser Range Measurements. In Proc. of the IEEE/RSJ Int. Conf. on Intelligent Robots and Systems (IROS), pages 206–211, Las Vegas, NV, USA, 2003. M. Jordan. Learning in Graphical Models. Kluwer Academic Publishers, 1998. S. Julier and J. Uhlmann. A Counter Example to the Theory of Simultaneous Localization and Map Building. In Proc. of the IEEE Int. Conf. on Robotics and Automation, pages 4238–4243, 2001. S. Julier, J. Uhlmann, and H. Durrant-Whyte. A New Approach for Filtering Nonlinear Systems. In Proc. of the American Control Conference, pages 1628–1632, Seattle, WA, USA, 1995. M. Kaess, A. Ranganathan, and F. Dellaert. iSAM: Fast Incremental Smoothing and Mapping with Efficient Data Association. In Proc. of the IEEE Int. Conf. on Robotics & Automation (ICRA), Rome, Italy, 2007. M. Kaess, A. Ranganathan, and F. Dellaert. iSAM: Incremental Smoothing and Mapping. IEEE Trans. on Robotics, 24(6):1365–1378, December 2008. R. Kalman. A New Approach To Linear Filtering and Prediction Problems. ASMEJournal of Basic Engineering, March:35–45, 1960. R. Kindermann and J. L. Snell. Markov Random Fields and Their Applications. American Mathematical Society, 1980. K. Konolige. Large-Scale Map-Making. In Proc. of the National Conference on Artificial Intelligence (AAAI), pages 457–463, 2004. K. Konolige and M. Agrawal. FrameSLAM: From Bundle Adjustment to Realtime Visual Mappping. IEEE Transactions on Robotics, 24(5):1066–1077, 2008. H. Kretzschmar, G. Grisetti, and C. Stachniss. Life-long Map Learning for Graph-based Simultaneous Localization and Mapping. Zeitschrift KI – Künstliche Intelligenz, 2009. Accepted for publication. J. B. Kruskal Jr. On the Shortest Spanning Subtree of a Graph and the Traveling Salesman Problem. Proceedings of the American Mathematical Society, pages 48–50, 1956. S. Kullback and R. A. Leibler. On Information and Sufficiency. The Annals of Mathematical Statistics, pages 79–86, 1951. Bibliography 77 J. Leonard, H. Durrant-Whyte, and I. Cox. Dynamic Map Building for an Autonomous Mobile Robot. Int. Journal of Robotics Research, 11(4):286, 1992. F. Lu and E. Milios. Globally Consistent Range Scan Alignment for Environment Mapping. Autonomous Robots, 4:333–349, 1997. P. S. Maybeck. Stochastic Models, Estimation, and Control, volume 141 of Mathematics in Science and Engineering. 1979. P. McLauchlan and D. Murray. A Unifying Framework for Structure and Motion Recovery from Image Sequences. In Proc. of the Int. Conf. on Computer Vision (ICCV), pages 314–320, Boston, MA, USA, 1995. M. Montemerlo and S. Thrun. Simultaneous Localization and Mapping with Unknown Data Association using FastSLAM. In Proc. of the IEEE Int. Conf. on Robotics & Automation (ICRA), pages 1985–1991, Taipei, Taiwan, 2003. M. Montemerlo and S. Thrun. Large-scale Robotic 3-D Mapping of Urban Structures. In Proc. of the Int. Symposium on Experimental Robotics (ISER), 2004. M. Montemerlo, S. Thrun, D. Koller, and B. Wegbreit. FastSLAM: A Factored Solution to Simultaneous Localization and Mapping. In Proc. of the National Conference on Artificial Intelligence (AAAI), pages 593–598, Edmonton, Canada, 2002. M. Montemerlo, S. T. D. Koller, and B. Wegbreit. FastSLAM 2.0: An Improved Particle Filtering Algorithm for Simultaneous Localization and Mapping that Provably Converges. In Proc. of the Int. Conf. on Artificial Intelligence (IJCAI), pages 1151–1156, Acapulco, Mexico, 2003. H. Moravec and A. Elfes. High Resolution Maps from Wide Angle Sonar. In Proc. of the IEEE Int. Conf. on Robotics & Automation (ICRA), pages 116–121, St. Louis, MO, USA, 1985. K. Murphy. Bayesian Map Learning in Dynamic Environments. In Proc. of the Conf. on Neural Information Processing Systems (NIPS), pages 1015–1021, Denver, CO, USA, 1999. J. Neira and J. Tardós. Data Association in Stochastic Mapping Using the Joint Compatibility Test. IEEE Transactions on Robotics and Automation, 17(6):890–897, 2001. E. Olson. Robust and Efficient Robotic Mapping. PhD thesis, Massachusetts Institute of Technology, Cambridge, MA, USA, June 2008. E. Olson. Real-Time Correlative Scan Matching. In Proc. of the IEEE Int. Conf. on Robotics & Automation (ICRA), Kobe, Japan, June 2009. 78 Bibliography E. Olson, J. Leonard, and S. Teller. Fast Iterative Optimization of Pose Graphs with Poor Initial Estimates. pages 2262–2269, 2006. S. O’Callaghan, F. Ramos, and H. Durrant-Whyte. Contextual Occupancy Maps using Gaussian Processes. In Proc. of the IEEE Int. Conf. on Robotics & Automation (ICRA), Kobe, Japan, June 2009. M. Paskin. Thin Junction Tree Filters for Simultaneous Localization and Mapping. In Proc. of the Int. Conf. on Artificial Intelligence (IJCAI), pages 1157–1164, Acapulco, Mexico, 2003. H. Robbins and S. Monro. A Stochastic Approximation Method. The Annals of Mathematical Statistics, pages 400–407, 1951. C. Shannon. A Mathematical Theory of Communication. The Bell System Technical Journal, 15:379–423, 1948. J. Sivic and A. Zisserman. Video Google: A Text Retrieval Approach to Object Matching in Videos. In Proc. of the Int. Conf. on Computer Vision (ICCV), pages 1470–1477, 2003. R. Smith, M. Self, and P. Cheeseman. Estimating Uncertain Spatial Realtionships in Robotics. In I. Cox and G. Wilfong, editors, Autonomous Robot Vehicles, pages 167–193. Springer Verlag, 1990. C. Stachniss. Robotic Mapping and Exploration, volume 55 of Springer Tracts in Advanced Robotics. Springer Verlag, 2009. C. Stachniss, G. Grisetti, and W. Burgard. Recovering Particle Diversity in a RaoBlackwellized Particle Filter for SLAM after Actively Closing Loops. In Proc. of the IEEE Int. Conf. on Robotics & Automation (ICRA), pages 667–672, Barcelona, Spain, April 2005. C. Stachniss, G. Grisetti, W. Burgard, and N. Roy. Evaluation of Gaussian Proposal Distributions for Mapping with Rao-Blackwellized Particle Filters. In Proc. of the IEEE/RSJ Int. Conf. on Intelligent Robots and Systems (IROS), San Diego, CA, USA, 2007. H. Strasdat, C. Stachniss, and W. Burgard. Which Landmark is Useful? Learning Selection Policies for Navigation in Unknown Environments. In Proc. of the IEEE Int. Conf. on Robotics & Automation (ICRA), Kobe, Japan, June 2009. S. Thrun. Learning Occupancy Grids With Forward Sensor Models. Autonomous Robots, 15:111–127, 2003. Bibliography 79 S. Thrun and M. Montemerlo. The Graph SLAM Algorithm with Applications to LargeScale Mapping of Urban Structures. Int. Journal of Robotics Research, 25(5-6):403, 2006. S. Thrun, D. Koller, Z. Ghahmarani, and H. Durrant-Whyte. SLAM Updates Require Constant Time. School of Computer Science, Carnegie Mellon University, Pittsburgh, Tech. Rep, 2002. S. Thrun, Y. Liu, D. Koller, A. Ng, Z. Ghahramani, and H. Durrant-Whyte. Simultaneous Localization and Mapping With Sparse Extended Information Filters. Int. Journal of Robotics Research, 23(7/8):693–716, 2004. S. Thrun, W. Burgard, and D. Fox. Probabilistic Robotics. MIT Press, Cambridge, MA, USA, 2005. G. D. Tipaldi, G. Grisetti, and W. Burgard. Approximate Covariance Estimation in Graphical Approaches to SLAM. In Proc. of the IEEE/RSJ Int. Conf. on Intelligent Robots and Systems (IROS), San Diego, CA, USA, 2007. J. Uhlmann. Dynamic Map Building and Localization: New Theoretical Foundations. PhD thesis, University of Oxford, 1995. R. van der Merwe, N. de Freitas, A. Doucet, and E. Wan. The Unscented Particle Filter. Technical Report CUED/F-INFENG/TR380, Cambridge University Engineering Department, August 2000. V. Verma, S. Thrun, and R. Simmons. Variable Resolution Particle Filter. In Int. Joint Conf. on Artificial Intelligence, volume 18, pages 976–984, 2003. M. Walter, R. Eustice, and J. Leonard. A Provably Consistent Method for Imposing Sparsity in Feature-based SLAM information Filters. In Proc. of the Int. Symposium of Robotics Research (ISRR), San Francisco, CA, 2005.