Optimization Algorithms on Matrix Manifolds

ISBN-10: 0691132984

ISBN-13: 9780691132983

Many problems in the sciences and engineering can be rephrased as optimization problems on matrix search spaces endowed with a so-called manifold structure. This book shows how to exploit the special structure of such problems to develop efficient numerical algorithms. It places careful emphasis on both the numerical formulation of the algorithm and its differential geometric abstraction—illustrating how good algorithms draw equally from the insights of differential geometry, optimization,...

Search in google:

"The treatment strikes an appropriate balance between mathematical, numerical, and algorithmic points of view. The quality of the writing is quite high and very readable. The topic is very timely and is certainly of interest to myself and my students."--Kyle A. Gallivan, Florida State University Anders Linner - Mathematical Reviews This book is succinct but essentially self-contained; it includes an appendix with background material as well as an extensive bibliography. The algorithmic techniques developed may be useful anytime a model leads to a mathematical optimization problem where the domain naturally is a manifold, particularly if the manifold is a matrix manifold. The book follows the usual definition-theorem-proof style but it is not intended for traditional course work so there are no exercises. A reader with limited exposure to manifold theory and differential geometry most likely will benefit from consulting standard texts on those subjects first.

Optimization Algorithm on Matrix Manifolds \ \ By P.-A. Absil R. Mahony R. Sepulchre Princeton University Press \ Copyright © 2007 Princeton University Press\ All right reserved.\ ISBN: 978-0-691-13298-3 \ \ \ \ Chapter One Introduction \ This book is about the design of numerical algorithms for computational problems posed on smooth search spaces. The work is motivated by matrix optimization problems characterized by symmetry or invariance properties in the cost function or constraints. Such problems abound in algorithmic questions pertaining to linear algebra, signal processing, data mining, and statistical analysis. The approach taken here is to exploit the special structure of these problems to develop efficient numerical procedures.\ An illustrative example is the eigenvalue problem. Because of their scale invariance, eigenvectors are not isolated in vector spaces. Instead, each eigendirection defines a linear subspace of eigenvectors. For numerical computation, however, it is desirable that the solution set consist only of isolated points in the search space. An obvious remedy is to impose a norm equality constraint on iterates of the algorithm. The resulting spherical search space is an embedded submanifold of the original vector space. An alternative approach is to "factor" the vector space by the scaleinvariant symmetry operation such that any subspace becomes a single point. The resulting search space is a quotient manifold of the original vector space. These two approaches provide prototypestructures for the problems considered in this book.\ Scale invariance is just one of several symmetry properties regularly encountered in computational problems. In many cases, the underlying symmetry property can be exploited to reformulate the problem as a nondegenerate optimization problem on an embedded or quotient manifold associated with the original matrix representation of the search space. These constraint sets carry the structure of nonlinear matrix manifolds. This book provides the tools to exploit such structure in order to develop efficient matrix algorithms in the underlying total vector space.\ Working with a search space that carries the structure of a nonlinear manifold introduces certain challenges in the algorithm implementation. In their classical formulation, iterative optimization algorithms rely heavily on the Euclidean vector space structure of the search space; a new iterate is generated by adding an update increment to the previous iterate in order to reduce the cost function. The update direction and step size are generally computed using a local model of the cost function, typically based on (approximate) first and second derivatives of the cost function, at each step. In order to define algorithms on manifolds, these operations must be translated into the language of differential geometry. This process is a significant research program that builds upon solid mathematical foundations. Advances in that direction have been dramatic over the last two decades and have led to a solid conceptual framework. However, generalizing a given optimization algorithm on an abstract manifold is only the first step towards the objective of this book. Turning the algorithm into an efficient numerical procedure is a second step that ultimately justifies or invalidates the first part of the effort. At the time of publishing this book, the second step is more an art than a theory.\ Good algorithms result from the combination of insight from differential geometry, optimization, and numerical analysis. A distinctive feature of this book is that as much attention is paid to the practical implementation of the algorithm as to its geometric formulation. In particular, the concrete aspects of algorithm design are formalized with the help of the concepts of retraction and vector transport, which are relaxations of the classical geometric concepts of motion along geodesics and parallel transport. The proposed approach provides a framework to optimize the efficiency of the numerical algorithms while retaining the convergence properties of their abstract geometric counterparts.\ The geometric material in the book is mostly confined to Chapters 3 and 5. Chapter 3 presents an introduction to Riemannian manifolds and tangent spaces that provides the necessary tools to tackle simple gradient descent optimization algorithms on matrix manifolds. Chapter 5 covers the advanced material needed to define higher order derivatives on manifolds and to build the analog of first and second order local models required in most optimization algorithms. The development provided in these chapters ranges from the foundations of differential geometry to advanced material relevant to our applications. The selected material focuses on those geometric concepts that are particular to the development of numerical algorithms on embedded and quotient manifolds. Not all aspects of classical differential geometry are covered, and some emphasis is placed on material that is nonstandard or difficult to find in the established literature. A newcomer to the field of differential geometry may wish to supplement this material with a classical text. Suggestions for excellent texts are provided in the references.\ A fundamental, but deliberate, omission in the book is a treatment of the geometric structure of Lie groups and homogeneous spaces. Lie theory is derived from the concepts of symmetry and seems to be a natural part of a treatise such as this. However, with the purpose of reaching a community without an extensive background in geometry, we have omitted this material in the present book. Occasionally the Lietheoretic approach provides an elegant shortcut or interpretation for the problems considered. An effort is made throughout the book to refer the reader to the relevant literature whenever appropriate.\ The algorithmic material of the book is interlaced with the geometric material. Chapter 4 considers gradientdescent linesearch algorithms. These simple optimization algorithms provide an excellent framework within which to study the important issues associated with the implementation of practical algorithms. The concept of retraction is introduced in Chapter 4 as a key step in developing efficient numerical algorithms on matrix manifolds. The later chapters on algorithms provide the core results of the book: the development of Newton based methods in Chapter 6 and of trust region methods in Chapter 7, and a survey of other superlinear methods such as conjugate gradients in Chapter 8. We attempt to provide a generic development of each of these methods, building upon the material of the geometric chapters. The methodology is then developed into concrete numerical algorithms on specific examples. In the analysis of superlinear and second order methods, the concept of vector transport (introduced in Chapter 8) is used to provide an efficient implementation of methods such as conjugate gradient and other quasi Newton methods. The algorithms obtained in these sections of the book are competitive with state of the art numerical linear algebra algorithms for certain problems.\ The running example used throughout the book is the calculation of invariant subspaces of a matrix (and the many variants of this problem). This example is by far, for variants of algorithms developed within the proposed framework, the problem with the broadest scope of applications and the highest degree of achievement to date. Numerical algorithms, based on a geometric formulation, have been developed that compete with the best available algorithms for certain classes of invariant subspace problems. These algorithms are explicitly described in the later chapters of the book and, in part, motivate the whole project. Because of the important role of this class of problems within the book, the first part of Chapter 2 provides a detailed description of the invariant subspace problem, explaining why and how this problem leads naturally to an optimization problem on a matrix manifold. The second part of Chapter 2 presents other applications that can be recast as problems of the same nature. These problems are the subject of ongoing research, and the brief exposition given is primarily an invitation for interested researchers to join with us in investigating these problems and expanding the range of applications considered.\ The book should primarily be considered a research monograph, as it reports on recently published results in an active research area that is expected to develop significantly beyond the material presented here. At the same time, every possible effort has been made to make the book accessible to the broadest audience, including applied mathematicians, engineers, and computer scientists with little or no background in differential geometry. It could equally well qualify as a graduate textbook for a one semester course in advanced optimization. More advanced sections that can be readily skipped at a first reading are indicated with a star. Moreover, readers are encouraged to visit the book home page where supplementary material is available.\ The book is an extension of the first author's Ph.D. thesis [Abs03], itself a project that drew heavily on the material of the second author's Ph.D. thesis [Mah94]. It would not have been possible without the many contributions of a quickly expanding research community that has been working in the area over the last decade. The Notes and References section at the end of each chapter is an attempt to give proper credit to the many contributors, even though this task becomes increasingly difficult for recent contributions. The authors apologize for any omission or error in these notes. In addition, we wish to conclude this introductory chapter with special acknowledgements to people without whom this project would have been impossible. The 1994 monograph [HM94] by Uwe Helmke and John Moore is a milestone in the formulation of computational problems as optimization algorithms on manifolds and has had a profound influence on the authors. On the numerical side, the constant encouragement of Paul Van Dooren and Kyle Gallivan has provided tremendous support to our efforts to reconcile the perspectives of differential geometry and numerical linear algebra. We are also grateful to all our colleagues and friends over the last ten years who have crossed paths as coauthors, reviewers, and critics of our work. Special thanks to Ben Andrews, Chris Baker, Alan Edelman, Michiel Hochstenbach, Knut Hüper, Jonathan Manton, Robert Orsi, and Jochen Trumpf. Finally, we acknowledge the useful feedback of many students on preliminary versions of the book, in particular, Mariya Ishteva, Michel Journée, and Alain Sarlette.\ (Continues...)\ \ \ \ \ Excerpted from Optimization Algorithm on Matrix Manifolds by P.-A. Absil R. Mahony R. Sepulchre\ Copyright © 2007 by Princeton University Press. Excerpted by permission.\ All rights reserved. No part of this excerpt may be reproduced or reprinted without permission in writing from the publisher.\ Excerpts are provided by Dial-A-Book Inc. solely for the personal use of visitors to this web site. \ \

List of Algorithms xiForeword Paul Van Dooren xiiiNotation Conventions xvIntroduction 1Motivation and Applications 5A case study: the eigenvalue problem 5The eigenvalue problem as an optimization problem 7Some benefits of an optimization framework 9Research problems 10Singular value problem 10Matrix approximations 12Independent component analysis 13Pose estimation and motion recovery 14Notes and references 16Matrix Manifolds: First-Order Geometry 17Manifolds 18Definitions: charts, atlases, manifolds 18The topology of a manifold 20How to recognize a manifold 21Vector spaces as manifolds 22The manifolds R[superscript n x p] and R[superscript n x p] 22Product manifolds 23Differentiable functions 24Immersions and submersions 24Embedded submanifolds 25General theory 25The Stiefel manifold 26Quotientmanifolds 27Theory of quotient manifolds 27Functions on quotient manifolds 29The real projective space RP[superscript n-1] 30The Grassmann manifold Grass(p, n) 30Tangent vectors and differential maps 32Tangent vectors 33Tangent vectors to a vector space 35Tangent bundle 36Vector fields 36Tangent vectors as derivations 37Differential of a mapping 38Tangent vectors to embedded submanifolds 39Tangent vectors to quotient manifolds 42Riemannian metric, distance, and gradients 45Riemannian submanifolds 47Riemannian quotient manifolds 48Notes and references 51Line-Search Algorithms on Manifolds 54Retractions 54Retractions on embedded submanifolds 56Retractions on quotient manifolds 59Retractions and local coordinates 61Line-search methods 62Convergence analysis 63Convergence on manifolds 63A topological curiosity 64Convergence of line-search methods 65Stability of fixed points 66Speed of convergence 68Order of convergence 68Rate of convergence of line-search methods 70Rayleigh quotient minimization on the sphere 73Cost function and gradient calculation 74Critical points of the Rayleigh quotient 74Armijo line search 76Exact line search 78Accelerated line search: locally optimal conjugate gradient 78Links with the power method and inverse iteration 78Refining eigenvector estimates 80Brockett cost function on the Stiefel manifold 80Cost function and search direction 80Critical points 81Rayleigh quotient minimization on the Grassmann manifold 83Cost function and gradient calculation 83Line-search algorithm 85Notes and references 86Matrix Manifolds: Second-Order Geometry 91Newton's method in R[superscript n] 91Affine connections 93Riemannian connection 96Symmetric connections 96Definition of the Riemannian connection 97Riemannian connection on Riemannian submanifolds 98Riemannian connection on quotient manifolds 100Geodesics, exponential mapping, and parallel translation 101Riemannian Hessian operator 104Second covariant derivative 108Notes and references 110Newton's Method 111Newton's method on manifolds 111Riemannian Newton method for real-valued functions 113Local convergence 114Calculus approach to local convergence analysis 117Rayleigh quotient algorithms 118Rayleigh quotient on the sphere 118Rayleigh quotient on the Grassmann manifold 120Generalized eigenvalue problem 121The nonsymmetric eigenvalue problem 125Newton with subspace acceleration: Jacobi-Davidson 126Analysis of Rayleigh quotient algorithms 128Convergence analysis 128Numerical implementation 129Notes and references 131Trust-Region Methods 136Models 137Models in R[superscript n] 137Models in general Euclidean spaces 137Models on Riemannian manifolds 138Trust-region methods 140Trust-region methods in R[superscript n] 140Trust-region methods on Riemannian manifolds 140Computing a trust-region step 141Computing a nearly exact solution 142Improving on the Cauchy point 143Convergence analysis 145Global convergence 145Local convergence 152Discussion 158Applications 159Checklist 159Symmetric eigenvalue decomposition 160Computing an extreme eigenspace 161Notes and references 165A Constellation of Superlinear Algorithms 168Vector transport 168Vector transport and affine connections 170Vector transport by differentiated retraction 172Vector transport on Riemannian submanifolds 174Vector transport on quotient manifolds 174Approximate Newton methods 175Finite difference approximations 176Secant methods 178Conjugate gradients 180Application: Rayleigh quotient minimization 183Least-square methods 184Gauss-Newton methods 186Levenberg-Marquardt methods 187Notes and references 188Elements of Linear Algebra, Topology, and Calculus 189Linear algebra 189Topology 191Functions 193Asymptotic notation 194Derivatives 195Taylor's formula 198Bibliography 201Index 221

\ American Mathematical Society[T]his book is succinct but essentially self-contained; it includes an appendix with background material as well as an extensive bibliography. The algorithmic techniques developed may be useful anytime a model leads to a mathematical optimization problem where the domain naturally is a manifold, particularly if the manifold is a matrix manifold. The book follows the usual definition-theorem-proof style but it is not intended for traditional course work so there are no exercises. A reader with limited exposure to manifold theory and differential geometry most likely will benefit from consulting standard texts on those subjects first.\ — Anders Linnér\ \ \ \ \ Foundations of Computational MathematicsThe book is very well and carefully written. Every chapter starts with a page-long introduction clearly outlining its goals and how they are achieved together with possible relations to other chapters. I find the material very well explained and supported with appropriate examples. It is a pleasure to work with such a book.\ — Nickolay T. Trendafilov\ \ \ Mathematical Reviews\ - Anders Linner\ [T]his book is succinct but essentially self-contained; it includes an appendix with background material as well as an extensive bibliography. The algorithmic techniques developed may be useful anytime a model leads to a mathematical optimization problem where the domain naturally is a manifold, particularly if the manifold is a matrix manifold. The book follows the usual definition-theorem-proof style but it is not intended for traditional course work so there are no exercises. A reader with limited exposure to manifold theory and differential geometry most likely will benefit from consulting standard texts on those subjects first.\ \ \ \ \ Foundations of Computational Mathematics\ - Nickolay T. Trendafilov\ The book is very well and carefully written. Every chapter starts with a page-long introduction clearly outlining its goals and how they are achieved together with possible relations to other chapters. I find the material very well explained and supported with appropriate examples. It is a pleasure to work with such a book.\ \ \ \ \ American Mathematical Society\ - Anders Linnér\ \ [T]his book is succinct but essentially self-contained; it includes an appendix with background material as well as an extensive bibliography. The algorithmic techniques developed may be useful anytime a model leads to a mathematical optimization problem where the domain naturally is a manifold, particularly if the manifold is a matrix manifold. The book follows the usual definition-theorem-proof style but it is not intended for traditional course work so there are no exercises. A reader with limited exposure to manifold theory and differential geometry most likely will benefit from consulting standard texts on those subjects first.\ \ \ \ \ Mathematical ReviewsThis book is succinct but essentially self-contained; it includes an appendix with background material as well as an extensive bibliography. The algorithmic techniques developed may be useful anytime a model leads to a mathematical optimization problem where the domain naturally is a manifold, particularly if the manifold is a matrix manifold. The book follows the usual definition-theorem-proof style but it is not intended for traditional course work so there are no exercises. A reader with limited exposure to manifold theory and differential geometry most likely will benefit from consulting standard texts on those subjects first.\ — Anders Linner\ \