Optimization Algorithms on Matrix Manifolds

Hardcover
from $0.00

Author: P.-A. Absil

ISBN-10: 0691132984

ISBN-13: 9780691132983

Category: Algorithms

Many problems in the sciences and engineering can be rephrased as optimization problems on matrix search spaces endowed with a so-called manifold structure. This book shows how to exploit the special structure of such problems to develop efficient numerical algorithms. It places careful emphasis on both the numerical formulation of the algorithm and its differential geometric abstraction—illustrating how good algorithms draw equally from the insights of differential geometry, optimization,...

Search in google:

"The treatment strikes an appropriate balance between mathematical, numerical, and algorithmic points of view. The quality of the writing is quite high and very readable. The topic is very timely and is certainly of interest to myself and my students."--Kyle A. Gallivan, Florida State University Anders Linner - Mathematical Reviews This book is succinct but essentially self-contained; it includes an appendix with background material as well as an extensive bibliography. The algorithmic techniques developed may be useful anytime a model leads to a mathematical optimization problem where the domain naturally is a manifold, particularly if the manifold is a matrix manifold. The book follows the usual definition-theorem-proof style but it is not intended for traditional course work so there are no exercises. A reader with limited exposure to manifold theory and differential geometry most likely will benefit from consulting standard texts on those subjects first.

Optimization Algorithm on Matrix Manifolds \ \ By P.-A. Absil R. Mahony R. Sepulchre Princeton University Press \ Copyright © 2007 Princeton University Press\ All right reserved.\ ISBN: 978-0-691-13298-3 \ \ \ \ Chapter One Introduction \ This book is about the design of numerical algorithms for computational problems posed on smooth search spaces. The work is motivated by matrix optimization problems characterized by symmetry or invariance properties in the cost function or constraints. Such problems abound in algorithmic questions pertaining to linear algebra, signal processing, data mining, and statistical analysis. The approach taken here is to exploit the special structure of these problems to develop efficient numerical procedures.\ An illustrative example is the eigenvalue problem. Because of their scale invariance, eigenvectors are not isolated in vector spaces. Instead, each eigendirection defines a linear subspace of eigenvectors. For numerical computation, however, it is desirable that the solution set consist only of isolated points in the search space. An obvious remedy is to impose a norm equality constraint on iterates of the algorithm. The resulting spherical search space is an embedded submanifold of the original vector space. An alternative approach is to "factor" the vector space by the scaleinvariant symmetry operation such that any subspace becomes a single point. The resulting search space is a quotient manifold of the original vector space. These two approaches provide prototypestructures for the problems considered in this book.\ Scale invariance is just one of several symmetry properties regularly encountered in computational problems. In many cases, the underlying symmetry property can be exploited to reformulate the problem as a nondegenerate optimization problem on an embedded or quotient manifold associated with the original matrix representation of the search space. These constraint sets carry the structure of nonlinear matrix manifolds. This book provides the tools to exploit such structure in order to develop efficient matrix algorithms in the underlying total vector space.\ Working with a search space that carries the structure of a nonlinear manifold introduces certain challenges in the algorithm implementation. In their classical formulation, iterative optimization algorithms rely heavily on the Euclidean vector space structure of the search space; a new iterate is generated by adding an update increment to the previous iterate in order to reduce the cost function. The update direction and step size are generally computed using a local model of the cost function, typically based on (approximate) first and second derivatives of the cost function, at each step. In order to define algorithms on manifolds, these operations must be translated into the language of differential geometry. This process is a significant research program that builds upon solid mathematical foundations. Advances in that direction have been dramatic over the last two decades and have led to a solid conceptual framework. However, generalizing a given optimization algorithm on an abstract manifold is only the first step towards the objective of this book. Turning the algorithm into an efficient numerical procedure is a second step that ultimately justifies or invalidates the first part of the effort. At the time of publishing this book, the second step is more an art than a theory.\ Good algorithms result from the combination of insight from differential geometry, optimization, and numerical analysis. A distinctive feature of this book is that as much attention is paid to the practical implementation of the algorithm as to its geometric formulation. In particular, the concrete aspects of algorithm design are formalized with the help of the concepts of retraction and vector transport, which are relaxations of the classical geometric concepts of motion along geodesics and parallel transport. The proposed approach provides a framework to optimize the efficiency of the numerical algorithms while retaining the convergence properties of their abstract geometric counterparts.\ The geometric material in the book is mostly confined to Chapters 3 and 5. Chapter 3 presents an introduction to Riemannian manifolds and tangent spaces that provides the necessary tools to tackle simple gradient descent optimization algorithms on matrix manifolds. Chapter 5 covers the advanced material needed to define higher order derivatives on manifolds and to build the analog of first and second order local models required in most optimization algorithms. The development provided in these chapters ranges from the foundations of differential geometry to advanced material relevant to our applications. The selected material focuses on those geometric concepts that are particular to the development of numerical algorithms on embedded and quotient manifolds. Not all aspects of classical differential geometry are covered, and some emphasis is placed on material that is nonstandard or difficult to find in the established literature. A newcomer to the field of differential geometry may wish to supplement this material with a classical text. Suggestions for excellent texts are provided in the references.\ A fundamental, but deliberate, omission in the book is a treatment of the geometric structure of Lie groups and homogeneous spaces. Lie theory is derived from the concepts of symmetry and seems to be a natural part of a treatise such as this. However, with the purpose of reaching a community without an extensive background in geometry, we have omitted this material in the present book. Occasionally the Lietheoretic approach provides an elegant shortcut or interpretation for the problems considered. An effort is made throughout the book to refer the reader to the relevant literature whenever appropriate.\ The algorithmic material of the book is interlaced with the geometric material. Chapter 4 considers gradientdescent linesearch algorithms. These simple optimization algorithms provide an excellent framework within which to study the important issues associated with the implementation of practical algorithms. The concept of retraction is introduced in Chapter 4 as a key step in developing efficient numerical algorithms on matrix manifolds. The later chapters on algorithms provide the core results of the book: the development of Newton based methods in Chapter 6 and of trust region methods in Chapter 7, and a survey of other superlinear methods such as conjugate gradients in Chapter 8. We attempt to provide a generic development of each of these methods, building upon the material of the geometric chapters. The methodology is then developed into concrete numerical algorithms on specific examples. In the analysis of superlinear and second order methods, the concept of vector transport (introduced in Chapter 8) is used to provide an efficient implementation of methods such as conjugate gradient and other quasi Newton methods. The algorithms obtained in these sections of the book are competitive with state of the art numerical linear algebra algorithms for certain problems.\ The running example used throughout the book is the calculation of invariant subspaces of a matrix (and the many variants of this problem). This example is by far, for variants of algorithms developed within the proposed framework, the problem with the broadest scope of applications and the highest degree of achievement to date. Numerical algorithms, based on a geometric formulation, have been developed that compete with the best available algorithms for certain classes of invariant subspace problems. These algorithms are explicitly described in the later chapters of the book and, in part, motivate the whole project. Because of the important role of this class of problems within the book, the first part of Chapter 2 provides a detailed description of the invariant subspace problem, explaining why and how this problem leads naturally to an optimization problem on a matrix manifold. The second part of Chapter 2 presents other applications that can be recast as problems of the same nature. These problems are the subject of ongoing research, and the brief exposition given is primarily an invitation for interested researchers to join with us in investigating these problems and expanding the range of applications considered.\ The book should primarily be considered a research monograph, as it reports on recently published results in an active research area that is expected to develop significantly beyond the material presented here. At the same time, every possible effort has been made to make the book accessible to the broadest audience, including applied mathematicians, engineers, and computer scientists with little or no background in differential geometry. It could equally well qualify as a graduate textbook for a one semester course in advanced optimization. More advanced sections that can be readily skipped at a first reading are indicated with a star. Moreover, readers are encouraged to visit the book home page where supplementary material is available.\ The book is an extension of the first author's Ph.D. thesis [Abs03], itself a project that drew heavily on the material of the second author's Ph.D. thesis [Mah94]. It would not have been possible without the many contributions of a quickly expanding research community that has been working in the area over the last decade. The Notes and References section at the end of each chapter is an attempt to give proper credit to the many contributors, even though this task becomes increasingly difficult for recent contributions. The authors apologize for any omission or error in these notes. In addition, we wish to conclude this introductory chapter with special acknowledgements to people without whom this project would have been impossible. The 1994 monograph [HM94] by Uwe Helmke and John Moore is a milestone in the formulation of computational problems as optimization algorithms on manifolds and has had a profound influence on the authors. On the numerical side, the constant encouragement of Paul Van Dooren and Kyle Gallivan has provided tremendous support to our efforts to reconcile the perspectives of differential geometry and numerical linear algebra. We are also grateful to all our colleagues and friends over the last ten years who have crossed paths as coauthors, reviewers, and critics of our work. Special thanks to Ben Andrews, Chris Baker, Alan Edelman, Michiel Hochstenbach, Knut Hüper, Jonathan Manton, Robert Orsi, and Jochen Trumpf. Finally, we acknowledge the useful feedback of many students on preliminary versions of the book, in particular, Mariya Ishteva, Michel Journée, and Alain Sarlette.\ (Continues...)\ \ \ \ \ Excerpted from Optimization Algorithm on Matrix Manifolds by P.-A. Absil R. Mahony R. Sepulchre\ Copyright © 2007 by Princeton University Press. Excerpted by permission.\ All rights reserved. No part of this excerpt may be reproduced or reprinted without permission in writing from the publisher.\ Excerpts are provided by Dial-A-Book Inc. solely for the personal use of visitors to this web site. \ \

List of Algorithms     xiForeword   Paul Van Dooren     xiiiNotation Conventions     xvIntroduction     1Motivation and Applications     5A case study: the eigenvalue problem     5The eigenvalue problem as an optimization problem     7Some benefits of an optimization framework     9Research problems     10Singular value problem     10Matrix approximations     12Independent component analysis     13Pose estimation and motion recovery     14Notes and references     16Matrix Manifolds: First-Order Geometry     17Manifolds     18Definitions: charts, atlases, manifolds     18The topology of a manifold     20How to recognize a manifold     21Vector spaces as manifolds     22The manifolds R[superscript n x p] and R[superscript n x p]     22Product manifolds     23Differentiable functions     24Immersions and submersions     24Embedded submanifolds     25General theory     25The Stiefel manifold     26Quotientmanifolds     27Theory of quotient manifolds     27Functions on quotient manifolds     29The real projective space RP[superscript n-1]     30The Grassmann manifold Grass(p, n)     30Tangent vectors and differential maps     32Tangent vectors     33Tangent vectors to a vector space     35Tangent bundle     36Vector fields     36Tangent vectors as derivations     37Differential of a mapping     38Tangent vectors to embedded submanifolds     39Tangent vectors to quotient manifolds     42Riemannian metric, distance, and gradients     45Riemannian submanifolds     47Riemannian quotient manifolds     48Notes and references     51Line-Search Algorithms on Manifolds     54Retractions     54Retractions on embedded submanifolds     56Retractions on quotient manifolds     59Retractions and local coordinates     61Line-search methods     62Convergence analysis     63Convergence on manifolds     63A topological curiosity     64Convergence of line-search methods     65Stability of fixed points     66Speed of convergence     68Order of convergence     68Rate of convergence of line-search methods     70Rayleigh quotient minimization on the sphere     73Cost function and gradient calculation     74Critical points of the Rayleigh quotient     74Armijo line search     76Exact line search     78Accelerated line search: locally optimal conjugate gradient     78Links with the power method and inverse iteration     78Refining eigenvector estimates     80Brockett cost function on the Stiefel manifold     80Cost function and search direction     80Critical points     81Rayleigh quotient minimization on the Grassmann manifold     83Cost function and gradient calculation     83Line-search algorithm     85Notes and references     86Matrix Manifolds: Second-Order Geometry     91Newton's method in R[superscript n]     91Affine connections     93Riemannian connection     96Symmetric connections     96Definition of the Riemannian connection     97Riemannian connection on Riemannian submanifolds     98Riemannian connection on quotient manifolds     100Geodesics, exponential mapping, and parallel translation     101Riemannian Hessian operator     104Second covariant derivative     108Notes and references     110Newton's Method     111Newton's method on manifolds     111Riemannian Newton method for real-valued functions     113Local convergence     114Calculus approach to local convergence analysis     117Rayleigh quotient algorithms     118Rayleigh quotient on the sphere     118Rayleigh quotient on the Grassmann manifold     120Generalized eigenvalue problem     121The nonsymmetric eigenvalue problem     125Newton with subspace acceleration: Jacobi-Davidson     126Analysis of Rayleigh quotient algorithms     128Convergence analysis     128Numerical implementation     129Notes and references     131Trust-Region Methods     136Models     137Models in R[superscript n]     137Models in general Euclidean spaces      137Models on Riemannian manifolds     138Trust-region methods     140Trust-region methods in R[superscript n]     140Trust-region methods on Riemannian manifolds     140Computing a trust-region step     141Computing a nearly exact solution     142Improving on the Cauchy point     143Convergence analysis     145Global convergence     145Local convergence     152Discussion     158Applications     159Checklist     159Symmetric eigenvalue decomposition     160Computing an extreme eigenspace     161Notes and references     165A Constellation of Superlinear Algorithms     168Vector transport     168Vector transport and affine connections     170Vector transport by differentiated retraction     172Vector transport on Riemannian submanifolds     174Vector transport on quotient manifolds     174Approximate Newton methods     175Finite difference approximations     176Secant methods     178Conjugate gradients     180Application: Rayleigh quotient minimization     183Least-square methods     184Gauss-Newton methods     186Levenberg-Marquardt methods     187Notes and references     188Elements of Linear Algebra, Topology, and Calculus     189Linear algebra     189Topology     191Functions     193Asymptotic notation     194Derivatives     195Taylor's formula     198Bibliography     201Index     221

\ American Mathematical Society[T]his book is succinct but essentially self-contained; it includes an appendix with background material as well as an extensive bibliography. The algorithmic techniques developed may be useful anytime a model leads to a mathematical optimization problem where the domain naturally is a manifold, particularly if the manifold is a matrix manifold. The book follows the usual definition-theorem-proof style but it is not intended for traditional course work so there are no exercises. A reader with limited exposure to manifold theory and differential geometry most likely will benefit from consulting standard texts on those subjects first.\ — Anders Linnér\ \ \ \ \ Foundations of Computational MathematicsThe book is very well and carefully written. Every chapter starts with a page-long introduction clearly outlining its goals and how they are achieved together with possible relations to other chapters. I find the material very well explained and supported with appropriate examples. It is a pleasure to work with such a book.\ — Nickolay T. Trendafilov\ \ \ Mathematical Reviews\ - Anders Linner\ [T]his book is succinct but essentially self-contained; it includes an appendix with background material as well as an extensive bibliography. The algorithmic techniques developed may be useful anytime a model leads to a mathematical optimization problem where the domain naturally is a manifold, particularly if the manifold is a matrix manifold. The book follows the usual definition-theorem-proof style but it is not intended for traditional course work so there are no exercises. A reader with limited exposure to manifold theory and differential geometry most likely will benefit from consulting standard texts on those subjects first.\ \ \ \ \ Foundations of Computational Mathematics\ - Nickolay T. Trendafilov\ The book is very well and carefully written. Every chapter starts with a page-long introduction clearly outlining its goals and how they are achieved together with possible relations to other chapters. I find the material very well explained and supported with appropriate examples. It is a pleasure to work with such a book.\ \ \ \ \ American Mathematical Society\ - Anders Linnér\ \ [T]his book is succinct but essentially self-contained; it includes an appendix with background material as well as an extensive bibliography. The algorithmic techniques developed may be useful anytime a model leads to a mathematical optimization problem where the domain naturally is a manifold, particularly if the manifold is a matrix manifold. The book follows the usual definition-theorem-proof style but it is not intended for traditional course work so there are no exercises. A reader with limited exposure to manifold theory and differential geometry most likely will benefit from consulting standard texts on those subjects first.\ \ \ \ \ Mathematical ReviewsThis book is succinct but essentially self-contained; it includes an appendix with background material as well as an extensive bibliography. The algorithmic techniques developed may be useful anytime a model leads to a mathematical optimization problem where the domain naturally is a manifold, particularly if the manifold is a matrix manifold. The book follows the usual definition-theorem-proof style but it is not intended for traditional course work so there are no exercises. A reader with limited exposure to manifold theory and differential geometry most likely will benefit from consulting standard texts on those subjects first.\ — Anders Linner\ \