CMA-ES with optimal covariance update and storage complexity

Oswin Krause, Dídac Rodríguez Arbonès, Christian Igel

18 Citations (Scopus)

Abstract

The covariance matrix adaptation evolution strategy (CMA-ES) is arguably one of the most powerful real-valued derivative-free optimization algorithms, finding many applications in machine learning. The CMA-ES is a Monte Carlo method, sampling from a sequence of multi-variate Gaussian distributions. Given the function values at the sampled points, updating and storing the covariance matrix dominates the time and space complexity in each iteration of the algorithm. We propose a numerically stable quadratic-time covariance matrix update scheme with minimal memory requirements based on maintaining triangular Cholesky factors. This requires a modification of the cumulative step-size adaption (CSA) mechanism in the CMA-ES, in which we replace the inverse of the square root of the covariance matrix by the inverse of the triangular Cholesky factor. Because the triangular Cholesky factor changes smoothly with the matrix square root, this modification does not change the behavior of the CMA-ES in terms of required objective function evaluations as verified empirically. Thus, the described algorithm can and should replace the standard CMA-ES if updating and storing the covariance matrix matters.

Original languageEnglish
Title of host publicationAdvances in Neural Information Processing Systems 29 (NIPS 2016)
EditorsD. D. Lee, M. Sugiyama, U. V. Luxburg, I. Guyon, R. Garnett
Number of pages9
PublisherCurran Associates, Inc.
Publication date2016
Pages370-378
Publication statusPublished - 2016
Event30th Annual Conference on Neural Information Processing Systems - Barcelona, Spain
Duration: 5 Dec 201610 Dec 2016
Conference number: 30

Conference

Conference30th Annual Conference on Neural Information Processing Systems
Number30
Country/TerritorySpain
CityBarcelona
Period05/12/201610/12/2016

Fingerprint

Dive into the research topics of 'CMA-ES with optimal covariance update and storage complexity'. Together they form a unique fingerprint.

Cite this