
Matrix Computations
Matrix computations are some of the core sources of computational expense in machine learning.



QUICSVD: Fast SVD Using Cosine Trees
Michael Holmes, Alexander Gray, and Charles Isbell
Neural Information Processing Systems (NIPS) 2008, appeared 2009
Fast singular value decomposition (SVD) using multiresolution Monte Carlo based on cosine trees, a new data structure.
Unlike previous Monte Carlo linear algebra approaches, it is not limited to a fixedrank approximation and is thus automatically
as accurate as the user requires, without the need for trying many different ranks or other parameters. It is also more efficient for the same number of samples than previous sampling schemes.
[pdf]
Abstract:
The Singular Value Decomposition is a key operation in many machine learning
methods. Its computational cost, however, makes it unscalable and impractical for
the massivesized datasets becoming common in applications. We present a new
method, QUICSVD, for fast approximation of the full SVD with automatic
sample size minimization and empirical relative error control. Previous Monte Carlo
approaches have not addressed the full SVD nor benefited from the efficiency of
automatic, empiricallydriven sample sizing. Our empirical tests show speedups
of several orders of magnitude over exact SVD. Such scalability should enable
QUICSVD to meet the needs of a wide array of methods and applications.
@Inproceedings{holmes2009quicsvd,
author = "Michael Holmes and Alexander Gray and Charles Isbell",
Title = "{QUICSVD: Fast SVD Using Cosine Trees}",
booktitle = "Advances in Neural Information Processing Systems 21 (Dec 2008)",
Year = "2009",
publisher = {MIT Press}
}

See also
Methods for Kernel MatrixVector Multiplications
Kernel matrixvector multiplications can be done quickly using efficient kernel summation methods.
[see webpage here]
In preparation
Kernel PCA and Gaussian Process Regression
While our fast PCA/SVD method can be used for kernel PCA and Gaussian process regression,
we are developing a methodology which does not have to explicitly form the
N x N kernel matrix, circumventing significant computational and storage costs.


Fast Kernel MatrixVector Multiplication with Application to Gaussian Process Learning
Alexander G. Gray
Carnegie Mellon University Technical Report, 2004
An example of using our dualtree, automaticallyerrorcontrolled kernel summation methods for fast matrixvector multiplication with kernel matrices, as arises in Gaussian process learning. The idea of using kernel summation methods inside linear algebraic methods has been used in some computational physics problems, but was introduced for statistical problems for the first time here.
[pdf]
Abstract:
A number of core computational problems in machine learning, both old and new, can be cast as a matrixvector multiplication between a kernel matrix or classprobability matrix and a vector of weights. This arises prominently, for example, in the kernel estimation methods of nonparametric statistics, many common probabilistic graphical models, and the more recent kernel machines. After highlighting the existence of this computational problem in several wellknown machine learning methods, we focus on a solution for one specific example for clarity, Gaussian process (GP) prediction  one whose applicability has been particularly hindered by this computational barrier. We demonstrate the application of a recent Nbody approach developed specically for statistical problems, employing adaptive computational geometry and finitedifference approximation. This core algorithm reduces the O(N^2) matrixvector multiplications within GP learning to O(N), making the resulting overall learning algorithm O(N). GP learning for N = 1 million points is demonstrated.
@techreport{gray2004kermatvec,
title = "{Fast Kernel MatrixVector Multiplication with Application to Gaussian Process Regression}",
author = "Alexander G. Gray",
institution = "{Carnegie Mellon University}",
series = "{Computer Science Department Technical Report, CMUCS04110}",
year = "2004"
}

