Abstract
Dimension reduction is often needed in the area of data mining. The goal of
these methods is to map the given high-dimensional data into a low-dimensional
space preserving certain properties of the initial data. There are two kinds of
techniques for this purpose. The first, projective methods, builds an explicit
linear projection from the high-dimensional space to the low-dimensional one.
On the other hand, the nonlinear methods utilizes nonlinear and implicit
mapping between the two spaces. In both cases, the methods considered in
literature have usually relied on computationally very intensive matrix
factorizations, frequently the Singular Value Decomposition (SVD). The
computational burden of SVD quickly renders these dimension reduction methods
infeasible thanks to the ever-increasing sizes of the practical datasets.
In this paper, we present a new decomposition strategy, Reduced Basis
Decomposition (RBD), which is inspired by the Reduced Basis Method (RBM). Given
$X$ the high-dimensional data, the method approximates it by $Y \, T (\approx
X)$ with $Y$ being the low-dimensional surrogate and $T$ the transformation
matrix. $Y$ is obtained through a greedy algorithm thus extremely efficient. In
fact, it is significantly faster than SVD with comparable accuracy. $T$ can be
computed on the fly. Moreover, unlike many compression algorithms, it easily
finds the mapping for an arbitrary ``out-of-sample'' vector and it comes with
an ``error indicator'' certifying the accuracy of the compression. Numerical
results are shown validating these claims.