Utils code documentation¶

An introductory descrption…then doxygen inputs Here is how we can link one of the classes below template<class TKernel, typename T>IncCholesky bla lab..

Incomplete Cholesky Decomposition¶

Here is how we can cite one of the details documents Incomplete Cholesky Decomposition of The Kernel Matrix

template<class TKernel, typename T, typename TInputD> class IncCholesky¶

Incomplete (pivoted) Cholesky decomposition of the kernel matrix.

Description:

Given a set of input data in the input space \( \{ \mathbf{x}_i \}_{i=1}^{N} \mathbf{x}_i \in \mathbb{R}^d \) and a non- linear mapping \( \varphi{(\cdot)}: \mathbb{R}^d\to\mathbb{R}^{n_h}\) through the corresponding kernel function \( \kappa(\mathbf{x}_i,\mathbf{x}_j) = \varphi{(\mathbf{x}_i)}^T\varphi{(\mathbf{x}_i)}\) to evaluate the inner products in the feature space, the Gram (or kernel) matrix \( \Omega_{ij} = \varphi{(\mathbf{x}_i)}^T\varphi{(\mathbf{x}_i)} = \kappa(\mathbf{x}_i,\mathbf{x}_j), \in \mathbb{R}^{(N\times N)} \). (The crrent implementation assumes a normalised kernel i.e. \( \kappa(\varphi(\mathbf{x})_i, \varphi(\mathbf{x})_i) = \varphi(\mathbf{x})_i^T\varphi(\mathbf{x})_i =1 \) which is used only in the initialisation of the diagonals and the error computation.)

The incomplete Cholesky decomposition generates the \( G \in \mathbb{R}^{(R\times N)}, R \leq N\) upper triangualr matrix (or optionally its transpose) such that \( \Omega \approx \tilde{\Omega} = G^TG \). Actually, the transpose of the feature map matrix \( \Phi = [\varphi(\mathbf{x})_1^T,\varphi(\mathbf{x})_2^T,\dots,\varphi(\mathbf{x})_N^T] \in \mathbb{R}^{(N \times n_h)} \) is orthogonalised by geedily slecting the feature map at each step with the highest residual norm and normalising to the previously selected and orthogonalised vectors. During this QR decomposition of \( \Phi^T \), an orthonormal basis is greedily generated and at the \( k\)-th steps the \( k\) columns of the \( Q \in \mathbb{R}^{(n_h \times k)} \) contains these orthonormal basis vectors while the columns of the matrix \( G \in \mathbb{R}^{(k \times N)}\) contains the projections of the \( \{\varphi(\mathbf{x})_1,\dots,\varphi(\mathbf{x})_N^T \) feature maps onto these basis. Therefore, the matrix \( Q^T\Phi^T \approx G \in \mathbb{R}^{(R \times N)}, R \leq N\) can be interpreted as a low dimensional representation of the feature map of the input data and \( \Omega = \Phi\Phi^T \approx [QG]^T[QG] = G^TG \) gives a low rank approximation of the kernel matrix.

See more details at the Incomplete Cholesky Decomposition of The Kernel Matrix section of the documentation.

How to use:

The pointer to the input data matrix, the kernel and its parameters needs to be set before the decompositon:

kernel function: the class is templated on the kernel function class so the user needs to chose the kernel when instantiate the class object. The IncCholesky::SetKernelParameters method (that will invoke the corresponding interface method of the kernel function object) is provided by the class to set the paraneters of the kernel.
input data: the user needs to set the pointer to the input data matrix before invoking the decomposition. The input data matrix is assumed to store the input data vectors (in the input space) in row-major format (i.e. each individual data vector is memory continuos). The IncCholesky::SetInputDataMatrix method is by the class to set the input data matrix pointer.
decompositon: the IncCholesky::Decompose method can be invoked by the user to perform the incomplete Cholesky decomposition after completing all these above required settings. See more details on the termination criteria at the documentation of the IncCholesky::Decompose method.

Result:

The incomplete Cholesky decompositon i.e. the \( G \in mathbb{R}^{(R\times N)}\) (or its traspose if it was required) upper triangualr matrix such that \( \Omega = \Phi\Phi^T \approx [QG]^T[QG] = G^TG \) is available after the decomposition (IncCholesky::Decompose) is completed. A pointer to the matrix \( G \) can be obtained by the IncCholesky::GetICholMatrix method while a reference to the vector containing the input data indices, as the result of the permutations done during the decomposition i.e. according to the columns of \( G\), can be obtained by the IncCholesky::GetPermutationVector method.

Author: M. Novak
Date: December 2020

Data members

TKernel *fKernel¶: Pointer to the kernel object that implements the kernel function.

Public Functions

template<typename ...Args> void SetKernelParameters(Args... args)¶

Public method to set the parameters of the kernel function object.

The type of the kernel function object is selected at instantiation of an IncCholesky object since the class is templated on this type. This public method can be used to set the parameters of the kernel function object. Note, that the method will invoke the corresponding \( \texttt{SetKernelParameters}\) method of the kernel function obejct through the base class KernelBase::SetParameters interface method by passing all provided parameters as arguments.

Parameters

[in] args: input argument list.

void SetInputDataMatrix(Matrix<TInputD, false> *inDataM)¶

Public method to set the input pointer to the input data matrix.

Parameters

[in] inDataM: pointer to the input data matrix that stores the input data vector in row-major (memory continous) order.

void Decompose(double tolError, size_t maxIter, bool transpose = true)¶

Public method to perform the incomplete Cholesky decomposition of the kernel matrix.

Note, that the pointer to the input data matrix as well as the kernel functon parameters needs to be set before invoking this method.

The decomposition will be terminated when either the proided approximation error or maximum number of iteration ( i.e. selected data vectors or rank of the approximation) is reached. The incomplete Cholesky decomposition can be interpreted as a trace optimisation \( \| \Omega - \tilde{\Omega}\|_1 \) (where \( \| \|_1\) denotes the sum of singular values which in turn is eaual to the sum of eigenvalues and that is equal to the trace of the matrix since the matrices involved are symmetric, positive semi-definite) or in orther words \( \texttt{max} [ \texttt{trace} \{\tilde{\Omega}\} ]\) . Therefore, the approximaton error will be computed at each step as \( \eta = \texttt{trace} [ \Omega - \tilde{\Omega} ]/N \) i.e. normalised by \( \texttt{trace} \{ \Omega \} \) assuming normalised kernel i.e. \( \Omega_{ii}=1 \).

Parameters

[in] tolError: maximum tolerated error value (see above)
[in] maxIter: maximum number of iteration (or maximum dimension of the underlying sub-space, or maximum rank of the approximation, maximum number of data vectors orthogonalised)
[in] transpose: the algorithm will compute the \( G \in \mathbb{R}^{(R\times N)}\) upper triangualr matrix. The transpose of this can be requested by the user if this parameter is set to be \(\texttt{true}\) (default).

Matrix<T> *GetICholMatrix() const¶: Public method to obtain a pointer to the resulted incomplete Cholesky factor.

void SetNullICholMatrixPrt()¶: Removes the ownership of the memory related to the incomplete Cholesky matrix.

const std::vector<size_t> &GetPermutationVector() const¶: Public method to obtain a reference to the vector containing the input data indices as the result of the permutations done during the decompositon.

template<class TDerived, typename TReturnType> class KernelBase¶

CRTP base class for Kernels.

Since kernel function evaluations are called frequently, the cost of virtual calls i.e. Dynamic Polymorphism is avoided by using Static Polymorphism based on Curiously Recurring Template Pattern (CRTP). This base class provides the SetParameters and Evaluate interface methods in which the concrete base class implementations of the corresponding methods will be invoked for setting the kernel function parameter(s) and for evaluating the kernel function. The KernelRBF class, that implements the Radial Basis Function (RBF) kernel, is provided as an example for kernel implementations.

Author: M. Novak
Date: December 2020

Public Functions

template<typename ...Args> void SetParameters(Args... args)¶

Interface method to set the kernel function parameters.

The derived class \( \texttt{SetKernelParameters}\) method will be invoked within this method to set the kernel function parameters.

Parameters

[in] args: input argument list.

template<typename ...Args> TReturnType Evaluate(Args... args)¶

Interface method to evaluate the kernel function with the given input arguments.

The derived class \( \texttt{EvaluateKernel} \) method will be invoked within this method to evaluate the kernel function with the given input arguments.

Parameters

[in] args: input argument list.

template<typename T, typename TInpD> class KernelRBF : public KernelBase<KernelRBF<T, TInpD>, T>¶

Radial Bases Function (RBF) Kernel implementation.

The class implements the RBF kernel function in the form of

\[ \exp\left[ - \frac{ ||\mathbf{x}_i - \mathbf{x}_j ||^2 }{\gamma} \right] = \exp\left[ -\frac{\sum_{k=1}^{d} (x_k^i-x_k^j)(x_k^i-x_k^j)}{\gamma} \right] \]

where \( \mathbf{x}_i,\mathbf{x}_j \in \mathcal{D}=\{\mathbf{x}_l\}_{l=1}^{N}, \mathbf{x}_l \in \mathrm{R}^d\) is the input data set and \(\gamma\) is the (bandwidth) kernel parameter.

The two interface methods of the KernelBase base class have been implemented:

KernelBase::SetParameters will invoke the KernelRBF::SetKernelParameters method to set the kernel function parameter (i.e. the \(\gamma\) bandwidth) either as a scalar or vector (the corresponding KernelRBF::SetBandWidth method will be invoked).
KernelBase::Evaluate will invoke the KernelRBF::EvaluateKernel method to evaluate the kernel function (the KernelRBF:EvaluateKernelRBF method will be invoked)

Note

The return type of the kernel evaluation (T) can be differeent from the type of the kernel parameter and input data (TInpD) in the case of this implementatin of the RBF kernel. Any pattern can be implemented if this is not suitable here with the only restriction, that the return type must be either double or float (since those will populate the Kernel matrix).

Also note, that the CTR is assumed to have an empty parameter list!

Author: M. Novak
Date: December 2020

Data members

std::size_t fDim¶: Dimension of the RBF kernel bandwidth parameter.

std::vector<TInpD> fInvBandWV¶: The RBF kernel bandwidth parameter value(s).

Public Functions

KernelRBF()¶: Constructor (must be without arguments).

~KernelRBF()¶: Destructor (nothing to do).

template<typename ...Args> void SetKernelParameters(Args... args)¶

Main method to set the kernel function parameters.

The base class KernelBase::SetParameters interface method invokes this method by passing the argument(s). The appropriate SetBandWidt method will be invoked within this call by selecting the implementation that matches the argument list (scalar or vector in this case).

Parameters

[in] args: input argument list.

void SetBandWidth(TInpD bw)¶

The concrete implementation for setting the scalar kernel bandwidth.

Parameters

[in] bw: the scalar RBF kernel bandwidth \(\gamma\)

void SetBandWidth(std::vector<TInpD> &bw)¶

The concrete implementation for setting the vector kernel bandwidth.

Parameters

[in] bw: the RBF kernel bandwidth vector \(\boldsymbol{\gamma}\)

template<typename ...Args> T EvaluateKernel(Args... args)¶

Main method to evaluate the kernel function.

The base class KernelBase::Evaluate interface method invokes this method by passing the argument(s). The appropriate EvaluateKernelRBF method will be invoked within this call by selecting the implementation that matches the argument list (only one version is implemented).

Parameters

[in] args: input argument list (two input data).

T EvaluateKernelRBF(const TInpD *a, const TInpD *b, std::size_t num)¶

The concrete implementation for evaluating the kernel function (both with scalar or vector bandwidth).

Parameters

[in] a: pointer to the first input data array
[in] b: pointer to the second input data array
[in] num: dimension of the input data

Private Functions

T EvaluateKernelRBF1D(const TInpD *a, const TInpD *b, std::size_t num)¶: Evaluation of the kernel function when scalar bandwidth is used.

T EvaluateKernelRBFMultiD(const TInpD *a, const TInpD *b, std::size_t num)¶: Evaluation of the kernel function when vector bandwidth is used.