A piratical approach to Kernel PCA

4 min readDec 2, 2020

A piratical approach to Kernel Principal Component Analysis using Python.

Introduction

In this article, we are talking about Kernel Principal Component Analysis (PCA) and how can be implemented using Python.

First of all I suggest reading about PCA, you will also find useful information in my previous article.
Basically the aim of PCA is to reduce the dataset by compressing it into a new subspace of characteristics by selecting only the set of eigenvectors, called principal components, that contain most of the information. The aim of Kernel PCA is to transforming data that cannot be linearly separated by mapping it to a smaller subspace, that can be used by linear classifiers and other algorithms that assume linearly separable data.

Here you can find the repository which contains the LaTeX file in which the formulas are reported.

Overview

Many machine learning algorithms assume that the training dataset is linearly separable, if not, the PCA kernel can be used to transform it into a smaller linearly separable one.

The aim is to perform a non-linear mapping of n dataset features, that cannot be linearly separated, in a new larger space and then apply the standard PCA on it to project the samples into a smaller subspace, which will be linearly separable.

Figure 1: A plot which show that Kernel PCA is able to find a projection of the data that makes data linearly separable. Source of image scikit-learn.org.

Kernel Function

The first step is build the covariance matrix and break up it in k eigenvectors , called principal components, where k is the dimensionality of the new subspace of the features.

As shown in the previous article, the eigenvalues and eigenvectors can be calculated starting from the following equation, but in this case it is used to find the kernel function k so that we don’t have to explicitly compute the eigenvectors.

There are several kernel functions that can be used, here can be found a deep understanding about this topic, but in this case will be used the Gaussian form of the kernel function, also called RBF (Radial Basis Function), defined as follow

It’s important keep in mind that the gamma value in the kernel function is chosen following several tests, therefore it requires a test phase.

Eigenvectors

Before selecting the eigenvectors of K the kernel matrix should be centered

The k eigenvectors and eigenvalues can be obtained from the centered kernel matrix K’ using the homogeneous linear system and characteristic polynomial as shown in this article.

Python Implementation

The sklearn.decomposition module of the Python library scikit-learn includes matrix decomposition algorithms, including among Kernel PCA, here can be found an example which shows that Kernel PCA is able to find a projection of the data that makes data linearly separable.

Below is implemented using the scipy and numpy libraries, scikit-learn library is used only to initialize the dataset, in this case is used make_circles.

To use this script download it, install dependencies and run from command line as shown below, Python 3 and pip is required

$ pip install -U numpy scipy matplotlib pandas sympy scikit-learn
$ python3 kernel_pca_test.py

Bibliography

“Radial basis function kernel” wikipedia.org
“Kernel principal component analysis” wikipedia.org
“Python machine learning book” (Sebastian Raschka, 2015)
“scikit-learn documentation” scikit-learn.org