Kernels, kernel density estimation, kernel regression

This page presents the kernel density estimation (KDE) and kernel logistic regression (KLR) tools provided by Nuklei. More...

## Classes

class  nuklei::kernel::base
Polymorphic kernel class. More...

class  nuklei::kernel::se3

class  nuklei::kernel::r3xs2_base< OriGrp >

class  nuklei::kernel::r3

class  nuklei::KernelCollection
This class acts as a vector-like container for kernels. It also provides methods related to kernel density estimation. More...

struct  nuklei::KernelLogisticRegressor
Implements kernel logistic regression. More...

## Typedefs

typedef r3xs2_base< groupS::s2nuklei::kernel::r3xs2
$$\mathbb R^3 \times S^2$$

typedef r3xs2_base< groupS::s2pnuklei::kernel::r3xs2p
$$\mathbb R^3 \times S^2_p$$

## Detailed Description

This page presents the kernel density estimation (KDE) and kernel logistic regression (KLR) tools provided by Nuklei.

The KDE and KLR tools provided by Nuklei work in combination with the kernels defined in the nuklei::kernel namespace. These kernels are described below. If one wishes to work with different kernels, there are two options:

1. Change the PositionKernel and OrientationKernel typedefs in one of the kernels (e.g., in kernel::se3),
2. Re-implement KDE or KLR with Generic Kernels.

# Kernels

The nuklei::kernel namespace provides kernels for elements that belong to

• the Special Euclidean Group $$SE(3) = \mathbb R^3 \times SO(3)$$ (i.e., 3D rigid body transformations),
• $$\mathbb R^3 \times S^2$$ (i.e., the product of $$\mathbb R^3$$ and the space of 2DOF orientations).
• $$\mathbb R^3$$.

These kernels are defined as the product of a position kernel and an orientation kernel (except for the third one which is only a position kernel). An important difference between the kernels provided by nuklei::kernel and the kernels discussed in Background, is that the kernels provided by nuklei::kernel are unnormalized. Their value at their center point is equal to 1, and their integral varies with the bandwidth. The motivation behind this choice is that normalization factors are often expensive to compute, and often unnecessary (many algorithms will be happy with a KDE computed up to a multiplicative constant).

The class kernel::se3 implements an $$SE(3)$$ kernel. The method kernel::se3::eval() returns an evaluation of

$\mathcal K_{SE(3)}\left(\lambda, \theta ; \mu_t, \mu_r, \sigma_t, \sigma_r\right) = \mathcal T\left(\lambda ; \mu_t, \sigma_t\right) e^{f_{SE(3)}(\sigma_r) \; (|\mu_r^T \theta|-1)}$

where $$\mathcal T$$ is a triangular position kernel, $$|\cdot|$$ denotes an absolute value, and $$f_{SE(3)}(\sigma_r) = \frac{1}{1-\cos(0.5*\sigma_r)}$$ is a function which allows $$\sigma_r$$ to be expressed in radians, and translates it to the von Mises-Fisher bandwidth parameter. The factor $$e^{f_{SE(3)}(\sigma_r) \; (|\mu_r^T \theta|-1)}$$ efficiently approximates a pair of antipodal von Mises-Fisher distributions (thanks to the absolute value), and returns 1 when evaluated at $$\mu_r$$. The position kernel $$\mathcal T$$ is given by

$\mathcal T\left(\lambda ; \mu_t, \sigma_t\right) = \begin{cases} 1 - \frac{\sqrt{(\lambda-\mu_t)^2}}{2\sigma_t} &\textrm{if } \sqrt{(\lambda-\mu_t)^2} \leq 2\sigma_t \\ 0 &\textrm{if } \sqrt{(\lambda-\mu_t)^2} > 2\sigma_t\end{cases}$

The method kernel::se3::sample() returns samples $$(\lambda, \theta)$$ that follow

$\mathcal K'_{SE(3)}\left(\lambda, \theta ; \mu_t, \mu_r, \sigma_t, \sigma_r\right) = \mathcal T\left(\lambda ; \mu_t, \sigma_t\right) \frac {\mathcal F_4\left( \theta ; \mu_r, f_{SE(3)}(\sigma_r) \right) + \mathcal F_4 \left(\theta; -\mu_r, f_{SE(3)}(\sigma_r)\right)}2.$

We note that this expression uses a pair of von Mises-Fisher distributions instead of the approximation introduced in $$\mathcal K_{SE(3)}$$. The reason behind this difference is that $$K_{SE(3)}$$ is fast to evaluate, but I do not know of an algorithm to generate samples from $$e^{f_{SE(3)}(\sigma_r) \; (|\mu_r^T \theta|-1)}$$. An algorithm for sampling from a von Mises-Fisher distribution exists, which is why $$\mathcal K'_{SE(3)}$$ has that form.

The class kernel::r3xs2p implements an $$\mathbb R^3 \times S^2$$ kernel for axial orientations. The method kernel::r3xs2p::eval() returns an evaluation of

$\mathcal K_{RSA}\left(\lambda, \theta ; \mu_t, \mu_r, \sigma_t, \sigma_r\right) = \mathcal T\left(\lambda ; \mu_t, \sigma_t\right) e^{f_{RSA}(\sigma_r) \; (|\mu_r^T \theta|-1)}$

where $$f_{RSA}(\sigma_r) = \frac{1}{1-\cos(\sigma_r)}$$ is a function which allows $$\sigma_r$$ to be expressed in radians, and translates it to the von Mises-Fisher bandwidth parameter. The factor $$e^{f_{RSA}(\sigma_r) \; (|\mu_r^T \theta|-1)}$$ efficiently approximates a pair of antipodal von Mises-Fisher distributions (thanks to the absolute value), and returns 1 when evaluated at $$\mu_r$$. Similar to the $$SE(3)$$ kernel, the method kernel::r3xs2p::sample() returns samples $$(\lambda, \theta)$$ that follow

$\mathcal K'_{RSA}\left(\lambda, \theta ; \mu_t, \mu_r, \sigma_t, \sigma_r\right) = \mathcal T\left(\lambda ; \mu_t, \sigma_t\right) \frac {\mathcal F_3\left( \theta ; \mu_r, f_{RSA}(\sigma_r) \right) + \mathcal F_3 \left(\theta; -\mu_r, f_{RSA}(\sigma_r)\right)}2.$

The class kernel::r3 implements an $$\mathbb R^3$$ kernel. The method kernel::r3::eval() returns an evaluation of

$\mathcal K_{\mathbb R^3}\left(\lambda ; \mu_t, \sigma_t\right) = \mathcal T\left(\lambda ; \mu_t, \sigma_t\right).$

The method kernel::r3::sample() returns samples $$\lambda$$ that follow $$\mathcal K_{\mathbb R^3}$$.

# Kernel Density Estimation

The KernelCollection class provides access to a kernel density estimate of the density modeled by the kernels it contains. The KDE defined by KernelCollection is expressed as

$f(x) = \sum_{i=1}^n w_i \mathcal K\left(x ; \hat x_i, \sigma\right),$

where $$n$$ is the number of kernels held in a KernelCollection object, $$w_i$$ and $$x_i$$ denote the weight and center point of the $$i^\mathrm{th}$$ kernel, respectively, $$\mathcal K$$ denotes the kernel function implemented by the kernels, and $$\sigma$$ denotes kernel bandwidths, which may correspond to a pair of values if the kernels have both a location and orientation component.

KernelCollection must contain kernels of identical domain of definition (for instance, $$SE(3)$$ kernels cannot be mixed with $$\mathbb R^3$$ kernels). The methods of KernelCollection enforce this constraint (exceptions will be thrown if it is attempted to mix different kernels). KernelCollection holds kernel objects of a class that inherits from kernel::base. Those kernels are by default unnormalized, which impacts KDE in two ways:

## Evaluation

Evaluation is provided by KernelCollection::evaluationAt(). In effect, this method computes

using namespace nuklei; // [for proper linking by doxygen]
{
double value = 0;
for (KernelCollection::const_iterator i = begin(); i != end(); i++)
value += i->getWeight() * i->polyEval(k);
return value;
}

For computational efficiency, the implementation of this function differs in two ways from this piece of code:

• If the number of kernels is larger than 1000, a $$kd$$-tree is used to discard the kernels that are too far from k to be relevant (see below).
• Instead of calling polyEval, evaluationAt first figures out the type of the kernels in this KernelCollection, and calls the static equivalent of polyEval. This trick avoids repeated VTABLE lookups and allows the compiler to inline the kernel evaluation function.

The Nuklei implementation follows the description given in Background and uses a $$kd$$-tree to quickly access kernels that are near the evaluation point. The definition of near is given by the cutPoint method that kernels of nuklei::kernel must implement. This method gives the distance to the kernel's origin at which the kernel value can be assume to be zero. For triangular kernels, this distance is $$2\sigma_t$$.

Calls to KernelCollection::evaluationAt() should be preceded a call to KernelCollection::computeKernelStatistics() (or KernelCollection::normalizeWeights()), which compute statistics necessary for evaluation, and a call to KernelCollection::buildKdTree(). Read Intermediary Results for an explanation of how to use these functions, and kde_evaluate.cpp for an example.

## Sampling

Sampling is provided by KernelCollection::sample(). Random variates from the density are generated as follows:

• First, a kernel index $$i$$ is selected by drawing $$i$$ from

$P(i = \ell) \propto w_{\ell},$

where $$w_{\ell}$$ is the weight of the $$\ell^\mathrm{th}$$ kernel. (This effectively gives a higher chance to kernels with a larger weight.)
• Then, a random variate $$x$$ is generated by sampling from the kernel $$\mathcal K\left(x ; \hat x_i, \sigma\right)$$.

Sampling is provided by KernelCollection::sample(). This method works as follows:

using namespace nuklei; // [for proper linking by doxygen]
{
for (const_sample_iterator i = sampleBegin(sampleSize); i != i.end(); ++i)
{
k->setWeight( 1.0/sampleSize );
}
return s;
}

The loop selects sampleSize kernels with probability $$P(i = \ell) \propto w_{\ell}$$. Calling polySample then draws a sample from each of the selected kernels.

If a single sample is needed, kernelCollection.randomKernel().polySample() can be used. Note that this expression is linear in the number of kernels in kernelCollection.

# Kernel Logistic Regression

KLR is implemented in KernelLogisticRegressor.

nuklei_trsl::ppfilter_iterator< is_picked, const_iterator > const_sample_iterator
Sample Iterator type.
Public namespace.
Definition: Color.cpp:9
Container::const_iterator const_iterator
KernelCollection iterator.
This class acts as a vector-like container for kernels. It also provides methods related to kernel de...
Polymorphic kernel class.
Definition: Kernel.h:45
virtual base::ptr polySample() const =0
Get a sample from the kernel.
NUKLEI_UNIQUE_PTR< kernel::base > ptr
NUKLEI_UNIQUE_PTR for kernel::base.
Definition: Kernel.h:50
Container::iterator end()
Returns an iterator pointing to the last kernel.
KernelCollection sample(int sampleSize) const
Returns sampleSize samples from the density modeled by *this.
Container::iterator begin()
Returns an iterator pointing to the first kernel.
sample_iterator sampleBegin(size_t sampleSize)
Returns an iterator that iterates through sampleSize kernels selected randomly.
weight_t evaluationAt(const kernel::base &f, const EvaluationStrategy strategy=WEIGHTED_SUM_EVAL) const
Evaluates the density represented by *this at f.