This page presents the kernel density estimation (KDE) and kernel logistic regression (KLR) tools provided by Nuklei. More...
Classes | |
class | nuklei::kernel::base |
Polymorphic kernel class. More... | |
class | nuklei::kernel::se3 |
class | nuklei::kernel::r3xs2_base< OriGrp > |
class | nuklei::kernel::r3 |
class | nuklei::KernelCollection |
This class acts as a vector-like container for kernels. It also provides methods related to kernel density estimation. More... | |
struct | nuklei::KernelLogisticRegressor |
Implements kernel logistic regression. More... | |
Typedefs | |
typedef r3xs2_base< groupS::s2 > | nuklei::kernel::r3xs2 |
\( \mathbb R^3 \times S^2 \) | |
typedef r3xs2_base< groupS::s2p > | nuklei::kernel::r3xs2p |
\( \mathbb R^3 \times S^2_p \) | |
This page presents the kernel density estimation (KDE) and kernel logistic regression (KLR) tools provided by Nuklei.
Before reading this page, make sure you are familiar with the material discussed in Background.
The KDE and KLR tools provided by Nuklei work in combination with the kernels defined in the nuklei::kernel namespace. These kernels are described below. If one wishes to work with different kernels, there are two options:
PositionKernel
and OrientationKernel
typedefs in one of the kernels (e.g., in kernel::se3),The nuklei::kernel namespace provides kernels for elements that belong to
These kernels are defined as the product of a position kernel and an orientation kernel (except for the third one which is only a position kernel). An important difference between the kernels provided by nuklei::kernel and the kernels discussed in Background, is that the kernels provided by nuklei::kernel are unnormalized. Their value at their center point is equal to 1, and their integral varies with the bandwidth. The motivation behind this choice is that normalization factors are often expensive to compute, and often unnecessary (many algorithms will be happy with a KDE computed up to a multiplicative constant).
The class kernel::se3 implements an \( SE(3) \) kernel. The method kernel::se3::eval() returns an evaluation of
\[ \mathcal K_{SE(3)}\left(\lambda, \theta ; \mu_t, \mu_r, \sigma_t, \sigma_r\right) = \mathcal T\left(\lambda ; \mu_t, \sigma_t\right) e^{f_{SE(3)}(\sigma_r) \; (|\mu_r^T \theta|-1)} \]
where \( \mathcal T \) is a triangular position kernel, \( |\cdot| \) denotes an absolute value, and \( f_{SE(3)}(\sigma_r) = \frac{1}{1-\cos(0.5*\sigma_r)}\) is a function which allows \( \sigma_r \) to be expressed in radians, and translates it to the von Mises-Fisher bandwidth parameter. The factor \( e^{f_{SE(3)}(\sigma_r) \; (|\mu_r^T \theta|-1)} \) efficiently approximates a pair of antipodal von Mises-Fisher distributions (thanks to the absolute value), and returns 1 when evaluated at \( \mu_r \). The position kernel \( \mathcal T \) is given by
\[ \mathcal T\left(\lambda ; \mu_t, \sigma_t\right) = \begin{cases} 1 - \frac{\sqrt{(\lambda-\mu_t)^2}}{2\sigma_t} &\textrm{if } \sqrt{(\lambda-\mu_t)^2} \leq 2\sigma_t \\ 0 &\textrm{if } \sqrt{(\lambda-\mu_t)^2} > 2\sigma_t\end{cases} \]
The method kernel::se3::sample() returns samples \( (\lambda, \theta) \) that follow
\[ \mathcal K'_{SE(3)}\left(\lambda, \theta ; \mu_t, \mu_r, \sigma_t, \sigma_r\right) = \mathcal T\left(\lambda ; \mu_t, \sigma_t\right) \frac {\mathcal F_4\left( \theta ; \mu_r, f_{SE(3)}(\sigma_r) \right) + \mathcal F_4 \left(\theta; -\mu_r, f_{SE(3)}(\sigma_r)\right)}2. \]
We note that this expression uses a pair of von Mises-Fisher distributions instead of the approximation introduced in \( \mathcal K_{SE(3)} \). The reason behind this difference is that \( K_{SE(3)} \) is fast to evaluate, but I do not know of an algorithm to generate samples from \( e^{f_{SE(3)}(\sigma_r) \; (|\mu_r^T \theta|-1)} \). An algorithm for sampling from a von Mises-Fisher distribution exists, which is why \( \mathcal K'_{SE(3)} \) has that form.
The class kernel::r3xs2p implements an \( \mathbb R^3 \times S^2 \) kernel for axial orientations. The method kernel::r3xs2p::eval() returns an evaluation of
\[ \mathcal K_{RSA}\left(\lambda, \theta ; \mu_t, \mu_r, \sigma_t, \sigma_r\right) = \mathcal T\left(\lambda ; \mu_t, \sigma_t\right) e^{f_{RSA}(\sigma_r) \; (|\mu_r^T \theta|-1)} \]
where \( f_{RSA}(\sigma_r) = \frac{1}{1-\cos(\sigma_r)}\) is a function which allows \( \sigma_r \) to be expressed in radians, and translates it to the von Mises-Fisher bandwidth parameter. The factor \( e^{f_{RSA}(\sigma_r) \; (|\mu_r^T \theta|-1)} \) efficiently approximates a pair of antipodal von Mises-Fisher distributions (thanks to the absolute value), and returns 1 when evaluated at \( \mu_r \). Similar to the \( SE(3) \) kernel, the method kernel::r3xs2p::sample() returns samples \( (\lambda, \theta) \) that follow
\[ \mathcal K'_{RSA}\left(\lambda, \theta ; \mu_t, \mu_r, \sigma_t, \sigma_r\right) = \mathcal T\left(\lambda ; \mu_t, \sigma_t\right) \frac {\mathcal F_3\left( \theta ; \mu_r, f_{RSA}(\sigma_r) \right) + \mathcal F_3 \left(\theta; -\mu_r, f_{RSA}(\sigma_r)\right)}2. \]
The class kernel::r3 implements an \( \mathbb R^3 \) kernel. The method kernel::r3::eval() returns an evaluation of
\[ \mathcal K_{\mathbb R^3}\left(\lambda ; \mu_t, \sigma_t\right) = \mathcal T\left(\lambda ; \mu_t, \sigma_t\right). \]
The method kernel::r3::sample() returns samples \( \lambda \) that follow \( \mathcal K_{\mathbb R^3} \).
The KernelCollection class provides access to a kernel density estimate of the density modeled by the kernels it contains. The KDE defined by KernelCollection is expressed as
\[ f(x) = \sum_{i=1}^n w_i \mathcal K\left(x ; \hat x_i, \sigma\right), \]
where \( n \) is the number of kernels held in a KernelCollection object, \( w_i \) and \( x_i \) denote the weight and center point of the \( i^\mathrm{th} \) kernel, respectively, \( \mathcal K \) denotes the kernel function implemented by the kernels, and \( \sigma \) denotes kernel bandwidths, which may correspond to a pair of values if the kernels have both a location and orientation component.
KernelCollection must contain kernels of identical domain of definition (for instance, \( SE(3) \) kernels cannot be mixed with \( \mathbb R^3 \) kernels). The methods of KernelCollection enforce this constraint (exceptions will be thrown if it is attempted to mix different kernels). KernelCollection holds kernel objects of a class that inherits from kernel::base. Those kernels are by default unnormalized, which impacts KDE in two ways:
\[ \int f(x) \mathrm{d}x = C(\sigma), \]
where \( C \) is positive and depends on \( \sigma \).Evaluation is provided by KernelCollection::evaluationAt(). In effect, this method computes
For computational efficiency, the implementation of this function differs in two ways from this piece of code:
k
to be relevant (see below).polyEval
, evaluationAt first figures out the type of the kernels in this KernelCollection, and calls the static equivalent of polyEval
. This trick avoids repeated VTABLE lookups and allows the compiler to inline the kernel evaluation function.The Nuklei implementation follows the description given in Background and uses a \( kd \)-tree to quickly access kernels that are near the evaluation point. The definition of near is given by the cutPoint
method that kernels of nuklei::kernel must implement. This method gives the distance to the kernel's origin at which the kernel value can be assume to be zero. For triangular kernels, this distance is \( 2\sigma_t \).
Calls to KernelCollection::evaluationAt() should be preceded a call to KernelCollection::computeKernelStatistics() (or KernelCollection::normalizeWeights()), which compute statistics necessary for evaluation, and a call to KernelCollection::buildKdTree(). Read Intermediary Results for an explanation of how to use these functions, and kde_evaluate.cpp for an example.
Sampling is provided by KernelCollection::sample(). Random variates from the density are generated as follows:
\[ P(i = \ell) \propto w_{\ell}, \]
where \( w_{\ell} \) is the weight of the \( \ell^\mathrm{th} \) kernel. (This effectively gives a higher chance to kernels with a larger weight.)Sampling is provided by KernelCollection::sample(). This method works as follows:
The loop selects sampleSize
kernels with probability \( P(i = \ell) \propto w_{\ell} \). Calling polySample
then draws a sample from each of the selected kernels.
If a single sample is needed, kernelCollection.randomKernel().polySample()
can be used. Note that this expression is linear in the number of kernels in kernelCollection
.
KLR is implemented in KernelLogisticRegressor.