Uncertainty propagation with and without Lie groups and algebras Link to heading

The correct uncertainty estimation of the pose is crucial for any navigation or positioning algorithm performance. One of the most natural way of representing the uncertainty for me is the confidence ellipse.

In the following post I would like to show effect of the uncertainty propagation using the various algorithms. What’s more important, I would like to show the Gaussians representations both in cartesian and exponential coordinates.

Small Lie notes Link to heading

These notes are made for myself and summarize the info from Joan Sola paper “A Micro Lie Theory”¹ and Barfoot’s “State estimation for robotics”².

Notation Link to heading

Lie group $\mathcal{M}$ - a smooth manifold whose elements satisfy four group axioms: closure, identity, inverse and associativity. The manifold looks the same at its every point (no spikes, edges, etc.), thus all tangent spaces at any point are similar and have the same structure.
Group compositions and actions.
- Given group elements $\mathcal{X}, \mathcal{Y} \in \mathcal{M}$ there is a composition $\circ$, that gives the compositions of group elements $\mathcal{X}, \mathcal{Y}$ that also lies on the manifold. There is one special element of the group, that is called identity element $\mathcal{E}$. The composition of any other group element $\mathcal{Z}$ with $\mathcal{E}$ produces $\mathcal{Z}$: $\forall \mathcal{Z} \in \mathcal{M} : \mathcal{E} \circ \mathcal{Z} = \mathcal{Z} \circ \mathcal{E} = \mathcal{Z}$
- Given group element $\mathcal{X}$ and some other set $\mathcal{v}$, $\mathcal{X}\cdot \mathcal{v}$ is the action of $\mathcal{X}$ on $\mathcal{v}$. For example, for $SO(3)$ group its action on vector is rotation: $\mathbf{R}\cdot\mathbf{x} = \mathbf{R}\mathbf{x}$, for $SE(3)$ the action is the rotation plus translation: $\mathbf{H}\cdot\mathbf{x} = \mathbf{R}\mathbf{x} + \mathbf{t}$.
Lie algebra $\mathcal{m} = \mathcal{T}_{\mathcal{E}} \mathcal{M}$. It is the tangent space to $\mathcal{M}$ at the identity element. The beaty of $\mathcal{m}$ is that it is a vector space and we can use the standard machinery for it. Lie algebras can be defined to any group element $\mathcal{X}$, they can be viewed as local tangent spaces $\mathcal{T}_{\mathcal{X}}\mathcal{M}$, having vectors in its local coordinates.
Lie algebra elements $\tau$:
- Can be identified as vectors in $\mathcal{R}^n$, where $n$ is the number degrees of freedom of group $\mathcal{M}$.
- Explicitly denoted with a hat decorator: $\wedge: \mathbf{R}^n \to \mathcal{m}$.
- Represented as $n$-length vectors, such operation is usally encoded with $\vee: \mathcal{m} \to \mathbf{R}^n $ symbol.
The exponential map, generally known as retraction, $\exp$ allows to convert elements of Lie algebra $\mathcal{m}$ into elements of Lie group $\mathcal{M}$: $\exp: \mathcal{m} \to \mathcal{M}$. It is an exact conversion.
The capitalized $Exp$ is used to denote $\mathbf{R}^n \to \mathcal{M}$ conversion.
The logarithm map $\log: \mathcal{M} \to \mathcal{m}$ allows to go vice versa.
Adjoint of $\mathcal{m}$. Vectors of the tangent space at element $\mathcal{X}$ can be transformed to the tangent space at $\mathcal{E}$ through a adjoint transform, that is linear: $ad: \mathcal{T}_{\mathcal{X}}\mathcal{M} \to \mathcal{T}_{\mathcal{E}}\mathcal{M}$.
Adjoint of $\mathcal{M}$. The representations of a given group element $\mathcal{X}$ at the $\mathcal{T}_{\mathcal{E}}\mathcal{M}$ vector space: $Ad$:
$Ad_{\mathcal{X}} = {}^\mathcal{E}\tau^{\wedge} = \mathcal{X}~{}^\mathcal{X}\tau^{\wedge}\mathcal{X}^{-1}$.
The adjoint is linear and homomorphic. There is also adjoint matrix:
$\mathbf{Ad}_\mathcal{X}: \mathbf{R}^n \to \mathbf{R}^n;~{}^\mathcal{X}\tau \mapsto {}^\mathcal{E}\tau = \mathbf{Ad}_\mathcal{X} {}^\mathcal{X}\tau$.
Plus $\oplus$ and minus $\ominus$ operators: the combination of $Exp$ or $Log$ operator with one composition.
- There is a right operator, where $^\mathcal{X}\tau$ belongs to the tangent space $\mathcal{T}_{\mathcal{X}}\mathcal{M}$. It is said that ${}^\mathcal{X}\tau$ is expressed in local frame at $\mathcal{X}$:
  $\mathcal{Y} = \mathcal{X} \oplus {}^\mathcal{X}\tau = \mathcal{X} \circ Exp({}^\mathcal{X}\mathcal{\tau}) \in \mathcal{M}$
  ${}^\mathcal{X}\tau = \mathcal{Y} \ominus \mathcal{X} = Log(\mathcal{X}^{-1}\circ\mathcal{Y}) \in \mathcal{T}_{\mathcal{X}}\mathcal{M}$
- In the left operator $Exp({}^\mathcal{E}\mathcal{\tau})$ is on the left meaning that ${}^\mathcal{E}\mathcal{\tau}$ expressed in global frame: $\mathcal{Y} = {}^\mathcal{E}\tau \oplus \mathcal{X} = Exp({}^\mathcal{E}\mathcal{\tau}) \circ \mathcal{X} \in \mathcal{M}$
  ${}^\mathcal{E}\tau = \mathcal{Y} \ominus \mathcal{X} = Log(\mathcal{Y}\circ\mathcal{X}^{-1}) \in \mathcal{T}_{\mathcal{E}}\mathcal{M}$
  The choice of left or right operator specifies the perturbation frame: they can be defined globally or locally.

Uncertainty on manifolds Link to heading

If there is a local perturbation $\mathcal{\tau}$ near the operation point $\bar{\mathcal{X}} \in \mathcal{M}$ in the tangent vector space $\mathcal{T}_{\bar{\mathcal{X}}}\mathcal{M}$, that can be defined as:

$$ \mathcal{X} = \bar{\mathcal{X}}\oplus\tau,~~~ \tau = \mathcal{X}\ominus\bar{\mathcal{X}} \in \mathcal{T}_{\bar{\mathcal{X}}}\mathcal{M}, $$

the covariances matrices can be properly defined on tangent space through standard expectation:

$$ \Sigma_{\mathcal{X}} = \mathbf{E}\left[\tau \tau ^T \right] \in \mathbf{R}^{n\times n} $$

If we work with perturbations, expressed in global reference frame, then:

$$ \mathcal{X} = \tau\oplus\bar{\mathcal{X}},~~~ \tau = \mathcal{X}\ominus\bar{\mathcal{X}} \in \mathcal{T}_{\mathcal{E}}\mathcal{M} $$

If we want to propagate the covariance through $f : \mathcal{M} \to \mathcal{N}; \mathcal{X}\mapsto\mathcal{Y} = f(\mathcal{X})$, we have to linearize:

$$ \Sigma_{\mathcal{Y}} \approx \frac{Df}{D\mathcal{X}}\Sigma_{\mathcal{X}}\frac{Df}{D\mathcal{X}}^T. $$

SO(3) and SE(3) Link to heading

Lie group composition operator for SO(3) and SE(3) is just a matrix multiplication.

Plotting utilities Link to heading

Spoiler


import numpy as np
import matplotlib.pylab as plt
np.set_printoptions(precision=3, suppress=True, linewidth=120)

%matplotlib inline
%config InlineBackend.figure_format='retina'

def density_scatter(x , y, ax = None, fig = None, sort = True, bins = 20, cbar=True, **kwargs ):
    from matplotlib import cm
    from matplotlib.colors import Normalize 
    from scipy.interpolate import interpn
    """
    Scatter plot colored by 2d histogram
    """
    if ax is None :
        fig , ax = plt.subplots()
    data, x_e, y_e = np.histogram2d(x, y, bins = bins, density = False )
    z = interpn((0.5*(x_e[1:] + x_e[:-1]), 0.5*(y_e[1:]+y_e[:-1]) ), data ,np.vstack([x,y]).T , method = "splinef2d", bounds_error = False)

    #To be sure to plot all data
    z[np.where(np.isnan(z))] = 0.0
    z = z / np.sum(z)
    
    # Sort the points by density, so that the densest points are plotted last
    if sort :
        idx = z.argsort()
        x, y, z = x[idx], y[idx], z[idx]
        
    sc = ax.scatter( x, y, c=z, marker='.', s=1.0, **kwargs )

    if cbar:
        norm = Normalize(vmin = np.min(z), vmax = np.max(z))
        cbar = fig.colorbar(cm.ScalarMappable(norm = norm, **kwargs), ax=ax)
        cbar.ax.set_ylabel('Density')
    
    levels = [0.5, 0.75, 0.9]
    H, X, Y = data, x_e, y_e
    if True:
        # Compute the density levels.
        Hflat = H.flatten()
        inds = np.argsort(Hflat)[::-1]
        Hflat = Hflat[inds]
        sm = np.cumsum(Hflat)
        sm /= sm[-1]
        V = np.empty(len(levels))
        for i, v0 in enumerate(levels):
            try:
                V[i] = Hflat[sm <= v0][-1]
            except IndexError:
                V[i] = Hflat[0]
        V.sort()
        m = np.diff(V) == 0
        if np.any(m) and not quiet:
            logging.warning("Too few points to create valid contours")
        while np.any(m):
            V[np.where(m)[0][0]] *= 1.0 - 1e-4
            m = np.diff(V) == 0
        V.sort()

        # Compute the bin centers.
        X1, Y1 = 0.5 * (X[1:] + X[:-1]), 0.5 * (Y[1:] + Y[:-1])

        # Extend the array for the sake of the contours at the plot edges.
        H2 = H.min() + np.zeros((H.shape[0] + 4, H.shape[1] + 4))
        H2[2:-2, 2:-2] = H
        H2[2:-2, 1] = H[:, 0]
        H2[2:-2, -2] = H[:, -1]
        H2[1, 2:-2] = H[0]
        H2[-2, 2:-2] = H[-1]
        H2[1, 1] = H[0, 0]
        H2[1, -2] = H[0, -1]
        H2[-2, 1] = H[-1, 0]
        H2[-2, -2] = H[-1, -1]
        X2 = np.concatenate(
            [
                X1[0] + np.array([-2, -1]) * np.diff(X1[:2]),
                X1,
                X1[-1] + np.array([1, 2]) * np.diff(X1[-2:]),
            ]
        )
        Y2 = np.concatenate(
            [
                Y1[0] + np.array([-2, -1]) * np.diff(Y1[:2]),
                Y1,
                Y1[-1] + np.array([1, 2]) * np.diff(Y1[-2:]),
            ]
        )
        
        ax.contour(X2, Y2, H2.T, V, alpha = 0.5)

    return ax

def vis_distribution(fig, ax, mean, cov, transform, N = 10000):
    from collections import Iterable

    samples = np.random.multivariate_normal(mean=np.zeros(6), cov=Sigma_next, size=N).T
    samples_se3 = np.zeros((N, 2))
    for i, s in enumerate(samples.T):
        sample_se3 = transform @ SE3.exp(s)
        samples_se3[i, 0] = sample_se3[0, 3]
        samples_se3[i, 1] = sample_se3[1, 3]
    
    if isinstance(ax, Iterable):
        ax1, ax2 = ax
        density_scatter(samples[0, :], samples[1, :], ax1, fig, cmap=plt.cm.cool, cbar = False)
        ax1.set(xlabel = 'x, meters', ylabel = 'y, meters')
        ax1.axis('equal')
        ax2.set(xlabel = 'x, meters', ylabel = 'y, meters')
        ax1.set_title('Uncertainty at tangent space, $\mathfrak {g}$')
    else:
        ax2 = ax
    density_scatter(samples_se3[:,0], samples_se3[:,1], ax2, fig, cmap=plt.cm.cool, cbar = False)
    meanse3 = np.mean(samples_se3, axis = 0)
    ax2.scatter([meanse3[0]], [meanse3[1]], marker = 'x', s = 150, c = 'black')
    ax2.set(xlabel = 'x, meters', ylabel = 'y, meters')
    ax2.axis('equal')
    ax2.set_title('Uncertainty at manifold, $\mathcal{G}$')

Lie algebras and groups operations Link to heading

First, let’s define all the needed functions to operate on SO(3) and SE(3) groups using Lie algebras. In the code I explicitly specify the implemented equation’s numbers from Barfoot’s book on robotics. Here we will use the left-hand convention, that means that the state is expressed with respect to the moving, or robot perspective. Therefore the increments are represented in global coordinate frame and applied from the left side. Our $\mathbf{T}_{next} = \mathbf{T}_{bw}$, this matrix represents the transformation of the world w.r.t. the body: to plot the robot pose in global coordinate frame we have to invert it, to do so we find the transformation of the body w.r.t. the world: $\mathbf{T}_{wb}$

Written using the matrices operations we have:

$$ \mathbf{T}_{b_{i+1}w} = \mathbf{T}_{b_{i+1}b_{i}} \mathbf{T}_{b_{i}w}. $$

It means that to we want to obtain the transformation of the world w.r.t. the body at time step $i+1$ we have to multiply:

the transformation of the world w.r.t. the body at time step $i$ $-$ $\mathbf{T}_{b_{i}w}$;
with transformation increment of the body from time step $i$ to time step $i+1$ $-$ $\mathbf{T}_{b_{i+1}b_{i}}$.

In this case our $\mathbf{T}_{b_{i+1}b_{i}}$ has trivial form, but the important note here is that its $(0, 0)$ elements has value $-1$, describing the pose of at time $i$ w.r.t. to the pose at time $i+1$, and equal to $[-1, 0, 0]^T$ if we move to the right (initially I found this $-1$ counter-intuitive).

Of course, this notation will be much more intuitive if we work with transform matrix $T_{wb}$, that represents the transform of the body w.r.t. world. In such a case we have:

$$ \mathbf{T}_{w b_{i+1}} = \mathbf{T}_{w b_{i}}\mathbf{T}_{b_{i}b_{i+1}}, $$

and our increment is equal to $1$, instead of $-1$. These differences are very important when we work with transformations and can lead to dramatic results when mixed order is used. The latter approach typically used in robotics and when we work with IMU, the former one (left) - in computer vision ³, ⁴.

Spoiler

from abc import ABC, abstractmethod
import numpy as np

class LieGroup(ABC):
    @staticmethod
    @abstractmethod
    def wedge(tau):
        pass
    
    @staticmethod
    @abstractmethod
    def exp(tau):
        pass
    
    @staticmethod
    @abstractmethod
    def log(T):
        pass
    
    @staticmethod
    @abstractmethod
    def Ad(tau):
        pass
    
class SO3(LieGroup):
    @staticmethod
    def wedge(phi):
        '''B. eq. (7.10)'''
        e1, e2, e3 = np.squeeze(phi)
        return np.array([[0., -e3, e2],
                         [e3, 0., -e1],
                         [-e2, e1, 0.]])
    
    @staticmethod
    def exp(phi):
        '''B. eq. (7.23)'''
        if np.linalg.norm(phi) < 1e-8:
            return np.eye(3) + SO3.wedge(phi)
        else:
            norm = np.linalg.norm(phi)
            a = phi / norm
            return np.cos(norm) * np.eye(3) + (1.0 - np.cos(norm)) * np.outer(a, a.T) + np.sin(norm) * SO3.wedge(a)
        
    @staticmethod
    def Ad(phi):
        return SO3.exp(phi)
    
    @staticmethod
    def log(R):
        '''B. eq. (7.28)'''
        phi = np.arccos((np.trace(R) - 1.0) / 2.0)
        if phi ** 2 < 1e-8:
            return np.zeros(3)
        else:
            w = (phi / (2. * np.sin(phi))) * (R - R.T)
            return np.array([[w[2, 1]], [w[0, 2]], [w[1, 0]]])
        
    @staticmethod
    def inv(R):
        return R.T[:,:]
        
    @staticmethod
    def J(phi):
        '''B. eq. (7.37a), left Jacobian of SO3'''
        if np.linalg.norm(phi) < 1e-8:
            return np.eye(3) + 0.5 * SO3.wedge(phi)
        else:
            norm = np.linalg.norm(phi)
            a = phi / norm
            return np.sin(norm) / norm * np.eye(3) + (1.0 - np.sin(norm)/norm) * np.outer(a, a.T)\
                + (1.0 - np.cos(norm)) / norm * SO3.wedge(a)
      
    @staticmethod
    def J_inv(phi):
        '''B. eq. (7.37b), J^{-1}'''
        if np.linalg.norm(phi) < 1e-8:
            return np.eye(3) - 0.5 * SO3.wedge(phi)
        else:
            norm = np.linalg.norm(phi)
            a = phi / norm
            cotHalfP = 1. / np.tan(norm/2)
            return norm / 2 * cotHalfP * np.eye(3) + (1.0 - norm/2 * cotHalfP) * np.outer(a, a.T)\
                - norm/2 * SO3.wedge(a)
        
    
class SE3(LieGroup):
    @staticmethod
    def split_xi(xi):
        '''return translational and rotational component, B. eq. (7.14)'''
        xi = np.squeeze(xi)
        return xi[0:3], xi[3:6]
    
    @staticmethod
    def wedge(xi):
        '''B. eq. (7.14)'''
        t, phi = SE3.split_xi(xi)
        wedge = np.zeros((4,4))
        wedge[0:3,0:3] = SO3.wedge(phi)
        wedge[0:3, 3] = t
        return wedge
    
    @staticmethod
    def exp(xi):
        '''B. eq. (7.44)'''
        t, phi = SE3.split_xi(xi)
        if np.linalg.norm(phi) < 1e-8:
            return np.eye(4) + SE3.wedge(xi)
        else:
            norm = np.linalg.norm(phi)
            se3_wedge = SE3.wedge(xi)
            return np.eye(4) + se3_wedge + (1.0 - np.cos(norm)) / norm ** 2 * se3_wedge @ se3_wedge\
                + (norm - np.sin(norm)) / norm ** 3 * se3_wedge @ se3_wedge @ se3_wedge
    
    @staticmethod
    def Ad(xi):
        '''B. eq. (7.45)'''
        if xi.shape == (4, 4):
            xi = SE3.log(xi)
        
        t, phi = SE3.split_xi(xi)

        C = SO3.exp(phi)

        Tau = np.zeros((6, 6))
        Tau[0:3,0:3] = Tau[3:6,3:6] = C
        Tau[0:3,3:6] = SO3.wedge(SO3.J(phi) @ t) @ C

        return Tau
    
    @staticmethod
    def log(T):
        '''B. eq. (7.35)'''
        t = T[0:3, 3]
        w = SO3.log(T[0:3, 0:3])

        # 7.35 p = invJ @ t
        p = SO3.J_inv(w) @ t

        return np.hstack((p.flatten(), w.flatten()))
    
    @staticmethod
    def inv(T):
        T_inv = np.eye(4)
        T_inv[0:3, 0:3] = T[0:3, 0:3].T
        T_inv[0:3, 3] = - T_inv[0:3, 0:3] @ T[0:3, 3]
        return T_inv

Pose compound experiments Link to heading

Now assume that we have some initial pose and we are absolutely confident about it. Then we make 500 moves forward (right direction) and on each move we are absolutely confident about travelled distance (it is equal to 1 meter), but not so certain about direction, it’s associated uncertainty is equal to $\sigma = 2 ^\circ$.

Monte-carlo Link to heading

At first, let’s directly simulate it. To do so we would like to generate 1000 sequences of 100 moves. On each move we are uncertain about our direction and therefore randomly draw the angle from the normal distribution.

import matplotlib.pylab as plt

N_steps = 500
step_directed, sigma_direction_rad = -1.0, np.deg2rad(2.0)

P = 1000

coords = np.zeros((P, 2, N_steps))

for p in range(0, P):
    T_next = np.eye(4)
    for i in range(0, N_steps):
        # perturb our motion
        T_move = SE3.exp([step_directed, 0., 0., 0., 0., np.random.normal(0.0, sigma_direction_rad)])
        # move the particle
        T_next = T_move @ T_next
        # save the state
        coords[p, :,i] = np.linalg.inv(T_next)[0:2, 3]

fig, ax = plt.subplots(1, 1, figsize = (5, 5))
for p in range(0, P):
    ax.plot(coords[p, 0,:], coords[p, 1,:], c='b', alpha=0.1)
ax.scatter(coords[:, 0, -1], coords[:, 1, -1], c='r', s = 2.0)
ax.axis("equal")
ax.set(xlabel = 'x, meters', ylabel = 'y, meters')
ax.set_title("Monte-carlo simulation")

print("Mean:\n", np.mean(coords[:, :, -1], axis = 0))
print("Covariance:\n", np.cov(coords[:,:,-1].T))

Mean:
 [430.06    6.476]
Covariance:
 [[ 4857.184 -1585.499]
 [-1585.499 36559.492]]

png

After 100 steps we see famously known banana-shaped distribution, that can not be represented using gaussian distribution! But wait, in next subsection we will note that this “banana” can be described by Gaussian in our tangent space $\mathcal{T}_{\mathcal{X}}\mathcal{M}$ and mapped from vector tangent space to Lie group conveniently.

First-order approximation Link to heading

Now let’s first show the results of the covariance estimation using first-order method, where we throw away all Taylor expansion terms except the first one.

# True pose on the first iteration
T_0_true = np.eye(4)

T_move = np.array([[1., 0., 0.,  step_directed],
                    [0., 1., 0., 0.],
                    [0., 0., 1., 0.],
                    [0., 0., 0., 1.]])
Sigma_move = np.diag([0., 0., 0., 0., 0., sigma_direction_rad**2])

T_next = T_0_true
Sigma_next = np.zeros((6, 6))
for _ in range(0, N_steps):
    T_next = T_move @ T_next 
    
    # second-order term
    Ad_T1 = SE3.Ad(T_next)
    Sigma_move_dash = Ad_T1 @ Sigma_move @ Ad_T1.T
    
    Sigma_next = Sigma_next + Sigma_move_dash
    
fig, ax = plt.subplots(1, 2, figsize = (10, 5))
vis_distribution(fig, ax, np.zeros(6), Sigma_next, np.linalg.inv(T_next), N = 10000)
print("Covariance:\n", Sigma_next[0:2, 0:2])

<ipython-input-1-14b40a8f213d>:94: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated since Python 3.3, and in 3.10 it will stop working
  from collections import Iterable


Covariance:
 [[    0.       0.  ]
 [    0.   50921.98]]

png

When we closely look at these samples we immediately understand why it is beneficial to have the Gaussian in tangent space (Lie algebra elements) and then retract it to our manifold (Lie group elements - rotation matrices!). Doing so we note that the banana-shaped distribution is actually a Gaussian ⁵ and we can correctly represent the uncertainties and manipulate them!

However, we still observe the inaccuracies in our representation and what is more important: our uncertainty is underestimated and equal to zero along Y-coordinate axis. To improve the representation let’s leverage the Taylor series even more and take into account higher order terms.

Unscented transform Link to heading

Now let’s compound the poses, or move poses from first to last using unscented transform. This technique is often used when the nonlinear transformation of a probability distribution is required.

Its underlying idea is the following:

Take a set of particles, or sigma points, from our prior distribution.
Apply the transform function $f$ to each point.
Calculate the mean and covariance of the transformed points statistically.

Do we have any improvement? Let’s check!

T_next = T_0_true
Sigma_next = np.zeros((6, 6))

L = 12
kappa = 1.0

sigma_weights = np.zeros(2 * L + 1)
sigma_weights[0] = kappa / (kappa + L)
for i in range(1, len(sigma_weights)):
    sigma_weights[i] = 1.0 / (2 * kappa + 2 * L)

for _ in range(0, 500):
    Sigma_total = np.zeros((12, 12)) + np.eye(12) * 1e-9
    Sigma_total[0:6, 0:6] += Sigma_next
    Sigma_total[6:12, 6:12] += Sigma_move
    
    S = np.linalg.cholesky(Sigma_total)

    sigma_points = np.zeros((L, 2 * L + 1))
    sigma_points[:, [0]] = np.zeros((12, 1))
    for i in range(0, (sigma_points.shape[1] - 1) // 2):
        sigma_points[:, [1 + i]] = np.sqrt(L + kappa) * S[:,[i]]
        sigma_points[:, [1 + i + L]] = -np.sqrt(L + kappa) * S[:,[i]]
        
    T_new = T_move @ T_next
        
    xis = np.zeros((6, 2 * L + 1))
    for i in range(0, 2 * L + 1):
        xi_next = sigma_points[0:6, [i]]
        xi_move = sigma_points[6:12, [i]]
        
        xis[:, i] = SE3.log(SE3.exp(xi_next) @ T_next @ SE3.exp(xi_move) @ T_move @ np.linalg.inv(T_new))
        
    xis_mean = (sigma_weights @ xis.T).reshape((6, 1))
    xis_sigma = np.zeros((6, 6))
    for i in range(0, 2 * L + 1):
        cov = (xis[:,[i]] - xis_mean) @ (xis[:,[i]] - xis_mean).T
        xis_sigma += sigma_weights[i] * cov
        
    T_next = T_new
    Sigma_next = xis_sigma
    
fig, ax = plt.subplots(1, 2, figsize = (10, 5))
vis_distribution(fig, ax, np.zeros(6), Sigma_next, np.linalg.inv(T_next), N = 10000)
print("Covariance:\n", Sigma_next[0:2, 0:2])

Covariance:
 [[    0.        0.   ]
 [    0.    50617.404]]

png

Unfortunately, no, the covariance we end up with is the same as for previous method. As noted in ⁶, ⁷, such representation of the mean is valid up to the second order, so there is no need to apply unscented transform - we won’t be able to get better results with it.

Considering 4-th order terms Link to heading

The derivation of higher terms of covariance update is presented in paper ⁸. Here I will just propagate the covariance while using their derivations.

def linOp(A):
    return -np.trace(A) * np.eye(3) + A

def linOpB(A, B):
    return linOp(A) @ linOp(B) + linOp(B @ A)

T_next = T_0_true
Sigma_next = np.zeros((6, 6))
for _ in range(0, 500):
    T_next = T_next @ T_move
    
    # second-order term
    Ad_T1 = SE3.Ad(T_next)
    Sigma_move_dash = Ad_T1 @ Sigma_move @ Ad_T1.T
    
    # Temp variables
    Sigma1_PosPos = Sigma_next[0:3,0:3]
    Sigma1_PosPhi = Sigma_next[0:3,3:6]
    Sigma1_PhiPhi = Sigma_next[3:6,3:6]
    Sigma2_PosPos = Sigma_move_dash[0:3,0:3]
    Sigma2_PosPhi = Sigma_move_dash[0:3,3:6]
    Sigma2_PhiPhi = Sigma_move_dash[3:6,3:6]
    
    # third and fourth terms
    A1 = np.zeros((6,6))
    A1[0:3,0:3] = A1[3:6,3:6] = linOp(Sigma1_PhiPhi)
    A1[0:3,3:6] = linOp(Sigma1_PosPhi + Sigma1_PosPhi.T)
    A2_dashed = np.zeros((6,6))
    A2_dashed[0:3,0:3] = A2_dashed[3:6,3:6] = linOp(Sigma2_PhiPhi)
    A2_dashed[0:3,3:6] = linOp(Sigma2_PosPhi + Sigma2_PosPhi.T)
    
    B_PosPos = linOpB(Sigma1_PhiPhi, Sigma2_PosPos) + linOpB(Sigma1_PosPhi.T, Sigma2_PosPhi)\
        + linOpB(Sigma1_PosPhi, Sigma2_PosPhi.T) + linOpB(Sigma1_PosPos, Sigma2_PhiPhi)
    B_PosPhi = linOpB(Sigma1_PhiPhi, Sigma2_PosPhi.T) + linOpB(Sigma1_PosPhi.T, Sigma2_PhiPhi)
    B_PhiPhi = linOpB(Sigma1_PhiPhi, Sigma2_PhiPhi)
    B = np.zeros((6, 6))
    B[0:3,0:3] = B_PosPos
    B[0:3,3:6] = B_PosPhi
    B[3:6,0:3] = B_PosPhi.T
    B[3:6,3:6] = B_PhiPhi
    

    Sigma_next = Sigma_next + Sigma_move_dash\
        + 1/12.0 * (A1 @ Sigma_move_dash + Sigma_move_dash @ A1.T 
                    + A2_dashed @ Sigma_next + Sigma_next @ A2_dashed.T)\
        + 1/4.0 * B
    
fig, ax = plt.subplots(1, 2, figsize = (10, 5))
vis_distribution(fig, ax, np.zeros(6), Sigma_next, np.linalg.inv(T_next), N = 10000)
print("Covariance:\n", Sigma_next[0:2, 0:2])

Covariance:
 [[ 1858.357     0.   ]
 [    0.    49714.573]]

png

Bingo, now our uncertainty has much better shape, and we see that the higher order terms definitely helped us to end up with decent covariance that represents our true uncertainty more accurately and precise.

The usage of higher order terms is not always an option: first, the derivations are usually quite tricky and second, the computation complexity also increases.

Bonus Link to heading

In the last section I would like to compare the performance of vanilla-old Extended Kalman Filter for our toy-problem. Let’s reformulate the problem: we have a robot that moves in one direction with perfectly known speed (1 m/s) and with slightly uncertain steering direction (as before, $2^\circ$). If our state $\mathbf{x}$ includes position and orientation, and our controls $\mathbf{u}$ are robot linear and angular speeds, we have the following state and control vectors:

$$ \mathbf{x} = \displaystyle \left[\begin{matrix}x & y & z & \phi & \theta & \psi\end{matrix}\right], $$ $$ \mathbf{u} = \displaystyle \left[\begin{matrix}v_{x} & v_{y} & v_{z} & w_{x} & w_{y} & w_{z}\end{matrix}\right]. $$

and their dynamics have the following forms:

$$ \displaystyle \begin{bmatrix}\Delta{t} \left(v_{x} \cos{\left(\psi \right)} \cos{\left(\theta \right)} + v_{y} \left(\sin{\left(\phi \right)} \sin{\left(\theta \right)} \cos{\left(\psi \right)} - \sin{\left(\psi \right)} \cos{\left(\phi \right)}\right) + v_{z} \left(\sin{\left(\phi \right)} \sin{\left(\psi \right)} + \sin{\left(\theta \right)} \cos{\left(\phi \right)} \cos{\left(\psi \right)}\right)\right) + x \\\\ \Delta{t} \left(v_{x} \sin{\left(\psi \right)} \cos{\left(\theta \right)} + v_{y} \left(\sin{\left(\phi \right)} \sin{\left(\psi \right)} \sin{\left(\theta \right)} + \cos{\left(\phi \right)} \cos{\left(\psi \right)}\right) - v_{z} \left(\sin{\left(\phi \right)} \cos{\left(\psi \right)} - \sin{\left(\psi \right)} \sin{\left(\theta \right)} \cos{\left(\phi \right)}\right)\right) + y\\\\ \Delta{t} \left(- v_{x} \sin{\left(\theta \right)} + v_{y} \sin{\left(\phi \right)} \cos{\left(\theta \right)} + v_{z} \cos{\left(\phi \right)} \cos{\left(\theta \right)}\right) + z\\\\ \Delta{t} \left(w_{x} + w_{y} \sin{\left(\phi \right)} \tan{\left(\theta \right)} + w_{z} \cos{\left(\phi \right)} \tan{\left(\theta \right)}\right) + \phi\\\\ \Delta{t} \left(w_{y} \cos{\left(\phi \right)} - w_{z} \sin{\left(\phi \right)}\right) + \theta\\\\ \frac{\Delta{t} \left(w_{y} \sin{\left(\phi \right)} + w_{z} \cos{\left(\phi \right)}\right) + \psi \cos{\left(\theta \right)}}{\cos{\left(\theta \right)}}\end{bmatrix} $$

Then we can find the Jacobians of state equations w.r.t. the state vector and controls and use the standard EKF equations for covariance propagation:

$$ \mathbf{P}_{i+1} = \mathbf{F}_i \mathbf{P}_i \mathbf{F}_{i}^T + \mathbf{G}_i \mathbf{U}_i \mathbf{G}_i^T. $$

Now let’s write the code and assess the covariance we end up with.

Spoiler

dt = 1.0
sin, cos, tan  = np.sin, np.cos, np.tan
v_x, v_y, v_z = 1., 0., 0.
w_x, w_y, w_z = 0.0, 0., 0.0

P = np.zeros((6, 6))
x, y, z = 0., 0., 0.
phi, theta, psi = 0., 0., 0.
for _ in range(0, N_steps):
    x = dt*(v_x*cos(psi)*cos(theta) + v_y*(sin(phi)*sin(theta)*cos(psi) - sin(psi)*cos(phi)) + v_z*(sin(phi)*sin(psi) + sin(theta)*cos(phi)*cos(psi))) + x
    y = dt*(v_x*sin(psi)*cos(theta) + v_y*(sin(phi)*sin(psi)*sin(theta) + cos(phi)*cos(psi)) - v_z*(sin(phi)*cos(psi) - sin(psi)*sin(theta)*cos(phi))) + y
    z = dt*(-v_x*sin(theta) + v_y*sin(phi)*cos(theta) + v_z*cos(phi)*cos(theta)) + z
    phi = dt*(w_x + w_y*sin(phi)*tan(theta) + w_z*cos(phi)*tan(theta)) + phi
    theta = dt*(w_y*cos(phi) - w_z*sin(phi)) + theta
    psi = (dt*(w_y*sin(phi) + w_z*cos(phi)) + psi*cos(theta))/cos(theta)
    
    # Jacobian of state dynamics w.r.t. the state
    F_x = np.eye(6)
    F_x[0, 3] = dt*(v_y*(sin(phi)*sin(psi) + sin(theta)*cos(phi)*cos(psi)) - v_z*(sin(phi)*sin(theta)*cos(psi) - sin(psi)*cos(phi)))
    F_x[0, 4] = dt*(-v_x*sin(theta) + v_y*sin(phi)*cos(theta) + v_z*cos(phi)*cos(theta))*cos(psi)
    F_x[0, 5] = -dt*(v_x*sin(psi)*cos(theta) + v_y*(sin(phi)*sin(psi)*sin(theta) + cos(phi)*cos(psi)) - v_z*(sin(phi)*cos(psi) - sin(psi)*sin(theta)*cos(phi)))
    F_x[1, 3] = -dt*(v_y*(sin(phi)*cos(psi) - sin(psi)*sin(theta)*cos(phi)) + v_z*(sin(phi)*sin(psi)*sin(theta) + cos(phi)*cos(psi)))
    F_x[1, 4] = dt*(-v_x*sin(theta) + v_y*sin(phi)*cos(theta) + v_z*cos(phi)*cos(theta))*sin(psi)
    F_x[1, 5] = dt*(v_x*cos(psi)*cos(theta) + v_y*(sin(phi)*sin(theta)*cos(psi) - sin(psi)*cos(phi)) + v_z*(sin(phi)*sin(psi) + sin(theta)*cos(phi)*cos(psi)))
    F_x[2, 3] = dt*(v_y*cos(phi) - v_z*sin(phi))*cos(theta)
    F_x[2, 4] = -dt*(v_x*cos(theta) + v_y*sin(phi)*sin(theta) + v_z*sin(theta)*cos(phi))
    F_x[3, 3] = dt*(w_y*cos(phi) - w_z*sin(phi))*tan(theta) + 1
    F_x[3, 4] = dt*(w_y*sin(phi) + w_z*cos(phi))/cos(theta)**2
    F_x[4, 3] = -dt*(w_y*sin(phi) + w_z*cos(phi))
    F_x[5, 3] = dt*(w_y*cos(phi) - w_z*sin(phi))/cos(theta)
    F_x[5, 4] = dt*(w_y*sin(phi) + w_z*cos(phi))*sin(theta)/cos(theta)**2

    # Jacobian of state w.r.t. the controls
    G_i = np.zeros((6, 6))
    G_i[0, 0] = dt*cos(psi)*cos(theta)
    G_i[0, 1] = dt*(sin(phi)*sin(theta)*cos(psi) - sin(psi)*cos(phi))
    G_i[0, 2] = dt*(sin(phi)*sin(psi) + sin(theta)*cos(phi)*cos(psi))
    G_i[1, 0] = dt*sin(psi)*cos(theta)
    G_i[1, 1] = dt*(sin(phi)*sin(psi)*sin(theta) + cos(phi)*cos(psi))
    G_i[1, 2] = dt*(-sin(phi)*cos(psi) + sin(psi)*sin(theta)*cos(phi))
    G_i[2, 0] = -dt*sin(theta)
    G_i[2, 1] = dt*sin(phi)*cos(theta)
    G_i[2, 2] = dt*cos(phi)*cos(theta)
    G_i[3, 3] = dt
    G_i[3, 4] = dt*sin(phi)*tan(theta)
    G_i[3, 5] = dt*cos(phi)*tan(theta)
    G_i[4, 4] = dt*cos(phi)
    G_i[4, 5] = -dt*sin(phi)
    G_i[5, 4] = dt*sin(phi)/cos(theta)
    G_i[5, 5] = dt*cos(phi)/cos(theta)

    Uc = np.diag([0., 0., 0.0,
                  0., 0., sigma_direction_rad**2])

    P = F_x @ P @ F_x.T + G_i @ Uc @ G_i.T

print("Covariance:\n", Sigma_next[0:2, 0:2])

Covariance:
 [[ 1858.357     0.   ]
 [    0.    49714.573]]

We see that the result is the same. We got the same covariance using various methods. Do we have to bother to study and use Lie algebra if result is the same? It is interesting question and I try to leverage the answer in my next posts.

References Link to heading