Sunday, 8 November 2015

RANSAC Line Feature Extraction from 2D Point Cloud (C#)

I have recently been researching several types of laser range finder with the goal of developing an effective robot navigation capability within an internal space.

There are so many laser range finders available on the market it can be difficult to know what to choose from. Just browsing Google shopping, there’s a huge list, which include:

Parallax 15-122cm Laser Rangefinder (£86)
Hokuyo URG-04LX-UG01 Scanning Laser Rangefinder (£880)
SF10-C Laser Rangefinder (£430)
LIDAR-Lite 2 Laser Rangefinder (£90)
RoboPeak RPLIDAR 360° Laser Scanner (£250)

I really like the RPLIDAR 360° laser scanner – it sits at the lower end of the price range and is a 360 degree 2D scanner (LIDAR) solution, with more than 6 meters of effective range detection. The 2D point cloud data it creates can easily be consumed by a simultaneous localisation and mapping (SLAM) algorithm.

Incidentally, you’ll find a very similar laser range finder in a Neato XV robot vacuum cleaner – so if you’re a hacker at heart, you could tear one down, get the sensor, plus harvest a host of other useful parts.

Due to environmental conditions, a point cloud generated from any laser scanning device will contain noise - so how can we clean the data and establish clean landmarks? Landmarks are features which can easily be distinguished from the environment and repeatedly be reacquired, allowing the robot to localise itself.

I decided to implement a derivative of the RANSAC algorithm. Random sample consensus (RANSAC) is an iterative method to estimate parameters of a mathematical model from a set of observed data which contains outliers. For the point cloud data I’ve been looking at, I decided to drop the random element of the algorithm, which gives me more consistent results.

Given a dataset whose data elements contain both inliers and outliers, RANSAC uses a voting scheme to find the optimal fitting result. Data elements in the dataset are used to vote for one or more models. The implementation of this voting scheme is based on two assumptions: that the noisy features will not vote consistently for any single model and there are enough features to agree on a good model.

The code below firstly creates a noisy data set, simulating a laser scan point cloud for a square room. Then it applies my hacked implementation of the RANSAC algorithm to correctly identify walls from the noisy data.

Code Snippet
  1. using System;
  2. using System.Drawing;
  3. using System.Windows.Forms;
  5. public class Ransac : Form
  6. {
  7.     const int MAX_LINES = 4;
  8.     const int SAMPLE_SIZE = 10;
  9.     const int RANSAC_CONSENSUS = 50;
  10.     const double RANSAC_TOLERANCE = 7.5;
  11.     const double D2R = Math.PI / 180.0;
  12.     double[] LinesA = new double[MAX_LINES];
  13.     double[] LinesB = new double[MAX_LINES];
  14.     double[] PointCloud = new double[360];
  15.     bool Draw = false;
  16.     int LineCount = 0;
  18.     [STAThread]
  19.     static void Main()
  20.     {
  21.         Application.Run(new Ransac());
  22.     }
  24.     public Ransac()
  25.     {
  26.         Width = 500;
  27.         Height = 525;
  28.         Text = "RANSAC Demo";
  29.         CreateNoisyPointCloud();
  30.         ExtractLines();
  31.         Draw = true;
  32.     }
  34.     private void CreateNoisyPointCloud()
  35.     {
  36.         Random rnd = new Random();
  37.         for (int i = 0; i < 90; i++)
  38.         {
  39.             double length = 150 / Math.Cos((i - 45) * D2R);
  40.             PointCloud[i] = length + rnd.NextDouble() * 16 - 8;
  41.             PointCloud[i + 90] = length + rnd.NextDouble() * 16 - 8;
  42.             PointCloud[i + 180] = length + rnd.NextDouble() * 16 - 8;
  43.             PointCloud[i + 270] = length + rnd.NextDouble() * 16 - 8;
  44.         }
  45.     }
  47.     private void ExtractLines()
  48.     {
  49.         int[] pointStatus = new int[360];
  50.         for (int seed = 0; seed < 360; seed++)
  51.         {
  52.             int[] sample = new int[SAMPLE_SIZE];
  53.             for (int i = 0; i < SAMPLE_SIZE; i++) sample[i] = (seed + i) % 360;
  55.             int[] fitPoints = new int[360];
  56.             int fitCount = 0;
  58.             double a = 0;
  59.             double b = 0;
  60.             LeastSquaresFit(PointCloud, sample, SAMPLE_SIZE, ref a, ref b);
  62.             for (int i = 0; i < 360; i++)
  63.             {
  64.                 if (pointStatus[i] == 0)
  65.                 {
  66.                     // Convert scan vectors to cartesian coordinates.
  67.                     double x = Math.Cos(i * D2R) * PointCloud[i];
  68.                     double y = Math.Sin(i * D2R) * PointCloud[i];
  70.                     // Claim points close to sample line.
  71.                     if (DistanceToLine(x, y, a, b) < RANSAC_TOLERANCE)
  72.                         fitPoints[fitCount++] = i;
  73.                 }
  74.             }
  76.             if (fitCount > RANSAC_CONSENSUS)
  77.             {
  78.                 // Refresh line and add to collection.
  79.                 LeastSquaresFit(PointCloud, fitPoints, fitCount, ref a, ref b);
  80.                 LinesA[LineCount] = a;
  81.                 LinesB[LineCount] = b;
  82.                 LineCount++;
  84.                 // Update point cloud status.
  85.                 for (int i = 0; i < fitCount; i++) pointStatus[fitPoints[i]] = LineCount;
  86.             }
  87.         }
  88.     }
  90.     private void LeastSquaresFit(double[] pointCloud, int[] selection, int count, ref double a, ref double b)
  91.     {
  92.         double sumX = 0;
  93.         double sumY = 0;
  94.         double sumXX = 0;
  95.         double sumYY = 0;
  96.         double sumXY = 0;
  98.         for (int i = 0; i < count; i++)
  99.         {
  100.             double x = Math.Cos(selection[i] * D2R) * pointCloud[selection[i]];
  101.             double y = Math.Sin(selection[i] * D2R) * pointCloud[selection[i]];
  102.             sumX += x;
  103.             sumXX += Math.Pow(x, 2);
  104.             sumY += y;
  105.             sumYY += Math.Pow(y, 2);
  106.             sumXY += x * y;
  107.         }
  109.         a = (count * sumXY - sumX * sumY) / (count * sumXX - Math.Pow(sumX, 2));
  110.         b = (sumY * sumXX - sumX * sumXY) / (count * sumXX - Math.Pow(sumX, 2));
  111.     }
  113.     private double DistanceToLine(double x, double y, double a, double b)
  114.     {
  115.         double ao = -1.0 / a;
  116.         double bo = y - ao * x;
  117.         double px = (b - bo) / (ao - a);
  118.         double py = ((ao * (b - bo)) / (ao - a)) + bo;
  120.         return Math.Sqrt(Math.Pow(x - px, 2) + Math.Pow(y - py, 2));
  121.     }
  123.     protected override void OnPaint(PaintEventArgs e)
  124.     {
  125.         if (Draw)
  126.         {
  127.             Draw = false;
  128.             Graphics g = e.Graphics;
  130.             Pen pen = new Pen(Color.Gray, 4);
  131.             for (int i = 0; i < LineCount; i++)
  132.             {
  133.                 int x1 = 0;
  134.                 int y1 = 250 + (int)(LinesA[i] * -250 + LinesB[i]);
  135.                 int x2 = 500;
  136.                 int y2 = 250 + (int)(LinesA[i] * 250 + LinesB[i]);
  137.                 g.DrawLine(pen, x1, y1, x2, y2);
  138.             }
  140.             pen = new Pen(Color.Red, 4);
  141.             for (int i = 0; i < 360; i++)
  142.             {
  143.                 int x = 250 + (int)(Math.Cos(i * D2R) * PointCloud[i]);
  144.                 int y = 250 + (int)(Math.Sin(i * D2R) * PointCloud[i]);
  145.                 g.DrawEllipse(pen, x, y, 1, 1);
  146.             }
  147.         }
  148.     }
  149. }

Wednesday, 6 May 2015

From Deep Blue to Deep Trouble?

On 11th May 1997, Deep Blue, a chess-playing computer developed by IBM, beat the then world champion Gary Kasparov at chess.  It won the six game match with two wins and three draws. Kasparov accused IBM of cheating and demanded a rematch, but IBM refused and dismantled Deep Blue. Kasparov had beaten a previous version of Deep Blue in 1996.

By 2011, IBM had built an even more intelligent computer called Watson – named after IBM founder Thomas Watson. Watson is an artificial intelligence computer system capable of answering questions posed in natural language. As a test of its abilities, Watson competed on the TV quiz show Jeopardy, in the shows only human-versus-machine match to date. Watson managed to beat Brad Rutter and Ken Jennings, the biggest all-time money winner and longest championship streak record holder respectively. Behind the scenes, Watson had access to 200 million pages of structured and unstructured data, consuming four terabytes of disk storage, including the full text of Wikipedia. However, it was not connected to the Internet during the game.

Today, Watson is marketed as a tool for people to explore and use. Watson is not alone, Microsoft have launched Azure ML, their machine learning platform, and everyday new companies are opening for business, promising to provide the answers to humanities toughest problems.

Computer scientists like Geoffrey Hinton, Yann Lecun and Andrew Ng are leading the way with improved machine learning techniques that have recently led to great advances in deep learning systems.

Software advances are being matched in hardware by the unstoppable Moore's law, which is the observation that over the history of computing hardware, the number of transistors on integrated circuits doubles approximately every two years. The law is named after Intel co-founder Gordon Moore, who described the trend in his 1965 paper. His prediction has proven to be most accurate - in part because the law is now used in the semiconductor industry to guide long-term planning and to set targets for research and development. The capabilities of many digital electronic devices are strongly linked to Moore's law: processing speed, memory capacity, sensors and even the number and size of pixels in digital cameras. All of these are improving at exponential rates as well.

Where will it all end? Each stage of technical development and each computerised victory brings us inevitability closer to the day that machines will outsmart humans…

There are those who call themselves Singularitarians who believe that the creation of a super intelligence, the Singularity, will happen in the near future and that deliberate action ought to be taken to ensure that this intelligence benefits humans. Singularitarians are distinguished from other futurists who speculate on a technological singularity by their belief that the Singularity is not only possible, but desirable if guided prudently.

On the flip side, there are some prominent figures, including Elon Musk and Stephen Hawking, who warn against major advances in artificial intelligence. In a recent interview with the BBC Hawking stated:

“The primitive forms of artificial intelligence we already have, have proved very useful. But I think the development of full artificial intelligence could spell the end of the human race. Once humans develop artificial intelligence it would take off on its own and redesign itself at an ever-increasing rate. Humans, who are limited by slow biological evolution, couldn’t compete and would be superseded.”

I’m certainly not up there on the intelligence scales with Stephen Hawking, but I do have a view. We are undoubtedly developing computers that are becoming more intelligent. The problems these computers solve are very useful: self-driving cars and speech recognition – where would I be without Siri?!

However, these computers are in no way sentient – they are merely very good at recognising patterns - they have no personal goals or desires. Animals made this jump with the evolution of the neocortex. In many ways this is what allows mammals to learn new behaviours and for humans to develop conscious thought and language.

To match a human level intelligence, with goals and desires, we must make monumentous advances in learning algorithms and develop fundamentally new approaches. We must learn to create the equivalent of a neocortex that sits over lower level learning algorithms.

That’s not to say we won’t get there one day – I’m certain we will - but we’re a long way from that just yet and have plenty of time to think about necessary safety concerns.

I, for one, welcome our new machine overlords..!

Friday, 27 February 2015

From Neural Networks to Deep Learning

A few years ago, I began blogging about Neural Networks. I have had an interest in this side of machine learning for more time than I can remember. However, even though these amazingly useful constructs have been used to solve many real world problems; they have never really delivered on the dream of a true artificial intelligence – until now. With the advent of Deep Learning algorithms this is all about to change…

Neural Networks began as single layer networks that could be used to solve “linearly separable” classification problems. This type of network was known as the perceptron.

Some very bright people then discovered how to do “back propagation”, which allowed (in theory) multi-layer networks to solve any type of classification problem. The back propagation algorithm is so called, because of the way it works - it compares the output of a network with the desired value and feeds back tiny amounts of the error through the network to modify the weights.

If you wanted to do something useful with a Neural Network, such as perform pattern recognition – identifying images that contain a car - you start by converting raw pixel inputs into feature activations. These feature activations are often hand crafted and are designed to pick out something like an individual wheel or grill. The network would then learn how to weight the feature activations and decide what it’s seeing in the image.

However, using back propagation to solve these problems really did not work for a number of reasons:
  • It’s really hard to hand craft feature detectors
  • It requires pre-classified (labelled) training data - almost all real world data is unlabelled
  • The learning time does not scale well - especially with really large networks
  • The network can often get stuck in “local optima” – it will stop learning before arriving at the correct solution
These serious limitations rendered neural networks as nothing more than a computer scientists plaything for several decades.

But then, with the passage of time, the story slowly changed. The rise of the Internet and Big Data brought with it huge amounts of labelled data. Computers also got a lot faster, especially with the creation of Graphics Processing Units (GPU) - by orders of magnitude. And, most importantly, we learnt new and better techniques to initialise the networks.

The key difference between techniques used in modern deep learning algorithms and the neural networks of old, is that the network creates its own feature detectors – they are not hand crafted. Therefore, the only limitation is computing power – and we have plenty of that!

Deep networks learn one layer at a time, using a generative model of the input (visible) data that connects to a layer of latent (hidden) nodes. The hidden layer is then used to train a second generative model against the next hidden layer, and so on. One technique used to achieve this is a restricted Boltzmann Machine (I’ll post some code next time).

Just like human vision systems, deep learning systems for image recognition process stuff in layers. For example, the first layer may learn correlations between pixels to begin to form tiny edge detectors. By the time you reach the third or forth layer the activations could represent complete wheels, hands, faces, etc.

Fast forward today - Google scientists have developed a computer program capable of learning a wide variety of tasks independently, in what has been hailed as “a significant step towards true artificial intelligence”.

The program learnt to play 49 different retro computer games, and came up with its own strategies for winning. The research was carried out by DeepMind, the British company bought by Google last year for £400m, whose stated aim is to build “smart machines”.

Likewise, Microsoft believes that too much of the world’s Big Data is going to waste and has just launched a new initiative to help organisations process it all, build APIs and finally make some sense out of it. The technology, called Azure Machine Learning (ML), is a new cloud based service that can be accessed via any web browser. It’s simple to use, featuring a simple drag and drop interface that data scientists and developers use. The main aim of ML is to reduce the amount of work that’s needed for organisations to deploy machine learning.

Not to be left behind, a Facebook project known as Deep Face can discern the accuracy of the true identity of any picture of you. The Deep Face AI system is now powerful enough to spot individual users from the 400 million photos uploaded to the social network every single day.

In the future, deep learning systems could be used to power self-driving cars, personal assistants in smartphones or conduct scientific research in fields from climate change to cosmology.

Exciting times..!