top of page
Search

Recognizing Colours of Lego bricks with Open CV

  • Writer: Eva Martens
    Eva Martens
  • Jan 26
  • 4 min read

Updated: Feb 18

In the rapidly evolving world of gaming and software development, computer vision has emerged as a pivotal technology. It enables machines to interpret and understand visual information from the world, allowing for innovative applications that enhance user experiences.


Let's imagine we can actually design a 3D world by just placing Lego bricks on a plate, with the Computer doing the rest. Sounds simple, right? Well there are a lot things to consider.


What's the problem?


First things first: There are tons of different methods to detect colour in an image. In order to understand what the a good approach for this specific problem is we need to first examine two things:

1) How computers store images and what "colour" spaces are

2) What problems photographing Lego has and what specific colour spaces are good at dealing with those



As usual, problems are easier to identify than the solution. In the image above you can see a prototype plate for this project, photographed on a simple table with a phone camera using a B-Y-Y-R sensor. If we were to photograph the same plate in the same lighting conditions with different cameras, we would get vastly different quality results in terms of the colour in the image being realistic, the resolution influencing edge detection and shadows and glare skewing the perceived information.

For a purpose build system then, not only do we need to consider the colour space, but also the camera choice and how we capture the image.


Note: This could be simplified by choosing only high contrast colours of bricks and controlling the light conditions by placing the plate in a box, but if the end goal is life-updating 3D worlds, we should explore the capabilities and limitations of the computer vision model first.

Summery of colour spaces

Computers store images as matrices with pixel values, where each value corresponds to information for the pixel. Greyscale images simply store one 2D matrix with information of lightness for each individual pixel. Colour images store multiple matrices, each corresponding to a channel. What each channel represents is different from colour model to colour model. The reason we have different models, is that there are different ways to represent colour spectrum and interpret this information. Every colour, when it is really dark, to the human eye looks converges to black. Do we store every colour's darkness/lightness value separately or do we use a compound value to infer this infomation? This is what the colour models manage.


While this list is by no means exhaustive. the 4 most commonly considered colour spaces are RGB, HSV, L*a*b, and L*u*v.


RGB (and BGR)

Typical camera sensors actually deploy RGB (or RGGB) sensors, and store images in the same way. The OpenCV library conveniently converts most common image types to the RGB (or in it's case BGR) colour space automatically.

Colour spaces that split the pixel information into Red, Green and Blue, have one channel for each of these values. Every value is represented by a number between 0 - 255, making for 256^3 (or over 16 million) different colour combinations.


This is a lot of data and a lot to process. Not to mention, matrix operations are expensive. A high quality camera with high resolution will result in any operations in this colour space using considerable computing power. If we want to combine this with a game engine to create 3D worlds, and not end up having to spend a few grand on our PC to even run our application, then we likely want to choose a different colour space for the majority of our operations.


That is not to say this colour space can't be useful!

Let's imagine we have a green base plate, instead of grey. Then we can isolate the red colour channel and at the very least get a high contrast image for our initial edge detection.



HSV


The HSV colour space splits a colour image into it's Hue, Saturation and Value.

The Hue channel defines the colour, storing a value between 0 and 180, vastly reducing the amount of information we need to process to label a colour.

Saturation describes how intense the colour is, with 0 corresponding to white, and 255 to the full colour value.

Value describes how dark the pixel is, with 0 corresponding to black, and 255 to the full colour value.



It is an easy to understand model, which vastly reduces complexity of the colour images information. Individual colours can be detected by simply defining ranges for their Hue for example.


This is also where this model has a pretty significant problem however. Depending on the camera, and how the image is taken initially, this model is very susceptible to classifying pixels wrong when there is a certain amount of glare and shadow. These are two primary issues this project deals with.


The saturation channel however does allow us to identify areas with excessive glare, and possibly choose to ignore them, or apply further pre-processing if necessary.


LAB and LUV

The L*a*b and L*u*v colour spaces are perceptually uniform, device-independent color spaces designed to match human vision, making them ideal for precise color correction and measurement. While computationally more expensive than HSV, both of these colour spaces are considered better when dealing with reflective surfaces..

The L channel for both models stands for lightness, and separates out how close to black or white a pixel gets. To humans these models more closely reflect the changes to colours we perceive in our vision and how lighting may effect the hue.

These model also accelerate at allowing us to calculate the exact distance between colours when evaluating and classifying them.


The only challenge is coming up with a way of accommodating this more expensive colour model, especially when it comes to live updating data through a camera feed.



The next time we will talk about how to use this information to implement some basic colour recognition and how to choose a camera for this type of project.


 
 
 

Comments


bottom of page