https://d37oebn0w9ir6a.cloudfront.net/account_16771/umberto-jXd2FSvcRr8-unsplash_08a9f83087a460401c83f9ceb7d6db2a.jpg
July 30, 2020

End-to-end Object Detection with Template Matching using Python

Object detection using template matching - no data needed!

How to implement custom object detection with template matching. No annotated data needed!

object-detection-template-matching
Object detection with Template matching to detect components

Today, state-of-the-art object detection algorithms (algorithms aiming to detect objects in pictures) are using neural networks such as Yolov4.

object-detection-output
An object detection output

Template matching is a technique in digital image processing for finding small parts of an image that matches a template image. It is a much simpler solution than a neural network to conduct object detection. In addition, it comes with the following benefits:

  • no need to annotate data (a time-consuming and mandatory task to train neural networks)

  • bounding boxes are more accurate

  • no need for GPU

In my experience, combining a neural network like Yolov4 and object detection with template matching here is a good way to considerably improve your neural network performance!

What is template matching?

When you use OpenCV template matching, your template slides pixel by pixel on your image. For each position, a similarity metric is computed between your template image and the part of the image it recovers:

template-matching-example

Using template matching to detect French ID in scanned documents

If the similarity metric is high enough for one pixel, then this pixel is probably the top-left corner of an object matching your template!

Consequently, you can achieve object detection with template matching only if the objects you try to detect are similar enough —almost identical— within a class. You can still include more templates to tackle object variations (size, color, orientation). But it will increase the prediction time.

At first look, it seems very restrictive. But a lot of object detection use cases can be tackled with template matching:

  • ID in scanned documents

  • empty parking space from a stationary camera

  • components on an assembly line...

A Practical Example

A good use case for object detection using template matching is to detect components on printed circuits, such as this one:

printed-circuit-unsplash
A printed circuit - Photo by Umberto on Unsplash

We could imagine an assembly line producing such circuits. Let’s imagine that some circuits manufactured are missing components and thus, defective. We could propose to install a camera at the end of the trail and to shoot each circuit, in order to filter out defective products. We can achieve this with object detection with template matching!

For the sake of simplicity, we will focus on the detection of a few components.

A first component appearing twice:

template-matching-component-1
Component 1

This one appearing four times:

template-matching-component-2
Component 2

And this third one appearing six times:

template-matching-component-3
Component 3

Finally, we choose these three images as templates. Consequently, the complexity of this use case is reduced: we will easily detect at least the objects chosen as templates.

Basic object detection with template matching

Defining template

Firstly, we define templates from:

  • an image path,

  • a label,

  • a color (for result visualization —bounding boxes and labels color),

  • and a matching threshold.

Secondly, we consider that all pixels having a similarity metric above this threshold indicate a detection for this template.

Here is the code defining templates:

Defining templates

Detecting object with template matching

Then, we loop over templates to perform object detection with template matching for each template. Because we are using a threshold, we select a normalized similarity metric (TM_CCOEFF_NORMED) when applying template matching. Hence, we can pick a threshold between 0 and 1:

Object detection with template matching

We consider that each pixel having a similarity score above the template threshold is the top-left corner of an object (with the template’s height, width, and label).

Visualize detected objects

Then, we plot the predicted bounding boxes of this object detection with template matching on the input image:

Display object detection results

Finally, we obtain the following results:

template-matching-duplicated-detected-objects
Duplicated detected objects

As indicated by the thickness of boxes (in green, yellow, and red), each object has been detected several times.

Remove duplicates

Why did we obtain duplicated detections? As explained above, OpenCV template matching returns a 2-D matrix having the dimension of the input image (one cell— and thus one similarity score— for each input image pixel).

Therefore, if an object is detected in one location, all surrounding pixels will most likely have the same similarity score, and thus considered as other object top-left corners.

To tackle this issue, we will sort all detections by decreasing matching values. Then, we will choose whether or not to validate each detection. We validate the detection if it is not overlapping too much with any of the already validated detections. Finally, we determine that two detections are overlapping if the Intersection over Union of their bounding boxes is above a given threshold. This process is called Non-Maximum Suppression.

Here is a visual explanation of what Intersection over Union (IoU) is:

Intersection over Union

Here is how I implemented it (compute IoU method along with more explanations can be found here):

Non-Maximum Suppression

And then, I just added these two lines after the detection loop:

Apply NMS

As a result, we obtain:

deduplicated-detected-objects
Deduplicated detected objects

Much cleaner! We now clearly see that all first and third components are detected without false positive (precision and recall of 1).

We now want to reduce the number of false positives for component 2. The easiest way is to increase the matching threshold for the template used for this label.

Choosing hyperparameters

It is of course better to compute object detection metrics on various images to choose hyperparameters (template matching threshold and Non-Maximum Suppression threshold). For now, we can simply increase the threshold for component 2:

Choose template matching thresholds

And we obtain:

last-output
Final output

We now have only two false positives for component 2, instead of dozens (precision of 2/3, recall of 1)! In addition, components 1 and 2 are still perfectly detected (precision and recall of 1).

To improve our results we could:

  • include templates for the components mistaken with component 2.

  • try several similarity metrics

  • annotate a few images to compute detection metrics, and perform a grid search on these parameters.

Summary

We have achieved object detection with template matching by:

  • defining at least one template for each object (the more templates you have for one object the more your recall will be high—and your precision low)

  • using OpenCV template matching method on the image for each template

  • considering that each pixel having a similarity score above a template threshold is the top-left corner of an object (with this template’s height, width, and label)

  • applying Non-Maximum Suppression of the detections obtained

  • choosing template thresholds to improve detection accuracy!

That’s it!

Are you looking for Image Recognition Experts? Don't hesitate to contact us!

Similar Articles
Edge detection, tutorial, knowledge

Edge Detection in Opencv 4.0, A 15 Minutes Tutorial

This tutorial will teach you, with examples, two OpenCV techniques in python to deal with edge detection.

Preview Image of the DSFD article

Face Detectors: Understand DSFD and the State-of-the-art Algorithms

Let’s dive into the recent Dual Shot Face Detector DSFD through a review of two famous detection algorithms: Faster R-CNN and Single Shot Detector.

annotation tools

Best Open Source Annotation Tools for Computer Vision

A Top 5 labeling tools to create Computer Vision datasets.