Characters Segmentation and Recognition for Vehicle License Plate

August 19, 2020

From my previous post (http://dangminhthang.com/computer-vision/character-recognition-using-alexnet/) some wonder what we can do with the model after we train it. In this post I will share how we can use the pre-trained model in a case study. Specifically we will use image processing and deep learning techniques for characters segmentation and recognition in vehicle license plate.

Given a vehicle license plate image in Fig. 1 below, let’s see how we can “teach” the computer to “read” it.

Vehicle license plate
Figure 1. A vehicle license plate

There are many methods for characters segmentation and recognition, including advanced and complex deep learning algorithms. However in this post we will use a simpler approach. In order to read the license plate it will take two stages. The first stage is to segment the characters, and the second stage is to recognise those characters.

Let’s get started by doing characters segmentation for the license plate in Fig. 1.

The following code will use Python, OpenCV (a powerful library for image processing and computer vision), and Tensorflow.

Characters Segmentation

First we read the image and convert it into grayscale:

# Import the necessary packages
import argparse
import cv2
import numpy as np
from keras.models import load_model
from keras.preprocessing.image import img_to_array
import functools
 
# Construct the argument parser and parse the arguments
ap = argparse.ArgumentParser()
ap.add_argument("-i", "--image", required=True, help="Path to the image")
ap.add_argument("-m", "--model", required=True, help="Path to the pre-trained model")
args = vars(ap.parse_args())

# Read the image and convert to grayscale
image = cv2.imread(args["image"])
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)

Then we apply Gaussian blurring and thresholding to reveal the characters on the license plate:

# Apply Gaussian blurring and thresholding 
# to reveal the characters on the license plate
blurred = cv2.GaussianBlur(gray, (5, 5), 0)
thresh = cv2.adaptiveThreshold(blurred, 255,
	cv2.ADAPTIVE_THRESH_MEAN_C, cv2.THRESH_BINARY_INV, 45, 15)

The result is a binary image as follows:

Vehicle license plate binary image
Figure 2. Vehicle license plate binary image

There are many white “blobs” in the binary image. We need to determine which white blobs are license plate characters. This can be done by applying an algorithm called connected-component analysis.

Let’s see the code:

# Perform connected components analysis on the thresholded image and
# initialize the mask to hold only the components we are interested in
_, labels = cv2.connectedComponents(thresh)
mask = np.zeros(thresh.shape, dtype="uint8")

The connectedComponents method returns labels, a NumPy array with the same dimension as our thresh image. Each element in labels is 0 if it is background or >0 if it belongs to a connected-component. Each connected-component corresponds a white blob in Fig. 2 and has a unique label.

So how can we decide if a white blob is of a character? An heuristic approach is used here. From the binary image it can be seen that the number of pixels for every character falls in a certain range. Therefore we set a lower boundary and an upper boundary which are the number of pixels within which each connected-component must have:

# Set lower bound and upper bound criteria for characters
total_pixels = image.shape[0] * image.shape[1]
lower = total_pixels // 70 # heuristic param, can be fine tuned if necessary
upper = total_pixels // 20 # heuristic param, can be fine tuned if necessary

Next we can loop over each connected-component and determine if it is a license plate character:

# Loop over the unique components
for (i, label) in enumerate(np.unique(labels)):
	# If this is the background label, ignore it
	if label == 0:
		continue
 
	# Otherwise, construct the label mask to display only connected component
	# for the current label
	labelMask = np.zeros(thresh.shape, dtype="uint8")
	labelMask[labels == label] = 255
	numPixels = cv2.countNonZero(labelMask)
 
	# If the number of pixels in the component is between lower bound and upper bound, 
	# add it to our mask
	if numPixels > lower and numPixels < upper:
		mask = cv2.add(mask, labelMask)

The mask now contains only the white blobs of characters. Other white blobs have been discarded. If we display the mask it will look like Fig. 3 below:

Characters segmentation
Figure 3. White blobs of characters

By finding contours we can get the bounding boxes of the license plate characters:

# Find contours and get bounding box for each contour
cnts, _ = cv2.findContours(mask.copy(), cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
boundingBoxes = [cv2.boundingRect(c) for c in cnts]

The contours returned from findContours do not have any particular order. Hence before we can proceed further, we need to sort the bounding boxes from left to right and top to bottom so that we can read the characters in correct order:

# Sort the bounding boxes from left to right, top to bottom
# sort by Y first, and then sort by X if Ys are similar
def compare(rect1, rect2):
    if abs(rect1[1] - rect2[1]) > 10:
        return rect1[1] - rect2[1]
    else:
        return rect1[0] - rect2[0]
boundingBoxes = sorted(boundingBoxes, key=functools.cmp_to_key(compare) )

If we display the bounding boxes on the mask we will have Fig. 4.

Characters segmentation with bounding boxes
Figure 4. Characters segmentation with bounding boxes

Now as characters have been segmented into bounding boxes, it’s time to use our pre-trained model from the previous post to recognise those characters.

Characters Recognition

Before we start recognising the characters, let’s define some constants such as the image input’s width and height, and the list of possible characters (0-9, a-z). The width and height are 128 and 128 because that is the size of the image input in our training data.

# Define constants
TARGET_WIDTH = 128
TARGET_HEIGHT = 128

chars = [
    '0','1','2','3','4','5','6','7','8','9','A','B','C','D','E','F','G',
    'H','I','J','K','L','M','N','O','P','Q','R','S','T','U','V','W','X','Y','Z'
    ]

Next we load the pre-trained model:

# Load the pre-trained convolutional neural network
model = load_model(args["model"], compile=False)

Then we loop over the bounding boxes and use the model to recognise characters:

vehicle_plate = ""
# Loop over the bounding boxes
for rect in boundingBoxes:

    # Get the coordinates from the bounding box
    x,y,w,h = rect

    # Crop the character from the mask
    # and apply bitwise_not because in our training data for pre-trained model
    # the characters are black on a white background
    crop = mask[y:y+h, x:x+w]
    crop = cv2.bitwise_not(crop)

    # Get the number of rows and columns for each cropped image
    # and calculate the padding to match the image input of pre-trained model
    rows = crop.shape[0]
    columns = crop.shape[1]
    paddingY = (TARGET_HEIGHT - rows) // 2 if rows < TARGET_HEIGHT else int(0.17 * rows)
    paddingX = (TARGET_WIDTH - columns) // 2 if columns < TARGET_WIDTH else int(0.45 * columns)
    
    # Apply padding to make the image fit for neural network model
    crop = cv2.copyMakeBorder(crop, paddingY, paddingY, paddingX, paddingX, cv2.BORDER_CONSTANT, None, 255)

    # Convert and resize image
    crop = cv2.cvtColor(crop, cv2.COLOR_GRAY2RGB)     
    crop = cv2.resize(crop, (TARGET_WIDTH, TARGET_HEIGHT))

    # Prepare data for prediction
    crop = crop.astype("float") / 255.0
    crop = img_to_array(crop)
    crop = np.expand_dims(crop, axis=0)

    # Make prediction
    prob = model.predict(crop)[0]
    idx = np.argsort(prob)[-1]
    vehicle_plate += chars[idx]

    # Show bounding box and prediction on image
    cv2.rectangle(image, (x,y), (x+w,y+h), (0, 255, 0), 2)
    cv2.putText(image, chars[idx], (x,y+15), 0, 0.8, (0, 0, 255), 2)

# Show final image
cv2.imshow('Final', image)
print("Vehicle plate: " + vehicle_plate)
cv2.waitKey(0)

In Line 6 we get the top left coordinates (x, y), the width and height of the bounding box. In Lines 11-12 we crop the character from the mask and apply a bitwise_not so that the character is black on a white background. In Lines 16-19 we calculate the extra padding needed so that the cropped character will have similar format to that of images in our training data. In Line 22 we use copyMakeBorder to apply the padding. In Lines 25-31 we do some pre-processing steps so that the data is ready to feed into our neural network. In Lines 34-36 we use our neural network model to make the prediction. Finally Lines 39-40 draw the bounding box and prediction on the original image.

The final result can be seen in Fig. 5 below:

Characters segmentation and recognition for vehicle plate license
Figure 5. Characters segmentation and recognition for vehicle plate license

Some other examples can be viewed in the following video:

Limitations

This case study is by no means a solution that can be used in real world products. It has certain limitations:

  • It depends on many hand-picked parameters such as in thresholding process or finding character blobs. It requires fine tuning and does not work universally in all cases.
  • The input image used in this post is in a top-down, birds-eye view with good lighting condition. In real world the input image may have different view angles, bad lighting conditions and/or noises.

Nevertheless, it is unrealistic to expect 100% accuracy across all cases. I hope this article has provided you with a great introduction to OCR (Optical Character Recognition).

Full source code

Source code is available at https://github.com/minhthangdang/CharactersSegmentationRecognition

Subscribe
Notify of
guest
1 Comment
Inline Feedbacks
View all comments
Mahnaz

Hi, I really love this topic. I want to use this trained model to recognize my interest characters with transfer learning. The question that I have is how to use this segmentation and recognition for other pictures because I try to use it and it did not work, it only works for the pictures you included here not even other plates. I really appreciate your help.