A circle detection method proposal for image processing based on Hu invariant moments, Hough transform, and template matching : Proposta de um método de detecção de círculos por processamento de imagens baseado em momentos de Hu, transformada de Hough e t

(1)

UNIVERSIDADE ESTADUAL DE CAMPINAS Faculdade de Engenharia Elétrica e de Computação

ABEL ALEJANDRO DUEÑAS RODRIGUEZ

A CIRCLE DETECTION METHOD PROPOSAL FOR IMAGE PROCESSING BASED ON HU INVARIANT MOMENTS, HOUGH TRANSFORM, AND TEMPLATE MATCHING

PROPOSTA DE UM MÉTODO DE DETECÇÃO DE CÍRCULOS POR PROCESSAMENTO DE IMAGENS BASEADO EM MOMENTOS DE HU, TRANSFORMADA DE HOUGH E

TEMPLATE MATCHING

CAMPINAS 2016

(2)

UNIVERSIDADE ESTADUAL DE CAMPINAS Faculdade de Engenharia Elétrica e de Computação

ABEL ALEJANDRO DUEÑAS RODRIGUEZ

A CIRCLE DETECTION METHOD PROPOSAL FOR IMAGE PROCESSING BASED ON HU INVARIANT MOMENTS, HOUGH TRANSFORM, AND TEMPLATE MATCHING

PROPOSTA DE UM MÉTODO DE DETECÇÃO DE CÍRCULOS POR PROCESSAMENTO DE IMAGENS BASEADO EM MOMENTOS DE HU, TRANSFORMADA DE HOUGH E

TEMPLATE MATCHING

Thesis presented to the School of Elec-trical Engineering of the University of Campinas in partial fulfillment of the re-quirements for the degree of Master in Electrical Engineering, in the area of Telecommunications

Dissertação apresentada à Faculdade de Engenharia Elétrica e de Computa-ção da Universidade Estadual de Cam-pinas como parte dos requisitos exigi-dos para a obtenção do título de Mestre em Engenharia Elétrica, na área de Te-lecomunicações e Telemática

Orientador: Prof. Dr. Yuzo Iano

ESTE EXEMPLAR CORRESPONDE À VERSÃO FINAL DE DISSERTAÇÃO DEFENDIDA PELO ALUNO ABEL ALEJANDRO DUEÑAS RODRIGUEZ, E ORIENTADO PELO PROF. DR. YUZO IANO

CAMPINAS 2016

(3)

Agência(s) de fomento e nº(s) de processo(s): CAPES

Ficha catalográfica

Universidade Estadual de Campinas Biblioteca da Área de Engenharia e Arquitetura Elizangela Aparecida dos Santos Souza - CRB 8/8098

D868c Dueñas Rodriguez, Abel Alejandro, 1988-

DueA circle detection method proposal for image processing based on Hu

invariant moments, Hough transform, and template matching / Abel Alejandro Dueñas Rodriguez. – Campinas, SP : [s.n.], 2016.

DueOrientador: Yuzo Iano.

DueDissertação (mestrado) – Universidade Estadual de Campinas, Faculdade

de Engenharia Elétrica e de Computação.

Due1. Visão por computador. 2. Processamento de imagens – Técnicas

digitais. 3. Reconhecimento de padrões. I. Iano, Yuzo,1950-. II. Universidade Estadual de Campinas. Faculdade de Engenharia Elétrica e de Computação. III. Título.

Informações para Biblioteca Digital

Título em outro idioma: Proposta de um método de detecção de círculos por

processamento de imagens baseado em momentos de Hu, transformada de Hough e template matching

Palavras-chave em inglês:

Computer Vision

Image processing - Digital techniques Pattern Recognition

Área de concentração: Telecomunicações e Telemática Titulação: Mestre em Engenharia Elétrica

Banca examinadora:

Yuzo Iano [Orientador] Ricardo Barroso Leite Rangel Arthur

Data de defesa: 29-01-2016

(4)

COMISSÃO JULGADORA – TESE DE MESTRADO

Candidato: Abel Alejandro Dueñas Rodriguez Data da defesa: 29 de Janeiro de 2016

Titulo da tese: “A circle detection method proposal for image processing based on Hu invariant

moments, Hough transform, and template matching”

(Proposta de um método de detecção de círculos por processamento de imagens baseado em

momentos de Hu, transformada de Hough e template matching)

Prof. Dr. Yuzo Iano (Presidente) Prof. Dr. Ricardo Barroso Leite Prof. Dr. Rangel Arthur

A ata de defesa, com as respectivas assinaturas dos membros da Comissão Julgadora, encontra-se no processo de vida acadêmica do aluno.

(5)

ACKNOWLEDGEMENTS

I would like to thank my family, who gave me all their time, love, and support, which are essential to me to accomplish this and all my achievements.

I am grateful to Prof. Yuzo Iano whose patience, guidance and wisdom enabled me to carry out this work.

(6)

ACKNOWLEDGEMENTS

I must express my gratitude to the CAPES (Coordenação de Aperfeiçoamento de Pessoal de

Nível Superior) program for the financial support and the academic incentive that enabled the

realization of this thesis.

Agradeço ao programa CAPES (Coordenação de Aperfeiçoamento de Pessoal de Nível Supe-rior) tanto pelo apoio financeiro quanto pelo incentivo acadêmico para que este trabalho pu-desse ser realizado.

(7)

ABSTRACT

Circle detection has become an important tool for many software applications based on com-puter vision like the industrial and medical fields. Knowing its importance, it is natural that, throughout the years, much work was done seeking improvement in effectiveness as well as performance. In the present work, a detection method is presented, based on the combination of three techniques widely used: Invariant Hu Moments, Hough Transform, and Template Matching. A comparative study about their performance and effectiveness is presented prior to implementing the proposed method, looking to find advantages and disadvantages of each one. Thus, a proper solution can be presented, considering the available strengths in the study, trying to minimize the weaknesses, improving the rate detection in 1.6% and the time processing by 311%.

(8)

RESUMO

A detecção de círculos tornou-se uma importante ferramenta para todo tipo de aplicações de software baseado em visão computacional, como na área industrial e médica. Conhecendo sua importância, é natural que ao longo dos últimos anos muitos trabalhos foram realizados procu-rando melhorar a eficácia, bem como o desempenho. No presente trabalho, um método de de-tecção é apresentado, com base na combinação de três técnicas amplamente utilizadas: Momen-tos Invariante de Hu, Transformada de Hough e Template Matching. Um estudo comparativo sobre o desempenho e eficácia de cada método é apresentado antes da apresentação do método proposto, fazendo uma análise de vantagens e desvantagens. Assim, uma solução apropriada é apresentada considerando as vantagens de cada um e tentando minimizar os pontos fracos, me-lhorando a taxa de detecção em 1.6% e o tempo de processamento em 311%.

Palavras chave: Visão por computador, Processamento de imagens – Técnicas digitais, Reco-nhecimento de padrões.

(9)

LIST OF FIGURES

FIGURE 2.1–(A)GOOGLE GOOGLES;(B)MICROSOFT KINECTS;(C)INSTAGRAM SERVICE;(D)FACEBOOK ___________________ 16 FIGURE 2.2–(A)CONTINUOUS IMAGE;(B)RESULT OF SAMPLING AND QUANTIZATION PROCESSES [5] ____________________ 17 FIGURE 2.3–(A)ORIGINAL IMAGE;(B)SET OF SEED;(C)REGION GROWING –THRESHOLD:225~250;(D)REGION GROWING –

THRESHOLD:190~250 ____________________________________________________________________ 20 FIGURE 2.4–(A)MASK;(B)IMAGE ROI _____________________________________________________________ 20 FIGURE 2.5–MASK USED TO DETECT ISOLATED POINTS ___________________________________________________ 21 FIGURE 2.6–(A)HORIZONTAL;(B)+45°;(C)VERTICAL;(D)-45° ____________________________________________ 22 FIGURE 2.7-(A)ROW DIFFERENTIAL MASK;(B)COLUMN DIFFERENTIAL MASK ____________________________________ 23 FIGURE 2.8–CATCHMENT BASINS _________________________________________________________________ 24

FIGURE 2.9–(A)GRAYSCALE IMAGE;(B)TOPOGRAPHIC REPRESENTATION _______________________________________ 24 FIGURE 2.10–(A)MINIMA OF A FUNCTION;(B)MAXIMA OF A FUNCTION ______________________________________ 25

FIGURE 2.11–(A)ORIGINAL IMAGE;(B)DISTANCE TRANSFORM _____________________________________________ 25 FIGURE 2.12–DAMS CONSTRUCTION PROCESS ________________________________________________________ 26

FIGURE 2.13–(A)ORIGINAL IMAGE;(B)DISTANCE TRANSFORM;(C)WATERSHED TRANSFORM;(D)SEGMENTATION _________ 26 FIGURE 2.14–(A)ORIGINAL IMAGE;(B)FINAL SEGMENTATION ______________________________________________ 27 FIGURE 2.15–(A)RED BLOOD CELL;(B)EYE DETECTIONS;(C)STEEL BARS COUNTING _______________________________ 28 FIGURE 2.16–HU MOMENTS [4] __________________________________________________________________ 29 FIGURE 2.17–(A)IMAGE SPACE;(B)PARAMETER SPACE;(C) CONE GENERATED IN SPACE A-B-R ________________________ 32 FIGURE 2.18– (A)TEMPLATE;(B)SEARCH AREA;(C)SLIDING PROCESS;(D)CORRELATION VALUES ______________________ 34 FIGURE 3.1–(A)BINARY IMAGE;(B)CONNECTED COMPONENT LABELING [4];(C)NEIGHBORHOOD SELECTION;(D)NEIGHBORHOOD

SEGMENTATION _________________________________________________________________________ 36 FIGURE 3.2–(A)TRUE POSITIVE CASE;(B)FALSE POSITIVE CASE ______________________________________________ 37 FIGURE 3.3–(A)SAMPLE IMAGE;(B)EDGE-DETECTED IMAGE _______________________________________________ 38 FIGURE 3.4–(A)SYNTHETIC SAMPLE; (B)EDGE DETECTED IMAGE;(C)VOTING MAP IN XY PLANE;(D)ORTHOGONAL VIEW OF VOTING

MAP. _________________________________________________________________________________ 38 FIGURE 3.5–(A)ORIGINAL KERNEL OF 9×9 PIXELS;(B)SYNTHETIC TEMPLATE OF 29×29 PIXELS ________________________ 39

FIGURE 3.6–(A)ORIGINAL SLIDING WINDOW;(B)CORRELATIONS MAP WITHOUT APPLICATION CRITERIA;(C)SLIDING WINDOWS CONSIDERING CONTRIBUTION NEIGHBORHOOD;(D)CORRELATIONS MAP WITH APPLICATION CRITERIA. _______________ 40

FIGURE 3.7–ERROR COUNT HISTOGRAM OF HU MOMENTS METHOD __________________________________________ 41 FIGURE 3.8–ERROR COUNT HISTOGRAM OF HOUGH TRANSFORM METHOD ______________________________________ 42

FIGURE 3.9–ERROR COUNT HISTOGRAM OF TEMPLATE MATCHING METHOD _____________________________________ 42 FIGURE 3.10–PERFORMANCE COMPARISON __________________________________________________________ 44 FIGURE 4.1–CIRCLE DETECTOR ALGORITHM PROPOSED IN [4],[32] __________________________________________ 47 FIGURE 4.2–CIRCLE DETECTOR PROPOSED IN THE ACTUAL WORK _____________________________________________ 47 FIGURE 4.3–COMPARATIVE OF THE FIRST METHOD PROPOSED IN [33] AND THE NEW PROPOSED METHOD. ________________ 48

(10)

FIGURE 4.4–ORIGINAL IMAGE ____________________________________________________________________ 49 FIGURE 4.5–BINARY IMAGE _____________________________________________________________________ 49 FIGURE 4.6–REGION OF INTEREST _________________________________________________________________ 50 FIGURE 4.7–SEGMENTATION PROCESS:(A)DISTANCE TRANSFORM;(B)LOCATING MARKERS;(C)WATERSHED TRANSFORM;(D)

IMAGE SEGMENTED _______________________________________________________________________ 51

FIGURE 4.8–FIRST COUNTER LAYER BASED IN INVARIANT HU MOMENTS:(A) FIRST INVARIANT MOMENT CALCULATION;(B)AREA THRESHOLD;(C)RESULT OF THE COUNTING _______________________________________________________ 52

FIGURE 4.9–SECOND COUNTER LAYER BASED IN HOUGH TRANSFORM:(A)CANNY’S EDGE DETECTION;(B)CONES IN THE SPACE A-B-R; (C)VOTING MAP;(D)RESULT OF THE COUNTING ___________________________________________________ 53 FIGURE 4.10–THIRD COUNTER LAYER BASED IN TEMPLATE MATCHING:(A)CORRELATION MAP;(B, C, D, E, F, G)CIRCLES DETECTION

BY LOCATING PEAK;(G)RESULT OF THE COUNTING __________________________________________________ 54

FIGURE 4.11–PROCESSED IMAGE _________________________________________________________________ 55 FIGURE 5.1–CUMULATIVE PROCESSING TIME __________________________________________________________ 57 FIGURE 5.2–(A)TIME PROCESSING PDF IN ALGORITHM 1;(B)TIME PROCESSING PDF IN ALGORITHM 2 _________________ 58

(11)

LIST OF TABLES

TABLE 2.1–INVARIANT HU MOMENTS ... 30

TABLE 3.1–TEST COMPUTER HARDWARE SPECIFICATIONS ... 35

TABLE 3.2–DETECTION RATES ... 43

TABLE 3.3–TOTAL TIME REQUIRED TO PROCESS THE 340 IMAGE SAMPLES ... 43

TABLE 3.4–FEATURE COMPARISON ... 45

TABLE 5.1–RESULTS COMPARISON BETWEEN THE THREES METHODS, THE FIRST ALGORITHM PROPOSED AND THE NEW ALGORITHM PROPOSE ... 56

(12)

LIST OF ABBREVIATIONS AI Artificial Intelligence CV Computer Vision HM Hu Moments HT Hough Transform TM Template Matching ROI Region of Interest

(13)

1 INTRODUCTION

Nowadays, machines can meet or exceed human capacity for some tasks in daily life. In-spired basically by our natural connection with the senses, the importance of image analysis in any environment is clear. With this, useful information can be extracted from an image, which is explicit in the scene. This is the reason why image analysis is one of the most studied fields in the area of artificial intelligence, widely used in everyday life and in industry [1]. A few years ago, using a machine to do this was absurd, considering the low computational power in those days’ computers. Since then, thanks to technological advances, it is possible and conven-ient according to the quantity of cameras present in all the possible scenarios of today.

Image analysis can be divided into three general types of tasks in a scene. First, discover if the visual appearance of objects is as it should be, in other words, an inspection task. The im-plicit assumption here is that the object and his location are known, but if the latter is unknown, the second task is required: location. Finally, if the object itself is the unknown factor, the third task is useful: identification. In this work, considering the extent of the research area, only the last type of analysis will be covered: identification in a scene. To achieve this, different acteristics like color, texture, or shape are taken into account. A feature so widely used to char-acterize an image is the geometry, with which it can defined whether this is a triangle, square, circle, etc.

Since the 70’s, many techniques for object identification were developed using different characteristics of the target image to analyze area, perimeter, circularity, etc. Some of them are more complex than others, generally with better functionality. For example, generally speaking, template matching paradigm (TM) is used in inspection tasks. Moreover, in location tasks, both Hough transform (HT) and template matching paradigm are usually used. In addition, for the third kind of task, depending on the complexity of the problem, statistical Hu moments (HM), Hough transform or template matching paradigm can be used. This does not mean that these are the only techniques, but they are probably the most used. [2]

One particular case study is the detection or identification of circles. In general terms, it can be seen like a superfluous task; However, it is important because it is usually used in different fields such face recognition, food industry, optical study, industrial production, etc. [3]. This is the reason why, in the following sections, each of these methods (HM, HT, and TM) will be addressed, making a brief introduction for each one and specifying which of these propose a high quality and efficient solution to the problem presented.

(15)

15

Furthermore, a comparison will made between them which will be analyzing the advantages and disadvantages, and proposing some possible uses for each one. However, doing this without an image database is impossible. For that matter, it will be used 340 images of steel bar pack-ages used in a previous work developed by us [4]. In addition, considering the results test, the HT method will be applied in a previous algorithm to prove if the results achieve improve the operation of the current algorithm. [5]

Finally, as can be seen in the chapter 5, the inclusion of the new detection layer based on HT – obeying the result of the preliminary test - allows the simplification of the old algorithm presented in [4], which only has many detection layers based on HM and TM. In this way, with an optimized algorithm, improvement are achieved in term of rate detection and processing time.

(16)

16

2 COMPUTER VISION FUNDAMENTALS

Computer Vision (CV), also known as Artificial Vision, is a field of Artificial Intelli-gence (AI) that seeks to provide the computers the skill of human vision. It is an important point because all major intelligent life forms have the capacity to interact with and manipulate their environment and the visual perception allows this interaction [2].

Nowdays, there are many companies which uses CV in their services. As can be seen in Figure 2.1, Google used CV to develop an images searches names Google googles, Microsoft are working in the new version of Kinect, Instagram base his services in image processing,

Facebook uses CV to improve his social network with automatic tags

(a) (b)

(c) (d)

Figure 2.1 – (a) Google googles; (b) Microsoft Kinects; (c) Instagram service; (d) Facebook

Image analysis is a Computer Vision subfield that, through image processing tools, ex-tracts useful information from digital images or scenes. This statement is important because it shows that image analysis is not the same as image processing. Image processing is defined to be any type of transformation of an image. That means its goal is, through a smoothing, sharp-ening or any process on the image, to produce a modified (enhanced) image output. [5] On the other hand, image analysis is the transformation of an image into something else other than an

(17)

17

image, as data or information that results critical for decision making. Moreover, while image processing techniques are usually applied to an image which needs to be treated, image analysis uses this enhanced image to get more information. Having explained this, and considering a logical order of the concepts, in this chapter some image processing notions will be introduced. 2.1 Image Acquisition and Representation

The digital camera is one of the most popular devices used to produce digital images of a scene. An imaging process is formed by two stages, a capture that obtains scene information, and a digitalization, that transform this information (continuous) in a digital image (discrete).

Roughly, the capture stage is the process of light capture; it can be either a passive (i.e. cameras) or an active (i.e. scanner devices) process. Besides, the digitalization stage has two processes: sampling and quantization. The first process samples a continuous image to achieve a discrete matrix 𝑀 × 𝑁, where the number of samples introduces the concept of pixel resolu-tion, that is obtained multiplying the number of pixel columns by the number of pixel rows and represents the level of detail in an digital image.

The second process is the quantization of the signal which consists of the discretization of the continuous values (sampled in the first process) for each pixel. The results of the two processes can be seen in Figure 2.2.

(a) (b)

Figure 2.2 – (a) Continuous image; (b) Result of sampling and quantization processes [5]

When the images only have one channel (luminance), they are called grayscale images; generally with 256 possible values to represent the intermediate tones from black (0) to white (255), when quantized with 8 bits. If the images only have two color levels (black and white), are called binary images. Finally, if the image has three channels, each one representing a color

(18)

18

channel (Red, Green, and Blue), is called true color image. Usually, 256 possible values are used for each channel, resulting in a 16 million color combination.

2.2 Segmentation

Image segmentation is defined as a partition in significant units, regions, or objects that have some similar features like brightness levels (luminance) or textures, in order to simplify or change the representation of an image in an easier one to analyze. To do this, the solution, in general terms, consists in set a label to all pixels in the image, considering the same label to pixels that have same visual characteristics as color, intensity or texture. The segmentation pro-cess involves a complex logic; usually because there is not enough information about the ob-jects to be extracted from the image. The segmentation algorithms are based on two criteria. The first, discontinuity, is based on abrupt changes such as edges; the second, similarity, is based in portioning the image into similar regions in a predefined characteristic such as thresh-olding, region growing or splitting and merging. [6] Some uses for image segmentation include:

 Medical images processing (cells counting)

 Topographic maps analysis

 Biometry (Recognition of faces, iris or fingerprint)

 Pedestrian detection

 Vehicle counting systems

2.2.1 Segmentation based in similarity

Segmentation based in similarity, complementary to the segmentation based in discon-tinuities, tries to label regions directly from the regions, finding neighboring pixels with some similar characteristics, and not necessarily with the same pixel value.

2.2.1.1 Thresholding

Thresholding is the simplest method of image segmentation as it is based only on the pixel value to label.

Tto start the labeling process over the image, it is important to find a good threshold value to, achieve a good separation between the regions by (2.11). Where 𝑓(𝑥, 𝑦) is a pixel in row 𝑥, column 𝑦 and 𝑇 and arbitrary threshold.

(19)

19

𝑓(𝑥, 𝑦) = {1,_0, 𝑓(𝑥, 𝑦) > 𝑇_{𝑓(𝑥, 𝑦) ≤ 𝑇} (2.1)

There are many methods to find the optimum value 𝑇 in (2.1), but the two most used are by histogram analysis and the Otsu method. The first one is based in histogram detection, con-sidering as an optimum value the abyss between the histogram peaks; and the second preforms a test with all the possible values, considering as an optimum value, the one that meets the some statistical conditions.

2.2.1.2 Region growing

The region growing method is based in pixels similarity and connectivity rules. The basic principle is that, these regions are formed by pixels having connectivity and present some similar characteristics and discrepancy to the rest of pixels in the image.

To segment an image, region growing method takes a set of seeds, that can be chosen manually, as input. Then, as an iterative process, after a first seed is chosen, its neighbors are analyzed looking if their characteristics are similar to the original seed. On the neighbors that meet the condition, that now forming the region, the same process is applied, growing the region each time, until no similarity is found in the neighbors analyzed.

As can be seen in Figure 2.3, the success of the process depends of the seed initially chosen and the similarity criteria used to analyzed the neighbors.

(20)

20

(c) (d)

Figure 2.3 – (a) Original image; (b) Set of seed; (c) Region growing – Threshold: 225~250 ; (d) Region growing – Threshold: 190~250 [7]

2.2.2 Segmentation based in discontinuities

There are three types of discontinuities in a digital image: points, lines and edges. To understand them, firstly is important to look how works a convolution mask over an image.

For example purposes, as can be shown in Figure 2. 4, a 3 × 3 mask will be used to calculate, by (2.2), a response (𝑅) of a mask, this procedure involves the sum of products of the mask coefficients with the gray values of the image. [6]

(a) (b)

Figure 2. 4 – (a) Mask; (b) Image ROI

(21)

21

2.2.2.1 Point detection

To detect isolated points in an image is necessary to use a mask with different weights between the center and its neighbors. Then, as seen in (2.2), with a simple threshold (2.3) over the responses values, it is possibly to detect these points.

Figure 2.5 – Mask used to detect isolated points

𝑃 = {1 ,_{0, 𝑜𝑡ℎ𝑒 𝑐𝑎𝑠𝑒𝑠}|𝑅| ≥ 𝑇 (2.3)

2.2.2.2 Line detection

As the point detection case, to detect lines in an image, a particular mask is used. In this case, a mask depends of the lines angle. As can be seen in Figure 2.6, the mask coefficient depends of the angle needed, for later, with a threshold process (2.3), label an image.

(22)

22

(c) (d)

Figure 2.6 – (a) Horizontal; (b) +45°; (c) Vertical; (d) -45°

2.2.2.3 Edge detection

In general terms, an edge in an image is a transition between two regions of significantly different intensities or , according to Gonzalez in [5], is a set of connected pixels that lie on the boundary of two regions. Detection of edges is probably the most common way to detect dis-continuities with a large literature about it.

One of the most known approaches to the edge detection problem is based in gradient operators. In there, the edges are detected based on spatial derivatives of the image that maybe calculated through convolution operations.

First of all, a signal derivative provides the local variations respect to a variable in 𝑓(𝑥, 𝑦). A derivative is a vector to a maximum variation direction of 𝑓(𝑥, 𝑦), and its magnitude is proportional at this variation. This vector is named Gradient and is defined by (2.4), (2.5) and (2.6). ∇𝑓(𝑥, 𝑦) = [ 𝜕𝑓(𝑥, 𝑦) 𝜕𝑥 𝜕𝑓(𝑥, 𝑦) 𝜕𝑦 ] (2.4) 𝑀𝑎𝑔[∇𝑓(𝑥, 𝑦)] = √(𝜕𝑓(𝑥, 𝑦) 𝜕𝑥 ) 2 + (𝜕𝑓(𝑥, 𝑦) 𝜕𝑦 ) 2 (2.5) 𝜃 = 𝜕𝑓(𝑥, 𝑦) 𝜕𝑥 𝜕𝑓(𝑥, 𝑦) 𝜕𝑦 (2.6)

(23)

23

In an image case, the gradient operator is based in differences between the gray levels in the image. Then, a row gradient ∂f(x,y)_∂x , in (2.7), and the column gradient ∂f(x,y)_∂y , in (2.8), can be represented by mask as can be shown in Figure 2.7

𝜕𝑓(𝑥, 𝑦) 𝜕𝑥 ≈ ∇𝑥𝑓(𝑥, 𝑦) = 𝑓(𝑥, 𝑦) − 𝑓(𝑥 − 1, 𝑦) (2.7) 𝜕𝑓(𝑥, 𝑦) 𝜕𝑦 ≈ ∇𝑥𝑓(𝑥, 𝑦) = 𝑓(𝑥, 𝑦) − 𝑓(𝑥, 𝑦 − 1) (2.8) (a) (b)

Figure 2.7 - (a) Row differential mask; (b) Column differential mask

2.2.2.4 Watershed segmentation

A Watershed segmentation is a powerful tool for image analysis that belongs to the second segmentation class (based in discontinuities) introduced by Beucher and Lantuejol in [8]. The term watershed refers to a ridge that divides areas drained by different river system or lakes. (Figure 2.8)

(24)

24

Figure 2.8 – Catchment basins [9]

A catchment basin is the geographical area draining into a river or reservoir [10], similar to a grayscale image that can be visualized in three dimensions: two spatial coordinates versus a gray level. As can be seen in Figure 2.9b, with a topographic interpretation, the image has bright areas (high) and dark areas (low) that form the catchment basins and the watershed lines.

(a) (b)

Figure 2.9 – (a) Grayscale Image; (b) Topographic representation [10]

To continue with the segmentation introduction, it is necessary to make a brief definition about minimum and maximum features of a function or distance transform.

2.2.2.4.1 Minima, maxima of a function

Minima and maxima can be features of primary importance in an image. To illustrate their definitions is necessary to consider an image like a topographic surface. Considering the point 𝑠1and 𝑠2 of the surface 𝑆 and a path as any sequence {𝑠𝑖} where 𝑠𝑖 is adjacent to 𝑠𝑖+1, since 𝑠 belong to 𝑍2_{× 𝑍 as {𝑥, 𝑓(𝑥)} . A path is non-ascending if (2.9) is satisfied.}

(25)

25

Thus, a point 𝑠 is considered minimum if only exists non-ascending path from it. A similar idea is used to define maxima. These concepts can be appreciated in a graphic way in Figure 2.10

(a) (b)

Figure 2.10 – (a) Minima of a function; (b) Maxima of a function [11]

2.2.2.4.2 Distance transform

Distance transform is an operator applied to binary images that is useful in many fields of computer vision. As can be seen in (2.10) , a distance transform 𝑇 applied in an object 𝑂, calculate a scalar field that represents a minimum distances between each pixel of the object and the background.

𝑇(𝑂) = 𝑚𝑖𝑚 𝑑𝑖𝑠𝑡𝑝𝑖∈𝑂(𝑝, 𝑝𝑖) (2.10)

In a graphic mode, the result Figure 2.11b is a grayscale image, with the same size of the original image Figure 2.11a, where the graylevel values show a minimum distance of each pixel to the closest background.

(a) (b)

Figure 2.11 – (a) Original image; (b) Distance transform [11]

Now, considering again the image as a topographic surface, suppose that all the mini-mums on the surface are perforated, and the surfaces are flooded through them with a constant

(26)

26

water vertical speed forming basins. During this flooding process, maybe two or more basins coming from different minima merge. Avoiding this merge, as can be seen in Figure 2.12, some dams are constructed on the surface, such that each basin only one minimum.

Figure 2.12 – Dams construction process [11]

In Figure 2.13, an example of the application of the watershed transform in a segmen-tation process in shown. As it is said in [11], this tool is really useful if is used in a right way. In [12] Eddins mention a good advice: “Change the image into another image whose catchment basins are object you want to identify”. That is why the transform is applied to a distance trans-form (Figure 2.13b) instead of the original image (Figure 2.13a).

(a) (b)

(c) (d)

(27)

27

Unfortunately, the direct application of the watershed algorithm has over-segmentation problems in the image. This happens because the transform gives one watershed for each local minimum in the image and, in many cases, the image noise generates local minimums and the image needs to be filtered. To solved this problem, as can be seen in Figure 2.14, is applied an algorithm by Beucher in [11], the use of markers to identify good minimums to start a flooding process only through them. To find these markers, the image must be analyzed by intensity, size, shape or texture. In this case, as Eddins suggests in [12], considering the minimum con-nected component sizes are a good alternative.

(a) (b)

Figure 2.14 – (a) Original image; (b) Final segmentation

2.3 Object Recognition

The humans, because of their nature, can recognize many types of objects with little effort, no matter if they vary slightly in different ways, in different sizes or scale and even when they are translated or rotated. In computer vision, the field of object recognition tries to give this capacity to the machines having as main motivation the great support from to other sub-fields, like:

 Face detection

 Optical characters recognition

 Scene recognition

 Object counting

An important tool in many tasks is the capacity to detect if some circle is in the scene, to apply different processes there later. That is why, in this work, a comparative study in circle detection methods is made.

(28)

28

2.3.1 Circle detection

As can be seen above, circles detection algorithms are really important in many appli-cations [13]–[17]. This explains why many researchers address the issue seeking to improve detection rates, optimizing algorithms or looking for new uses known techniques. For example, in [18], Žunić proposed the invariant Hu moments as a circularity measure. In regards to the Hough Transform method, in [19], Yuen compared different Hough Transform implementation methods aiming to optimize the quantity of memory needed to achieve detection. Other authors such as Wu [20] achieved good results in circle detection using this same method. At last, Zhu [21] and Sintorn [22] use the Template Matching method for circle detection implicitly in dif-ferent applications such as eye position calculation and human cytomegalovirus capsids classi-fication, respectively. Some examples can be seen in Figure 2.15

(a) (b)

(c)

(29)

29

2.3.1.1 Invariant Hu moments

The Hu moments, proposed by M. K. Hu [23], are a set of 7 2-D invariant moments that , as can be seen in Figure 2.16, are invariant to translation, scale change, mirroring and rotation [6].

Figure 2.16 – Hu moments [4]

The 2-D invariant moments of (𝑝 + 𝑞) order in a grayscale image 𝑓(𝑥, 𝑦) of size 𝑀 × 𝑁 are defined by (2.11) 𝑚𝑝𝑞= ∑ ∑ 𝑥𝑝𝑦𝑞𝑓(𝑥, 𝑦) 𝑁−1 𝑦=0 𝑀−1 𝑥=0 (2.11)

Where {𝑝, 𝑞} ∈ ℕ and the central moment 𝜇_𝑝𝑞 is defined by (2.12):

𝜇𝑝𝑞 = ∑ ∑(𝑥 − 𝑥̅)𝑝(𝑦 − 𝑦̅)𝑞𝑓(𝑥, 𝑦) 𝑁−1 𝑦=0 𝑀−1 𝑥=0 (2.12)

for , 𝑞 = 0,1,2, … , where the gray centroid of the region of interest (ROI) is defined in (2.13) and (2.14)

𝑥̅ =𝑚10

𝑚₀₀ (2.13)

𝑦̅ =𝑚01

(30)

30

The central moments 𝜇_𝑝𝑞 are used to normalize the moments 𝜂_𝑝𝑞 of (𝑝 + 𝑞) order given in (2.15)

𝜂_𝑝𝑞 =𝜇𝑝𝑞

𝜇₀₀𝛾 (2.15)

Where 𝛾 =𝑝+𝑞₂ + 1 , for 𝑝, 𝑞 = 2,3 …

The complete list of the seven invariant moments presented by M. K. Hu [23] are listed in Table 2.1

Table 2.1 – Invariant Hu Moments Moment

Or-der

Equation Related to

1 𝝓_𝟏 = 𝜼_𝟐𝟎+ 𝜼_𝟎𝟐 Distribution in horizontal and verti-cal axis

2 𝝓_𝟐 = (𝜼_𝟐𝟎− 𝜼_𝟎𝟐)𝟐+ 𝟒𝜼_𝟏𝟏𝟐 Similarity in horizontal variance 3 𝝓_𝟑 = (𝜼_𝟑𝟎− 𝟑𝜼_𝟏𝟐)𝟐

+ (𝟑𝜼_𝟐𝟏− 𝜼_𝟎𝟑)𝟐

Tilt up or down

4 𝝓_𝟒 = (𝜼_𝟑𝟎+ 𝜼_𝟏𝟐)𝟐+ (𝜼_𝟐𝟏+ 𝜼_𝟎𝟑)𝟐 Tilt left or right 5 𝝓_𝟓 = (𝜼_𝟑𝟎− 𝟑𝜼_𝟏𝟐)(𝜼_𝟑𝟎 + 𝜼𝟏𝟐)[(𝜼𝟑𝟎+ 𝜼𝟏𝟐)𝟐 − 𝟑(𝜼𝟐𝟏+ 𝜼𝟎𝟑)𝟐] + (𝟑𝜼_𝟐𝟏− 𝜼_𝟎𝟑)(𝜼𝟐𝟏 + 𝜼_𝟎𝟑)[𝟑(𝜼_𝟑𝟎 + 𝜼𝟏𝟐)𝟐 − (𝜼𝟐𝟏+ 𝜼𝟎𝟑)𝟐]

Size, rotation and translation

6 𝝓_𝟔 = (𝜼_𝟐𝟎− 𝜼_𝟎𝟐)[(𝜼_𝟑𝟎+ 𝜼_𝟏𝟐)𝟐 − (𝜼𝟐𝟏+ 𝜼𝟎𝟑)𝟐] + 𝟒𝜼_𝟏𝟏(𝜼𝟑𝟎

+ 𝜼_𝟏𝟐)(𝜼_𝟐𝟏+ 𝜼_𝟎𝟑)

(31)

31 7 𝝓_𝟕 = (𝟑𝜼_𝟐𝟏− 𝜼_𝟎𝟑)(𝜼_𝟑𝟎 + 𝜼_𝟏𝟐)[(𝜼𝟑𝟎+ 𝜼𝟏𝟐)𝟐 − 𝟑(𝜼𝟐𝟏+ 𝜼𝟎𝟑)𝟐] + (𝟑𝜼_𝟐𝟏− 𝜼_𝟑𝟎)(𝜼𝟐𝟏 + 𝜼_𝟑𝟎)[𝟑(𝜼_𝟑𝟎 + 𝜼_𝟏𝟐)𝟐 − (𝜼𝟐𝟏+ 𝜼𝟎𝟑)𝟐] Image Skew 2.3.1.2 Hough transform

The Hough Transform is one of the most used procedures in computer vision with over 87,000 citations in Google Scholar. It is commonly used in straight line and curves detection in images, and require an accumulator array whose dimension correspond to the number of un-known parameters in the equation of the family of curves being sought [24].

In general lines, this procedure detects an image by mapping it from image space to parameter space using accumulator cells so that the difficult global problem in image space is transformed to a simple local peak detection problem in parameter space [3].

An easy way to understand this algorithm is starting with the simplest example, a line detector. To do this, the goal is detect a set of point that lie a straight line, knowing that its equation is given in (2.16)

𝑏 = −𝑥_𝑖𝑎 + 𝑦_𝑖 (2.16)

Considering that many lines pass through (𝑥_𝑖, 𝑦_𝑖) for different values of 𝑎 and 𝑏, a 𝑎𝑏-plane achieve that the equation is represented by a fixed pair (𝑥_𝑖, 𝑦_𝑖), also exist a different point (𝑥_𝑗, 𝑦_𝑗) that has one of its many lines that has the same parameter 𝑎 and 𝑏, this means that the line represent for these parameters contain (𝑥_𝑖, 𝑦_𝑖), (𝑥𝑗, 𝑦𝑗) and others.

These possible values (𝑎, 𝑏) are collected in a voting map, that is a discrete representa-tion of the continuous multidimensional space which spans all the possible parameter values [19]. It is interesting to note that, in general lines, the peaks in this map represent the lines found.

(32)

32

For purpose of circles detection, the Hough Transform method uses a restriction equa-tion, as can be seen in (2.17), for cases of circle detection.

(𝑥 − 𝑎)2_{+ (𝑦 − 𝑏)}2 _{= 𝑟}2 _(2.17)

Where (𝑎, 𝑏) are the circle center coordinates and 𝑟 is the radius.

In the “new” space, the coordinate pair (𝑥, 𝑦) is a known value, but the center coordi-nates and the radius are unknown values. Thus, for each feature point(𝑥, 𝑦), votes are accumu-lated in the parameter space 𝑎 − 𝑏 − 𝑟 for all combinations that satisfy the restriction.

Returning to the circles detections, parameterized by (2.17), any point (𝑥, 𝑦) could be a point on any circle whose parameters lie on the surface of a right circular cone 𝑎 − 𝑏 − 𝑟 , as can be seen in Figure 2.17c. In the opposite direction this means that, each cone in the parameter space a − b − r possibly correspond to an only one circle in the image space, thus each level in the new space correspond to a determinate radius, as can be seen in Figure 2.17b.

(a) (b)

(c)

(33)

33

At the end of the accumulation process, possibly many voting map cells contain large numbers of votes in comparison with the others (peaks), indicating strong evidence for the pres-ence of the shape with corresponding parameters. Then, the new goal in the procedure is to count how many peaks are in the accumulator. For that, it can be used different approaches, but commonly is used a threshold to detect them [19].

2.3.1.3 Template matching

Many times, a computer vision application needs to know if an image contains some objects or locates a subimage, called template, inside a scene [25]. To do this, generally the template matching paradigm is really useful, considering its application mode.

This procedure consists in sliding the template over every possible position in the image to evaluate the measure of the match between the template and the image at the actual position. As Vernon notes in [2], two types of paradigms can be found by the template nature: the global template matching and the local template matching. The first one is when the template represent the entire object and the second one is when it is used several templates of local features to represent the object.

There are several ways to measure the similarity between the image and the template. Some are based on summation of differences and cross-correlation techniques like the Pearson coefficient [26], that can be seen in (2.18)

𝜌 = ∑ ∑ (𝐴𝑚𝑛 − 𝐴)(𝐵𝑚𝑛− 𝐵) 𝑁−1 𝑛=0 𝑀−1 𝑚=0 √(∑𝑁−1(𝐴_𝑚𝑛 − 𝐴)2 𝑛=0 ) (∑ (𝐵𝑚𝑛 − 𝐵) 2 𝑁−1 𝑛=0 ) (2.18)

Where 𝐴 is the image, 𝐵 is the template and 𝐴̅ and 𝐵̅ are the means of 𝐴 and 𝐵 respec-tively.

As can be seen in Figure 2.18, after calculating the set of coefficients, it can be assumed that if the similarity measure is large enough, then the object exist in the peak position.

(34)

34

(a) (b)

(c) (d)

Figure 2.18 – (a) Template; (b) Search area; (c) Sliding process; (d) Correlation values

As it was seen in this chapter, object recognition is a really important field in computa-tional vision. In this chapter, three methods were viewed (HM, HT, and TM), showing the dif-ferent characteristics of each. Some of them with more complexity in the algorithm, as the method based on Hough Transform, or simpler ones as the method based in Hu moments. In the next chapter, these methods will be tested under the same conditions (computer and data-base), looking for the most appropriate in terms of effectiveness and performance.

(35)

35

3 EVALUATION

3.1 Experiments

As can be seen in the previous chapter, each method has a particular way of detect circles in an image. In order to evaluate the performance of each, a proprietary database from a previ-ous work will be used [4]. To ensure the test with the same conditions for all the three methods, (HM, HT and TM) the images have already been pre-processed and turned into black and white.

The image database 𝐼 consists of 340 binary images of 1024x768 pixels that will be the first input for each of the algorithms. The other input is an average radius vector 𝑅, where each element of the vector is the average radius of the circles in each image in I, as defined in (3.1) and (3.2). It is important to remind that any given element within the radius vector is only the mean of all the circles’ radii in the corresponding image.

𝐼 = {𝐼₁, 𝐼₂, 𝐼₃, … , 𝐼_𝑁} (3.1)

𝑅 = {𝑟̅̅̅, 𝑟_𝐼₁ ̅̅̅, 𝑟_𝐼₂ ̅̅̅, … , 𝑟_𝐼₃ ̅̅̅̅̅} _𝐼_𝑁 (3.2) Where N = 340.

To evaluate each method in an impartial manner, the 340 images were processed in a same personal computer with the specifications shown in Table 3.1

Table 3.1 – Test Computer Hardware Specifications Detection Rate

Processor Intel Core i3 2100 3.1 GHz

Memory 2x4GB Corsair Vengeance 1600 MHz OS Windows 7 Pro X64 SP1

Software MATLAB 2014a

Toobox

Matlab Image processing toolbox Matlab Parallel computing toolbox Matlab Report Generator Matlab symbolic math toolbox

(36)

36

3.2 Hu invariant moments

As it was mentioned in the previous chapter, the Hu invariant moments method consists in a set of values that achieve to describe any object. Although it is possible to calculate mo-ments in any object or in a complete image as the original image in Figure 3.2a, in this case, it is convenient to calculate moments for each neighborhood in the image (Figure 3.1b), as shown in Figure 3.1c. Then, the next step is to calculate the first Hu’s moment for each of them. A very useful fact was noticed after a preliminary study was that, with slightly irregular circles, only the first moment was required to achieve circle detection. This ended up turning unneces-sary the calculation of the other 6 moments.

To calculate the first Hu moment, it is necessary to apply the first equation in Table 2.1 over each neighborhood (Figure 3.1d). From the initial tests, it was determined that the value of Φ₁ = 0.16125 ± 2.5% is a good threshold to know if the neighborhood is a circle or not.

(a) (b)

(c) (d)

Figure 3.1 – (a) Binary image; (b) Connected component labeling [4]; (c) Neighborhood selection; (d) Neighborhood seg-mentation

(37)

37

The final step is filtering the detected circles by a threshold area; because, in some cases, as shown in Figure 3.2, a noise image can have an acceptable Hu’s moment, leading to false positives.

(a) (b)

Figure 3.2 – (a) True positive case; (b) False positive case

Where HM is the first calculated Hu’s invariant moment value.

In order to filter the noise, an area comparison between the neighborhood and the calculated mean area (3.3) with the given radius was sufficient. For this, it is necessary to define am area tolerance range of 120 white pixels.

𝐴 = 𝜋𝑅_𝑖2 (3.3)

3.3 Hough transform

Hough Transform method is a slightly more complex than the Hu moments method. In other words, this needs more algorithm development time for the execution of more steps. One of the advantages of it is that the processing occurs only over the edges and not over all the pixels in the image, this means that probably it requires less time that other methods. Firstly, to find the edges of the image (Figure 3.3a) the Canny Edge Detection Method [27] was used. Hereinafter, all the processing, based on the implementation in [28], will be done on the edge detected image (Figure 3.3b).

(38)

38

(a) (b)

Figure 3.3 – (a) Sample image; (b) Edge-detected image

Before searching for possible circles, it is imperative to know a radius value to start with. In the initial testing, by looking at the samples, it was concluded that it is required to have more than a single radius due to the circles’ imperfections. To solve this, it was necessary to use a set of radii within ±2 pixels of the image radius input. For example, if the input radius is 20 pixels, the possible radii to work are {18, 19, 20, 21, 22}. Figure 3.4 illustrates an example of HT algorithm applied on a synthetic image with two circles. The final step is searching for the maximum values in the voting map generated to locate the center of the detected circles.

(a) (b)

(c) (d)

(39)

39

3.4 Template Matching

The algorithm of template matching is already being used in many applications of com-puter vision. In this work, as it was mentioned in the previous section, generating a template is required to calculate the similarity between itself and the search window in the sample. For this, as it can be seen in Figure 3.5, the radius value 𝑅𝑖 is used as an input in the test and an original kernel that is dilated by a morphological operation.

The implementation of this algorithm requires previous knowledge of the average radius 𝑅𝑖 in order to create a proper template to maximize the correlation coefficient. The procedure then dilates the seed kernel until it becomes a synthetic template of radius 𝑟 = 𝑅_𝑖 using a mor-phological operation.

(a) (b)

Figure 3.5 – (a) Original kernel of 9×9 pixels; (b) Synthetic template of 29×29 pixels

After that, with the template generated, the next step is to create a correlation map be-tween the template and the image sample. For that, equation (2.18) was used with a small dif-ference in the application criteria.

The idea is to reduce noise in the correlation map by addressing the issue where the sliding window overlaps between 2 circles (Figure 3.6a). In that case, with the normal correla-tion operacorrela-tion, the resulting correlacorrela-tion map (Figure 3.6b) shows a smoothing effect between the peaks and the valleys, making it difficult to detect the circles.

The proposed workaround considers the biggest neighborhood contribution to the win-dow prior to the correlation calculation. For instance, if the sliding winwin-dow selects an area as in Figure 3.6a, it will detect the possible neighborhoods and weigh their contribution to the

(40)

40

image (area). The subimage will then be masked, keeping only the biggest contributing neigh-borhood, as shown in Figure 3.6c. The correlation map resulting from this preprocessing is much cleaner and effective in circle detection, as seen in Figure 3.6d.

With the correlation map shown in Figure 3.6d ready, the new focus is the peaks detec-tion to find all the possible circles, similar to the other methods at this stage.

(a) (b)

(c) (d)

Figure 3.6 – (a) Original sliding window; (b) Correlations map without application criteria; (c) Sliding windows considering contribution neighborhood; (d) Correlations map with application criteria.

3.5 Results of experiments

Figures show the probability density functions (PDF) for the counting errors for each of the three methods. The first PDF (Figure 3.7), corresponding to the Hu Moments, shows a wide distribution with mean μ = 12.13241_{and variance σ}2 _{= 140.9057. This is critical, as it}

1_{The present work considers the decimal period instead decimal comma for consistencies with the American} English language.

(41)

41

states that this method will present an average error of 12 circles. Considering that the average number of circles per image sample is 90, this represents a 13.33% average error.

Figure 3.7– Error count histogram of Hu moments method

Figure 3.8 and Figure 3.9 show the PDFs for the Hough Transform and Template Match-ing methods, with μ = 0.84118, σ2 _{= 3.2903; and μ = 0.082353 , σ}2 _{= 0.099393} respec-tively. These represent a much more reliable detection rate for both methods, where Template Matching seems to be more reliable with a mean close to zero and very low variance.

(42)

42

Figure 3.8– Error count histogram of Hough transform method

(43)

43

Table 3.2 shows the successful detection rate, calculated using (3.4), for each of the methods, 𝐷𝑅 = 1 𝑁∑ |𝑥_𝑖− 𝑥̂_𝑖| 𝑥𝑖 × 100% 𝑁 𝑖=1 (3.4) Where DR is the Detection Rate, 𝑥_𝑖is as correct count (control), 𝑥̂_𝑖 is the corresponding method’s output count, and N = 340.

Table 3.2 – Detection rates

Method Detection Rate

Hu Invariant Moments 86.611% Hough Transform 99.886% Template Matching 99.981%

As it can be observed from Table 3.2 , the most reliable method was the Template Matching, followed closely by the Hough Transform method.

An important aspect to test was computational performance. Figure 3.10 shows the cu-mulative processing time for all the 340 sample images, for each of the three methods. It is shown in a logarithmic scale for the vertical axis (time) as the Template Matching processing time showed an exponential growth rate; while the Hu Invariant Moments and the Hough Trans-form methods presented a more linear growth. The total times required to process the 340 sam-ples are shown in Table 3.3

Table 3.3 – Total time required to process the 340 image samples

Method Time (s)

Hu Invariant Moments 174 Hough Transform 695 Template Matching 6341

(44)

44

Figure 3.10 – Performance Comparison

3.6 Conclusions

After testing the algorithm behavior with different images of the database, a set of con-clusions can be made. According to the results, in terms of computation time the best of the three methods tested was that based in Hu Moments, being 36 time faster than the method based in Template Matching and 4 times faster than the method based in Hough Transform.

In order to reliability, the best of the three method was that based in Template Matching, being 15% more accurate compared with that based in Hu Moments and 0.1% more accurate compared with that based in Hu Moments.

Knowing this, a feature comparison is shown in Table 3.4. It was considered that a qual-itative evaluation of the methods’ performance was required, besides the quantqual-itative measures already presented, to help present a more holistic perspective. Tolerance to circle overlapping is an important feature due to several real life applications not presenting necessarily well seg-mented images. 0 50 100 150 200 250 300 350 100 101 102 103 104

Cumulative Processing Time

Samples Processed Ti m e ( s ) Hu Moments Hough Transform Template Matching

(45)

45

For that same reason, as can be seen in Table 3.4, a method’s ability to detect circles with different radii given a close input gives the method versatility to certain applications (such as different/random sized circular element counting).

Table 3.4 – Feature Comparison

Method Tolerance to circle

overlapping Tolerance to different radii Detection Rate Processing Time Possible Scenarios

Hu Invariant Moments Bad Good High Very Fast

Blister Packages Inspection Rotifers Classification[29] Road Sign Recognition[17]

Hough Transform Good Good Very High Fast

Red Blood Cell Counting[13], [14], [30]

Mechanical Circular Parts de-tection[31]

Template Matching Good Medium Very High Slow Steel Bar Packages

(46)

46

4 COMBINED APPROACH PROPOSAL BASED ON THE TESTED

METH-ODS

As seen in the previous section, there are many ways to detect circles in an image. Some of methods work over the overall pixel distribution, while others over the edge detected by some operator like Canny[27], implicating some advantages and disadvantages for each one. For example, as seen in the previous chapter, Hu Invariants Moments method is fast in the detection but its low tolerance to overlapping decreases its rate detection. On the other hand, methods like the Template Matching are good for overlapping cases but need more time to detect it.

At this point is good to clarify that the algorithm is divide in processing blocks or layers, such that the output of the layer 𝑛 is the input of the next layer 𝑛 + 1. That being said, in this section, in order to prove the results of the previous test, an addition of a new counter layer in the algorithm using a Hough transform for circles detection is proposed on the algorithm pre-sented in [4], [32] to improve the rate detection. This change also allows that some counter layers of the original algorithm (based on template matching through nested loops) are not re-quired, improving the processing time in the counting.

4.1 Methodology

Firstly, in the overview, a little block-level comparison of the two methods will be pre-sented. Later, in the second section, the methodology of proposed method will be deepened showing gradually its operation mode.

4.1.1 Overview

Broadly speaking, as can be seen in Figure 4.1 and Figure 4.2 , the main change to improve the circles detection results in the base work is the application of the Hough transform algorithm as a new counter layer. Furthermore, segmentation was improved applying a method based on the watershed transform.

(47)

47

Figure 4.1 – Circle detector algorithm proposed in [4], [32]

(48)

48

Figure 4.3 shows a graphic comparative and equivalence between the processing blocks in the methods. In there, the two brown highlighted blocks, represent the additions that allows avoid the repetitive used of the method based in Template Matching.

Figure 4.3 – Comparative of the first method proposed in [33] and the new proposed method.

The decision to place the Hough transform method in the middle of the other was not deliberate, this follows the results in chapter 3. In this section was proved that, in terms of time, HM is better than the others are but with a low tolerance with overlaps in the scene. For other hand, TM proved good tolerance with overlap circles but it has problem if the time processing is an important factor. Finally, HT proved an acceptable behavior to the overlap problem with better performance than TM.

Having this, it was logic to place HT in the second counter layer and hold the other in the first and third layers. In the first layer, the fast HM method to detect quickly the most circles without overlap; in the second layer, the HT methods to detects the circles (with or without overlap) that the first method do not detect in a reasonably processing time; in the third counter layer, the TM method ends the work detection of the remaining circles. In this case, TM method works quickly because it do not have to find circles in the entire image since the most of circles was already detected.

(49)

49

4.1.2 Operation mode

The pictures to be processed consist in RGB images in XGA resolution (1024x768) storages in BMP format. As can be seen in Figure 4.4, they have a red band around for a logistic operation in the factory where the database was made.

Figure 4.4 – Original image

A color pre-processing [4] is made over the original image resulting a binary image (Figure 4.5). Since this moment, all the process will be made on it.

(50)

50

A useful and common strategy to achieve a better processing time in this field is working only in the region of interest in the image (Figure 4.6). With this, calculation in unimportant areas is avoided; in this case, considering the white pixels existence as a decision threshold.

Figure 4.6 – Region of interest

After that, as it is shown in Figure 4.6, some objects are not completely separated from others. Although the image may be processed at this moment, looking to get a better rate detec-tion than [4], the image is segmented with a watershed transform tool. As was mendetec-tioned in the previous chapter, a distance transform in Figure 4.7a is a useful tool to achieve this. If the seg-mentation is applied directly, a common problem as over segseg-mentation appears, to fix that, a watershed transform based in markers is applied (Figure 4.7b). These markers are selected de-tecting the valleys in the Distance Transform result. Finally, in Figure 4.7d, the objects could be separated mostly.

(51)

51

(a) (b)

(c) (d)

Figure 4.7 – Segmentation process: (a) Distance transform; (b) Locating markers; (c) Watershed transform; (d) Image seg-mented

Then, with the image segmented, the first counter layer based in invariant Hu moments starts to detect the first circles and the possible radii. To do that, the first moment, shown in Figure 4.8a, is calculated for all objects (neighborhoods). If the moments are inside of the preset threshold, it is considered as a circle. After calculate the moment for all the objects and detector possible circles, an area threshold is applied to separate the circles from the noise. To do this last filtering, as is shown in Figure 4.8b, a preset tolerance is used. Finally, the mean of the first circles detected is the radii used for the next counter layers. In Figure 4.8d can be seen the result of the first count, as can be seen there some circles were removed because they were already counted.

(52)

52

(c)

Figure 4.8 – First counter layer based in invariant Hu moments: (a) first invariant moment calculation; (b) Area threshold; (c) Result of the counting

The next counter layer, based in Hough transform, works over edges (Figure 4.9a). Con-sidering five possible radii starting from the radius calculated by the first layer (Figure 4.9b), the voting map is generated to detect the peaks (Figure 4.9c) that means presence of a circle. In Figure 4.9d can be seen the result of the second count.

(53)

53

(c) (d)

Figure 4.9 – Second counter layer based in Hough transform: (a) Canny’s edge detection; (b) Cones in the space a-b-r; (c) Voting map; (d) Result of the counting

The ultimate counter layer is based in template matching. As mentioned in a previous chapter, generally at this layer only arrive the circles that could not be detected in the previous layers by their shape imperfections. To detect them, a correlation map in generated (Figure 4.10a), where each pixel represent the value of the Pearson coefficient between the template and the area in the image that have as center the coordinate (𝑥, 𝑦) of this pixel. In Figure 4.10b-Figure 4.10f can be seen how the peaks are removed because they indicate the presence of circle. Finally, in Figure 4.10g, the result of the detection process is shown. As can be seen there, all the circles were detected indicating that all the layers work successfully.

(54)

54

(c) (d)

(e) (f)

(g)

Figure 4.10 – Third counter layer based in template matching: (a) Correlation map; (b, c, d, e, f, g) Circles detection by locating peak; (g) Result of the counting

In Figure 4.11 can be seen the result of the detection process. To differentiate the circles detected by each layer, a color point was painted in the center of the circles. The circles detected by the first layer based in invariant Hu moments are those that have red centers. The circles detected by the second layer based in the Hough transform are those that have brown center. Finally, the circles detected by the third layer based in template matching are those that have blue center.

(55)

55

(56)

56

5 RESULTS

After testing the application of the Hough transform in the algorithm based in a previous work, the successful results were convincing. The addition meant an acceptable improvement in the rate detection and a radical improvement in its performance. As can be seen in Table 5.1, the previous work (Algorithm 1) has a rate detection of 99.97%; for other hand, the actual work (Algorithm 2), improve the rate detection to 99.99%.

In terms of cumulative computation time, as can be seen in Figure 5.1 and in Appendix A, the new algorithm took only the 27% of the time that the previous work took in processing the entire database.

On the other hand, if the obtained results using only the three methods (HM, TH, TM) are included in the comparison, interesting conclusions can be found. As can be seen in Table 5.1 again, the best outcome in rate detection is achieved again by the new algorithm proposal with (99,9935%). In term of time, again the methods based in Hu Moments achieved the time pro-cessing, but if it is considered the mean of the three pure methods: HM, TH and TM (2403 s.), the new proposal achieve again the best time processing with 1176 s.

Table 5.1 –Results comparison between the threes methods, the first algorithm proposed and the new algorithm propose

T (s) DR

HM 174 86,6110%

TH 695 99,8860%

TM 6341 99,9810%

First algorithm proposed 3658 99,9773%

(57)

57

Figure 5.1 – Cumulative processing time

0 500 1000 1500 2000 2500 3000 3500 4000 1 8 ₁₅ ₂₂ ₂₉ ₃₆ ₄₃ ₅₀ ₅₇ ₆₄ ₇₁ ₇₈ ₈₅ ₉₂ ₉₉ 106 113 120 127 134 141 148 155 162 169 176 183 190 197 204 211 218 225 232 239 246 253 260 267 274 281 288 295 302 309 316 323 330 337 Pr o ce ssi n g Ti m e Sample

Cumulative Processing Time

(58)

58

Still in term of time, is easy to denote the performance of the new algorithm proposed over the previous work. In addition to the difference of performance, other important factor is the predictability of time processing. As can be seen in Figure 5.2a and Figure 5.2b, comparing the standard deviation of the time processing in each algorithm, the new proposal allows better control of time, a very important advantage if the algorithm is applied in an industrial process.

(a)

(b)

(59)

59

6 CONCLUSIONS

In the development of the research, a critic point was the search of a good image data-base with circles. Happily, for the development of the state of the art, a property datadata-base was made with the shapes needed. For the other hand, no additional database was founded, became this in a critic difficult.

In this work, a comparison was made between three methods to detect circles: invariant Hu moments, Hough transform and template matching. According to the tests results, the in-variant Hu moment method offers a good performance but low effectiveness to detect imperfect circles. On the other hand, the method based in template matching offers a good rate detection but it takes a long time to finish the task.

For the other hand, the Hough transform is a good method to detect circles with an acceptable performance and effectiveness. So, of the three method used, this is the one that achieve better balance rate detection / processing time.

In order to apply this results, a previous work used to detect circles was used to analyze how the results improve if two methods work together (Hu moments and Template matching). For this, an additional counter layer based in Hough transform was implemented. These changes, in chapter 4, improved the rate detection and time performance of the algorithm, sup-porting the preliminary results in chapter 3.

Indeed, Hough transform is very useful to detect; as in this work, circles and has an acceptable tolerance to overlap problems and imperfections in the shape as can be seen in some images of the data base; or according to the literature, other types of shapes like squares, ellipses or rectangles.

Finally, the new algorithm proposal, which represents an optimized use of the three mentioned methods (Hu moments, Hough transform and template matching), achieved a better rate detection than the others methods, leveraging the strengths of each of the individual meth-ods without compromising the processing time. On this way, using a combination of slow and fast methods and robust and no robust methods, is achieved a fast and robust solution.

(60)

60

7 FUTURE WORK

Test algorithms with a larger database, to retest reliability. Unhappily, the image base used in this word contain only 340 samples. Being the next step, the search of other data-base with other types of geometric shapes.

Develop some metrics for measuring the quality of detection considering if the shape was haunted from the correct pixel (for cases of HM and TM), avoiding consider only measures of success or failure cases.

As can be seen in this work, the adoption of the Hough transform to detect circles meant great advantages in rate detection and time performance of the algorithm. This is not enough if considering that the algorithm is perfectly parallelizable using the many tools that are available in these days.

In terms of hardware, a next step of development could be a using a FPGA to achieve better time performance that the actual work or GPU technologies.

In terms of detection method, other techniques could be analyzed as Gradient Pain Vec-tors or Random Sample Consensus to prove if it means an improvement over the actual work.

(61)

61

BIBLIOGRAPHY

[1] Lei Zhang, Jiexin Pu, and Jia Yu, “Object recognition based on modified invariant moments,” in 2009 International Conference on Mechatronics and Automation, 2009, pp. 2542–2547.

[2] D. Vernon, Machine Vision - Automated Visual Inspection and Robot Vision, 1st ed. Hertfordshire: Prentice Hall, 1991.

[3] X. Chen, L. Lu, and Y. Gao, “A new concentric circle detection method based on Hough transform,” in 2012 7th International Conference on Computer Science & Education

(ICCSE), 2012, no. Iccse, pp. 753–758.

[4] A. Dueñas and C. Vadillo, “Steel Bars Counting Using Image Processing,” UPC - Peruvian University of Applied Sciences, 2013.

[5] R. C. Gonzalez, Digital Image Processing, vol. 14, no. 3. 2002.

[6] R. C. Gonzalez, R. E. Woods, and S. L. Eddins, Digital Image Processing using

MATLAB, 2nd ed. Gatesmark Publishing, 2009.

[7] “Region growing,” 2008. [Online]. Available:

https://en.wikipedia.org/wiki/Region_growing.

[8] S. Beucher and C. Lantuejoul, “Use of Watersheds in Contour Detection,” International

Workshop on Image Processing: Real-time Edge and Motion Detection/Estimation. pp.

12–21, 1979.

[9] E. Betteridge, “Washed Out Plans.” [Online]. Available: http://the-session.info/2013/01/washed-out-plans/.

[10] S. L. Eddins, “The Watershed Transform: Strategies for Image Segmentation,”

Mathworks - Technical Articles and Newsletters, 2002. [Online]. Available:

http://www.mathworks.com/company/newsletters/articles/the-watershed-transform-strategies-for-image-segmentation.html?refresh=true. [Accessed: 31-Aug-2015].

A circle detection method proposal for image processing based on Hu invariant moments, Hough transform, and template matching : Proposta de um método de detecção de círculos por processamento de imagens baseado em momentos de Hu, transformada de Hough e t

ACKNOWLEDGEMENTS

ACKNOWLEDGEMENTS

ABSTRACT

RESUMO

TABLE OF CONTENTS

1 INTRODUCTION

2 COMPUTER VISION FUNDAMENTALS

3 EVALUATION

4 COMBINED APPROACH PROPOSAL BASED ON THE TESTED

METH-ODS

5 RESULTS

Cumulative Processing Time

6 CONCLUSIONS

7 FUTURE WORK

BIBLIOGRAPHY