Literature DB >> 34072269

A Hybrid Rao-NM Algorithm for Image Template Matching.

Xinran Liu1,2,3, Zhongju Wang1,2,3, Long Wang1,2,3, Chao Huang1,2,3, Xiong Luo1,2,3.   

Abstract

This paper proposes a hybrid Rao-Nelder-Mead (Rao-NM) algorithm for image template matching is proposed. The developed algorithm incorporates the Rao-1 algorithm and NM algorithm serially. Thus, the powerful global search capability of the Rao-1 algorithm and local search capability of NM algorithm is fully exploited. It can quickly and accurately search for the high-quality optimal solution on the basis of ensuring global convergence. The computing time is highly reduced, while the matching accuracy is significantly improved. Four commonly applied optimization problems and three image datasets are employed to assess the performance of the proposed method. Meanwhile, three commonly used algorithms, including generic Rao-1 algorithm, particle swarm optimization (PSO), genetic algorithm (GA), are considered as benchmarking algorithms. The experiment results demonstrate that the proposed method is effective and efficient in solving image matching problems.

Entities:  

Keywords:  Rao algorithm; computational intelligence; image matching; optimization

Year:  2021        PMID: 34072269      PMCID: PMC8229128          DOI: 10.3390/e23060678

Source DB:  PubMed          Journal:  Entropy (Basel)        ISSN: 1099-4300            Impact factor:   2.524


1. Introduction

Image matching is an important topic in image processing, and it has broad application prospects in the field of computer vision. Image matching typically includes Template Matching (TM), Feature Matching, and Dynamic Pattern Matching, among which TM is the most commonly used matching approach. TM is employed to measure whether an image patch matches a small area of the source image by sliding the template through the source image, and then use the coordinates of the upper-left corner of the corresponding window in the two images to determine the matching position [1]. TM is a fundamental problem of pattern recognition and has a wide range of applications in the field of image processing and computer vision, such as image recognition [2,3,4,5], remote sensing [6,7], social media analytics [8,9], medical image processing [10,11,12], biometric recognition [13,14,15], etc. In image analysis, matching technologies play an important role in image understanding and retrieval [16]. Two main operations, similarity measurement and best matching search [17,18] are often included in TM. Various similarity metrics are utilized to measure the similarity of two grayscale images, including Mean Absolute Differences (MAD), Sum of Absolute Differences (SAD), Sum of Squared Differences (SSD), and Mean Square Differences (MSD). Among these similarity measures, the normalized cross correlation (NCC) is commonly used for image matching, due to its robustness for the illumination variance and noise [19,20,21]. The NCC effectively reduces the influences of illumination on image comparison results, and it is more suitable for processing images with slightly deformed objects, blurred or unclear images, and textured images. The full, exhaustive search algorithm [22] is the simplest TM approach. It can check each pixel candidate at once and has extremely high accuracy. However, this kind of exhaustive search has an extremely expensive computation cost because every pixel of the source image has to be compared with NCC values computed, which severely limits its use in image processing applications [23]. In this paper, to reduce the time of NCC computation and speed up image matching, TM algorithms based on computational intelligence algorithms were proposed in the literature. Computational intelligence algorithms were extensively used for different optimization problems in previous studies. He et al. [24] developed a robust fuzzy programming approach to solve the multiple response optimization issues. Chen et al. [25] proposed an adaptive gradient method to ensure both the convergence and the communication efficiency of federated learning. Tang et al. [26] proposed an improvement in the stochastic optimization of the imaging inverse problems. Recently, the hybrid computational intelligence algorithms were developed and applied in various domains [27,28,29]. Computational intelligence-based algorithms were also employed in the area of image matching. Yan et al. [30] introduced the isolation niche technology into the traditional Cultural Algorithm (CA) and applied it to the image matching problem to improve stability and convergence precision. Liu et al. [31] proposed a Chaotic Quantum-behaved Particle Swarm Optimization Based on Lateral Inhibition (LI-CQPSO), which utilized the Chaos theory to ensure the PSO avoids premature convergence. Luo et al. [32] proposed a hybrid spotted hyena optimizer based on LI, which was applied for image pre-processing to make an intensity gradient in the image contrast-enhanced and enhanced the characters of the image. Huang et al. [33] discussed a hybrid bio-inspired evolutionary optimization approach incorporating the lateral inhibition mechanism and Imperialist Competitive Algorithm (ICA), addressing the limitation that the traditional ICA method is possibly trapped in the local minimum. The above-mentioned methods often include algorithm-specific parameters, such as the cognitive and social factors in PSO, and tuning these parameters introduces additional computational cost. Meanwhile, they typically employ the correlation value as the fitness function to find the best matching point in the image through multiple iterations, thereby reducing the number of explorations and shortening the search time. However, these methods cannot search the entire solution space efficiently and are easy to converge prematurely. Therefore, they often fall into the optimal local state and miss the accurate position, resulting in low search precision and accuracy. To address these limitations, a hybrid Rao-NM algorithm that combines the Rao-1 algorithm and the Nelder–Mead algorithm is proposed for the TM problem in this paper. The Rao-1 algorithm does not contain any algorithm-specific parameters, and only simple mathematical operations, addition, and multiplication, are included. The proposed method contains two search processes, global search, and local search. The Rao-1 algorithm is employed for the global search. The Rao-1 algorithm is a metaphor-less swarm intelligence method introduced by Rao [34] in 2019. The main idea of the Rao-1 algorithm is to iteratively update candidate solutions with the high probability of approaching the global best solution and leaving the worst solution. The optimal solution is obtained through the random interaction between the best and worst solutions. Meanwhile, the Rao-1 algorithm does not require any algorithm-specific parameters, and the computational cost of tuning parameters can be avoided. Recent research has proved its capability in solving different unconstrained and constrained optimization problems. During the local search process, the NM algorithm is utilized to further improve the search results of the Rao-1 algorithm. The NM algorithm is a popular nonlinear optimization search method without using derivative information introduced by Nelder and Mead [35,36]. The NM algorithm only considers function values to minimize the scalar-valued nonlinear function, without any derivative information [22]. It rescales the simplex of vertices according to the local behaviors of the function through four basic processes: Reflection, expansion, contraction, and shrinkage. After these steps, the simplex can be self-improved and gradually approach to the optimal solution. The rest of this paper is organized as follows. In Section 2, an optimization problem for TM is formulated. Section 3 presents the proposed hybrid Rao-NM algorithm. In Section 4, experiments and analyses are showed. Finally, the conclusion of this paper is provided in Section 5.

2. Problem Formulation

Image matching technologies are important in the field of airplane or missile map matching and positioning, medical image processing, and other related fields. The image matching process uses two sensors to get two images of different sizes from the same area. The image obtained in advance is called the source image, and the image obtained in real time or online during the matching process is called the template image. In this study, we use the NCC model [16] as the fitness function to compute the degree of matching between the template image and the source image and then determine the search position. Under the guidance of the fitness value, NCC coefficient, the hybrid algorithm can search the source image quickly until the area with the best similarity is found. Image TM aims to locate a small area of the source image by searching for a target similar to the template image by sliding the template through the source image, shown in Figure 1. To facilitate computation, both the template image and the source image are transformed to grayscale images. Let the matrix and represent the grayscale template and source images, respectively, where m and n denote the height and width, and and represent the gray values of a certain pixel of images, respectively (, ).
Figure 1

Template matching geometry.

The main idea of the TM problem is defined that search a point in so that the similarity between and , ) is the maximum in the feasible search space. The NCC metric can use the grayscale matrices of two images to compute the degree of matching between them through a normalized correlation measurement formula. Therefore, the TM problem can be presented as an optimization problem, depicted in (1). where is the pixel position of the top-left corner of the grayscale template matching, when the original image matches the same area as the template image at (i*, j*), NCC (i*, j*) = 1.

3. The Proposed Hybrid Rao-NM Algorithm

In this paper, we combine the Rao algorithm and the Nelder–Mead simplex method to efficiently obtain the optimal solution. In the proposed algorithm, the Rao-1 algorithm is employed for global search, while the NM algorithm is utilized to conduct the local search. In this section, the Rao-1 and NM algorithms are introduced separately, and then the hybrid Rao-NM algorithm is described in detail.

3.1. Rao-1 Algorithm

The Rao-1 algorithm is a metaphor-less swarm intelligence-based optimization method without containing any algorithm-specific parameters [34]. Only two controlling parameters, population size and the number of iterations, need to be determined for the Rao-1 algorithm. The solution updating procedure of the Rao-1 algorithm is illustrated as (2) and (3): where is the value of the variable j for the best candidate and is the value of the variable for the worst candidate during the th iteration. are the updated values of and and are two random numbers of the th variable during the th iteration, with their value range in [0, 1]. Based on the updating rule, the optimization process of the Rao-1 algorithm is summarized as follows: Initialize the common controlling parameters, population size, number of design variables, and termination criteria. Determine the best and worst solutions in the population. Update the current solution based on the best, worst, and candidate solutions, random interaction according to (2) Computer the objective function value for every updated solution. Next, the updated solution will be selected according to (3). If the termination conditions are satisfied, the optimization process will stop. Otherwise, the process skips to Step 2.

3.2. NM Method

The NM search method is a local search method, and it parameterizes the function value through unconstrained optimization without using the gradient information. The objective function shrinks to optimal value by adapting to the local landscape with simplex. Since the TM problem can be regarded as a two-dimensional optimization problem, a simplex is a triangle composed of vertices. If a point is defined as the origin of a non-degenerate simplex, the other n points will define the vector direction across the N-dimensional vector space [37]. NM method uses four basic steps to readjust the scale of the simplex according to the local behavior of the function: Reflection, expansion, contraction, and shrinkage [38]. The simplex can approach the optimal value continuously through these procedures. Before starting the algorithm, defining the complete NM method requires four scaling parameters: Coefficients of reflection (), contraction (), expansion (), and shrinkage (). According to the definition of the NM method, these parameters should satisfy (4): As Image TM is actually a two-dimensional optimization problem, parameters are restricted to the standard case according to (5). The specific steps of the NM method are described as follow: Initialization: Randomly Generate initial vertices within their respective search range. Compute the objective function value and the simplex constraint of each vertex, and then order these vertices to satisfy . Reflection: Calculate the reflection point r according to the (6): where , and are the vertices with the highest and lowest function values, respectively, and represent the value of the observation function. Next, obtain the c, which is the center of the simplex without h in minimization case. If go to step 3; If , go to step 4; otherwise, if lies between and , is replaced by and go to step 6. Expansion: To expand the search space in the same direction, the expansion point is expanded the simplex and computed as (7): If , h is replaced by e; If , h is replaced by r; Go to step 6. Contraction: When lies between and , then h is replaced by and contraction is performed. When >, perform contraction directly without any replacements. The contraction vertex is computed as follow (8): If < , is replaced by and go to step 6. Otherwise, do shrinking in step 5. Shrinkage: When the contraction is failed, shrinkage attempts to all vertexes of the entire simplex expect x as (9): Then go to step 6. If the termination condition is met, the computation is stopped and terminates the iteration. Otherwise, return Step 1 to start a new iteration.

3.3. The Hybrid Rao-NM Algorithm

The Rao-NM algorithm combines the adaptive Rao-1 algorithm and the NM method to balance the efficiency and accuracy of the optimization process with a higher probability of obtaining the optimal solution within limited iterations. In the optimization process, the Rao-1 algorithm [34] is initially applied to finding a relatively optimal solution, and the search space is reduced for the continued search. Next, according to the solution obtained from the Rao-1 algorithm, the NM method [35] is utilized to search the best local solution near the initial solution. Compared with the generic Rao-1 algorithm, the proposed hybrid algorithm can offer better solutions thanks to the NM method. Meanwhile, the Rao-NM algorithm can converge quickly, inheriting the advantage of the Rao-1 algorithm. The main optimization process of the proposed Rao-NM algorithm is described in Algorithm 1. As shown in Algorithm 1, considering the multiplication operation of the NCC computation as the basic operation, the time complexity of the proposed algorithm is O(M·N·w·h), where w and h are the weight and height of the template image, respectively. Thus, it is independent of the size of the source image.

4. Experiment and Analysis

4.1. Benchmarking Test Functions

To assess the performance of the proposed algorithm, four benchmarking test functions, as shown in (10)–(13), are utilized, and their images are shown in Figure 2, Figure 3, Figure 4 and Figure 5. The test functions include unimodal functions and multimodal functions with numerous local optimums in their images. Meanwhile, three algorithms—Rao-1, PSO, and the Genetic algorithm (GA)—are benchmarked to assess the performance of the proposed method.
Figure 2

Image of Function 1.

Figure 3

Image of Function 2.

Figure 4

Image of Function 3.

Figure 5

Image of Function 4.

Function 1: Schaffer function Function 2: Camel function Function 3: Function 4: For the above four benchmark functions, four algorithms have experimented 50 times, respectively. According to the results presented in Table 1, The proposed hybrid Rao-NM algorithm achieves the best performance in terms of both efficiency and precision among all considered methods. Besides, though both the Rao-1 algorithm and the proposed hybrid Rao-NM algorithm can quickly converge to the optimal value, the Rao-NM algorithm has higher accuracy, especially for the F2 function, it can precisely converge to the optimal value. For the F3 function, many local optimal values in the solution space exist, and the proposed hybrid algorithm can find the optimal value accurately and efficiently. Therefore, the proposed algorithm outperforms other algorithms in searching for the optimal solution.
Table 1

Results comparisons of the benchmark.

Algorithm F1F2F3F4
Theoretical Optimal Value0.0−1.0316−39.94500.0
Rao-1Average time6.9750 × 10−68.8250 × 10−60.00320.0033
Actual optimal0.0610−0.1943−39.84980.0003
PSOAverage time0.24210.27670.37270.2229
Actual optimal0.004857.6269−39.08972.6623
GAAverage time1.27411.27571.27391.2990
Actual optimal0.0024−0.9549−39.42690.0032
Rao-NMAverage time4.1650 × 10−53.7300 × 10−50.00320.0033
Actual optimal 0.0025 −1.0316 −39.8500 5.2560 × 10−6

The bold indicates the best results.

4.2. Sensitivity Analysis on Controlling Parameters

Since the two controlling parameters, the population size and the number of iterations, are included for all considered TM algorithms, they are tuned based on 368 images selected from the Oxford-IIIT Pet Dataset [39]. Nine parameter configurations are employed, and the grid search is utilized. All considered algorithms are implemented on a PC with AMD Ryzen 9 3950X CPU and 32 GB RAM. The programs are written by Python3, and they are executed o Windows 10. The algorithm-specific parameters of PSO and GA are set as follows: (1) PSO parameter settings [40]: Cognitive and social acceleration constants C1 = 1.8, C2 = 1.8, self-weighting factor = 1.0, and independent random numbers r1 and r2 are distributed in the range of [0, 1]. (2) GA parameter settings [41]: The mutation probability = 0.05, the elite ratio = 0.01, the crossover probability = 0.75, and the parent portion = 0.1. The matching accuracy and execution time of different algorithms are shown in Table 2, Table 3 and Table 4. In this study, the success rate is defined in (14) to depict the accuracy. where R is the success rate, is the number of times that the matching pixel position is the same as the template image in the experiment, and is the total number of experiments.
Table 2

Test results of TM using Rao-NM algorithm.

Population SizeNo. of IterationsRTime (s)
505077.71%71.42
5010080.70%140.23
5020084.51%277.56
1005085.32%138.61
10010089.13%274.53
10020087.77%546.94
2005088.58%273.82
20010091.84%544.45
20020095.10%1085.16

Population Size

No. of Iterations

R

Time (s)

Table 3

Test results of TM using PSO.

Population SizeNo. of IterationsRTime (s)
505024.45%99.38
5010029.89%196.78
5020041.57%371.76
1005032.06%197.40
10010048.36%311.62
10020061.68%622.31
2005052.44%372.62
20010066.03%624.34
20020077.44%1194.61
Table 4

Test results of TM using GA.

Population SizeNo. of IterationsRTime (s)
505015.48%196.89
5010034.51%394.18
5020058.96%776.82
1005035.05%399.27
10010066.03%792.42
10020082.06%1567.83
2005067.39%798.36
20010088.31%1570.10
20020094.29%2951.44

4.3. Template Matching Results

The Oxford Pets Dataset of 2580 images [39] is utilized to compare the performance of different algorithms based on the optimized parameters. Each algorithm is executed ten times, and the accuracy and execution time are presented in Table 5.
Table 5

Performance of different methods on the Oxford Pets Dataset.

ModelR (%)Time (s)
PSO49.76 ± 0.842616.38 ± 9.29
GA70.17 ± 0.824345.63 ± 151.69
Rao-154.17 ± 0.591666.08 ± 25.15
Proposed88.94 ± 0.641807.25 ± 30.69
According to Table 5, it can be seen that the proposed method outperforms other benchmarking methods in terms of the highest accuracy and the shorter computing time. Meanwhile, the execution time of the proposed method is slightly longer than that of the Rao-1 algorithm. Thus, it is more practical to apply the proposed method for real applications. To assess the performance of the proposed method on real biometrics recognition tasks, 94 images collected from the V47 dataset [42] and 100 images selected from the WIDER FACE dataset [43] are employed to evaluate the performance of the proposed method on person re-identification and face detection problems. The images from the WIDER FACE dataset are with a high degree of variability in scale, pose, and occlusion, as shown in Figure 6. The image matching results are obtained and shown in Table 6 and Table 7.
Figure 6

Example images of the WIDER FACE dataset.

Table 6

Performance of different methods for Person Re-identification.

ModelR (%)Time (s)
PSO16.91 ± 3.3497.067 ± 0.61
GA48.19 ± 3.54151.736 ± 4.06
Rao-119.68 ± 1.6086.23 ± 0.55
Proposed56.70 ± 3.1389.11 ± 1.09
Table 7

Performance of different methods for FaceDetector.

ModelR (%)Time (s)
PSO15.2 ± 2.03126.533 ± 1.51
GA44.3 ± 4.59189.719 ± 2.26
Rao-119.5 ± 1.50116.022 ± 0.54
Proposed67.1 ± 3.95120.916 ± 0.67
According to the results presented in Table 6 and Table 7, the proposed method dominates other methods with the highest accuracy for both two datasets. Therefore, the proposed method is applicable for face detection and person re-identification tasks. Since different scenes are included in these images, TM using the Rao-NM method offers more robust results. Three large images of different sizes are employed to validate the actual performance of all considered algorithms, shown in Figure 7, Figure 8 and Figure 9. Each algorithm is executed 50 times independently based on three images to validate their average performance.
Figure 7

(a) Predefined template image (160 x 70); (b) source image (960 × 540); (c) TM result.

Figure 8

(a) Predefined template image (150 × 150); (b) source image (1000 × 1500); (c) TM result.

Figure 9

(a) Predefined template image (100 × 150); (b) source image (1080 × 720); (c) TM result.

Table 8 shows the proposed hybrid Rao-NM algorithm dominates all the compared algorithms in terms of the highest success rate. Although the Rao-1 algorithm requires the least execution time, it performs badly in TM of these three images. Especially, the success rate is only 2% by using the Rao-1 algorithm for Image 2. Thus, it is not suitable to directly apply the Rao-1 algorithm for TM problems. Compared with PSO and GA algorithms, the search efficiency and accuracy of the proposed algorithm are greatly improved over all three images. As shown in Table 8, the hybrid Rao-NM algorithm matching accuracy can reach more than 85%, while PSO and GA algorithms can only offer success rates of less than 85%. The above comparison results show that it is more practical to apply the proposed hybrid algorithm for TM problems.
Table 8

Results comparisons of TM.

Image 1Image 2Image 3
PSOAverage time5.6424.4427.86
Accuracy34%28%50%
GAAverage time9.3314.7524.59
Accuracy82%84%50%
Rao-1Average time4.2713.8611.72
Accuracy34%2%52%
Rao-NMAverage time4.2813.8711.73
Accuracy96%86%86%

5. Conclusions and Future Work

In this paper, a novel hybrid optimization algorithm, combining the Rao-1 algorithm and the NM method, is proposed to address the image matching problem in an effective and efficient way. The proposed algorithm incorporates the powerful largescale global search ability of the Rao-1 algorithm and the thorough local search capability of the NM method. Thus, the Rao-NM algorithm can accurately search for high-quality optimal solutions. To verify the robustness and the efficiency of the proposed Rao-NM algorithm, four commonly applied test functions, and three image datasets are utilized. Meanwhile, three benchmarking algorithms are considered. The experimental results demonstrate that the proposed algorithm is more accurate than other recently reported algorithms and takes less time to converge to the optimum. Considering the higher accuracy and shorter execution time, the proposed algorithm is practical for image matching problems. The proposed method is implemented serially on the CPU. Since current image processing and computer vision algorithms can run on modern GPUs, the parallel version of the proposed method will be investigated, and thus, the multi-core CPUs and many-core GPUs can be employed to speed up the image matching task. Meanwhile, the elite mechanism can be incorporated into the Rao-1 algorithm to improve the global searchability.
  6 in total

1.  Template matching in rotated images.

Authors:  A Goshtasby
Journal:  IEEE Trans Pattern Anal Mach Intell       Date:  1985-03       Impact factor: 6.226

2.  Robust Semantic Template Matching Using A Superpixel Region Binary Descriptor.

Authors:  Hua Yang; Chenghui Huang; Feiyue Wang; Kaiyou Song; Zhouping Yin
Journal:  IEEE Trans Image Process       Date:  2019-01-17       Impact factor: 10.856

3.  Improved medical image modality classification using a combination of visual and textual features.

Authors:  Ivica Dimitrovski; Dragi Kocev; Ivan Kitanovski; Suzana Loskovska; Sašo Džeroski
Journal:  Comput Med Imaging Graph       Date:  2014-06-19       Impact factor: 4.790

Review 4.  Evaluating performance of biomedical image retrieval systems--an overview of the medical image retrieval task at ImageCLEF 2004-2013.

Authors:  Jayashree Kalpathy-Cramer; Alba García Seco de Herrera; Dina Demner-Fushman; Sameer Antani; Steven Bedrick; Henning Müller
Journal:  Comput Med Imaging Graph       Date:  2014-03-27       Impact factor: 4.790

5.  A novel artificial bee colony algorithm based on internal-feedback strategy for image template matching.

Authors:  Bai Li; Li-Gang Gong; Ya Li
Journal:  ScientificWorldJournal       Date:  2014-04-29

6.  A hybrid cuckoo search algorithm with Nelder Mead method for solving global optimization problems.

Authors:  Ahmed F Ali; Mohamed A Tawhid
Journal:  Springerplus       Date:  2016-04-18
  6 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.