Literature DB >> 35684834

Using Conventional Cameras as Sensors for Estimating Confidence Intervals for the Speed of Vessels from Single Images.

Jose L Huillca¹, Leandro A F Fernandes¹.

Abstract

In this paper, we describe an image-based approach for estimating the speed of a moving vessel using the wakes that remain on the surface of water after the vessel has passed. The proposed method calculates the speed of the vessel using only one RGB image. In this study, we used the vanishing line of the mean water plane, the camera height concerning the level of the tide, and the intrinsic parameters of the camera to perform geometric rectification on the surface plane of the water. We detected the location of troughs on one of the wake arms and computed the distance between them in the rectified image to estimate the speed of the vessel as a so-called inverse ship wake problem. We used a radar that was designed to monitor ships to validate the proposed method. We used statistical studies to determine the reliability and error propagation of the estimated values throughout the calculation process. The experiments showed that the proposed method produced precise and accurate results that agreed with the actual radar data when using a simple capture device, such as a conventional camera.

Entities: Chemical

Keywords: Kelvin wake; error propagation; metrology; object detection; planar homography; ship wake; vessel speed

Mesh：

Substances：
Water

Year: 2022 PMID： 35684834 PMCID： PMC9185343 DOI： 10.3390/s22114213

Source DB: PubMed Journal: Sensors (Basel) ISSN： 1424-8220 Impact factor: 3.847

1. Introduction

Radar-based electronic navigation systems (ENSs) that keep the pilot informed about the location and speed of nearby vessels have been a significant technological advance for the maritime industry. Unfortunately, radars and other sensors may fail to detect stealth ships and non-metallic targets because they reflect a low amount of radiation. Additionally, many vessels are not equipped with an ENS. As a result, naked eye visibility still plays a key role in making decisions, especially at close range. Going further, projects have been carried out on the creation of autonomous vessels [1,2], in which autonomous navigation makes use of motion flow techniques, such as FeatFlow [3], as auxiliary sources of information. Vessel detection and tracking using computer vision-based systems are convenient methods for measuring vessel speed [4]. A large number of algorithms estimate the vessel speed by analyzing the traces that are left by the vessel using synthetic-aperture radar (SAR) imagery [5,6]. Techniques, such as the CRSN [7], have been used to improve the quality of SAR images and CenterNet++ [8] performs vessel detection. In the work developed by Liu et al. [9], ship wakes were used to detect vessels and to identify the position and direction of the vessel using optical images. Although these techniques provide good results, the SAR and optical images that were considered in the above-mentioned studies were not captured at close range; they were generated by airborne sensors or satellites and could not be obtained from inside an autonomous vessel. Other techniques perform tracking and estimations of speed using image sequences that were taken with a digital camera [10,11,12]. However, video-based techniques require fixed cameras to estimate speed from the relative motion. Therefore, this type of method is limited to applications in which the camera is onshore, since cameras that are placed on autonomous vessels and drones produce additional movement in the optical flow. In a recent study, Huillca and Fernandes [13] presented a semi-automatic method to calculate the speed of a vessel directly from one image that was acquired by a conventional camera. The method is based on projective geometry and estimates vessel speed from an analysis of the Kelvin wake pattern. The key observation is that naval objects leave traces of their movements on the surface of the water and the appearance of these traces is related to speed. Actually, as Lord Kelvin demonstrated in 1887, the wakes that are left by vessels maintaining a constant course and speed can be modeled as a function of speed [14]. The inverse ship wake problem consists of estimating parameters, such as speed, from the observation of wake patterns [15]. The approach that was proposed by Huillca and Fernandes [13] detects features (crests or troughs) of the extracted wave arm and applies the inverse ship wake problem to estimate the speed of the corresponding vessel. However, the technique requires the horizon line to be visible in the image. The curves that are adjusted to the wave arms are also sensitive to noise and the analysis of the results of the study was limited to a few test cases using ground truth that was based on rough estimates of the maximum and minimum speeds that were reached by the observed vessels. This paper presents an extended version of the approach that was proposed by Huillca and Fernandes [13]. The extension included the automatic estimation of the vanishing line, a different approach to identifying the wave arms, the use of the troughs in the closest wave arm to estimate the wavelength, the use of radar to validate the results, the estimation of confidence intervals for the measurements, and the analysis of error propagation throughout the computational chain. Unlike previous work, the use of a more sophisticated vanishing line detection method allowed for the application of the new method even when the line between the sky and the water was not visible. Our new clustering strategy for identifying the two wave arms made our approach less prone to misidentification. In the previous study, the validation was performed using only the speed information that was presented on the Rio de Janeiro Ferry Services website. In this work, we verified the accuracy and reliability of the measurements that were obtained through our method using ground truth data that were collected from a radar that was designed to monitor vessels, including ferries and other types of vessels. We also used a sampling-based method and first-order error propagation [16] to estimate the uncertainty of the estimations that were produced using our technique. We showed that the confidence intervals that were obtained with error propagation were equivalent to those that were computed by sampling, but also had the advantage of allowing for uncertainty estimation when using one image. Our results were consistent and agreed with the values that were obtained by the radar and did not present the limitations of video-based approaches, since all estimates were performed using a single image.

2. Computing Vessel Speed

For the proposed method, it is assumed that the camera is mounted at a given height h above sea level in such a way that the target ship and the traces that are left by the target ship can be observed (see Figure 1). The camera could be onshore, aboard a vessel or on a moving drone. The orientation of the camera in world space is estimated during the process. The speed is computed using the wavelength of the trace information. For each acquired image, the processing steps include: (i) the estimation of the vanishing line of the mean water plane; (ii) the definition of the corners of the region of interest (ROI), including the ship wake; (iii) the identification of the wave arms and troughs that are present in the ROI; and (iv) the estimation of the speed of the vessel using the wavelength.

Figure 1

The pipeline followed to estimate the vessel speed. We intentionally flipped the rectified region of interest (ROI) image to make the vessel travel to the left.

The notation adopted in this article was inspired by Hartley and Zisserman [17] and is shown in Table 1.

Table 1

Notation convention that is used in this article.

Notation	Meaning
I,B	Input and edge image, respectively
Π	Water plane
xi	The i-th point in image space under homogeneous coordinates
Xi	The i-th point in world space under homogeneous coordinates
xxi,yxi	Coordinates of point xi in image space
XXi,YXi,ZXi	Coordinates of point Xi in world space
l	Vector encoding the vanishing line in image space
v→	Vector in image space under homogeneous coordinates
V→	Vector in world space under homogeneous coordinates
M	An m×n matrix
M−1	Inverse of a matrix M
MT	Transpose of a matrix M

2.1. Vanishing Line Estimation

We used the Horizon Line in the Wild (HLW) algorithm [18] to automatically detect the vanishing line of the water body. The HLW algorithm estimated the left- and right-hand endpoints of the vanishing line of the most prominent plane that was observed in the input color image . In our equations, we represented the endpoints using homogeneous coordinates as vectors and , where W is the width of and and are the vertical coordinates of the points that were returned by the HLW algorithm. The vector that encoded the vanishing line of the mean water plane in homogeneous coordinates could be computed as the cross product of vectors and [17]: where A, B, and C are the coefficients of the general equation of the line . It is important to emphasize that the horizon line did not need to be visible for the HLW algorithm to estimate the endpoints of the vanishing line of the mean water plane. For instance, notice in Figure 1 that the horizon line would be behind the mountains. For the RANSAC-based technique that was used by Huillca and Fernandes [13], on the other hand, the horizon line between the sky and the water must be visible because it is detected as the most apparent straight line in the edge image.

2.2. Definition of the Corners of the ROI

The ROI had to include the ship wake. It was defined in as the quadrilateral that resulted from the projection of a rectangular region onto the water surface (see Figure 1). We found the set of corners of the ROI in using the reference corner , the direction of the vessel in the image, the vanishing line of the water (1), the camera calibration matrix , the ROI size (in meters) in 3D space, and the camera height h above sea level. Using techniques such as YOLO [19] and the training model that was used by Breitinger et al. [20] to detect the vessels is straightforward. However, as this was a separate problem that was subject to the quality of the detection technique without the loss of generality, we chose to manually set and in image space in our experiments. The selection condition for was to place this point in image space to the left of the vessel in 3D space. It could be placed either near the bow or near the stern. We set close to the bow for all images in our experiments. The definition of direction in image space was straightforward. It could be defined by tracing a line segment from the stern to the bow or it could be calculated from the edge pixels of the wake as the eigenvector with the largest eigenvalue pointing to the bow. In this work, we extracted metadata from the image files to compute . and were constant values that were defined by the user. The camera height h was given by the construction and was defined as the distance between the camera and the mean water plane in meters. We assumed that the origin of the world coordinate system (see Figure 1) lay on the orthogonal projection of the camera center to the mean water plane , that the X and Y axes spanned , and that the Z axis was parallel to the vector that was normal to . We set and computed its orientation with respect to the world’s frame from the vanishing line . Therefore, even though the camera could move with six degrees of freedom, only one (the height) had to be known a priori. When a camera is mounted on a vessel’s mast, its height h can be calculated from the mast’s height and its relative orientation to the normal vector (computed from the vanishing line). For drones, the vehicle needs to be able to determine its altitude. In our experiments, the camera was mounted on a tripod that was inside a building. In this case, the camera height above the sea level was calculated as the sum of the heights , , , and representing the ground height, floor height, tripod height, and tide height, respectively: where is the floor number on which the camera was mounted. We let be the set of corners of the ROI lying on plane . By tracing a ray from the camera to through point , we computed: where is the direction of the traced ray, , and is a rotation matrix whose columns correspond to the X, Y, and Z axes of the camera in the world’s frame of reference (the red, green, and blue segments leaving in Figure 1, respectively). The columns of were , , and . The unit function normalized the vector to unit length and the up function changed the orientation of when its coordinate was negative. It was necessary to correct the hand of the camera coordinate system by forcing the vector to point upward, as with the normal vector of the mean water plane and the Z axis of the world’s frame of reference (see Figure 1), while and pointed to the right and the front, respectively. , , and were computed by translating by and in directions and on plane : where is the direction of the vessel in world space and is the perpendicular direction that was computed by rotating in through an angle of . Thus, was a constant rotation matrix. Here, was the back projection of (an improper point) onto the world coordinate system and encoded the orthogonal projection onto . Finally, the corners of the ROI in were computed as: where , is the camera matrix, is a identity matrix, and .

2.3. Finding the Wave Arms

We used the edge image of the color image to find the wave arms. To avoid processing the whole image or working with a non-rectangular ROI, the procedure for finding the wave arms considered a small portion of that was defined as the axis-aligned bounding box of the ROI. Thus, in the remainder of this section, all described processing was restricted to that portion of . In this work, we used the Richer Convolutional Features (RCF) algorithm [21] to compute the edge image from . RCF helped to detect the wakes that were left by the vessels by making the waves more visible. Among the edge detection strategies that we tested, RCF was less sensitive to weather conditions and poor natural light. The edge image that was produced by RCF was an intensity image, in which 1 indicated that the pixel had a high chance of being an edge while 0 meant the opposite. We used the Otsu algorithm to find an automatic threshold t to separate the pixels into the two classes, i.e., edge and non-edge, and compute the binary image . We used the k-means algorithm [22] with to differentiate the two wave arms that were present in the wakes that were left by the ships. The image coordinates of the edge pixels were taken as the inputs for the algorithm. As illustrated in Figure 2a, the k-means algorithm was not directly applied to the entire binary image. We divided the portion of that was inside the bounding box of the ROI into vertical partitions , each with a width of pixels and height that was equal to the height of the bounding box. In each partition , we applied the k-means algorithm to obtain the centroid of each wake arm. The k-means points in partition were equal to the centroids that were calculated in , except for the first partition () or when no centroid was detected in . For those cases, the two points were taken as random. The points that were obtained as the centroids defined the samples for the curves that described each wake arm. Figure 2a highlights three consecutive partitions, while Figure 2b shows the detected centroids using green and red for the most distant and closest wake arms, respectively. In our experiments, we set to 5 pixels. According to our experience, the results of the curve fitting process that follows did not differ when a value between 3 and 9 pixels was chosen.

Figure 2

Finding the wave arms: (a) we applied the k-means algorithm to the edge pixels that were included in each partition of the ROI’s bounding box and the partitions each had a width of pixels; (b) the detected wave arms.

2.4. Wavelength and Speed Estimation

To solve the inverse Kelvin wake problem, we needed to calculate the Euclidean distance between at least two consecutive troughs or crests in the world coordinate system. A key observation is that the image of the wave arm that was closest to the camera was the least affected by the turbulence of the wake. In addition, its troughs were less affected by errors when we performed the rectification of the ROI. The steps for performing wavelength and speed estimation are described below.

2.4.1. ROI Image Rectification

The objective of the ROI rectification was to eliminate the projective distortion that was introduced by the camera from the image of the mean water plane, thereby simulating an aerial view of the ROI that was similar to the sketch that is presented in Figure 3. We used the line at infinity of the water plane to remove affine distortion and the camera height above sea level to eliminate projective ambiguity. The line at infinity allowed for the recovery of the related properties of image elements, such as parallelism and the proportion of areas [17].

Figure 3

The Kelvin wake structure, indicating the transverse and divergent components as well as the crests and troughs of the wave arms.

For convenience, we used canonical coefficients to define the general equation of the line at infinity, i.e., in homogeneous coordinates. By taking the vanishing line from (1), the projective transformation that mapped onto was given by: Using , we were able to rectify each point on the mean water plane in the image. In (6), was used to offset the affine matrix , in which , , , and defined a non-singular matrix and was a translation vector. A planar affine transformation has six degrees of freedom, which can be computed using three-point correspondences. We used the correspondence between the corners of the rectified ROI using homogeneous coordinates , , and and points for , where are the corners of the ROI in image space (5). From the correspondences, we defined a linear system of the equation in matrix form: where is a matrix of known values and is the vector of variables. The coefficients of were computed as after solving the system for as the right-hand null space of . We used the singular value decomposition to solve the system by taking (i.e., the vector that was associated with the zero singular value) as the last column of matrix . By construction, the wave arm that was closest to the camera was always as shown below in the ROI image. Therefore, the rectification only needed to be applied to the points that were obtained in Section 2.3 for that wave arm, since we were interested in the red line that is presented in Figure 3. The black dots in Figure 4 represent the rectified set of points that was used to fit the red curve in Figure 2.

Figure 4

The black points are the curve samples and the red line is the smooth curve that was computed by the LOWESS algorithm [23]. Axes in centimeters.

2.4.2. Curve Fitting

As can be seen in the black points in Figure 4, the discrete set of points that was obtained in Section 2.3 could be corrupted by noise, which made it challenging to find the troughs of the wake. We used the locally weighted scatterplot smoothing (LOWESS) algorithm [23] to smooth the digital curve that was extracted from the rectified ROI image. The black points in Figure 4 correspond to the input curve samples in this example, while the red line is the resulting smooth curve that was used to find the troughs. Huillca and Fernandes [13] took the point samples for one of the wave arms and followed a naive approach that used a Savitzky–Golay filter [24] for curve fitting. They performed both procedures in the rectified image of the ROI. According to our experience, the use of the k-means algorithm on the input image domain to obtain the point samples followed by the application of the LOWESS algorithm to the points that were mapped onto the domain of the rectified ROI was much less sensitive to noise.

2.4.3. Wavelength Estimation

The wavelength of the transverse components of the Kelvin wake pattern (see Figure 3) can be estimated using the distance between the successive crests or troughs. From those wavelengths, the speed of a vessel can then be estimated [5]. One finds a set of crests/troughs as the curve maximum or minimum, depending on the direction of the vessel with respect to the camera. Since we had a curve that corresponded to the arm that was closest in the V-shaped pattern, we could find the crests and troughs as follows: When the vessel went to the right in the input image , the curve maximum and minimum corresponded to the troughs and crests of the wave arm; When the vessel went to the left in the input image , the curve maximum and minimum corresponded to the crests and troughs of the wave arm. We avoided the identification of noisy troughs by imposing a minimum horizontal distance of m between valid consecutive troughs and by only extracting two minimums and maximums (depending on the case). The value of was empirically defined based on the observation that the speed is close to 10 knots for a wavelength of approximately 20 m and, as discussed in Section 4, most vessels that leave visible tracks move at higher speeds. We computed the wavelength by replacing D in: where is the Euclidean distance (in meters) between the location of the crests (or troughs) and .

2.4.4. Vessel Speed Estimation

Finally, the speed of the vessel was: were is given by (7) and 9.80665 m/s2 is the acceleration of gravity. Since is the unit of measurement that is used for speed in maritime navigation, the values had to be multiplied by to be converted into .

3. First-Order Error Propagation

The theory of errors [16] provides the expressions that were needed to estimate the standard uncertainty of a measurement U using the standard uncertainties of the experimental values in the dataset . In matrix form, the first-order error propagation of such uncertainties was expressed by: where is the Jacobian matrix of the function that calculated the speed U of a vessel using the method that was described in Section 2 and is the covariance matrix that encoded the uncertainty of the input variables that were used to compute U. In this paper, we assumed the independence of the input variables. Thus, their covariances were zero, with being a diagonal matrix and being the standard deviation of the input variable . The partial derivatives in (10) were taken with respect to the nine variables in . Appendix A presents the expressions that were used to compute those partial derivatives. The computational flow of the U function is illustrated in Figure 5, in which the circles represent the input variables with uncertain values, pentagons represent the input variables that we assumed to have no uncertainty, rectangles represent the intermediary variables, and the rhombus is the estimated speed of the vessel. Altogether, the proposed method has 20 input variables:

Figure 5

Computational chain for estimating vessel speed (rhombus) using experimental variables with (circles) and without (pentagons) uncertainty.

, : The y axis coordinates of the endpoints and of the vanishing line that are estimated for the mean water plane, respectively (Section 2.1); , : The coordinates of the corner of the ROI in the input image ; : The angle that defines the direction of the vessel in the input image ; , , , : The set of heights that is used in (2) to calculate the camera height h above the sea level in meters; , , , , , : The homogeneous coordinates of two adjacent troughs of the wave arm that is closest to the camera, i.e., points and that were used in (7), but represented by pixel coordinates in . These variables are not taken as sources of uncertainty because their rectified counterparts ( and ) naturally include the uncertainty that is propagated from other variables (see Appendix A for details); , , , , : The intrinsic parameters that define the camera calibration matrix . They are the focal length in terms of pixel dimensions in the x and y directions, skew, and the coordinates of the principal point in terms of pixel dimensions, respectively [17]. Recall from Section 2.2 that we extracted metadata from the input image file to compute . In this work, we assumed that the intrinsic parameters of the camera were constant values since it was observed that they do not usually have much influence on the uncertainty of image-based measurements [25]. Section 4 describes how to estimate the uncertainty in .

4. Experiments and Results

The procedures that were described in Section 2 were implemented in Python 2.7.0. Speeds were calculated on a computer that had an Intel Xeon CPU E-2698 v4 with GHz and a Tesla P100-SXM2 video card with 16 GB of VRAM. In the experiments, the images of moving vessels were acquired under natural lighting and different weather conditions. The images were taken using a Nikon D3300 camera with megapixels and were then encoded in JPG format file (any other image format can be used without affecting the results). The lens model that we used was an AF-S DX NIKKOR, with an 18∼55 mm focal length and vibration reduction (VR II) [26]. The resolution of the captured images was pixels. The ROI size lying on plane was intentionally set to m in order to cover the wave arms of vessels that were traveling at more than 10 knots, as the speed of the vessels was rarely less than this threshold within the field of view that was used for the experiments. The camera was mounted in two places. The camera height was approximately and m for images i1–i17 and i18–i23, respectively. The camera height varied according to the tide height at the time the image was taken and the floor of the building. A total of 40 images was obtained, of which 23 were used to analyze the results that are presented in this section. Table 2 describes the conditions under which the images that we used were acquired and Figure 6 shows some of the cropped versions. Special attention was paid to the noisy inputs that we included in our experiments, such as images i18 to i23 (see Figure 6g,h), because those cases included natural noise as they were acquired during a scattered storm or the partially cloudy and windy weather that followed the storm. The remaining 17 of the original 40 images were not considered because they were taken during unfavorable weather conditions (Figure 7a) and low natural lighting (Figure 7b), which prevented the edge detection approach from being successful in the detection of the troughs. Additionally, some of the vessels in those images were merchant ships (Figure 7c). As such, their speed had to be low because they were close to a port area. In all of those cases, at least two troughs of the traces that were left by the vessels could not be distinguished, even by human observers. In Table 2, the tide height was obtained from webpages that freely provide sea conditions [27,28,29]. The noise that was introduced by the wind speed was not considered in these experiments.

Table 2

Information regarding each captured image, where U denotes the speed that was measured by a radar (ground truth), is the speed that was estimated by our approach using the troughs of the wave arms, and and are the absolute and relative errors of estimations, respectively. The subtable summarizes the results of Huillca and Fernandes [13].

Image	Vessel		Time (hh:mm)	Weather	Tide (Meters)	Speed (knots)		Error		Results of [13]
Image	Model	Name	Time (hh:mm)	Weather	Tide (Meters)	U	U^	εa	εr	U^	εr
i1	HSC	Fenix	10:09	Cloudy	0.30	18.2	18.339	0.139	0.008	17.234	0.053
i2	MC25	Apolo	10:12	Cloudy	0.30	20.6	21.533	0.933	0.045	25.467	0.236
i3	MC25	Apolo	10:13	Cloudy	0.30	20.5	20.609	0.109	0.005	26.629	0.299
i4	MC25	Neptuno	10:19	Cloudy	0.50	19.2	19.434	0.234	0.012	21.788	0.135
i5	MC25	Neptuno	10:21	Cloudy	0.50	17.3	16.683	0.617	0.036	20.346	0.176
i6	MC25	Neptuno	10:22	Cloudy	0.50	17.1	16.530	0.570	0.033	27.785	0.625
i7	Other	Escander Amazonas	10:28	Cloudy	0.60	09.2	19.262	10.062	1.094	17.326	0.883
i8	MC25	Missing	10:39	Cloudy	0.70	19.5	19.288	0.212	0.011	26.036	0.335
i9	HSC	Fenix	10:41	Cloudy	0.70	15.6	15.984	0.384	0.025	18.231	0.169
i10	HSC	Fenix	10:43	Cloudy	0.70	16.5	16.611	0.111	0.007	20.133	0.220
i11	MC25	Missing	11:08	Cloudy	0.50	20.4	20.771	0.371	0.018	26.436	0.296
i12	MC25	Missing	11:09	Cloudy	0.50	20.4	20.543	0.143	0.007	23.863	0.170
i13	MC25	Missing	11:12	Cloudy	0.50	17.3	16.846	0.454	0.026	23.259	0.344
i14	MC25	Zeus	11:42	Cloudy	0.70	19.6	20.533	0.933	0.048	26.982	0.377
i15	MC25	Neptuno	11:43	Cloudy	0.70	19.1	19.861	0.761	0.040	19.407	0.016
i16	MC25	Neptuno	12:10	Cloudy	0.90	20.2	22.120	1.920	0.095	25.951	0.285
i17	MC25	Missing	12:13	Cloudy	0.90	20.3	20.638	0.338	0.017	22.254	0.096
i18	MC25	Zeus	16:12	Scattered storm	1.10	18.9	19.617	0.717	0.038	22.086	0.169
i19	MC25	Zeus	16:50	Partly cloudy	0.70	17.5	17.342	0.158	0.009	22.249	0.271
i20	MC25	Zeus	16:51	Partly cloudy	0.70	17.6	17.808	0.208	0.012	19.947	0.133
i21	MC25	Zeus	16:51	Partly cloudy	0.70	17.8	18.947	1.147	0.064	25.107	0.410
i22	MC25	Missing	17:01	Partly cloudy	0.70	20.3	16.490	3.810	0.188	23.448	0.155
i23	MC25	Zeus	17:21	Partly cloudy	0.50	16.8	16.084	0.716	0.043	21.008	0.251

Figure 6

Cropped versions of some of the images that were used in the experiments: images i1, i9, and i10 show HSC passenger vessels; images i10, i13, i18, and i22 show MC25 passenger vessels; image i7 shows another vessel model. (a) Image i1; (b) Image i2; (c) Image i5; (d) Image i7; (e) Image i13; (f) Image i16; (g) Image i18; (h) Image i22; and (i) Image i17.

Figure 7

Cropped versions of some of the images in which our approach could not detect wave arms that had at least two well-defined troughs in the edge image.

A radar that was designed to monitor vessels was used as a resource to validate the proposed method. The radar was a series of X- and S-bands with a 19-inch LCD screen [30]. The radar screen information was captured using a smartphone camera. The radar screen included the name and speed of the tracked vessel. The images that were taken of the radar screen and the moving vessel were acquired at approximately the same time. Table 2 shows the identification of the vessels (columns “Model” and “Name”), the time at which both pictures were taken (column “Time”), and the speed that was measured by the radar (column “U”). Of the 23 images that were used, 22 were of six different passenger vessels (models HSC and MC25) and one was a tugboat (image i7, Figure 6d). As we had limited access to the radar (upon authorization), it was necessary to restrict the image sections and only take 40 images. Additionally, later access was not allowed due to the COVID-19 pandemic. Even so, the set of images that was captured led to exciting results that prove the technique to be promising. Considering that the data that were used as the input (e.g., the y coordinate of the endpoints of the vanishing line, the camera height, the reference corner of the ROI, and direction of the vessel in image space) were subject to errors, it was expected that the estimated speeds would also have uncertainties. By comparing the computed values to the speeds that were measured by the radar, it was possible to develop an idea of the accuracy and precision of the proposed technique. In Section 4.1, we analyze the relative error of our estimations and compare the quality of the estimates that were made using our technique to the approach that was presented by Huillca and Fernandes [13]. Section 4.2 and Section 4.3 present an analysis of the confidence intervals that were computed using sampling and first-order error propagation. In Section 4.4, Section 4.5 and Section 4.6, we discuss the influence of each experimental variable on the uncertainty of the estimated speed, the resilience to changes in the resolution of the input image, and the variations in JPG compression rate, respectively. For Section 4.2, Section 4.3 and Section 4.4, the input uncertainties were estimated as follows: , : The standard deviations of and were estimated using: where is the mean value of the differences between the y coordinates that were observed on endpoints that were returned by HLW and the endpoints of vanishing lines that were manually identified by us in a dataset comprising images, which led to and pixels; , : We also used the experimental samples to set and pixels. The samples were obtained by repeatedly selecting the first corner of the ROI in the chosen image to serve as a reference. The standard deviations for the and coordinates were calculated using (12) for the coordinates of the selected points; : The same reference image was used to indicate the orientation of a vessel, which produced angular samples that were used to compute radians; , , , : The standard deviations of the tripod height and the floor height were empirically set to and m, respectively, by assuming a conservative uncertainty for the measurement of those input variables. We used Google Maps to measure the ground height and set its uncertainty to m based on the variations we observed in this tool. We estimated m by applying (12) to a set of average tidal heights that were computed from observations in [27,28,29]. We assumed that the locations and of the troughs that were detected in the rectified ROI images carried the uncertainties that were introduced by the input variables, but the detection process itself did not introduce any new uncertainties (see Figure 5 and Appendix A).

4.1. Analysis of Relative Error

Relative error indicates the proportion of the absolute error of an estimated value with regard to the true value U. We used to determine the accuracy of our approach. In Table 2, the absolute error is given in . was calculated by applying the proposed method, while U was measured by the radar. We used the troughs of the closest wave arms to estimate the values that are presented in Table 2. Those troughs were the least affected by the noise that was introduced by the vessel’s turbulence and the distortion of the rectified elements that were not in the actual mean water plane. According to Table 2, the relative error was below for ten images, within the interval in six cases, within the range in three images, and between and in three cases. Only image i7 (Figure 6d) had a relative error that surpassed the true measure (). The explanation for this behavior is that the trace that was left by the tugboat was weak because it was traveling at and we set the parameter for the minimum horizontal distance between valid consecutive troughs as 20 m, which limited the estimated speeds to a minimum of . The mean and median relative errors that are presented in Table 2 were and , respectively. The mean relative error was clearly affected by the result in image i7. Observing the robust statistics that were provided by the median, we could conclude that the proposed method was accurate. To the best of our knowledge, the only work that has presented an approach for the estimation of the speed of moving vessels using single color images was developed by Huillca and Fernandes [13]. The subtable in Table 2 summarizes the results that were obtained by their approach after we replaced their RANSAC-based vanishing line detection scheme with the HLW algorithm. Otherwise, their method would not have been applicable to most of the images in Table 2. Unfortunately, it was not possible to perform a comparison between our approach and techniques that use image sequences, such as [10,11], because their implementation was not available and the descriptions that are presented in the articles proved to be insufficient for proper reproduction. In any case, such techniques cannot be used for video from cameras that are onboard vessels, which limits the scope of their application. Observing in Table 2, our new technique outperformed the previous approach for 20 out of the 23 images. For two cases in which our relative errors were higher, the difference in the errors was only on image i15 and on image i22. The third case was image i7, which, as previously mentioned, was not correctly handled by our approach because of the parameter that was set.

4.2. Analysis of Confidence Intervals Estimated Using the Samples

In practice, each image that is presented in Table 2 provided one sample for which we could estimate speed. One way to assess the accuracy of the technique was to analyze the variations in the speed estimates that were obtained for the same vessel when considering a set of images that were captured under similar conditions as the input. To simulate several image captures of the same ship, we introduced small variations in the input variables that were considered to be sources of uncertainty for each captured image. We used the standard deviations that were assumed in the error propagation model (11) to produce Gaussian-distributed variations in the original set of input values for each image in Table 2, thereby generating samples from which we could compute speeds and the corresponding confidence intervals: where is the mean speed of the sample, is the standard deviation of the sample, is a t-Student variable with degrees of freedom, and is the confidence level. Figure 8a shows the confidence intervals with that were calculated for the vessels using sampling (Table 2). The narrowest confidence intervals were for images i10 and i3, which were and wide, respectively. Notice that the mean speed was close to the true speed U in most cases and was included within the confidence interval in 16 out of the 23 images. Images i7, i16, and i22 were the cases with the most considerable distances between U and , whose confidence intervals did not include the true speed. The problem with image i7 was discussed in Section 4.1. For image i16 (Figure 6f), the location of the troughs was affected by the weather conditions. Notice the presence of more capillary wakes that were due to wind in this image than in the other images in Figure 6. For image i22 (Figure 6h), low natural lighting made the trail of the vessel very blurred. For the remaining four cases in which the true speed was outside of the confidence interval (images i6, i11, i14, and i21), the distance to the limits of the interval was negligible and ranged from to .

Figure 8

Confidence intervals () that were computed using (a) sampling and (b) first-order error propagation. Images sorted by vessel speed.

Including all cases that are presented in Figure 8a, the largest confidence intervals were and for images i1 and i22, respectively. The median interval was only . The variations in the values that were reported by the radar for the three consecutive speed measurements of vessels Zeus (images i19 to i21), Neptune (images i4 to i6), and Missing (images i11 to i13) were , , and , respectively. Thus, we could conclude that the proposed approach was accurate. However, we cannot make a strong statement in this regard because each interval was calculated using samples that were generated from one image.

4.3. Analysis of Confidence Intervals Estimated Using Error Propagation

In this section, we analyze the confidence intervals that were produced by the first-order error propagation approach, as discussed in Section 3, and compare them to the intervals that were produced using sampling. First-order error propagation may provide the correct Gaussian uncertainty for the resulting estimations when the uncertainty of the input variables follows Gaussian distribution and the process for computing the resulting values is linear. Otherwise, it provides a first-order approximation of the error [16]. To verify which is the case for our approach, we used the Shapiro and Wilk [31] test to check whether the resulting samples that were produced in Section 4.2 fit Gaussian distribution. The null hypothesis of this test was: the data are normally distributed when . For , images i7, i16, i19, i22, and i23 had -values that suggested evidence of non-normality. Therefore, we could only expect an approximation from the first-order error propagation of these cases. The ratio between the standard deviations that were computed using sampling and propagation showed that the first-order error propagation approach was equivalent to and slightly more conservative than the sampling-based approach. The only exceptions were for images i1, i22, and i23. In 16 cases, . This was reflected in the results that are presented in Figure 8b, in which only two confidence intervals clearly did not include the true speeds (images i7 and i22) and four almost included them (images i2, i6, i9, and i21). Notice that images i7, i22, and i23 did not pass the normality test. The narrowest intervals in Figure 8b were and in width, while the widest were (image i17) and knots (image i15). Using first-order error propagation, the mean width of the confidence intervals was and the median was . With error propagation, it was easy to detect cases that had more considerable uncertainty for the calculated speeds because this approach does not require several samples.

4.4. Impact of the Uncertainty of Each Input Variable

The impact of each input variable on the uncertainty of the estimated speed could be assessed using the error propagation model (9). The absolute contribution of any input variable was obtained using a covariance matrix (11), for which the only non-zero elements were those related to in the main diagonal. The relative impact of was obtained by dividing its absolute impact by the sum of the absolute impacts of all input variables. For this analysis, we grouped the input variables into four groups (see the columns of the heatmap tables in Figure 9): the first group included and ; the second group represented and included and ; the third group was and only included the angle ; and the last group included the variables , , , and , which were used to compute the camera height h.

Figure 9

The relative impact of the input parameters on the computed speeds, assuming that (left) all input variables were independent and had the same uncertainty or (right) had the uncertainty that was estimated for the experiments: represents the vanishing line; is the reference corner of the ROI; is the direction of the vessel; and h is the camera height. Stronger shades of blue indicate greater relative impact.

Figure 9 (left) shows the relative impact of each group of input variables, assuming that they all had the same uncertainty ( for all ) and were independent. Taking this as a premise, the largest sources of uncertainty for the estimated speeds were the parameters of and . One possible explanation is that and were the variables that were used to calculate the first point in the world coordinate system and was used to calculate the remaining corners. Furthermore, the ROI coordinates in image space came from the world coordinate system and they directly fed the uncertainties into the matrix . Figure 9 (right) shows the scenario illustrating the results that were obtained by the estimated uncertainties of the input variables for the proposed method. It can be seen that the most significant source of uncertainty was the camera height h, which demonstrates the importance of this parameter in breaking projective ambiguity and, consequently, the correct scale of the estimates that are produced. In the particular case of our experiments, it was necessary to assume since we observed inconsistent height readings at close points on the map.

4.5. Resilience to Variations in Resolution

In this experiment, the tests were carried out with scaled versions of the original image. This experiment simulated the use of lower resolution cameras and the speed estimation of more distant vessels. Here, we discuss the results that assumed scaling factors of , , and . In each case, the camera calibration matrix and the location of the reference corner of the ROI were transformed accordingly. Furthermore, the corresponding edge image and the new y coordinates for the endpoints of the vanishing line were obtained by applying the RCF and HLW algorithms to the scaled versions of the input images. The camera height h and the direction of the vessel did not change. Table 3 summarizes the speed and relative error that were computed for each input image, in which denotes the scaling factor, is the original scale, and U is the true speed value that was measured by the radar.

Table 3

Variation in the estimated speeds as a function of image resolution: U is the speed that was measured by the radar; and are the speed that was estimated by our approach and the relative error that was obtained using an input image at a scale of s, respectively; and for the original images.

Image	Speed (knots)				Relative Error
Image	U	U^1.00	U^0.50	U^0.25	εr1.00	εr0.50	εr0.25
i1	18.2	18.34	18.75	–	0.01	0.03	–
i2	20.6	21.53	22.39	21.82	0.05	0.09	0.06
i3	20.5	20.61	23.53	13.64	0.01	0.15	0.33
i4	19.2	19.43	20.71	19.36	0.01	0.08	0.01
i5	17.3	16.68	19.10	13.96	0.04	0.10	0.19
i6	17.1	16.53	17.93	17.86	0.03	0.05	0.04
i7	09.2	19.26	–	–	1.09	–	–
i8	19.5	19.29	19.30	21.75	0.01	0.01	0.12
i9	15.6	15.98	19.38	–	0.02	0.24	–
i10	16.5	16.61	16.15	15.38	0.01	0.02	0.07
i11	20.4	20.77	22.24	6.40	0.02	0.09	0.69
i12	20.4	20.54	13.32	10.45	0.01	0.35	0.49
i13	17.3	16.85	18.78	–	0.03	0.09	–
i14	19.6	20.53	19.50	–	0.05	0.01	–
i15	19.1	19.86	11.08	–	0.04	0.42	–
i16	20.2	22.12	20.09	–	0.10	0.01	–
i17	20.3	20.64	18.95	–	0.02	0.07	–
i18	18.9	19.62	22.07	14.48	0.04	0.17	0.23
i19	17.5	17.34	21.46	18.58	0.01	0.23	0.06
i20	17.6	17.81	19.24	10.54	0.01	0.09	0.40
i21	17.8	18.95	–	–	0.06	–	–
i22	20.3	16.49	10.22	14.99	0.19	0.50	0.26
i23	16.8	16.08	17.09	–	0.04	0.02	–

Recall that the resolution of the original images () was pixels. However, it is important to notice that, except for the detection of the vanishing line, all visual information that was used by our approach was within the ROI in image space. The average resolution of the axis-aligned bounding box of the ROI at was only pixels. For the , , and scaled images, the average resolutions of the ROI were , , and pixels, respectively. According to Table 3, most of the speeds that were calculated with a scale of were close to the true speed, the relative errors were below for 13 images, and there were eight cases between and . Image i16 was the least affected by the change in resolution, which presented a degradation of only of the relative error compared to the original image. It was only not possible to estimate speeds for images i7 and i21. Among those that resulted in measurements, the worst case was image i12 with a degradation in the relative error. The mean and median degradations were and , respectively. In total, it was not possible to estimate speeds using of the images with a scale of . The relative errors for the remaining cases were below for seven images and there were six cases between and . The mean relative error for images with a scale of was , while the mean and median degradations of the relative error concerning the original images were and , respectively. These results showed that the resolution of the ROI and, hence, the amount of information that was available to produce a quality edge image was critical to the performance of the proposed approach, since the edge image was used to provide the visual clues for the identification of the wave arms. In the practical use of this technique, automatic zooming could be used to increase the resolution of the ROI.

4.6. Resilience to Variations in JPG Compression Rate

In the last experiment, we analyzed the variation in estimated speeds as we changed the JPG image compression level. For each original image that was stored at 100% quality, we created copies at 90%, 75%, and 50% quality and applied the approach that was described in Section 2. In Table 4 it is possible to observe that the estimated speed did not change much compared to the lower resolution counterparts of the same image. Surprisingly, in images i1, i7, i14, i15, i16, i21, and i22, there was a decrease of 1 to 4% in the relative error of the estimates for images with 90%, 75%, or 50% quality. The biggest increase in relative error was 6% for image i18 at 75% quality. As mentioned before, image i18 (Figure 6g) was one of the noisy images that was captured under a scattered storm. In another 20 cases (images i3, i4, i5, i7, i13, i15, i17, i18, i20, and i22), the increase ranged from 1 to 5%. In 32 out of the 69 cases, the increase or decrease in the relative error was less than 1%.

Table 4

Variation in estimated speeds as a function of image compression: U is the speed that was measured by the radar; and are the speed that was estimated by our approach and the relative error of using an image at quality compression q, respectively; and for the original images.

Image	Speed (knots)					Relative Error
Image	U	U^100%	U^90%	U^75%	U^50%	εq100%	εq90%	εq75%	εq50%
i1	18.2	18.34	18.21	18.21	18.21	0.0077	0.0005	0.0005	0.0005
i2	20.6	21.52	21.48	21.48	21.48	0.0447	0.0427	0.0427	0.0427
i3	20.5	20.55	20.52	20.42	20.00	0.0024	0.0010	0.0039	0.0244
i4	19.2	19.63	19.93	19.93	20.27	0.0224	0.0380	0.0380	0.0557
i5	17.3	16.68	16.34	16.34	16.34	0.0358	0.0555	0.0555	0.0555
i6	17.1	16.53	16.54	16.54	16.54	0.0333	0.0327	0.0327	0.0327
i7	9.2	19.26	19.32	19.32	19.12	1.0935	1.1000	1.1000	1.0783
i8	19.5	19.29	19.28	19.28	19.28	0.0108	0.0113	0.0113	0.0113
i9	15.6	16.07	16.09	16.09	16.09	0.0301	0.0314	0.0314	0.0314
i10	16.5	16.61	16.55	16.55	16.55	0.0067	0.0030	0.0030	0.0030
i11	20.4	20.77	20.83	20.83	20.83	0.0181	0.0211	0.0211	0.0211
i12	20.4	21.04	21.10	21.10	21.03	0.0314	0.0343	0.0343	0.0309
i13	17.3	16.85	16.82	16.00	15.99	0.0260	0.0277	0.0751	0.0757
i14	19.6	20.38	20.17	20.21	20.21	0.0398	0.0291	0.0311	0.0311
i15	19.1	19.86	19.70	19.73	19.70	0.0398	0.0314	0.0330	0.0314
i16	20.2	21.42	21.22	21.76	21.22	0.0604	0.0505	0.0772	0.0505
i17	20.3	20.64	20.62	20.54	20.82	0.0167	0.0158	0.0118	0.0256
i18	18.9	19.62	20.29	20.80	20.61	0.0381	0.0735	0.1005	0.0905
i19	17.5	17.34	17.40	17.40	17.40	0.0091	0.0057	0.0057	0.0057
i20	17.6	17.72	16.95	17.87	17.12	0.0068	0.0369	0.0153	0.0273
i21	17.8	18.99	18.36	18.36	18.36	0.0669	0.0315	0.0315	0.0315
i22	20.3	17.02	17.25	17.25	6.80	0.1616	0.1502	0.1502	0.6650
i23	16.8	16.96	16.99	16.99	16.99	0.0095	0.0113	0.0113	0.0113

The results that were presented in this section suggested that the proposed approach had a good resilience to the compression of the input image. We believe that the reason for this is the robustness of the edge detection technique that was used in our implementation.

5. Conclusions and Future Works

We presented a method for the estimation of vessel speed using single perspective projection images. The approach uses geometric constraints to remove perspective distortion from the images of traces that were left by a moving vessel and uses curve fitting and peak detection to identify troughs in the wave arms and natural constraints in components of Kelvin wakes in order to compute vessel speed. We validated the measurements that were produced by our approach using the speeds that were obtained by a radar. The quality of the results was verified by the application of a statistical analysis on the estimated speeds. We also used error propagation along the computational chain to provide reliable confidence intervals that provided a notion of the quality of the speeds that were estimated from a single image and presented a study that could identify the set of input parameters that had more impact on the uncertainty of the estimated speeds. The statistical analysis revealed that the estimated speeds were accurate and precise. We believe that our algorithm could be used by autonomous vessels and for maritime surveillance using drones and smart lighthouses. In order to consider the use of our technique in real situations, it is necessary to draw some recommendations: Lighting conditions affect edge detection and the detection of wave arms. In our experiments, we had no problems in daylight, but it was not always possible to process images that were captured at dawn or dusk and our solution cannot be applied at night. The same applies to rain and fog; Due to geographical restrictions in our experiments, we used images of the port and starboard of vessels that were traveling in the left and right directions in front of the camera and moving along a linear course at a (supposedly) constant speed. However, we believe that our approach is robust to variations in camera orientation since it was possible to see the troughs in the wake, even at the grazing angle; As demonstrated in our experiments, well-defined capillary wakes due to wind and, possibly, those generated by nearby vessels may affect the Kelvin wake pattern. However, we believe that this is a problem that could be overcome by the detection of crossing wakes; Since this method is to be applied to single images, the use of video could provide dozens of independent measurements per second, which could be combined to reduce error or eliminate spurious estimates; Although we did not try this in our experiments, pre-processing the images to increase contrast could help in the detection of the wakes of slower vessels. Unfortunately, the radar that we used could not automatically display information on the speed of small vessels, e.g., sailboats and fishing boats. This is because small vessels do not have to be equipped with an automatic identification system (AIS), which allows sensors to display the ship’s speed information. In addition, the social isolation that was imposed by the COVID-19 pandemic prevented us from having access to the radar to expand data acquisition. Even so, we were able to obtain speed information from several medium-sized vessels. The results showed that our approach was robust. It was validated by considering the measurements from standard nautical equipment and we believe that it could be applied to any vessel that leaves distinguishable wake patterns. As a direction for future work, we point to the investigation of the influence of climatic conditions to better analyze the few cases in which the results were not consistent. For this analysis, images must be systematically captured at different times of day, under various lighting conditions, in different weather conditions, at different wind speeds, and in all four seasons of the year. So, ideally, systematic captures need to take place over at least one year in order to obtain a wide range of image conditions. Another direction for future work is to extend the analyses that were presented in this paper through the application of the ISO Guide to the Expression of Uncertainty of Measurement (GUM) [32]. The testbed implementation of our approach took approximately 12 s to process each image. Most of that time was used to compute the edge image using the RCF algorithm. Faster methods for calculating the edge images proved ineffective in enhancing the ship wakes. So, we aim to optimize this stage to calculate the vessel speed in real time. We are also exploring ways to work with super-resolution images in order to improve the quality of the information being included in the ROI and ways to use a sequence of images to calculate the vessel speed in each image and analyze the resulting speeds, although the central idea of this work was to use only one image. Our implementation will be made available after the publication of this paper.

3 in total

1. Richer Convolutional Features for Edge Detection.

Authors: Yun Liu; Ming-Ming Cheng; Xiaowei Hu; Jia-Wang Bian; Le Zhang; Xiang Bai; Jinhui Tang
Journal: IEEE Trans Pattern Anal Mach Intell Date: 2018-10-31 Impact factor: 6.226

2. Vessel Detection and Tracking Method Based on Video Surveillance.

Authors: Natalia Wawrzyniak; Tomasz Hyla; Adrian Popik
Journal: Sensors (Basel) Date: 2019-11-28 Impact factor: 3.576

3 in total