We describe a method for 3D reconstruction of plant seed surfaces, focusing on small seeds with diameters as small as 200 μm. The method considers robotized systems allowing single seed handling in order to rotate a single seed in front of a camera. Even though such systems feature high position repeatability, at sub-millimeter object scales, camera pose variations have to be compensated. We do this by robustly estimating the tool center point from each acquired image. 3D reconstruction can then be performed by a simple shape-from-silhouette approach. In experiments we investigate runtimes, theoretically achievable accuracy, experimentally achieved accuracy, and show as a proof of principle that the proposed method is well sufficient for 3D seed phenotyping purposes.
We describe a method for 3D reconstruction of plant seed surfaces, focusing on small seeds with diameters as small as 200 μm. The method considers robotized systems allowing single seed handling in order to rotate a single seed in front of a camera. Even though such systems feature high position repeatability, at sub-millimeter object scales, camera pose variations have to be compensated. We do this by robustly estimating the tool center point from each acquired image. 3D reconstruction can then be performed by a simple shape-from-silhouette approach. In experiments we investigate runtimes, theoretically achievable accuracy, experimentally achieved accuracy, and show as a proof of principle that the proposed method is well sufficient for 3D seed phenotyping purposes.
Making image analysis methods available for plant phenotyping applications is currently a driving force in plant sciences (Spalding and Miller, 2013). In many such applications the absence of suitable image processing is even a bottleneck (Minervini et al., 2015). More than 100 specialized methods (Lobet et al., 2013) and software packages are available for image-based analysis of different plant parts, e.g., fruit shape (Brewer et al., 2006), single or multiple leaves (Bylesjö et al., 2008; Weight et al., 2008; Alenya et al., 2011; De Vylder et al., 2011; Wallenberg et al., 2011; Dellen et al., 2015; Müller-Linow et al., 2015; Pape and Klukas, 2015), hypocotyl and seedlings (Koenderink et al., 2009; Wang et al., 2009; Silva et al., 2013; Golbach et al., 2015), shoot (Augustin et al., 2015; Santos and Rodrigues, 2015; Pound et al., 2016), rosettes (Arvidsson et al., 2011; Aksoy et al., 2015) and many more. Such analysis tools are needed in robotic imaging platforms for high-throughput plant phenotyping (Granier et al., 2006; Jansen et al., 2009; Hartmann et al., 2011; Nagel et al., 2012; van der Heijden et al., 2012; Fahlgren et al., 2015), but also in affordable systems (Tsaftaris and Noutsos, 2009; Minervini et al., 2014; Santos and Rodrigues, 2015).Plant seed phenotyping is needed by seed banks for quality management e.g., concerning breeding purposes, linking to germination rate or plant growth. For this, 2D scanning is a popular, affordable technique (Herridge et al., 2011; Tanabata et al., 2012; Moore et al., 2013; Whan et al., 2014). Several commercial software packages are available for seed investigations using flat-bed scanners (e.g., Regent Instruments, 2000; Next Instruments, 2015). It has been applied to different seed types, like Arabidopsis, soybean, barley, or rice. Typically parameters like width, length, or area are calculated from the 2D images, but also more complex shape measures like Fourier descriptors (Iwata and Ukai, 2002; Iwata et al., 2010).However, to the best of our knowledge, no affordable 3D imaging technique has been presented so far designed for seed measurements. Correspondence-based techniques (Quan et al., 2006; Paproki et al., 2012; Pound et al., 2014, 2016; Santos and Rodrigues, 2015) reconstructing 3D models from multiple images, or other low-cost techniques like laser scanning or the Kinect can be used for 3D whole plant reconstruction (Paulus et al., 2014) or root systems in transparent gel (Fang et al., 2009). However, such techniques are not suitable for much smaller objects like seeds of rapeseed plants (~2 mm diameter) or even Arabidopsis seeds (~0.2–0.4 mm length).Here we investigate volume carving, a well-known shape-from-silhouette technique (Martin and Aggarwal, 1983; Potmesil, 1987; Laurentini, 1994), for 3D seed shape reconstruction. It is a fast, reliable, and simple but robust method, having been used in plant phenotyping before, e.g., 3D seedling reconstruction (Koenderink et al., 2009; Golbach et al., 2015) or root system investigations (Clark et al., 2011; Zheng et al., 2011; Topp et al., 2013). Depending on the selected viewpoints it approximates the convex hull of an object or reconstructs even valleys and saddle-points, but cannot reconstruct true concavities. Most seeds are, however, relatively smooth, convex objects. For the seed types investigated here (Arabidopsis, barley, and maize, see Figure 1), true concavities seem to be of low relevance for volume estimation. For non-smooth seeds, like e.g., seeds of the plant parasites Phelipanche aegyptiaca, or Orobanche cernua the proposed method may be less suitable.
Figure 1
Overview of the reconstruction method. (A) Image acquisition from multiple viewing angles. (B) One of the acquired gray value images. (C) Mask image. (D) Estimation of tool center point (TCP). (E) Estimate shape from silhouette by volume carving. (F) Surface of reconstructed volume. (G) Tool removed from volume: seed red, tool blue.
Overview of the reconstruction method. (A) Image acquisition from multiple viewing angles. (B) One of the acquired gray value images. (C) Mask image. (D) Estimation of tool center point (TCP). (E) Estimate shape from silhouette by volume carving. (F) Surface of reconstructed volume. (G) Tool removed from volume: seed red, tool blue.This paper is an extension of our conference publication (Roussel et al., 2015), thus theory (Section 2 and 3), and some experiments from Section 4 are mainly repeated from there. We extend the theory by an accuracy check and iterative camera position correction procedure, and the experiments by a numerical and experimental investigation of achievable accuracy vs. number of images in Sections 4.2 and 4.3. Further we updated references and discussion.
2. Reconstructing seed shape from silhouettes
Aiming at relatively simple, mostly convex seed shapes, target voxel resolutions needed to describe such shapes are comparably low—as we will show in the experiments below, see Section 4. Therefore, for this study, it is sufficient to apply one of the most basic volume carving approaches.We get the intrinsic camera matrix K (Hartley and Zisserman, 2004) and the distance between the origin of our working volume and the camera center from calibration (cmp. Section 3.2). The origin of the working volume is selected to be the tool center point (TCP) of the robot system handling the seeds, i.e., rotating them in front of the camera for imaging (cmp. Section 3.1).We acquire N images, showing a seed under (equidistantly spaced) rotation angles α where i ∈ {1, …, N}, see Figure 1. Rotation is around the vertical axis through the TCP, being parallel to the y-axis of the camera. We segment by gray-value thresholding each image into a binary mask M being one at the foreground, i.e., seed and tool tip, and zero at background locations. Small objects like noise are removed and small holes (e.g., the reflection of the tool) filled.For each image and thus segmentation mask we calculate the homogeneous camera projection matrix P, from the rotation angle α by
where R is the rotation matrix corresponding to the given angle α, and translation vector is calculated using the distance of the world origin to the camera center, also known from calibration (see e.g., Hartley and Zisserman, 2004). By this, the world coordinate frame rotates with the object, i.e., the seed.We define an equidistantly spaced, cubic voxel grid around the world origin, being large enough to contain the seed. The thus defined working volume depends on the seed type. For Arabidopsis we use (1 mm)3, for rapeseed (2.9 mm)3, and for barley and maize (13 mm)3.Each voxel center with homogeneous world coordinates is projected to a point in each mask M by
If is projected to the background region of at least one of the N masks M, then this voxel does not belong to the foreground object and its value is set to 0, i.e.,
Thus, if a voxel belongs to the foreground object, its value is set to 1.When higher voxel resolution is desired, and thus runtimes increase, parallelization of the carving algorithm (Brenscheidt, 2014) is feasible (see Section 4.1). Even higher resolutions become available on current desktop computer hardware, when hierarchically representing the voxel grid, e.g., as an octree (Szeliski, 1993; Klodt and Cremers, 2015).One of the main drawbacks of this simple carving algorithm is its sensitivity to imprecise external camera calibration. When a mask M is misaligned and thus does not well overlap with the “true” object volume, the non-overlapping parts are deleted from the volume without further testing or corrections. We therefore apply an image-based camera pose calibration step, as described next.
2.1. Correcting camera pose
Methods not adapting camera pose by estimating extrinsic parameters from the acquired images are known to be particularly sensitive to (extrinsic) calibration errors, thereby requiring precise positioning of the cameras (see e.g., Yezzi and Soatto, 2003). For relatively large objects in the multiple centimeter-range, say 20 cm long and filling most of an image, and typical pixel resolutions, say 2000 × 2000, a pixel covers an object area 0.1 × 0.1 mm2. Position repeatability of industrial-grade robotic systems, typically ≤ 20 μm and ≤ 0.05° (Denso Robotics Europe, 2015), is therefore high enough for precise reconstruction. However, for objects being few millimeter in size or even in the sub-millimeter range additional care has to be taken. The mathematical TCP coordinates known to the robot control software may not coincide precisely with the physical TCP at the tool tip, due to mechanical calibration inaccuracies, wear and tear, or small deformations of the tool. In our case, instead of being at a fixed location in the camera images, the TCP moves on a more or less reproducible, elliptic trajectory of up to 200 μm diameter, varying with room temperature.Before projecting the voxels to the mask images, we therefore adapt projection matrices P. If a suitable non-changing target moving with the TCP is visible in all images, image registration can be done using simple normalized cross-correlation (see Figure 2).
Figure 2
Example provided together with source code as supplemental material. Imaging is done using affordable hardware, i.e., a usual SLR camera (Nikon D7000, AF-S Nikkor 16–85 mm 1:3.5–5.6 GED lens at f = 85 mm) and a motorized turntable (Steinmeyer DT130-360°-SM01) for rotating the seed. As tool keeping the seed above the turntable's plane, we use a cut-off ball-pen-tip. A smaller tool, like in our robotic setup, allows for better seed visibility and reconstruction, however it is not easy to build. Even though the SLR is mounted on a sturdy tripod and released by a remote control, the camera center moves from image to image. In this setup, the base of the tool is visible in each image and can be used as target for correlation-based image registration. From the calculated image shifts, projection matrices P are adapted accordingly. See provided source code for more details. (A) SLR image showing a maize seed, (B,C) cropped 760×564-image with 293×100 target for cross correlation indicated in red, (D,E) mask image with borders of the user-defined reconstruction volume projected back into the image in blue, (F) reconstructed seed and tool, (G) reconstructed seed with tool removed, (H,I), reconstructed seed from a more bottom and more top view.
Example provided together with source code as supplemental material. Imaging is done using affordable hardware, i.e., a usual SLR camera (Nikon D7000, AF-S Nikkor 16–85 mm 1:3.5–5.6 GED lens at f = 85 mm) and a motorized turntable (Steinmeyer DT130-360°-SM01) for rotating the seed. As tool keeping the seed above the turntable's plane, we use a cut-off ball-pen-tip. A smaller tool, like in our robotic setup, allows for better seed visibility and reconstruction, however it is not easy to build. Even though the SLR is mounted on a sturdy tripod and released by a remote control, the camera center moves from image to image. In this setup, the base of the tool is visible in each image and can be used as target for correlation-based image registration. From the calculated image shifts, projection matrices P are adapted accordingly. See provided source code for more details. (A) SLR image showing a maize seed, (B,C) cropped 760×564-image with 293×100 target for cross correlation indicated in red, (D,E) mask image with borders of the user-defined reconstruction volume projected back into the image in blue, (F) reconstructed seed and tool, (G) reconstructed seed with tool removed, (H,I), reconstructed seed from a more bottom and more top view.In our robotic application, in order to adapt projection matrices P, the truncated cone shape of the gripping tool has to be found, see Figure 3. As larger seeds may partly occlude the tool tip, we search for a region of the tool being reliably visible in the images. The tool enters the image vertically from below and becomes smaller in diameter toward the true TCP, being the center point of the very tip of the tool. As we can robustly find the tool's left and right edges, we apply a simple and very fast procedure. We calculate the visible width of the tool line by line starting at the bottom of the image, moving upwards, i.e., in negative y-direction. We iterate while the width decreases and is larger than the minimum tool width (being at the tip). The thus reached y-coordinate is taken as first estimate of the TCP y-coordinate yTCP. A reliable estimate of the TCP's x-coordinate xTCP is established as the mean of all found left and right edge x-positions. As the tool tip may be partly occluded by the seed, yTCP needs refinement. For this the left and right tool edges are independently tracked further until the narrowest point is reached, i.e., the rightmost point of the left edge, and the leftmost point of the right edge. The smallest y-value (highest point) of the two points is taken as new yTCP.
Figure 3
Illustration of the extrinsic camera calibration correction. (A–C) Images of the same barley seed taken from different angles. (D) Mask image generated from (B). (E) Steps to find the TCP: (1) find edges of gripping tool (red lines), stop when lines diverge (blue dotted line). (2) xTCP is average of middle between found edge positions (red dotted line). (3) Trace edges further as long as they come closer to xTCP (yellow lines). (4) Top most position is yTCP. The found TCP is indicated by a yellow cross.
Illustration of the extrinsic camera calibration correction. (A–C) Images of the same barley seed taken from different angles. (D) Mask image generated from (B). (E) Steps to find the TCP: (1) find edges of gripping tool (red lines), stop when lines diverge (blue dotted line). (2) xTCP is average of middle between found edge positions (red dotted line). (3) Trace edges further as long as they come closer to xTCP (yellow lines). (4) Top most position is yTCP. The found TCP is indicated by a yellow cross.For small seeds like Arabidopsis this procedure works reliably, as the seeds are too small to occlude the whole tool tip in an image. For larger seeds, we use the observation that the TCP's elliptic trajectory results in its y-coordinates to describe a sinusoidal curve over the rotation angle. We therefore robustly fit a sin-curve to the y-coordinates and correct outliers according to the fit result.For such small objects, the optical lens setup (cmp. Section 3) features a narrow opening angle (i.e., large zoom), like a microscope at 1-to-1 magnification. This means lines of sight are almost parallel and thus depth effects are negligible. This allows to update P with by simply setting the principal point (Hartley and Zisserman, 2004) to .In our experiments we observed that can be estimated reliably with pixel accuracy, when no disturbances like small dust particles are present. Maximum offset in locating TCP from an unoccluded tool tip was 2 pixel.In situations, where larger inaccuracies in locating occur, testing consistency of results and correcting is recommended. Back-projection of the reconstructed 3d object is a simple procedure allowing to test whether or not P is correct and the selected segmentation procedure is suitable. For this test, each surface voxel of the found 3d object is projected to a mask image , initially filled with zeros. A voxel is projected to by projecting its corners to using P and filling the respective convex hull with ones. If no errors occurred, the thus generated foreground mask should be identical to the segmentation mask M (up to ignored subpixel effects when filling the convex hull, leading to a potentially slightly dilated mask ). Measuring overlap between the two masks can be done using well established measures, e.g., the Overlap Ratio Criterion
(see e.g., Everingham et al., 2010), or Dice Similarity Coefficient
established by Dice (1945) and Sørensen (1948), where | · | denotes set cardinality. These measures are used throughout image analysis and are common in plant imaging as well (Minervini et al., 2014).In case the achieved mask overlap is too small but larger than zero, iterative procedures can be applied to increase accuracy. Straight forward is to (1) shift the principal point in each projection matrix P such that the center of mass of coincides with the center of mass of M and (2) recarve, and iterate both steps until convergence or suitably large overlap. Alternatively, gradient descent-based algorithms optimizing camera pose may be applied as a refinement step (e.g., Yezzi and Soatto, 2003).
2.2. Removing the tool from the seed
For small seeds not overlapping with the tool, the TCP lies precisely in the world origin, i.e., the origin of the reconstructed voxel block. Thus, voxel above the TCP contain the seed, voxel below (which in that case we do not reconstruct) contain the tool. In cases where seed and tool may overlap (see e.g., Figure 3), the tool tip is also reconstructed. It can be removed from the volume data using its known position, orientation, and physical size by deleting the corresponding voxel volume.Alternatively, at high voxel resolutions, where the reconstructed volume covered by the tool may be affected by noise, one can estimate the tool position from the reconstruction. Summing up voxel values of horizontal planes in the bottom region of the volume gives reliable estimates of the area of horizontal cuts through the tool. While the areas decrease when summing over higher and higher planes, the planes are deleted from the data. Then, when areas no longer decrease, using these areas, we estimate the y-position of the truncated cone using a least squares fit and remove the thus covered volume.
3. Materials and methods
3.1. Imaging
Depending on seed size for 3D reconstruction we use two different setups for image acquisition. Both setups consist of an industrial-grade c-mount camera (PointGrey Grasshopper, GRAS-50S5M-C, Mono, 5.0 MP, Sony ICX625 CCD, 2/3", square pixels of size μ = 3.45 μm, global shutter, 2448 × 2048, 15 FPS), 35 mm high precision lens (Schneider KMP APO-XENOPLAN 35/1,9) and a white LED ring with diffusor (CCS LDR2-70-SW2) shown in Figure 4. For small seeds (e.g., Arabidopsis, tobacco, rapeseed) a 36 mm spacer is mounted between camera and lens. For larger seeds (e.g., barley, maize) only a 15 mm spacer is needed. Spacer reduce the minimum working distance of the lens (d = 69.9 mm for the 36 mm spacer, 128.0 mm for the other) and thus are responsible for suitable magnification. This allows to measure seeds in a range between ≈0.2 and 12 mm. White paper is used as background.
Figure 4
Camera setup for 3D imaging.
Camera setup for 3D imaging.For image acquisition seeds are picked by a cone-shaped vacuum nozzle and held in front of the camera at optimal working distance using a robotic system to be described elsewhere. The robot rotates the seed in configurable angles and triggers the camera. We use 10° steps and take 36 images, if not stated differently. Image acquisition times are mainly limited by the robot's rotation typically ≳ 2.7 − 5s per 360°, depending on the motion type. We perform 36 stop-and-go steps resulting in an overall acquisition time of ≈6 − 7 s.
3.2. Camera calibration
We use the OpenCV implementation (Bradski and Kaehler, 2008) of Bouguet's calibration method (Bouguet, 1999) and an asymmetric 4 × 11 dot-pattern target with a total size of 5.8 × 4.3 mm. It was printed using a professional, high resolution film recorder, as usual office printers even on good paper do not achieve a printing precision suitable for camera calibration at such small spatial scales.Using this toolbox, estimation of the focal length f is not precise enough for our purposes. We therefore use a ball-bearing ball (steel, precision according to DIN5401 G20) with r0 = 1.50 mm ±0.25 μm radius as calibration object, in order to estimate the working distance d (or equivalently focal length f from working distance d) of our system precisely. From a mask image of the ball acquired with our system, we estimate its area A in pixel. This allows to estimate its radius r in the image by , where μ is the pixel size. From basic geometric reasoning working distance d can be derived as .
3.3. Software implementation
The software framework is implemented in C++ on a Windows 7 operating system with Visual Studio 2013. The application programming interface Open Graphics Library (OpenGL OpenGL.org, 2015) was used for the GPU implementation.As supplemental material (Roussel et al., 2016) we provide both, a suite of Matlab (Mathworks, 2015) routines as well as a Python implementation suitable for volume carving of not too small seeds using an affordable imaging setup. Such setups may e.g., use a turntable and a consumer SLR camera. In our example (see Figure 2) we use normalized cross-correlation for image registration, as the bottom of the tool is always visible as a suitable registration target.
4. Experiments
4.1. Resolution and runtime
The complexity of the volume carving algorithm is proportional to the number N of voxels and number N of images acquired. For our equidistantly spaced cubic R × R × R grids the voxel number is and thus complexity is O(R3N). In addition time for loading (or acquiring) the images (with N pixels) and, for the GPU implementation, transferring the data to and from the graphics card is needed. Complexity of this data transfer and preprocessing of the images is O(NN), or for the GPU implementation.Runtimes shown in Figure 5D have been measured on a PC with Intel Core i5-3470 CPU, 8GB DDR3 RAM and an NVIDIA GeForce GTX 580 GPU with 4047MB GDDR5 RAM (cmp. Brenscheidt, 2014 for further details). We observe that for low resolutions R of the voxel grid, runtime contributions by the O(NN) components dominate, as no dependence on R is visible. For increasing R, these parts become negligible. While for the CPU implementation a significant increase of the runtime vs. the 2 s runtime for smallest voxel resolutions can be noticed at R = 256 (4s), the parallel GPU implementation stays at comparable runtimes even at R = 512.
Figure 5
Performance of the proposed method. (A) Original image of a barley seed. (B) Reconstructions of the seed at different grid resolutions. (C) Reconstructed volume vs. resolution of the voxel grid. (D) Runtimes in seconds of serial CPU and parallel GPU implementations (reproduced from Brenscheidt, 2014). (E) Reconstructions of the seed using different numbers of images. (F) Reconstructed volume vs. number of images used.
Performance of the proposed method. (A) Original image of a barley seed. (B) Reconstructions of the seed at different grid resolutions. (C) Reconstructed volume vs. resolution of the voxel grid. (D) Runtimes in seconds of serial CPU and parallel GPU implementations (reproduced from Brenscheidt, 2014). (E) Reconstructions of the seed using different numbers of images. (F) Reconstructed volume vs. number of images used.When interested in a seed's volume as a trait used for high throughput phenotyping, rather than in subtle surface details, voxel resolution can be selected comparably low. In Figures 5A–C we show a barley seed and its reconstructions together with its derived volume for different grid resolutions R. We observe that above R = 256 the estimated volume is approximately constant. Thus, for this phenotyping task, runtime is limited mainly by file-IO, transfer and preprocessing. Sophisticated speed-up mechanisms for the carving provide rather low benefits in this application, as their main potential lies in higher achievable volume resolutions.Speedup using fewer images may be paid by lower accuracy (see Section 4.2). We show reconstructions of the same barley seed in Figure 5E and the corresponding volumes in Figure 5F. Images are selected equidistantly. We observe that reducing image number rapidly reduces reconstruction quality. Interesting to note is that the reconstruction using N = 9 images is more accurate than with N = 12 images. This is due to the fact that for N = 12 the selected angle between images is α = 30°, thus 180° is a multiple of α (the same is true for N ∈ {4, 6, 12, 18, 36}, cmp. also Figure 8 and Section 4.2). However, as the opening angle of our lens is small, complementary information content in masks coming from cameras looking in opposite direction is low. We conclude that for shorter runtimes with comparable or even higher reconstruction accuracy investigating alternative viewing directions is promising. We do this in the next section for the restricted possibilities of our robotic, turntable-like, single camera acquisition system.
4.2. Accuracy vs. number of images
We numerically investigate the influence of the number N of equidistantly acquired images on the accuracy of volume estimation in an ideal turntable setting. To do so, we calculate the volume Vnum of a sphere with radius r0 derived as cut of tangent cones, cmp. Figures 6A,B,D: Each ideal camera is represented by its camera center C, known from the selected working distance and rotation angle. A sphere projects to a circle on the sensor plane. A volume carving step for each image of this sphere thus corresponds to testing for each point of the working volume, if it lies inside or outside a cone spanned by C and the outline of the projected sphere (transparent cones in Figure 6B). The cone is independent of the focal length f of the ideal camera, but depends on the working distance d, i.e., the distance between each camera center and the center of the sphere. We select r0 = 1.5 mm, as we use a highly accurate, spherical bearing ball of this size as ground truth object in real experiments (see Section 4.4).
Figure 6
Setup for the numerical accuracy analysis. (A) Exemplary geometrical setup for five cameras. (B) Same configuration with transparent projection cones touching the green sphere and colored inlier sampling points. (D) Close up of the sphere (green) and a visualization of the sampled inlier points. Red, yellow, black, and white points indicate the four different types if surplus volumes not taken away by carving. Black lines indicate where cones touch the sphere (C) Symmetry planes (black) for each camera and section of a spherical shell indicating the geometry of the sampling region (yellow). (E) Close up of the sampling region (yellow). The inner sphere (green) represents the inner border of the volume, the meshed sphere the outer border. Black stripes indicate symmetry planes.
Setup for the numerical accuracy analysis. (A) Exemplary geometrical setup for five cameras. (B) Same configuration with transparent projection cones touching the green sphere and colored inlier sampling points. (D) Close up of the sphere (green) and a visualization of the sampled inlier points. Red, yellow, black, and white points indicate the four different types if surplus volumes not taken away by carving. Black lines indicate where cones touch the sphere (C) Symmetry planes (black) for each camera and section of a spherical shell indicating the geometry of the sampling region (yellow). (E) Close up of the sampling region (yellow). The inner sphere (green) represents the inner border of the volume, the meshed sphere the outer border. Black stripes indicate symmetry planes.As we did not find a closed form solution for the volume of an object derived by a cut of N cones for arbitrary N, we numerically integrate the volume by a Monte-Carlo method:We randomly select K points in a region with known volume Vreg including the complete test volume.For each point we test, if it lies in all cones spanned by the cameras. If yes, the point lies in the volume, if no, not. The number of all inliers is Kin.The sought-for volume Vnum is then Vnum ≈ Vreg
*
Kin∕K.The smaller Vreg can be selected, the more accurate Vnum can be approximated with a fixed number K of sampling points. We observe that the selected turntable camera configuration (Figures 6A,C) is symmetric with respect tothe plane spanned by the cone centers i.e., the equator of the sphere,each plane spanned by the rotation axis and a camera center,each plane spanned by the rotation axis and cutting the rotation angle between two adjacent camera centers in half, i.e., the vertical plane between two cameras.Furthermore, we observe thatthe inner of the sphere lies completely in the carved volume, i.e., we carve the sphere from the outside; and finally thatthe sought-for volume lies in a concentric sphere with a somewhat larger radius than the carved sphere.This allows to restrict numerical calculations to a region with known volume Vreg being box interval in spherical coordinates (cmp. Figure 6E, yellow region), in order to benefit optimally from symmetries in the problem. We restrict the altitude angle to θ ∈ [0, π∕2], azimuth to ϕ ∈ [0, π∕(2N)], and the radius to r ∈ [r0, r1]. Radius r1 is calculated by (1) intersecting all cones with the plane spanned by the camera centers, yielding a pair of lines for each cone, (2) selecting the right lines of two adjacent cones (cmp. example red lines in Figures 6C,E) (3) calculating the intersection between these lines, and (4) selecting r1 as radius of the point given in polar coordinates.This geometry is depicted in Figures 6C,E. The selection correspond to one section of a spherical shell, cut in half by the plane spanned by the cone centers (i.e., at the equator of the sphere) and cut in 2N sections by the half-planes starting at the axis of rotation symmetry (i.e., the axis through the poles of the sphere) and each including one cone center; as well as their angle bisector planes.Randomly sampling points in spherical coordinates produces higher point densities toward the origin in Euclidean coordinates and toward the north-south-axis of the sphere. We correct for these density differences combining two approaches. To understand this we need to calculate the Jacobian of the spherical coordinate transformation from Euclidean coordinates. Using the convention
we get the Jacobian J
As long as r0 and r1 do not differ too much, as in our case, it is sufficient to consider the radial part of J by weighing sampling points p by their radius value r
where and . P is the set of all sampling points and Pin⊂P the set of all inliers.Density variations due to altitude θ are compensated by transforming the sampling probability density from the uniform distribution p(χ) (with χ ∈ [0, 1]) a random number generator delivers to a sin(θ)-shaped distribution pθ(θ). This is achieved by the transform θ = g(χ), where g(χ) = arccos(1 − χ). This can be easily verified using to the transform law for densities , where g−1 denotes the inverse function of g.In Figure 7 relative error E = (Vnum − V0)∕V0 with is depicted vs. the number N of images used for reconstruction of Vnum. We observe that for parallel projection, using an even number of cameras (or images) yields the same result as using half the number of images. This makes sense, as for an even number of cameras, pairs of cameras are in an 180°-configuration, looking at the same object contour from different sides. This does not add additional information to the reconstruction. However, when considering central projection also images from an 180°-configuration add additional information, as cameras do not look at the same contour.
Figure 7
Error of volume vs. number . Top left: error when using parallel projection (telecentric lens). The other plots show errors for different working distances d and central projection. In real experiments we use working distance d = 69.9 mm.
Error of volume vs. number . Top left: error when using parallel projection (telecentric lens). The other plots show errors for different working distances d and central projection. In real experiments we use working distance d = 69.9 mm.Figure 8 shows volumes reconstructed using different camera configurations. We observe that using N = 5 or N = 10 camera positions yields identical results for parallel projection in Figure 8, left. However, for central projection N = 10 camera positions yields a much better reconstruction than using N = 5 images. This is in full agreement with the plot given in Figure 7, bottom right, where we also observe, that using odd N is not always better than using even N with one additional camera position. It depends on object size in relation to working distance, which camera position configuration yields better results in theory. We observe, that in the cases tested here, a minimum of N = 11 or N = 12 images is needed to stay below 1% relative error. A limit of less than 0.1% error is reached using N = 33, N = 36, or N = 37 images. In practice, however, given a sufficiently high N, other error sources may dominate (see next Sections).
Figure 8
Reconstructed volumes for different numbers . Numbers in the upper left corner of each subimage are N. Inlier sampling points are color-coded, where hue indicates ϕ and brightness θ. The underlying ground truth sphere is white. Left: reconstruction using parallel projection (telecentric lens). Right: reconstruction using central projection with working distance d = 20mm and sphere radius r0 = 1.5 mm.
Reconstructed volumes for different numbers . Numbers in the upper left corner of each subimage are N. Inlier sampling points are color-coded, where hue indicates ϕ and brightness θ. The underlying ground truth sphere is white. Left: reconstruction using parallel projection (telecentric lens). Right: reconstruction using central projection with working distance d = 20mm and sphere radius r0 = 1.5 mm.
4.3. Accuracy loss due to position inaccuracies
We performed experiments using synthetic images in order to test the different error sources in our method. The images showed a perfect sphere as projection of a synthetic ball of same size as our bearing-ball. Geometrical setup was as for the experiments in Section 4.2, with working distance d = 69.9 mm. We reconstructed the ball using volume carving, a voxel grid resolution of 2563 and (12 μm)3 voxel size. Results were very similar to the ones shown in Figure 7 for the same d, e.g., relative error at 36 images was approximately 0.14% instead of 0.1%. We do not show this plot. We conclude, that volume carving on a fine enough grid comes close to the theoretical performance of the simulation from Section 4.2, in agreement with our observations in Figure 5C.Estimating TCP locations in the images is critical for reconstruction accuracy. To test the influence of a TCP wrongly located in an image, we add a relatively large offset of 7 pixel in y-direction to in the first image and leave the TCP locations in all other images untouched. The carving result is shown in Figure 9. The volume loss due to the misaligned TCP is in the order of 0.001 to 0.002 mm3, and thus reduces the positive volume error due to non-carved regions (as described in Section 4.2). For image configurations with relatively low positive initial error, i.e., here N > 36, the error due to the TCP offset dominates, such that overall relative error is negative. Please note that Figure 9 shows absolute values of the relative errors due to the log-scale.
Figure 9
Absolute value of relative error of volume vs. number . Configuration is identical to the simulation results shown in Figure 7. Yellow and dark green points have been repeated from there.
Absolute value of relative error of volume vs. number . Configuration is identical to the simulation results shown in Figure 7. Yellow and dark green points have been repeated from there.
4.4. Seed types and overall accuracy
Using the proposed method we reconstructed different seed types, namely Arabidopsis (length, i.e., longest dimension ≲ 0.5 mm), rapeseed (≈ 2 mm), barley (≈ 8 mm), and maize (≈ 11 mm). See Figure 10 to get an impression of the usually achieved reconstruction accuracy.
Figure 10
Reconstructed seeds shown from different angles, side-by-side with the original images (. (A)
Arabidopsis. (B) Barley. (C) Maize. Please note the different scalings.
Reconstructed seeds shown from different angles, side-by-side with the original images (. (A)
Arabidopsis. (B) Barley. (C) Maize. Please note the different scalings.Absolute accuracy is validated experimentally using two different test objects. The first one is the ball-bearing ball we used for working distance calibration with 3.00 mm ±0.5 μm diameter, i.e., 0.02% diameter tolerance, and thus a precisely known volume of 14.137 mm3 ±0.007 mm3, i.e., 0.05% volume tolerance. Clearly, as we used this object for calibration, scaling of the mask images exactly fit to the respective projection matrices P. However, the overall performance of the system for volume reconstruction can still be evaluated using this object, as the volume derived has not been used for calibration and still accumulates all errors and imperfections the method has. The second object is an ink cartridge ball with 2.45 mm ±0.02 mm diameter, i.e., 0.4% diameter measurement error, measured with a digital sliding caliper, and thus a volume of 7.70 mm3± 0.19 mm3, i.e., 2.5% volume error.In Figure 11 volume error vs. number N ∈ {27, …, 36} of imaging positions is shown for the ball-bearing ball experiment. Comparing results to Figure 7, lower left (values repeated in Figure 11 for easy reference), we observe that negative reconstruction errors due to positioning inaccuracies are in the same range as the positive theoretical errors and increase with increasing number of images. Please note that the plot shows absolute values of relative errors, as negative errors can not be displayed in log-scale.
Figure 11
Error of volume vs. number . Configuration is identical to the simulation results shown in Figure 7, repeated in the plot for reference.
Error of volume vs. number . Configuration is identical to the simulation results shown in Figure 7, repeated in the plot for reference.Reconstructing the ball using 36 images as used for seed reconstruction, a voxel grid resolution of 2563 and (12 μm)3 voxel size, i.e., a volume quite tightly surrounding the object, yields a volume of 14.11 mm3 and thus a mean diameter of 2.998 mm for the bearing ball. This is a relative error of −0.19% wrt. the specified volume, and of −0.06% wrt. the diameter, when calculating the diameter from the measured volume, assuming a perfect sphere. For the ink cartridge ball we measure 7.83 mm3 corresponding to a mean diameter of 2.46 mm, being well within the measuring error of our caliper measurement.We conclude that the overall accuracy of our method, including camera calibration error, mechanical imperfections, TCP finding error, imprecision due to the simple carving approach etc. is high enough to compete with or even beat a precise slide caliper for length measurements. Absolute values of measurement errors of volume and lengths are in the range of few per mill.
5. Conclusion and outlook
Simple volume carving combined with a method for extrinsic camera pose estimation from images is sufficiently accurate for size measurements of even tiny seeds. To optimize our system for runtime and accuracy, we investigated its performance using different parameter settings. Surprisingly, the main performance gain potential does currently not lie in using more sophisticated reconstruction methods allowing for higher achievable voxel resolutions R, e.g., achievable by the highly accurate method presented by Klodt and Cremers (2015) and necessary for reconstruction of more complex surfaces. Our findings allow reducing preprocessing and transfer times by selecting a suitable image number N and comparably low voxel resolution of 2563.The optimal number N of acquisition positions used in a turntable setting depends on the selected projection geometry. In our case using N = 36 images yields a theoretical overestimation of the reconstructed volume of a sphere by +0.1% relative error.The method's achievable accuracy has been tested experimentally using a highly accurate spherical object. Systematic errors are much lower than we expected, between +0.06% for N = 27 and −0.19% for N = 36. This means that volume losses due to inaccurate positioning of the object are truly negligible for our purposes. Clearly, as seeds are not well represented by a ball, such accuracy studies give insight in the accuracy potential of the method—if it fails on a ball, it will also fail on more complex shapes.Seed-shape-specific errors are not well captured by a ball and may vary from seed type to seed type. Alternative simple volume measurement methods for ground truthing, e.g., Archimedes' principle, are not accurate enough for such small objects, but high-resolution CT may be an option.Many factors influence the accuracy, e.g., segmentation errors, small dust particles or camera pose errors. Most critical are inaccuracies of in the image, leading to parts of a seed being erroneously carved away. To detect such errors, suitable error estimation methods can be implemented to complement the method proposed here, e.g., summing back-projection error. When positioning accuracy is an issue, more elaborated, iterative, optimization-based but also more costly to calculate methods may be applied. However, as we have seen in our experiments, achieved accuracies are well high enough for seed phenotyping even without such corrections.Overall we conclude that the presented method yields highly accurate seed reconstructions being accurate enough when interested in seed volume.
Author contributions
SJ and HS designed the robotic system used for seed handling and image acquisition. JR and AF implemented robot and acquisition hard and software. JR performed all lab experiments. FG performed all numerical experiments and implemented the supplemental Python code. HS implemented the baseline method, the supplemental Matlab code, and drafted the manuscript. All authors contributed to text and figures of the manuscript and approved the final manuscript.
Conflict of interest statement
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Authors: Liya Wang; Ioan Vlad Uilecan; Amir H Assadi; Christine A Kozmik; Edgar P Spalding Journal: Plant Physiol Date: 2009-02-11 Impact factor: 8.340
Authors: Alex P Whan; Alison B Smith; Colin R Cavanagh; Jean-Philippe F Ral; Lindsay M Shaw; Crispin A Howitt; Leanne Bischof Journal: Plant Methods Date: 2014-07-08 Impact factor: 4.993
Authors: Siegfried Jahnke; Johanna Roussel; Thomas Hombach; Johannes Kochs; Andreas Fischbach; Gregor Huber; Hanno Scharr Journal: Plant Physiol Date: 2016-09-23 Impact factor: 8.340
Authors: Hanno Scharr; Christoph Briese; Patrick Embgenbroich; Andreas Fischbach; Fabio Fiorani; Mark Müller-Linow Journal: Front Plant Sci Date: 2017-09-28 Impact factor: 5.753
Authors: Meriem Aoun; William Stafstrom; Paige Priest; John Fuchs; Gary L Windham; W Paul Williams; Rebecca J Nelson Journal: Food Control Date: 2020-12 Impact factor: 5.548
Authors: Viktor A Sydoruk; Johannes Kochs; Dagmar van Dusschoten; Gregor Huber; Siegfried Jahnke Journal: Sensors (Basel) Date: 2020-01-30 Impact factor: 3.576
Authors: Johanna Roussel; Felix Geiger; Andreas Fischbach; Siegfried Jahnke; Hanno Scharr Journal: Front Plant Sci Date: 2016-06-07 Impact factor: 5.753