The present disclosure generally relates to aerial imaging systems, and in particular to a system and method for plant sensing using aerial RGB imagery.
This section introduces aspects that may help facilitate a better understanding of the disclosure. Accordingly, these statements are to be read in this light and are not to be understood as admissions about what is or is not prior art.
Biological In many agricultural applications one wants to characterize physical properties of plants and use the measurements to predict, for example, biomass and environmental influence. This process is known as phenotyping. Obtaining high quality phenotypic information is of crucial importance for plant breeders in order to study a crop's performance. Some phenotypic traits such as leaf area have been shown to be correlated with above-ground biomass. Collecting phenotypic data is often done manually and in a destructive manner. Therefore, improvements are needed in the field.
The above and other objects, features, and advantages of the present invention will become more apparent when taken in conjunction with the following description and drawings wherein identical reference numerals have been used, where possible, to designate identical features that are common to the figures, and wherein:
For the purposes of promoting an understanding of the principles of the present disclosure, reference will now be made to the embodiments illustrated in the drawings, and specific language will be used to describe the same. It will nevertheless be understood that no limitation of the scope of this disclosure is thereby intended.
Current UAV-based mapping is usually executed according to a mission plan while relying on a consumer-grade navigation sensor within the platform's autopilot. Therefore, prior information, which describes the trajectory of the platform including the involved cameras, can be utilized to facilitate ROP recovery for stereo-images. We assume that such prior information can be either derived from a specific flight configuration (e.g., a UAV platform moving at a constant flying height while operating a nadir-looking camera) or established by a low-cost Micro-Electro-Mechanical-System (MEMS) integrated with a GPS unit onboard the UAV.
The Essential matrix E describes the epipolar geometry relating two corresponding points in a stereo-pair. The Essential matrix is based on the co-planarity constraint, which is depicted in
p
1
T·({right arrow over (T)}×Rpz)=0 (1)
In this equation, p1 and p2 are two corresponding points, where p=(x, y, −c)T represents the image coordinates corrected for the principal point offset and camera-specific distortions. In t, we consider the case that the stereo-images are acquired by two different cameras. The rotation matrix R, which is defined by three rotation angles ω, ϕ, and κ, describes the relative rotation relating overlapping images.
The nine elements of the Essential matrix are defined by the five elements fo the ROPs (three rotation angles and two translation components). Therefore, there must be four additional constraints that can be imposed on the nine elements of the Essential matrix E. Such constraints are explained as follows:
The Essential matrix has rank two. Therefore, its determinant has to be zero as shown in Equation 4, which leads to what is known as the cubic constraint on the nine unknown parameters of the Essential matrix.
det(E)=0 (4)
The Essential matrix has two equal non-zero singular values, which leads to the trace constraint as represented by Equation 5. Given the rank of, two independent equations on the nine unknown parameters can be deduced from the equality in Equation 5.
EE
T
E−½ trace(EET)E=0 (5)
The nine elements of the Essential matrix can be only determined up to a scale, which provides the fourth constraint.
Considering these four constraints, one can conclude that a minimum of five-point-correspondence is sufficient to estimate the nine unknown parameters of the Essential matrix. Using five point correspondences, a system of five linear equations of the form in Equation 6 can be established. An efficient solution of the Essential matrix using five feature correspondences, while considering the above constraints is provided, and the Essential matrix can then be used to derive the rotation matrix R and the translation vector {right arrow over (T)} relating the stero images. Possible solution for R and {right arrow over (T)} from a given Essential matrix are then given by:
x
1
x
2
e
11
+x
1
y
2
e
12
−x
1
c
2
e
13
+y
1
x
2
e
21
+y
1
y
2
e
22
−y
1
c
2
e
23
−c
1
x
2
e
31
−c
1
y
2
e
32
+c
1
c
2
e
23=0 (6)
Derivation of the Two-Point Approach
This approach according to one embodiment is based on acquired imagery from a nadir-looking camera onboard a UAV platform moving at a constant flying height. Within the robotics research community, this flight configuration is known as “planar motion,” where the platform's motion is constrained to a horizontal plane, and the rotation of the image plane is constrained along an axis orthogonal to the horizontal plane (i.e., with a reference direction that coincides with the normal to the plane of motion). The UAV-based planar motion leads to two geometric constraints that can be used to reduce and simplify the elements of the Essential matrix. As can be seen in
For a nadir-looking camera, the rotation angles ω and ϕ are assumed to be zero. Therefore, the relative rotation of the camera between the images of a stereo-pair is constrained to the rotation angle κ (i.e., heading). For a planar motion along the horizontal plane, the TZ translation component is assumed to be zero.
Therefore, for a planar motion, the rotation matrix R and translation vector {right arrow over (T)} relating the stereo-images can be ex-pressed according to Equation 7, where x is the rotation angle, and Tx and Ty are the translation components describing the horizontal planar motion of the UAV platform. The expressions for R and {right arrow over (T)} can be substituted into Equation 3. This substitution will lead to the simplified Essential matrix in Equation 8, where L1, L2, L3, and L4 are used to denote the four unknown parameters of the Essential matrix E. As can be seen in Equation 8, L1, L2, L3, and L4 are derived from three independent parameters (Tx, Ty, and κ). Therefore, there should be one constraint relating the four elements of the Essential matrix. A closer inspection of the relationships between (L1, L2, L3, L4) and (Tx, Ty, and κ), one can introduce the constraint in Equation 9.
Given the simplified form for the Essential matrix that describes the relationship between conjugate points in stereo-images captured under horizontal planar motion constraints, we will focus on establishing the closed-form solution of the proposed two-point approach. Using the simplified Essential matrix, one can expand the relationship between the image coordinates of conjugate points to the form in Equation 10, where the image coordinates of the conjugate points p1 and p2 are represented by (x1, y1, −c1) and (x2, y2, −c2) after correcting for the principal point offsets and camera-specific distortions.
At this stage, the ROP recovery should be concerned with coming up with an estimate of the four parameters (L1, L2, L3, L4) while observing the inherent constraint among these parameters, which is represented by Equation 9, and the fact that these parameters can be only determined up to an arbitrary scale. Therefore, two conjugate point pairs should be sufficient for deriving the simplified Essential matrix. The presently disclosed closed-form solution for deriving the elements of the Essential matrix estimation starting from Equation 10 can be carried out as follows:
Given two conjugate point pairs, two linear equations, which can be represented using the matrix form in Equation 11, can be established:
Where (x11, y11, −c1) and (x21, y21, −c2) are the image coordinates for the first conjugate point pair, and (x12, y12, −c1) and (x22, y22, −c2) are the image coordinates for the second conjugate point pair.
As can be seen in Equation 11, the L4×1 vector belongs to the null space of the A2×4 and is comprised of the image coordinates of available conjugate point pairs. Therefore, the L4=1 vector can be derived through a linear combination of the basis vectors {tilde over (X)} and {tilde over (Y)} spanning the right null-space of the matrix A2×4. Assuming that the basis vectors {tilde over (X)}, and {tilde over (Y)} are presented by the elements in Equation 12, L1, L2, L3, L4 can be derived through the linear combination in Equation 13, where α and b are two arbitrary scale factors. Since the Essential matrix is determined only up to an arbitrary scale, one can choose a value of 1 for the scale factor b. Thus, the Essential matrix can be expressed by the form in Equation 14 using the basis vectors spanning the right null space of the A2×4 matrix.
Using the expressions for L1, L2, L3, L4 in Equation 13 and the inherent constraint in Equation 9, one can derive the second order polynomial in the unknown scale factor in Equation 15. The second order polynomial in Equation 15 provides up to two estimates for the scale factor α. Thus, two Essential matrices might be derived. For a given Essential matrix, R and T can be recovered through either the introduced singular value decomposition (SVD) approach or the closed-form solution. A total of four possible solutions of the rotation matrix R and translation vector {right arrow over (T)} can be recovered from a single Essential matrix. Therefore, up to eight solutions for R and {right arrow over (T)} can be derived from this approach. In order to identify the valid Essential matrix among the available solutions, two additional constraints can be utilized as follows:
The light rays connecting a derived object point and perspective centers should be on the same side of the baseline. The derived object points should be below the camera stations.
The above two-point approach assumes that the involved images are acquired from a nadir-looking camera onboard a UAV platform moving at a constant flying height. Therefore, the ω and ϕ rotation angles and the Tz translation component are assumed to be zero. Such prior flight information leads to the fact that a minimum of two conjugate point pairs can be used to derive the Essential matrix matrix relating a stereo pair through a closed form.
The above two-point approach can be integrated within a RANSAC framework for outlier detection and removal. More specifically, a random sample comprised of two conjugate point pairs is first drawn from potential matches and used to derive the Essential matrix E according to the procedure proposed in the previous section. To derive other matches that are compatible with such estimate of the Essential matrix (i.e., derive the corresponding inliers), the Sampson distance (Hartley and Zisserman, 2003), which is the first order approximation of the normal distances between a given conjugate point pair and the respective corresponding epipolar lines, is evaluated for the remaining potential matches. Such a sampling-and-testing procedure is repeated until a required number of trials/draws is achieved. The required number of trials, which is based on an assumed percentage of inliers, is derived according to a probabilistic basis to ensure that at least one correct draw has been executed as seen in Equation 16. Finally, the random sample with the highest number of inliers is selected and used for ROP estimation. In order to evaluate the computational efficiency of the proposed two-point approach when coupled with RANSAC for outlier removal in contrast to three, five, and eight point algorithms, the required number of trials N is presented in Table 1 (where ε is assumed to be 0.5 and the probability of having at least a single two correct conjugate point pairs within these trials is set to 0.99). As can be seen in Table 1, the required number of RANSAC trials is significantly reduced by using fewer number of conjugate point pairs. Therefore, the proposed two-point approach with a built-in RANSAC outlier detection/removal process is advantageous when compared with other approaches that require larger number of conjugate point pairs.
where s is the minimum number of required conjugate point pairs for rop estimation; ε is the probability of choosing an incorrect conjugate point pair, which is equivalent to the ratio between the assumed number of incorrect conjugate point pairs and the total number of potential conjugate point pairs; (1−e)3 is the probability of having correct conjugate point pairs in a single draw, and p in the probability of having at least a single sample comprised of correct conjugate point pairs.
Leaf Counting:
In one embodiment, a method for leaf counting using aerial images of a planted field is provided. First, the uncorrected image or orthomosaic is converted from RGB to HSV color space. From the image in the HSV color space, a segmentation mask Y is generated. Each pixel Ym in this mask is obtained as:
Where Hm, Sm, and Vm are the hue, saturation, and the value of the pixel m. The thresholds τ1, τ2, τ3, and τ4 are determined experimentally. τ1 and τ2 select the characteristic green color of the leaves. τ3 and τ4 prevent misclassifying some soil pixels as leaves. Then, the number of pixels classified as sorghum leaves is determined as
where M is the number of pixels in the image. This pixel-wise segmentation makes use of the strong color difference between the sorghum leaves, which are generally green or yellow, and the soil and panicles, which are usually brown. An example of the segmentation result is shown in
In order to calibrate ρ, a small region of the image is selected. The number of leaves in this region is manually counted and denoted as λ0. The number of pixels classified as sorghum is denoted as α0. Finally, ρ is estimated by ρ=α0/λ0, and the final leaf count can be obtained as λ=α/ρ. We experimentally determined that the relationship between the number of leaves and the number of sorghum pixels is approximately linear at a given growth stage. This method assumes that all leaves are at the same distance from the camera. This condition is fulfilled when using the orthorectified mosaic. Also, only leaves that are visible can be counted.
Plant leaves have similar morphology but different traits such as leaf length, width and green color hue. From the segmentation of each individual leaf, these traits can be estimated. In this section, we present a shape-based approach to leaf segmentation.
A and B are two pixels located at two opposite edges of a leaf. A pixel-wide line connecting A and B is defined as the leaf slice SAB.
GB, respectively. The angle 0ϰ is defined in the direction of the x axis. GA and GB are expected to be opposite to each other with some bias Ta, i.e.,
|GA−GB+π|mod 2π<Ta (6)
Leaf slices are obtained by using the Stroke Width Transform. The slice angle θAB of SAB is defined as the normal of the slice (in radians) as
In order to reconstruct leaves from leaf slices, adjacent leaf slices, SAB and SCD, are compared. If their angles θAB and ° C.D differ less than a constant Tb, i.e,
|θAB−θBC|<Tb (8)
then slices SAB and SCD are merged.
Plants with high leaf density can cause leaves overlap with each other. In the case of leaf overlap, there may be a dis-continuity between leaf slices as shown in
In this case, the two leaf segments will be merged by the following search method. Let a leaf be split into two leaf segments L1 and L2, separated by a discontinuity, as shown in
|θAB−θEF|<Tc, (9)
Then leaves segments L1 and L2 are considered the same leaf.
Note that the thresholds Ta, Tb, Tc are determined experimentally. This method addresses leaf discontinuity due to leaf crossing, but requires that leaf edges are clearly distinguishable. Overlap of parallel leaves may lead to undersegmentation.
Next presented is a statistical model according to one embodiment for segmentation of the plant and determining its location. The location of each plant is defined as the pixel coordinates where the stalk intersects the ground plane and can be used to automatically obtain the inter-row and intra-row spacing. A row of plants is defined as all the plants that are aligned together. Inter-row spacing is defined as the distance between rows. Intra-row spacing is defined as the distance between plants within the same row.
The number of plants in an image is denoted as the constant P and assumed to be known a priori. The positions of the plants are modeled as a random vector X, i.e,
X=[X0,X1, . . . ,XP−1] (10)
where Xp, p=0, . . . , P−1, contains the (i, j) coordinates of the p-th plant:
The goal is to estimate X from Y, where is Y is the color based segmentation mask. The 2D coordinates of the pixel m are denoted as K(m). A vector Z is constructed as
Z=[Z0,Z1, . . . ,ZN−1] (12)
where each element Zn=K(n), n=0, . . . , N−1, is included if Yn=1. N is the number of pixels classified as leaf pixels. Notice that N<M.
The plant p that is closest to the pixel n is denoted as C(n). The Euclidean distance from the pixel n to the plant C(n) is denoted as Dn and is computed as in Equation 13.
Dn is modeled as a random variable with exponential conditional probability density with mean and standard deviation σ. Therefore the probability density function for a leaf pixel n at distance Dn=dn from C(n) is
o can be interpreted as the average radius of a plant.
Then, the conditional distribution of a single point Zn only depends on its closest plant:
From our assumptions above we have conditional independence and the joint conditional density of Z is
This model assumes that the leaf distribution does not have any direction preference, i.e. the leaves grow uniformly in all directions. In some situations, however, the plant is tilted, and the stalk is not completely at the center of the plant.
As seen in
The condition probability density of the position of one plant Xp given the remaining plants is assumed normal
Where up are the coordinates of the vertical and horizontal plant lines where Xp is a member, and
Is the covariance matrix of the positions of the plants that are aligned with the plant p, either vertically or horizontally. σ2p,i and σ2p,j are the vertical and horizontal standard deviations of Xp (see
From Equation 16, we can obtain the MAP estimate of X as
Obtaining a closed form for pX(x) involves dependencies between pant positions because the plant positions are not mutually independent. We iteratively obtain the MAP estimate of each plant position Xp separately:
For the special case in which the prior term is not used, the estimate X{circumflex over ( )}p(Z) in Equation 20 is reduced to
This corresponds to the cost function of the k-means clustering technique. In this case, X{circumflex over ( )}p(Z), has a closed form solution, shown in Equation 21, which is the average of the points in the cluster fromed by plant p
is the membership function that indicates whether the pixel n corresponds to plant xp or not.
Another special case occurs when the prior distribution about the intra-row spacing prior is not used. When σ−>∞, Equation 20 becomes.
Since σ is not known, we estimate it after every Iterative Coordinate Descent (ICD) iteration using the Maximum Likelihood estimator, that can be obtained in closed form:
The methods and processes described above can be embodied in, and fully automated via, software code modules executed by one or more general purpose computers or processors. The code modules can be stored in any type of computer-readable storage medium or other computer storage medium. Some or all of the methods can alternatively be embodied in specialized computer hardware. For example, various aspects herein may take the form of an entirely hardware aspect, an entirely software aspect (including firmware, resident software, micro-code, etc.), or an aspect combining software and hardware aspects These aspects can all generally be referred to herein as a “service,” “circuit,” “circuitry,” “module,” or “system.”
Those skilled in the art will recognize that numerous modifications can be made to the specific implementations described above. The implementations should not be limited to the particular limitations described. Other implementations may be possible.
The present patent application is related to and claims the priority benefit of U.S. Provisional Patent Application Ser. No. 62/545,461, filed Aug. 14, 2017, the contents of which is hereby incorporated by reference in its entirety into the present disclosure.
This invention was made with government support under grant No. DE-AR0000593, awarded by the Department of Energy. The United States government has certain rights in the invention.
Number | Date | Country | |
---|---|---|---|
62545461 | Aug 2017 | US |