This application is a U.S. National Stage Application under 35 USC 371 filing of International Application Number PCT/US2011/000465, entitled “COMPUTER VISION AND MACHINE LEARNING SOFTWARE FOR GRADING AND SORTING PLANTS” filed on Mar. 14, 2011, which is a Non-provisional Application of U.S. Provisional Application No. 61/340,091, titled COMPUTER VISION AND MACHINE LEARNING SOFTWARE FOR GRADING AND SORTING PLANTS, filed on Mar. 13, 2010, both are herein incorporated by reference.
The strawberry industry presently uses manual labor to sort several hundred million plants every year into good and bad categories, a tedious and costly step in the process of bringing fruit to market. Plants raised by nursery farms are cultivated in large fields grown like grass. The plants are harvested at night in the fall and winter when they are dormant and can be moved to their final locations for berry production. During the nursery farm harvest, the quality of the plants coming from the field is highly variable. Only about half of the harvested plants are of sufficient quality to be sold to the berry farms. It is these plants that ultimately yield the berries seen in supermarkets and road-side fruit stands. The present invention provides new sorting technologies that will fill a valuable role by standardizing plant quality and reducing the amount of time that plants are out of the ground between the nursery farms and berry farms.
Present operations to sort plants are done completely manually with hundreds of migrant workers. A typical farm employs 500-1000 laborers for a 6-8 week period each year during the plant harvest. The present invention is novel both in its application of advanced computer vision to the automated plant-sorting task, and in the specific design of the computer vision algorithms. One embodiment of the present invention applies to strawberry nursery farms. However, there are other embodiments of the software engine being for many different types of plants that require sophisticated quality sorting.
The software described in the present invention is a core component for a system that can take plants from a transport bin, separate them into single streams, inspect, and move them into segregated bins that relate to sale quality. Although automated sorting systems exist in other applications, this is the first application to strawberry nursery sorting, and the first such system to involve extensive processing and computer vision for bare-root crops.
This invention is a novel combination and sequence of computer vision and machine learning algorithms to perform a highly complex plant evaluation and sorting task. The described software performs with accuracy matching or exceeding human operations with speeds exceeding 100 times that of human sorters. The software is adaptable to changing crop conditions and until now, there have been no automated sorting systems that can compare to human quality and speed for bare-root plant sorting.
Specific details of each embodiment of the system as shown in
One embodiment of system 10 of the present invention includes 2 dimensional camera images for classification. The imagery can be grayscale or color but color images add extra information to assist in higher accuracy pixel classification. No specific resolution is required for operation, and system performance degrades gracefully with decreasing image resolution. The image resolution that provides most effective classification of individual pixels and overall plants depends on the application.
One embodiment of the present invention that generates the 2 dimensional camera images (step i of
The present invention can include two operational algorithms for determining region of interest for extraction:
A first algorithm can count foreground pixels for a 1st axis per row. When the pixel count is higher than a threshold, the algorithm is tracking a plant. This threshold is pre-determined based on the size of the smallest plant to be detected for a given application. As the pixel count falls below the threshold, a plant is captured along one axis. For the 2nd axis, the foreground pixels are summed per column starting at the column with the most foreground pixels and walking left and right until it falls below threshold (marking the edges of the plant). This algorithm is fast enough to keep up with real-time data and is good at chopping off extraneous runners and debris at the edges of the plant due to the pixel count thresholding. The result of this processing are images cropped around the region of interest with the background masked, as in
Step iii is a second algorithm that can use a modified connected components algorithm to track ‘blobs’ and count foreground pixel volume per blob during processing. Per line, the connected components algorithm is run joining foreground pixels with their adjacent neighbors into blobs with unique indices. When the algorithm determines that no more connectivity exists to a particular blob, that blob is tested for minimum size and extracted for plant classification. This threshold is pre-determined based on the size of the smallest plant to be detected for a given application. If the completed blob is below this threshold it is ignored, making this algorithm able to ignore dirt and small debris without requiring them to be fully processed by later stages of the system. The result of this processing are images cropped around the region of interest that encompasses each blob with the background masked, as in
It is possible that the cropped image containing the region of interest may contain foreground pixels that are not part of the item of interest, possibly due to debris, dirt, or nearby plants that partially lie within this region. Pixels that are not part of this plant are masked and thus ignored by later processing, reducing the overall number of pixels that require processing and reducing errors that might otherwise be introduced by these pixels.
One embodiment of the present invention includes an algorithm for feature calculation for use in pixel classification (step iv of
(i) Grayscale intensity;
(ii) Red, Green, Blue (RGB) color information;
(iii) Hue, Saturation, Value (HSV) color information;
(iv) YIQ color information;
(v) Edge information (grayscale, binary, eroded binary);
(vi) Root finder: the algorithm developed is a custom filter that looks for pixels with adjacent high and low intensity patterns that match those expected for roots (top and bottom at high, left and right are lower). The algorithm also intensifies scores where the root match occurs in linear groups; and
(vii) FFT information: the algorithm developed uses a normal 2D fft but collapses the information into a spectrum analysis (1D vector) for each pixel block of the image. This calculates a frequency response for each pixel block which is very useful for differentiating texture in the image; gradient information; and Entropy of gradient information.
After the features are computed, the neighborhood mean is then calculated and variance for each pixel executed over many different neighborhood sizes as it was found that the accuracy of classifying a particular pixel can be improved by using information from the surrounding pixels. The neighborhood sizes used are dependent upon the parameters of the application, typically a function of plant size, plant characteristics, and camera parameters.
The system is designed to be capable of generating several hundred scores per pixel to use for classification, but the configuration of features is dependent upon computational constraints and desired accuracy. The exemplary system 10 utilizes a set of 37 features that were a subset of the 308 tested features (element vectors). This subset was determined through a down-select process, explained in a later section, which determined the optimal and minimum combination of features to achieve desired minimum accuracy. This process can be performed for each plant variety as well as plant species.
Machine learning techniques are used to classify pixels into high level groups (step v of
Another aspect of the present invention uses the classified pixel image (
One particular algorithm that helps to standardize the plant classification application is the concept of virtual cropping. Even though different farmers chop their plants, such as strawberries, using different machinery, the plant evaluation of the present invention can be standardized by measuring attributes only within a fixed distance from the crown. This allows for some farmers to crop short while others crop longer, and makes the plant classification methods above more robust to these variations. This step is optional and can improve the consistency of classification between different farm products.
Depending on the technology utilized for plant singulation 18 (
Another aspect of the present invention mentioned above is plant classification (step ix of
Another aspect of the present invention involves training procedures and tools for the system software (step x in
The first algorithm, shown in
A second algorithm, shown in
The second algorithm is a more manually intensive method, but is able to achieve higher accuracy in some situations. When there are cases of overlapping regions of different classes, this method is better able to assess them correctly.
Once training images have been collected, machine learning software is applied to train a model for the classifier. This first training stage produces a model containing parameters used to classify each pixel into a configurable category, such as root, stem, leaf, etc.
The second training stage involves creating a model for classifying plants 36 (
The first method involves sorting plants manually. Once the plants are separated into the various categories, each plant from each category is captured and given a label. This is a time intensive task as the plants must be sorted manually, but allows careful visual inspection of each plant.
The second method uses unsorted plants that are provided to the system, capturing an image of each plant. These images are transmitted to a custom user interface that displays the image as well as the pixel classified image. This allows an operator to evaluate how to categorize a plant using the same information that the system will be using during training, which has the benefit of greater consistency. The operator then selects a category from the interface and the next image is displayed. An example interface is shown in
After the required training images have been acquired, machine learning software (e.g., a SVM algorithm) is applied to build the associations of score vectors to plant categories. The final result is a model of parameters containing parameters used to determine the category of a pixel classified image.
Both of these training operations 32, 36 share some common algorithms to analyze, configure, and enhance accuracy. Randomization, class training distributions, penalty matrices, and exit criteria are all configurable with our implementation of the learning engine. These settings are independent of the actual machine learning engine software and enabled the software to attain accuracy equivalent to or beyond human levels. Additional features have been added to the training system to allow the user to predict expected accuracy and control error rates by using margins. These margins are computed by looking at the classifier confidence level, representing how certain it is that the item is of a certain category relative to the other categories. If the machine learning software is unsure about an answer for a pixel or a plant, the user can configure a specific margin (to ensure certainty). If the answer does not have enough margin, the answer will be marked ambiguous instead.
Another concept of the present invention is a method to make real-time adjustments to plant classification 37 during system operation (
Another aspect of the present invention is Automated Feature Down-selection to Aid in Reaching Real-time Implementations (steps iv and vi of
The first algorithm begins by utilizing all of the feature calculations that have been implemented and calculating a model of training parameters using the machine learning software. One feature calculation is then ignored and the training process is repeated, creating a new model for this feature combination. This process is repeated, each time ignoring a different feature calculation, until a model has been created for each combination. Once this step is complete the combination with the highest accuracy is chosen for the next cycle. Each cycle repeats these steps using the final combination from the previous cycle, with each cycle providing the optimal combination of that number of features. The overall accuracy can be graphed and examined to determine when the accuracy of this sort falls below acceptable levels. The model with the least number of features above required accuracy is chosen to be the real-time implementation.
The second algorithm has similar functionality as the first algorithm but the second algorithm starts with using only one feature at a time and increasing the number of features each cycle. The feature that results in highest accuracy is accepted permanently and the next cycle is started (looking for a second, third, fourth, etc. feature to use). This algorithm is much faster and is also successful at identifying which features to use for real-time implementation.
While the disclosure has been described in detail and with reference to specific embodiments thereof, it will be apparent to one skilled in the art that various changes and modifications can be made therein without departing from the spirit and scope of the embodiments. Thus, it is intended that the present disclosure cover the modifications and variations of this disclosure provided they come within the scope of the appended claims and their equivalents.
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/US2011/000465 | 3/14/2011 | WO | 00 | 10/22/2012 |
Publishing Document | Publishing Date | Country | Kind |
---|---|---|---|
WO2011/115666 | 9/22/2011 | WO | A |
Number | Name | Date | Kind |
---|---|---|---|
5253302 | Massen | Oct 1993 | A |
5864984 | McNertney | Feb 1999 | A |
5926555 | Ort et al. | Jul 1999 | A |
6882740 | McDonald, Jr. et al. | Apr 2005 | B1 |
7123750 | Lu et al. | Oct 2006 | B2 |
7218775 | Kokko et al. | May 2007 | B2 |
7367155 | Kotyk et al. | May 2008 | B2 |
20030142852 | Lu et al. | Jul 2003 | A1 |
20050157926 | Moravec | Jul 2005 | A1 |
20050180627 | Yang et al. | Aug 2005 | A1 |
20050192760 | Dunlap | Sep 2005 | A1 |
20070044445 | Spicer | Mar 2007 | A1 |
20070119518 | Carman et al. | May 2007 | A1 |
20080084508 | Cole et al. | Apr 2008 | A1 |
20080166023 | Wang | Jul 2008 | A1 |
20090060330 | Liu | Mar 2009 | A1 |
20100086215 | Bartlett et al. | Apr 2010 | A1 |
20100254588 | Cualing et al. | Oct 2010 | A1 |
20110175984 | Tolstaya | Jul 2011 | A1 |
Number | Date | Country |
---|---|---|
0353800 | Feb 1990 | EP |
0353800 | Feb 1990 | EP |
1564542 | Aug 2005 | EP |
08-190573 | Jul 1996 | JP |
2004040274 | May 2004 | WO |
Entry |
---|
International Preliminary Report on Patentability and Written Opinion of the ISA for PCT Applicaiton No. PCT/US2011/00465 dated Sep. 18, 2012. |
International Search Report and Written Opinion of the ISA for PCT Applicaiton No. PCT/US2011/00465 dated Oct. 31, 2011. |
Davis, D.B. et al., “Machine Vision Development and Use in Seedling Quality Monitoring Inspection” In: Landis, T.D.; Cregg, B., tech. coords. National Proceedings, Forest and Conservation Nursery Associations. Gen. Tech. Rep. PNW-GTR-365. Portland, OR: U.S. Department of Agriculture, Forest Service, Pacific Northwest Research Station: 75-79. Available at: http://www.fcnanet.org/proceedings/1995/davis.pdf, 1995. |
European Search Report for European Application No. 11756656.2 dated Jan. 28, 2014. |
Pekkanen, Ville et al., Utilizing inkjet printing to fabricate electrical interconnections in a system-in-package, Microelectronic Engineering 87 (2010) 2382-2390, available online Apr. 22, 2010. |
Smith, P.J. et al., Direct ink-jet printing and low temperature conversion of conductive silver patterns, Journal of Materials Science, 41 (2006) 4153-4158, published on web May 19, 2006. |
Stringer, Jonathan et al., Formation and Stability of Lines Produced by Inkjet Printing, Langmuir 2010, 26 (12), 10365-10372, published on web Sep. 18, 2010. |
Hossain, S.M. Zakir et al., Development of a Bioactive Paper Sensor for Detection of Neurotoxins Using Piezoelectric Inkjet Printing of Sol-Gel-Derived Bioinks, Anal. Chem. vol. 81, No. 13, Jul. 1, 2009, 5474-5483. |
Number | Date | Country | |
---|---|---|---|
20130028487 A1 | Jan 2013 | US |
Number | Date | Country | |
---|---|---|---|
61340091 | Mar 2010 | US |