This invention generally relates to systems and processes for detecting, segmenting, and classifying poultry carcasses using machine learning and computer vision in a smart-automated poultry plant.
Numerous studies have demonstrated increasing annual poultry consummation rates, mainly due to relatively low prices, nutritional value, and health benefits. With annualized increases in broiler production, concomitant increases in labor are necessary for meat production supply chain efficiency. Conventional chicken processing plants rely heavily on manual labor throughout the hanging, processing, deboning, and packaging of components at the plant. Employees are involved in visually inspecting chickens, weighing carcasses, and determining the size and weight of chicken parts. The reliance on human labor is expensive and prone to error. In addition to the costs of increased workforce labor and workforce development, many poultry companies are suffering from labor shortages.
Another negative side of relying on human labor for poultry processing is the varying results of carcass evaluation consistency. Many companies use assembly lines stationed by employees to inspect the quality of chicken carcasses, which leaves room for human error and can result in miscategorized carcass defections. For example, after the first stage of broiler processing, i.e., evisceration, not every chicken is a quality carcass. Imperfections may arise due to the stunner, scalder, picker, and evisceration processes, and detecting these imperfections can be challenging, particularly in high-speed assembly line production operations.
Similarly, if accurate yields are to be measured, multiple people are typically engaged in the weighing process. For example, one employee places an item (e.g., chicken body, fat, wing, leg, tender, breast) onto a scale, and another identifies the chicken part and estimates the scale stability by pressing the button. Rapidly weighing the chicken parts and correctly associating an accurate weight for each part is also challenging for workers.
The invention relates to a smart-automated poultry plant system and process based on machine learning and computer vision. The smart-automated system and process predict the quality of poultry (e.g., chicken broilers) carcasses and analyze them for any imperfections resulting from production and transport welfare issues, as well as processing plant stunner, scalder, picker, and equipment malfunctions. Depending on the carcass detection result, the system and process can designate the carcass to stay in the processing line or to be redirected if any rework is necessary based on the automated visual examination at the first critical control point.
Accordingly, it is an object of this invention to provide a new and improved system and process for identifying imperfections in chickens and accurately weighing chickens and chicken parts during processing.
Another object of this invention is to provide automated computer vision-based smart chicken plant systems and processes to automize data collection and implement vision-based smart technology that is more versatile, economical, and inclusive than current technology and methodologies.
A further object of this invention is to provide smart-automated poultry plant systems and processes that use machine learning and computer vision for detecting, segmenting, and classifying the quality of poultry carcasses.
The above and other objects and advantages of this invention may be more clearly seen when viewed in conjunction with the accompanying drawing wherein:
While this invention is susceptible to embodiment in many different forms, there are shown in the drawings and will herein be described hereinafter in detail some specific embodiments of the invention. It should be understood, however, that the present disclosure is to be considered an exemplification of the principles of the invention and is not intended to limit the invention to the specific embodiments so described.
The invention relates to systems and processes for implementing computer vision and machine learning in a poultry processing plant. The invention can also be applied to other meat processing facilities and similar assembly or disassembly systems. Visual inspection is one of the most basic but essential steps in controlling meat quality before the product is prepared, packaged, and distributed to the market. The smart-automated systems and processes that are disclosed herein improve poultry processing and food safety by using an automated detection model to classify normal or defective (contaminated, mutilated, or skin lacerated) carcasses.
Referring to the drawings in detail,
Turning to
The backbone or image input module 302 of the system 100 is a convolutional neural network (CNN) network that takes an input image 310 with a size of H×W and generates a set of four low-resolution feature maps 312:
where CF1, CF2, CF3, CF4 are the number of channels.
The pixel decoder module 304 of the system 100 generates features from the backbone 302 to produce the pyramid of feature maps 316 with resolutions of 1/32, 1/16, ⅛, and ¼ of the input image 310 so that both high and low resolutions can be utilized. To get the first feature map 316, the pixel decoder 304 takes F4 and performs 1×1 convolution (to decrease channel size to Cp). This first feature map 316 is upsampled by a factor of 2 and then merged with the corresponding feature with the same spatial sizes, i.e., F3, by element-wise summation. A 3×3 convolution is then followed on the merged map to get a final feature map 316. This procedure is repeated until the highest resolution feature map 316, and the pixel decoder module 304 has produced the feature pyramid network (FPN) with at least four feature maps 316:
The multi-scale transformer encoder module 306 of the system 100 inputs the three first feature maps 312 from the backbone 302 from low to high resolution, i.e., F4, F3, and F2, followed by a 1×1 convolution to get the same channel size Ce. Each feature map 312 is added a positional embedding eposition∈H
1×C
The input of the mask-attention transformer decoder module 308 are the scale feature maps 314 from the transformer encoder module 306 and N learnable positional embeddings acted as object queries. The decoder module 308 has three layers 322 and two (2) types of attention submodules in each layer: a mask-attention submodule 320 and a self-attention submodule 318. Object queries interact with one another in the self-attention submodule 318 to identify their relationships. Both the query and the key queries are object queries. The mask-attention submodule 320, for each query, extracts features by restricting cross-attention to the foreground region of the predicted mask. The query components are from the object queries, while the key elements are from the feature maps 314 from the transformer encoder 306. The mask-attention submodule 320 calculates the attention matrix via:
While attention mask Maskl-1 at pixel (x,y) defines as:
Where Al∈RN×C is N query features at l-th layer, Bl-1 with size of N×HlWl is the binarized output of the preceding (l1)-th decoder layer's resized mask prediction. A0 is denoted as input query features. B0 is obtained from A0. Using the three first features from the lowest resolution generated by the pixel decoder module 304, i.e., D4, D3 and D2. Each feature is added a positional embedding eposition∈H
1×C
A linear classifier with softmax activation can then be applied to generate N class prediction for each segment. To predict the mask, a two-layer multi-layer perceptron transforms N per-segment embeddings to N mask embeddings emask∈Ce×N. To further increase the detail of the mask prediction, the last pyramid feature from the pixel decoder 304 with a resolution ¼ the size of the original image and upsampled two times to get per-pixel embeddings epixel∈Ce×H×W before doing dot product with the mask embeddings emask from the transformer decoder module 308. A sigmoid activation can follow the dot product to help obtain N mask predictions.
The systems and processes for implementing computer vision and machine learning in a poultry processing plant are further illustrated by the following examples, which are provided for the purpose of demonstration rather than limitation.
Camera equipment 102 to collect the photographs and videos, as shown in
To further improve mask quality, the pre-processing procedure 400 performs binary thresholding on all three red-green-blue (RGB) channels to capture all needed information (step 408). By choosing thresholding numbers t for each image, the thresholded images by threshold function fthreshold can be calculated as:
and then combines all of them together for the final mask (step 410). Any remaining undesirable spots from thresholding are cleaned by using opening morphological transformations (step 412). The final step of the pre-processing procedure 40 was computing the area of any remaining contours and getting rid of any excess contours so only the main object remains (step 414). This step 414 results in a set of RGB images and corresponding mask annotation images.
Each video is a compilation of many continuous frames, so even just one singular broiler has an excess of corresponding image frames. For more straightforward labeling of the images, the system 100 automatically counted the birds, which also helped track which bird was connected to which image. This counting algorithm, shown in Algorithm 1 below, helped to more accurately note if the poultry was defective or normal while watching the index of carcasses.
After the pre-processing procedure 400, the poultry carcass 104 must meet the criteria and be within the ROI without any excess pieces touching the border, i.e., xborder
Table 1 below shows the number of segments in the dataset. Since there is only one carcass per image, the number of segments equals the number of images.
The system 100 was compared with Mascrnn, a network instance segmentation, for a baseline for validation and to produce the single dataset in Table 1 and the synthetic dataset in Table 2. Input resolutions were both resized to 256×256, and AP@95 was used as the default metric for instance segmentation. FLOPs are calculated over 100 images in the test set of each dataset. When computing frames-per-second (fps), the average runtime on a single NVIDIA RTX A6000 GPU with a batch size of 1 for the complete test set was used. The system 100 was demonstrated to have better AP on both datasets than Maskrcnn with all backbone for a large margin. Maskrcnn models can not perform detection at IoU=95 while the system 100 still provided reasonable AP scores, indicating that the system 100 can provide high-resolution masks.
In another aspect, the system and process 100 are configured to analyze images of processed poultry carcass parts being weighed on a scale. In this mode of operation, the system and process are configured to automatically classify the processed poultry carcass parts being weighed. The system and process also confirm the weight displayed for the processed poultry part by conducting a time series analysis of the weight displayed on the scale for the processed poultry part with the weight directly output by the scale to the computer system.
Turning now to
To carry out this functionality, the computer system 108 of the automated broiler processing system 100 first detects the scale 118 and display screen 120 to recognize if the scale 118 is vacant, or if a carcass 114 or fat pad 116 has been deposited on the scale 118. Once the computer system 108 detects the presence of an object of interest (either the chicken carcass 114 or the abdominal fat pad 116), the automated broiler processing system 100 incorporates a recognizer module to identify if the object on the scale 118 is a chicken carcass 114, the abdominal fat pad 116, or both. At the same time, a digit detector module and a digit recognizer module within the computer system 108 of the automated broiler processing system 100 are configured to read the digits from the display 120 on the scale 118, and associate that reading with a specific time series analysis to correlate the reading from the display 120 with the reading that is transmitted directly from the scale 118 to the computer system 108. Both the digit recognizer module and time series analysis are used to estimate the most stable weight on the scale 118. The computer system 108 also includes an action recognizer to identify if a carcass 114 or fat pad 116 is placed on the scale 118. This allows the automated broiler processing system 100 to track the same carcass 114 and fat pad 116 throughout the processing system.
Furthermore, the digit recognizer module of the automated broiler processing system 100 contains five sub-modules as follows: (i) digital scale detection to localize the scale; (ii) digital scale registration to align the scale by computing a homography matrix: (iii) digits separation to partition a sequence of digits on the scale screen into a set of individual digits; (iv) image enhancement and denoising by generative adversarial networks (GANs); and (v) digit classification by an advanced machine learning technique, e.g., Deep Learning with Convolutional Neural Network (CNN) where 12 classes are defined as the last fully connected layer.
Thus, in this mode of operation, the automated broiler processing system 100 is configured to: (i) automatically discriminate between a chicken carcass 114 and an abdominal fat pad 116 on the scale 118; (ii) confirm the most stable weight for the object on the scale 118; and (iii) identify the weight of the carcass 114 and fat pad 116 from the same chicken 104; and (iv) input into the computer system 108 the type of product (i.e., chicken carcass 114 or abdominal fat pad 116) and the weight of the product by using visual confirmation of the scale measurements sent directly to the computer system 108.
Turning to
To measure the weight of carcass parts 124 (i.e., breast fillets, breast tenders, thighs, drumsticks, wings, and other parts), the automated broiler processing system 100 is configured to monitor the carcass parts 124 as they are automatically placed on the scale 118. The scale camera 122 is installed at the scale 118 to identify the carcass parts 124 on the scale 118, and capture the corresponding weight at the scale display 120. The computer system 108 of the automated broiler processing system 100 includes a carcass parts recognizer module to distinguish between different parts of the chicken 104 at the scale 118. Notably, this carcass part recognizer also includes token identification. In this mode of operation, the computer system 108 of the automated broiler processing system 100 also contains a digit recognizer focused on the display 120 of the scale 118 to read the digits, execute a machine learning algorithm to estimate the stability of the visual signal from the scale base 118, and perform a time series analysis. Both the digit recognizer module and time series analysis are used to estimate the most stable weight for the item on the scale 118.
As noted above, the system and process may be implemented in a computer system using hardware, software, firmware, tangible computer-readable media having instructions stored thereon, or a combination thereof and may be implemented in one or more computer systems or other processing systems.
If programmable logic is used, such logic may execute on a commercially available processing platform or a special purpose device. One of ordinary skill in the art may appreciate that embodiments of the disclosed subject matter can be practiced with various computer system configurations, including multi-core multi-processor systems, minicomputers, mainframe computers, computers linked or clustered with distributed functions, as well as pervasive or miniature computers that may be embedded into virtually any device.
For instance, at least one processor device and a memory may be used to implement the above-described embodiments. A processor device may be a single processor, a plurality of processors, or combinations thereof. Processor devices may have one or more processor “cores.”
Various embodiments of the inventions may be implemented in terms of this example computer system. After reading this description, it will become apparent to a person skilled in the relevant art how to implement one or more of the inventions using other computer systems and/or computer architectures. Although operations may be described as a sequential process, some of the operations may be performed in parallel, concurrently, and/or in a distributed environment and with program code stored locally or remotely for access by single or multi-processor machines. In addition, in some embodiments, the order of operations may be rearranged without departing from the spirit of the disclosed subject matter.
The processor device may be a special purpose or a general-purpose processor device or maybe a cloud service wherein the processor device may reside in the cloud. As will be appreciated by persons skilled in the relevant art, the processor device may also be a single processor in a multi-core/multi-processor system, such system operating alone or in a cluster of computing devices operating in a cluster or server farm. The processor device is connected to a communication infrastructure, for example, a bus, message queue, network, or multi-core message-passing scheme.
The computer system also includes a main memory, for example, random access memory (RAM), and may also include a secondary memory. The secondary memory may include, for example, a hard disk drive or a removable storage drive. The removable storage drive may include a floppy disk drive, a magnetic tape drive, an optical disk drive, a flash memory, a Universal Serial Bus (USB) drive, or the like. The removable storage drive reads from and/or writes to a removable storage unit in a well-known manner. The removable storage unit may include a floppy disk, magnetic tape, optical disk, etc., which is read by and written to by the removable storage drive. As will be appreciated by persons skilled in the relevant art, the removable storage unit includes a computer usable storage medium having stored therein computer software and/or data.
The computer system (optionally) includes a display interface (which can include input and output devices such as keyboards, mice, etc.) that forwards graphics, text, and other data from communication infrastructure (or from a frame buffer not shown) for display on a display unit.
In alternative implementations, the secondary memory may include other similar means for allowing computer programs or other instructions to be loaded into the computer system. Such means may include, for example, the removable storage unit and an interface. Examples of such means may include a program cartridge and cartridge interface (such as that found in video game devices), a removable memory chip (such as an EPROM, PROM, or Flash memory) and associated socket, and other removable storage units and interfaces which allow software and data to be transferred from the removable storage unit to computer system.
The computer system may also include a communication interface. The communication interface allows software and data to be transferred between the computer system and external devices. The communication interface may include a modem, a network interface (such as an Ethernet card), a communication port, a PCMCIA slot, and card, or the like. Software and data transferred via the communication interface may be in the form of signals, which may be electronic, electromagnetic, optical, or other signals capable of being received by the communication interface. These signals may be provided to the communication interface via a communication path. Communication path carries signals, such as over a network in a distributed computing environment, for example, an intranet or the Internet, and may be implemented using wire or cable, fiber optics, a phone line, a cellular phone link, an RF link, or other communication channels.
In this document, the terms “computer program medium” and “computer usable medium” are used to generally refer to media such as removable storage unit, removable storage unit, and a hard disk installed in the hard disk drive. The computer program medium and computer usable medium may also refer to memories, such as main memory and secondary memory, which may be memory semiconductors (e.g., DRAMs, etc.) or cloud computing.
Computer programs (also called computer control logic) are stored in the main memory and/or the secondary memory. The computer programs may also be received via the communication interface. Such computer programs, when executed, enable the computer system to implement the embodiments as discussed herein, including but not limited to machine learning and advanced artificial intelligence. In particular, the computer programs, when executed, enable the processor device to implement the processes of the embodiments discussed here. Accordingly, such computer programs represent controllers of the computer system. Where the embodiments are implemented using software, the software may be stored in a computer program product and loaded into the computer system using the removable storage drive, the interface, the hard disk drive, or the communication interface.
Moreover, embodiments of the disclosure may be practiced with other computer system configurations, including hand-held devices, multi-processor systems, microprocessor-based or programmable consumer electronics, minicomputers, mainframe computers, and the like. Embodiments of the disclosure may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote memory storage devices.
Embodiments of the inventions also may be directed to computer program products comprising software stored on any computer useable medium. Such software, when executed in one or more data processing devices, causes a data processing device(s) to operate as described herein. Embodiments of the inventions may employ any computer-useable or readable medium. Examples of computer useable mediums include, but are not limited to, primary storage devices (e.g., any type of random access memory), secondary storage devices (e.g., hard drives, floppy disks, CD ROMS, ZIP disks, tapes, magnetic storage devices, and optical storage devices, MEMS, nanotechnological storage device, etc.).
The benefits and advantages described above may relate to one embodiment or may relate to several embodiments. The embodiments are not limited to those that solve any or all of the stated problems or those that have any or all of the stated benefits and advantages. The operations of the methods described herein may be carried out in any suitable order or simultaneously where appropriate. Additionally, individual blocks may be added or deleted from any of the methods without departing from the spirit and scope of the subject matter described herein. Aspects of any of the examples described above may be combined with aspects of any of the other examples described to form further examples without losing the effect sought.
The above description is given by way of example only, and various modifications may be made by those skilled in the art. The above specification, examples, and data provide a complete description of the structure and use of exemplary embodiments. Although various embodiments have been described above with a certain degree of particularity or with reference to one or more individual embodiments, those skilled in the art could make numerous alterations to the disclosed embodiments without departing from the spirit or scope of this specification.
Benefits, other advantages, and solutions to problems have been described above with regard to specific embodiments. However, the benefits, advantages, solutions to problems, and any element(s) that may cause any benefit, advantage, or solution to occur or become more pronounced are not to be construed as critical, required, or essential features or elements of any or all the claims. As used herein, the terms “comprises,” “comprising,” or any other variations thereof are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Further, no element described herein is required for the practice of the invention unless expressly described as “essential” or “critical.”
The preceding detailed description of exemplary embodiments of the invention makes reference to the accompanying drawings, which show the exemplary embodiment by way of illustration. While these exemplary embodiments are described in sufficient detail to enable those skilled in the art to practice the invention, it should be understood that other embodiments may be realized and that logical and mechanical changes may be made without departing from the spirit and scope of the invention. For example, the steps recited in any of the method or process claims may be executed in any order and are not limited to the order presented. Thus, the preceding detailed description is presented for purposes of illustration only and not of limitation, and the scope of the invention is defined by the preceding description and with respect to the attached claims.
This application claims the benefit of U.S. Provisional Patent Application No. 63/243,247 filed on Sep. 13, 2021, and incorporates said provisional application by reference in its entirety into this document as if fully set out at this point.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/US2022/076377 | 9/13/2022 | WO |
Number | Date | Country | |
---|---|---|---|
63243247 | Sep 2021 | US |