This disclosure relates generally to managing inventory at product storage facilities, and in particular, to updating templates for use in recognizing products in images captured at a product storage facility.
A typical product storage facility (e.g., a retail store, a product distribution center, a warehouse, etc.) may have hundreds of shelves and thousands of products stored on the shelves and/or on pallets. Individual products offered for sale to consumers are typically stocked on shelves, pallets, and/or each other in a product storage space having a price tag label assigned thereto. It is common for workers of such product storage facilities to manually (e.g., visually) inspect product display shelves and other product storage spaces to verify which of the on-shelf price tag labels are match with which of the on-shelf products, and whether the shelves storing the on-shelf products are correctly labeled with appropriate price tag labels.
Given the large number of product storage areas such as shelves, pallets, and other product displays at product storage facilities of large retailers, and the even larger number of products stored in the product storage areas, manual inspection of the price tag labels and the products on the product storage structures at the product storage facilities by the workers is very time consuming and significantly increases the operations cost for a retailer, since these workers could be performing other tasks if they were not involved in manually inspecting the product storage structures, price tag labels, and products.
On the other hand, optical character-based recognition of on-shelf product labels and on-shelf products based on hundreds or thousands of images captured at hundreds/thousands of product storage facilities, each of the images depicting a distinct on-shelf product label or on-shelf product requires significant system resources and/or high processing costs for large retailers, some of these costs being associated with the training and retraining of image recognition models for recognizing products in images of product structures captured at the product storage facilities of the retailers by image capture devices.
Disclosed herein are embodiments of systems and methods of updating templates for use in recognizing products in images captured at a product storage facility. This description includes drawings, wherein:
Elements in the figures are illustrated for simplicity and clarity and have not necessarily been drawn to scale. For example, the dimensions and/or relative positioning of some of the elements in the figures may be exaggerated relative to other elements to help to improve understanding of various embodiments of the present invention. Also, common but well-understood elements that are useful or necessary in a commercially feasible embodiment are often not depicted in order to facilitate a less obstructed view of these various embodiments of the present invention. Certain actions and/or steps may be described or depicted in a particular order of occurrence while those skilled in the art will understand that such specificity with respect to sequence is not actually required.
The terms and expressions used herein have the ordinary technical meaning as is accorded to such terms and expressions by persons skilled in the technical field as set forth above except where different specific meanings have otherwise been set forth herein.
The following description is not to be taken in a limiting sense, but is made merely for the purpose of describing the general principles of exemplary embodiments. Reference throughout this specification to “one embodiment,” “an embodiment,” or similar language means that a particular feature, structure, or characteristic described in connection with the embodiment is included in at least one embodiment of the present invention. Thus, appearances of the phrases “in one embodiment,” “in an embodiment,” and similar language throughout this specification may, but do not necessarily, all refer to the same embodiment.
Systems and methods of updating templates for use in recognizing products in images captured at a product storage facility include an image capture device that captures one or more images of product storage structure at a product storage facility, a computing device in communication with the image capture device, and an electronic database that stores keyword model templates and feature model templates associated with images of previously recognized individual products detected at the product storage facility. The computing device obtains the keyword and feature model templates associated with a recognized product from the electronic database, extracts the keywords from the products associated with the obtained keyword model templates, identifies products that are similar to the recognized product, and updates the keyword model template for each of the products to include must keywords and negative keywords, facilitating recognition of products in subsequent images captured by the image capture device.
In some embodiments, systems of updating templates for use in recognizing individual products in images captured at a product storage facility include an image capture device having a field of view that includes at least a portion of a product storage structure of the product storage facility, the product storage structure having the individual products arranged thereon, wherein the image capture device is configured to capture one or more images of the product storage structure, a computing device including a control circuit, the computing device being communicatively coupled to the image capture device, and an electronic database configured to store keyword model templates and feature model templates associated with the images of previously recognized individual products stored at the product storage facility, wherein the keyword model templates includes an image of a recognized individual product and meta data associated with the recognized individual product, and wherein the feature model templates include the image of the recognized product in association with visual features of the recognized product. The control circuit of the computing device is configured to obtain at least one of the keyword model templates and feature model templates stored in the electronic database; extract one or more keywords from each of the individual recognized products depicted in the captured images associated with the obtained keyword model templates; correlate the keywords extracted from each of the individual recognized products depicted in the images associated with the obtained keyword model templates to identify similar products, where the similar products share a number of keywords with each other and do not share a number of keywords with each other; update a keyword model template for each of the similar products to set the keywords that are unique to the similar recognized products as must keywords and to set the must keywords that are not shared between the similar recognized products as negative keywords; and transmit the updated keyword model template including the must keywords and the negative keywords for each of the similar recognized products to the electronic database for storage to be used for analysis of subsequent images captured by the image capture device, and recognition of the products in the subsequent images.
In some embodiments, a method of updating templates for use in recognizing individual products in images captured at a product storage facility includes: capturing one or more images of a product structure of the product storage facility by an image capture device having a field of view that includes at least a portion of the product storage structure, the product storage structure having the individual products arranged thereon; storing, in an electronic database, keyword model templates and feature model templates associated with the images of previously recognized individual products stored at the product storage facility, wherein the keyword model templates includes an image of a recognized individual product and meta data associated with the recognized individual product, and wherein the feature model templates include the image of the recognized product in association with visual features of the recognized product; and, by a computing device including a control circuit and communicatively coupled to the image capture device: obtaining at least one of the keyword model templates and feature model templates stored in the electronic database; extracting one or more keywords from each of the individual recognized products depicted in the captured images associated with the obtained keyword model templates; correlating the keywords extracted from each of the individual recognized products depicted in the images associated with the obtained keyword model templates to identify similar products, where the similar products share a number of keywords with each other and do not share a number of keywords with each other; updating a keyword model template for each of the similar products to set the keywords that are unique to the similar recognized products as must keywords and set the must keywords that are not shared between the similar recognized products as negative keywords; and transmitting the updated keyword model template including the must keywords and the negative keywords for each of the similar recognized products to the electronic database for storage to be used for analysis of subsequent images captured by the image capture device, and recognition of the products in the subsequent images.
It is understood that the direction and type of movement of the image capture device 120 about the product storage area 110 of the product storage facility 105 may depend on the physical arrangement of the product storage area 110 and/or the size and shape of the product storage structure 115. For example, the image capture device 120 may move linearly down an aisle alongside a product storage structure 115 (e.g., a shelving unit) located in a product storage area 110 of a product storage facility 105, or may move in a circular fashion around a table having curved/multiple sides. Notably, the term “product storage structure” as used herein generally refers to a structure on which products 190a-190f are stored, and may include a pallet, a shelf cabinet, a single shelf, table, rack, refrigerator, freezer, displays, bins, gondola, case, countertop, or another product display. Likewise, it will be appreciated that the number of individual products 190a-190f representing six individual units of each of six different exemplary products (generically labeled as “Brand 1,” “Brand 2,” “Brand 3,” “Brand 4,” Brand 5,” and “Brand 6”) is chosen for simplicity and by way of example only, and that the product storage structure 115 may store more or less than six units of each of the products 190a-190f. Further, the size and shape of the products 190a-190f in
Notably, the term “product storage structure” as used herein generally refers to a structure on which products 190a-190f are stored, and may include a pallet, a shelf cabinet, a single shelf, table, rack, refrigerator, freezer, displays, bins, gondola, case, countertop, or another product display. Likewise, it will be appreciated that the number of individual products 190a-190f representing six individual units of each of six different exemplary products (generically labeled as “Brand 1,” “Brand 2,” “Brand 3,” “Brand 4,” Brand 5,” and “Brand 6”) is chosen for simplicity and by way of example only, and that the product storage structure 115 may store more or less than six units of each of the products 190a-190f. Further, the size and shape of the products 190a-190f in
Notably, the term “products” may refer to individual products 190a-190f (some of which may be single-piece/single-component products and some of which may be multi-piece/multi-component products), as well as to packages or containers of products 190a-190f, which may be plastic- or paper-based packaging that includes multiple units of a given product 190a-190f (e.g., a plastic wrap that includes 36 rolls of identical paper towels, a paper box that includes 10 packs of identical diapers, etc.). Alternatively, the packaging of the individual products 190a-190f may be a plastic- or paper-based container that encloses one individual product 190a-190f (e.g., a box of cereal, a bottle of shampoo, etc.).
Notably, while the product labels 192a-192f may be referred to herein as “on-shelf product labels” or “on-shelf price tag labels,” it will be appreciated that the product labels 192a-192f do not necessarily have to be affixed to horizontal support members 119a or 119b (which may be shelves, etc.) of the product storage structure 115 as shown in
The image capture device 120 (also referred to as an image capture unit or a motorized robotic unit) of the exemplary system 100 depicted in
In some embodiments, as will be described in more detail below, the images 180 of the product storage area 110 captured by the image capture device 120 while moving about the product storage area 110 are transmitted by the image capture device 120 over a network 130 to an electronic database 140 and/or to a computing device 150. In some aspects, the computing device 150 (or a separate image processing internet based/cloud-based service module) may be configured to process such images as will be described in more detail below.
The exemplary system 100 includes an electronic database 140. Generally, the exemplary electronic database 140 of
The system 100 of
The computing device 150 may be a stationary or portable electronic device, for example, a desktop computer, a laptop computer, a single server or a series of communicatively connected servers, a tablet, a mobile phone, or any other electronic device including a control circuit (i.e., control unit) that includes a programmable processor. The computing device 150 may be configured for data entry and processing as well as for communication with other devices of system 100 via the network 130. As mentioned above, the computing device 150 may be located at the same physical location as the electronic database 140, or may be located at a remote physical location relative to the electronic database 140.
The control circuit 206 of the exemplary motorized image capture device 120 of
The motorized wheel system 210 may also include a steering mechanism of choice. One simple example may comprise one or more wheels that can swivel about a vertical axis to thereby cause the moving image capture device 120 to turn as well. It should be appreciated that the motorized wheel system 210 may be any suitable motorized wheel and track system known in the art capable of permitting the image capture device 120 to move within the product storage facility 105. Further elaboration in these regards is not provided here for the sake of brevity save to note that the aforementioned control circuit 206 may be configured to control the various operating states of the motorized wheel system 210 to thereby control when and how the motorized wheel system 210 operates.
In the exemplary embodiment of
In the embodiment illustrated in
By one optional approach, an audio input 216 (such as a microphone) and/or an audio output 218 (such as a speaker) can also operably couple to the control circuit 206. So configured, the control circuit 206 can provide a variety of audible sounds to thereby communicate with workers at the product storage facility 105 or other motorized image capture devices 120 moving about the product storage facility 105. These audible sounds can include any of a variety of tones and other non-verbal sounds. Such audible sounds can also include, in lieu of the foregoing or in combination therewith, pre-recorded or synthesized speech.
The audio input 216, in turn, provides a mechanism whereby, for example, a user (e.g., a worker at the product storage facility 105) provides verbal input to the control circuit 206. That verbal input can comprise, for example, instructions, inquiries, or information. So configured, a user can provide, for example, an instruction and/or query (.g., where is product storage structure number so-and-so?, how many products are stocked on product storage structure so-and-so? etc.) to the control circuit 206 via the audio input 216.
In the embodiment illustrated in
In some embodiments, the motorized image capture device 120 includes an input/output (I/O) device 224 that is coupled to the control circuit 206. The I/O device 224 allows an external device to couple to the control unit 204. The function and purpose of connecting devices will depend on the application. In some examples, devices connecting to the I/O device 224 may add functionality to the control unit 204, allow the exporting of data from the control unit 206, allow the diagnosing of the motorized image capture device 120, and so on.
In some embodiments, the motorized image capture device 120 includes a user interface 226 including for example, user inputs and/or user outputs or displays depending on the intended interaction with the user (e.g., worker at the product storage facility 105). For example, user inputs could include any input device such as buttons, knobs, switches, touch sensitive surfaces or display screens, and so on. Example user outputs include lights, display screens, and so on. The user interface 226 may work together with or separate from any user interface implemented at an optional user interface unit or user device 160 (such as a smart phone or tablet device) usable by a worker at the product storage facility 105. In some embodiments, the user interface 226 is separate from the image capture device 120, e.g., in a separate housing or device wired or wirelessly coupled to the image capture device 120. In some embodiments, the user interface 226 may be implemented in a mobile user device 160 carried by a person (e.g., worker at product storage facility 105) and configured for communication over the network 130 with the image capture device 120.
In some embodiments, the motorized image capture device 120 may be controlled by the computing device 150 or a user (e.g., by driving or pushing the image capture device 120 or sending control signals to the image capture device 120 via the user device 160) on-site at the product storage facility 105 or off-site. This is due to the architecture of some embodiments where the computing device 150 and/or user device 160 outputs the control signals to the motorized image capture device 120. These controls signals can originate at any electronic device in communication with the computing device 150 and/or motorized image capture device 120. For example, the movement signals sent to the motorized image capture device 120 may be movement instructions determined by the computing device 150; commands received at the user device 160 from a user; and commands received at the computing device 150 from a remote user not located at the product storage facility 105.
In the embodiment illustrated in
In some embodiments, the control circuit 206 may be communicatively coupled to one or more trained computer vision/machine learning/neural network modules/models 222 to perform at some of the functions. For example, the control circuit 206 may be trained to process one or more images 180 of product storage areas 110 at the product storage facility 105 to detect and/or recognize one or more products 190 using one or more machine learning algorithms, including but not limited to Linear Regression, Logistic Regression, Decision Tree, SVM, Naïve Bayes, kNN, K-Means, Random Forest, Dimensionality Reduction Algorithms, and Gradient Boosting Algorithms. In some embodiments, the trained machine learning module/model 222 includes a computer program code stored in a memory 208 and/or executed by the control circuit 206 to process one or more images 180, as described in more detail below.
It is noted that not all components illustrated in
With reference to
The control circuit 310 can be configured (for example, by using corresponding programming stored in the memory 320 as will be well understood by those skilled in the art) to carry out one or more of the steps, actions, and/or functions described herein. In some embodiments, the memory 320 may be integral to the processor-based control circuit 310 or can be physically discrete (in whole or in part) from the control circuit 310 and is configured non-transitorily store the computer instructions that, when executed by the control circuit 310, cause the control circuit 310 to behave as described herein. (As used herein, this reference to “non-transitorily” will be understood to refer to a non-ephemeral state for the stored contents (and hence excludes when the stored contents merely constitute signals or waves) rather than volatility of the storage media itself and hence includes both non-volatile memory (such as read-only memory (ROM)) as well as volatile memory (such as an erasable programmable read-only memory (EPROM))). Accordingly, the memory and/or the control unit may be referred to as a non-transitory medium or non-transitory computer readable medium.
The control circuit 310 of the computing device 150 may be also electrically coupled via a connection 335 to an input/output 340 that can receive signals from, for example, from the image capture device 120, the electronic database 140, internet-based service 170 (e.g., one or more of an image processing service, computer vision service, neural network service, etc.), and/or from another electronic device (e.g., an electronic device or user device 160 of a worker tasked with physically inspecting the product storage area 110 and/or the product storage structure 115 and observing the individual products 190 stocked thereon). The input/output 340 of the computing device 150 can also send signals to other devices, for example, a signal to the electronic database 140 including a raw image 180 of a product storage structure 115 as shown in
The processor-based control circuit 310 of the computing device 150 shown in
In some embodiments, the user interface 350 of the computing device 150 may also include a speaker 380 that provides audible feedback (e.g., alerts) to the operator of the computing device 150. It will be appreciated that the performance of such functions by the processor-based control circuit 310 of the computing device 150 may be not dependent on a human operator, and that the control circuit 310 of the computing device 150 may be programmed to perform such functions without a human operator.
As pointed out above, in some embodiments, the image capture device 120 moves about the product storage facility 105 (while being controlled remotely by the computing device 150 (or another remote device such one or more user devices 160)), or while being controlled autonomously by the control circuit 206 of the image capture device 120), or while being manually driven or pushed by a worker of the product storage facility 105. When the image capture device 120 moves about the product storage area 110 as shown in
In some aspects, the control circuit 310 of the computing device 150 obtains (e.g., from the electronic database 140, or from an image-processing internet-based service 170, or directly from the image capture device 120) one or more images 180 of the product storage area 110 captured by the image capture device 120 while moving about the product storage area 110. In particular, in some embodiments, the control circuit 310 of the computing device 150 is programmed to process a raw image 180 shown in
In some embodiments, the meta data extracted from the image 180 captured by the image capture device 120, when processed by the control circuit 310 of the computing device 150, enables the control circuit 310 of the computing device 150 to detect the physical location of the portion of the product storage area 110 and/or product storage structure 115 depicted in the image 180 and/or the physical locations and characteristics (e.g., size, shape, etc.) of the individual products 190a-190f and the price tag labels 192a-192f depicted in the image 180.
With reference to
In some embodiments, the control circuit 310 may be trained to process one or more images 180 of product storage areas 110 at the product storage facility 105 to detect and/or recognize one or more products 190 using one or more computer vision/machine learning algorithms, including but not limited to Linear Regression, Logistic Regression, Decision Tree, SVM, Naïve Bayes, kNN, K-Means, Random Forest, Dimensionality Reduction Algorithms, and Gradient Boosting Algorithms. In some embodiments, the trained machine learning/neural network module/model 322 includes a computer program code stored in a memory 320 and/or executed by the control circuit 310 to process one or more images 180, as described herein. It will be appreciated that, in some embodiments, the control circuit 310 does not process the raw image 180 shown in
In some embodiments, the control circuit 310 is configured to process the data extracted from the image 180 via computer vision and one or more trained neural networks to detect each of the individual products 190a-190f and each of the individual price tag labels 192a-192f located on the product storage structure 115 in the image 180, and to generate virtual boundary lines 195a-195f (as seen in image 182 in
It is understood that as used herein, the term “bounding box” is intended to be any shape that surrounds or defines boundaries about a detected object in an image. That is, a bounding box may be in the shape of a square, rectangle, circle, oval, triangle, and so on, or may be any irregular shape having curved, angled, straight and/or irregular sections within which the object is located, the irregular shape may loosely conform to the shape of the object or not. Further, a bounding box may not be complete in that it could include open sections (such that the bounding box is formed by connecting the dots). In any event, embodiments of a bounding box can be defined as a shape that surrounds or defines boundaries about a detected object. And generally, to illustrate examples of some embodiments in one or more figures, bounding boxes are illustrated in square or rectangular form.
As seen in the image 182 in
In some embodiments, after generating the virtual boundary lines 195a-195f around the products 190 and the virtual boundary lines 197a-197f around the price tag labels 192a-192f, the control circuit 310 of the computing device 150 is programmed to cause the computing device 150 to transmit a signal including the processed image 182 over the network 130 to the electronic database 140 for storage. In one aspect, this image 182 may be used by the control circuit 310 in subsequent image detection operations and/or training or retraining a neural network model as a reference model of a visual representation of the product storage structure 115 and/or products 190a-190f and/or price tag labels 192a-192f.
More specifically, in some implementations, the control circuit 310 is programmed to perform object detection analysis with respect to images subsequently captured by the image capture device 120 by utilizing machine learning/computer vision modules/models 322 that may include one or more neural network models trained using the image data stored in the electronic database 140. Notably, in certain aspects, the machine learning/neural network modules/models 322 may be retrained based on physical inspection of the product storage structure 115 and/or products 190a-190f and/or price tag labels 192a-192f by a worker of the product storage facility 105, and in response to an input received from an electronic user device 160 of the worker.
In certain embodiments, as will be discussed in more detail below with reference to
In some implementations, after the image 180 obtained by the computing device 150 is processed by the control circuit 310 as described above to generate the image 182 of
Then, as discussed in more detail below, the control circuit 310 further processes the cropped images 184a-184f depicting the products 190a-190f (or pixel data representing the product 190) as discussed in more detail below to create one or more reference model templates based on the processed images 184a-184f that are stored in the electronic database 140 to facilitate recognition/identification of products 190a-190f subsequently captured on the product storage structure 115 by the image capture device 120. In particular, in some embodiments, the control circuit 310 creates a cluster of the cropped images (see
For example,
In the exemplary method 700, after the cropped images 186a-186y are obtained in step 710, the control circuit 310 passes the cropped images 186a-186y through a neural network 196. The neural network may be a convolutional neural network. In one aspect, the convolutional neural network (CNN) may be pretrained to extract predetermined features from the cropped images 186a-186y (720) and, based on the features extracted from each of the cropped images 186a-186y, the CNN may be pretrained to generate lower dimensional representations for each of the cropped images 186a-186y (step 730). For example, step 730 of the method 700 may include the CNN converting the features extracted from each of the cropped images 186a-186y into dense vector representations, also known as embeddings 187, for each of the textual features extracted from each of the cropped images 186a-186y.
In the illustrated embodiment, each of the dense vector representations or embeddings 187 is a numerical representation (i.e., represented by a set of numbers), which may be representative of 128 (or less or more) dimensions. These numeral representations or embeddings 187 reflect the visual information (i.e., predetermined features) extracted from the cropped images 186a-186y. As such, embeddings 187 having similar numerical inputs/values are indicative of cropped images 186 having similar products 190 depicted therein, and the control circuit 310 may be programmed to place embeddings 187 having similar numerical inputs/values close together in an embedding space (e.g., a cluster, as will be discussed in more detail below with reference to
In some implementations, the control circuit 310 is programmed to use the embeddings 187 of the cropped images 186 to create an image cluster graph 825 as shown in
In some aspects, the step 820 of generating the image cluster graph 825 includes the control circuit 310 using an appropriate threshold (e.g., a predetermined threshold) for distances to create edges between the nodes 189a-189y, and positioning the nodes 189a-189y into clusters using the Louvain method for community detection. As such, each cluster of nodes 189a-189y generated in the image cluster graph 825 represents a particular unique set of cropped images 186 having similar facings, lighting patterns, etc. In other words, based on the similarity of the embeddings 187a-187y generated for the cropped images 186a-186y, the nodes 189a-189g in
In some embodiments, after the image cluster graph 825 is generated, the control circuit 310 is programmed to analyze the image cluster graph 825 and the nodes 189a-189y located in the image cluster graph 825 to select one of the cropped images that is most representative of the cluster with respect to providing an optimal visual representation of the product depicted in the cropped images represented by the clustered nodes 189, making this selected cropped image the keyword template reference image for the selected product. To that end, in the embodiment of
In the exemplary method 900 shown in
By the same token, in the example illustrated in
Similarly, in the example illustrated in
In some embodiments, after identifying the centroid node (i.e., 189d, 189n, and 189v) for each of the three node clusters and marking the corresponding cropped images (i.e., 186d, 186n, and 186v) as the keyword template reference images to facilitate future recognition/identification of products 190 in images 180 captured by the image capture device 120, the control circuit is also programmed to further process the image cluster graph 825 to generate a feature model template reference image for each of the individual products 190a-190c named BRAND 1, BRAND 2, and BRAND 3. In one aspect, after identifying the centroid node (i.e., 189d, 189n, and 189v) for each of the three node clusters, the control circuit is programmed to resample a number of the cropped images each one of the respective clusters that are located closest to the centroid nodes 189d, 189n, and 189v.
In some implementations, the control circuit 310 is programmed to select a number (e.g., a predetermined number, such as 3, 5, 10, 15, 20, etc.) of nodes 189 of a cluster that are located most proximally to their respective centroid nodes 189d, 189n, and 189v, and sample the cropped images corresponding to the selected nodes 189 such that the centroid image 186d, 186n, and 186v of each cluster, and a predetermined number of the selected resampled cropped images (located in their respective cluster most proximally to their respective centroid image) are marked as a feature model template for the respective one of the products BRAND 1, BRAND 2, and BRAND 3 associated with the respective ones of the cropped images 186a-186y. Such feature model templates, which include not only the centroid images of each cluster, but also multiple images located in the cluster most proximally to the centroid image are highly representative of the cluster features and facilitate a more accurate prediction by the control circuit 310 of whether a given product detected in one or more images 180 subsequently captured by the image capture device 120 corresponds to any one of products 190a-190c. In some aspects, the control circuit 310 may send a signal to the electronic database 140 to update the electronic database 140 to mark each centroid node 189d, 189n, and 189v of each cluster, in combination with the cropped images 186a-186y located most proximally to their respective centroid nodes 189d, 189n, and 189v in the cluster, as a feature model template to facilitate recognition/identification of the products 190 subsequently captured on the product storage structure 115 by the image capture device 120.
In some embodiments, as described in more detail below with reference to
In the embodiment illustrated in
In some embodiments, the feature model templates associated with the products 190a-190c that were generated on a given day based on a first batch of images 180 of the products 190a-190c captured by the image capture device 120 are updated in view of a second batch of images 180 of the products 190a-190c captured by the image capture device 120. In one implementation, if any of the cropped images 186a-186y used by the control circuit 310 to generate the keyword model templates and/or feature model templates as described above are marked incorrect (e.g., by the worker), the control circuit 310 is programmed to remove such cropped images 186a-186y (and their associated embeddings) from the feature model template.
In the embodiment shown in
After the cropped images 184a-184f (which may be referred to herein as “training images”) are obtained by the control circuit 310 in step 1015, in the embodiment illustrated in
In the example shown in
In the illustrated embodiment, after the characters on the portions of the cropped images 184g-184j corresponding to the products 190g-190j are detected and converted to keyword instances, the exemplary method 1000 of
In the embodiment illustrated in
For example, in instances such as illustrated in
In the example illustrated in
In some embodiments, the control circuit 310 is programmed to determine a list of “must keywords” 195g-195j for the keyword model template that each of the products 195g-195j must contain in order to be properly recognized as being associated with a respective one of the UPCs 195g-195j. For example, in the illustrated embodiment, the product 190g is Epson ink cartridge model 702, which means that all products that are identical to the product 190g must have the characters 702 on them (otherwise, they will not be Epson model 702 ink cartridge, but will instead be a similar, but a different Epson model ink cartridge, e.g., model 220, 288, 288XL, 252, 252XL, etc.), and which also means that the control circuit 310 would set the keyword “702” as a “must keyword” 195g (see
In the illustrated embodiment, after the control circuit 310 determines “must keywords” 195g-195j for the keyword model templates associated with the products 190g-190j, the control circuit 310 is programmed to generate, based on the “must keywords” 195g-195j the “negative keywords” for the keyword model template associated with the products 190g-190j. In particular, as shown in the embodiment of
In particular, as can be seen in
In the illustrated embodiment, after the control circuit 310 determines and sets “must keywords” 195g-195j and “negative keywords” 197g-197j for the keyword model templates associated with the products 190g-190j, the exemplary method 1000 of
The above-described embodiments advantageously provide for inventory management systems and methods, where the individual products detected on the product storage structures of a product storage facility can be efficiently detected and identified. As such, the systems and methods described herein provide for an efficient and precise recognition of products on product storage structures of a product storage facility, and also provide for an efficient and precise way to update the model templates used to recognize the products, thereby providing a significant cost savings to the retailer by saving the retailer thousands of worker hours that would be normally spent by workers of the retailer to manually monitor the on-shelf products.
This application is related to the following applications, each of which is incorporated herein by reference in its entirety: entitled SYSTEMS AND METHODS OF SELECTING AN IMAGE FROM A GROUP OF IMAGES OF A RETAIL PRODUCT STORAGE AREA filed on Oct. 11, 2022, application Ser. No. 17/963,787; entitled SYSTEMS AND METHODS OF IDENTIFYING INDIVIDUAL RETAIL PRODUCTS IN A PRODUCT STORAGE AREA BASED ON AN IMAGE OF THE PRODUCT STORAGE AREA filed on Oct. 11, 2022, application Ser. No. 17/963,802; entitled CLUSTERING OF ITEMS WITH HETEROGENEOUS DATA POINTS filed on Oct. 11, 2022, application Ser. No. 17/963,903; entitled SYSTEMS AND METHODS OF TRANSFORMING IMAGE DATA TO PRODUCT STORAGE FACILITY LOCATION INFORMATION filed on Oct. 11, 2022, application Ser. No. 17/963,751; entitled SYSTEMS AND METHODS OF MAPPING AN INTERIOR SPACE OF A PRODUCT STORAGE FACILITY filed on Oct. 14, 2022, application Ser. No. 17/966,580; entitled SYSTEMS AND METHODS OF DETECTING PRICE TAGS AND ASSOCIATING THE PRICE TAGS WITH PRODUCTS filed on Oct. 21, 2022, application Ser. No. 17/971,350; entitled SYSTEMS AND METHODS OF VERIFYING PRICE TAG LABEL-PRODUCT PAIRINGS filed on Nov. 9, 2022, application Ser. No. 17/983,773; entitled SYSTEMS AND METHODS OF USING CACHED IMAGES TO DETERMINE PRODUCT COUNTS ON PRODUCT STORAGE STRUCTURES OF A PRODUCT STORAGE FACILITY filed Jan. 24, 2023, application Ser. No. 18/158,969; entitled METHODS AND SYSTEMS FOR CREATING REFERENCE IMAGE TEMPLATES FOR IDENTIFICATION OF PRODUCTS ON PRODUCT STORAGE STRUCTURES OF A RETAIL FACILITY filed Jan. 24, 2023, application Ser. No. 18/158,983; entitled SYSTEMS AND METHODS FOR PROCESSING IMAGES CAPTURED AT A PRODUCT STORAGE FACILTY filed Jan. 24, 2023, application Ser. No. 18/158,925; and entitled SYSTEMS AND METHODS FOR PROCESSING IMAGES CAPTURED AT A PRODUCT STORAGE FACILTY filed Jan. 24, 2023, application Ser. No. 18/158,950; entitled SYSTEMS AND METHODS FOR ANALYZING AND LABELING IMAGES IN A RETAIL FACILITY filed January, 2023, Application No.; entitled SYSTEMS AND METHODS FOR ANALYZING DEPTH IN IMAGES OBTAINED IN PRODUCT STORAGE FACILITIES TO DETECT OUTLIER ITEMS filed January, 2023, Application No.; entitled SYSTEMS AND METHODS FOR REDUCING FALSE IDENTIFICATIONS OF PRODUCTS HAVING SIMILAR APPEARANCES IN IMAGES OBTAINED IN PRODUCT STORAGE FACILITIES filed January, 2023, Application No.; entitled SYSTEMS AND METHODS FOR IDENTIFYING DIFFERENT PRODUCT IDENTIFIERS THAT CORRESPOND TO THE SAME PRODUCT filed January, 2023, Application No.; entitled SYSTEMS AND METHODS FOR RECOGNIZING PRODUCT LABELS AND PRODUCTS LOCATED ON PRODUCT STORAGE STRUCTURES OF PRODUCT STORAGE FACILITIES filed, 2023, Application No.; and entitled SYSTEMS AND METHODS FOR DETECTING SUPPORT MEMBERS OF PRODUCT STORAGE STRUCTURES AT PRODUCT STORAGE FACILITIES, filed Jan. 30, 2023, Application No.
Those skilled in the art will recognize that a wide variety of other modifications, alterations, and combinations can also be made with respect to the above-described embodiments without departing from the scope of the invention, and that such modifications, alterations, and combinations are to be viewed as being within the ambit of the inventive concept.
| Number | Name | Date | Kind |
|---|---|---|---|
| 5074594 | Laganowski | Dec 1991 | A |
| 6570492 | Peratoner | May 2003 | B1 |
| 8923650 | Wexler | Dec 2014 | B2 |
| 8965104 | Hickman | Feb 2015 | B1 |
| 9275308 | Szegedy | Mar 2016 | B2 |
| 9477955 | Goncalves | Oct 2016 | B2 |
| 9526127 | Taubman | Dec 2016 | B1 |
| 9576310 | Cancro | Feb 2017 | B2 |
| 9659204 | Wu | May 2017 | B2 |
| 9811754 | Schwartz | Nov 2017 | B2 |
| 10002344 | Wu | Jun 2018 | B2 |
| 10019803 | Venable | Jul 2018 | B2 |
| 10032072 | Tran | Jul 2018 | B1 |
| 10129524 | Ng | Nov 2018 | B2 |
| 10210432 | Pisoni | Feb 2019 | B2 |
| 10373116 | Medina | Aug 2019 | B2 |
| 10572757 | Graham | Feb 2020 | B2 |
| 10592854 | Schwartz | Mar 2020 | B2 |
| 10839452 | Guo | Nov 2020 | B1 |
| 10922574 | Tariq | Feb 2021 | B1 |
| 10943278 | Benkreira | Mar 2021 | B2 |
| 10956711 | Adato | Mar 2021 | B2 |
| 10990950 | Garner | Apr 2021 | B2 |
| 10991036 | Bergstrom | Apr 2021 | B1 |
| 11036949 | Powell | Jun 2021 | B2 |
| 11055905 | Tagra | Jul 2021 | B2 |
| 11087272 | Skaff | Aug 2021 | B2 |
| 11151426 | Dutta | Oct 2021 | B2 |
| 11163805 | Arocho | Nov 2021 | B2 |
| 11276034 | Shah | Mar 2022 | B2 |
| 11282287 | Gausebeck | Mar 2022 | B2 |
| 11295163 | Schoner | Apr 2022 | B1 |
| 11308775 | Sinha | Apr 2022 | B1 |
| 11409977 | Glaser | Aug 2022 | B1 |
| 20050238465 | Razumov | Oct 2005 | A1 |
| 20100208941 | Broaddus et al. | Aug 2010 | A1 |
| 20110040427 | Ben-Tzvi | Feb 2011 | A1 |
| 20140002239 | Rayner | Jan 2014 | A1 |
| 20140247116 | Davidson | Sep 2014 | A1 |
| 20140307938 | Doi | Oct 2014 | A1 |
| 20150363660 | Vidal | Dec 2015 | A1 |
| 20160191856 | Huang et al. | Jun 2016 | A1 |
| 20160203525 | Hara | Jul 2016 | A1 |
| 20170106738 | Gillett | Apr 2017 | A1 |
| 20170286773 | Skaff | Oct 2017 | A1 |
| 20180005176 | Williams | Jan 2018 | A1 |
| 20180018788 | Olmstead | Jan 2018 | A1 |
| 20180108134 | Venable | Apr 2018 | A1 |
| 20180197223 | Grossman | Jul 2018 | A1 |
| 20180260772 | Chaubard | Sep 2018 | A1 |
| 20190025849 | Dean | Jan 2019 | A1 |
| 20190043003 | Fisher | Feb 2019 | A1 |
| 20190050932 | Dey | Feb 2019 | A1 |
| 20190066185 | More et al. | Feb 2019 | A1 |
| 20190087772 | Medina | Mar 2019 | A1 |
| 20190107880 | Jung | Apr 2019 | A1 |
| 20190163698 | Kwon | May 2019 | A1 |
| 20190197561 | Adato | Jun 2019 | A1 |
| 20190220482 | Crosby | Jul 2019 | A1 |
| 20190236531 | Adato | Aug 2019 | A1 |
| 20200117884 | Adato | Apr 2020 | A1 |
| 20200118063 | Fu | Apr 2020 | A1 |
| 20200246977 | Swietojanski | Aug 2020 | A1 |
| 20200265494 | Glaser | Aug 2020 | A1 |
| 20200324976 | Diehr | Oct 2020 | A1 |
| 20200356813 | Sharma | Nov 2020 | A1 |
| 20200380226 | Rodriguez | Dec 2020 | A1 |
| 20200387858 | Hasan | Dec 2020 | A1 |
| 20210049541 | Gong | Feb 2021 | A1 |
| 20210049542 | Dalal | Feb 2021 | A1 |
| 20210142105 | Siskind | May 2021 | A1 |
| 20210150231 | Kehl | May 2021 | A1 |
| 20210192780 | Kulkarni | Jun 2021 | A1 |
| 20210216954 | Chaubard | Jul 2021 | A1 |
| 20210272269 | Suzuki | Sep 2021 | A1 |
| 20210319684 | Ma | Oct 2021 | A1 |
| 20210342914 | Dalal | Nov 2021 | A1 |
| 20210400195 | Adato | Dec 2021 | A1 |
| 20220043547 | Jahjah | Feb 2022 | A1 |
| 20220051179 | Savvides | Feb 2022 | A1 |
| 20220058425 | Savvides | Feb 2022 | A1 |
| 20220067085 | Nihas | Mar 2022 | A1 |
| 20220114403 | Shaw | Apr 2022 | A1 |
| 20220114821 | Arroyo | Apr 2022 | A1 |
| 20220138914 | Wang | May 2022 | A1 |
| 20220165074 | Srivastava | May 2022 | A1 |
| 20220222924 | Pan | Jul 2022 | A1 |
| 20220230220 | Hong | Jul 2022 | A1 |
| 20220262008 | Kidd | Aug 2022 | A1 |
| Number | Date | Country |
|---|---|---|
| 106347550 | Aug 2019 | CN |
| 110348439 | Oct 2019 | CN |
| 110443298 | Feb 2022 | CN |
| 114898358 | Aug 2022 | CN |
| 3217324 | Sep 2017 | EP |
| 3437031 | Feb 2019 | EP |
| 3479298 | May 2019 | EP |
| 2006113281 | Oct 2006 | WO |
| 2017201490 | Nov 2017 | WO |
| 2018093796 | May 2018 | WO |
| 2020051213 | Mar 2020 | WO |
| 2021186176 | Sep 2021 | WO |
| 2021247420 | Dec 2021 | WO |
| Entry |
|---|
| U.S. Appl. No. 17/963,751, filed Oct. 11, 2022, Yilun Chen. |
| U.S. Appl. No. 17/963,787, filed Oct. 11, 2022, Lingfeng Zhang. |
| U.S. Appl. No. 17/963,802, filed Oct. 11, 2022, Lingfeng Zhang. |
| U.S. Appl. No. 17/963,903, filed Oct. 11, 2022, Raghava Balusu. |
| U.S. Appl. No. 17/966,580, filed Oct. 14, 2022, Paarvendhan Puviyarasu. |
| U.S. Appl. No. 17/971,350, filed Oct. 21, 2022, Jing Wang. |
| U.S. Appl. No. 17/983,773, filed Nov. 9, 2022, Lingfeng Zhang. |
| U.S. Appl. No. 18/103,338, filed Jan. 30, 2023, Wei Wang. |
| U.S. Appl. No. 18/106,269, filed Feb. 6, 2023, Zhaoliang Duan. |
| U.S. Appl. No. 18/158,925, filed Jan. 24, 2023, Raghava Balusu. |
| U.S. Appl. No. 18/158,950, filed Jan. 24, 2023, Ishan Arora. |
| U.S. Appl. No. 18/158,969, filed Jan. 24, 2023, Zhaoliang Duan. |
| U.S. Appl. No. 18/158,983, filed Jan. 24, 2023, Ashlin Ghosh. |
| U.S. Appl. No. 18/161,788, filed Jan. 30, 2023, Raghava Balusu. |
| U.S. Appl. No. 18/165,152, filed Feb. 6, 2023, Han Zhang. |
| U.S. Appl. No. 18/168,174, filed Feb. 13, 2023, Abhinav Pachauri. |
| U.S. Appl. No. 18/168,198, filed Feb. 13, 2023, Ashlin Ghosh. |
| Chaudhuri, Abon et al.; “A Smart System for Selection of Optimal Product Images in E-Commerce”; 2018 IEEE Conference on Big Data (Big Data); Dec. 10-13, 2018; IEEE; <https://ieeexplore.ieee.org/document/8622259>; pp. 1728-1736. |
| Chenze, Brandon et al.; “Iterative Approach for Novel Entity Recognition of Foods in Social Media Messages”; 2022 IEEE 23rd International Conference on Information Reuse and Integration for Data Science (IRI); Aug. 9-11, 2022; IEEE; <https://ieeexplore.ieee.org/document/9874231>; pp. 1-6. |
| Kaur, Ramanpreet et al.; “A Brief Review on Image Stitching and Panorama Creation Methods”; International Journal of Control Theory and Applications; 2017; vol. 10, No. 28; International Science Press; Gurgaon, India; <https://www.researchgate.net/publication/348232877 >; pp. 1-11 pages. |
| Naver Engineering Team; “Auto-classification of NAVER Shopping Product Categories using TensorFlow”; <https://blog.tensorflow.org/2019/05/auto-classification-of-naver-shopping.html>; May 20, 2019; pp. 1-15. |
| Paolanti, Marine et al.; “Mobile robot for retail surveying and inventory using visual and textual analysis of monocular pictures based on deep learning”; European Conference on Mobile Robots; Sep. 2017, 6 pages. |
| Refills; “Final 3D object perception and localization”; European Commision, Dec. 31, 2016, 16 pages. |
| Retech Labs; “Storx | RetechLabs”; <https://retechlabs.com/storx/>; available at least as early as Jun. 22, 2019; retrieved from Internet Archive Wayback Machine <https://web.archive.org/web/20190622012152/https://retechlabs.com/storx/> on Dec. 1, 2022; pp. 1-4. |
| Schroff, Florian et al.; “Facenet: a unified embedding for face recognition and clustering”; 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR); Jun. 7-12, 2015; IEEE; <https://ieeexplore.ieee.org/document/7298682>; pp. 815-823. |
| Singh, Ankit; “Automated Retail Shelf Monitoring Using AI”; <https://blog.paralleldots.com/shelf-monitoring/automated-retail-shelf-monitoring-using-ai/>; Sep. 20, 2019; pp. 1-10. |
| Singh, Ankit; “Image Recognition and Object Detection in Retail”; <https://blog.paralleldots.com/featured/image-recognition-and-object-detection-in-retail/>; Sep. 26, 2019; pp. 1-11. |
| Tan, Mingxing et al.; “EfficientDet: Scalable and Efficient Object Detection”; 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR); Jun. 13-19, 2020; IEEE; <https://ieeexplore.ieee.org/document/9156454>; 6 pages. |
| Tan, Mingxing et al.; “EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks”; Proceedings of the 36th International Conference on Machine Learning; 2019; vol. 97; PLMR; <http://proceedings.mlr.press/v97/tan19a.html>; pp. 6105-6114. |
| Technology Robotix Society; “Colour Detection”; <https://medium.com/image-processing-in-robotics/colour-detection-e15bc03b3f61>; Jul. 2, 2019; pp. 1-6. |
| Tonioni, Alessio et al.; “A deep learning pipeline for product recognition on store shelves”; 2018 IEEE International Conference on Image Processing, Applications and Systems (IPAS); Dec. 12-14, 2018; IEEE; <https://ieeexplore.ieee.org/document/8708890>; pp. 25-31. |
| Trax Retail; “Image Recognition Technology for Retail | Trax”; <https://traxretail.com/retail/>; available at least as early as Apr. 20, 2021; retrieved from Internet Wayback Machine <https://web.archive.org/web/20210420132348/https://traxretail.com/retail/> on Dec. 1, 2022; pp. 1-19. |
| Verma, Nishchal, et al.; “Object identification for inventory management using convolutional neural network”; IEEE Applied Imagery Pattern Recognition Workshop (AIPR); Oct. 2016, 6 pages. |
| Zhang, Jicun, et al.; “An Improved Louvain Algorithm for Community Detection”; Advanced Pattern and Structure Discovery from Complex Multimedia Data Environments 2021; Nov. 23, 2021; Mathematical Problems in Engineering; Hindawi; <https://www.hindawi.com/journals/mpe/2021/1485592/>; pp. 1-27. |
| Rodriquez, Kari, “International Search Report & Written Opinion”, International Application No. PCT/US24/12335, mailed Apr. 30, 2024, 9 pages. |
| Number | Date | Country | |
|---|---|---|---|
| 20240257043 A1 | Aug 2024 | US |