Businesses may rely on different mechanisms for managing inventory in a retail environment. The inventory management may include product identification for tracking inventory of the identified product. There may be different ways to identify products including the use of computer technology.
This disclosure relates generally to a model for visual product identification (VPI) using object detection with machine learning. Image acquisition with augmented reality can improve the model's identification, which may include classifying those images that is further verified with machine learning. Usage of the visual product identification data can further improve the model.
In one embodiment, a method for visual product identification includes providing augmented reality guides for image collection of a product, receiving metadata related to the product from the image collection, generating a model for checking and monitoring the received metadata, providing the model for usage for product identification, and improving the model based on data gathered from the usage of the model.
The system and method may be better understood with reference to the following drawings and description. Non-limiting and non-exhaustive embodiments are described with reference to the following drawings. The components in the drawings are not necessarily to scale, emphasis instead being placed upon illustrating the principles of the invention. In the drawings, like referenced numerals designate corresponding parts throughout the different views.
By way of introduction, the disclosed embodiments relate to systems and methods for creating and utilizing a model for visual product identification (VPI) using object detection with machine learning. Image acquisition with augmented reality can improve the model's identification, which may include classifying those images that is further verified with machine learning. Usage of the visual product identification data can further improve the model.
Reference will now be made in detail to exemplary embodiments of the invention, examples of which are illustrated in the accompanying drawings. When appropriate, the same reference numbers are used throughout the drawings to refer to the same or like parts. The numerous innovative teachings of the present application will be described with particular reference to presently preferred embodiments (by way of example, and not of limitation). The present application describes several inventions, and none of the statements below should be taken as limiting the claims generally.
For simplicity and clarity of illustration, the drawing figures illustrate the general manner of construction, and description and details of well-known features and techniques may be omitted to avoid unnecessarily obscuring the invention. Additionally, elements in the drawing figures are not necessarily drawn to scale, some areas or elements may be expanded to help improve understanding of embodiments of the invention.
The word ‘couple’ and similar terms do not necessarily denote direct and immediate connections, but also include connections through intermediate elements or devices. For purposes of convenience and clarity only, directional (up/down, etc.) or motional (forward/back, etc.) terms may be used with respect to the drawings. These and similar directional terms should not be construed to limit the scope in any manner. It will also be understood that other embodiments may be utilized without departing from the scope of the present disclosure, and that the detailed description is not to be taken in a limiting sense, and that elements may be differently positioned, or otherwise noted as in the appended claims without requirements of the written description being required thereto.
The terms “first,” “second,” “third,” “fourth,” and the like in the description and the claims, if any, may be used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the terms so used are interchangeable. Furthermore, the terms “comprise,” “include,” “have,” and any variations thereof, are intended to cover non-exclusive inclusions, such that a process, method, article, apparatus, or composition that comprises a list of elements is not necessarily limited to those elements, but may include other elements not expressly listed or inherent to such process, method, article, apparatus, or composition.
The aspects of the present disclosure may be described herein in terms of functional block components and various processing steps. It should be appreciated that such functional blocks may be realized by any number of hardware and/or software components configured to perform the specified functions. For example, these aspects may employ various integrated circuit components, e.g., memory elements, processing elements, logic elements, look-up tables, and the like, which may carry out a variety of functions under the control of one or more microprocessors or other control devices.
Similarly, the software elements of the present disclosure may be implemented with any programming or scripting languages with the various algorithms being implemented with any combination of data structures, objects, processes, routines, or other programming elements. Further, it should be noted that the present disclosure may employ any number of conventional techniques for data transmission, signaling, data processing, network control, and the like.
The particular implementations shown and described herein are for explanatory purposes and are not intended to otherwise be limiting in any way. Furthermore, the connecting lines shown in the various figures contained herein are intended to represent exemplary functional relationships and/or physical couplings between the various elements. It should be noted that many alternative or additional functional relationships or physical connections may be present in a practical incentive system implemented in accordance with the disclosure.
As will be appreciated by one of ordinary skill in the art, aspects of the present disclosure may be embodied as a method or a system. Furthermore, these aspects of the present disclosure may take the form of a computer program product on a tangible computer-readable storage medium having computer-readable program-code embodied in the storage medium. Any suitable computer-readable storage medium may be utilized, including hard disks, CD-ROM, optical storage devices, magnetic storage devices, and/or the like. These computer program instructions may be loaded onto a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions which execute on the computer or other programmable data processing apparatus create means for implementing the functions specified in the flowchart block or blocks. These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means, which implement the function, specified in the flowchart block or blocks. The computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer-implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart block or blocks.
The products 106 may be eyewear (e.g. glasses, or sunglasses) in one embodiment. In alternative embodiments, the products 106 may include other types of products other than eyewear.
The VPI 112 may include or be part of a computing device. In a retail embodiment, there may be at least one VPI 112 for each retail location for inventory management at that location. In other embodiments, employees may be able to utilize their own mobile device as the VPI 112 by running an application on the mobile device that performs the functions described below. In these embodiments, the VPI 112 may be implemented in software that is run by a computing device, such as an application or app that is run on a mobile computing device. In other embodiments, the VPI may be any hardware or software used for performing the functions described herein. In an example, where the VPI 112 is the computing device rather than just the software, the VPI 112 may include a processor 120, a memory 118, software 116 and a user interface 114. In alternative embodiments, the VPI 112 may be multiple devices to provide different functions and it may or may not include all of the user interface 114, the software 116, the memory 118, and/or the processor 120.
The user interface 114 may be a user input device, a display, or a camera. The user interface 114 may include a keyboard, keypad or a cursor control device, such as a mouse, or a joystick, touch screen display, remote control or any other device operative to allow a user or administrator to interact with the VPI 112. The user interface 114 may include a user interface configured to allow a user and/or an administrator to interact with any of the components of the VPI 112. The user interface 114 may include a display coupled with the processor 120 and configured to display an output from the processor 120. The display (not shown) may be a liquid crystal display (LCD), an organic light emitting diode (OLED), a flat panel display, a solid state display, a cathode ray tube (CRT), a projector, a printer or other now known or later developed display device for outputting determined information. The display may act as an interface for the user to see the functioning of the processor 120, or as an interface with the software 116 for providing data.
The user interface 114 of the VPI 112 may be a camera or other image acquisition component for acquiring images of the products 106. As described below, a user may utilize the VPI 112 for acquiring product images. This implementation of the user interface 114 as a camera may be in addition to the embodiments described above for receiving user input.
The processor 120 in the VPI 112 may include a central processing unit (CPU), a graphics processing unit (GPU), a digital signal processor (DSP) or other type of processing device. The processor 120 may be a component in any one of a variety of systems. For example, the processor 120 may be part of a standard personal computer or a workstation. The processor 120 may be one or more general processors, digital signal processors, application specific integrated circuits, field programmable gate arrays, servers, networks, digital circuits, analog circuits, combinations thereof, or other now known or later developed devices for analyzing and processing data. The processor 120 may operate in conjunction with a software program (i.e. software 116), such as code generated manually (i.e., programmed). The software 116 may include the functions described below for classifying images, machine learning, integrating the model, utilization of the model, and improvement of the model. The functions described below for the VPI 112 may be implemented at least partially in software (e.g. software 116) in some embodiments.
The processor 120 may be coupled with the memory 118, or the memory 118 may be a separate component. The software 116 may be stored in the memory 118. The memory 118 may include, but is not limited to, computer readable storage media such as various types of volatile and non-volatile storage media, including random access memory, read-only memory, programmable read-only memory, electrically programmable read-only memory, electrically erasable read-only memory, flash memory, magnetic tape or disk, optical media and the like. The memory 118 may include a random access memory for the processor 120. Alternatively, the memory 118 may be separate from the processor 120, such as a cache memory of a processor, the system memory, or other memory. The memory 118 may be an external storage device or database for storing recorded tracking data, or an analysis of the data. Examples include a hard drive, compact disc (“CD”), digital video disc (“DVD”), memory card, memory stick, floppy disc, universal serial bus (“USB”) memory device, or any other device operative to store data. The memory 118 is operable to store instructions executable by the processor 120.
The functions, acts or tasks illustrated in the figures or described herein may be performed by the programmed processor executing the instructions stored in the software 116 or the memory 118. The functions, acts or tasks are independent of the particular type of instruction set, storage media, processor or processing strategy and may be performed by software, hardware, integrated circuits, firm-ware, micro-code and the like, operating alone or in combination. Likewise, processing strategies may include multiprocessing, multitasking, parallel processing and the like. The processor 120 is configured to execute the software 116.
The present disclosure contemplates a computer-readable medium that includes instructions or receives and executes instructions responsive to a propagated signal, so that a device connected to a network can communicate voice, video, audio, images or any other data over a network. The user interface 114 may be used to provide the instructions over the network via a communication port. The communication port may be created in software or may be a physical connection in hardware. The communication port may be configured to connect with a network, external media, display, or any other components in system 100, or combinations thereof. The connection with the network may be a physical connection, such as a wired Ethernet connection or may be established wirelessly as discussed below. Likewise, the connections with other components of the system 100 may be physical connections or may be established wirelessly.
Although not shown, data used by the VPI 112 may be stored in locations other than the memory 118, such as a database connected through the network 104. For example, the images that are acquired may be stored in the memory 118 and/or stored in a database accessible via the network 104. Likewise, the machine learning model may be operated by the VPI 112 but may include functionality stored in the memory 118 and/or stored in a database accessible via the network 104. The VPI 112 may include communication ports configured to connect with a network. The network or networks that may connect any of the components in the system 100 to enable communication of data between the devices may include wired networks, wireless networks, or combinations thereof. The wireless network may be a cellular telephone network, a network operating according to a standardized protocol such as IEEE 802.11, 802.16, 802.20, published by the Institute of Electrical and Electronics Engineers, Inc., or WiMax network. Further, the network(s) may be a public network, such as the Internet, a private network, such as an intranet, or combinations thereof, and may utilize a variety of networking protocols now available or later developed including, but not limited to TCP/IP based networking protocols. The network(s) may include one or more of a local area network (LAN), a wide area network (WAN), a direct connection such as through a Universal Serial Bus (USB) port, and the like, and may include the set of interconnected networks that make up the Internet. The network(s) may include any communication method or employ any form of machine-readable media for communicating information from one device to another.
The VPI 112 performs the operations described in the embodiments below. For example,
The example embodiment of
For each image captured, the AR guides change and move to direct the user to get different angle images. Image history ensures proper product is not “lost” as user moves around the display. The perspective polygon updates in real-time as the user progresses through the capture process. The images in history can be deleted and retaken if needed.
The dataset may be filtered to ensure all labels are equally represented, and avoid bias. Standard image transformations may be applied or unique image transformations may vary color temperature and place marketing materials over the product.
The model training may track accuracy of the product identification. There may also be a retraining process that may be part of the training. This training may compare accuracy before and after using the training model with field data. Using field data to retrain the model improves accuracy by exposing the model to real-world conditions, but biases the model to be more likely to guess those products that are the most common in the field. This bias can be compensated using statistical methods.
As described, the VPI 112 may be implemented in an application (app) for a mobile device. In one example, there may be a dedicated iOS framework that eases integration into mobile solutions. Models can be hosted on any server over the network 104 and downloaded or accessed when needed. Cloud configuration ensures model updates can be rolled out gradually. The model can be delivered in any format ideal for the target device.
Retraining models allows them to learn from new data captured by users. All images captured by users may stored in a web database over the network 104. These images augment the existing dataset to fine-tune the model for real-world performance. Accuracy between versions can be tracked and compared. As new models are trained with field data, they can migrate from beta to release status.
The process described herein may be subject to different embodiments that the example embodiments described. For example, the inventory system may not be a solely self contained software tool or process. The image acquisition and usage process may improve utility and ease of use of connected or otherwise available business systems. The results of the classification and confirmation data can be sent to a Supply Chain Management (SCM) system for use in completing orders for novice users that may not otherwise be able to efficient, accurately, or quickly make product identification decisions. The results could also be used with other business systems that interact with real world objects, settings and beings. In another embodiment, enterprise resource planning (ERP) systems could leverage the classifications for asset management or more efficient facility operations and logistics support. In another embodiment, Customer Relationship Management (CRM) systems could use this system for contextual customer information delivery, personnel or facial recognition, or other marketing automation features. In another embodiment, Learning Management Systems (LMS), or Knowledge Management and Learning Experience Platforms (LXP) could use this for provision of relevant training or performance support materials at time-of-need. In another embodiment, Business Process Management, Task Management or Work Order systems could use this system for provision of repair records, instructions or checklists delivers via recognized objects.
The system and process described above may be encoded in a signal bearing medium, a computer readable medium such as a memory, programmed within a device such as one or more integrated circuits, one or more processors or processed by a controller or a computer. That data may be analyzed in a computer system and used to generate a spectrum. If the methods are performed by software, the software may reside in a memory resident to or interfaced to a storage device, synchronizer, a communication interface, or non-volatile or volatile memory in communication with a transmitter. A circuit or electronic device designed to send data to another location. The memory may include an ordered listing of executable instructions for implementing logical functions. A logical function or any system element described may be implemented through optic circuitry, digital circuitry, through source code, through analog circuitry, through an analog source such as an analog electrical, audio, or video signal or a combination. The software may be embodied in any computer-readable or signal-bearing medium, for use by, or in connection with an instruction executable system, apparatus, or device. Such a system may include a computer-based system, a processor-containing system, or another system that may selectively fetch instructions from an instruction executable system, apparatus, or device that may also execute instructions.
A “computer-readable medium,” “machine readable medium,” “propagated-signal” medium, and/or “signal-bearing medium” may comprise any device that includes stores, communicates, propagates, or transports software for use by or in connection with an instruction executable system, apparatus, or device. The machine-readable medium may selectively be, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, device, or propagation medium. A non-exhaustive list of examples of a machine-readable medium would include: an electrical connection “electronic” having one or more wires, a portable magnetic or optical disk, a volatile memory such as a Random Access Memory “RAM”, a Read-Only Memory “ROM”, an Erasable Programmable Read-Only Memory (EPROM or Flash memory), or an optical fiber. A machine-readable medium may also include a tangible medium upon which software is printed, as the software may be electronically stored as an image or in another format (e.g., through an optical scan), then compiled, and/or interpreted or otherwise processed. The processed medium may then be stored in a computer and/or machine memory.
The illustrations of the embodiments described herein are intended to provide a general understanding of the structure of the various embodiments. The illustrations are not intended to serve as a complete description of all of the elements and features of apparatus and systems that utilize the structures or methods described herein. Many other embodiments may be apparent to those of skill in the art upon reviewing the disclosure. Other embodiments may be utilized and derived from the disclosure, such that structural and logical substitutions and changes may be made without departing from the scope of the disclosure. Additionally, the illustrations are merely representational and may not be drawn to scale. Certain proportions within the illustrations may be exaggerated, while other proportions may be minimized. Accordingly, the disclosure and the figures are to be regarded as illustrative rather than restrictive.
One or more embodiments of the disclosure may be referred to herein, individually and/or collectively, by the term “invention” merely for convenience and without intending to voluntarily limit the scope of this application to any particular invention or inventive concept. Moreover, although specific embodiments have been illustrated and described herein, it should be appreciated that any subsequent arrangement designed to achieve the same or similar purpose may be substituted for the specific embodiments shown. This disclosure is intended to cover any and all subsequent adaptations or variations of various embodiments. Combinations of the above embodiments, and other embodiments not specifically described herein, will be apparent to those of skill in the art upon reviewing the description.
The phrase “coupled with” is defined to mean directly connected to or indirectly connected through one or more intermediate components. Such intermediate components may include both hardware and software based components. Variations in the arrangement and type of the components may be made without departing from the spirit or scope of the claims as set forth herein. Additional, different or fewer components may be provided.
The above disclosed subject matter is to be considered illustrative, and not restrictive, and the appended claims are intended to cover all such modifications, enhancements, and other embodiments, which fall within the true spirit and scope of the present invention. Thus, to the maximum extent allowed by law, the scope of the present invention is to be determined by the broadest permissible interpretation of the following claims and their equivalents, and shall not be restricted or limited by the foregoing detailed description. While various embodiments of the invention have been described, it will be apparent to those of ordinary skill in the art that many more embodiments and implementations are possible within the scope of the invention. Accordingly, the invention is not to be restricted except in light of the attached claims and their equivalents.
This application claims priority to U.S. Provisional Patent App. 63/043,248, entitled “Visual Product Identification,” filed on Jun. 24, 2020, the entire disclosure of which is herein incorporated by reference.
Number | Date | Country | |
---|---|---|---|
63043248 | Jun 2020 | US |