This application is based upon and claims the benefit of priority from Japanese Patent Application No. 2012-049464, filed Mar. 6, 2012, the entire contents of which are incorporated herein by reference.
Embodiments described herein relate generally to an object identification system which uses standard images of an object.
Recognition technology which identifies commodities from image data is known. This recognition technology compares a dictionary data with the feature value of the image data using algorithms such as the pattern matching method, the minutia method, and frequency analysis. In recent years these recognition technologies are being considered to be used in scanners in supermarkets.
According to embodiments, an object identification system is disclosed. The object identification system comprises a dictionary file comprising multiple records, each record including: an object identification code, and one or more standard images, wherein each standard image is related to one of the object identification codes. The object identification system further comprises a computation module configured to calculate a similarity by comparing an image data produced by an image sensor with the standard images in each record, and an identification module configured to identify one or more of the object identification codes based on the calculated similarity. The object identification system further comprises a production module configured to produce a graphical user interface that displays each of one or more standard images that are related to one of the object identification codes specified by a user.
According to additional embodiments, an object identification method is disclosed. The object identification method comprises: receiving an image data produced by an image sensor, and comparing the received image data with a plurality of standard images each related to an object identification code, wherein the plurality of standard images and the object identification codes are stored in a dictionary file. The object identification method further comprises calculating a similarity between the received image data and the standard images related to each object identification code, and identifying one or more of the object identification codes based on the calculated similarity. The object identification method further comprises accepting a user's selection of one of the object identification codes, and producing a graphical user interface that displays the one or more standard images that are related to the selected object identification code.
Hereinafter, further embodiments will be described with reference to the drawings. In the drawings, the same reference numerals denote the same or similar portions, respectively.
An embodiment will be explained with reference to
The POS terminal 11 includes a keyboard 22, a display device 23, and a display device 24. The keyboard 11 is an input device for receiving input from the operator. The display device 23 displays information for the operator. The display device 23 includes a touch panel 26 on its surface 23a. This touch panel 26 detects the location where the operators hand has contacted. The display device 24 displays information for the customers. The display device 24 may also have a touch panel 24a on its surface. The POS terminal 11 supports the display device 24 ability to turn. The operator is able to turn the display device 24 in the desired direction.
The product readout device 101 may be placed on the top of the counter 151. The product readout device 101 transmits and receives data to the POS terminal 11. The counter 151 is arranged parallel with the customer aisle. The customer is likely to move along the counter 151. A checkout stand 51 is placed to be downstream of the customer movement direction next to the counter 151. The operators may operate the product readout device 101 and a POS terminal 11 in a space surrounding the counter 151 and the checkout stand 51.
The product readout device 101 includes a housing 102. The housing 102 includes a readout window 103 at the front, a product readout section 110 internally, and an illumination source 166 that is not shown in
The housing 102 includes space for an input/output section 104 at the top. The input/output section 104 includes a display device 106, a keyboard 107, a slot 108, and a display device 109. The display device 106 includes a touch panel 105 on its surface. The display device 106 displays information for the operator. The slot 108 includes a card reader to read the magnetic strip on the back of credit and debit cards. The display device 109 displays information for customers.
Customers may place shopping cart 153a on the side of the product readout device 101 and also on the top 152 of the counter 151. The operator has empty shopping cart 153b ready next to the other side of the product readout device 101. The operator removes product G from shopping cart 153a and move product G to the front of the readout window 103. The image sensor 164 receives image data of product G through the readout window 103. After the image sensor 164 receives the image data, the operator places product G into a shopping cart 153b provided in advance. The readout device 101 receives product image data from the operator' s actions.
The microcomputer 60 is electronically connected to the drawer 21, the keyboard 22, the display device 23, the display device 24, the touch panel 26, a Hard Disk Drive (HDD) 64, a printer 66, a communication interface 25, and an external interface 65.
The keyboard 22 includes at minimum, numerical key 22d, #1 function key 22e, and #2 function key 22f. The numerical key 22d has multiple numerical keys and operator keys. The printer 66 prints customer receipt information on roll paper.
The hard disc drive 64 stores application programs PR and data including: PLU file F1, image file F2, dictionary F3, and transaction file F4. CPU 61 copies application program PR to RAM 63 at the time of launching the POS terminal 11. The CPU 61 executes application program PR stored in the RAM 63. The CPU 61 reads out data stored in the HDD 64 as needed based on the demand from the application program PR.
The external interface 65 is connected to the product readout device 101. The communications interface 25 is connected to server CS via a network. Server CS has an HDD which stores the master file for PLU file F1. The POS terminal 11 may synchronize periodically with this master file and files F1, F2, F3, and F4.
The product readout device 101 includes the product readout section 110 and the input/output section 104. The product readout section 110 includes a microcomputer 160, the image sensor 164, a sound output section 165, the illumination source 166, and an external interface 175. The microcomputer 160 controls the image sensor 164, the sound output section 165, and the external interface 175. The microcomputer 160 includes the CPU 161, the ROM. 162, and the RAM. 163. The signal line mutually connects the CPU 161, the ROM 162, and the RAM 163. The RAM 163 stores the programs that the CPU 161 executes.
A color CCD or CMOS type sensor module may be used for the image sensor 164. This image sensor 164 creates image data consecutively using a frame rate of 30 frames per second. The image data is stored in the RAM 163. Hereafter, the frame images will be expressed in the order they are created as FI (n) where n is an integer. FI (2) may be assumed to be the frame image created after FI (1).
The sound output 165 includes a sound circuit, speaker, and the like. The sound circuit converts warning sounds and voice messages stored in the RAM 163 beforehand into analog audio signals. The speaker outputs the analog signal which is created in the sound circuit as a sound.
The input/output section 104 includes the touch panel 105, the display device 106, the display device 109, the keyboard 107, and an external interface 176. The external interface 176 connects the input/output section 104 to the product readout section 110 and the POS terminal 11.
The product category indicates categories such as fruits and vegetables. The feature value is data calculated by CPU 161 based on multiple standard images mentioned later. The threshold is a lower limit of similarity. The CPU 161 may remove products from the candidacy for identification of any product below this threshold value. For example, the CPU 161 can determine, based on this threshold value, when the product loses some of its freshness and changes its surface color as time progresses. In short, when the similarity is below the threshold value, it is determined that the product in the frame image is not proper.
The standard image is image data taken (for example, by digital camera) of a product and is the standard data to determine product G. The standard image field value includes the same address and file name as image file F2, but not limited to image data of standard images.
The image file F2 stores multiple standard images produced in, for example, Joint Photographic Experts Group (JPEG) format. The standard image has multiple image data of the same product, taken under different conditions. Different conditions means, for example, differing camera directions and differing brightness. The frame image input by image sensor 164 cannot identify which part of the product is included. For that reason, multiple image data taken under various conditions is used as standard data.
The individual feature value is data collected from product G's surface unevenness, pattern, shade, and the like, and is formatted based on the standard images of each record.
The maximum value is a similarity maximum value computed by comparing the individual feature amount and the frame image to every dictionary F3 record. Based on the frame image, within the product code record computed using this system, the maximum value field may be modified. For example, when this system determines the product code is “0000000101”, only the record containing “0000000101” within the dictionary F3 will be updated, this system changes the field value which is the maximum of each record.
The feature value of the PLU file F1 is calculated based on multiple standard images describing address in dictionary F3. For example, we will explain a case in which “0000020100” product code feature value is determined. This system chooses multiple records with product codes from within “0000020100” from the dictionary F3 and calculates the feature value from standard images chosen from the record. The feature value calculated will be stored in the feature value field of the PLU file F1. The feature value may be calculated using individual feature values instead of standard images.
The image acquisition module 1611 includes the ability to gain frame image FI by controlling the image sensor 164. The image acquisition module 1611 outputs signals to the image sensor 164 and the image sensor 164 will start recording after receiving these signals. The image sensor 164 sends out frame images FI to the RAM 163. The image acquisition section 51 accepts frame images stored in the RAM 163 in order, for example, frame image FI (1), frame image FI (2), etc.
The detection module 1612 detects product G in the frame image using pattern matching technology and extracts an outline from frame image (m) binary data. Next, the detection module 1612 extracts an outline from frame image (m-g) binary data. By comparing these outlines, a product detection section 52 detects product G from frame image (m), where m and g are integers.
Frame image (m-g) is a background image that can be obtained by the image acquisition module 1611 at a time when product G is not included in the frame image.
The detection module 1612 may be able to detect a product from the skin tone of an operator's hand. When the detection module 1612 detected the skin tone region from the frame image, the detection module 1612 extracts an outline by the binarization of the skin tone region of the frame image and its surrounding. With this outline, the detection module 1612 separately detects a hand outline and other object outline. The detection module 1612 may determine this object to be product G.
The computation module 1613 calculates similarity by comparing feature values of product G included in the frame image with feature value of PLU file F1. In concrete terms, the computation module 1613 obtains a partial or whole image of product G. The computation module 1613 may also obtain image data from inside of the outline extracted by the detection module 1612. Based on the image obtained, the computation module 1613 computes feature value data A. Feature value data A is calculated without consideration of factors such as outline and size to reduce the burden on the CPU 161.
The computation module 1613 calculates similarity by comparing feature value data A and feature data B stored in PLU file F1. It is defined that when comparing, identical feature values have a similarity of 100% or 1.0. The computation module 1613 may calculate feature values by weighing such factors as color tone, surface unevenness, and surface pattern. This feature value is an absolute judgment.
Also, the computation module 1613 may calculate similarity by comparing similarity feature value data A and the individual feature value of the dictionary file F3. Furthermore, the computation module 1613 may calculate similarity by comparing the frame image with the standard images of the dictionary F3.
This technology recognizing objects from image data is called “Generic Object Recognition” as explained in:
Kenji Yanai “The Current State and Future Directions of Generic Object Recognition”, Transactions of Information Processing Society of Japan, Vol. 48, No. SIG16 (accessed Aug. 10, 2010 at http://mm.cs.uec.ac.jp/IPSJ-TCVIM-Yanai.pdf); and
Jamie Shotton et al. “Semantic Texton Forests for Image Categorization and Segmentation” (accessed Aug. 10, 2010 at http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.145.336&rep=repl&type-pdf). Each of these references is hereby incorporated by reference as if set forth herein in their entirety.
This embodiment may use relative evaluation as a similarity. When PLU file F1 has five different product records, computation module 1613 calculates each similarity as an absolute judgment by comparing the product G and the similarity of five records. Each similarity is referred to as GA, GB, GC, GD, and GE. A relative assessment of the similarity will be calculated using logic. For example, GA/(GA+GB+GC+GD+GE).
The communication module 1614 sends a product code selected based on similarity and sales number to the POS terminal 11. In concrete terms, the communication module 1614 extracts product codes of high similarity from PLU file F1 based on the similarity calculated by the computation module 1613. Also, a communication module 1618 determines whether the frame image F1 product is in a proper state or not by comparing the product code similarity with the threshold of the PLU file F1. When the product in the frame image is in a proper state, communication module 1618 sends the product code to the POS terminal 11.
The GUI production module 1616 produces a graphical user interface to send to the display device 106. This GUI contains a selection GUI (described later), a verification GUI, and a list GUI.
The verification module 1616 sums up the record numbers of each product code from dictionary F3. The verification module 1616 obtains the necessary information from PLU file F1 to produce the verification GUI and sends it to the GUI production module 1616.
The establishing module 1617 produces a standard image list for every product code. By the operator selecting a product code in the verification GUI, the establishing module 1617 collects the necessary information from image file F2 and dictionary F3, and sends it to the GUI production 1616.
A sales registration section 611 executes the process for payment by recording the transaction to transaction file F4 based on the product code and sales number received from the communication module 1614. A record of the transaction is printed on a receipt by the printer 66.
BT 23 is a button to register the frame image displayed in area R to dictionary F3. After the operator 23 selects BT 23, the operator then selects BT 20, BT 21, or BT 22, and the product code and standard image will be registered to dictionary F3. It is also possible for the operator to input the product code using keyboard 107 instead of selecting BT 20, BT 21, or BT 22.
The product name is obtained from PLU file F1. The standard image number is every record of product code counted by the verification module 1615. BT 1 to BT 6 access the list GUI. BT 7 completes the verification in GUI G2. By the operator selecting BT 7, the verification module 1615 will execute dictionary F3 correction.
GUI list G3 contains illustrations, product names, thumbnails G10, maximum value, BT 7, BT 13 to 16, BT 11 and BT 12. Illustrations and product names are the same as in the verification GUI. Thumbnails G10 is an area to arrange standard image thumbnails and a maximum value list.
Standard information contains information such as captured date, person who takes the pictures, and illumination degree in its header data. By the operator selecting BT 13 to BT 15, the GUI production module 1616 executes a sorting procedure according to the selected terms. When the operator selects BT 16, the sorting of the list is carried out by sorting the maximum field value of the dictionary F3.
The operator can match the cursor C1 to the intended thumbnail image by touching the screen. BT 11 is a button to display the standard image header data. After the operator selects one of the standard images using the cursor and presses BT 11, a window containing the header data will be displayed at the top of GUI list G3.
BT 12 is a button to erase the standard image. After the operator selects thumbnail G11 by a cursor, which is one of the standard images, and then presses BT 12, the establishing module 1617 will erase the standard image from image file F2 from the corresponding thumbnail G11. BT 7 saves any changes and exits the G3 GUI list display.
In
The detection module 1612 checks to see if the product code can be detected from the frame image (m) by comparing frame image (m-g) and frame image (m) (Act 13). When frame image (m) contains no product code, the detecting module 1612 executes the same process to frame image (m+1).
When frame image (m) contains a product code, computation section 1613 calculates feature value data A from frame image (m) (Act 14). The computation module 1613 calculates similarity by comparing feature value data A and the feature value of PLU file F1 (Act 15).
The computation module 1613 checks to see whether similarity is calculated against all of the PLU file F1 record (Act 16). When all similarity is calculated, the communication module 1614 extracts product codes with high similarity values and compares the similarity value to the product code and PLU file F1 threshold. The communication module 1614 picks up product code equal to the threshold or more. (Act 17)
The GUI production module 1616 produces the selection GUI indicated in
When the button chosen by the operator is BT 23, the GUI production module 1616 records the product code and frame image displayed in area R to dictionary F3. This frame image will be stored in image file F2.
CPU 101 checks if all of the readout processes are completed (Act 22). When all of the readout processes are not completed, the detection module 1615 will acquire a new frame image (Act 12). When all of the readout processes are completed, the image acquisition module 1611 sends an OFF signal to the image sensor 164.
The GUI production module 1616 produces the verification GUI indicated in
The GUI production module 1616 arranges data received from the establishing module 1617 according to a template file determined in advance and produces verification list GUI indicated in
In Act 33, the GUI production 1616 waits for the selection of BT 1 to BT 6 or BT 7 which indicates completion (Act 39). When the operator selects BT 7, if the record from dictionary F3 is erased, the verification module 1615 will confirm this deletion. The verification module 1615 will update dictionary F3 including other changes (Act 40).
In the first embodiment, the operator should be able to easily maintain the standard image. There is a possibility that a user will register an incorrect standard image into the hardware. In the first embodiment this problem can be easily eliminated. By resolving this problem, hardware processing speed and distinguishing accuracy should improve.
In the first embodiment, the POS terminal 11 and the product readout device 101 are composed of separate hardware systems, but the disclosure is not limited to these. There can be one hardware system with multiple servers. PLU file F1, image file F2, and dictionary F3 can be stored in a server other than the POS terminal 11. This server can be arranged in a cloud network composed of multiple servers. The PLU file F1, image file F2, and dictionary file F3 database may be modified appropriately according to a system not limited by the first embodiment.
While certain embodiments have been described, these embodiments have been presented by way of example only, and are not intended to limit the scope of the invention. Indeed, the novel embodiments described herein may be embodied in a variety of other forms. Furthermore, various omissions, substitutions and changes in the form of the embodiments described herein may be made without departing from the spirit of the invention. The accompanying claims and their equivalents are intended to cover such forms or modifications as would fall within the scope and spirit of the invention.
Number | Date | Country | Kind |
---|---|---|---|
2012-049464 | Mar 2012 | JP | national |