This application is based upon and claims the benefit of priority of the prior Japanese Patent Application No. 2021-161970, filed on Sep. 30, 2021, the entire contents of which are incorporated herein by reference.
The embodiments discussed herein are related to a non-transitory computer-readable recording medium, an information processing method, and an information processing apparatus.
In the stores such as supermarkets or convenience stores, the self-service checkout system has become popular. The self-service checkout system is a POS checkout system (POS stands for Point Of Sale) using which a user, who is purchasing articles, performs the operations of reading the barcodes of the articles and settling the bill. For example, as a result of introducing the self-service checkout system, it becomes possible to improve on the shortage of manpower attributed to the decline in population, or to hold down the manpower expenses.
According to an aspect of an embodiment, a non-transitory computer-readable recording medium stores therein an information processing program that causes a computer to execute a process including obtaining image data in which a predetermined area in front of an accounting machine, which is used by a user to register an article and pay bill, is captured, obtaining output result by inputting the image data in a machine learning model that is trained to identify an article and a storage for an article, and Identifying, by referring to the article and the storage specified in the output result, action taken by the user with respect to an article.
The object and advantages of the invention will be realized and attained by means of the elements and combinations particularly pointed out in the claims.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory and are not restrictive of the invention.
With the techniques recited in the background section, in the technology mentioned above, it is difficult to detect the occurrence of unfair action. For example, in the self-service checkout system, there are times when the user either makes some inadvertent mistake or cheats intentionally, which results in non-payment.
As far as an inadvertent mistake is concerned, omission of scanning can occur when an article is moved from the basket into a shopping bag while forgetting to scan the article. For example, in a beer case including six cans of beer, a barcode is attached to the beer case as well as to each beer can. In that case, a reading mistake can occur when the barcode of a beer can is mistakenly presented instead of presenting the barcode of the beer case. As far as intentional cheating is concerned, there is barcode hiding in which the user pretends to scan an article while hiding the barcode by fingers.
In that regard, it is possible to think of installing a weight sensor in the self-service checkout machine, so as to enable automatic counting of the number of articles and detect an unfair action. However, the implementation cost is too high, and it is not a practical solution particularly for large stores or for the stores spread throughout the country.
Preferred embodiments will be explained with reference to accompanying drawings. However, the present invention is not limited by the embodiments described below. Meanwhile, the embodiments can be combined without causing any contradictions.
The information processing apparatus 100 represents an example of a computer connected to the camera 30 and the self-service checkout machine 50. Moreover, the information processing apparatus 100 is connected to the administrator terminal 60 via a network 3 that is configurable with various communication networks of the wired type as well as the wireless type. The camera 30 and the self-service checkout machine 50 can also be connected to the information processing apparatus 100 via the network 3.
The camera 30 represents an example of a camera that takes a video of the area in which the self-service checkout machine 50 is installed. The camera 30 sends the data of a video to the information processing apparatus 100. In the following explanation, the data of a video is sometimes referred to as “video data”.
The video data contains a plurality of image frames in chronological order. Each image frame is assigned with a frame number in ascending chronological order. A single image frame represents image data of a still image that is taken at a particular timing by the camera 30.
The self-service checkout machine 50 represents an example of a POS checkout system or an accounting machine using which a user 2, who is purchasing articles, performs the operations of reading the barcodes of the articles and settling the bill. For example, when the user 2 moves a target article for purchase within the scanning area of the self-service checkout machine 50, the self-service checkout machine 50 scans the barcode of that article and registers it as a target article for purchase.
The user 2 repeatedly performs the operation of article registration and, once the scanning of the articles is completed, operates the touch-sensitive panel of the self-service checkout machine 50 and requests for bill settlement. Upon receiving the request for bill settlement, the self-service checkout machine 50 presents the number of target articles for purchase and the purchase price, and performs a bill settlement operation. The self-service checkout machine 50 stores, in a memory unit, the information about the articles that were scanned since the start of the scanning performed by the user 2 till the request for bill settlement issued by the user 2; and sends the stored information as self-service checkout data (article information) to the information processing apparatus 100.
The administrator terminal 60 represents an example of a terminal device used by the administrator of the store. The administrator terminal 60 receives, from the information processing apparatus 100, a notification of an alert indicating that there was an unfair action in regard to the purchase of articles.
In this configuration, the information processing apparatus 100 obtains the image data of a predetermined area in front of the self-service checkout machine 50 using which the user 2 registers articles and pays the bill. Then, the information processing apparatus 100 inputs the image data in a machine learning model that is trained to identify the articles and the storage for the articles (such as a shopping bag), and obtains the output result. Subsequently, the information processing apparatus 100 refers to the articles and the storage specified in the output result, and identifies the action of the user with respect to the articles.
That is, the information processing apparatus 100 detects an article and the interaction of the user 2 with respect to the article (for example, the action of holding); and counts the number of articles taken out from the shopping basket, counts the number of the articles that passed through the scanning position in the self-service checkout machine 50, and count the number of articles put in the shopping bag. Then, the information processing apparatus 100 compares the number of counted articles with the number of articles scanned by the self-service checkout machine 50; and performs unfair action detection in regard to the purchase of the articles.
In this way, since no weight sensor is installed in the information processing apparatus 100, not only the implementation cost can be held down, but any unfair action at the self-service checkout machine 50 can also be detected.
Functional Configuration
The communication unit 101 is a processing unit that controls the communication with other devices and is implemented using, for example, a communication interface. For example, the communication unit 101 receives video data from the camera 30; as well as sends the processing result obtained by the control unit 110 to the administrator terminal 60.
The memory unit 102 is a processing unit that is used to store a variety of data and to store the computer programs to be executed by the control unit 110; and is implemented using, for example, a memory or a hard disk. The memory unit 102 is used to store a training data DB 103, a machine learning model 104, a video data DB 105, and a self-service checkout data DB 106.
The training data DB 103 is a database for storing the data that is used in the training of the machine learning model 104. For example, explained below with reference to
In the correct-answer information, the following information is set: a class for a person and an object that are the detection targets; a class indicating the interaction between a person and an object; and Bbox indicating the area of each class (Bbox stands for Bounding box, and represents the area information of an object). For example, the following information is set as the correct-answer information: area information of a “something” class indicating an object such as an article excluding a shopping bag; area information of a “person” class indicating the user who purchases an article; and the relationship (a “holding” class) indicating the interaction between the “something” class and the “person” class. That is, as the correct-answer information, the information related to an object held by a person is set. Meanwhile, the “person” class represents an example of a first-type class; the “something” class represents an example of a second-type class; the area information of the “person” class represents an example of first-type area information; the area information of the “something” class represents an example of second-type area information; and the interaction between a person and an object represents an example of interaction.
Moreover, the following information is also set as the correct-answer information: area information of a “shopping bag” class indicating a shopping bag; area information of a “person” class indicating the user of the shopping bag; and the relationship (a “holding” class) indicating the interaction between the “shopping bag” class and the “person” class. That is, as the correct-answer information, the information related to the shopping bag held by a person is set.
Generally, if the “something” class is created according to the normal object identification (object recognition); it results in the detection of all objects not related to the task, such as the background, clothing, and small articles. Moreover, since all those objects imply “something”, nothing becomes clear except the fact that a large number of Bbox get identified in the image data. In the case of the HOID, since the peculiar relationship indicating an object being held by a person is understood (sometimes there are other relationships too, such as sitting or operating), that information can be used as meaningful information in a task (for example, an unfair action detection task in the self-service checkout machine). After detecting an object as “something”, the shopping bag is identified as a unique class called “bag (shopping bag)”. The shopping bag represents information that is valuable in the unfair action detection task in the self-service checkout machine, but is not important in other tasks. Hence, the value of that information lies in being used based on the unique knowledge of the unfair action detection task in the self-service checkout machine indicating that an article is taken out from a basket (a shopping basket) and is put in a bag. Thus, in this case, it becomes possible to achieve a useful effect.
Returning to the explanation with reference to
The video data DB 105 is a database for storing the video data that is obtained by imaging by the camera 30 installed in the self-service checkout machine 50. For example, in the video data DB 105, the video data is stored corresponding to each self-service checkout machine 50 or corresponding to each camera 30.
The self-service checkout data DB 106 is a database for storing a variety of data obtained from the self-service checkout machine 50. For example, the self-service checkout data DB 106 is used to store, for each self-service checkout machine 50, the number of articles registered as the target articles for purchase and the billing amount representing the total of the prices of all target articles for purchase.
The control unit 110 is a processing unit that controls the entire information processing apparatus 100 and is implemented using, for example, a processor. The control unit 110 includes a machine learning unit 111, a video obtaining unit 112, an unfair action detecting unit 113, and a warning unit 114. The machine learning unit 111, the video obtaining unit 112, the unfair action detecting unit 113, and the warning unit 114 are implemented using electronic circuits included in the processor or using the processes executed by the processor.
The machine learning unit 111 is a processing unit that uses a variety of training data stored in the training data DB 103, and performs machine learning of the machine learning model 104.
The video obtaining unit 112 is a processing unit that obtains the video data from the camera 30. For example, the video obtaining unit 112 obtains, as needed, the video data from the camera 30 that is installed in the self-service checkout machine 50, and stores the video data in the video data DB 105.
The unfair action detecting unit 113 is a processing unit that, based on the video data obtained as a result of imaging of the surrounding of the self-service checkout machine 50, detects an unfair action such as forgetting to scan an article. More particularly, the unfair action detecting unit 113 obtains image data of a predetermined area in front of the self-service checkout machine 50 using which the user 2 registers articles and pays the bill. Then, the unfair action detecting unit 113 inputs the image data in the machine learning model 104; obtains the output result; refers to the articles and the shopping bags specified in the output result; and identifies the actions of the user 2 with respect to the articles.
For example, the unfair action detecting unit 113 obtains the following from the output result of the HOID: the “person” class and the area information, the “article” (object) class and the area information, and the interaction between the person and the article. Then, the unfair action detecting unit 113 counts the number of articles with respect to which the user 2 (person) took a specific action such as holding (interaction).
Subsequently, the unfair action detecting unit 113 compares the counted number of articles with the scanning count indicating the number of articles that are scanned and registered in the self-service checkout machine 50 (i.e., the registered article count). If there is a difference in the two counts, then the unfair action detecting unit 113 detects an unfair action and notifies the warning unit 114.
For example, as illustrated in (a) in
As illustrated in (b) in
As illustrated in (c) in
Herein, the explanation is given for a case in which the unfair action detecting unit 113 refers to the video data containing a plurality of sets of image data (frames) and counts the articles that the user 2 is purchasing.
As illustrated in
Subsequently, the unfair action detecting unit 113 obtains the image data 3 in which a person is captured who is taking out an article from a shopping basket; inputs the image data 3 in the HOID; and obtains the output result. Accordingly, the unfair action detecting unit 113 detects the action that is taken by the user 2 and that indicates moving the held article to the upper side of the shopping basket. Herein, since the detection result corresponds to (a) illustrated in
Then, the unfair action detecting unit 113 obtains the image data 4 in which a person is captured who is scanning an article; inputs the image data 4 in the HOID; and obtains the output result. Accordingly, the unfair action detecting unit 113 detects the action that is taken by the user 2 and that indicates moving the held article to the scanning position. Herein, since the detection result corresponds to (b) illustrated in
Subsequently, the unfair action detecting unit 113 obtains the image data 5 in which a person is captured who is putting an article into a shopping bag; inputs the image data 5 in the HOID; and obtains the output result. Accordingly, the unfair action detecting unit 113 detects the action that is taken by the user 2 and that indicates putting the held article in the held shopping bag. Herein, since the detection result corresponds to (c) illustrated in
Then, the unfair action detecting unit 113 obtains the image data 6 in which a person is captured who is taking out an article from a shopping basket; inputs the image data 6 in the HOID; and obtains the output result. Accordingly, the unfair action detecting unit 113 detects the action that is taken by the user 2 and that indicates moving the held article to the upper side of the shopping basket. Herein, since the detection result corresponds to (a) illustrated in
Subsequently, the unfair action detecting unit 113 obtains the image data 7 in which a person is captured who is scanning an article; inputs the image data 7 in the HOID; and obtains the output result. Accordingly, the unfair action detecting unit 113 detects the action taken by the user 2 for moving the held article to the scanning position. Herein, since the detection result corresponds to (b) illustrated in
As explained above, the unfair action detecting unit 113 inputs, in the HOID, the frames of the video data obtained as a result of imaging performed from the time when the user 2 brings the shopping basket to the position of the self-service checkout machine 50 to the time when the user 2 pays the bill; obtains the output result (the detection result); and accordingly performs action identification (action recognition) of the target for count-up. As a result, the unfair action detecting unit 113 can count the number of articles that the user 2 intends to purchase. Meanwhile, as far as ending the counting of the articles is concerned, for example, the counting can be ended when the article registration count is notified from the self-service checkout machine 50 or when the completion of article registration is notified from the self-service checkout machine 50.
Then, the unfair action detecting unit 113 obtains, from the self-service checkout machine 50, the article registration count registered in the self-service checkout machine 50; compares the obtained article registration count with the counted number of articles; and detects any unfair action taken by the user 2.
Returning to the explanation with reference to
Flow of Operations
On the other hand, when the article count is not greater than the scanning count (No at S103) and when the self-service checkout machine 50 performs the scanning operation without the selection of the billing operation by an operation by the user 2 (No at S105), the operations from S101 are again performed.
On the other hand, when the billing operation is selected as a result of an operation performed by the user 2 (Yes at S105), the self-service checkout machine 50 performs the billing operation (S106). Then, the information processing apparatus 100 compares the article count, which is counted till the billing operation is performed, and the scanning count, which is registered in the self-service checkout machine 50 till the billing operation is performed (S107).
If the article count is greater than the scanning count (Yes at S107), then the information processing apparatus 100 detects an unfair action; takes measures to issue a warning (S108); and requests for a correction operation (S109). On the other hand, if the article count is not greater than the scanning count (No at S107), then the information processing apparatus 100 ends the operations.
Operation for Counting Number of Articles
Given below is the explanation of an operation for counting the number of articles. Herein, the explanation is given about an example in which the information processing apparatus 100 counts the number of articles taken out from the shopping basket and the number of articles put in the shopping bag.
Once a person and an object are detected (Yes at S202), the information processing apparatus 100 determines whether or not the action indicates taking out an object from the shopping basket (S203). If the action indicates taking out an object from the shopping basket (Yes at S203), then the information processing apparatus 100 increments the PickUpFromBasket count (S204).
Then, the information processing apparatus determines whether or not the action indicates putting the object in the shopping bag (S205). If the action indicates putting the object in the shopping bag (Yes at S205), then the information processing apparatus 100 increments a PutInBag count (S206).
Subsequently, if the scanning is to be continued (No at S207), then the information processing apparatus 100 again performs the operations from S201 onward. When all of the scanning has ended (Yes at S207), the information processing apparatus 100 sets the greater count between the PickUpFromBasket count and the PutInBag count as the article count (S208).
Meanwhile, at S202, if no person or object is detected (No at S202), the information processing apparatus 100 performs the operation at S207 without performing the operations from S203 to S206. At S203, if the action of taking out an object from the shopping basket is not performed (No at S203), then the information processing apparatus 100 performs the operation at S205 without performing the operation at S204. Moreover, at S205, if the action of putting an article in the shopping bag is not performed (No at S205), then the information processing apparatus 100 performs the operation at S207 without performing the operation at S206.
Effects
As explained above, as a result of using the HOID, the information processing apparatus 100 becomes able to detect an object such as an article or a shopping bag having interaction with the user (person). At that time, in order to count the number of articles that the user brings to the self-service checkout machine 50 with the intention of purchasing, the information processing apparatus 100 detects the shopping bag or the shopping basket that enables confirmation of bringing the articles or taking out the articles, and can accurately count the number of articles that the user intends to purchase. As a result, the information processing apparatus 100 becomes able to detect any unfair action taken at the self-service checkout machine 50. Meanwhile, an unfair action not only includes intentionally skipping the scanning of an article, but also includes forgetting to scan an article.
Meanwhile, in the commonly-used object identification, unless a large volume of learning data is available for each article, object identification is difficult; and the objects not having any interaction with a person also get identified. Thus, the objects in the background also get identified. In contrast, in the information processing apparatus 100, unlike in the case of commonly-used object identification, as a result of using the HOID that enables identification of only the objects having interaction with a person, an arbitrary object can be identified as “something” (without the influence of the external appearance of that object), and the corresponding object area (Bbox) can be estimated. Moreover, in the image data in the self-service checkout machine 50, the shopping bag or the shopping basket appears frequently without having much change in the appearance as compared to the articles. Hence, it becomes possible to reduce the cost for collecting the training data to be used in the training of the HOID.
In the first embodiment, the explanation is given about the action identification performed using the HOID. However, that is not the only possible case. Alternatively, instead of using the HOID, it is also possible to use a machine learning model in which a neural network is used, or to use a machine learning model that is generated based on deep learning.
The information processing apparatus 100 inputs image data A1 (a frame) in the machine learning model, and obtains an output result S1 in which the position information of a person is detected. Then, the information processing apparatus 100 inputs subsequently-obtained image data A2 in the machine learning model, and obtains an output result S2 in which the position information of a person and the position information of an article are detected.
Subsequently, the information processing apparatus 100 calculates the difference between the output result S1 and the output result S2, and performs action identification. For example, when a “person” is detected in both output results, if an “article” is detected as the difference therebetween, then the information processing apparatus 100 counts the article as the target article for purchase.
In this way, the information processing apparatus 100 can perform action identification based on the inter-frame difference, and count the number of articles. Meanwhile, the subsequent method for unfair action detection is identical to the first embodiment. Hence, the detailed explanation is not given again. As a result, the information processing apparatus 100 becomes able to provide a simple system in which a machine learning model is used.
Moreover, the information processing apparatus 100 can perform action identification by combining the HOID and a machine learning model.
As illustrated in
Then, by combining the HOID and the machine learning model, the information processing apparatus 100 can identify the shopping basket or the shopping bag in the output result of the HOID, and can perform the operations identical to the first embodiment. As a result, the information processing apparatus 100 can accurately detect the position of the shopping basket or the shopping bag, thereby enabling achieving enhancement in the detection accuracy of the article and the interaction, as well as achieving enhancement in the accuracy of unfair action detection.
Till now, the description was given about the embodiments of the present invention. However, besides the embodiments described above, the present invention can be implemented in various other forms.
Numerical Values
In the embodiments described above, the number of self-service checkout machines, the number of cameras, the numerical values, the training data, the number of sets of training data, the machine learning models, the class names, the number of classes, and the data formats are only exemplary; and can be changed in an arbitrary manner. Moreover, the flow of operations explained with reference to each flowchart can also be changed without causing any contradictions. Furthermore, regarding various models, it is possible to use models generated according to various algorithms of neural networks.
Moreover, regarding the scanning position and the position of the shopping basket, the information processing apparatus 100 can use a different machine learning model meant for detecting positions, or can use a known technology such as an object detection technology or a position detection technology. For example, based on the differences among the frames (image data) and the chronological variation in the frames, the information processing apparatus 100 can detect the position of the shopping basket. Thus, using that information, the information processing apparatus 100 can either perform detection or generate a different model. Furthermore, the information processing apparatus 100 can specify in advance the size of the shopping basket and, when an object of the specified size is detected from the image data, can identify that position as the position of the shopping basket. Meanwhile, since the scanning position is a fixed position to a certain extent, the information processing apparatus 100 can identify, as the scanning position, the position specified by the administrator.
System
The processing procedure, the control procedure, the specific names, and the information including various kinds of data and parameters described in the above document and the drawings can be changed arbitrarily, unless otherwise specified.
Moreover, the specific forms of dispersion or integration of the constituent elements of the devices are not limited to the examples illustrated in the drawings. For example, the video obtaining unit 112 and the unfair action detecting unit 113 can be integrated. That is, the constituent elements, as a whole or in part, can be separated or integrated either functionally or physically based on various types of loads or use conditions. Furthermore, the process functions implemented in the device are entirely or partially implemented by a CPU or by computer programs that are analyzed and executed by a CPU, or are implemented as hardware by wired logic.
Hardware
The communication device 100a is a network interface card that performs communication with other devices. The HDD 100b is used to store a computer program meant for implementing the functions illustrated in
The processor 100d reads, from the HDD 100b, the computer program meant for implementing operations identical to the processing units illustrated in
In this way, as a result of reading and executing the computer program, the information processing apparatus 100 operates as an information processing apparatus that implements an information processing method. Alternatively, the information processing apparatus 100 can make a medium reading device read the computer program from a recording medium, and can execute the read computer program so as to implement functions identical to the embodiments described above. Meanwhile, the computer program is not limited to be executed by the information processing apparatus 100. Alternatively, for example, even when the computer program is executed by another computer, or by another server, or by a computer and a server in cooperation; the embodiments described above can be implemented in an identical manner.
The computer program can be distributed via a network such as the Internet. Alternatively, the computer program can be recorded in a computer-readable recording medium such as a flexible disk (FD), a compact disc read only memory (CD-ROM), a magneto-optical disk (MO), or a digital versatile disc (DVD). Then, a computer can read the computer program from the recording medium and execute it.
The communication interface 400a is a network interface card that performs communication with other information processing apparatuses. The HDD 400b is used to store computer programs meant for implementing the functions of the self-service checkout machine 50, and to store databases.
The processor 400d is a hardware circuit that reads, from the HDD 400b, a computer program meant for implementing the functions of the self-service checkout machine 50; loads the computer program in the memory 400c; and runs a process for implementing the functions of the self-service checkout machine 50. That is, the process implements functions identical to the processing units of the self-service checkout machine 50.
In this way, as a result of reading and executing the computer program, the self-service checkout machine 50 operates as an information processing apparatus that implements an operation control method. Alternatively, the self-service checkout machine 50 can make a medium reading device read the computer program from a recording medium, and can execute the read computer program so as to implement functions identical to the self-service checkout machine 50. Meanwhile, the computer program is not limited to be executed by the self-service checkout machine 50. Alternatively, for example, even when the computer program is executed by another computer, or by another server, or by a computer and a server in cooperation; the embodiments described above can be implemented in an identical manner. The computer program meant for implementing the functions of the self-service checkout machine 50 can be distributed via a network such as the Internet.
Alternatively, the computer program can be recorded in a computer-readable recording medium such as an FD, a CD-ROM, an MO, or an DVD. Then, a computer can read the computer program from the recording medium and execute it.
The input device 400e detects various input operations performed by the user, such as an input operation performed with respect to the computer program executed by the processor 400d. Examples of the input operation include a touch operation. In the case of enabling touch operations, the self-service checkout machine 50 further includes a display unit. Thus, the input operation detected by the input device 400e can be a touch operation performed on the display unit. The input device 400e can be, for example, buttons, a touch-sensitive panel, and a proximity sensor. Moreover, the input device 400e reads barcodes. For example, the input device 400e is a barcode reader. The barcode reader includes a light source and an optical sensor, and scans barcodes.
The output device 400f outputs data, which is output from the computer program executed by the processor 400d, via an external device connected to the self-service checkout machine 50, such as an external display device. Meanwhile, when a display unit is included therein, the self-service checkout machine 50 need not include the output device 400f.
According to an aspect of the present invention, it becomes possible to detect an unfair action taken at a self-service checkout machine.
All examples and conditional language recited herein are intended for pedagogical purposes of aiding the reader in understanding the invention and the concepts contributed by the inventor to further the art, and are not to be construed as limitations to such specifically recited examples and conditions, nor does the organization of such examples in the specification relate to a showing of the superiority and inferiority of the invention. Although the embodiments of the present invention have been described in detail, it should be understood that the various changes, substitutions, and alterations could be made hereto without departing from the spirit and scope of the invention.
Number | Date | Country | Kind |
---|---|---|---|
2021-161970 | Sep 2021 | JP | national |