The present invention relates to an information processing apparatus capable of using an artificial intelligence function, a control method thereof, and a storage medium.
Recently, keywords are often being associated with data saved in an information processing apparatus (a process which will be called “tagging” hereinafter) and used to execute searches. For example, some image forming apparatuses, which are a type of information processing apparatus, have a “box” function for reading documents and storing the resulting data in storage in a variety of formats. Japanese Patent Laid-Open No. 2009-32186 discloses a technique in which when data has been stored in a box, tags are added as information associated with the data, folders are created to hold the data, and so on to make it easier to find desired data when searching for that data later on.
However, this conventional technique has the following issue. Although the information added as a tag is mainly used to search for data, using the information added as a tag to saved data can improve the convenience for a user when they use various functions provided by the information processing apparatus. For example, in addition to a printing function for printing images and the above-described box function, image forming apparatuses may have a “send” function, a fax function, or the like for transmitting image data read by the image forming apparatus to the exterior. There is demand, for example, for the ability to set a destination for the send function, the fax function, or the like by using an email address, a telephone number, or the like associated with a human who can be identified from an image included in saved image data. Doing so makes it possible to eliminate the burden of the user manually setting the destination when transmitting image data, which improves the convenience for the user.
The present invention enables the realization of a technique in which, when transmitting image data to the exterior, a transmission destination is set in accordance with an image included in the image data.
One aspect of the present invention provides an information processing apparatus comprising: an obtaining unit that obtains image data; an extracting unit that extracts a feature amount of a predetermined object included in the image data; a determining unit that, based on the feature amount extracted by the extracting unit, determines whether a specific object is included in an image expressed by the image data; and a destination setting unit that, in a case where the determining unit has determined that the specific object is included in the image expressed by the image data, sets, as a transmission destination of the image data, contact information stored in association with the specific object stored in a storage unit in advance.
Another aspect of the present invention provides a control method for an information processing apparatus, the method comprising: obtaining image data; extracting a feature amount of a predetermined object included in the image data; determining, based on the feature amount extracted in the extracting, whether a specific object is included in an image expressed by the image data; and in a case where it has been determined in the determining that the specific object is included in the image expressed by the image data, setting, as a transmission destination of the image data, contact information stored in association with the specific object stored in a storage unit in advance.
Still another aspect of the present invention provides a non-transitory computer-readable storage medium storing a program for causing a computer to execute each step of a control method for an information processing apparatus, the method comprising: obtaining image data; extracting a feature amount of a predetermined object included in the image data; determining, based on the feature amount extracted in the extracting, whether a specific object is included in an image expressed by the image data; and in a case where it has been determined in the determining that the specific object is included in the image expressed by the image data, setting, as a transmission destination of the image data, contact information stored in association with the specific object stored in a storage unit in advance.
Further features of the present invention will become apparent from the following description of exemplary embodiments (with reference to the attached drawings).
Hereinafter, embodiments will be described in detail with reference to the attached drawings. Note, the following embodiments are not intended to limit the scope of the claimed invention. Multiple features are described in the embodiments, but limitation is not made an invention that requires all such features, and multiple such features may be combined as appropriate. Furthermore, in the attached drawings, the same reference numerals are given to the same or similar configurations, and redundant description thereof is omitted.
Note that a multifunction peripheral (digital multifunction peripheral; MFP), which is an image forming apparatus, will be described as an example of an information processing apparatus according to the embodiment. However, the applicable scope is not limited to a multifunction peripheral, and any information processing apparatus which has or can use the artificial intelligence function pertaining to image processing, which will be described below, can be used.
Configuration of Information Processing Apparatus
An embodiment of the present invention will be described hereinafter. First, an example of the configuration of an image forming apparatus 10, serving as the information processing apparatus according to the present embodiment, will be described with reference to
The image forming apparatus 10 includes an operating unit 150, a fax unit 160, a controller unit 100, a printer unit 120, a scanner unit 130, a power source unit 200, switches 142 to 145, and a power source switch 148. The controller unit 100, which is a CPU system, includes a CPU 204, ROM 103, RAM 104, an HDD 502, a network interface 106, and a BIOS 209.
The CPU 204 executes software programs stored in the RAM 104, the HDD 502, and the like, and controls the apparatus as a whole. The ROM 103 stores, for example, a startup program for the controller unit 100, programs and fixed parameters used when executing image processing, and so on. The RAM 104 is used to store programs, temporary data, and the like when the CPU 204 controls the image forming apparatus 10. Note that the programs, the temporary data, and the like stored in the RAM 104 are read out from the ROM 103, the HDD 502 (described below), or the like. The HDD 502 serves as main storage for storing programs executed by the CPU 204, program management tables, various types of data, and so on. The executed programs are, for example, boot programs executed by the CPU 204 in order to launch an OS when the information processing apparatus is started up (a boot loader 302 and a kernel 301). Although an HDD is described as being used as the storage here, an SSD, eMMC, NAND flash memory, NOR flash memory, or the like may be used instead.
The network interface 106 is connected to a network 118, and transmits and receives data to and from one or more external apparatuses which can be communicated with over the network 118. Specifically, the network interface 106 receives data sent over the network 118, transmits image data read by the scanner unit 130, data saved in the HDD 502, and the like to prescribed destinations over the network 118, and soon. The power source unit 200 supplies power to the image forming apparatus 10. When the power is off, an AC power source is insulated by the power source switch 148, and when the power source switch 148 is turned on, AC power is supplied to an AC-DC converter 141 to create a DC power source.
The AC power source (a power source device) can control three independent power systems of the overall apparatus in response to instructions from the CPU 204. The supply of power to the controller unit 100 can be controlled by a switch 142. The supply of power to the printer unit 120 can be controlled by a switch 143. The supply of power to the scanner unit 130 can be controlled by a switch 144.
A learning processing unit 105 carries out deep learning on images read by the scanner unit 130. Although the learning processing unit 105 is provided in the image forming apparatus 10 in the present embodiment, the configuration may be such that a learning server is provided outside the image forming apparatus 10 and used by being connected over a network. The functions of the learning processing unit 105 will be described in detail later with reference to
The scanner unit 130 is an example of a reading unit that reads a document and generates black-and-white image data, color image data, and the like. The scanner unit 130 is connected to the CPU 204 by a scanner control interface (not shown). The CPU 204 controls image signals input from the scanner unit 130 via the scanner control interface.
The printer unit 120 prints image data converted from PDL data accepted by the network interface 106, image data generated by the scanner unit 130, and the like onto paper (a sheet). The printer unit 120 includes a CPU 161 and a fixing unit 162, for example. The fixing unit 162 uses heat and pressure to fuse a toner image, which has been transferred to the paper, onto the paper. In
The power source switch 148 switches between supplying power and not supplying power to the image forming apparatus 10 by switching on and off. Whether the switch is on or off is determined based on a seesaw signal connected between the power source switch 148 and the CPU 204. When the seesaw signal is high, the power source switch 148 is on, whereas when the seesaw signal is low, the power source switch 148 is off.
The BIOS 209 is non-volatile memory storing a boot program (a BIOS). An image processing unit 208 is connected to the CPU 204, the printer unit 120, and the scanner unit 130. The image processing unit 208 performs image processing such as color space conversion on a digital image output from the scanner unit 130, and outputs the post-image processing data to the CPU 204. The image processing unit 208 also performs image processing such as color space conversion based on image data read by the scanner unit 130, converts the image data into bitmap data, and outputs the bitmap data to the printer unit 120.
The fax unit 160 can transmit and receive digital images to and from a telephone line or the like. In addition to the copy function, the image forming apparatus 10 can save data read by the scanner unit 130 in the HDD 502, execute a send function, a fax function, and so on for transmitting data to the network 118 or a fax line, and the like. With the copy function, data read by the scanner unit 130, image data received from an external apparatus such as a PC (not shown) connected over the network 118, image data received by the fax unit 160, and the like can be printed. With a save function of the image forming apparatus 10, data read by the scanner unit 130 is saved in the HDD 502. The saved data can be printed using the copy function, transmitted, using the send function or the fax function (described below), to an external apparatus connected over the network 118, and so on. The send function is a function for transmitting image data saved in the HDD 502, data read by the scanner unit 130, and so on over the network 118 to a designated destination. This will be described in greater detail later. The fax function is a function for transmitting image data saved in the HDD 502, data read by the scanner unit 130, and so on over a fax line.
Operating Unit
The operating unit 150 according to the present embodiment will be described next with reference to
The liquid-crystal operating panel 11 is a combination of a liquid-crystal display and a touch panel. The liquid-crystal operating panel 11 includes a display unit that displays operating screens, and when displayed keys are operated by a user, information corresponding thereto is sent to the controller unit 100. The start key 12 is used when starting operations for reading and printing a document image, and when instructing other functions to start. Two color LEDs, namely green and red, are incorporated into the start key 12. The green light being lit indicates that operations can start, whereas the red light being lit indicates that operations cannot start. The stop key 13 serves to stop operations which are underway. The physical key group 14 is provided with a numerical keypad, a clear key, a reset key, a guide key, and a user mode key. The power save key 15 is used when transitioning the image forming apparatus 10 from a normal mode, in which all functions can be used, to a sleep mode, in which only the minimum required operations are performed, and when transitioning back to the normal mode. The image forming apparatus 10 transitions to the sleep mode when the user operates the power save key 15 while in the normal mode, and transitions to the normal mode when the user operates the power save key 15 while in the sleep mode. Information required for creating job information, such as a username, a number of copies, and output attribute information, which are input by the user using the liquid-crystal operating panel 11, is transmitted to the controller unit 100.
Learning Processing Unit
The learning processing unit 105 according to the present embodiment will be described in detail next with reference to
The image obtaining unit 1051 passes an image read by the scanner unit 130, image data saved in the HDD 502, or the like to the image analyzing unit 1052 (described below). The image analyzing unit 1052 raster-scans (see
The registration DB 1054 stores the feature amount of the image data in association with an email address, a telephone number, or the like as contact information of the individual corresponding to that feature amount. Although the present embodiment describes an example in which this information is stored within the learning processing unit 105, the present invention is not intended to be limited thereto, and the information may be stored in the HDD 502, external memory (not shown), a server, or the like. When the determining unit 1053 has detected the face of an individual registered in the registration DB 1054 in the image data, the output unit 1055 outputs, to the CPU 204, the contact information, such as an email address or a telephone number, associated as a tag with the feature amount of the face of that individual in the registration DB 1054. The CPU 204 then sets that email address or telephone number as a destination. This will be described in greater detail later.
A method used in the learning phase of the machine learning by the learning processing unit 105 will be described next. In the learning phase, the image forming apparatus itself is caused to create a determination standard by training the apparatus on a large number of face and non-face images. The images used for this training may be read by the scanner unit 130, or data saved in the HDD 502 may be used, as well as image data located in an external server. The images used for the training are transmitted to the image analyzing unit 1052 via the image obtaining unit 1051. The image analyzing unit 1052 calculates and analyzes a feature amount valid for facial recognition (e.g., a gradient histogram) for the image data which has been transmitted. The determining unit 1053 uses an output value found from the analyzed feature amount to set the determination reference for determining whether or not the face of an individual is present. Specifically, a face is determined to be present when the output value found from the gradient histogram of the input image data is greater than or equal to a set value.
Facial Recognition
A method for identifying the face of an individual from image data read by the scanner unit 130 and associating contact information of the individual, such as an email address or a telephone number, as a tag with that face, will be described next with reference to
A method used in the estimation phase of the machine learning by the learning processing unit 105 will be described next. First, the image data read by the scanner unit 130 is sent to the image analyzing unit 1052 via the image obtaining unit 1051. The image analyzing unit 1052 calculates a feature amount in the image data sent from the image obtaining unit 1051. The determining unit 1053 compares the calculated feature amount with the feature amounts saved in the registration DB 1054, and if there is an image having a feature amount within a set range, the contact information associated with the feature amount of that individual is output to the output unit 1055. Note that if there are a plurality of images having feature amounts within the set range, the contact information associated with the closest feature amount is output to the output unit 1055.
Send Function Using Image Recognition AI
The send function of the image forming apparatus 10 according to the present embodiment, which uses image recognition AI, will be described next with reference to
When using the send function, the user can set whether or not to use the image recognition AI function.
A case where a group photograph has been read, as illustrated in
Here, the feature amounts of the faces of person A and person B, as well as the email addresses associated with those individuals, are registered in the registration DB 1054, and thus the email addresses of person A and person B are output to the CPU 204 from the output unit 1055. The CPU 204 automatically sets the email addresses of person A and person B, which have been output from the output unit 1055, as transmission destinations. When transmitting data, the user may transmit the data using the email addresses which have been automatically set, or may transmit the data after adding, deleting, or otherwise modifying the addresses which have been set.
The user interface in
As illustrated in
Note that the operations described above are not limited to the send function, and a similar function can be implemented when using the fax function as well, by employing telephone numbers instead of email addresses.
Processing Sequence
A sequence of processing for setting a destination when image data has been received using the send function, according to the present embodiment, will be described next with reference to
First, in step S1101, the CPU 204 causes a document such as a photograph to be read by the scanner unit 130, and image data is generated as a result. Then, in step S1102, the CPU 204 receives the generated image data through the image obtaining unit 1051 of the learning processing unit 105, and in step S1103, the image analyzing unit 1052 raster-scans the obtained image data.
Next, in step S1104, the CPU 204 uses the determining unit 1053 to determine whether or not there is a face image of an individual registered in the registration DB 1054. This determination is carried out through the above-described facial recognition and human specifying methods. If there is no face image of an individual, the sequence moves to step S1106, where the CPU 204 does not cause the output unit 1055 to output an email address; the automatic destination setting is not performed, and the sequence moves to step S1107. In step S1107, the CPU 204 sets the destination in response to user input, and the sequence then moves to step S1108.
On the other hand, if there is a face image of an individual registered in the registration DB 1054, the sequence moves to step S1105, where the CPU 204 causes the output unit 1055 to output the registered email address; that address is automatically set as the destination and displayed in the liquid-crystal operating panel 11, and the sequence then moves to step S1108. Here, even if the email address has been set automatically, the user can correct the destination as necessary by selecting the button 805 illustrated in
Then, in step S1108, the CPU 204 transmits image data to the destination which has been set, in response to the user operating the start key 12. Although the send function has been described as an example here, the present invention is not intended to be limited thereto, and a similar function can be implemented when using the fax function as well, by employing telephone numbers instead of email addresses.
As described above, the information processing apparatus according to the present embodiment obtains image data, extracts a feature amount of a predetermined object included in the obtained image data, and based on the extracted feature amount, determines whether or not a specific object is included in an image expressed by the image data. Furthermore, if it is determined that the specific object is included in the image expressed by the image data, the information processing apparatus sets, as a transmission destination of the image data, contact information stored in association with the specific object stored in memory or the like in advance. Thus according to the present embodiment, when using a send function, a fax function, or the like, a feature amount is extracted from a read image using image recognition AI, the face of an individual having the same feature amount is specified, and a destination is set using contact information such as an email address with which the face has been associated and tagged. This makes it possible to eliminate the burden of the user setting the destination, which improves the convenience for the user.
Variations
Note that the present invention is not limited to the aforementioned embodiment, and many variations can be carried out thereon. For example, although an image forming apparatus is described as an example of the information processing apparatus in the present embodiment, the present invention can also be applied in a mobile terminal such as a smartphone. In this case, the present invention can be applied in a method in which a photograph shot by the mobile terminal such as a smartphone is selected, contact information is automatically set for an individual appearing in that photograph, and the photograph is transmitted.
Specifically, by the mobile terminal executing an application for managing/displaying image data such as photographs stored in the mobile terminal (a photo app), a plurality of photographs stored in the mobile terminal are displayed in a display unit, such as a touch panel, of the mobile terminal. Then, when the mobile terminal accepts the selection of a photograph (image) from among the plurality of photographs from the user, the selected image is displayed in an enlarged manner. Additionally, upon an image being selected by the user, the above-described image analysis processing and determination processing are executed for the selected image, and it is determined whether or not a face similar to the face of a human registered in advance in a DB of the mobile terminal has been detected in the image.
If a similar face has been detected, the mobile terminal displays an object (a pop-up or a notification) for selecting whether or not to transmit that image data to the transmission destination corresponding to the face stored in the DB. If a selection is made to transmit the image data, an object for allowing the user to select a transmission method is displayed. The transmission method can be selected from among email, P2P communication using Bluetooth (registered trademark), Wi-Fi, or the like, uploading to an SNS, and so on.
If “email” is selected, the selected image data is transmitted by email to the email address stored in association with the detected face of the person.
If “P2P communication” is selected, the mobile terminal searches for a nearby terminal. Specifically, it is determined whether an advertising packet from Bluetooth LE (Low Energy) or the like has been received. Terminal information (a name or the like of the terminal) is displayed in the mobile terminal based on the received advertising packet, and when the user selects that terminal information, the mobile terminal establishes a Bluetooth LE connection with the terminal corresponding to the selected terminal information. The image data may be transmitted through that Bluetooth LE communication, or may be handed over to a Wi-Fi Direct connection through that Bluetooth LE communication, and transmitted through Wi-Fi Direct communication.
Furthermore, the present invention can be applied in any information processing apparatus, as opposed to only image forming apparatuses, as long as the apparatus has a function for reading an image, using image recognition AI to determine the face of a person, automatically setting contact information, and transmitting data.
According to the present invention, when transmitting image data to the exterior, a transmission destination can be set in accordance with an image included in the image data.
Embodiment(s) of the present invention can also be realized by a computer of a system or apparatus that reads out and executes computer executable instructions (e.g., one or more programs) recorded on a storage medium (which may also be referred to more fully as a ‘non-transitory computer-readable storage medium’) to perform the functions of one or more of the above-described embodiment(s) and/or that includes one or more circuits (e.g., application specific integrated circuit (ASIC)) for performing the functions of one or more of the above-described embodiment(s), and by a method performed by the computer of the system or apparatus by, for example, reading out and executing the computer executable instructions from the storage medium to perform the functions of one or more of the above-described embodiment(s) and/or controlling the one or more circuits to perform the functions of one or more of the above-described embodiment(s). The computer may comprise one or more processors (e.g., central processing unit (CPU), micro processing unit (MPU)) and may include a network of separate computers or separate processors to read out and execute the computer executable instructions. The computer executable instructions may be provided to the computer, for example, from a network or the storage medium. The storage medium may include, for example, one or more of a hard disk, a random-access memory (RAM), a read only memory (ROM), a storage of distributed computing systems, an optical disk (such as a compact disc (CD), digital versatile disc (DVD), or Blu-ray Disc (BD)™), a flash memory device, a memory card, and the like.
While the present invention has been described with reference to exemplary embodiments, it is to be understood that the invention is not limited to the disclosed exemplary embodiments. The scope of the following claims is to be accorded the broadest interpretation so as to encompass all such modifications and equivalent structures and functions.
This application claims the benefit of Japanese Patent Application No. 2019-163845 filed on Sep. 9, 2019, which is hereby incorporated by reference herein in its entirety.
Number | Date | Country | Kind |
---|---|---|---|
2019-163845 | Sep 2019 | JP | national |