1. Field
Aspects of the present invention generally relate to an information processing apparatus, an information processing method, and a storage medium.
2. Description of the Related Art
In recent years, portable terminals having high information processing functions such as a smartphone and a tablet personal computer (PC) have become widespread. These portable terminals each include a camera and have an imaging function (a camera function). In this type of portable terminal, a document of a paper medium is imaged using the camera function to be stored as image data in a memory of the portable terminal.
For example, Japanese Patent Application Laid-Open No. 2014-36323 discusses a technique for setting an operation mode according to an imaging direction at the time of imaging, when displaying a photographic image. In this technique, image display is performed according to association between direction information indicating the imaging direction and the photographic image obtained by imaging a subject. Further, in this technique, a function of setting an operation mode concerning a display changing operation for a displayed image is provided, so that a change in display is performed in accordance with the changing operation according to the set operation mode.
As described above, Japanese Patent Application Laid-Open. No. 2014-36323 discusses the technique for changing the window for an operator when the operator operates the camera, according to an orientation of the camera. This technique improves operability of an operation window when the operator moves and operates the camera. For example, in this technique, when an orientation of imaging is established, operation display suitable for this orientation is provided, and the operation mode is determined according to the orientation. However, in Japanese Patent Application Laid-Open No. 2014-36323, processing not suitable for the characteristic of an image may be performed, when a mode (for display and processing) is changed only using the orientation (three directions of an upward direction, a downward direction, and a sideward direction) of the camera as illustrated in
Aspects of the present invention is generally directed to a technique that processes a photographic image more appropriately, according to a posture of a portable terminal when an imaging object is imaged.
According to an aspect of the present invention, an information processing apparatus includes an acquisition unit configured to acquire posture information of an imaging unit, an extraction unit configured to extract characteristic amount information of an image imaged by the imaging unit, an identification unit configured to identify a subject included in the image imaged by the imaging unit using the posture information acquired by the acquisition unit and the characteristic amount information extracted by the extraction unit, a recognition unit configured to recognize a subject area in the image in a recognition mode according to the subject identified by the identification unit, and a correction unit configured to correct the image including the subject area by clipping, from the image the subject area recognized by the recognition unit.
Further features of aspects of the present invention will become apparent from the following description of exemplary embodiments with reference to the attached drawings.
Exemplary embodiments of the present invention will be described below using the drawings.
The storage unit 304 is, for example, a flash memory, and stores photographic images and various programs to be executed by the CPU 301. A data transmitting/receiving unit 305 has a wireless LAN controller, and implements transmission and reception of data to and from the printer 103 and the server 121, via the wireless router 102. An imaging unit 306 corresponds to the camera 203 described above, and images a document serving as a subject, and transmits a photographic image obtained thereby to each unit. As will be described below in detail for the document, here, imaging objects, which include not only a paper medium but also a board medium such as a blackboard and a whiteboard, are each referred to as a “document”.
A display unit 307 corresponds to the touch panel display 201, and displays a live view when imaging a document using the camera function, or displays various kinds of information such as a document-area recognition result to be described below. An operation unit 308 corresponds to a touch panel and the operation button 202 of the touch panel display 201 described above. The operation unit 308 transmits operation information to each unit, by receiving an operation from a user. A motion sensor (a movement detection unit) 310 includes a three-axis acceleration sensor, an electromagnetic compass, and a three-axis angular velocity sensor, and therefore can detect a posture and a movement of the portable terminal 101.
A terminal posture detection unit 602 analyses sensor information acquired by the motion sensor 310, thereby detecting the posture of the portable terminal 101. This will be described below in detail by using figures including
The document area recognition unit 603 performs edge detection in order to detect an edge (a boundary) between the document and a support where the document is placed, i.e., an edge between the subject area and a background area. The document area recognition unit 603 uses, for example, a scheme such as Canny edge detection for the edge detection. As to a threshold for the edge detection, a fixed threshold may be used, or the threshold may be automatically calculated by the document area recognition unit 603 based on luminance of the image.
An image correction unit 604 performs processing such as density nonuniformity and/or sharpness correction and color correction for the photographic image. An image storage unit 605 transfers the photographic image to the RAM 302, while performing compression and formatting. The image storage unit 605 may transfer the photographic image to the RAM 302, after storing the photographic image in an external storage device. A user interface (UI) display processing unit 606 displays information related to the imaging, the document-area recognition result, and an imaging-object (subject) recognition result in the photographic image to be described below, and the like, on the touch panel display 201. An imaging object recognition unit 607 recognizes an imaging object in the photographic image. In the present exemplary embodiment, the imaging object recognition unit 607 is descried using an example of recognizing each of a whiteboard, a blackboard, and paper as the imaging object, but the imaging object is not limited to this example.
In step S401, the imaging processing unit 601 images a document, when an instruction for imaging is input via the operation unit 308 from a user. The document described here is the above-described imaging object such as the whiteboard, the blackboard, and the paper. In step S402, the terminal posture detection unit 602 acquires posture information of the portable terminal 101 based on sensor information of the motion sensor 310. The sensor information of the motion sensor 310 includes information about an angle and an orientation of the portable terminal 101. The processing in step S402 will be described below in detail. The processing in step S402 is an example of acquisition processing of acquiring the posture information of the portable terminal 101. In step S403, the imaging object recognition unit 607 performs recognition processing of recognizing the imaging object in the photographic image, i.e., imaging object recognition processing. The imaging object recognition processing is, for example, processing of recognizing whether the imaging object is a whiteboard, a blackboard, paper, an object onto which a presentation is projected, or the like. The processing in step S403 will be described below in detail. The processing in step S403 is an example of identification processing of identifying an imaging object in a photographic image.
In step S404, the document area recognition unit 603 performs document area recognition processing. The photographic image includes a background area besides the imaging object. Therefore, the document area recognition unit 603 recognizes the document area, by recognizing an edge (a boundary) between the document area of the imaging object and the background area. Further, the document area recognition unit 603 recognizes the document area by changing a processing mode (a recognition mode) in the document area recognition processing, according to the imaging object recognized in step S403. The processing in step S404 will be described below in detail. The processing in step S404 is an example of recognition processing of recognizing a document area in a photographic image. In step S405, the document area recognition unit 603 clips an image according to the document area recognized in step S404.
In step S406, the image correction unit 604 performs keystone correction of the clipped image. The keystone correction may be implemented by using conventional projective transformation. A scaling parameter is a projective transformation matrix considering an occurrence of trapezoidal distortion. The image correction unit 604 can calculate a projective transformation matrix from apex information (apexes 1103, 1104, 1105, and 1106) of four points of a document area in a photographic image illustrated in
The user can specify an output image size via a window displayed in the operation unit 308. Output image sizes that can be specified include, for example, A4, A5, B4, and letter-size, and can be set beforehand in the ROM 303. For example, in a case where the output image size is a paper size of A4, the output image size is A4, and the scaling parameter is a parameter for scaling an image of a document area to A4 serving as an output image size. Further, for example, in a case where the output image size is half the paper size of A4, the output image size is A5, and the scaling parameter is a parameter for scaling an image of a document area to A5 serving as an output image size. For the output image size, a paper type series may be changeable to use any one of A-series paper, B-series paper, and inch-series paper, based on an aspect ratio of the document area.
In step S407, the image correction unit 604 corrects density nonuniformity of the photographic image, thereby correcting color irregularity due to a light source and shading at the time of imaging the document. First, the image correction unit 604 performs filter processing to remove noise at the time of the imaging, and then performs tone correction that allows reproduction of white of the paper by removing color appearing on a background. In step S408, the image correction unit 604 sharpens and smooths (performs sharpness correction and smoothness correction on) the image. This can be implemented by performing conventional filter processing or the like. In step S409, the image correction unit 604 performs conversion of the photographic image (color correction) from a color space specific to the camera to a common RGB color space. The conversion of the color space used here is assumed to be conversion from the camera color space of the portable terminal 101 to the colorimetric common RGB color space such as sRGB, which is performed by a 3×3 matrix operation defined beforehand. The processing in each of step S406 to S409 corresponds to an example of image correction processing.
In step S410, the image storage unit 605 performs compression and formatting of the photographic image after the correction. To be more specific, the image storage unit 605 compresses the photographic image after the correction to, for example, a Joint Photographic Experts Group (JPEG) format. The image storage unit 605 then converts JPEG data resulting from this conversion, into a file format (e.g., a Portable Document Format (PDF) or an Extensible Markup Language paper specification (XML paper specification, or XPS) format). In step S411, the UI display processing unit 606 displays a UI for receiving an instruction for performing storage of the file generated in step S410. For example, the UI display processing unit 606 displays a UI for prompting the user to determine whether to store the file. In step S412, the image storage unit 605 performs storage processing of the file to a set area. This area for storage may be an area designated beforehand by an application, or may be an area for storing images held by an operating system (OS) of the portable terminal 101.
This ends the description of an outline of a series of processing steps to be performed by the portable terminal 101 in the present exemplary embodiment. Details of the processing in each of step S402 to step S404 will be described below.
Details of step S402 will be described using
Here, an upward direction of the portable terminal 101 will be described. Top, bottom, right, and left of the portable terminal 101 correspond to top, bottom, right, and left defined in the motion sensor 310, respectively. The motion sensor 310 has three axes of a vertical direction, a lateral direction, and an orthogonal-to-screen direction of the portable terminal 101, and can detect acceleration for each axis (direction). In the first exemplary embodiment, a vector, which is directed from a lower side 506 to an upper side 505 in
The rotation angle 504 of the window 501 will be described using
Now, a definition of the tilt 1603 of the portable terminal 101 illustrated in
The rotation about the X-axis is referred to as a “tilt angle”, and takes a value in a range from “−180 degrees” to “179 degrees”. With reference to the touch panel display 201 of the portable terminal 101, an upward horizontal state is “0 degree”, an upright state is “90 degrees”, a downward horizontal state is “−180 degrees” and “179 degrees”, and an inverted state is “−90 degrees”. The rotation about the Y-axis is referred to as “rotation angle”, and takes a value in a range from “−90 degrees” to “90 degrees”. With reference to the touch panel display 201 of the portable terminal 101, a horizontal state is “0 degree”, a right-shoulder upward state is “90 degrees”, and a left-shoulder upward state is “−90 degrees”. The rotation about the Z-axis is referred to as “azimuth angle”, and takes a value in a range from “0 degree” to “359 degrees”. With reference to a short-side top surface of the portable terminal 101, a top-surface northward state is “0 degree”, a top-surface eastward state is “90 degrees”, a top-surface southward state is “180 degrees”, and a top-surface eastward state is “270 degrees”. As illustrated in
In step S403, the imaging object recognition unit 607 performs the recognition processing of an imaging object in the photographic image. Here, a case, in which printed paper, a whiteboard, and a blackboard each serve as the imaging object, is described as an example.
As an example of processing of recognizing an imaging object as being a whiteboard, the imaging object recognition unit 607 performs acquisition of color distribution information. Specifically, the imaging object recognition unit 607 acquires the color distribution information of an image in a range set beforehand, from a center 704 of the imaging area 701. The color distribution information of the image is an example of characteristic amount information of an image. Further, the processing of acquiring the color distribution information from the photographic image is an example of extraction processing of extracting characteristic amount information from a photographic image. This processing can be performed by conventional processing of acquiring a histogram. In the whiteboard, a background plane is white, and the written text 703 is limited in terms of color as well. For example, the color of the text 703 can be limited to black, red, blue, and green. When white, black, red, blue, and green are stronger than other colors, the imaging object recognition unit 607 can recognize the imaging object being the whiteboard based on a histogram. As another applicable method, there is such a method that a whiteboard image has been registered beforehand, and the imaging object recognition unit 607 recognizes the imaging object as being the whiteboard by image recognition.
As an example of processing of recognizing an imaging object as being a blackboard, the imaging object recognition unit 607 performs acquisition of color distribution information. Specifically, the imaging object recognition unit 607 acquires the color distribution information of an image in a range set beforehand, from a center 804 of the imaging area 801. This processing can be performed by conventional processing of acquiring a histogram. In the blackboard, a background plane is black or dark green, and the written text 803 is limited in terms of color as well. For example, the color of the text 803 can be limited to white, red, blue, and yellow. When black, white, red, blue, and yellow are stronger than other colors, the imaging object recognition unit 607 recognizes the imaging object as being the blackboard based on a histogram. As another applicable method, there is such a method that a blackboard image has been registered beforehand, and the imaging object recognition unit 607 recognizes the imaging object as being the blackboard by image recognition.
In the present exemplary embodiment, the imaging object recognized as neither the whiteboard nor the blackboard is assumed to be the paper. Further, the imaging object recognition unit 607 can narrow the imaging object by using the posture information of the portable terminal 101 acquired in step S402. This will be described below more in detail.
For example, assume that, in the posture information of the portable terminal 101 acquired in step S402, the value of the X-axis is 0, the value of the Y-axis is 0, and the value of the Z-axis is 10, as illustrated in
If the angle with respect to the plane of the portable terminal 101 is 45 degrees or less (YES in step S1002), then in step S1003, the imaging object recognition unit 607 recognizes the imaging object as being the paper. On the other hand, when the angle with respect to the plane of the portable terminal 101 is not 45 degrees or less (NO in step S1002), then in step S1004, the imaging object recognition unit 607 recognizes the imaging object as being the whiteboard, the blackboard, or the paper put on the wall. The value of the angle serving as a threshold is freely settable, and is not limited to the value of 45 degrees. In addition, here, although the case where the imaging object is the whiteboard, the blackboard, or the paper put on the wall is described as an example, the imaging object may be other type of object. For example, the imaging object may be a presentation window projected onto a wall by using a liquid crystal projector.
In this way, the imaging object can be recognized in a simpler and faster manner, by thus performing the narrowing to the imaging object conceivable for the case where the angle is greater than the set value. Further, when two or more imaging objects are identified by the narrowing based on the angle of the portable terminal 101, the imaging object recognition unit 607 can identify one of the imaging objects based on the color distribution information acquired from the photographic image in the manner described above.
In step S404, the document area recognition unit 603 performs processing of recognizing an outer frame (here, described as four sides) of a document area. The document area recognition unit 603 finds edges to delete or connect the found edges, thereby performing the detection of the sides. In this process, the document area recognition unit 603 changes the processing mode in the edge detection according to the type of the imaging object recognized in step S403. The four sides can be detected more accurately by changing edge intensity or luminance difference for performing the edge detection according to each of the imaging objects. In other words, for example, when the imaging object is recognized as the whiteboard, the document area recognition unit 603 performs the four sides detection for the whiteboard. In addition, when the imaging object is recognized as the blackboard, the document area recognition unit 603 performs the four sides detection for the blackboard. Similarly, when the imaging object is recognized as the paper, the document area recognition unit 603 performs the four sides detection for the paper. Setting information for the four sides detection according to each of the imaging objects is assumed to have been stored beforehand in the storage unit 304 or the like.
In step S904, the document area recognition unit 603 determines whether the imaging object recognized in step S403 is the blackboard. When the document area recognition unit 603 determines that the imaging object is the blackboard (YES in step S904), the processing proceeds to step S905. When the document area recognition unit 603 determines that the imaging object is not the blackboard (NO in step S904), the processing proceeds to step S906. In step S905, the document area recognition unit 603 performs the above-described detection processing for the blackboard. In step S906, the document area recognition unit 603 performs the above-described detection processing for the paper.
When the document is imaged by the imaging processing unit 601, the document area recognition unit 603 extracts the apexes of the document area in the photographic image as illustrated in
As described above, according to the present exemplary embodiment, the photographic image can be processed more appropriately according to the posture of the portable terminal 101 when the imaging object is imaged. To be more specific, appropriate document area recognition and image processing that are suitable for the photographic image can be performed, by identifying the imaging object in the photographic image based on the posture information of the portable terminal 101. In addition, according to the present exemplary embodiment, each processing described above can be implemented without complicated instructions of the user.
In the above described first exemplary embodiment, there is described the example in which the imaging object recognition can be implemented in an easier and faster manner by recognizing the imaging object in the photographic image based on the posture information of the portable terminal 101. Now, in a second exemplary embodiment, there will be described an example in which the imaging object is recognized faster than the first exemplary embodiment by considering priority of imaging object recognition processing. In the second exemplary embodiment, a point different from the first exemplary embodiment will be mainly described.
When the angle with respect to the lateral direction of the portable terminal 101 is in the range from 70 degrees to 90 degrees, the posture of the portable terminal 101 is a landscape state as illustrated in
As for the orientation of the portable terminal 101, even if the posture illustrated in
As described above, according to the present exemplary embodiment, in addition to the angle with respect to the orthogonal-to-screen direction of the portable terminal 101, the angle with respect to the lateral direction is considered, so that the priority of the imaging object recognition processing can be changed. Therefore, the recognition processing can be performed faster.
In the above described second exemplary embodiment, there is described the imaging object recognition processing, in which the imaging object is recognized by reflecting the priority in the recognition of the imaging object, using the two angles, i.e., the angle with respect to the orthogonal-to-screen direction of the portable terminal 101 and the angle with respect to the lateral direction. In a third exemplary embodiment, there will be described an example in which a setting window is provided to present an imaging-object recognition result to the user, and to receive a user instruction as to whether to change the recognition result, so that processing is performed according to the instruction. In the third exemplary embodiment, a point different from the first exemplary embodiment will be mainly described.
By viewing this displayed window, the user can not only confirm the result of detecting the document area, but also recognize the processing being performed for the whiteboard serving as the imaging object, and the image correction processing to be subsequently performed by using a parameter for the whiteboard. When the UI display processing unit 606 detects a press of a button 1402, processing in and after step S405 is performed. In the example in
In the present exemplary embodiment, the UI display processing unit 606 displays the imaging-object recognition result in the window displaying the document-area recognition result obtained in step S404. However, the display timing and display form are not limited to this example. For example, the UI display processing unit 606 may display the recognition result at the time when the recognition of the imaging object is performed, or may display a window that can receive an instruction for changing the imaging object.
As described above, according to the above described third exemplary embodiment, a confirmation or an instruction by a user can be easily received for the processing to be performed based on the posture information of the portable terminal 101 and the imaging-object recognition result. Therefore, the processing suitable for the purpose of the user can be provided without any complicated UI operation.
In the first exemplary embodiment, as in the flowchart described with reference to
In step S1504, the imaging object recognition unit 607 determines whether the whiteboard is selected as the imaging object by a user operation via the window displayed in step S1503. When the imaging object recognition unit 607 determines that the whiteboard is selected as the imaging object (YES in step S1504), processing in and after step S1505 is performed for the whiteboard being recognized as the imaging object. On the other hand, when the imaging object recognition unit 607 determines that the whiteboard is not selected as the imaging object (NO in step S1504), processing in and after step S1506 is performed for the blackboard being the imaging object.
In step S1505, the imaging object recognition unit 607 recognizes the imaging object as the whiteboard.
On the other hand, in step S1506, the imaging object recognition unit 607 recognizes the imaging object as the blackboard.
As described above, according to the present exemplary embodiment, the processing suitable for the imaging object taken by the user can be provided without requiring a complicated operation from the user, by performing display for allowing the user to conform the imaging object, by using the posture information of the portable terminal 101.
An exemplary embodiment of the present invention is also achievable by such processing that a program that implements one or more functions of the above-described exemplary embodiments is supplied to a system or apparatus via a network or storage medium. One or more processors in a computer of the system or apparatus read the program and then execute the read program. Moreover, an exemplary embodiment of the present invention is also achievable by a circuit (e.g., an application-specific integrated circuit (ASIC)) that implements one or more functions.
In addition, in the first to fourth exemplary embodiments described above, the portable terminal 101 is described as a terminal that processes the photographic image in the portable terminal 101, but the processing form is not limited to this example. For example, the portable terminal 101 may transmit a photographic image to the server 121, and the server 121 may perform image processing on the photographic image. In this case, the server 121 is an example of the information processing apparatus. The server 121 has at least a CPU and a memory as a hardware configuration. The CPU of the server 121 executes processing based on a program stored in the memory of the server 121, thereby implementing the processing in each of the exemplary embodiments described above.
As described above, according to each of the exemplary embodiments described above, there can be provided a technique, which is capable of processing a photographic image more appropriately, according to a posture of a portable terminal when an imaging object is imaged.
Exemplary embodiments have been described in detail. However, these specific exemplary embodiments are not seen to be limiting, and may be variously altered and/or modified within the scope and the essence of aspects of the present invention as claimed.
Embodiments of the present invention can also be realized by a computer of a system or apparatus that reads out and executes computer executable instructions recorded on a storage medium (e.g., non-transitory computer-readable storage medium) to perform the functions of one or more of the above-described embodiment(s) of the present invention, and by a method performed by the computer of the system or apparatus by, for example, reading out and executing the computer executable instructions from the storage medium to perform the functions of one or more of the above-described embodiment(s). The computer may comprise one or more of a central processing unit (CPU), micro processing unit (MPU), or other circuitry, and may include a network of separate computers or separate computer processors. The computer executable instructions may be provided to the computer, for example, from a network or the storage medium. The storage medium may include, for example, one or more of a hard disk, a random-access memory (RAM), a read only memory (ROM), a storage of distributed computing systems, an optical disk (such as a compact disc (CD), digital versatile disc (DVD), or Blu-ray Disc (BD™), a flash memory device, a memory card, and the like.
While aspects of the present invention have been described with reference to exemplary embodiments, it is to be understood that the aspects of the invention are not limited to the disclosed exemplary embodiments. The scope of the following claims is to be accorded the broadest interpretation so as to encompass all such modifications and equivalent structures and functions.
This application claims the benefit of Japanese Patent Application No. 2015-017098, filed Jan. 30, 2015, which is hereby incorporated by reference herein in its entirety.
Number | Date | Country | Kind |
---|---|---|---|
2015-017098 | Jan 2015 | JP | national |