This application claims the benefit under 35 U.S.C. §119(a) of a Korean Patent Application No. 10-2012-0020558, filed on Feb. 28, 2012, the entire disclosure of which is incorporated herein by reference for all purposes.
1. Field
The following description relates to image processing technology, and more particularly, to an apparatus and a method for recognizing an object included in an image using a feature descriptor extracted from a specific region of the image.
2. Description of the Related Art
In a method of describing the feature of an image, there are a global descriptor that represents all characteristics of an image using one vector, and a local descriptor that compares different regions of an image to extract a plurality of regions having distinct characteristics from the image, and represents all characteristics of the image using a plurality of vectors for the respective regions.
The local descriptor is based on a local description, and thus is capable of generating the same description for the same region in spite of geometric changes in an image. Therefore, when using the local description, the local descriptor recognizes and extracts an object included in an image without preprocessing such as image segmentation, and particularly, even when a portion of an image is covered, the local descriptor can strongly respond to the case in representing the feature of the image.
Due to such advantages, the local descriptor is being widely used in pattern recognition, computer vision, and computer graphic fields, including, for example, object recognition, image retrieval, panorama generation, etc.
An operation of calculating the local descriptor is largely categorized into two stages. A first stage is a stage of extracting a point having characteristic differentiated from peripheral pixels as a feature point. A second stage is a stage of calculating a descriptor using the extracted feature point and peripheral pixel values.
Technology for generating a feature descriptor on the basis of the above-described local region information and matching the feature descriptor with a local feature descriptor of a different image is applied to various computer vision fields such as content-based image/video retrieval, object recognition and detection, video tracking, and augmented reality.
Recently, due to the introduction of mobile devices, the amount of distributed multimedia content is explosively increasing, and it is becoming easier to obtain content. Therefore, the demand for computer vision-related technology associated with object recognition for effectively retrieving the content is increasing. Especially, due to the characteristics of smart phones in which it is inconvenient to input letters, the necessity of content-based image retrieval technology that performs retrieval by inputting an image is increasing, and a retrieval application using the existing feature-based image processing technology is being actively created.
Representatives of the local feature-based image processing technology using the feature point include SIFT and SURF. Such technology is used to extract a point in which a change in a pixel statistical value is large as in a corner as a feature point from a scale space, and extract a feature descriptor using a relationship between the extracted point and a peripheral region.
However, since the size of a local feature descriptor is very large, a case in which the descriptor size of an entire image is greater than the compression size of an image occurs very frequently. For this reason, only a descriptor having a large capacity is extracted even when a simple feature descriptor is required, and thus, a large-capacity memory is used to store a descriptor.
The following description relates to an apparatus and a method for extracting and matching a scalable feature descriptor having scalability according to a purpose and an environment to which technology of extracting a feature descriptor is applied.
In one general aspect, an image recognition apparatus includes: a feature descriptor generator configured to extract scalable compact local feature descriptor information for recognizing an object from input image information; a database configured to include information on a plurality of feature descriptors; and a descriptor matcher configured to compare a feature descriptor output from the feature descriptor generator with a plurality of feature descriptors stored in the database to recognize an object included in an image.
In another general aspect, an image recognition method using a scalable local feature descriptor in an image recognition apparatus includes: extracting a scalable compact local feature descriptor from an input image; and retrieving a feature descriptor similar to the extracted feature descriptor to match the feature descriptors.
Other features and aspects will be apparent from the following detailed description, the drawings, and the claims.
Throughout the drawings and the detailed description, unless otherwise described, the same drawing reference numerals will be understood to refer to the same elements, features, and structures. The relative size and depiction of these elements may be exaggerated for clarity, illustration, and convenience.
The following description is provided to assist the reader in gaining a comprehensive understanding of the methods, apparatuses, and/or systems described herein. Accordingly, various changes, modifications, and equivalents of the methods, apparatuses, and/or systems described herein will be suggested to those of ordinary skill in the art. Also, descriptions of well-known functions and constructions may be omitted for increased clarity and conciseness.
Hereinafter, embodiments of the present invention will be described in detail with reference to the accompanying drawings.
The present invention relates to image recognition technology for detecting which object is included in an image, and particularly, provides an object recognition apparatus and method using a scalable compact local feature descriptor. Also, in the present invention, the image recognition apparatus should be construed as being applicable to all devices that recognize an object included in an image and output information on what the recognized object is, such as mobile communication terminals including personal digital assistants (PDAs), smart phones, navigation terminals, etc., as well as personal computers (PCs) including desktop computers, notebook computers, etc.
Referring to
The image obtainer 110 is a means of obtaining an image and outputting the image to the feature descriptor generator 120, and for example, may be a camera or an image sensor. Also, in an additional aspect of the present invention, the image obtainer 110 may be a camera that enlarges or reduces an image, and is capable of rotating automatically or manually. Moreover, the image obtainer 110 may obtain and output an image that has been previously captured through a communication interface, or obtain and output an image that is stored in a memory.
The feature descriptor generator 120 extracts feature information for recognizing an object from an image that is input through the image obtainer 110. The feature descriptor generator 120 will be described below with reference to
The feature descriptor matcher 130 compares a feature descriptor that is output from the feature descriptor generator 120 with feature descriptors that are previously stored in the database 140, and matches the compared feature descriptors. The feature descriptor 130 determines what an object included in an image is through the matching.
The database 140 stores feature descriptor information of a pre-designated object for determining what an object recognized from image information is. That is, a feature descriptor of an object called “Mega Box” is previously stored, and the feature descriptor of the object called “Mega Box” is retrieved as a feature descriptor similar to a feature descriptor of an object included in an image, whereupon the feature descriptor matcher 140 may determine the object included in the image as a book when the feature descriptors are capable of being matched.
The feature descriptor matcher 130 retrieves a feature descriptor similar to feature descriptors output from the feature descriptor generator 120 from the database 140, compares the feature descriptors, and outputs matching result information that is obtained by matching the feature descriptors according to the compared result. The feature descriptor matcher 130 will be described below with reference to
Referring to
The feature point extraction unit 121 extracts a point at which the change in a pixel statistical value is large as in a corner as a feature point from a scale space of an image that is input through the image obtainer 110. The feature point extraction unit 121 calculates the scale of the extracted feature point to extract a local region. In this case, the extracted local region is extracted in consideration of orientation, and may have various shapes such as a tetragon, a circle, etc. According to an embodiment of the present invention, a fast-Hessian detector may be used in a method of calculating a scale and an orientation angle.
The local region feature calculation unit 122 extracts information for a feature description of the local region that is extracted by the feature point extraction unit 121. The extracted information is used by segmenting the local region into specific shapes such as a tetragon, a circle, etc. A statistical value calculated in each region is calculated as a one-dimensional statistical value such as an average and a variance, a two-dimensional statistical value, and a high-dimensional statistical value such as a saliency map and the number of corners that are extracted from each region. is The feature comparison unit 123 compares features calculated by the local region feature calculation unit 122 for each region, and generates a bit stream that is used in an actual feature descriptor. In this case, a method of binarizing a feature value by comparing the sizes of feature values between different blocks, and a method of quantizing a feature value by aligning a plurality of feature values may be used for the comparison.
Referring to
As another method, a method of storing ranking of the sizes of values of a block “F1” to a block “F16” may be used. That is, by comparing the sizes of feature values of the block “F1” to the block “F16,” the method includes designating values of 1 to 16 in the order of size, and storing a designated value for each of the blocks.
The feature descriptor extraction unit 124 generates a descriptor using a local region feature result value that is obtained from the feature comparison unit 123. The generated descriptor includes information on a position, scale, and angle of the extracted region, and configures a descriptor by adding a region feature comparison value. In this case, depending on the case, the feature descriptor extraction unit 124 may adjust the scale of the descriptor by cutting a portion of a comparison bit stream of the descriptor.
Referring to
The DB retrieval unit 131 retrieves the database 140 according to the input of a feature descriptor from the feature descriptor generator 120. That is, the DB retrieval unit 131 retrieves one or more feature descriptors similar to the input feature descriptor from the database 140.
The similarity comparison unit 132 compares similarities between the one or more feature descriptors retrieved by the DB retrieval unit 131 and the feature descriptors input from the feature descriptor generator 120.
When the similarities compared by the similarity comparison unit 132 satisfy a predetermined threshold value and other conditions, the matching unit 133 determines two feature descriptors as matching. The number of similarities is plural according to the number and statistical values of block-converted patches included in a feature descriptor, and thus, matching can be efficiently performed based on various combined similarities. That is, the matching unit 133 determines what a corresponding object is.
Next, an image recognition method using a scalable compact region feature descriptor will be described.
The image recognition method according to an embodiment of the present invention includes an operation of extracting a scalable compact region feature descriptor from an input image, and an operation of retrieving a feature descriptor similar to the extracted scalable compact region feature descriptor and matching the retrieved feature descriptor with the extracted feature descriptor.
Referring to
The feature descriptor generator 120 extracts information for a feature description of the extracted local region in operation 530. This will be described in detail with reference to
Referring to
In statistical values calculated in each block, the feature descriptor generator 120 calculates a one-dimensional statistical value such as an average and a variance in operation 532, and calculates a two-dimensional statistical value and a high-dimensional statistical value such as a saliency map and the number of corners that are extracted from each region in operation 533.
The feature descriptor generator 120 compares features calculated by the local region feature calculation unit 122 for each region, and generates a bit stream that is used in an actual feature descriptor in operation 540. In this case, a method of binarizing a feature value by comparing the sizes of feature values between different blocks, and a method of quantizing a feature value by aligning a plurality of feature values may be used for the comparison.
The feature descriptor generator 120 generates a descriptor using a local region feature result value in operation 550. The generated descriptor includes information on a position, scale, and angle of the extracted region, and configures a descriptor by adding a region feature comparison value. In this case, depending on the case, the feature descriptor generator 120 may adjust the scale of the descriptor by cutting a portion of a comparison bit stream of the descriptor.
Referring to
The feature descriptor matcher 130 compares similarities between the retrieved one or more feature descriptors and the input feature descriptors in operation 720.
When the compared similarities satisfy a predetermined threshold value and other conditions, the feature descriptor matcher 130 determines two feature descriptors as matching in operation 730. The number of similarities is plural according to the number and statistical values of block-converted patches included in a feature descriptor, and thus, matching can be efficiently performed based on various combined similarities. That is, the feature descriptor matcher 130 determines what a corresponding object is.
According to the present invention, a scalable feature descriptor that changes the size of a descriptor and a processing speed according to an applied purpose can be generated.
Accordingly, according to the present invention, different descriptors can be extracted according to a descriptor storage space and the performance of an extractor, and moreover, the extracted descriptors having different sizes can be matched.
A number of examples have been described above. Nevertheless, it will be understood that various modifications may be made. For example, suitable results may be achieved if the described techniques are performed in a different order and/or if components in a described system, architecture, device, or circuit are combined in a different manner and/or replaced or supplemented by other components or their equivalents. Accordingly, other implementations are within the scope of the following claims.
Number | Date | Country | Kind |
---|---|---|---|
10-2012-0020558 | Feb 2012 | KR | national |