The technology discussed below relates generally to wireless communications, and more specifically to methods and devices for facilitating detection of text and computer generated graphics within digital imagery, and the encoding of such imagery based at least in part of such detections.
Digital imagery, as the term is used throughout this disclosure, can include still images and moving images (e.g., digital video). As technology advances, digital imagery is becoming more and more detailed, resulting in increasing numbers of pixels associated with a single still image or video frame, and an increasing number of bits to represent those pixels.
In order to reduce the resources needed to store and/or transmit such digital imagery, digital encoding is often employed. Encoding generally involves operations such as compression, encryption, quantization, etc. Depending on the specific encoding employed, some detail from the data source may be lost. Often, the perceptual quality of encoded digital imagery is affected by the loss of such data. Perceptual quality refers generally to a subjective measure of how clearly the details of the imagery can be visually perceived by viewers. Research on how viewers perceive the data in question may be used to determine what specific data the encoding will allow to be lost. Often, a person viewing encoded digital imagery is more sensitive to the perceptual quality as it applies to certain components of an image than others. For example, it has been determined that viewers are relatively more sensitive to artifacts associated with computer generated graphic components and/or text components than they may be to natural image components.
It may be desirable to provide encoding processes that are capable of detecting text, computer generated graphics, and the like in a manner to enable sufficiently high quality for such content.
The following summarizes some aspects of the present disclosure to provide a basic understanding of the discussed technology. This summary is not an extensive overview of all contemplated features of the disclosure, and is intended neither to identify key or critical elements of all aspects of the disclosure nor to delineate the scope of any or all aspects of the disclosure. Its sole purpose is to present some concepts of one or more aspects of the disclosure in summary form as a prelude to the more detailed description that is presented later.
Various examples and implementations of the present disclosure facilitate low complexity text detection and/or computer generated graphics detection in digital imagery for encoding such digital imagery. According to at least one aspect of the disclosure, electronic devices may be configured for processing digital imagery. In at least one example, an electronic device may include a storage medium including digital imagery stored thereon. A processing circuit may be coupled to the storage medium, and may be adapted to mathematically combine pixel values for each pixel in a group of pixels from the digital imagery according to a plurality of predefined patterns. Each predefined pattern is different from the other predefined patterns and designates whether to add or subtract each pixel value. The processing circuit may be adapted to determine that the group of pixels includes at least one of text or computer generated graphics when the result from the mathematical combination is greater than or equal to a predefined threshold, or that the group of pixels does not include text or computer generated graphics when the result from the mathematical combination is less than the predefined threshold.
Further aspects provide methods operational on access terminals and/or access terminals including means to perform such methods. One or more examples of such methods may include mathematically combining pixel values associated with respective pixels from a macroblock of a digital image according to a plurality of predefined patterns, where each predefined pattern is different from the other predefined patterns. A determination may be made that the macroblock includes content comprising at least one of text or computer generated graphics when the result from the mathematical combination is greater than or equal to a predefined threshold. Further, a determination may be made that the macroblock does not include content comprising at least one of text or computer generated graphics when the result from the mathematical combination is less than the predefined threshold.
Still further aspects include processor-readable storage mediums comprising programming executable by a processing circuit. According to one or more examples, such programming may be adapted for causing the processing circuit to calculate a macroblock value by adding together pixel values for each pixel of a macroblock from a digital image. The pixel values can be added together according to a plurality of predefined patterns designating whether a pixel value is added as a positive or a negative value. The programming may further be adapted for causing the processing circuit to determine that the macroblock includes at least one of text or computer generated graphics when the macroblock value is greater than or equal to a predefined threshold, or to determine that the macroblock does not include text or computer generated graphics when the macroblock value is less than the predefined threshold.
Other aspects, features, and embodiments associated with the present disclosure will become apparent to those of ordinary skill in the art upon reviewing the following description in conjunction with the accompanying figures.
The description set forth below in connection with the appended drawings is intended as a description of various configurations and is not intended to represent the only configurations in which the concepts and features described herein may be practiced. The following description includes specific details for the purpose of providing a thorough understanding of various concepts. However, it will be apparent to those skilled in the art that these concepts may be practiced without these specific details. In some instances, well known circuits, structures, techniques and components are shown in block diagram form to avoid obscuring the described concepts and features.
Aspects of the present disclosure relate to text and/or computer generated graphic detection in digital imagery. Referring now to
The digital image 100 is typically made up of millions of pixels, each defining a color for the respective location within the image. For example,
According to an aspect of the present disclosure, the text content 104 and the computer generated graphic content 106 in the digital image 100 may be detected. The detection of text content and/or computer generated graphic content can be employed in encoding the digital imagery in a manner to maintain perceptual quality of this content.
Turning to
The processing circuit 302 is arranged to obtain, process and/or send data, control data access and storage, issue commands, and control other desired operations. The processing circuit 302 may include circuitry adapted to implement desired programming provided by appropriate media, and/or circuitry adapted to perform one or more functions described in this disclosure. For example, the processing circuit 302 may be implemented as one or more processors, one or more controllers, and/or other structure configured to execute executable programming Examples of the processing circuit 302 may include a general purpose processor, a digital signal processor (DSP), an application specific integrated circuit (ASIC), a field programmable gate array (FPGA) or other programmable logic component, discrete gate or transistor logic, discrete hardware components, or any combination thereof designed to perform the functions described herein. A general purpose processor may include a microprocessor, as well as any conventional processor, controller, microcontroller, or state machine. The processing circuit 302 may also be implemented as a combination of computing components, such as a combination of a DSP and a microprocessor, a number of microprocessors, one or more microprocessors in conjunction with a DSP core, an ASIC and a microprocessor, or any other number of varying configurations. These examples of the processing circuit 302 are for illustration and other suitable configurations within the scope of the present disclosure are also contemplated.
The processing circuit 302 is adapted for processing, including the execution of programming, which may be stored on the storage medium 306. As used herein, the term “programming” shall be construed broadly to include without limitation instructions, instruction sets, code, code segments, program code, programs, subprograms, software modules, applications, software applications, software packages, routines, subroutines, objects, executables, threads of execution, procedures, functions, etc., whether referred to as software, firmware, middleware, microcode, hardware description language, or otherwise.
In some instances, the processing circuit 302 may include a text/graphic detector 310 and an imagery encoder 312. The text/graphic detector 310 may include circuitry and/or programming (e.g., programming stored on the storage medium 304) adapted to detect text content and computer generated graphical content within imagery. The imagery decoder 312 may include circuitry and/or programming (e.g., programming stored on the storage medium 304) adapted to employ information relating to detected text and/or computer generated graphics to encode imagery according to one or more digital imagery encoding algorithms.
The storage medium 304 may represent one or more processor-readable devices for storing programming, such as processor executable code or instructions (e.g., software, firmware), electronic data, databases, or other digital information. The storage medium 304 may also be used for storing data that is manipulated by the processing circuit 304 when executing programming The storage medium 304 may be any available media that can be accessed by a general purpose or special purpose processor, including portable or fixed storage devices, optical storage devices, and various other mediums capable of storing, containing and/or carrying programming By way of example and not limitation, the storage medium 304 may include a processor-readable storage medium such as a magnetic storage device (e.g., hard disk, floppy disk, magnetic strip), an optical storage medium (e.g., compact disk (CD), digital versatile disk (DVD)), a smart card, a flash memory device (e.g., card, stick, key drive), random access memory (RAM), read only memory (ROM), programmable ROM (PROM), erasable PROM (EPROM), electrically erasable PROM (EEPROM), a register, a removable disk, and/or other mediums for storing programming, as well as any combination thereof
The storage medium 304 may be coupled to the processing circuit 302 such that the processing circuit 302 can read information from, and write information to, the storage medium 304. That is, the storage medium 304 can be coupled to the processing circuit 302 so that the storage medium 304 is at least accessible by the processing circuit 302, including examples where the storage medium 304 is integral to the processing circuit 302 and/or examples where the storage medium 304 is separate from the processing circuit 302 (e.g., resident in the electronic device 300, external to the electronic device 300, distributed across multiple entities).
Programming stored by the storage medium 304, when executed by the processing circuit 302, causes the processing circuit 302 to perform one or more of the various functions and/or process steps described herein. In at least some examples, the storage medium 304 may include text/graphic detection operations 314 and imagery encoding operations 316. The text/graphic detection operations 314 are adapted to cause the processing circuit 302 to detect text content and computer generated graphical content within imagery, as described herein. The imagery encoding operations 316 are adapted to cause the processing circuit 302 to encode imagery according to one or more digital imagery encoding algorithms and in response to the information relating to any detected text and/or computer generated graphics.
Thus, according to one or more aspects of the present disclosure, the processing circuit 302 is adapted to perform (in conjunction with the storage medium 304) any or all of the processes, functions, steps and/or routines for any or all of the electronic devices described herein. As used herein, the term “adapted” in relation to the processing circuit 302 may refer to the processing circuit 302 being one or more of configured, employed, implemented, and/or programmed (in conjunction with the storage medium 304) to perform a particular process, function, step and/or routine according to various features described herein.
The communications interface 306 is configured to facilitate wireless and/or wired communications of the electronic device 300. For example, the communications interface 306 may include circuitry and/or programming adapted to facilitate the communication of information bi-directionally with respect to one or more other devices.
The imagery capture device 308 is configured to facilitate the capture and digitization of imagery. The imagery capture device 308 may include a camera and related hardware and programming capable of obtaining still and/or video imagery, as is generally known to those of ordinary skill in the art.
In operation, the electronic device 300 is adapted to analyze digital imagery data to detect text and/or computer generated graphics using relatively low complexity operations. In general, the electronic device 300 is adapted to evaluate a group of pixels (e.g., a 16×16 macroblock) of the digital imagery by mathematically combining the value of each pixel within the group according to a predefined pattern. For example, the electronic device 300 may combine the numeric value for each pixel in a group by either addition or subtraction, depending on the location of the pixel within the group of pixels. In some embodiments, a plurality of predefined patterns may be used on each group of pixels, and the results from each of the patterns may be combined to obtain a total value. Based on the resulting value, the electronic device 300 can predict whether the group of pixels includes text and/or computer generated graphic components.
A more specific example will be described.
According to an aspect of the present disclosure, each of the different patterns may be adapted to determine whether a pixel will be added to the total or whether it will be subtracted from the total when the pixel values are all mathematically combined. Another way of thinking about it is that the patterns can indicate whether a pixel value will be positive or negative.
In
Turning to
Each macroblock can be analyzed independently by the electronic device 300. In one example, the text/graphic detector 310 can initially add up the pixel values for each pixel within the macroblock according to the top pattern 602 to obtain SUM 1. In some examples, the pixel values associated with the shaded regions in the pattern 602 can be multiplied by a negative one (−1) and then all of the pixel values can be added together to obtain the value of SUM 1. In other examples, the pixel values associated with the shaded regions in the pattern 602 can be subtracted from the total, while the pixel values associated with the white regions in the pattern 602 can be added to the total to obtain SUM 1. The resulting value for SUM 1 may be an absolute value of the resulting sum from adding/subtracting each of the pixel values according to the predefined pattern so that SUM 1 is not a negative value.
Similarly, the text/graphic detector 310 can apply other patterns, such as patterns 604, 606, 608, 610, 612, etc. to obtain respective SUM 2, SUM 3, SUM 4, SUM 5, SUM 6, etc. In some examples, all of the patterns ‘A’-‘S’ in
Once all of the total values (SUMs) are obtained for each predefined pattern, those total values can be added together to obtain the total value for the macroblock, identified as MacroBlock value 614. The text/graphic detector 310 then compares the MacroBlock value 614 to a predefined threshold. If the MacroBlock value 614 is greater than or equal to the predefined threshold, then the text/graphic detector 310 can determine that the macroblock 202 includes text and/or computer generated graphics content. If the MacroBlock value 614 is below the predefined threshold, then the text/graphic detector 310 can determine that the macroblock 202 does not include text and/or computer generated graphics content.
In some implementations, the electronic device 300 may employ features to further reduce the number of computations for obtaining a MacroBlock value. For instance, the text/graphic detector 310 may calculate values for a smaller portion of the macroblock, and then use those values for calculations in all of the different patterns.
To calculate a MacroBlock value in this example, the macroblock 202 can first be divided into the smaller groups of pixels I-XVI as depicted at 702. The text/graphic detector 310 can then calculate a group value for each group I-XVI by adding the pixel values for each pixel within each group I-XVI to obtain a respective value. The group values for each respective group I-XVI are depicted as SUM I, SUM II, etc. in box 704.
Instead of adding or subtracting each pixel value according to the various patterns, the text/graphic detector 310 can add or subtract each group according to the various patterns. To facilitate such pixel groupings, the text/graphic detector 310 can apply patterns where the pixels within a group are either all added or all subtracted. For instance, in the example in
To further illustrate this feature, the first depicted pattern in
Similar to the example described above with reference to
In response to a determination by the text/graphic detector 310 that a macroblock contains text and/or computer generated graphics content, the imagery encoder 312 can encode the macroblock in a manner to better preserve the details (e.g., employ a low quantization parameter). The specific imagery encoding algorithm employed may vary according to applications. By way of example and not limitation, the imagery encoding 312 may employ JPEG image coding standard, H.264, HEVC video coding standards, or any other known encoding algorithm.
Various aspects of the present disclosure relate to methods operational on an electronic device for encoding digital imagery. Turning to
In order to determine the degree to which details within portions of the digital imagery is to be preserved, the electronic device 300 may determine whether the imagery includes any text or computer generated graphics. At 804, the electronic device 300 may analyze each macroblock for text and/or computer generated graphics by mathematically combining pixel values for the pixels within a macroblock according to a plurality of predefined patterns. For example, the processing circuit 302 (e.g., the text/graphic detector 310) executing the text/graphic detection operations 314 may mathematically combine pixel values for each pixel of a macroblock according to a plurality of predefined patterns.
Referring to
The processing circuit 302 (e.g., the text/graphic detector 310) executing the text/graphic detection operations 314 then mathematically combines each of the pixel values by adding or subtracting the pixel values for each pixel of the macroblock according to this first predefined pattern. For instance, at operation 904, the next pixel (e.g., initial pixel if starting) is evaluated. At decision diamond 906, processing circuit 302 (e.g., the text/graphic detector 310) executing the text/graphic detection operations 314 determines whether the pattern indicates the pixel is to be added or subtracted.
If the pattern indicates the pixel is to be added, then it is added to a total at operation 908. On the other hand, if the pattern indicates the pixel is to be subtracted, then it is subtracted from the total at operation 910. The initial total will start at 0, and each pixel value is combined as defined by the pattern.
At decision diamond 912, a determination is made whether there are further pixels in the macroblock. If yes, then the process returns to operation 904, where the next pixel is evaluated. If no, then the process continues to operation 914 where the total is saved. As noted above, the saved pattern total may be an absolute value of the calculated sum, to ensure it is a positive value.
After the total value of the predefined pattern is saved for the macroblock, a determination is made whether there is another pattern at decision diamond 916. If so, the process returns to operation 902 and applies the next pattern. If not, then the saved pattern total values for all of the evaluated patterns are combined at operation 918 to obtain a macroblock value (e.g., macroblock value 614 in
Referring to
At operation 1004, the processing circuit 302 (e.g., the text/graphic detector 310) executing the text/graphic detection operations 314 may apply a first predefined pattern (e.g., pattern A in
The processing circuit 302 (e.g., the text/graphic detector 310) executing the text/graphic detection operations 314 then mathematically combines each of the pixel values by adding or subtracting the group values for each pixel group of the macroblock according to this first predefined pattern. For instance, at operation 1006, the next pixel group (e.g., initial pixel group if starting) is evaluated. At decision diamond 1008, processing circuit 302 (e.g., the text/graphic detector 310) executing the text/graphic detection operations 314 determines whether the pattern indicates that the pixels in the pixel group are to be added or subtracted.
If the pattern indicates the pixels in the pixel group are to be added, then the group value associated with the pixel group is added to a total at operation 1010. On the other hand, if the pattern indicates the pixels in pixel group are to be subtracted, then the group value is subtracted from the total at operation 1012. The initial total will start at 0, and each group value is combined as defined by the pattern.
At decision diamond 1014, a determination is made whether there are further pixel groups in the macroblock. If yes, then the process returns to operation 1006, where the next pixel group is evaluated. If no, then the process continues to operation 1016 where the total is saved. As noted above, the saved pattern total may be an absolute value of the calculated sum, to ensure it is a positive value.
After the total value of the predefined pattern is saved for the macroblock, a determination is made whether there is another pattern at decision diamond 1018. If so, the process returns to operation 1004 and applies the next pattern. If not, then the saved pattern total values for all of the evaluated patterns are combined at operation 1020 to obtain a macroblock value (e.g., macroblock value 706 in
Returning to
At 808, the electronic device 300 can encode the macroblock based at least in part on whether the macroblock has been determined to include text and/or computer generated graphic content. For example, the processing circuit 302 (e.g., the imagery encoder 312) executing the imagery encoding operations 316 may encode the macroblock. If the mathematical combination for the macroblock is greater than or equal to the predefined threshold, the processing circuit 302 (e.g., the imagery encoder 312) executing the imagery encoding operations 316 may encode the macroblock in a manner to preserve details within the macroblock (e.g., employ a lower quantization parameter compared to the quantization parameter that would be employed when the mathematical combination is less than the predefined threshold). If the mathematical combination for the macroblock is less than the predefined threshold, the processing circuit 302 (e.g., the imagery encoder 312) executing the imagery encoding operations 316 may encode the macroblock without regard for text and/or computer generated graphics (e.g., employ a higher quantization parameter compared to the quantization parameter that would be employed when the mathematical combination is greater than or equal to the predefined threshold).
While the above discussed aspects, arrangements, and embodiments are discussed with specific details and particularity, one or more of the components, steps, features and/or functions illustrated in
While features of the present disclosure may have been discussed relative to certain embodiments and figures, all embodiments of the present disclosure can include one or more of the advantageous features discussed herein. In other words, while one or more embodiments may have been discussed as having certain advantageous features, one or more of such features may also be used in accordance with any of the various embodiments discussed herein. In similar fashion, while exemplary embodiments may have been discussed herein as device, system, or method embodiments, it should be understood that such exemplary embodiments can be implemented in various devices, systems, and methods.
Also, it is noted that at least some implementations have been described as a process that is depicted as a flowchart, a flow diagram, a structure diagram, or a block diagram. Although a flowchart may describe the operations as a sequential process, many of the operations can be performed in parallel or concurrently. In addition, the order of the operations may be re-arranged. A process is terminated when its operations are completed. A process may correspond to a method, a function, a procedure, a subroutine, a subprogram, etc. When a process corresponds to a function, its termination corresponds to a return of the function to the calling function or the main function. The various methods described herein may be partially or fully implemented by programming (e.g., instructions and/or data) that may be stored in a processor-readable storage medium, and executed by one or more processors, machines and/or devices.
Those of skill in the art would further appreciate that the various illustrative logical blocks, modules, circuits, and algorithm steps described in connection with the embodiments disclosed herein may be implemented as hardware, software, firmware, middleware, microcode, or any combination thereof. To clearly illustrate this interchangeability, various illustrative components, blocks, modules, circuits, and steps have been described above generally in terms of their functionality. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the overall system.
The various features associate with the examples described herein and shown in the accompanying drawings can be implemented in different examples and implementations without departing from the scope of the present disclosure. Therefore, although certain specific constructions and arrangements have been described and shown in the accompanying drawings, such embodiments are merely illustrative and not restrictive of the scope of the disclosure, since various other additions and modifications to, and deletions from, the described embodiments will be apparent to one of ordinary skill in the art. Thus, the scope of the disclosure is only determined by the literal language, and legal equivalents, of the claims which follow.