As observed in
A problem with performing all such tasks 100 within these components 102, 103 is the amount of power consumed moving image data within the system. Specifically, entire images of data typically need to be forwarded 106 from the camera 101 to the ISP directly 103 or into system memory 104. The movement of such large quantities of data within the system consumes large amounts of power which, in the case of battery operated devices, can dramatically reduce the battery life of the device.
Compounding the inefficiency is that often times much of the image data is of little importance or value. For example, consider an imaging task that seeks to analyze a small area of the image. Here, although just a small area of the image is of interest to the processing task, the entire image will be forwarded through the system. The small region of interest is effectively parsed from the larger image only after the system has expended significant power moving large amounts of useless data outside the region.
Another example is the initial identification of a “looked for” feature within an image (e.g., the initial identification of the region of interest in the example discussed immediately above). Here, if the looked for feature is apt to be present in the imagery taken by the camera only infrequently, continuous streams of entire images without the feature will be forwarded through the system before the feature ultimately presents itself. As such, again, large amounts of data that are of no use or value are being moved through the system, which can dramatically reduce the power efficiency of the device.
Additionally all camera control decisions, such as whether to enter a camera into a particular mode, have traditionally been made by the general purpose processing core 102. As such highly adaptive camera control functions (e.g., in which a camera switches between various modes frequently) can generate heavy camera control traffic 107 that is directed through the system toward the camera 101. Such highly adaptive functions may even be infeasible because of the substantial delay that exists between the recognizing of an event that causes a camera to change modes and when any new command is ultimately received by the camera 101.
An integrated stacked and/or abutted sensor, memory and processing hardware camera solution is described. The sensor is to receive light from an image and generate electronic pixels from the light. The processing hardware is to process the electronic pixels to: a) recognize a scene from the image in a lower quality image mode; b) trigger actions by the camera solution in response to the recognition of the scene, the actions including: i) transitioning the camera solution from the lower quality image mode to a higher quality image mode to capture a higher quality version of the image; and, ii) forwarding from the camera solution important imagery and not forwarding from the camera solution unimportant imagery.
An apparatus is described that comprises means for receiving light from an image and generating electronic pixels from the light. The apparatus also includes means for processing the electronic pixels, the means for processing including means for recognizing a scene from the image in a lower quality image mode and means for triggering actions in response to the recognizing. The actions include: i) transitioning to from the lower quality image mode to a higher quality image mode to capture a higher quality version of the image; and, ii) forwarding important imagery and not forwarding unimportant imagery. The means for receiving light, the means for processing and a memory are stacked and/or abutted into an integrated camera solution.
The following description and accompanying drawings are used to illustrate embodiments of the invention. In the drawings:
One such task is the ability to identify “looked-for” image features within the imagery being taken by the integrated solution 201. Another task is the ability to determine specific operating modes “on the fly” from analysis of imagery that has just been taken by the integrated solution 201. Each of these is discussed at length below.
With the ability to identify looked for image features with the integrated solution 201, image data that is of no interest or importance can be discarded by the integrated solution 201 thereby preventing it from being forwarded elsewhere through the system.
For example, recalling the problematic examples discussed just above in the Background section, if an image's region of interest can be identified by the integrated solution 201, the area of the image that is outside the region of interest can be completely discarded by the integrated solution 201—leaving only the region of interest to be forwarded to other components within the system for further processing. Likewise, entire images that do not have any content of importance can also be discarded in their entirety by the integrated solution 201.
As another example, entire frames can be passed or discarded based on whether or not their content has any features of interest. As such, frames having pertinent information are passed from the integrated solution 201 to other components of the system (e.g., system memory 204, a display, etc.). Frames deemed not to contain any pertinent information are discarded.
As such, the ability to identify looked-for features with the integrated solution 201 provides for a system that, ideally, only forwards data having some importance or value elsewhere through the system. By preventing the forwarding of data having no importance or value through the system, the efficiency of the system is greatly improved as compared to traditional prior art systems.
The functionality of identifying looked for features with the integrated solution 201 may also be extended, at least in some cases, to perform any associated follow-on image processing tasks with the integrated solution 201. One particularly pertinent follow-on processing task may be compression. Here, once pertinent image information has been identified by the integrated solution 201, the information may be further compressed by the integrated solution 201 to reduce its total data size in preparation for its forwarding to other components within the system. Thus, not only may efficiencies be realized by eliminating information of no importance for forwarding, but also, reducing the size of the information that is pertinent and is forwarded.
Further still, different parts of a feature of interest may be compressed at different compression ratios (e.g., sections of the image that are more quality sensitive may be compressed at a lower compression ratio while other sections of the image that are less quality sensitive may be compressed at a higher compression ratio). Generally, images (e.g., entire frames or portions thereof) that are more sensitive to quality may be compressed with lower compression ratios while images (e.g., frames or portions thereof) that are less sensitive to quality may be compressed with greater compression ratios.
In yet other cases, all of the image processing intelligence for a particular function may be performed by the integrated solution 201. For instance, not only may a region of interest be identified by the integrated solution 201, but also, whatever analysis of the region of interest that is to take place once it has been identified is also performed by the integrated solution 201. In this case, little or no image information at all (important or otherwise) is forwarded through the system because the entire task has been performed by the integrated solution 201. In this respect, power reduction efficiency is practically ideal as compared to the prior art approaches described in the Background.
In order to identify looked for features within an image (or other extended image processing functions) with the integrated solution 201, some degree of processing intelligence/sophistication is integrated into the integrated solution 201.
As observed in
In other embodiments, such as observed in
The processing intelligence hardware 304 can take on various different forms depending on implementation. At one extreme, the processing intelligence hardware 304 includes one or more processors and/or controllers that execute program code (e.g., that is also stored in memory 303 and/or in a non volatile memory, e.g., within the camera (not shown)). Here, software and/or firmware routines written to perform various complex tasks are stored in memory 303 and are executed by the processor/controller in order to perform the specific complex function.
At the other extreme, the processing intelligence hardware 304 is implemented with dedicated (e.g., custom) hardware logic circuitry such as application specific integrated specific (ASIC) custom hardware logic and/or programmable hardware logic (e.g., field programmable gate array (FPGA) logic, programmable logic device (PLD) logic and/or programmable logic array (PLA) logic).
In yet other implementations, some combination between these two extremes (processor(s) that execute program code vs. dedicated hardware logic circuitry) can be used to effectively implement the processing intelligence hardware component 304.
As alluded to above, various looked-for features may be found by the integrated solution. The associated looked-for feature processes 401 may include, e.g., face detection (detecting the presence of any face), face recognition (detecting the presence of a specific face), facial expression recognition (detecting a particular facial expression), object detection or recognition (detecting the presence of a generic or specific object), motion detection or recognition (detecting a general or specific kind of motion), event detection or recognition (detecting a general or specific kind of event), image quality detection or recognition (detecting a general or specific level of image quality).
The looked for feature processes 401 may be performed, e.g., concurrently, serially, and/or may be dependent on various conditions (e.g., a facial recognition function may only be performed if specifically requested by a processing core and/or application and/or user).
As observed in
Consider as an example a system that has been configured to identify various looked for features within a sequence of images being captured by the integrated solution, but where no such features have currently been found. In this mode, the integrated solution may continually take pictures of images to feed the looked for feature processes 401 with the expectation that a looked for feature may eventually present itself.
The taking of these pictures, however, is deliberately performed in a low picture quality mode to consume less power since there is also a likelihood that a number of images being captured may not contain any looked for feature. Since it does not make sense to consume significant power taking pictures of images whose content has no value, low quality mode is used prior to the discovery of a looked for feature to conserve power usage. Here, in many cases, various kinds of looked for features can be identified from a low quality image.
The outputs from the one or more of the looked-for feature processes 401 are provided to an aggregation layer 403 that combines outputs from various ones of the looked for feature processes 401 to enable a more comprehensive looked for scene (or “scene analysis”) function 404. For instance, consider a system that is designed to start streaming video if two particular people are identified in an image. Here, a first of the looked for feature processes 401 will identify the first person and a second of the looked for feature processes will identify the second person.
The outputs of both processes are aggregated 403 to enable a scene analysis function 404 that will raise a flag if both looked for features are found (i.e., both people have been identified in the image). Here, various ones of the looked for feature processes can be aggregated 403 to enable one or more scene analysis configurations (e.g., a first scene analysis that looks for two particular people and a particular object within an image, a second scene analysis that looks for three specific people, etc.).
Upon the scene analysis function 404 recognizing that a looked for scene has been found, the scene analysis will “trigger” the start of one or more additional follow-up actions 405. For instance, recall the example above where the integrated solution is to begin streaming video if two people are identified in the images being analyzed. Here, the follow-up action corresponds to the streaming of the video.
In many cases, as indicated in
Here, recall that low quality mode 410 may be used to analyze images for looked for features before any looked for scenes are found because such images are apt to not contain looked for information, and therefore it does not make sense to consume large amounts of power taking such images. After a looked for scene has been found, however, the images being taken by the integrated solution are potentially important and therefore it is justifiable to consume more power to take later images at higher quality. Transitioning to a higher quality image mode may include, for instance, any one or more of increasing the frame rate, increasing the image resolution, and/or increasing the bit depth. In one embodiment, e.g., to conserve power in the high quality mode, the only pixel areas of the image sensor that are enabled during a capture mode are the pixel areas where a feature of interest is expected to impinge upon the surface area the image sensor. Again, the image sensor 302 is presumed to include various configuration settings to enable rapid transition of such parameters. Note that making the decision to transition the integrated solution between low quality and high quality image modes corresponds to localized, adaptive imaging control which is a significant improvement over prior art approaches
Some examples of the additional actions 504 that may take place in response to a particular scene being identified include any one or more the following: 1) identifying an area of interest within an image (e.g., the immediate area surrounding one or more looked for features within the image); 2) parsing an area of interest within an image and forwarding it to other (e.g., higher performance) processing components within the system; 3) discarding the area within an image that is not of interest; 4) compressing an image or portion of an image before it is forwarded to other components within the system; 5) taking a particular kind of image (e.g., a snapshot, a series of snapshots, a video stream); and, 6) changing one or more camera settings (e.g., changing the settings on the servo motors that are coupled to the optics to zoom-in, zoom-out or otherwise adjust the focusing/optics of the camera; changing an exposure setting; trigger a flash). Again, all of these actions can be taken under the control of the processing intelligence that exists at the camera-level.
Although embodiments above have stressed the entering of a high quality image capture mode after a looked for scene has been identified, various embodiments may not require such a transition and various one of the follow up actions 504 can take place while images are still being captured in a lower quality image capture mode.
Note also that the integrated solution may be a stand alone device that is not itself integrated into a computer system. For example, the integrated solution may have, e.g., a wireless I/O interface that forwards image content consistent with the teachings above directly to a stand alone display device.
As observed in
An applications processor or multi-core processor 750 may include one or more general purpose processing cores 715 within its CPU 701, one or more graphical processing units 716, a memory management function 717 (e.g., a memory controller), an I/O control function 718 and an image processing unit 719. The general purpose processing cores 715 typically execute the operating system and application software of the computing system. The graphics processing units 716 typically execute graphics intensive functions to, e.g., generate graphics information that is presented on the display 703. The memory control function 717 interfaces with the system memory 702 to write/read data to/from system memory 702. The power management control unit 724 generally controls the power consumption of the system 700.
The camera 707 may be implemented as an integrated stacked and/or abutted sensor, memory and processing hardware solution as described at length above.
Each of the touchscreen display 703, the communication interfaces 704-707, the GPS interface 708, the sensors 709, the camera 710, and the speaker/microphone codec 713, 714 all can be viewed as various forms of I/O (input and/or output) relative to the overall computing system including, where appropriate, an integrated peripheral device as well (e.g., the one or more cameras 710). Depending on implementation, various ones of these I/O components may be integrated on the applications processor/multi-core processor 750 or may be located off the die or outside the package of the applications processor/multi-core processor 750.
In an embodiment one or more cameras 710 includes a depth camera capable of measuring depth between the camera and an object in its field of view. Application software, operating system software, device driver software and/or firmware executing on a general purpose CPU core (or other functional block having an instruction execution pipeline to execute program code) of an applications processor or other processor may perform any of the functions described above.
Embodiments of the invention may include various processes as set forth above. The processes may be embodied in machine-executable instructions. The instructions can be used to cause a general-purpose or special-purpose processor to perform certain processes. Alternatively, these processes may be performed by specific hardware components that contain hardwired logic for performing the processes, or by any combination of programmed computer components and custom hardware components.
Elements of the present invention may also be provided as a machine-readable medium for storing the machine-executable instructions. The machine-readable medium may include, but is not limited to, floppy diskettes, optical disks, CD-ROMs, and magneto-optical disks, FLASH memory, ROMs, RAMs, EPROMs, EEPROMs, magnetic or optical cards, propagation media or other type of media/machine-readable medium suitable for storing electronic instructions. For example, the present invention may be downloaded as a computer program which may be transferred from a remote computer (e.g., a server) to a requesting computer (e.g., a client) by way of data signals embodied in a carrier wave or other propagation medium via a communication link (e.g., a modem or network connection).
In the foregoing specification, the invention has been described with reference to specific exemplary embodiments thereof. It will, however, be evident that various modifications and changes may be made thereto without departing from the broader spirit and scope of the invention as set forth in the appended claims. The specification and drawings are, accordingly, to be regarded in an illustrative rather than a restrictive sense.
This application is a continuation of U.S. application Ser. No. 15/273,427, filed Sep. 22, 2016, which claims the benefit of U.S. Provisional Application No. 62/234,010, filed Sep. 28, 2015, the contents of each are hereby incorporated by reference.
Number | Date | Country | |
---|---|---|---|
62234010 | Sep 2015 | US |
Number | Date | Country | |
---|---|---|---|
Parent | 15273427 | Sep 2016 | US |
Child | 16031523 | US |