A portion of the disclosure of this patent document contains material which is subject to copyright protection. The copyright owner has no objection to the facsimile reproduction by anyone of the patent document or the patent disclosure, as it appears in the Patent and Trademark Office patent file or records, but otherwise reserves all copyright rights whatsoever.
The present disclosure relates generally to image processing techniques and, more particularly, to methods and systems for partially blurring an image.
In contemporary photography, techniques exist for capturing and processing images in which certain objects in the image are presented clearly (e.g., in focus) while other objects in the background and/or foreground objects appear blurry (e.g., out-of-focus). In particular, a target portion of the image is presented clearly while other portions in the image are softened using an optical and/or digital blurring process. These techniques result in an image with an emphasized or highlighted point of interest (e.g., a target), which is set apart from other features in the image because it is in focus compared with other portions of the image, which appear out-of-focus.
Producing this type of image generally requires a particular type of camera or specific post-processing algorithms to generate the sharp and blurry portions. Particular cameras that may be used for this technique include those that can optically focus on a target. These include single-lens reflex (SLR) cameras with aperture control, cameras with perspective control lenses (e.g., tilt-and-shift lenses), and light field cameras. Some camera systems, including those in certain smartphones, may also accomplish this blurring effect by capturing digital images with an array of pixels and re-rendering the images using post-processing algorithms to blur pixels outside a desired depth of focus. The algorithms may, for example, capitalize on a smartphone's dual-camera system to capture stereoscopic images from which a depth of focus can be identified for the post-processing blurring algorithms, or alternatively, other camera systems may employ a manual pixel-blurring algorithm applied to a single image. The latter approach, however, results in unnatural blurring that is not aesthetically appealing.
Processing multiple images from different perspectives yields the best results, producing a target in clear resolution while surrounding objects are softened by blurring in a manner that appears more natural to the human eye. However, existing techniques for rendering such partially-blurred images are unsuitable for cameras that move significantly between captured images, such as cameras mounted to movable objects, and particularly for non-stereoscopic cameras (e.g., 2D cameras) mounted to movable objects. Therefore, an improved method of processing an image is desired, where a partially-blurred image can be generated from images captured by cameras mounted on a moving object, such as an unmanned aerial vehicle (UAV).
The present disclosure relates to systems and methods for processing an image having a first set of pixels. In the disclosed embodiments, the method may include generating a depth map of the image, the depth map including a second set of pixel values representative of distances of objects in the image. The method may further include identifying a plurality of different depths at which objects are located in the image based on the depth map, and using the depth map to determine a relative distance between one identified depth (e.g., a “target” depth) in the plurality of different depths and each of the other identified depths in the plurality of different depths. Furthermore, the method may include blurring pixels in the first set of pixels based on each determined relative distance. In some embodiments, at least some pixels in the first set of pixels may be blurred in a descending order from pixels at an identified depth corresponding to the farthest relative distance from the target depth to pixels at an identified depth corresponding to the closest relative distance to the target depth.
Further to the disclosed embodiments, systems are provided for processing an image having a first set of pixels. The system may include a memory having instructions stored therein and one or more processors configured to execute the instructions. The one or more processors may be configured to execute the instructions to generate a depth map of the image, the depth map including a second set of pixel values representative of distances of objects in the image. The one or more processors may further be configured to identify a plurality of different depths at which objects are located in the image based on the depth map, and use the depth map to determine a relative distance between one identified depth in the plurality of different depths and each of the other identified depths in the plurality of different depths. Furthermore, the one or more processors may be configured to blur pixels in the first set of pixels based on each determined relative distance.
In some disclosed embodiments, the present disclosure also relates to an UAV. The UAV may include a propulsion device, a communication device, at least one image capture device, a memory storing instructions, and one or more processors in communication with the communication device and configured to control the UAV. The one or more processors may be configured to execute the instructions to generate a depth map of an image having a first set of pixel values, the depth map including a second set of pixel values representative of distances of objects in the image. The one or more processors may further be configured to identify a plurality of different depths at which objects are located in the image based on the depth map, and use the depth map to determine a relative distance between one identified depth in the plurality of different depths and each of the other identified depths in the plurality of different depths. Furthermore, the controller may be configured to blur pixels in the first set of pixels based on each determined relative distance.
In still other disclosed embodiments, the present disclosure relates to a non-transitory computer readable medium storing instructions that, when executed by at least one processor, perform a method for processing an image having a first set of pixels. In the disclosed embodiments, the method may include generating a depth map of the image, the depth map including a second set of pixel values representative of distances of objects in the image. The method may further include identifying a plurality of different depths at which objects are located in the image based on the depth map, and using the depth map to determine a relative distance between one identified depth in the plurality of different depths and each of the other identified depths in the plurality of different depths. Furthermore, the method may include blurring pixels in the first set of pixels based on each determined relative distance.
The following detailed description refers to the accompanying drawings. Wherever possible, the same reference numbers are used in the drawings and the following description to refer to the same or similar parts. While several illustrative embodiments are described herein, modifications, adaptations and other implementations are possible. For example, substitutions, additions or modifications may be made to the components illustrated in the drawings, and the illustrative methods described herein may be modified by substituting, reordering, removing, or adding steps to the disclosed methods. Accordingly, the following detailed description is not limited to the disclosed embodiments and examples. Instead, the proper scope is defined by the appended claims.
Referring to
Movable object 110 may include one or more (e.g., 1, 2, 3, 4, 5, 10, 15, 20, etc.) propulsion devices, such as one or more propulsion assemblies 112 positioned at various locations (for example, top, sides, front, rear, and/or bottom of movable object 110) for propelling and steering movable object 110. Propulsion assemblies 112 may be devices or systems operable to generate forces for sustaining controlled flight. Propulsion assemblies 112 may share or may each separately include or be operatively connected to a power source 115, such as a motor M (e.g., an electric motor, hydraulic motor, pneumatic motor, etc.) or an engine (e.g., an internal combustion engine, a turbine engine, etc.). A power storage device 117 (
Propulsion assemblies 112 and/or rotary components 124 may be adjustable (e.g., tiltable) with respect to each other and/or with respect to movable object 110. Alternatively, propulsion assemblies 112 and rotary components 124 may have a fixed orientation with respect to each other and/or movable object 110. In some embodiments, each propulsion assembly 112 may be of the same type. In other embodiments, propulsion assemblies 112 may be of different types. In some embodiments, all propulsion assemblies 112 may be controlled in concert (e.g., at the same speed and/or angle). In other embodiments, one or more propulsion devices may be independently controlled with respect to, e.g., speed and/or angle.
Propulsion assemblies 112 may be configured to propel movable object 110 in one or more vertical and horizontal directions and to allow movable object 110 to rotate about one or more axes. That is, propulsion assemblies 112 may be configured to provide lift and/or thrust for creating and maintaining translational and rotational movements of movable object 110. For instance, propulsion assemblies 112 may be configured to enable movable object 110 to achieve and maintain desired altitudes, provide thrust for movement in all directions, and provide for steering of movable object 110. In some embodiments, propulsion assemblies 112 may enable movable object 110 to perform vertical takeoffs and landings (i.e., takeoff and landing without horizontal thrust). In other embodiments, movable object 110 may require constant minimum horizontal thrust to achieve and sustain flight. Propulsion assemblies 112 may be configured to enable movement of movable object 110 along and/or about multiple axes.
Payload 114 may include one or more sensors 118. Sensors 118 may include devices for collecting or generating data or information, such as surveying, tracking, and capturing images or video of targets (e.g., objects, landscapes, subjects of photo or video shoots, etc.). Sensors 118 may include one or more image capture devices 113 configured to gather data that may be used to generate images. For example, imaging capture devices 113 may include photographic cameras, video cameras, infrared imaging devices, ultraviolet imaging devices, x-ray devices, ultrasonic imaging devices, radar devices, etc. Sensors 118 may also or alternatively include sensor devices 119 for range-finding or for capturing visual, audio, and/or electromagnetic signals.
Sensor devices 119 may also or alternatively include devices for measuring, calculating, or otherwise determining the position or location of movable object 110. For instance, sensor devices 119 may include devices for determining the height (i.e., distance above the ground) of movable object 110 and/or the altitude (i.e., with respect to sea level) of movable object 110. Sensor devices 119 may include optical sensors, ultrasonic sensors, barometers, radar systems (e.g., millimeter wave radar), laser systems (e.g., LIDAR, etc.), etc. In some embodiments, movable object 110 may be equipped with multiple sensor devices 119, each operable to generate a different measurement signal. Sensor devices 119 may also or alternatively be or include devices for determining the movements, orientation, and/or location of movable object 110, such as a positioning device 146 for a positioning system (e.g., GPS, GLONASS, Galileo, Beidou, GAGAN, etc.), motion sensors, inertial sensors (e.g., IMU sensors), proximity sensors, image sensors, etc. Sensor devices 119 may also include devices or be configured to provide data or information relating to the surrounding environment, such as weather information (e.g., temperature, pressure, humidity, etc.), lighting conditions, air constituents, or nearby obstacles (e.g., objects, structures, people, other vehicles, etc.)
Carrier 116 may include one or more devices configured to hold the payload 114 and/or allow the payload 114 to be adjusted (e.g., rotated) with respect to movable object 110. For example, carrier 116 may be a gimbal. Carrier 116 may be configured to allow payload 114 to be rotated about one or more axes, as described below. In some embodiments, carrier 116 may be configured to allow 360° of rotation about each axis to allow for greater control of the perspective of the payload 114. In other embodiments, carrier 116 may limit the range of rotation of payload 114 to less than 360° (e.g., ≤270°, ≤210°, ≤180, ≤120°, ≤90°, ≤45°, ≤30°, ≤15°, etc.), about one or more of its axes.
In addition to sensors 118 on payload 114, movable object 110 and/or carrier 116 may also include one or more sensors on movable object 110 and/or carrier 116 and not on payload 114. Alternatively, the movable object 110 may include the sensors 118 on both the payload 114 and other elements of the movable object 110.
Information and data obtained from the sensor devices 119 and/or image capture device 113 may be communicated to and stored in non-transitory computer-readable media of memory 136. Non-transitory computer-readable media associated with memory 136 may also be configured to store logic, code and/or program instructions executable by processor 137 or any other processor to perform embodiments of the methods described herein. For example, non-transitory computer-readable media associated with memory 136 may be configured to store computer-readable instructions that, when executed by processor 137, cause the processor to perform a method comprising one or more steps. The method performed by the processor based on the instructions stored in the non-transitory computer readable media may involve processing inputs, such as inputs of data or information stored in the non-transitory computer-readable media of memory 136, inputs received from an external terminal 163, inputs received from sensor devices 119 and/or image capture devices 113 (e.g., received directly or retrieved from memory), and/or other inputs received via communication device 120. The non-transitory computer-readable media may be configured to store sensor data from sensor device 119 and images from image capture devices 113 to be processed by processor 137. The non-transitory computer-readable media may also be configured to transmit sensor data from sensor device 119 and images from image capture devices 113 to the terminal 163 for processing. In some embodiments, the non-transitory computer-readable media can be used to store the processing results produced by processor 137.
Processor 137 may include one or more processors and may embody a programmable processor, e.g., a central processing unit (CPU). Processor 137 may be operatively coupled to memory 136 or another memory device configured to store programs or instructions executable by processor 137 for performing one or more method steps. It is noted that method steps described herein may be stored in memory 136 and configured to be carried out by processor 137 to cause the method steps to be carried out by the processor 137.
In some embodiments, processor 137 may include and/or alternatively be operatively coupled to one or more modules, such as a flight control module 140 and an image processing module 148. Flight control module 140 may be configured to control propulsion assemblies 112 of movable object 110 to adjust the spatial disposition, velocity, and/or acceleration of the movable object 110 with respect to six degrees of freedom (e.g., three translational directions along its coordinate axes and three rotational directions about its coordinate axes). Image processing module 148 may be configured to receive and process images captured from the one or more image capture devices 113 before transmitting processed images to off-board entities (e.g., to terminal 163). Flight control module 140 and image processing module 148 may be implemented in software for execution on processor 137, or may be implemented in hardware and/or software components separate from processor 137 (not shown in the figure). For example, software for implementing at least a portion of the flight control module 140 or image processing module 148 may be stored in memory 136.
Processor 137 may be operatively coupled to communication device 120 and configured to transmit and/or receive data from one or more external devices (e.g., terminal 163, other movable objects, and/or other remote controllers). Communication device 120 may be configured to enable communications of data, information, commands, and/or other types of signals between controller 122 and off-board entities. Communication device 120 may include one or more components configured to send and/or receive signals, such as receiver 134, transmitter 132, or transceivers that are configured to carry out one- or two-way communication. Components of communication device 120 may be configured to communicate with off-board entities via one or more communication networks, such as radio, cellular, Bluetooth, Wi-Fi, RFID, wireless local area network (WLAN) network, wide area networks (WAN), infrared, point-to-point (P2P) networks, cloud communication, particular wireless protocols, such as, for example, IEEE 802.15.1, IEEE 802.11, and/or other types of communication networks usable to transmit signals indicative of data, information, commands, and/or other signals. For example, communication device 120 may be configured to enable communications with user input devices for providing input for controlling movable object 110 during flight, such as a remote terminal 163. Communication device 120 may also be configured to enable communications with other movable objects.
The components of controller 122 may be arranged in any suitable configuration. For example, one or more of the components of the controller 122 may be located on the movable object 110, carrier 116, payload 114, terminal 163, sensors 118, or an additional external device in communication with one or more of the above. In some embodiments, one or more processors or memory devices may be situated at different locations. These include on movable object 110, carrier 116, payload 114, terminal 163, sensors 119, or on an another off-board device in communication with one or more of the above, including any suitable combinations thereof, such that any aspect of the processing and/or memory functions performed by the system may occur at one or more of the aforementioned locations. For example, and as shown in
Referring again to
The image 50 may include one or more target objects or features 55 and one or more background objects or features 53. In the context of the disclosed embodiments, an object may be a physical object (e.g., a flower such as in
In the exemplary image 50 of
Further to the disclosed embodiments, at least one target 55 (e.g., flower) may be in the foreground of image 50, while background objects 53 (e.g., mountain 53C, hill 53B, and ground 53A) may be in the background 52 of image 50. To generate a partially-burred image that highlights target 55 of image 50, the target 55 may be designated prior to capturing the image 50. Alternatively, image 50 may be obtained and the target 55 may be identified afterwards (e.g., from post-processing a previously captured image). The target 55 may be an object or feature in image 50 that will be highlighted or otherwise be presented in-focus, while other objects or features 53, including those in the background 52, are blurred. This results in a partially-blurred image with the target 55 presented in sharp resolution and other features presented as blurred, out-of-focus features. An exemplary process of partially blurring an image 50 that may be used with the disclosed embodiments is described below.
After obtaining the plurality of images to be processed, a depth map 60 is generated (Step 700) from the images. Using the first set of pixel values in each of the plurality of images, a depth map 60 is created including a second set of pixel values representing distances of objects in the plurality of images. In some embodiments, each of the second set of pixels 61 in the depth map 60 corresponds to a respective pixel at the same position in the first set of pixels 51 in image 50. An exemplary depth map 60 is shown in
After the depth map 60 is generated (Step 700), the depths of objects in the image may be identified (Step 315). In one embodiment, the depth map may be analyzed for pixels 61 having similar depths, and pixels 61 with shared or similar depths identified as objects in the image. Analysis of the depths may include a statistical analysis of the pixel values, and/or a graphical or numerical representation of the pixel values. In an exemplary embodiment, the pixels values of the depth map may be grouped into a plurality of groups. The groups may include pixel values having shared or similar values. In some embodiments, the groups may be selected based on a statistical analysis of the pixel values, a limit for the total number of groups to be generated, and/or a predetermined sensitivity or range for determining shared or similar pixel values. The groups may also be grouped using any suitable process recognized by those skilled in the art.
In one embodiment, the groups may be used to generate a histogram 500, charting the second set of pixel values in the depth map. An exemplary histogram 500 is shown in
Having identified different local peaks in the pixel value data, each peak may be grouped in a different group. The groups may then be used to determine the relative distance between one identified depth in the plurality of different depths (e.g., a target group) and each of the other identified depths in the plurality of different depths (e.g., the other groups). The relative distance between the groups may then be used to carry out a blurring process (Step 600). Represented graphically in the histogram 500, each group may be assigned a different bin 505. Each bin 505 may correspond to a cluster of pixels at a particular depth and representative of at least one object in the image at that particular depth. The bin 505, like the group, may correspond to a single depth or a range of depths. For example, if a peak in the histogram 500 is identified at a depth of 8 meters, then a corresponding bin 505 may be defined as a group of pixels having a depth of 8 meters plus or minus a predetermined amount or percentage. The histogram 500 may then be used to identify different depths for objects in the image 50, where the objects are identified from local maxima in the pixel value data and grouped into different depth groups. For example, where the ground, flower, hill, and mountain in the exemplary image 50 of
In some embodiments, the depths of objects may be identified (Step 315) using the groups generated from the pixel value data of the depth map. One or more groups may be generated before they are identified as an object in Step 315, and one or more objects may correspond to each group. The number of groups identified and used to identify objects will depend on the data presented in the pixel value data (e.g., the number of peaks in data line 501 of histogram 500) and how many different groups the user elects to use to discretize the data. In some instances, objects sharing a common depth may be grouped into a single depth, despite being discrete objects in the image. As will be described below, the number of groups, and correspondingly or alternatively the number of identified depths identified as objects in Step 315, will be used to determine the number of blurring iterations carried out in method 600. Therefore, the more objects identified in Step 315, the greater the computation time. However, more objects will account for more depths and may result in a more realistic blurring.
After the depths of different objects in the image have been identified (Step 315) from the pixel values of the depth map using a suitable statistical method and/or graphical or numerical representation of the pixel values, method 300 may proceed by either obtaining a target object 330 (Step 330) or by identifying one particular “target” depth at which objects should remain in focus. Then, the relative distances between each of the other identified depths and the target depths is determined (Step 320). To illustrate, assume the flower is the target object 55 corresponding to a target depth of 5 meters in the 5 meter group of histogram 500, and the other background objects of the ground, hill, and mountain correspond to groups at identified depths of 5, 10, and 12 meters, respectively. In this example, the relative distances, measured as an absolute value of the difference between identified depths, between the target depth of the flower (8 meters) and each of the other identified depths is: 3 meters for the ground (8 meters minus 5 meters), 2 meters for the hill (10 meters minus 8 meters), and 4 meters for the mountain (12 meters minus 8 meters).
At step 330, a target object optionally may be obtained from an external source or from identifying a target in at least one image stored in memory. The target object, for instance, may be target 55 of image 50. Once obtained, the target is associated (Step 335) with an identified depth from Step 315 (e.g., associated with a particular group). This optional branch of method 300 therefore relies on a stored target object or selected target object, either selected prior to obtaining or capturing the plurality of images, or selected after the plurality of images are captured and a target is designated during post-processing. A user from terminal 163 may select the target object using input devices 169, either before or after the images have been captured and/or obtained. Alternatively, rather than obtaining a target object in Step 330, the method 300 may proceed by automatically identifying the depths of objects in Step 315 to determine relative distances between one identified depth in the plurality of different depths. That is, an identified depth may be automatically selected from the pixel value data of the depth map. In either case (e.g., obtaining a target object to associate with a target depth or automatically identifying a target depth), the method 300 proceeds by determining relative distances between the target depth or the other identified depths in the pixel value data (Step 320).
Using the relative distances between identified depths (e.g., between a target and other features), the method 300 proceeds to blur pixels (Step 600) in the first set of pixels based on the relative distances determined in Step 320. For example, assume that pixels at an identified depth of 12 meters are to be blurred in accordance with the disclosed embodiments. In this case, pixel positions having pixel values corresponding to 12 meters (i.e., pixel values in bin 505 for 12 meters) are identified in the depth map 60, and those same pixel positions are blurred in the image 50, e.g., by adding a desired amount of random or pseudo-random noise to the pixel values being blurred in image 50. This blurring process can be repeated for blurring pixels in image 50 corresponding to different depths identified from histogram 500.
By way of example and not limitation, consider again the example of the target depth of a flower at 8 meters, and the relative distances between the ground (3 meters away from the flower), hill (2 meters away from the flower), and mountain (4 meters away from the flower). In this example, the mountain's identified depth has the greatest relative distance from the flower's target depth, so pixels in image 50 corresponding to the mountain may be blurred first (as described below). The ground's identified depth is the next-farthest away from the flower, and pixels corresponding to the ground may be blurred second per the blurring process below. Finally, the hill has the closest relative distance to the flower and pixels corresponding to the hill may be blurred after blurring pixels for the mountain and ground in image 50.
In some embodiments, there may be different identified depths that are the same relative distance to the target depth. For example, if the target depth is 8 meters, then bins 505 corresponding to 5 meters and 11 meters would both have a relative distance of 3 meters. In such embodiments, the relative order of these different identified depths relative to each other may be arbitrarily chosen, e.g., such that pixels corresponding to 11 meters are blurred before pixels corresponding to 5 meters, or vice versa. In other embodiments where multiple identified depths have the same relative distance, there may be a preference to blur pixels in a specified order, e.g., from the farthest to nearest depths, as among these identified depths.
Once the identified distances are ordered in descending order, pixels in the first set of pixels are blurred iteratively and in the descending order from farthest relative distance to closest relative distance. Further to the disclosed embodiments, blurring occurs in pixels associated with the identified relative distances, as well as in pixels having a farther relative distance, for each iteration. More specifically, pixels in the first set of pixels associated with the first identified relative distance in the order are blurred first (Step 610). During this first blurring iteration (e.g., while blurring pixels at the first identified relative distance in the order), pixels with greater relative distances than the first identified depth in the order are also blurred (Step 615). Therefore, pixels at the relative distance in the order and those having greater relative distance from the target are blurred with each blurring iteration. The pixels having a greater relative distance may include pixels in the background (e.g., having a greater depth than the depth at the given iteration), and/or may include pixels in the foreground (e.g., having a shallower depth but greater relative distance from the depth at the given iteration).
After the first iteration is completed, the blurring process continues by blurring pixels in the next identified relative distance in the order (Step 620). Again, pixels with relative distances greater than the next identified depth in the order are blurred (Step 625) as pixels at the next identified relative distance are blurred. If all identified relative distances in the order have been blurred (Step 630, Yes), the blurring process ends (Step 635). That is, if the blurring iterations have completed, the process may end. If pixels corresponding to a relative distance in the order have not yet been blurred (Step 630, No), the process continues until all identified relative distances in the order are blurred and pixels with greater relative distances than each identified relative distance are blurred (e.g., the process continues until all iterations are complete). This process therefore blurs pixels according to their groupings, and blurs pixels of greater relative distance than the identified relative distance in each iteration. This results in a blurring method in which pixels in the first set of pixels in image 50 may be blurred multiple times using this illustrative iterative process, where the number of times may be determined based on the number of iterations (e.g., number of identified depths) and the relative distance of each pixel relative to the relative distance used in each iteration of the blurring process. This results a more realistic and progressive blurring of objects in the image, resulting in a more realistic representation of the target surrounding by blurred (e.g., out-of-focus) objects. Moreover, by using the groups to identify finite ranges of pixels to blur in an iterative approach, the blurring process repeats a finite number of times and the number of iterations can be controlled using more groups or fewer groups. More groups in the histogram creates more blurring iterations, and vice versa.
In one representative example, and referring to
Next, the determined relative distances may be ordered in descending order, from farthest distance to shortest distance, and pixels corresponding to those depths are blurred in that order, together with pixels having greater relative depths with each iteration. Moreover, because three objects and one target were identified from the grouping of pixel value data, the blurring algorithm will proceed with three blurring iterations. Blurring proceeds for pixels having relative distances greater than or equal to each distance in the order. First, blurring is applied to pixels at a relative distance of 4 meters (|12 m−8 m|=4 m). In one embodiment, pixels corresponding to this relative distance may be both behind and in front of the target object (e.g., those with relative distances equal to or greater than the target at 4 meters). This embodiment blurs objects in front of and behind the target object. In other embodiments, the blurring may proceed only for those pixels having greater depths (e.g., pixels behind the object). This embodiment blurs objects only behind the target object.
Once this iteration is complete, the blurring then proceeds in descending order after pixels associated with the first relative distance are blurred. In the example, pixels associated with relative distances of 3 meters or greater are blurred (|5 m−8 m|=3 m). Then pixels associated with relative distances of 2 meters or greater are blurred (|10 m−8 m|=2 m). Therefore, the farther an object is relative to the target, the more times the pixel is blurred and therefore the less in focus (e.g., more blurry) the object may appear in the processed image. This is compared to objects closer to the target, which are blurred fewer times and thus are more in focus (e.g., less blurry) than objects farther from the target object. The pixels are blurred iteratively and consecutively to produce a gradual, progressive blurring from the target (presented in sharp focus) to the background and foreground elements (presented in as blurred and out-of-focus).
The pixels blurred in each iteration may be blurred to the same degree, or may be blurred progressively based on the relative distance of each pixel. For each iteration of blurring, a progressive blurring may apply greater blurring to pixels with greater relative distances from the target object as compared to those having closer relative distances. This blurs objects farther from the target object more than objects closer to the target for each blurring iteration. Alternatively, for each pixel in the identified relative distance in the order and pixels of greater relative distance, each pixel may be blurred equally and to the same degree in each iteration. The progressive approach requires greater computation time, and may proceed by applying blurring linearly as the pixel distances increase relative to the identified target.
In one embodiment, the pixels corresponding to each identified relative distance and those of greater relative distance (e.g. those pixels for each iteration of blurring) may be grouped as layers in the image. The layers may then be blurred consecutively based on their respective identified relative distance and the identified order. Pixels may be grouped into more than one layer, and the blurring may be applied equally or progressively when blurring each layer. The blurring applied to each layer or each pixel in the first set of pixels may include any known blurring or filtering function in image processing, such Gaussian blurring, etc.
Referring now to
Once the plurality of images are obtained in Step 705, the method proceeds by extracting pixels corresponding to features in a plurality of images (Step 710). Features are extracted using a corner detection technique, including FAST (Features from Accelerated Segment Test), SUSAN, Harris, etc. In one embodiment, features may be extracted using Harris Corner detection. Using this technique and as presented in the equation below, a matrix (A) is defined as a structure tensor, Ix and Iy are points on each image, and the gradient information is in the x and y directions. In the disclosed embodiments, the extracted features may correspond to at least one of an inflection point, one or more points of an object contour, or any other features that may be extracted from images as would be recognized by those skilled in the art.
Using a function (Mc), presented below, a threshold Mn is defined. When Mc is greater than Mn, a feature point is determined in the image. For the function Mc, det(A) is the determinant of matrix A, trace (A) is the trace of matrix A, and kappa is a tunable sensitivity parameter.
M
c=λ1λ2−κ(λ1+λ2)2=det(A)−κ trace2(A)
Once feature points are extracted for each of the plurality of images in Step 710, the relative positions of the extracted features can be tracked in the plurality of images in Step 715. However, before tracking feature points, the feature points extracted in Step 710 may be filtered in Step 735. If too many feature points are extracted from the images, a filter may be applied to reduce the feature point count. This may be accomplished by adjusting the sensitivity parameter to reduce the number of features extracted. Filtering the extracted features to reduce the count reduces computation time. Once filtered, or if filtering is not elected, like-features that have been extracted are tracked across the plurality of images in Step 715.
The features are tracked in order to calculate movement or optical flow across the plurality of images. Since the plurality of images may be taken or captured from different angles and positions, the features may present differently in each image. Tracking extracted features allows like-features to be identified in different images, from which the depth of each feature in each image can be calculated. When tracking features, KLT (Kanade-Lucas-Tomasi) is one exemplary algorithm that may be employed. For each extracted feature point, the following formula may be used, in which (h) is the displacement of the a feature's position F(x) between two images, where the feature's second position is G(x)=F(x+h).
In the embodiment using KLT, the above equation is used in an iterative process to obtain displacement (h) of feature points between consecutive images. The displacement calculated can be confirmed by reversing the calculation once the displacement is known. That is, the displacement may be applied to the position G(x) in the second image to verify that the feature point returns to its original position F(x) in the first image. By satisfying this relationship, the displacement is confirmed and the tracking is verified.
Once the features are extracted in Step 710 and tracked in Step 715, the method may continue by determining camera pose information for each of the plurality of images (Step 720). That is, the relative 3D position of each extracted feature point is determined in the plurality of images. In one embodiment, bundle adjustment may be employed to determine camera pose for each of the images. Camera pose is determined for each image based on position data associated with each image. In one embodiment, position device 146 and sensors 119 provide position and attitude information as image capture device 113 captures each image and moves between each image. The position and attitude of the image capture device 113 (e.g., the camera) is therefore associated with each captured image. With this information in conjunction with bundle adjustment, camera pose information for each of the plurality of images can be determined. In the formula below for bundle adjustment, (n) is the number of 3D feature points in (m) number of images (e.g., (n) number of 3D points capable of being continuously tracked in (m) number of consecutive images).
Once camera pose information is determined, the relative distances between the like-features in the plurality of images (e.g., the same features in the plurality of images) can be calculated (Step 725), and relative depth information can be obtained for each extracted pixel to form the depth map in Step 730. In one embodiment, a plane sweeping algorithm can be applied to calculate the depth of each feature. The 3D relative position of each feature is calculated using the Bundle Adjustment applied in Step 720. Using a plane sweeping algorithm and transforming each feature point onto a projection, relative distances can be calculated for each feature and determined for each pixel. The relative distances may then be calculated using any number of methods, including Mean Absolute Differences, Sum of Squared Differences, Sum of Absolute Difference, Normalized Cross Correlation, Sequential Similarity Detection Algorithm, or Sum of Absolute Transformed Difference.
Once the relative distances are calculated and the relative depths are determined for each pixel, the depth map 60 can be generated (Step 700). From the depth map 60, groups of pixels corresponding to objects may be identified (Step 315) using a statistical process that may include grouping the pixel value data of the depth map. Alternatively, the depth may 60 may be refined (Step 740) prior to identifying the depths of objects in the image (Step 315). The depth map may be refined in Step 740 to sharpen features detected in Step 710 using Kruskal's algorithm, a minimum-spanning-tree algorithm. In one embodiment, a cost function according to the following equation may be utilized when defining the minimum-spanning-tree.
Using the minimum-spanning-tree algorithm, pixels of similar depths can be determined and the depth map may be refined to sharpen each feature such that features are more pronounced in the pixel value data of the depth map.
In addition to sharpening features, the depth map may be refined by removing distortion from the depth map. Based on the 3D information of each pixel of the plurality of images, transformations can be applied to each point in the depth map to eliminate unnatural perspective or other distortion created when capturing images using a 2D camera. The transformation adjusts the depths of the pixels in the depth map according to a selected focal point and a projection plane. This may be accomplished by converting the depth map into a 3D point cloud, applying a transformation matrix to the point cloud (e.g., performing one or more rotations to every point in the point cloud), and projecting the point cloud to remove distortion. This allows the blurring to be performed from a selected perspective among the plurality of images, and eliminates unnatural distortion of objects in the blurred image.
The distorted images here may include the images captured by Tilt-shift photography, which is often used for simulating a miniature scene. Removing distortion includes rotation and transformation of the pixels of the distorted images.
Once the pixels have been blurred based on the pixel value data of the depth map and the relative distances from the target object, a processed image 70 may be generated. Referring to
In addition, the processed image 70 may include a partially-blurred 2D image from a selected perspective, or the processed image 70 may include a partially-blurred 3D image that blurs features based on a selectable perspective that can be chosen and altered by a viewing user. 3D images provide a movable view of the target 55 within a selected range. The target can be viewed from different perspectives or vantage points and the image is dynamically rotated about the target 55. The partially-blurred portions of the 3D processed image update with each perspective available to the user to correspond with the changing feature depths as the user's perspective changes relative to the target.
Referring now to
After a target is obtained in Step 905, the image capture device 113 tracks the target (Step 910). Target tracking may be accomplished by an image capture device 113 on movable object 110, where the flight control module 140 or 182 controls movable object 110 and image capture device 110 maintains the obtained target in its field of view. The target may be tracked as movable object 110 and/or image capture device 113 are moved. Therefore, image capture device 113 view the obtained target from one or more perspectives as it moves.
As the target is tracked in Step 910, image capture device 113 obtains images of the target in Step 915. The image capture device 113 may obtain several images and continue capturing images as it moves relative to and tracks the target. With each image, position information is obtained in Step 920 and is associated with each image. This allows camera pose data to be determined during post-processing. Position data may be obtained using sensors 119, positioning device 146, or any other means of obtaining the position and attitude of image capture device 113 as it obtains images of the target.
In order to generate the partially-blurred image and/or 3D image, images from more than one position may be necessary if image capture device 113 includes a 2D camera or monocular lens. Therefore, after each image is captured in Step 920, the image capture device 113 moves (Step 925) before capturing a subsequent image. The image capture device 113 may move on its gimbal, or movable object 110 carrying image capture device 113 may move between image captures. This allows a first image to be captured at a first location, and a second image captured at a second location, and so on. The plurality of images are then post-processed according to the method 300 outlined in
Computer programs based on the written description and methods of this specification are within the skill of a software developer. The various programs or program modules may be created using a variety of programming techniques. For example, program sections or program modules may be designed in or by means of Java, C, C++, assembly language, or any such programming languages. One or more of such software sections or modules may be integrated into a computer system, non-transitory computer readable media, or existing communications software.
While illustrative embodiments have been described herein, the scope includes any and all embodiments having equivalent elements, modifications, omissions, combinations (e.g., of aspects across various embodiments), adaptations or alterations based on the present disclosure. The elements in the claims are to be interpreted broadly based on the language employed in the claims and not limited to examples described in the present specification or during the prosecution of the application, which examples are to be construed as non-exclusive. Further, the steps of the disclosed methods may be modified in any manner, including by reordering steps or inserting or deleting steps. It is intended, therefore, that the specification and examples be considered as exemplary only, with the true scope and spirit being indicated by the following claims and their full scope of equivalents.
This application is a continuation application of International Application No. PCT/CN2017/085757, filed on May 24, 2017, the entire contents of which are incorporated herein by reference.
Number | Date | Country | |
---|---|---|---|
Parent | PCT/CN2017/085757 | May 2017 | US |
Child | 16678395 | US |