Various implementations relate generally to method, apparatus, and computer program product for processing of multimedia content.
The rapid advancement in technology related to capturing multimedia content, such as images and videos has resulted in an exponential increase in the creation of image content. Various devices like mobile phones and personal digital assistants (PDA) are being configured with image/video capture capabilities, thereby facilitating easy capture of the panorama images/videos. The captured images may be subjected to processing based on various user needs. For example, images corresponding to a scene captured from various viewpoints and angles may have a high amount of overlapping image portions. Such images may be processed to for a variety of applications, for example, for generating a panorama image, generating a video content and the like.
Various aspects of example embodiments are set out in the claims.
In a first aspect, there is provided a method comprising: determining an angle of rotation between a first image and a second image associated with a multimedia content; determining one or more intermediate planes in an overlap region between a first image plane associated with the first image and a second image plane associated with the second image; and projecting at least a portion of the first image associated with the overlap region and at least a portion of the second image associated with the overlap region onto the one or more intermediate planes for processing the first image and the second image.
In a second aspect, there is provided an apparatus comprising at least one processor; and at least one memory comprising computer program code, the at least one memory and the computer program code configured to, with the at least one processor, cause the apparatus to at least perform: determine an angle of rotation between a first image and a second image associated with a multimedia content; determine one or more intermediate planes in an overlap region between a first image plane associated with the first image and a second image plane associated with the second image; and project at least a portion of the first image associated with the overlap region and at least a portion of the second image associated with the overlap region onto the one or more intermediate planes for processing the first image and the second image.
In a third aspect, there is provided a computer program product comprising at least one computer-readable storage medium, the computer-readable storage medium comprising a set of instructions, which, when executed by one or more processors, cause an apparatus to at least perform: determine an angle of rotation between a first image and a second image associated with a multimedia content; determine one or more intermediate planes in an overlap region between a first image plane associated with the first image and a second image plane associated with the second image; and project at least a portion of the first image associated with the overlap region and at least a portion of the second image associated with the overlap region onto the one or more intermediate planes for processing the first image and the second image.
In a fourth aspect, there is provided an apparatus comprising: means for determining an angle of rotation between a first image and a second image associated with a multimedia content; means for determining one or more intermediate planes in an overlap region between a first image plane associated with the first image and a second image plane associated with the second image; and means for projecting at least a portion of the first image associated with the overlap region and at least a portion of the second image associated with the overlap region onto the one or more intermediate planes for processing the first image and the second image.
In a fifth aspect, there is provided a computer program comprising program instructions which when executed by an apparatus, cause the apparatus to: determine an angle of rotation between a first image and a second image associated with a multimedia content; determine one or more intermediate planes in an overlap region between a first image plane associated with the first image and a second image plane associated with the second image; and project at least a portion of the first image associated with the overlap region and at least a portion of the second image associated with the overlap region onto the one or more intermediate planes for processing of the first image and the second image.
Various embodiments are illustrated by way of example, and not by way of limitation, in the figures of the accompanying drawings in which:
Example embodiments and their potential effects are understood by referring to
In an example embodiment, a panorama image may be generated by utilizing a plurality of images associated with a sequence of images. As described herein, the term ‘panorama image’ refers to an image associated with a wider or elongated field of view. A panorama image may include a two-dimensional construction of a three-dimensional (3-D) scene. In some embodiments, the panorama image may provide about 360 degrees view of the scene. The panorama image can be generated by capturing a video footage or multiple still images of the scene, as a media capturing device (for example, a camera) is spanned through a range of angles. For example, as illustrated in
For processing of the multimedia content, a multimedia capturing device may be rotated or spanned through angles along a single direction, for example along a horizontal direction. For example, the multimedia capturing device may be spanned by an angle such as an angle theta about y-axis to capture a plurality of images associated with a sequence of images. As used herein, the ‘sequence of images’ may refer to images that are a part of a scene or a video, and that may be combined together to generate a complete image (or a panorama image) of the scene. In an embodiment, the angle of rotation may be the angle through which the media capturing device is rotated about an axis to capture adjacent images of the plurality of images. The rotation of the multimedia capturing device is illustrating by utilizing
In an embodiment, the angle of rotation may be an angle enclosed between the projections of the two images, for examples the images 102, 104 at a reference point. In an embodiment, the reference point may be referred to as a center of projection (COP). For example, as illustrated in
In an embodiment, if the adjacent images for example, the images 102 and 104 are captured by a rotation or spanning of the multimedia capturing device by an angle (for example angle theta) around the optical axis (y-axis), then the images may differ by an angle ‘theta’ at the center of projection. In an embodiment, the angle between the images may be same as the angle between the projections of the image at the center of projection. For example, the angle between the images 102, 104 is same as the angle between the projections 108, 110 of the first image 102 and the second image 104 at the COP. In an embodiment, the adjacent images may be stitched in a manner that the angle theta between the images may be close to zero. An advantage of causing the angle theta to be close to zero is that the curving of lines associated with adjacent images, for example images 102, 104 appearing in the processed multimedia content, for example a panorama image may be avoided.
In an embodiment, the curving of lines in the processed image may be avoided by determining at least one intermediate plane or view in an overlap region of the adjacent images such that all the combined views are separated by an theta/(n+1) angle at the COP around y-axis for enabling a smooth transition between the adjacent images. The smooth transition may prevent the curving of lines in the component images of the processed multimedia content, and retain the straight lines. For example, if the number of intermediate planes is two (n=2), and angle of rotation between the two images associated with the multimedia content is about 30 degrees, then three intermediate planes may be generated such that the views differ by about 10 degrees, thereby rendering the lines in the processed multimedia content relatively straight. Various example embodiments describing the devices, apparatus and method for generation of processed multimedia content are explained in detail with reference to
The device 200 may include an antenna 202 (or multiple antennas) in operable communication with a transmitter 204 and a receiver 206. The device 200 may further include an apparatus, such as a controller 208 or other processing device that provides signals to and receives signals from the transmitter 204 and receiver 206, respectively. The signals may include signaling information in accordance with the air interface standard of the applicable cellular system, and/or may also include data corresponding to user speech, received data and/or user generated data. In this regard, the device 200 may be capable of operating with one or more air interface standards, communication protocols, modulation types, and access types. By way of illustration, the device 200 may be capable of operating in accordance with any of a number of first, second, third and/or fourth-generation communication protocols or the like. For example, the device 200 may be capable of operating in accordance with second-generation (2G) wireless communication protocols IS-136 (time division multiple access (TDMA)), GSM (global system for mobile communication), and IS-95 (code division multiple access (CDMA)), or with third-generation (3G) wireless communication protocols, such as Universal Mobile Telecommunications System (UMTS), CDMA2000, wideband CDMA (WCDMA) and time division-synchronous CDMA (TD-SCDMA), with 3.9G wireless communication protocol such as evolved-universal terrestrial radio access network (E-UTRAN), with fourth-generation (4G) wireless communication protocols, or the like. As an alternative (or additionally), the device 200 may be capable of operating in accordance with non-cellular communication mechanisms. For example, computer networks such as the Internet, local area network, wide area networks, and the like; short range wireless communication networks such as Bluetooth® networks, Zigbee® networks, Institute of Electric and Electronic Engineers (IEEE) 802.11x networks, and the like; wireline telecommunication networks such as public switched telephone network (PSTN).
The controller 208 may include circuitry implementing, among others, audio and logic functions of the device 200. For example, the controller 208 may include, but are not limited to, one or more digital signal processor devices, one or more microprocessor devices, one or more processor(s) with accompanying digital signal processor(s), one or more processor(s) without accompanying digital signal processor(s), one or more special-purpose computer chips, one or more field-programmable gate arrays (FPGAs), one or more controllers, one or more application-specific integrated circuits (ASICs), one or more computer(s), various analog to digital converters, digital to analog converters, and/or other support circuits. Control and signal processing functions of the device 200 are allocated between these devices according to their respective capabilities. The controller 208 thus may also include the functionality to convolutionally encode and interleave message and data prior to modulation and transmission. The controller 208 may additionally include an internal voice coder, and may include an internal data modem. Further, the controller 208 may include functionality to operate one or more software programs, which may be stored in a memory. For example, the controller 208 may be capable of operating a connectivity program, such as a conventional Web browser. The connectivity program may then allow the device 200 to transmit and receive Web content, such as location-based content and/or other web page content, according to a Wireless Application Protocol (WAP), Hypertext Transfer Protocol (HTTP) and/or the like. In an example embodiment, the controller 208 may be embodied as a multi-core processor such as a dual or quad core processor. However, any number of processors may be included in the controller 208.
The device 200 may also comprise a user interface including an output device such as a ringer 210, an earphone or speaker 212, a microphone 214, a display 216, and a user input interface, which may be coupled to the controller 208. The user input interface, which allows the device 200 to receive data, may include any of a number of devices allowing the device 200 to receive data, such as a keypad 218, a touch display, a microphone or other input device. In embodiments including the keypad 218, the keypad 218 may include numeric (0-9) and related keys (#, *), and other hard and soft keys used for operating the device 200. Alternatively or additionally, the keypad 218 may include a conventional QWERTY keypad arrangement. The keypad 218 may also include various soft keys with associated functions. In addition, or alternatively, the device 200 may include an interface device such as a joystick or other user input interface. The device 200 further includes a battery 220, such as a vibrating battery pack, for powering various circuits that are used to operate the device 200, as well as optionally providing mechanical vibration as a detectable output.
In an example embodiment, the device 200 includes a media capturing element, such as a camera, video and/or audio module, in communication with the controller 208. The media capturing element may be any means for capturing an image, video and/or audio for storage, display or transmission. In an example embodiment, the media capturing element is a camera module 222 which may include a digital camera capable of forming a digital image file from a captured image. As such, the camera module 222 includes all hardware, such as a lens or other optical component(s), and software for creating a digital image file from a captured image. Alternatively or additionally, the camera module 222 may include the hardware needed to view an image, while a memory device of the device 200 stores instructions for execution by the controller 208 in the form of software to create a digital image file from a captured image. In an example embodiment, the camera module 222 may further include a processing element such as a co-processor, which assists the controller 208 in processing image data and an encoder and/or decoder for compressing and/or decompressing image data. In an embodiment, the processor may be configured to perform processing of the co-processor. For example, the processor may facilitate the co-processor to process the image data and the encoder and/or the decoder. The encoder and/or decoder may encode and/or decode according to a JPEG standard format or another like format. For video, the encoder and/or decoder may employ any of a plurality of standard formats such as, for example, standards associated with H.261, H.262/MPEG-2, H.263, H.264, H.264/MPEG-4, MPEG-4, and the like. In some cases, the camera module 222 may provide live image data to the display 216. In an example embodiment, the display 216 may be located on one side of the device 200 and the camera module 222 may include a lens positioned on the opposite side of the device 200 with respect to the display 216 to enable the camera module 222 to capture images on one side of the device 200 and present a view of such images to the user positioned on the other side of the device 200.
The device 200 may further include a user identity module (UIM) 224. The UIM 224 may be a memory device having a processor built in. The UIM 224 may include, for example, a subscriber identity module (SIM), a universal integrated circuit card (UICC), a universal subscriber identity module (USIM), a removable user identity module (R-UIM), or any other smart card. The UIM 224 typically stores information elements related to a mobile subscriber. In addition to the UIM 224, the device 200 may be equipped with memory. For example, the device 200 may include volatile memory 226, such as volatile random access memory (RAM) including a cache area for the temporary storage of data. The device 200 may also include other non-volatile memory 228, which may be embedded and/or may be removable. The non-volatile memory 228 may additionally or alternatively comprise an electrically erasable programmable read only memory (EEPROM), flash memory, hard drive, or the like. The memories may store any number of pieces of information, and data, used by the device 200 to implement the functions of the device 200.
The apparatus 300 includes or otherwise is in communication with at least one processor 302 and at least one memory 304. Examples of the at least one memory 304 include, but are not limited to, volatile and/or non-volatile memories. Some examples of the volatile memory include, but are not limited to, random access memory, dynamic random access memory, static random access memory, and the like. Some example of the non-volatile memory includes, but are not limited to, hard disks, magnetic tapes, optical disks, programmable read only memory, erasable programmable read only memory, electrically erasable programmable read only memory, flash memory, and the like. The memory 304 may be configured to store information, data, applications, instructions or the like for enabling the apparatus 300 to carry out various functions in accordance with various example embodiments. For example, the memory 304 may be configured to buffer input data comprising multimedia content for processing by the processor 302. Additionally or alternatively, the memory 304 may be configured to store instructions for execution by the processor 302.
An example of the processor 302 may include the controller 208. The processor 302 may be embodied in a number of different ways. The processor 302 may be embodied as a multi-core processor, a single core processor; or combination of multi-core processors and single core processors. For example, the processor 302 may be embodied as one or more of various processing means such as a coprocessor, a microprocessor, a controller, a digital signal processor (DSP), processing circuitry with or without an accompanying DSP, or various other processing devices including integrated circuits such as, for example, an application specific integrated circuit (ASIC), a field programmable gate array (FPGA), a microcontroller unit (MCU), a hardware accelerator, a special-purpose computer chip, or the like. In an example embodiment, the multi-core processor may be configured to execute instructions stored in the memory 304 or otherwise accessible to the processor 302. Alternatively or additionally, the processor 302 may be configured to execute hard coded functionality. As such, whether configured by hardware or software methods, or by a combination thereof, the processor 302 may represent an entity, for example, physically embodied in circuitry, capable of performing operations according to various embodiments while configured accordingly. For example, if the processor 302 is embodied as two or more of an ASIC, FPGA or the like, the processor 302 may be specifically configured hardware for conducting the operations described herein. Alternatively, as another example, if the processor 302 is embodied as an executor of software instructions, the instructions may specifically configure the processor 302 to perform the algorithms and/or operations described herein when the instructions are executed. However, in some cases, the processor 302 may be a processor of a specific device, for example, a mobile terminal or network device adapted for employing embodiments by further configuration of the processor 302 by instructions for performing the algorithms and/or operations described herein. The processor 302 may include, among other things, a clock, an arithmetic logic unit (ALU) and logic gates configured to support operation of the processor 302.
A user interface 306 may be in communication with the processor 302. Examples of the user interface 306 include, but are not limited to, input interface and/or output user interface. The input interface is configured to receive an indication of a user input. The output user interface provides an audible, visual, mechanical or other output and/or feedback to the user. Examples of the input interface may include, but are not limited to, a keyboard, a mouse, a joystick, a keypad, a touch screen, soft keys, and the like. Examples of the output interface may include, but are not limited to, a display such as light emitting diode display, thin-film transistor (TFT) display, liquid crystal displays, active-matrix organic light-emitting diode (AMOLED) display, a microphone, a speaker, ringers, vibrators, and the like. In an example embodiment, the user interface 306 may include, among other devices or elements, any or all of a speaker, a microphone, a display, and a keyboard, touch screen, or the like. In this regard, for example, the processor 302 may comprise user interface circuitry configured to control at least some functions of one or more elements of the user interface 306, such as, for example, a speaker, ringer, microphone, display, and/or the like. The processor 302 and/or user interface circuitry comprising the processor 302 may be configured to control one or more functions of one or more elements of the user interface 306 through computer program instructions, for example, software and/or firmware, stored on a memory, for example, the at least one memory 304, and/or the like, accessible to the processor 302.
In an example embodiment, the apparatus 300 may include an electronic device. Some examples of the electronic device include communication device, media capturing device with communication capabilities, computing devices, and the like. Some examples of the communication device may include a mobile phone, a personal digital assistant (PDA), and the like. Some examples of computing device may include a laptop, a personal computer, and the like. In an example embodiment, the communication device may include a user interface, for example, the UI 306, having user interface circuitry and user interface software configured to facilitate a user to control at least one function of the communication device through use of a display and further configured to respond to user inputs. In an example embodiment, the communication device may include a display circuitry configured to display at least a portion of the user interface of the communication device. The display and display circuitry may be configured to facilitate the user to control at least one function of the communication device.
In an example embodiment, the communication device may be embodied as to include a transceiver. The transceiver may be any device operating or circuitry operating in accordance with software or otherwise embodied in hardware or a combination of hardware and software. For example, the processor 302 operating under software control, or the processor 302 embodied as an ASIC or FPGA specifically configured to perform the operations described herein, or a combination thereof, thereby configures the apparatus or circuitry to perform the functions of the transceiver. The transceiver may be configured to receive multimedia content. Examples of multimedia content may include audio content, video content, data, and a combination thereof.
In an example embodiment, the communication device may be embodied as to include an image sensor, such as an image sensor 308. The image sensor 308 may be in communication with the processor 302 and/or other components of the apparatus 300. The image sensor 308 may be in communication with other imaging circuitries and/or software, and is configured to capture digital images or to make a video or other graphic media files. The image sensor 308 and other circuitries, in combination, may be an example of the camera module 222 of the device 200.
The components 302-308 may communicate with each other via a centralized circuit system 310 to perform generation of the processed multimedia content. The centralized circuit system 310 may be various devices configured to, among other things, provide or enable communication between the components 302-308 of the apparatus 300. In certain embodiments, the centralized circuit system 310 may be a central printed circuit board (PCB) such as a motherboard, main board, system board, or logic board. The centralized circuit system 310 may also, or alternatively, include other printed circuit assemblies (PCAs) or communication channel media.
In an example embodiment, the processor 302 is caused to, with the content of the memory 304, and optionally with other components described herein, to cause the apparatus 300 to process the multimedia content. In an embodiment, the multimedia content may be pre-recorded and stored in the apparatus 300. In another embodiment, the multimedia content may be captured by utilizing the camera module 222 of the device 200, and stored in the memory of the device 200. In yet another embodiment, the device 200 may receive the multimedia content from internal memory such as hard drive, random access memory (RAM) of the apparatus 300, or from external storage medium such as digital versatile disk, compact disk, flash drive, memory card, or from external storage locations through Internet, Bluetooth®, and the like. The apparatus 300 may also receive the multimedia content from the memory 304.
In an embodiment, the multimedia content may comprise a plurality of multimedia frames. In an embodiment, the plurality of multimedia frames may include a sequence of image frames. In an embodiment, the plurality of multimedia frames comprises a sequence of video frames. The sequence of multimedia frames may correspond to a single scene of the multimedia content. In an embodiment, the plurality of multimedia frames may correspond to video content captured by the image sensor 308 and stored in the memory 304. It is noted that the terms ‘images’, ‘multimedia frames’ and ‘frames’ are used interchangeably herein and refer to the same entity. In an embodiment, the multimedia content may comprise a plurality of images associated with a sequence of images associated with a wider or elongated field of view of a scene. In another embodiment, the multimedia content may comprise a plurality of frames associated with a video. It will be noted that the adjacent video frames may include a very less angular shift/rotation, and accordingly, in an embodiment, the plurality of frames may not be adjacent video frames. For example, the plurality of video frames may comprise of video frames having a gap of about 20 video frames there between so that the video frames may be associated with a substantial angular rotation.
In an embodiment, the multimedia content may be captured by a multimedia capturing device, for example, a camera by spanning the multimedia capturing in at least one direction. For example, the multimedia capturing device may be spanned in a horizontal direction around an axis, for example, y-axis to capture the plurality of images. In an embodiment, the adjacent images, for example a first image and a second image, of the plurality of images may include at least an overlap region, as discussed with reference to
In an example embodiment, the processor 302 is configured to, with the content of the memory 304, and optionally with other components described herein, to cause the apparatus 300 to determine an angle of rotation between the first image plane and the second image plane associated with the first image and the second image, respectively of a sequence of images. As discussed with reference to
In an embodiment, the image features associated with the first image and the second image may be computed based on a corner detection method. As used herein, the corner detection method includes extraction of features associated with the first image and the second image, and inferring contents of the images based on the extracted image features. As used herein, the term ‘corner’ may be defined as an intersection of two edges that may define the boundary between two different objects or the parts of a same object, for example, in an image. In an example embodiment, the corner detection method may include Harris corners method for computing corners. In this embodiment, the image features associated with the first image and the second image may be Harris corner features, which may be computed and arranged in a collection of local feature vectors associated with the first image and the second image, respectively. Each of the feature vectors may be distinctive and invariant to any scaling, rotation and translation of the image. The feature vectors may be utilized for determining distinctive objects in different frames, associated with the first image and the second image.
In an embodiment, if the multimedia capturing device is spanned in one direction only, for example, around the y-axis, the angles of spanning around other directions for example, the angle alpha and the angle beta around the directions x-axis and z-axis respectively are zero. In the present embodiment, the first projection angle between the first image and the second image is determined based at least on the image features associated with the first image and the second image. In particular, the image features of the second image may be projected on the first image such that the angle theta (th) between the projected image features (represented as x2′, y2′) of the second image and the image features (represented as x1, y1) of the first image may as close as possible. In an embodiment, determination of the first projection angle between the first image and the second image may be performed by minimizing the following equation (1):
E(th)=Σ(x1i−x′2i)2+(y1i−y′2i)2 (1)
The equation (1) computes the value of angle ‘theta’ between the first image and the second image which minimizes the projection of the image features associated with the second image on to the first image in the overlap region. Herein, the projected image features of the second image, x2′ and y2′ may be computed by using 3-D line equations. For example, the following equation (2) may be utilized for computing the values of the projected image features of the second image:
x
2
′=f*(dx−r*cos(th))/(f+r*sin(th));
y
2
′=f*y
2/(f+r*sin(th)); (2)
In an embodiment, a range of values of dx may be considered (this is same as considering a value of theta), and searched for the minimum of equation (1), which gives an optimal solution for the angle theta. In an embodiment, various minimization techniques may be utilized for minimizing equation (1) instead of exhaustive searching.
In an embodiment, if the multimedia capturing device is spanned in more than one direction, for example, around x-axis, y-axis and z-axis, then the angles alpha (α) and beta (β) associated with the rotation about x-axis and z-axis may be compensated. In an embodiment, the angle beta (β) represents an inplane rotation between the images (or frames) around the z-axis.
In the first image, a plurality of vertical lines is computed by joining image features from the first set of image features. Also, in the second image, a plurality of straight lines may be determined from a matching second set of image features corresponding to the image features of the first set of image features. For example, the combinations of vertical lines (having slopes thereof in the range of 90+/−2 degrees) may be considered in the first image and the corresponding straight lines in the second image may be determined. In an embodiment, a deviation of the slope of the plurality of straight lines from the corresponding plurality of vertical lines is computed. In an embodiment, an average of the computed deviations is determined to compute the angle beta (β). It will be noted that the vertical straight lines are preserved in a specific type of projective transformation, where the projective transformation is caused only by rotation around y-axis and hence the angle beta (β) is computed by using vertical straight lines. The angle alpha(α) around x-axis may be approximated as a translation along the vertical direction in a spatial domain in the image and may be computed by exhaustively searching in a specified range. In the present embodiment, −Th<dα<Th, where dα is the vertical translation used instead of angle a, and Th=H/20, wherein H is the height of the first image.
In an example embodiment, the processor 302 is configured to, with the content of the memory 304, and optionally with other components described herein, to cause the apparatus 300 to determine one or more intermediate planes between the first image plane and the second image plane in the overlap region for interfacing the first image plane and the second image plane. In an embodiment, the one or more intermediate plane is configured to be separated by the first image plane and the second image plane by an intermediate projection angle that is function of the first projection angle, theta. For example, the one or more intermediate planes may include three intermediate planes between the first image plane and the second image plane, so that the three intermediate planes divide the angle theta into four portions, each portion having an angular span of theta or four degrees at the COP. An example embodiment illustrating multiple intermediate image planes is described with reference to
In an embodiment, the at least one intermediate plane comprises at least a portion of the overlap region between the first image and the second image. For example, the at least one intermediate plane may comprise a single intermediate plane separating the first image plane and the second image plane into two halves having an angular span of theta/2 each. In another embodiment, the at least one intermediate plane may comprise multiple intermediate planes, for example, three intermediate plane separating the first image plane and the second image plane into four portions of equal angular spans (for example, theta/four degrees each). In an embodiment, such multiple intermediate image planes may be considered to be within the overlap region only. In another embodiment, at least one of the multiple intermediate image planes may be considered to be outside the overlap region. For example, the first intermediate image plane of the multiple intermediate image planes may be positioned to be extending from the first image plane towards the overlap region. In an embodiment, by determining multiple intermediate image planes between the first image plane and the second image plane facilitates in reducing the curved lines and avoiding distortion in the processed image effectively.
In an example embodiment, the processor 302 is caused to, with the content of the memory 304, and optionally with other components described herein, to cause the apparatus 300 to project at least a portion of the first image associated with the overlap region and at least a portion of the second image associated with the overlap region onto the one or more intermediate planes. In an embodiment, the projection of at least the portion of the first image and at least the portion of the second image onto the one or more intermediate planes facilitates in processing the first image and the second image. For example, in an embodiment, the projection of at least the portion of the first image and at least the portion of the second image facilitates in determining one or more seams in the one or more intermediate planes. The one or more seams may be utilized in stitching the first image and the second image for generation of the processed image. Alternatively, the first image and the second image may be stitched by blending the projection of at least the portion of the first image and at least the portion of the second image onto the one or more intermediate planes to thereby process the first image and the second image. In an example embodiment, the first image and the second image may be processed for generating a panorama image. In another example embodiment, the first image and the second image may be processed for generating a video content. In an embodiment, the processor 302 is caused to, with the content of the memory 304, and optionally with other components described herein, to cause the apparatus 300 to reduce the perspective distortion occurring in the processed images by making use of the ‘distance from centre of the image’ criterion and appropriately stitching the images. A detailed explanation of utilizing the ‘distance from centre of the image’ criterion is explained with reference to
In an embodiment, the first image and the second image may be warped based on the projection. In an example embodiment, the apparatus 300 is caused to stitch the warped images to generate the processed image. For instance, in an example embodiment, the warped images may be stitched by computing a seam between the images and blending the images across the seam. In an example embodiment, the seam between the images may be determined by a dynamic programming based seam computation method.
In some example embodiments, an apparatus such as the apparatus 200 may comprise various components such as means for determining an angle of rotation between a first image and a second image, means for determining one or more intermediate planes in the overlap region between the first image plane and the second image plane; and means for projecting at least a portion of the first image associated with the overlap region and at least a portion of the second image associated with the overlap region onto the one or more intermediate planes to determine one or more seams in the one or more intermediate planes. Such components may be configured by utilizing hardware, firmware and software components. Examples of such means may include, but are not limited to, the processor 302 along with the memory 304, the UI 306, and the image sensor 308.
In an example embodiment, the means for determining the angle of rotation comprises means for: means for determining the first set of image features associated with the first image and a second set of image features associated with the second image; means for projecting the second set of image features onto the first image plane; and means for minimizing a distance between the projected second set of image features and the first set of image features. Examples of such means may include, but are not limited to, the processor 302 along with the memory 304, the UI 306, and the image sensor 308.
In an example embodiment, the means for determining the one or more intermediate planes comprises: means for computing projection of points in vicinity of a periphery of the overlap region; and means for joining the points in vicinity of the periphery in 3-D. In an embodiment, the points in vicinity of a periphery of the overlap region may comprise extreme points of the overlap region. Examples of means for determining the one or more intermediate planes may include, but are not limited to, the processor 302 along with the memory 304, the UI 306, and the image sensor 308. Some embodiments of panorama generation are further described in
As explained with reference to
In an embodiment, the processed image is generated by combining the plurality of intermediate planes. In particular, the portions of the first image and the second image may be projected onto the intermediate planes. In an embodiment, a seam is determined between the projected portions of the first image and the second image and the seam may be utilized for stitching the first image and the second image together to thereby form a final processed image. In another embodiment, the portions of the first image and the second image may be projected onto the intermediate planes and stitched together by blending the respective projections of the first image and the second image, to thereby generate the processed image. In an embodiment, the processed image is formed by combining in a sequence of intermediate planes between the first image and the second image. For example, the processed image may be generated by combining the non-overlap portions of the first image and the second image with the overlap portion projected onto the intermediate plane. As illustrated in
It is noted that in the present embodiment, the rotation/spanning of the multimedia capturing device in one direction is considered. However, in certain scenarios, the multimedia capturing device may be spanned in more than one direction, for example, around x-axis and z-axis. The rotation around x-axis may be alpha degrees and the rotation about the z-axis may be beta degrees. In such a scenario, the angles alpha and beta may be computed, for example, by utilizing similarity transformations, and compensated by utilizing inverse transformations so that in the panorama image only the angle theta may be utilized for the determination of the at least one intermediate plane and stitching of the intermediate image planes.
In an example embodiment, the plurality of images may be associated with a scene, such that each of the plurality of images may correspond to at least a portion of the scene. As disclosed herein, the plurality of images may include adjacent images such that any two adjacent images of the plurality of images may include a common portion or an overlapping region. For example, the plurality of images may include a first image and a second image having an overlapping region between them. As disclosed herein, the terms ‘first image’ and ‘second image’ refers to successive (or adjacent) images associated with a scene, such that the first image and the second image comprises at least an overlapping region. An example illustrating the first image and the second image is illustrated and described in detail with reference to
In some embodiments, the plurality of images, for example the first image and the second image may be associated with image features, for example, a first set of features and a second set of features, respectively. At block 602, the method 600 includes determining an angle of rotation between a first image and a second image. In an embodiment, the angle of rotation is indicative of an overlap region between the first image and the second image. The angle of rotation is determined based on a threshold distortion of a first set of image features associated with the first image upon being projected onto the second image. In an embodiment, the projections of the second set of image features having minimum distortion with respect to the first set of image features may be utilized for computing the angle of rotation. The angle of rotation is explained in detail with reference to
At block 604, one or more intermediate planes are determined in the overlap region between a first image plane associated with the first image and a second image plane associated with the second image. In an embodiment, the one or more intermediate planes may include a single intermediate plane. In an embodiment, the one or more intermediate planes are configured to be separated by the first image plane and the second image plane by an intermediate projection angle that is function of the first projection angle, theta. For example, the one or more intermediate planes may include four intermediate planes between the first image plane and the second image plane, so that the four intermediate planes divide the angle theta into five portions, each portion having an angular span of theta/five degrees at the COP. Various example embodiments illustrating single and multiple intermediate image planes are explained with reference to
In an embodiment, the one or more intermediate planes may be determined by computing projection of points in the vicinity of a periphery of the overlap region. The points may be joined by, for example, a line in 3-D to determine an intermediate plane. In an embodiment, a plurality of intermediate planes may be determined between the first image plane and the second image plane in a manner that the projections of the plurality of the intermediate planes at the reference point partitions the angle of rotation, for example, the angle theta between the first image plane and the second image plane is divided into equal angular spans. Moreover, the number of partitions of the angle of rotation is one more than the number of intermediate planes. For example, for five numbers of intermediate planes, the number of partitions of the angle of rotation (for example, the angle theta) is equal to six. In another embodiment, the plurality of intermediate planes may be determined between the first image plane and the second image plane by dividing the angle of rotation by an integer indicative of the number of intermediate planes. For example, for ‘n’ intermediate planes between the first image plane and the second image plane, the angle theta may be divided by (n+1) such that each successive intermediate plane may be separated from an adjacent image and/or intermediate plane by an angle theta/(n+1) degrees. Accordingly, a first intermediate plane may be positioned at an angle theta/(n+1) degrees from the first image plane, a second intermediate plane may be positioned at an angle of theta/(n+1) degrees from the second intermediate plane, and so on.
At block 606, at least a portion of the first image associated with the overlap region, and at least a portion of the second image associated with the overlap region are projected onto the one or more intermediate planes. In an embodiment, at least the portion associated with the overlap region between the first image and the second image may be determined based on a distance of the portion of the image from a reference point (for example, the center of the respective image). In an embodiment, the portion of the image closer to the center is utilized for projecting onto the intermediate plane. For example, an overlap portion of the first image and the second image may include a vehicle, such that if the image of the rear wheels of the vehicle is taken from the second image, then it is closer to the center of the second image as compared to the image of the rear wheels taken from the first image with respect to the center of the first image. In a similar manner, the image of the front wheels of the vehicle is taken from the first image, then it is closer to the center of the first image as compared to the image of front wheels taken from the second image with respect to the center of the second image. While projecting the image of the car onto the intermediate plane, the projection of the rear wheels may be taken from the second image while the projection of the front wheels may be taken from the first image. In an embodiment, the projections of the portions of the image in the overlap region onto the one or more intermediate planes may be stitched together by utilizing one or more seams or by utilizing image blending techniques for generating a processed image, for example, a panorama image of the first image and the second image. In another embodiment, the projections of the portions of the image in the overlap region onto the one or more intermediate planes may be stitched together by utilizing one or more seams or by utilizing image blending techniques for generating a processed video content.
As disclosed herein with reference to
In an example embodiment, a processing means may be configured to perform some or all of: determining an angle of rotation between a first image and a second image associated with a multimedia content; determining one or more intermediate planes in the overlap region between a first image plane associated with the first image and a second image plane associated with the second image; and projecting at least a portion of the first image associated with the overlap region and at least a portion of the second image associated with the overlap region onto the one or more intermediate planes for processing of the first image and the second image. An example of the processing means may include the processor 302, which may be an example of the controller 208. Another method for generating a processed image is explained in detail with reference to
Referring now to
In an embodiment, the plurality of images may be captured by utilizing a multimedia capturing device. In an embodiment, the multimedia capturing device may be rotated in a direction for capturing images. For example, the multimedia capturing device may capture a first image, and thereafter rotated by an angle, also known as angle of rotation for capturing the second image. At block 704, the rotation of the second image is determined. In an embodiment, the angle of rotation is determined in 3-D. In an embodiment, the angle of rotation in 3-D is indicative of an overlap region between the first image and the second image.
At block 706, it is determined whether the rotation of the multimedia capturing device (or the second image) in a first direction is non-zero, and the rotation in one of a second direction and a third direction is zero. For example, the multimedia capturing device may be rotated around y-axis, then the rotation of the multimedia capturing device around y-axis is non-zero, however the rotation around at least one of the x-axis and z-axis may or may not be zero. In another embodiment, the media capturing device may be rotated in a vertical direction for facilitating the generation of a panorama image. For example, the media capturing device may be moved from top to bottom or from bottom to top direction. In the present embodiment, the first direction is about x-axis, the second direction is about y-axis and the third direction is about z-axis. If it is determined at block 706 that the rotation in the first direction is non-zero and the rotation in at least one of the second direction and the third direction is zero, then the rotation about the y-axis is computed, at block 708. The determination of the angle of rotation between the first image and the second image at block 708 is performed by executing blocks 710-714.
At block 710, a first set of image features associated with the first image, and second set of image features associated with the second image are determined. In an embodiment, the first set of image features and the second set of image features may be computed based on Harris corner detection method. In another example embodiment, the first set of image features and the second set of image feature may be determined based on a corner detection method. At block 712, the second set of image features may be projected onto a first image plane associated with the first image. At block 714, a distance between the projected first set of image features and the second set of image features may be minimized for determining a minimum angle of rotation between the first image and the second image. In an example embodiment, the distance between the projected first set of image features and the second set of image features may be minimized based on the following equation:
E(th)=Σ(x1i−x′2i)2+(y1i−y′2i)2
In an example embodiment, the following equation may be utilized for computing the values of the projected image features of the second image:
x
2
′=f*(dx−r*cos(th))/(f+r*sin(th));
y
2
′=f*y
2/(f+r*sin(th));
In an embodiment, it may be determined at block 716, that the rotation in the first direction (for example, about y-axis), and at least one of the second direction (for example, about x-axis) and the third direction (for example, about z-axis) is non-zero, then at block 718, the rotation in the second direction and the third direction is computed and compensated. In an embodiment, the angle beta (β) is represents an in-plane rotation between the images (for example, the first image and the second image) or frames around z-axis.
In the first image, a plurality of vertical lines is computed by joining image features from the first set of image features. Also, in the second image, a plurality of straight lines are determined from a matching second set of image features corresponding to the image features of the first set of image features. For example, the combinations of vertical lines (having slopes thereof in the range of 90+/−2 degrees) are considered in the first image and the corresponding straight lines in the second image may be determined. In an embodiment, a deviation of the slope of the plurality of straight lines from the corresponding plurality of vertical lines is computed. In an embodiment, an average of the computed deviations is determined to compute the angle beta (β). It will be noted that the vertical straight lines are preserved in a specific type of projective transformation, where the projective transformation is caused only by rotation around y-axis and hence the angle beta (β) is computed by using vertical straight lines. The angle alpha(α) around x-axis can be approximated as a translation along the vertical direction in a spatial domain in the image and is computed by exhaustively searching in a specified range. In the present embodiment, −Th<dα<Th, where dα is the vertical translation used instead of angle a, and Th=H/20, wherein H is the height of the first image.
At block 720, one or more intermediate planes may be determined in the overlap region between the first image plane associated with the first image, and the second image plane associated with the second image. In an embodiment, the one or more intermediate image planes may include a single intermediate plane. In an embodiment, the one or more intermediate image planes may include multiple intermediate planes. In an embodiment, the one or more intermediate image planes are configured to divide the angle theta in a manner that the projections of the plurality of the intermediate planes at the reference point partitions the angle of rotation, for example, the angle theta between the first image plane and the second image plane into equal angular spans. Moreover, the number of partitions of the angle of rotation is one more than the number of intermediate planes. For example, for five number of intermediate planes, the number of partitions of the angle of rotation (for example, the angle theta) is equal to six.
At block 722, at least a portion of the first image and at least a portion of the second image associated with the overlap region is projected onto the one or more intermediate planes. In an embodiment, at least the portion associated with the overlap region between the first image and the second image may be determined based on a distance of the portion of the image from a reference point (for example, the center of the image). In an embodiment, the portion of the image closer to the center is utilized for projecting onto the intermediate plane. In an embodiment, the projections of the portions of the image in the overlap region onto the one or more intermediate planes may be stitched together at block 724, for processing the first image and the second image. For example, one or more seams in the one or more intermediate planes may be determined for stitching the portions of the images in the overlap region, and the first image and the second image may be stitched together by utilizing the one or more seams to generate a panorama image.
To facilitate discussion of the method 700 of
Without in any way limiting the scope, interpretation, or application of the claims appearing below, a technical effect of one or more of the example embodiments disclosed herein is to generate panorama images. The disclosed embodiments facilitates in generating panorama image with minimum distortion of straight lines in the component images thereof. Various embodiments facilitates in determination of one or more intermediate image planes in an overlap region of the component images, and generating the panorama image by combining the one or more intermediate planes. In an embodiment, the generation of the panorama image facilitates in reducing the perspective distortion in the images. In various embodiments disclosed herein, the post-processing of the panorama image is precluded since during the generation of the panorama image, the distortion therein is minimized. Moreover, various method steps are performed at least in parts or under certain circumstances automatically, thereby precluding the need of user intervention for generating the panorama image.
Various embodiments described above may be implemented in software, hardware, application logic or a combination of software, hardware and application logic. The software, application logic and/or hardware may reside on at least one memory, at least one processor, an apparatus or, a computer program product. In an example embodiment, the application logic, software or an instruction set is maintained on any one of various conventional computer-readable media. In the context of this document, a “computer-readable medium” may be any media or means that can contain, store, communicate, propagate or transport the instructions for use by or in connection with an instruction execution system, apparatus, or device, such as a computer, with one example of an apparatus described and depicted in
If desired, the different functions discussed herein may be performed in a different order and/or concurrently with each other. Furthermore, if desired, one or more of the above-described functions may be optional or may be combined.
Although various aspects of the embodiments are set out in the independent claims, other aspects comprise other combinations of features from the described embodiments and/or the dependent claims with the features of the independent claims, and not solely the combinations explicitly set out in the claims.
It is also noted herein that while the above describes example embodiments of the invention, these descriptions should not be viewed in a limiting sense. Rather, there are several variations and modifications, which may be made without departing from the scope of the present disclosure as defined in the appended claims.
Number | Date | Country | Kind |
---|---|---|---|
2467/CHE/2012 | Jun 2012 | IN | national |