This invention relates to a method and apparatus for providing a combined image and refers particularly, though not exclusively, to such a method and apparatus for providing a combined image from a plurality of images.
Throughout this specification the use of “combined” is to be taken as including a reference to the creation of a panoramic image, as well as a stereoscopic image, lenticular stereoscopic image/video, and video post-production to merge two or more video image streams into a single video stream.
Panoramic images are images over a wide angle. In normal photography panoramic images are normally taken by having a sequence of successive images that are subsequently joined, or stitched together, to form the combined image. When the images are taken simultaneously using a plurality of cameras, the images are normally displayed separately. For video camera security, video conferencing, and other similar applications, this means multiple cameras, and multiple displays, must be used for continuous panoramic imaging.
Alternatively or additionally, one or more of the cameras may be a pan/tilt camera. This requires the pan/tilt cameras to have an operator to move the camera's field of vision, or a servomotor to move the camera. The servomotor may be operated remotely and/or automatically. However, when such a system is used, the camera is covering only a part of its maximum field of view at any one time. The consequence is that another part of its maximum field of view is not covered at any one time. This is unsatisfactory.
Although wide-angle lenses may be used to reduce the impact of the loss of coverage, the distortion introduced, particularly at higher off-axis angles, is also unsatisfactory. A wide-angle lens also requires a higher resolution image sensor to maintain the same resolution.
In accordance with one aspect of the present invention there is provided a method for providing a combined image from a plurality of images each produced by one of a plurality of cameras each having an image system for taking an image of the plurality of images, the method comprising:
According to another aspect of the invention there is provided a method for providing a combined image from a plurality of images each produced by one of a plurality of cameras each having an image system for taking an image of the plurality of images, the method comprising:
According to a further aspect of the invention there is provided a method for providing a combined image from a plurality of images each produced by one of a plurality of cameras each having an image system for taking an image of the plurality of images, the method comprising:
In accordance with yet another aspect of the invention there is provided a method for providing a combined image from a plurality of images each produced by one of a plurality of cameras, each of the plurality of cameras having an image system for taking an image of the plurality of images, the method comprising:
In accordance with an additional aspect of the invention there is provided a method for providing a combined image from a plurality of images each produced by one of a plurality of cameras each having an image system for taking an image of the plurality of images, the method comprising:
In accordance with a further additional aspect of the invention there is provided a method of producing a combined video image from a plurality of video images each produced by one of a plurality of video cameras each having an image system for taking an image of the plurality of images, the method comprising:
A penultimate aspect of the invention provides a method for providing a combined image from a plurality of images each produced by one of a plurality of cameras each having an image system for taking an image of the plurality of images, the method comprising the steps:
A final aspect of the invention provides apparatus for providing a combined image, the apparatus comprising
Each camera may have a buffer, and they may be in a common body, or may be separate.
In order that the invention may be fully understood and readily put into practical effect, there shall now be described by way of non-limitative example only preferred embodiments of the present invention, the description being with reference to the accompanying illustrative drawings in which:
As shown in
The image sensors 12 in a multiple-camera can either be separate entities as shown in
As compared to a single camera with mechanical pan tilt motor, the multiple-camera configuration has the advantage of no moving parts which makes it free from mechanical failure. It has the additional benefit of capturing the entire scene all the time, behaving like a wide-angle lens camera, but without the associated distortion and loss of image data, particularly at wide, off-axis angles. Unlike a single wide-angle lens camera, which has a single image sensor, the multiple-camera configuration is scalable to wider view, and provides higher resolution due to the usage of multiple image sensors.
A multiple-camera system is useable using existing cameras and video applications, such as video conferencing and web casting applications, on a standard computer. In this way existing video applications can be used. One way for it to work with existing video applications is to disguise a stitcher as a virtual camera (
Most computer operating systems (OS) provide a standard method for its applications to access an attached camera. Typically, every camera has a custom “device driver”, which provides a common interface to which the OS can communicate. In turn, the OS provides a common interface to its applications for them to send queries and commands to the camera. Such layered architecture provides a standard way for the applications to access the cameras. Using a common driver interface is important for these applications to work independently of the camera vendor. It also enables these applications to continue to function with future cameras, as long as the cameras respect the common driver interface.
The virtual camera 32 does not exist in a physical sense. Instead of providing a video stream from an image sensor, which it lacks, the virtual camera 32 obtains the video streams 34 from other real cameras 30, 31 directly from their device drivers 33 or by using the common driver interface. It then combines and repackages these video streams into a single video stream, which it offers through its own common driver interface 33. A combined camera 32 is a virtual camera, which stitches the input video streams 34 into a combined video stream. As such the virtual camera 32 is a video processor capable of processing one or more input video streams, and outputs a single video stream.
From a video application's 35 perspective, the virtual camera 32 appears as a regular camera, with a wide viewing angle. In this way, the image data from more than one camera 30, 31 can be processed by the virtual camera 32 such that the computer's video application 35 sees it as a single camera. The number of cameras involved is not limited and may be two, three, four, five, six, and so forth. The panorama captured by their combined field of view is not limited and may extend to 360°, and even to a sphere.
As shown in
To achieve real-time performance, the combined virtual camera performs the overlap calculation (45) only once, and assumes that the camera positions remain the same throughout the session.
Some video applications have format restriction. For example H.261 based video conferencing applications only accept CIF and QCIF resolution. The size and aspect ratio of the resulting combined image is likely to be different from the standard video formats. An additional stage to transform the image to the required format may also be performed, which typically involves scaling and panning.
A separate user interface may be provided to the user to enable the selection of different presentation styles. For pan & scan (48), the user can interactively pan the panorama to select a region of interest. Alternatively, automatic panning and switching between styles can be employed at pre-set time intervals. Multiple styles can also be created simultaneously. For example, the horizontal compressed style may be used for recording the video, while the pan & scan may be used for display.
By having multiple viewpoints, a perfect stitch may be possible. However, at the overlapping region, double or missing images may result. The problem may be more serious for near objects than distant objects. For surveillance application, which has mostly distant objects, the problems may be reduced. For close-up applications such as, for example, video conferencing, three cameras may be used so that the centre camera has the full picture of the human head and shoulder. Each camera should preferably send thirty frames each second.
For real-time stereoscopy, the virtual camera may perform the stereoscopic image formation such as, for example, by interlacing odd and even rows, and stacking the images for a top-to-bottom stereoscopy. For post-processing of video, the virtual camera may be used to combine or merge video from different cameras; and it may be used for the generation of lenticular stereoscopic image/video.
The virtual camera 32 is able to convert multiple video streams into a single stream in a stereo format by performing interlacing, resizing, and translation. Resizing is preferably performed with proper filtering such as, for example, “Cubic” and “Lanczos” interpolations for upsizing, and “Box” or “Area Filter” for downsizing. Row-interlace stereoscopy format interlaces the stereo pair with odd rows representing the left eye, and even rows representing the right eye. This can be viewed using de-multiplexing equipment such as, for example, “Stereographic's SimulEyes”, and that is compatible with standard video signals. The virtual camera 32 performs the interlacing, which involves copying pixels, and possibly resizing each line:
Above-Below stereoscopy format requires the vertically resizing and translation of the source images, the top for the left eye, and the bottom for the right eye. In the same way, the Side-by-Side format can also be used. In these cases, the virtual camera 32 performs scaling and translation to combine the two video streams into a single stereo video stream. At the receiving end, a device capable of decoding the selected format can be used to view the stereo pair using stereo glasses.
The cameras 10 may be digital still cameras, or digital motion picture cameras.
Whilst there has been described in the foregoing description a preferred embodiment of the present invention, it will be understood by those skilled in the technology that may variations or modifications in details of one or more of design, construction and operation maybe made without departing from the present invention.