As the technology of handheld electronic devices improves, various types of functionality are being combined into a single device, and the form factor of these devices is becoming smaller. These devices may have extensive processing power, virtual keyboards, wireless connectivity for cell phone and interne service, and cameras, among other things. Cameras in particular have become popular additions, but the cameras included in these devices are typically limited to low resolution snapshots and short video sequences. The small size, small weight, and portability requirements of these devices prevents many of the more sophisticated uses for cameras from being included. For example, 3D photography can be enabled by taking two pictures of the same object from physically separated locations, thus giving a slightly different visual perspective of the same scene. Techniques for such stereo imaging algorithms typically require accurate knowledge of the relative geometry of the two positions from which the two pictures are taken. In particular, the distance separating the two camera positions and the convergence angle of the optical axes are essential information in extracting depth information from the images. Conventional techniques typically require two cameras taking simultaneous pictures from rigidly fixed positions with respect to each other, which can require a costly and cumbersome setup. This approach is impractical for small and relatively inexpensive handheld devices.
Some embodiments of the invention may be understood by referring to the following description and accompanying drawings that are used to illustrate embodiments of the invention. In the drawings:
In the following description, numerous specific details are set forth. However, it is understood that embodiments of the invention may be practiced without these specific details. In other instances, well-known circuits, structures and techniques have not been shown in detail in order not to obscure an understanding of this description.
References to “one embodiment”, “an embodiment”, “example embodiment”, “various embodiments”, etc., indicate that the embodiment(s) of the invention so described may include particular features, structures, or characteristics, but not every embodiment necessarily includes the particular features, structures, or characteristics. Further, some embodiments may have some, all, or none of the features described for other embodiments.
In the following description and claims, the terms “coupled” and “connected,” along with their derivatives, may be used. It should be understood that these terms are not intended as synonyms for each other. Rather, in particular embodiments, “connected” is used to indicate that two or more elements are in direct physical or electrical contact with each other. “Coupled” is used to indicate that two or more elements co-operate or interact with each other, but they may or may not be in direct physical or electrical contact.
As used in the claims, unless otherwise specified the use of the ordinal adjectives “first”, “second”, “third”, etc., to describe a common element, merely indicate that different instances of like elements are being referred to, and are not intended to imply that the elements so described must be in a given sequence, either temporally, spatially, in ranking, or in any other manner.
Various embodiments of the invention may be implemented in one or any combination of hardware, firmware, and software. The invention may also be implemented as instructions contained in or on a computer-readable medium, which may be read and executed by one or more processors to enable performance of the operations described herein. A computer-readable medium may include any mechanism for storing information in a form readable by one or more computers. For example, a computer-readable medium may include a tangible storage medium, such as but not limited to read only memory (ROM); random access memory (RAM); magnetic disk storage media; optical storage media; a flash memory device, etc.
Various embodiments of the invention enable a single camera to derive three dimensional (3D) information for one or more objects by taking two pictures of the same general scene from different locations at different times, moving the camera to a different location between pictures. Linear motion sensors may be used to determine how far the camera has moved between pictures, thus providing a baseline for the separation distance. Angular motion sensors may be used to determine the change in direction of the camera, thus providing the needed convergence angle. While such position and angular information may not be as accurate as what is possible with two rigidly mounted cameras, the accuracy may be sufficient for many applications, and the reduction in cost and size over that more cumbersome approach can be substantial.
Motion sensors may be available in various forms. For example, three linear motion accelerometers, at orthogonal angles to each other, may provide acceleration information in three dimensional space, which may be converted to linear motion information in three dimensional space, and that in turn may be converted to positional information in three dimensional space. Similarly, angular motion accelerometers may provide rotational acceleration information about three orthogonal axes, which can be converted into a change in angular direction in three dimensional space. Accelerometers with reasonable accuracy may be made fairly inexpensively and in compact form factors, especially if they only have to provide measurements over short periods of time.
Information derived from the two pictures may be used in various ways, such as but not limited to:
1) Camera-to-object distance for one or more objects in the scene may be determined.
2) The camera-to-object distance for multiple objects may be used to derive a layered description the relative distances of the objects from the camera and/or from each other.
3) By taking a series of pictures of the surrounding area, a 3D map of the entire area may be constructed automatically. Depending on the long-term accuracy of the linear and angular measurement devices, this might enable a map of a geographically large area to be produced simply by moving through the area and taking pictures, provided each picture has at least one object in common with at least one other picture, so that the appropriate triangulation calculations may be made.
One technique for measuring motion is to use accelerometers coupled to the camera in a fixed orientation with respect to the camera. Three linear accelerometers, each with its measurement axis in parallel with a different one of the three axes X, Y, and Z, can detect linear acceleration in the three dimensions, as the camera is moved from one location to another. Assuming the initial velocity and position of the camera is known (such as starting from a standstill at a known location), the acceleration detected by the accelerometers can be used to calculate velocity along each axis, which can in turn be used to calculate a change in location at a given point in time. Because the force of gravity may be detected as acceleration in the vertical direction, this may be subtracted out of the calculations. If the camera is not in a level position during a measurement, the X and/or Y accelerometer may detect a component of gravity, and this may also be subtracted out of the calculations.
Similarly, three angular accelerometers, each with its rotational axis in parallel with the three axes X, Y, and Z, can be used to detect rotational acceleration of the camera in three dimensions (i.e., the camera can be rotated to point in any direction), independently of the linear motion. This can be converted to angular velocity and then angular position.
Because a slight error in measuring acceleration may result in a continuously increasing error in velocity and position, periodic calibration of the accelerometers may be necessary. For example, if the camera is assumed to be stationary when the first picture is taken, the accelerometer readings at that point in time may be assumed to represent a stationary camera, and only changes from those readings will be interpreted as an indication of motion.
Other techniques may be used to detect movement. For example, a global positioning system (GPS) may be used to locate the camera at any given time, with respect to earth coordinates, and location information for different pictures may therefore be determined directly. An electronic compass may be used to determine the direction in which the camera is pointed at any given time, also with respect to earth coordinates, and the directional information of the optical axis for different pictures may be determined directly from the compass. In some embodiments, the user may be required to level the camera to the best of his/her ability when taking pictures (for example, a bubble level or an indication from an electronic tilt sensor may be provided on the camera), to reduce the number of linear sensors down to two (X and Y horizontal sensors) and reduce the number of directional sensors down to one (around the vertical Z axis). If an electronic tilt sensor is used, it may provide leveling information to the camera to prevent a picture from being taken if the camera is not level, or provide correction information to compensate for a non-level camera when the picture is taken. In some embodiments, positional and/or directional information may be entered into the camera from external sources, such as by the user or by a local locator system that determines this information by methods outside the scope of this document, and wirelessly transmits that information to the camera's motion detection system. In some embodiments, visual indicators may be provided to assist the user in rotating the camera in the right direction. For example, an indicator in the view screen (e.g., arrow, circle, skewed box, etc.) may show the user which direction to rotate the camera (left/right and/or up/down) to visually acquire the desired object in the second picture. In some embodiments, combinations of these various techniques may be used (e.g., GPS coordinates for linear movement and angular accelerometer for rotational movement). In some embodiments, the camera may have multiple ones of these techniques available to it, and the user or the camera may select from the available techniques and/or may combine multiple techniques in various ways, either automatically or through manual selection.
As can be seen, in this example neither of the objects is directly in the center of either picture, but the direction of each object from the camera may be calculated, based on the camera's optical axis and where the object appears in the picture with respect to that optical axis.
Thus, the direction of each object from each camera location may be calculated, by taking the direction the camera is pointing and adjusting that direction based on the placement of the object in the picture. It is assumed in this description that the camera uses the same field of view for both pictures (e.g., no zooming between the first and second pictures) so that an identical position in the images of both pictures will provide the same angular difference. If different fields of view are used, it may be necessary to use different conversion values to calculate the angular difference for each picture. But if the object is aligned with the optical axis in both pictures, no off-center calculations may be necessary. In such cases, an optical zoom between the first and second pictures may be acceptable, since the optical axis will be the same regardless of the field of view.
Various embodiments may also have other features, instead of or in addition to the features described elsewhere in this document. For example, in some embodiments, the camera may not enable a picture to be taken unless the camera is level and/or steady. In some embodiments, the camera may automatically take the second picture once the user moves the camera to a nearby second location and the camera is level and steady. In some embodiments, several different pictures may be taken at each location, each one centered on a different object, before moving to the second location and taking object-centered pictures of the same objects. Each pair of pictures of the same object may be treated in the same manner as described for two pictures.
Based on the change of location and the change of direction from the camera to each object, various 3D information may be calculated for each of objects A and B. In the illustration, the second camera position is closer to the objects than the first position, and that difference may also be calculated. In some embodiments, if an object appears to be a different size in one picture than another, the relative sizes may help to calculate the distance information, or at least relative distance information. Other geometric relationships may also be calculated, based on the available information.
The foregoing description is intended to be illustrative and not limiting. Variations will occur to those of skill in the art. Those variations are intended to be included in the various embodiments of the invention, which are limited only by the scope of the following claims.
This application is derived from U.S. provisional patent application Ser. No. 61/187,520, filed Jun. 16, 2009, and claims priority to that filing date for all applicable subject matter.
Number | Date | Country | |
---|---|---|---|
61187520 | Jun 2009 | US |