This applications claims priority from U. S. Provisional Application, filed Mar. 15, 2016, which is incorporated herein by reference.
The present invention relates generally to remote imaging and to geographic information systems, and more particularly, to a system for generating georeferenced, geo-oriented realtime video streams.
Definitions
“Include” or “including” means “including but not limited to.” “For example” refers to one possible example and is not meant to limit or exclude others.
“Georeferencing” generally means associating an object with coordinates in a reference system, for example latitude, longitude, and elevation, also referred to as location metadata.
“Geo-orienting” generally refers to the action of orienting an object relative to the points of a magnetic or digital compass or other specified positions. In digital media i.e. GIS computer program applications, Geo-orientation refers to the process of displaying the said object as a different layer on a computer generated map with specific compass bearing, roll and pitch angles to superimpose its exact attitude with regards to the geographic environment, and permitting a user to control the viewpoint from which the combined image is viewed.
Global Positioning Systems, (“GPS”) are available and permit establishing the location of an object. Geographical Information Systems (“GIS”) are available and permit displays representing the physical appearance of locations on a virtual map of the world. One widely available GIS, Google® Earth, allows rendering a map of a given location and also allows display of icons or images representing structures at the displayed location. The images used by Google® Earth are historical, representing the appearance as of the last time that particular location was captured for the Google database.
Remote video acquisition may be accomplished using various platforms, including satellites, drones, unmanned aerial vehicles, remote-controlled cameras or cell phones.
It would be desirable to be able to place remotely acquired video in context by creating a hybrid video stream, combining the remotely acquired video with pre-stored data representing the geography of the location at which the video was acquired, particularly if the hybrid video could be updated in realtime, and even more so if a user could control the viewpoint from which the hybrid video would be displayed. Many U.S. patents have been directed to visualization of remotely acquired images or geospatial information. For example, U.S. Pat. No. 6,484,101 generates geo-spatial objects which are assigned location data and represented on a map. It does not, however, disclose or suggest projecting realtime geo-referenced video feeds on the map nor does it disclose or suggest 3-Dimensional representations. U.S. Pat. No. 8,997,521 discloses 3-Dimensional models on a map. It does not, however, disclose or suggest realtime projection or video projection.
U.S. Pat. No. 8,942,483 compares still, non-georeferenced, images against georeferenced images included in a database of georeferenced imagery. If a match is found, then it outputs a correlation identifier stating that a match has been found and that the location of the non-georeferenced imagery has been resolved; i.e. it is used to identify the location at which a particular image has been taken if that location's imagery is in a pre-existing database. It does not, however, disclose or suggest realtime projection or video projection.
U.S. Pat. No. 9,091,547 teaches simulation of the view of the ground an airborne observer will have at a specific location and orientation. It does not, however, disclose or suggest realtime projection or video projection nor does it teach realtime sensor data fusion.
U.S. Pat. No. 9,188,444 teaches a system for improving the location accuracy of an object that appears on a georeferenced image. It does not, however, disclose or suggest projecting realtime 3-Dimensional geo-referenced video feeds on a map.
U.S. Pat. No. 9,218,682 uses a database of georeferenced objects and embeds geo-location information from matching images. It does not, however, disclose or suggest projecting 3-Dimensional realtime geo-referenced video feeds on a map.
The invention comprises a system for placing remotely acquired video in context by creating a hybrid video, combining the realtime remotely acquired video with real time sensor data representing the geography of the location and 3-Dimentional attitude at which the video was acquired, and allowing a user to control the viewpoint from which the hybrid image is displayed, thereby generating 3-Dimensional, georeferenced, geo-oriented realtime imagery, including video imagery.
The system comprises means for acquiring a remote image, for example a video camera, which captures a real time data feed (which may be single frames or continuous and which may be visible or in a portion of the electromagnetic spectrum not visible to the human eye); a global positioning system (“GPS”) receiver, which reports location metadata associated with the video camera at each instant during image capture; means for determining the orientation of the video camera, for example a 3-Axis compass, which captures the orientation metadata (for example, heading, roll and pitch angles) associated with the video camera at each instant while the video feed is being captured; a computer system on which is stored a database of geographic location metadata and associated imagery for a region of interest and software for fusing images from the realtime video feed with the geographic location metadata and associated imagery and generating a signal which may be translated into a hybrid image for visual display, and a network which connects the system components. The system generates geo-referenced, geo-oriented live footage.
There are two principal kinds of “images” involved in the creation of the hybrid image of the invention. A “topographical image” comprises data describing historical or fixed information concerning the general location under consideration and may include data in the form of imagery (which may include images in spectral ranges beyond human vision), geolocation tags (for example, latitude, longitude and elevation) and would typically be acquired ahead of time and stored on a storage medium and organized as a database accessible by a computer.
A “realtime image” comprises data acquired in real time (or at a specific time) and may, in addition to visual images, include images in spectral ranges beyond human vision, geolocation tags describing the location of the image being captured and/or the device capturing the image. Examples of geolocation tags would include latitude, longitude and altitude or elevation of either the device capturing the realtime image or of components of the image, and attitude of the device with respect to a specific plane (for example, a real or artificial horizon) and orientation with respect to a reference (for example, geographic north).
The system of the invention generates georeferenced, geo-oriented live imagery using a video camera, which is used for capturing a real time video feed—a global positioning system (“GPS”) receiver, which is used to establish and update the location metadata associated with the video camera at each instant during the time the video feed is being captured; a 3-Axis compass, which is used to capture the orientation metadata (for example, heading, roll and pitch angles) associated with the video camera at each instant while the video feed is being captured; a computer system on which is stored a database of geographic location metadata and associated imagery for a region of interest and software for fusing images from the realtime video feed with the geographic location metadata and associated imagery and generating a signal which may be translated into a hybrid image for visual display, and a network which connects the video camera, the GPS, the 3-axis compass and the computer system. The network may be wired (for example, a router connected with the video camera, the 3-axis compass and the computer system) or may be wireless (for example, a cellular modem connected with the video camera, the 3-axis compass and the computer system). The system generates geo-referenced, geo-oriented live footage.
Several of the components may in one embodiment be integrated into a single device, for example, into a smartphone that carries a camera, a cellular modem, a GPS and a compass. In an alternative embodiment, the components are connected by a network, for example the internet.
An example of a prototype constructed embodying the invention follows. Referring to
The remote computer receives the aforementioned data, and is programmed to generate the 3-Dimensional imagery and display it as a separate layer on top of prestored digital maps to the true location and orientation at which the imagery was acquired under software control. Writing the software for calculation, display and coordination of the 3-dimensional imagery and prestored digital maps is a time-consuming task, but requires no more than receiving the data and using it as the input to trigonometry calculations which are within the skill of those of ordinary skill in the art. Additionally, the software is responsive to user control specifying which images are of interest (location and viewpoint). Again, creation of such software is within the skill of those of ordinary skill in the art. The software should allow a user to obtain a topographical image of an area of interest and provide indexed access to the image to a computer. The indexing should be designed so as to enable the computer to access a particular subset of the topographical image in response to an input from the user, using a database management system.
The topographical image may, for example, be Google® Earth and the computer access may be through the internet using a browser, for example, Firefox.
In operation, the user determines a specific realtime image of interest and deploys a remote-controlled video camera system, illustrated in
Referring to
1. A remote package, comprising:
2. A transmission system, comprising:
3. A computer processing system, comprising:
The user provides the GIS data in a form readable by the computer and deploys the remote package to an area of interest. The remote package acquires realtime video of the area of interest and associated location and orientation information and transmits it to the computer. In response to user input specifying the area and viewpoint of interest, the computer executes software which calculates and displays a fused image incorporating the GIS data and the realtime video.
A suitable projection may be based on ray tracing. It computes both ground coordinates and above ground bearing vectors that corresponds to any image point. First a 3D look vector is computed that joins the image point and the camera optical centre. This is known as camera internal orientation that is computed. A projection is calculated as a function of camera parameters. Next, a 3D rotation matrix is computed that take into account rotation of mount (Base roll, pitch and yaw), as well as camera orientation (Pan and tilt).
This 3D rotation matrix is combined with look vector to orient the look vector in real earth. A lookVector1 ray that emerges from image location and hit the earth after passing through optical center. This vector is oriented in real earth as per orientation of UAV and camera. The look vector is projected to its target object or earth. Knowing the distance of UAV or camera from earth it is possible to calculate the exact distance along this line to hit the ground.
Conceptually, in overview the process of creating what is in effect an embodiment of video information and associated environmental information in a display operates as follows. A device, for example a smartphone or a camera with suitable environmental sensors, is used to simultaneously acquire a stream of video information and an associated stream of environmental information (for example, location, altitude, attitude, and other desired information). The two streams are multiplexed so as to create a multiplexed stream of information, which is transmitted (for example, using wifi or the cellular modem of a smart phone) to a remote location where a user has a computer with a receiver capable of receiving said multiplexed stream, a processor capable of demultiplexing said multiplexed stream back into the original stream of video information and stream of environmental information and converting them into a visually display. Upon receipt, the multiplexed stream is separated into a stream of video information and a stream of environmental information. The computer is provided with memory and software capable of providing access to a pre-stored database of topographical information and database management software. The computer is instructed to access the topographical information associated with the location from which the stream of video information was captured as identified by the stream of environmental information. This allows the computer to determine the location from which the stream of video information was captured and the point of view of the device which captured it (location in space, attitude and any other desired information) and to construct a virtual map fusing the topographical information with the stream of video information so as to allow a user to view the stream of video information in the context of, and from the desired point of view of, a virtual observer located at a selected point and viewpoint on the virtual map. The resulting virtual image may then be displayed in any fashion which is suitable, for example, on a monitor.
The system may utilize the live video streaming capability of a smartphone coupled with a “geo-registration” component so as to create a streamed video that has location and orientation data embedded to it as video metadata. The location and orientation data may be extracted from the an embedded GPS (for location), compass, accelerometers and gyros (for orientation) if the smartphone is so equipped, or may be acquired from external equipment.
On the receiving side, a pairing application separates the received location/orientation data from the video and uses them to render an invisible frame onto a digital map that follows the same orientation in all 3-axis as the smartphone attitude used to generate this attitude. Also, the GPS information are used so this invisible frame is placed on the exact location where the smartphone resides.
At the same time, the application textures the invisible frame with the received video frames so that the video is geo-registered and geo-oriented by presenting it on a digital map at the location of and oriented from the perspective of the smartphone which was used to generate the video.
This is accomplished by conceptually four elements: data (including video data) acquisition, telemetry, processing and display.
Data acquisition is carried out by using the smartphone's camera to acquire the desired video and the smartphone's positioning features (to the extent present or as supplemented by additional hardware) to acquire environmental data (for example, gps positioning, altitude, attitude—pan, tilt, roll-, acceleration or other data of interest to the user).
Telemetry begins with multiplexing the acquired data with the video stream. This means that the RTSP H.264 video streams contains a second metadata track that includes the environmental data. The multiplexed data is then transmitted—for example, using wifi or cellular—as a transmission stream to a client location.
At the client location the transmission stream is de-multiplexed so as to separate it into a video stream and an environmental stream.
The environmental data is parsed and processed so as to display the location of the smartphone on a map at a location and with its orientation corresponding to the environmental data, and also to display the 3D projection of the live video stream both flat video (FOV less than 180) or spherical video (FOV greater than 180) at the correct “projected” location and FOV from the location of the phone's camera on the 3D map. All this information is streamed and updated real time so that the user of the client applications can “follow” the smartphone it is connected to and see the changes in location, attitude and video on the map.
Display may be controlled by the user extracting each video frame as it arrives real time and selecting a display mode. If the video is flat (less than 180 degrees field of view) then the video is shown on the map as a flat rectangular frame. The location and “attitude” of this frame on the 3D map is calculated using the environmental data from the phone. Knowing the phone location and the camera FOV angles the flat rectangular frame can be drawn at the appropriate location on the map. This location/attitude changes real time with the location/attitude of the phone. The user may also change its point of view around the map and locate itself at the phone location to have the POV of the video from the location of the phone itself.
If the video is spherical (field of view greater than 180 (Fisheye frame). Normally the field of view is 360 horizontally by some value greater than 180 and less than 360 vertically, for example 360H×240V) the processing is more complicated. In this case, the basic process is to create a virtual 3D “hemisphere” centered at the location of the phone and taking into consideration the phones attitude and camera FOV angles. This hemisphere is a “wire frame” and its surface consists of many vertices/triangles. The higher the number of vertices and triangles the smoother the appearance of the sphere on the display (the “wire frame” is not shown on the display). The hemisphere location/attitude is, once again, updated real time from the phone data.
Once the hemisphere wire frame is calculated then the fisheye video frames that are being received real time from the phone's camera are “textured/draped” over the hemisphere wire frame so that the user sees the live video as a 3D sphere on the map. The fisheye video frames from the camera cannot be applied directly as a texture to the hemisphere but need to be “de-warped/stretched” over the hemisphere wire frame. All this occurs real time on the client device (which may be a PC, another cell phone, a tablet, a server, a desktop computer or other device).
As with the flat view the client user may change the point of view from “outside” the sphere (i.e. viewing the map, phone location and video sphere from above) to inside the sphere as the point of view of the camera, allowing the user to “look around” the sphere without the distortion of the original fisheye video frame.
Number | Date | Country | |
---|---|---|---|
20180262789 A1 | Sep 2018 | US |
Number | Date | Country | |
---|---|---|---|
62390021 | Mar 2016 | US |