The present invention relates to an augmented reality based system and method for generating and displaying a continuous and real-time augmented reality view corresponding to a current orientation of a user.
Poor visibility conditions, which can occur in both land and water environments, can make navigation very difficult and can affect the performance of activities in these conditions.
A night vision camera, which amplifies reflected light from the moon and/or stars, or a thermal camera, which detects the thermal energy emitted by objects, can be used in clear weather to improve or enable visual perception of the environment in poor visibility conditions due to lighting conditions. If, in addition to poor light conditions, rain, fog, dust, smoke or other environmental conditions affecting visibility, further reduce visibility, it is worth using devices other than a conventional camera, such as sonars, to enable visual perception of the environment. Sonars, which are essentially sound-based sensors, can also be used effectively in water and air, where sound waves can propagate.
In poor visibility conditions, mapping of a given land or water environment is now typically carried out using fixed sensors or sensors mounted on mobile devices such as drones, land or water vehicles, such as sonars. Although these methods are capable of providing the user with 2-dimensional or 3-dimensional depth/relief images, these images can essentially only be used as a standard map.
For example, if there are rainstorms, heavy rain, strong wind, sandstorms on dry land, or strong flows and currents in the water, the user cannot look at the previously generated 2-dimensional or 3-dimensional image, but can only use it from memory, so areas or objects with unknown geometry are essentially approached blindly, by groping. This can make navigation and movement in the environment dangerous, slow and difficult. In poor visibility conditions, it would be necessary for the user, even while moving, to have continuous and up-to-date visual information about the desired area and any objects that may be located therein, according to the current orientation of the user.
WO2017131838A2 relates to an apparatus and method for generating a 2-dimensional or 3-dimensional representation of the underwater environment based on a fusion of data from multiple deployed static and deployed movable sensors, which representation can be displayed on a movable device by tracking the user's movement. The device principally provides data on the underwater environment, such as the bottom topography, to aid navigation of a mobile structure, such as a boat. The displayed image is generated based on data fusion on an external device and is entirely dependent on the actual image detected by the sensors, which are limited by the location of the sensors and the current visibility. Although the displayed image tracks the user's movement, it is not capable of representing the user's orientation or a realistic sense of distance, but only the orientation of the generated image is performed. The data fusion of the detected data from the sensors is also not perform locally, but by an external unit. Furthermore, the invention is not capable of improving or correcting for missing or poor quality images, and the invention is also unable to match a specific object to generate an augmented reality view.
WO2013049248A2 relates to a near field communication (NFC) system comprising deployed sensors and a head-mounted device with sensors capable of displaying a 2-dimensional video recording of the external environment on its display. One embodiment of the device allows representation of object locations in low visibility conditions by using overlapping map data to determine the exact location of each object. Essentially, the system can improve the displayed video recording by fusing the overlapping image data, but in this case the data quality also depends on the current recorded images, which are uniformly of poor quality in low visibility conditions. The invention is not suitable for fitting an image of an object as an augmented reality, so that the image provided to the user cannot track the user's current position and orientation.
The present invention provides an augmented reality based system and method for generating and displaying a continuous and real-time augmented reality view corresponding to a current orientation of a user, wherein the augmented reality view is generated by fusing data detected by sensors and matching image data of reference data according to a given spatial orientation. By using the reference data, the image quality can be improved such that the images can be used to accurately match the virtual space and the objects therein, thereby representing them together on the image, thus creating an augmented reality view. The augmented reality view created in this way is suitable for providing a real-time environmental representation corresponding to realistic spatial orientation and a sense of distance by measuring and determining the user's current position and orientation.
The theoretical background of the feasibility of the solution according to the present invention is described in detail below.
With the rise of virtual reality (VR) and augmented reality (AR) applications, various tools appeared, which can be used to display virtual objects in 3-dimensions, so that the user perceives the virtual environment around them as real. When virtual reality is augmented with virtual objects or other visual elements, it is called augmented reality.
In order to create an augmented reality view, it is necessary to accurately match 3-dimensional reality and virtual space, so that the connection and interaction of the objects in the common representation can be properly realized. When navigating in an environment with poor visibility, for example where visibility is practically zero, due to the poor quality of the image data, the visual information cannot or only partially be matched with the virtual information, but for orientation, the user needs the virtual 3-dimensional map to be spatially matched with objects, topography or other environmental features located in reality. When using a 3-dimensional map, the user can be expected to need to see the virtual view according to his/her sense of distance and orientation, and if he/she changes position and/or orientation (e.g. due to head movement, drifting, etc.), whether intentionally or unintentionally, the virtual view should change accordingly.
In the field of robotics, as a result of the DARPA competition for autonomous cars, so-called SLAM (simultaneous localization and mapping) algorithms are used. The idea is that the process builds and updates a map at the same time as the localisation. In the classic case, the so-called encoders in the wheels of the mobile robot measure the rotation of the axles and, based on these measurements, the rotation of the wheels, as well as the distance travelled can be calculated. However, due to a number of other parameters and environmental effects (e.g. friction, different wheel sizes, load and centre of gravity position, battery charge level, etc.), the measurement can be subject to errors, which, even if negligible, are integrated and the method cannot be used for location determination alone. For this reason, a sensor capable of measuring the relative position and distance of the external environment is usually used, typically a 1-dimensional, 2-dimensional, 3-dimensional rangefinder such as a LIDAR (Light Detection and Ranging) device or camera. Then, based on the measurement of this sensor, the measurement of the internal sensor is refined by correcting any errors, and only then does the system create a map together with the determined current position of the mobile robot.
However, the sensor devices used in conventional cases are not capable of providing satisfactory visual quality at poor or practically zero visibility and cannot be used to correct errors or improve the quality of the images in such circumstances.
The solution according to the invention solved this problem by representing the augmented reality in an environment with a particularly poor visibility, where the system and method simultaneously use sensors that provide the appropriate data to properly present the environment according to the user's position and orientation, and in addition the solution of the present invention also uses a reference database to create an augmented reality experience.
The object of the invention is to implement a system for generating and displaying a continuous and real-time augmented reality view corresponding to a current position and orientation of a user,
wherein said system comprises:
Preferably the data storage unit comprises a sub-unit for storing the processed sensor data and/or the reference data.
The head-mountable device can be a google-like or helmet-like device with a graphical display.
The head-mountable device preferably comprises one or more sensors selected from the group comprising of:
The sonar can be arranged substantially at the centre of the head-mountable device.
The inclination sensor can comprise an IMU sensor.
The reference data can comprise depth camera images.
Preferably, the system further comprises a deployed static sensor assembly and/or a deployed movable sensor assembly for detecting physical characteristics of the environment, in particular of an external object, wherein the sensor assembly comprises a plurality of sensors having partially overlapping acoustic fields of view and wherein the sensor assembly is in communication connection with the integrated computing and communication unit.
Preferably, the reference data are continuously determined or predetermined data in a given spatial orientation, which are detected by the deployed static sensor assembly and/or the deployed movable sensor assembly; and/or continuously determined data in a given spatial orientation detected by sensors.
The system can further comprise an external processing unit comprising:
Preferably, the system further comprises an external processing unit further comprising:
The system preferably comprises additional position sensors that can be mounted on at least one arm of the user for determining the position and spatial orientation of at least one arm, wherein the additional position sensors are in communication with the integrated computing and communication unit.
The system can comprise a camera, which is arranged on the head-mountable device and which is in communication connection with the integrated computing and communication unit.
The system can comprise a plurality of reference databases, wherein each reference database comprises different types of reference data, wherein the different types of reference data are combined with each other to correspond to data detected by the sensors based on the spatial orientation.
Another object of the invention is to implement a method for generating and displaying a continuous and real-time augmented reality view corresponding to a current position and orientation of a user, wherein the method comprises:
Preferably, the sonar is arranged at the centre of the head-mountable device.
Preferably, the inclination sensor is an IMU sensor.
Reference data may comprise depth camera images.
Preferably, independently and simultaneously generating augmented reality view of the plurality of head-mountable devices.
Preferably, using an external computing and communication unit in addition to the integrated computing and communication unit, wherein the external computing and communication unit has a higher computing capacity than the integrated computing and communication unit of the head-mountable device.
Preferably, the data processing further comprises fusing data detected by the sensors with data from an additional deployed static sensor assembly and/or an additional deployed mobile sensor assembly.
Preferably, collecting data by an additional position sensor mounted on the at least one arm, and the data processing comprises fusing data detected by the sensors with the data from the additional position sensor mounted on the at least one arm.
Preferably, the data processing comprises fusing data detected by the sensors with a 2-dimensional image data detected by a camera.
Preferably, based on the spatial orientation, matching a given spatially oriented, fused sensor data in combination with different types of reference data from a plurality of reference databases by the integrated computing and communication unit.
Preferably, based on the matched sensor and reference image pairs, automatically generating augmented reality view and displaying on a display of the head-mountable device by the integrated computing and communication unit.
Preferably, the step of matching fused sensor data with reference data in the reference database is implemented by aligning sensor data and reference data according to spatial orientation, or by transformation using a classifier trained with matched sensor and reference image pairs.
In the following, the system and process according to the invention are described in detail on the basis of the drawing, in which:
the
The head-mountable device 10 of
The system 1 further comprises a reference database 20 comprising reference data, wherein the reference data comprises sonar images and/or 2-dimensional camera images, preferably depth camera images, which show the target area to be displayed as augmented reality according to different spatial orientations. The system may comprise a plurality of reference databases 20 containing different types of reference data, for example, the given reference database 20 may comprise only sonar images having different spatial orientations or only 2-dimensional camera images having different spatial orientations. The data detected by 101 sensors, according to a spatial orientation, can be matched with a corresponding spatial orientation reference data. The different types of reference data can also be matched in combination with each other based on a given spatial direction.
The system 1 comprises a data storage unit 30 storing at least spatially orientally matched sensor and reference image pairs, preferably also storing processed sensor data and/or reference data in a storage sub-unit.
The integrated computing and communication unit 103 of the system 1 is in communication connection with the sensors 101, one or more reference databases 20, data storage unit 30 and display 102.
In one embodiment of the system 1, the system 1 further comprises an external processing unit 60 comprising an external computing and communication unit 600 having a greater computing capacity than the integrated computing and communication unit 103 of the head-mountable device 10. Similarly, to the integrated computing and communication unit 103, the external computing and communication unit 600 is in communication connection with the sensors 101, the reference database 20, the data storage unit 30 and the display 102. The external computing and communication unit 600 is preferably connected to a plurality of head-mountable devices 10. The use of multiple head wearable devices 10 is advantageous if the target area is larger or the augmented reality view must be created in a short period of time, since with multiple sensors 101, all the input data all the input data that enables the creation of the augmented reality view can be collected sooner. The external processing unit 60 is also capable of communicating with one or more deployed static sensor assemblies 40 and/or one or more deployed moveable sensor assemblies 50, which are described in more detail below. The external processing unit 60 may further comprise a display unit 601 for displaying an augmented reality view and/or a user's absolute position, wherein the display unit 601 is in communication connection with one or more integrated computing and communication units 103 of the one or more head-mountable devices 10 and/or the external computing and communication unit 600.
The system 1 in
The reference data in the reference database or databases 20 can be collected using sensors 101 on the head-mountable device 10. In a preferred embodiment of the system 1, the head-mountable device 10 includes a camera 1015 in addition to the other sensors 101, which is in communication connection with the integrated computing and communication unit 103. The camera 1015 is preferably suitable for recording 2-dimensional images. The data comprises data detected by all sensors 101, which are different types of reference data. By using the head-mountable device 10, the data are continuously collected and stored, i.e., by continuously moving it in different directions. The reference data can also be collected by storing data 50 in a given spatial orientation detected by static sensor assembly 40 and/or deployed, movable sensor assembly. The data can be collected before using the head-mountable device 10, in which case predetermined reference data are stored in the data storage unit 30. However, the data can be collected and stored simultaneously and continuously with the use of the head-mountable device 10. Such data can be considered as continuously collected reference data.
The processing steps of the method for the operation of the system 1 are described in detail below. The operating principle of the system 1 is to generate a suitable, continuous and real-time augmented reality view based on continuous measurements from all sensors 101 of the system 1 (in the case of the basic solution, only based on the data of the sensors 101 of the head-mountable device 10) and spatial directional matching via the reference database.
The method comprises:
As a result of the method, an augmented reality view is generated that changes continuously in real time in accordance with the changes in the measurement data, based on which the user receives a realistic representation of the environment to be examined.
In a preferred embodiment of the method, a fused sensor data having a given spatial orientation is matched based on spatial orientation in combination with different types of reference data from a plurality of reference databases 20 by the integrated computing and communication unit 103. This means that a reference database 20 contains only one type of reference data, for example only 2-dimensional camera image or only sonar image, which can be considered as input data, so that these input data are matched with the fused sensor data. By combining several different types of reference data, we can create a higher quality augmented reality view.
The method can involve creating the augmented reality view of multiple head-mountable devices 10 independently and simultaneously, so that the given augmented reality view can be displayed separately on the display 102 of a head-mountable device 10. The use of multiple sensors 101, in particular multiple sonars 1010, is advantageous because measuring from a single point, for example a depth image, is not expected to result in complete image information with the adequate, detailed resolution from all sides, because there will be parts that cannot be measured from that point due to shadow effects.
The sensors 101 of the head-mountable device 10 may be supplemented with additional positioning sensors 1014 mounted on at least one arm, with which we also collect data, and during the data processing the data detected by the sensors 101 are also fused with the additional positioning sensor data 1014 mounted on at least one arm. Preferably, a plurality of additional position sensors 1014 are mounted on the user's arms, particularly preferably on different segments of the arms, where the sensor can be, for example, an IMU sensor. By tracking the movement of the arm, the augmented reality view can be supplemented with the movement of the user's arms, which mapping can facilitate the user's distance perception.
During the data processing, the data detected by sensors 101 is also fused with a 2-dimensional image data detected by the camera 1015. The matching of the 2-dimensional image provided by the camera 1015 with the fused data from the sensors 101 enables the transformation of a 3-dimensional image into a 2-dimensional image during the process. Both the 2-dimensional and the 3-dimensional image can be displayed to the user as an augmented reality view.
Alternatively, it is possible to use one or more, preferably more head-mountable devices 10 with multiple sonars 1010. One or more head-mountable devices 10, and one or more of deployed static 40 and/or deployed mobile sensor assemblies 50 can be used simultaneously.
When using multiple measuring devices, it is important to ensure that they do not interfere with each other, which can be achieved by standard procedure in the field, such as time-division measurement or frequency division. These procedures are well-known in the field, so we will refrain from describing them in this description. By fusing the measurement data provided by multiple devices and then matching them with reference data based on spatial direction, i.e. orientation, a common augmented reality view is mapped.
The method may involve the use of an external computing and communication unit 600 in addition to the integrated computing and communication unit 103, where the external computing and communication unit 600 has a greater computing capacity than the integrated computing and communication unit 103 of the head-mountable device 10. The external processing unit 600 is preferably used when a larger amount of data needs to be processed. For example, this can be achieved by predetermining a data volume during the process, and when the data volume is exceeded, the external processing unit 600 performs the required computational tasks. The increased computational capacity of the 600 external processing units allows the necessary procedural operations to be performed more quickly, resulting in an augmented reality view.
In a particularly advantageous embodiment of the method according to the invention, based on the stored, matched sensor and reference image pairs, the integrated computing and communication unit 103 automatically generates and displays the augmented reality view on a display 102 of the head-mountable device 10. The automatic execution of the operations can be implemented, for example, by algorithms, preferably by a machine learning algorithm, or by one or more neural networks, preferably by a series of neural networks.
In another advantageous embodiment of the method according to the invention, the step of matching the fused sensor data with the reference data in the reference database 20 can be performed in two different ways. First, by spatially orientally matching a fused sensor data with a reference data in a reference database 20. This means that there are two data sets, preferably two image data, which can be considered as spatial point clouds, whose points can be matched, so that the image data can be aligned according to a given spatial direction. Thus, a lower quality fused sensor data can be improved, i.e. corrected and/or supplemented by matching a higher quality reference data. Alternatively, a classifier, which is preferably an algorithm or neural network, can be trained using the already available, matched sensor-reference image pairs. After the training, the classifier is capable of performing a transformation to create augmented reality. The transformation involves improving the quality of a fused sensor data, supplementing and/or correcting the input image data. The transformation operation of the classifier is faster than the matching operation, since in this case there is no need for matching according to the spatial direction, since this has already performed beforehand for the matched image pairs. Thus, the classifier can perform the transformation using the previously performed matching data.
For the application of neural networks, they need to be trained with so-called training databases to enable automatic operation later on. In the present problem, one or more neural networks can be trained with databases representing good visibility, i.e., training databases, which essentially correspond to one or more of the reference databases 20 of the system 1 according to the invention. As described above, the training databases can use high-quality sonar images, camera images, preferably depth camera images, preferably in combination. A high-quality image is defined as an image dataset that provides a clear visual recognition of the environment and/or objects to be presented, i.e. it has appropriate edges, resolution, etc. The training database or databases can be used to train a neural network that can supplement and/or transform a sonar point cloud-based image into a 2-dimensional camera image, even in low visibility environments.
The application of the automatic operation of neural networks to the system and method according to the invention is illustrated below with detailed schematic descriptions. For a better understanding of the figures, we provided them with captions.
The training schemes in
The cases shown in
Neural network learning schemes, such as the cases (a)-(c) in
The first step is to create the experimental arrangement. The next necessary step is the calibration and synchronisation of the sensors. Then, the relative location (position and orientation) of each device, i.e. the head-mountable device(s) 10, deployed static sensor assembly 40 and/or deployed mobile sensor assembly 50, is measured and, if there are any additional sensors, their calibration and synchronization of communication are also carried out. The data collection is an iterative process, taking place in a known environment with different parameters. Thus, we can change the artificially designed environment by changing or adjusting the parameters, the examined object, or even the lighting conditions. In each examined environment, as many measurements as possible are performed, then the sensors 1010, or more precisely the movable sensors, are moved to a given position and orientation. This should be done in such a way that the entire surface of the examined object is measured, with overlaps, from different distances. One possible method is to take measurements with the highest possible orientation resolution from points selected at the same distances on a bounding circle, considering the object as the centre. From each of these positions, sensor measurement (multimodal measurement and data acquisition) will be performed, which will include a synchronized sonar image and/or a 2-dimensional camera image, and possibly a depth camera image. In addition, if the known measurement parameters and the existing model of the object, i.e. appropriate data, are available, the reference data can be calculated based on these. The data of each synchronized measurement is stored in pairs with the reference data, on the basis of which the neural networks are later trained.
The system and method according to the invention can be used in poor visibility conditions, such as extreme weather conditions, for example in disaster situations, typically where the observation of a target object with conventional camera sensors is not or only very limited possible. Environmental conditions that create such poor visibility on land can be for example smoke from a combustion or other chemical reaction, desert sandstorms, Martian dust storms, poor visibility in rain or snowfall. The solution according to the invention can also solve rescue difficulties in underwater water disaster situations, where darkness due to the depth or impurities in the water prevents the use of a camera.
The system has the advantage of being able to provide an augmented reality view even when visibility is poor due to the lighting conditions and/or conditions such as contamination in water. The system is capable of providing an augmented reality view even at zero visibility.
The advantage of the solution according to the invention is that different sensors can be used depending on the target area to be monitored, for example underwater flow sensors. During an underwater search or rescue operation, it may happen that even objects thought to be static, such as the hull of a boat, are displaced by the current. By measuring the current with sensors and determining the position of the target object, the solution also allows continuous real-time monitoring of this. With special equipment, the user can navigate in a water environment with sufficient accuracy even in poor visibility conditions. Augmented Reality Visualisation gives the user a visual representation of his/her surroundings and enables him/her to effectively detect in difficult environmental conditions.
Another advantage is that with the help of a suitable machine learning-based neural network, the diver can automatically visualize his/her environment. Neural networks can also be trained with data collected in artificial environments. For example, data for an aquatic environment can be collected in a pool that needs to be made suitable for water flow. This is possible, for example, by introducing n×m flow channels per side (typically spaced every metre), where each flow channel can be switched on independently. Using this approach, it is possible to create a sea wave either by inserting a larger drifting river or by inserting a mechanical module that moves back and forth to create a wave motion. The neural network can be taught to provide the desired flow, thereby also best approximating natural situations.
A further advantage of the inventive solution is that by using a camera, the 3-dimensional image can be transformed into a 2-dimensional image, so that the system and the method can be used to display the augmented reality view as a conventional map-like image.
Number | Date | Country | Kind |
---|---|---|---|
P2100311 | Aug 2021 | HU | national |
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/IB2022/060384 | 10/28/2022 | WO |