The present disclosure generally relates to generating a 3D model of an area. In particular, the disclosure relates to the generation of a live 3D Tour of an area which enables a user to immersively and interactively experience the area virtually as if the user were physically there in real-time. More particularly, the area is an indoor area.
3D modeling of an area generally includes scanning the area to create a point cloud, transforming the point cloud to a mesh model, texturing the mesh model, and rendering the textured mesh model to generate the 3D model. Conventionally 3D models are static models. For example, the 3D models are based on what was originally scanned. Although the 3D model can be manipulated, it nevertheless remains based on what was originally scanned. In other words, the 3D models do not depict the modeled area in real-time. In other words, the model is not a live 3D model, but instead is a static 3D model.
From the foregoing disclosure, there is a desire to provide real-time live 3D models of an area.
Real-time live 3D models are disclosed. In one embodiment, a system for displaying an immersive interactive live digital twin of an indoor area is disclosed. The system includes a static digital twin module. The static digital twin module includes a static digital twin model of the indoor area. The system also includes an input module. The input module is configured to receive live feeds of the indoor area from a plurality of cameras located in the indoor area. The system also includes an integration module. The integration module is configured to stitch the live feeds of the indoor area to the static digital twin model to create the immersive live digital twin. The system also includes an interactive area disposed in the digital twin model. The interactive area includes an action button for controlling an interactive component. The live feeds capture the interactive component being controlled by a user when the action button is selected. The system also includes a display module for displaying the immersive live digital twin to provide a user with the perception that the user is immersed with the immersive live digital twin. The system further includes a navigation module to navigate to different parts of the immersive live digital twin.
In another embodiment, a method for generating an immersive interactive live digital twin of an indoor area is disclosed. The method includes scanning the indoor area to create a static digital twin model of the indoor area. The static digital twin includes an interactive area with an action button for controlling an interactive component. The method also includes obtaining live feeds of the indoor area from a plurality of cameras located in the indoor area. The method also includes stitching the live feeds to the static digital twin model to create the immersive live digital twin of the indoor area. The method also includes displaying the immersive live digital twin where a user is immersed within the immersive live digital twin. The method also includes navigating through the immersive live digital twin to different parts of the module to navigate to different parts of the indoor area. The method further includes selecting the action button to control the interactive component. The live feeds capture the interactive component being controlled.
These and other advantages and features of the embodiments herein disclosed will become apparent through reference to the following description and the accompanying drawings. Furthermore, it is to be understood that the features of the various embodiments described herein are not mutually exclusive and can exist in various combinations and permutations.
In the drawings, like reference characters generally refer to the same parts throughout the different views. Also, the drawings are not necessarily to scale, emphasis instead generally being placed upon illustrating the principles of various embodiments. In the following description, various embodiments of the present disclosure are described with reference to the following, in which:
Embodiments relate to generating a 3D digital replica or twin of a location or area. The location, in one embodiment, is an indoor area such as an office or building. The building may have a single floor or multiple floors. Other types of indoor areas may also be useful. In other embodiments, the location may be an outdoor area.
In one embodiment, video feeds of the area are merged or stitched into a digital model to produce a unified 3D reconstruction of the area, resulting in a 3D digital replica. A user can virtually tour the digital replica. For example, the user can navigate to any position within the digital replica and view of that position of the digital replica as if the user were standing there. The user may turn and view the area from the position. This gives the user a 3D immersive experience from a specific position of the digital replica. For example, the user may navigate to any position within the digital replica and see everything surrounding the user animated. This produces a realistic 3D immersive tour of the area
In another embodiment, the video feeds are live video feeds which are stitched to the digital model. This produces a real-time immersive 3D virtual tour of the location. With real-time video feeds stitched, the user can see exactly what is going on in real-time, as if the user were physically there at that time. For example, the user may navigate to any position within the digital replica and see everything surrounding the user animated and in real-time. This includes people who are at the location and what they are doing in real-time. Of course, when we refer to real-time, there may be delays related to processing (e.g., merging the video feeds) and data transmission. Other delays may also be included. However, such delays may be considered negligible. For example, the 3D real-time virtual tour of the replica may still be considered real-time or essentially real-time. Real-time may refer to real-time or near real-time, for example, delay due to processing. The digital replica with the live video feeds may be referred to as a live digital twin, giving the experience of being physically there at that moment in time.
In another embodiment, the real-time virtual 3D tour is a real-time interactive virtual 3D tour. For example, a user may interact with elements or components of the area. This, for example, may include a robot, a television, a door or other components of the area. Hot or web buttons may be included in the tour for a user to interact with a component associated with a web button.
The immersive experience not only blurs the lines between virtual and physical realms but also enhances user engagement, making one feel genuinely present in the explored location. Furthermore, the immersive, the immersive real-time and the immersive real-time interactive experiences provide a groundbreaking shift as to how a user interacts with and experiences a virtual space, offering a unique and immersive way to explore the world without ever leaving the comfort of your device.
To create the 3D digital replica, a 3D digital model of the indoor area is created. The digital model may be created by scanning the indoor area completely. For example, the digital model is a 3D digital model of the complete area. Video feeds are merged with the digital model to create a static 3D digital replica of the area. This, for example, produces a static tour model through which the user can navigate. Live video feeds may be merged with the digital model to create a real-time 3D digital replica of the area. The real-time digital replica depicts what is occurring in the area in real-time. The stitching may be continuously performed in the background as the live feeds are received.
After scanning is completed, a 3D reconstruction of the scan data is performed at 120. The reconstruction, for example, utilizes artificial intelligence (AI) to reconstruct the scan data. For example, the reconstruction process includes processing the data to generate point clouds. The point clouds are used to create 3D models of the scan space or spaces. The different scans, for example, generate data for the different spaces and are processed using AI to form an overall reconstruction of the area of interest. The area of interest may be from a single scan or multiple scans. For example, the area of interest may be a room, an office space, a floor of a building, floors of a building or all floors of a building. Other areas of interest may also be useful. The point clouds are used to generate a 3D BIM model at 130, a 3D mesh model at 132 and a 3D tour model at 134. The various models are different 3D digital representations or replicas of the scanned space or area of interest. For example, different texturing are employed to generate the different 3D models.
In one embodiment, the BIM model is a basic 3D model of the scanned area of interest.
In one embodiment, a user may navigate to different parts of the area of interest using any of the models at 140. For example, each model may include a control panel to switch from one model to another as well as to migrate from one position to another. In addition, a user may zoom in or zoom out (e.g., moving closer or farther away). Migrating from one position to another may also be achieved using the pointer or mouse to move and select the desired position.
In one embodiment, immersive locations are dispersed within the indoor area to generate the real-time 3D tour model. The immersive locations include cameras for providing live video feeds for processing to generate the real-time 3D tour model. A user may navigate to an immersive location at which the user would view the area as if the user were standing there. For example, a user may view 360° outwardly from the immersive location. Live video feeds from the immersive camera are merged with the 3D model. The 3D model merged with the live video feeds may be referred to as a real-time or live immersive digital twin of the area.
In one embodiment, video fusion is employed to merge multiple video feeds from the immersive locations with the 3D model to produce a coherent 3D scene. A video fusion module of the system is configured to fuse the multiple video feeds with the 3D model. The video fusion module, in one embodiment, includes an end-to-end deep learning architecture that fuses dynamic 3D geometry and video texture for real-time navigation. The deep learning architecture is configured to predict depth from video or imagery and create 3D spatial geometry therefrom.
In one embodiment, the video fusion module merges video data captured from the different which is then streamed in real-time into a 3D scene. This innovative approach merges real-time video footage with 3D digital twin, allowing for dynamic and immersive visualizations. As the 360-degree camera captures a complete panoramic view, it provides a comprehensive perspective of the surroundings. This footage is seamlessly integrated into the 3D scene, for example, at 60 FPS, ensuring that viewers can experience a real-time and realistic depiction of the environment.
Furthermore, the present technology may be applied to other applications, such as real-time surveillance and augmented reality systems. The real-time streaming capability means that changes in the physical environment are immediately reflected in the 3D scene, providing up-to-date information that facilitates decision-making in various fields, including urban planning, security, and entertainment.
In one embodiment, processing of the image data includes the techniques used to transform raw video feeds from, for example, a 360-degree camera into a structured 3D scene which involves dynamically updating the textures with video frames. For example, 3d video fusion is employed. Video fusion is a high-performance adaptive video texturing and rendering of a large 3D scene. For example, video fusion seamlessly integrates video content into a 3D model utilizing depth estimation, advanced spatial analysis such as segmentation, and motion tracking. The 3D model, for example, may be the mesh model. Other types of 3D models may also be useful.
Regarding depth estimation, this process involves determining the distance between the camera and objects in the video by calculating depth and detecting shapes. This is achieved by comparing differences between views from two cameras. For example, stereo vision is employed. By analyzing these differences or disparities, the system can construct a depth map that represents the distance of objects from the camera.
As for segmentation, it is performed on the depth map to distinguish and isolate various shapes and objects within the scene. Segmentation techniques classify different regions of the depth map, enabling the identification of distinct objects and their boundaries. This comprehensive approach not only enhances the accuracy of depth perception but also improves the recognition and understanding of complex scenes by dividing a video feed into segments that represent different objects or regions. Advanced machine learning models, such as convolutional neural networks (CNNs), are employed to recognize and categorize different areas of the image according to their characteristics.
Motion tracking involves monitoring the movement of objects over a sequence of video frames. Motion tracking may employ techniques designed to detect and track various points or features as they traverse through both space and time. For example, optical flow algorithms may be used to estimate the motion between two frames by observing the changes in pixel intensity, while feature-based tracking methods, such as the Kanade-Lucas-Tomasi (KLT) tracker, identify and follow key points through a video sequence. Motion tracking enables us to understand the dynamics of moving objects to “super-impose” the dynamically generated animated 3D scene.
By leveraging these AI tracking techniques on video streams, the system can generate a comprehensive map of object movements, meshing these animated objects into the 3D scene, and also enhancing its ability to predict future positions and interactions within the environment. For example, a 3D real-time environment can be created by mapping video textures onto a 3D mesh and point clouds.
Stitching video textures into 3D virtual tours involves the process of integrating video content onto 3D models to create immersive experiences. The use of 3D point clouds, geometry and panoramic videos in virtual tours allows for a more realistic and compelling presentation of properties, products, or environments, enhancing the overall user experience, and conveying a more visceral sense of “being there” while navigating in 3D Tour. This represents a groundbreaking advancement, unmatched by any current virtual tour technology available.
As discussed, in the live 3D tour model, live video feeds are stitched to the static 3D tour module. For example, the live 3D tour will show in real-time what is occurring in the selected area. For example, a user taking the live 3D tour may navigate the area. Where the user navigates to, will show what is occurring in real-time at that location. For example, if there are people, it will show the people. The stitching will also be able to provide the perspective, as selected by the user, of what is occurring in the selected area. Stitching, for example, may be performed using artificial intelligence (AI). The selected area will be displayed with the appropriate perspective as the user navigates the area using the live 3D tour.
In one embodiment, reduced processing by the AI can be achieved by selectively stitching the differential between the real-time and non-real-time videos. For example, only what is different from the non-real-time and real-time videos is stitched to the static 3D tour model. In another embodiment, only certain types of differentials are stitched. For example, differentials which are identified as people, animate objects mirrors, and windows are stitched. Other configurations of stitching the live video feeds to the static 3D model may also be useful.
The user may select a live or non-live tour. When a non-live tour is selected, the static 3D tour model is used. The user can navigate the static tour model. When a live tour is selected, the user navigates the live 3D tour model. The user can navigate to the desired position of the area of interest at 121. For example, the user can select a position within a room to view. For non-live, a pre-recorded video feed is stitched to the tour model 131. The user can view the area of interest for the selected position. In the case of a live 3D tour, a live video feed of the area of interest is stitched to the 3D tour model. This produces a live immersive experience for the user from the selected position.
At the selected position, the user may move around and view the surrounding area. For example, the user may navigate closer to an object at the selected position or turn to view a different direction of the surrounding area at the selected position. Video cameras are strategically located in the room to enable capturing the whole room from any position within the room and in any direction. For example, video cameras may be mounted on different parts of the walls of the room and/or on objects, such as lamps. The camera, for example, is a 360° camera. The camera can be equipped with telescopic lenses for effecting zooming in and out. Other configurations of cameras may also be useful. The cameras may be mounted on a swivel or rotating mount. The cameras may be mounted on rotatable mounts and are equipped with telephoto lenses for zooming in and out. In some cases, the cameras may be multi-directional cameras. Other configurations of cameras may also be useful.
Based on the location and direction which the user has selected, the system instructs the appropriate video feeds to stitch. For example, if the user moves towards an object, the camera may zoom in as the user is getting closer. If the user rotates and moves in another direction, the video feed or feeds capturing the view from the other direction are stitched. For example, feeds from more than one camera can be stitched. For example, the system is configured to stitch video feeds from any position and in any direction of the room or area. For example, as the user navigates from one position to another, the video feeds from the appropriate camera or cameras are stitched. The user may also continue to navigate within the room or to another area, such as another room at 141. The 3D tour continues until terminated by the user.
Referring to
A user may navigate to different areas or positions within the BIM model. To navigate to different areas or positions, a mouse may be employed. For example, the user may move the pointer using the mouse to the desired position and click to select. The selected area is displayed on the user interface in greater detail. The user may further look at different ports of the selected area using the mouse.
In addition, the BIM model may include action buttons or hot spots 335. A user may select a button using the mouse. For example, the user may navigate the cursor to the desired action button and click to select it. Selecting a button may generate a desired action.
The user interface includes a control panel 310. The control panel includes various control buttons for the user to select.
Referring to
In
As shown, the mesh model is generated from scanned data. The mesh model includes walls defining the interior space of the building. The building includes various rooms or areas, such as a lobby, conference rooms, office rooms and cafeteria. The building may include other areas. The areas, for example, depend on the actual building from which the scanned data is based or obtained. In one embodiment, the mesh model includes an interactive area 446 which a user can interact with elements therein. A user, for example, navigates to an area 442 outside of the interactive area using, for example, the mouse. The user selects the area by clicking on the mouse. The user may switch between different modes. For example, the user may switch from the BIM mode to the 3D tour mode or the live 3D tour mode using the control panel.
Referring to
The user may navigate closer to the interactive area 446, as shown in
The example, as described, relates to controlling a robot to view and item and if desired purchase the item. The action buttons, as described, can be applied to other purposes, such as opening or closing a door, controlling what is displayed on a video screen as well as controlling other types of components.
The present disclosure may be embodied in other specific forms without departing from the spirit or essential characteristics thereof. The foregoing embodiments, therefore, are to be considered in all respects illustrative rather than limiting the invention described herein. The scope of the invention is thus indicated by the appended claims, rather than by the foregoing description, and all changes that come within the meaning and range of equivalency of the claims are intended to be embraced therein.
The present disclosure may be embodied in other specific forms without departing from the spirit or essential characteristics thereof. The foregoing embodiments, therefore, are to be considered in all respects illustrative rather than limiting the invention described herein. The scope of the invention is thus indicated by the appended claims, rather than by the foregoing description, and all changes that come within the meaning and range of equivalency of the claims are intended to be embraced therein.
This application claims the benefit of provisional application, Titled “INTERACTIVE LIVE DIGITAL TWIN OF AN INDOOR AREA”, having Application No. 63/512,663 (Attorney Docket No. VTPLP2022PRO23US0), which was filed on Jul. 10, 2023. This application is also a continuation-in-part of co-pending application, Titled “IMMERSIVE LIVE DIGITAL TWIN OF AN INDOOR AREA”, having application Ser. No. 18/646,820 (Attorney Docket No. VTPLP2023NAT03US0), which was filed on Apr. 26, 2024. The disclosures of above said references are incorporated herein by reference in their entireties for all purposes.
Number | Date | Country | |
---|---|---|---|
63512663 | Jul 2023 | US |
Number | Date | Country | |
---|---|---|---|
Parent | 18646820 | Apr 2024 | US |
Child | 18769378 | US |