The present invention relates to a computer implemented method, device and system for processing 3D media data, in particular a computer implemented method, device and system for locating a virtual 3D location within a master 3D digital asset, such as for locating an auxiliary digital asset within the 3D master 3D digital asset.
Three-dimensional (3D) digital content, such as 360/180 degree, immersive digital content or virtual reality (VR) videos are becoming ever more commonplace. In recent years, online services such as YouTube™ have enabled 360 digital content to be uploaded, accessed and streamed by anyone with an internet connection and an internet connected user device. The user devices which can access and play 3D digital content can vary from conventional computers, smartphones, and tablet devices to virtual reality (VR) headsets, with each type of device giving the user some form of VR experience of the 3D video.
A hurdle to the availability of 3D digital content is the complexity involved in generating it. In particular, hotspot type content can be particularly problematic in terms of its generation because it contains multiple sources of digital media located within different virtual positions of the 3D environment; these sources have been obtained from different physical locations in a real world scene. The individual sources of the digital media content are often recorded independently and then have to be stitched together in a laborious editing process which involves significant user input and manipulation to ensure that the different sources of digital content are synchronised.
For example, a master 3D digital asset may contain various hotspot locations, i.e. positional locations within the master asset from which auxiliary digital assets, such as 2D or further 360/180 degree or 3D content can be accessed and viewed by a user. These auxiliary digital assets are obtained from different auxiliary content generation devices located at different positional locations and having different recording start times within the overall real world scene. When the digital assets are being edited prior to distribution, a content editor will manipulate the various digital assets and locate them virtually to different positional locations within the master 3D digital asset; the different positional locations corresponding to the real world location of the content generation device from which the auxiliary digital content was recorded. The content editor interfaces provided to generate and manipulate the various digital assets and in particular to locate them within the virtual scene of the master 3D digital asset are not well configured to enable accurate placement efficiently and easily within the virtual scene. Moreover, the underlying master 3D digital asset has to be rendered within its corresponding 3D player during the content editing process, such that there is an additional technical constraint on how a content editor is implemented.
It is aim of the present invention to solve the aforementioned problems and other problems associated with the processing of 3D digital media.
In a first aspect of the invention, there is provided a computer implemented method for processing digital data including a master 3D digital asset and an auxiliary digital asset, the method comprising:
By providing a positional mesh on a 3D digital asset player, improved location determination of user input within the virtual scene currently being displayed by the 3D asset player takes place. In particular, the invention is particularly advantageous for identifying user input for associating auxiliary digital assets at positional locations within the master 3D digital scene, for example in a hotspot generation process.
For the purposes of the present disclosure, the master 3D digital asset may comprise 360, VR or 3D video data recorded from a real world scene. The auxiliary digital assets may comprise 2D, 360, VR or 3D video data of the real world scene, or audio data of the real world scene. The master 3D digital asset may be generated via a master content generation device typically comprising 360/3D/VR video capture devices (including associated audio capture capability). The auxiliary digital assets may be generated from one or more auxiliary content generation devices. The auxiliary content generation devices for the auxiliary digital assets typically comprise 2D or 360/3D/VR video capture devices (including associated audio capture capability) or merely an audio capture device.
The step of identifying a display screen position may comprise enabling a marker for the auxiliary digital asset to be dragged and dropped via the user input device from a first area of a display screen to a second area on the display screen (comprising the 3D digital asset player and a displayed current view of the master 3D digital asset) onto which the first positional mesh is provided. The drop point of the marker is at an X-Y position on the display screen. Alternatively, the step of identifying a display screen position may comprise enabling a marker for the auxiliary digital asset to be placed by clicking with a pointing device or touching with the display being touch-sensitive display at a X-Y position on the display screen at the second area overlaying the 3D digital asset player and current view of the master 3D digital asset. With the mesh provided over the 3D player, this process enables an effective way of interacting with an existing 3D player for determination for hotspot placement at virtual 3D positional locations in the current view being displayed in a content editor mode via an existing 3D player.
The step of determining may comprise:
The first positional mesh may assign a plurality of display screen positions within the display screen with corresponding virtual 3D positional location a first depth position within the master 3D digital asset.
The computer-implemented method may further comprise:
The computer-implemented method may further comprise:
The computer-implemented method may further comprise:
The computer-implemented method may further comprise switching from playing the master 3D digital asset to playing of the auxiliary digital asset upon activation of the hotspot marker.
The computer-implemented method may further comprise:
The master 3D digital asset may comprise a 3D digital scene, for example a 3D video stream or 360 degree video content.
The auxiliary digital asset comprise an auxiliary 3D digital scene, for example an auxiliary 3D video stream.
In a second aspect of the present invention, there is provided a processing device configured to perform the aforementioned method.
The processing device may be a distributed processing system including a user device for performing some of the functionality of the processing device in a distributed fashion along with an associated media processing device (to which the user device is connected). The media processing device may operate in content processing or content generation mode based on content stored in a media server. The user device may be one or more of a personal computer, e.g. desktop or laptop computer, a tablet device, a mobile device, e.g. a smartphone, and a virtual reality (VR) device, such as a VR headset. The user device may comprise a display, e.g. a display screen, such as a touch sensitive display screen, configured for display of the master 3D digital asset and auxiliary digital asset. The user device, media server and media processing device are each configured to be in communication with each other for transmitting requests for and transmitting and receiving the master 3D digital asset and auxiliary digital asset. Communication between the media processing device, user device and media server may take place via one or more communication links, such as the internet, and the communication links may be wired or wireless, or a combination of the two based on any known network communication protocol.
The present invention has been described below purely by way of example with reference to the accompanying drawings in which:
Referring to
Referring to
Referring to
Each of the remaining plurality of content generation devices 301b . . . 301n may be an auxiliary content generation device each configured to generate auxiliary digital media assets of video and audio, and may each comprise one or more of: a further 3D video capture device, a 2D video capture device, an audio capture device. The 3D/2D video capture devices are configured additionally to generate audio data alongside the video data. The audio capture device is configured to generate just audio data. Each auxiliary content generation device is configured to generate its auxiliary digital asset from a fixed or moving auxiliary physical location within the real world scene.
The digital media assets thus generated by the content generation devices 301 comprise at least one master 3D digital asset 304a comprising 360 video and audio data of the real world scene, and one or more auxiliary digital assets 304b . . . 304n comprising 3D, 2D and/or audio data of the real world scene. The master 3D digital asset 304a and one or more auxiliary digital assets 304b . . . 304n thus acquired are transmitted to the media server 104 and stored therein as digital asset files. Storage in the media server 104 of each digital asset can take place in real time, e.g. during capture, or can take place after acquisition, possibly even after a significant delay. For example, a user of an auxiliary content generation device may upload the auxiliary digital asset of the scene during capture, or after some time, for example many days after acquisition. Each digital asset 304 stored in the media server 104 comprises or has associated metadata identifying the real world scene or event captured, along with the time and, optionally physical location data of the content generation device 301 within the real world scene during capture of the asset. The physical location data may be assigned automatically, for example based on an automatic location determination device within the content generation device, or may be assigned later by the user upon upload to the media server 104.
The times of capture of the master 3D digital asset 304a and auxiliary digital assets 304b . . . 304n of the scene may overlap at least in part, but typically the auxiliary digital assets 304b . . . 304n would be timed such that they have been acquired wholly within the capture period of the master 3D digital asset 304a. The start time of each auxiliary digital asset may vary, and since the auxiliary content generation devices 301b . . . 301n are independent of each other (possibly acquired completely independently via different users of each auxiliary generation device), there is typically no synchronous time stamp available across each digital asset in relation to when it was acquired with respect to one or more of the other digital assets. In particular, there is no information available concerning the start time of each auxiliary digital asset with respect to a playback time of the master 3D digital asset 304a. In prior art systems, the time synchronisation data between digital assets is assigned by a 360 content editor who manually reviews each digital asset within a content editor and places each asset on a common timeline for all acquired digital assets for the real world scene.
The media server 104 stores each digital asset 304a . . . 304n upon receipt and in one embodiment associates the individual assets within the media database 306 to a corresponding virtual scene for which there is at least one corresponding master 3D digital asset 304a. As explained above, metadata is generated including data corresponding to the scene. This data is stored in media server 104 such that a scene identifier for each auxiliary digital asset 304b . . . 304n links it to a corresponding master 3D digital asset 304a. This scene data can be stored separately as depicted in media database 306,
In an alternative embodiment the scene data is stored within an asset bundle 304, such that an asset bundle 304 is generated for each master 3D digital asset 304a comprising the master 3D digital asset 304a itself along with its associated auxiliary assets 304b . . . 304n and scene data, including the data linking the auxiliary assets to their corresponding master 3D digital asset, virtual 3D positional information of the auxiliary assets within the master 3D digital asset and time synchronisation data for each auxiliary digital asset within the master 3D digital asset.
Media processing device 102 access the media server 104 and acquires each auxiliary digital asset 304b . . . 304n for a given scene identifier and processes each auxiliary digital asset 304b . . . 304n to determine its temporal location within its corresponding master 3D digital asset 304a and store corresponding time synchronisation data for each auxiliary digital asset 304b . . . 304n with the media database 306.
The media processing device 102 can be configured to process each auxiliary digital asset 304b . . . 304n for temporal information in real time as it is uploaded to media server 104. Alternatively, the media processing device 102 can be configured to process each auxiliary digital asset 304b . . . 304n only upon instigation by a content editor. Either way, a master 3D digital asset 304a must first have been identified and associated based on its corresponding scene identifier to one or more corresponding auxiliary digital assets 304b . . . 304n.
Referring to
The rendered view may be depicted on the display 106a of the user device 106 which has acquired the master 3D digital asset 304a from media server 104. The master 3D digital asset 304a comprises video and audio data of the real world scene. In addition, the master 3D digital asset 304a includes auxiliary asset location identifiers 403 (403b . . . 403n) (“hotspots”) of the locations within the 360 virtual scene of one or more auxiliary digital assets 304b . . . 304n each acquired from one or more of the auxiliary content generation devices 301b . . . 301n when they were positioned within the real world scene during acquisition of the master 3D digital asset 304a. As explained above, each auxiliary digital asset has associated metadata including location data indicative of the physical location within the real world scene, and thus correspondingly location data of its virtual location within the master 3D digital asset 304a, such that the master 3D digital asset 304a includes such location data for displaying the corresponding location identifier 403 for each digital asset at its virtual location during playback. Each auxiliary asset location identifier 401 can be activated during playback upon user input via input device 106b to cause the user device 106 to start playback of the auxiliary digital asset corresponding to the location identifier selected. Each location identifier 403 may be displayed (or made available for selection) within the master 3D digital asset scene only for the time period during which it exists within the master 3D digital asset scene. Thus, if an auxiliary digital asset is only available for a portion of the time (such that it starts part way through the master 3D digital asset 304a and/or finishes before the end of the master 3D digital asset 304a), its corresponding location identifier 403 will only be displayed or made available for that corresponding period of time.
Referring to
The editor layer 502 comprises one or more mesh layers 502a . . . 502n, with each mesh layer 502a . . . 502n providing X-Y positional locations for user input at a given depth (Z) position in the current view of the virtual scene of the master 3D digital asset 304a being displayed.
Auxiliary asset location markers 503b . . . 503n (hotspot markers) previously set are displayed by editor layer in the display 106a. The virtual 3D positional information for each marker within the master 3D digital asset 304a is obtained from media server 104 and rendered as an X-Y position on the current 2D view of the 3D scene based on a transformation of the virtual 3D positional information for each marker and the current viewpoint position within the master 3D digital asset 304a as currently displayed in the player 501.
Each auxiliary asset location marker 503b . . . 503b can be selected via input device 106b and moved, e.g. via “dragging” with user input device 106 and then “dropping”. Alternative means of positional placement of the auxiliary asset location markers 503b . . . 503n are also contemplated, including touch selection and then touch placement within the mesh layer. The placement at a given X-Y position of a given auxiliary asset location markers 503b . . . 503n within a given mesh layer 502 constitutes the user input which is detected and from which the X-Y positional location is identified, with the given mesh layer 502 providing depth (Z) positional information. Each auxiliary asset location marker 503b . . . 503n corresponds to an auxiliary digital asset 304b . . . 304n, and thus the positional placement of an asset location marker 503b . . . 503n as described above enables positional data within the master 3D digital asset 304a for the corresponding auxiliary digital asset 304b . . . 304n to be generated and stored in the media database 306 or within the master 3D digital asset 304a itself.
Auxiliary asset location markers 503b . . . 503n can be retained in an unused area 504 of the editor layer display 106b where their positional information is not used to identify positional locations within the master 3D digital asset 304a. When selected and moved via user input, a given auxiliary asset location markers 503b . . . 503n can be placed within a given mesh layer for determination of desired positional location within the master 3D digital asset 304a as explained above. Moreover, any “placed” auxiliary asset location marker 503b . . . 503n can be removed from its mesh layer, and thus moved back to the unused area 504 so that it becomes available again for placement. As depicted in
User input via input device 106b can select the current mesh layer for receiving user input and thus set the current depth (Z) position for user input detection and positional determination. For example, the input device 106b can select one or more pre-set depth positions via a content editor accessed via user device 106. The pre-set depth positions can then be associated with the master 3D digital asset which have been pre-set by a user when the master 3D digital asset was initially uploaded to the media server 104. The pre-set depth positions are stored in media database 306 and then displayed to the user for selection via the editor layer 502.
Referring to
In step 601, a 3D digital asset player 501 is displayed within the display 106a.
In step 602, a first positional mesh 502a is overlaid on at least a portion of the 3D digital asset player 501 via an editor layer 502. The first positional mesh assigns a plurality of display X-Y positions within the display 106a with corresponding virtual 3D positional location within the master 3D digital asset 304a, based on the current viewpoint position of the £D player 501 within the master 3D digital asset and the active depth position for the current active mesh.
In step 603, an X-Y position within the display 106a is identified via a user input being provided via the editor layer and user input device 106b. Typically, this would be by dragging and dropping to or providing input at the X-Y position within the display 106a via the user input device 106. The Z position is set by the currently selected mesh layer being the active mesh layer.
In step 604, a virtual 3D positional location within the master 3D digital asset 304a is determined via the first positional mesh 502a from the identified display screen X-Y position, and further from the active mesh layer currently selected (since each mesh layer defines a corresponding depth position).
Additional mesh layers 502b . . . 502n may be displayed in step 601 for receiving user input and determining X-Y positional locations. Each mesh layer 502a . . . 502n corresponds to a set virtual depth position within the master 3D digital asset 304a based on the current virtual view displayed. Each mesh layer 502a . . . 502n can be selected individually via user input device 106b based on pre-set defined depths, so as to be set as the active mesh layer for setting the X-Y positional locations at the given Z depth corresponding to the active mesh layer within the master 3D digital asset 304a.
Step 603 may comprise the positional placement of a location by detecting the drop location on the display 106a of an auxiliary asset location markers 503a . . . 503b as dragged via the input device 106b from one position on the display 106a to another position, e.g. from the available asset area 504 to a given mesh location.
Based on the identified X, Y, Z positional information determined for the placement location and the current viewpoint position in the master 3D digital asset as determined from the 3D player, the media processing device 102 determines the virtual 3D positional location within the master 3D digital asset.
The editor layer 502 implementing mesh layers 502b . . . 502n as overlaid on the playback representation 200 is implemented as a separate executable module to the 3D player. The media processing device 102 receives the X, Y, Z positional information from the editor layer 502 along with the current viewpoint position in the master 3D digital asset 304a from the 3D player 501, and transforms the received X, Y, Z positional information into a virtual 3D positional location within the master 3D asset 304a using the current viewpoint position as a reference point to generate the virtual 3D positional location. The virtual 3D positional location is stored in media server 104, within media database 306, within a digital asset bundle (as described above) or within the master 3D digital asset 304a itself.
The 3D digital asset player 501 can subsequently receive the stored virtual 3D positional location for each auxiliary asset location marker 503b . . . 503b from the media server 104, and render each marker as a hotspot at an X-Y position on the current 2D view of the 3D scene based on a transformation of the virtual 3D positional information for each marker and the current viewpoint position within the master 3D digital asset 304a as currently displayed in the player 501.
The present invention has been described above by way of example only. It will be appreciated that modifications are possible within the scope of the appended claims.
Number | Date | Country | Kind |
---|---|---|---|
21151021.9 | Jan 2021 | EP | regional |
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/EP2022/050444 | 1/11/2022 | WO |