This disclosure relates to video-based coding of dynamic meshes.
In the realm of computer graphics and virtual reality, meshes are the building blocks that may be used to represent three dimensional (3D) objects. Meshes are the digital equivalent of a wireframe model, defining the shape and surface of an object. A mesh is a collection of vertices, edges, and faces that define the shape of a 3D object. Vertices are the points that define the corners of the object. Edges are the lines connecting the vertices. Faces are the polygons (usually triangles or quadrilaterals) formed by the edges, which define the surface of the object. Meshes are versatile and have a wide range of applications, including, but not limited to: 3D modeling and animation, virtual and augmented reality, 3D printing, medical imaging, and architectural design. As 3D models become increasingly complex, file sizes of such models may grow significantly. Mesh compression techniques may be used to reduce the size of mesh files.
This disclosure describes example motion vector prediction techniques that may be employed to efficiently encode and decode meshes. In 3D video and virtual reality, accurate motion vector prediction may be important for efficient compression and transmission of 3D mesh sequences. As described in more detail, motion vector prediction is a technique used to determine a motion vector for a current vertex. The motion vector for the current vertex may be an estimate of the movement of the current vertex in a 3D mesh from one frame to the next.
By predicting the motion vector for a current vertex, an encoder may reduce the amount of data needed to transmit to determine the motion vector for the current vertex. The disclosed techniques may utilize weighted averaging to improve the accuracy of motion vector prediction. An encoder or decoder may consider a set of motion vectors from neighboring vertices. The encoder or decoder may calculate the distance between the current vertex and the vertex positions corresponding to each candidate motion vector. The candidate motion vectors may be weighted based on their respective distances. Closer vertices may have a higher weight, contributing more to the final predicted motion vector. This better ensures that the prediction is heavily influenced by the most relevant neighbors. The encoder or decoder may use the weighted average of the candidate motion vectors as the predicted motion vector for the current vertex. By calculating the distance between the current vertex and potential candidate vertices, the encoder or decoder may prioritize vertices that are spatially closer. By utilizing motion vectors from a previously encoded or decoded reference mesh, the encoder or decoder may incorporate historical information into the prediction process.
By considering the relative distances between vertices, the weighted averaging techniques may provide more accurate motion vector predictions, especially in complex motion scenarios. More accurate motion vector prediction may lead to better compression efficiency, resulting in smaller file sizes and lower bandwidth requirements. Accurate motion vector prediction may help maintain the visual quality of 3D video and virtual reality content, even at lower bitrates.
The disclosed techniques may help to capture long-term motion patterns and may be particularly effective for complex, dynamic scenes. More accurate predictions may lead to fewer bits representing the residual error between the predicted and actual motion vectors. This may lead to a smaller bitrate and reduced transmission costs. Precise motion vector prediction may enable more efficient compression techniques, such as motion compensation. Enhanced coding efficiency may result in higher compression ratios and improved quality of the reconstructed 3D mesh.
In an example, a method of encoding or decoding mesh data includes: for a current vertex of mesh vertices of the mesh data, determining a motion vector predictor based on respective weighted averages of respective motion vectors in a candidate list for the current vertex; and encoding or decoding the current vertex based on the motion vector predictor.
In an example, a device for generating mesh data includes: a memory configured to store the mesh data; and one or more processors coupled to the memory, implemented in circuitry, and configured to: for a current vertex of mesh vertices of the mesh data, determine a motion vector predictor based on respective weighted averages of respective motion vectors in a candidate list for the current vertex; and encode or decode the current vertex based on the motion vector predictor.
In an example, non-transitory computer-readable storage media having instructions encoded thereon, the instructions configured to cause processing circuitry to: for a current vertex of mesh vertices of the mesh data, determine a motion vector predictor based on respective weighted averages of respective motion vectors in a candidate list for the current vertex; and encode or decode the current vertex based on the motion vector predictor.
The details of one or more examples are set forth in the accompanying drawings and the description below. Other features, objects, and advantages will be apparent from the description, drawings, and claims.
This disclosure describes techniques that may enhance the efficiency of video compression, such as for Dynamic Mesh Coding (V-DMC). Base mesh motion field coding is a component of V-DMC and may include representing the movement of the vertices of the mesh over time. One way to indicate the movement of vertices is using a motion vector. The motion vector of a vertex may indicate the position change of that vertex from frame to frame.
In some techniques, an encoder may signal coordinate information of the motion vector to a decoder. However, reduce the amount of information that needs to be signaled, the encoder and decoder may use motion vector prediction to predict the motion vector. For instance, the encoder and decoder may utilize motion vectors of previously encoded or decoded vertices to generate motion vector predictors that predict the actual motion vector for a current vertex. This way, rather than signaling the coordinate information of the actual motion vector, the encoder may signal information for selecting a motion vector predictor and possibly a difference between the motion vector predictor and the actual motion vector. Signaling information for selecting a motion vector predictor and possibly a difference between the motion vector predictor and the actual motion vector may require less signaling as compared to signaling the coordinates of the motion vector. Accordingly, by utilizing motion vector prediction, significant compression gains may be achieved.
Motion vector predictors may be used to predict the motion of a particular mesh vertex based on the motion of its neighboring vertices. In one or more examples described in this disclosure, the examples of neighboring vertices may be vertices in a current mesh that includes the current vertex for which the motion vector is being determined. In some examples, a neighboring vertex may be a vertex in another mesh, referred to as a reference mesh.
As an example, the encoder and decoder may utilize the same techniques to generate a candidate list of motion vectors for a current vertex. In some examples, the candidate list of motion vectors may include motion vectors of vertices in a current mesh that includes the current vertex, and/or motion vectors of vertex or vertices in a reference mesh that was previously encoded or decoded. Because the encoder and decoder use the same techniques to generate the candidate list of motion vectors for the current vertex, the candidate list may be same at the encoder side and decoder side.
The encoder and decoder may utilize same techniques to determine a motion vector predictor based on the motion vectors in the candidate list, such as by weighted averaging, as an example. Accordingly, the motion vector predictor determined on the encoder side and decoder side may be the same motion vector predictor. The encoder and decoder may determine a motion vector for the current vertex using the motion vector predictor.
With the example techniques, the motion vector predictor may be a better predictor of the actual motion vector as compared to other techniques. That is, the motion vector predictor may be closer in value to the actual motion vector than other techniques that use motion vector prediction. As noted above, by accurately predicting the motion vector (e.g., determine a motion vector predictor that better predicts the actual motion vector), an encoder may reduce the amount of information needed to be signaled to represent the actual motion vector. The process of constructing candidate motion vector list may include creating a list of potential motion vectors for a given vertex. A good candidate list may significantly improve the efficiency of the motion estimation and coding process. Enhancing motion vector prediction may include exploring more sophisticated prediction models or incorporating additional contextual information. Optimizing candidate motion vector list construction may include developing strategies to prioritize the most likely motion vectors, reducing the search space and improving coding efficiency.
As shown in
In the example of
System 100 as shown in
In general, data source 104 represents a source of data (i.e., raw, unencoded data) and may provide a sequential series of “frames”) of the data to V-DMC encoder 200, which encodes data for the frames. Data source 104 of source device 102 may include a mesh capture device, such as any of a variety of cameras or sensors, e.g., a 3D scanner or a light detection and ranging (LIDAR) device, one or more video cameras, an archive containing previously captured data, and/or a data feed interface to receive data from a data content provider. Alternatively or additionally, mesh data may be computer-generated from scanner, camera, sensor or other data. For example, data source 104 may generate computer graphics-based data as the source data, or produce a combination of live data, archived data, and computer-generated data. In each case, V-DMC encoder 200 encodes the captured, pre-captured, or computer-generated data. V-DMC encoder 200 may rearrange the frames from the received order (sometimes referred to as “display order”) into a coding order for coding. V-DMC encoder 200 may generate one or more bitstreams including encoded data. Source device 102 may then output the encoded data via output interface 108 onto computer-readable medium 110 for reception and/or retrieval by, e.g., input interface 122 of destination device 116.
Memory 106 of source device 102 and memory 120 of destination device 116 may represent general purpose memories. In some examples, memory 106 and memory 120 may store raw data, e.g., raw data from data source 104 and raw, decoded data from V-DMC decoder 300. Additionally or alternatively, memory 106 and memory 120 may store software instructions executable by, e.g., V-DMC encoder 200 and V-DMC decoder 300, respectively. Although memory 106 and memory 120 are shown separately from V-DMC encoder 200 and V-DMC decoder 300 in this example, it should be understood that V-DMC encoder 200 and V-DMC decoder 300 may also include internal memories for functionally similar or equivalent purposes. Furthermore, memory 106 and memory 120 may store encoded data, e.g., output from V-DMC encoder 200 and input to V-DMC decoder 300. In some examples, portions of memory 106 and memory 120 may be allocated as one or more buffers, e.g., to store raw, decoded, and/or encoded data. For instance, memory 106 and memory 120 may store data representing a mesh.
Computer-readable medium 110 may represent any type of medium or device capable of transporting the encoded data from source device 102 to destination device 116. In one example, computer-readable medium 110 represents a communication medium to enable source device 102 to transmit encoded data directly to destination device 116 in real-time, e.g., via a radio frequency network or computer-based network. Output interface 108 may modulate a transmission signal including the encoded data, and input interface 122 may demodulate the received transmission signal, according to a communication standard, such as a wireless communication protocol. The communication medium may comprise any wireless or wired communication medium, such as a radio frequency (RF) spectrum or one or more physical transmission lines. The communication medium may form part of a packet-based network, such as a local area network, a wide-area network, or a global network such as the Internet. The communication medium may include routers, switches, base stations, or any other equipment that may be useful to facilitate communication from source device 102 to destination device 116.
In some examples, source device 102 may output encoded data from output interface 108 to storage device 112. Similarly, destination device 116 may access encoded data from storage device 112 via input interface 122. Storage device 112 may include any of a variety of distributed or locally accessed data storage media such as a hard drive, Blu-ray discs, DVDs, CD-ROMs, flash memory, volatile or non-volatile memory, or any other suitable digital storage media for storing encoded data.
In some examples, source device 102 may output encoded data to file server 114 or another intermediate storage device that may store the encoded data generated by source device 102. Destination device 116 may access stored data from file server 114 via streaming or download. File server 114 may be any type of server device capable of storing encoded data and transmitting that encoded data to the destination device 116. File server 114 may represent a web server (e.g., for a website), a File Transfer Protocol (FTP) server, a content delivery network device, or a network attached storage (NAS) device. Destination device 116 may access encoded data from file server 114 through any standard data connection, including an Internet connection. This may include a wireless channel (e.g., a Wi-Fi connection), a wired connection (e.g., digital subscriber line (DSL), cable modem, etc.), or a combination of both that is suitable for accessing encoded data stored on file server 114. File server 114 and input interface 122 may be configured to operate according to a streaming transmission protocol, a download transmission protocol, or a combination thereof.
Output interface 108 and input interface 122 may represent wireless transmitters/receivers, modems, wired networking components (e.g., Ethernet cards), wireless communication components that operate according to any of a variety of IEEE 802.11 standards, or other physical components. In examples where output interface 108 and input interface 122 comprise wireless components, output interface 108 and input interface 122 may be configured to transfer data, such as encoded data, according to a cellular communication standard, such as 4G, 4G-LTE (Long-Term Evolution), LTE Advanced, 5G, or the like. In some examples where output interface 108 comprises a wireless transmitter, output interface 108 and input interface 122 may be configured to transfer data, such as encoded data, according to other wireless standards, such as an IEEE 802.11 specification, an IEEE 802.15 specification (e.g., ZigBee™), a Bluetooth™ standard, or the like. In some examples, source device 102 and/or destination device 116 may include respective system-on-a-chip (SoC) devices. For example, source device 102 may include an SoC device to perform the functionality attributed to V-DMC encoder 200 and/or output interface 108, and destination device 116 may include an SoC device to perform the functionality attributed to V-DMC decoder 300 and/or input interface 122.
The techniques of this disclosure may be applied to encoding and decoding in support of any of a variety of applications, such as communication between autonomous vehicles, communication between scanners, cameras, sensors and processing devices such as local or remote servers, geographic mapping, or other applications.
Input interface 122 of destination device 116 receives an encoded bitstream from computer-readable medium 110 (e.g., a communication medium, storage device 112, file server 114, or the like). The encoded bitstream may include signaling information defined by V-DMC encoder 200, which is also used by V-DMC decoder 300, such as syntax elements having values that describe characteristics and/or processing of coded units (e.g., slices, pictures, groups of pictures, sequences, or the like). Data consumer 118 uses the decoded data. For example, data consumer 118 may use the decoded data to determine the locations of physical objects. In some examples, data consumer 118 may comprise a display to present imagery based on meshes.
V-DMC encoder 200 and V-DMC decoder 300 each may be implemented as any of a variety of suitable encoder and/or decoder circuitry, such as one or more microprocessors, digital signal processors (DSPs), application specific integrated circuits (ASICs), field programmable gate arrays (FPGAs), discrete logic, software, hardware, firmware or any combinations thereof. When the techniques are implemented partially in software, a device may store instructions for the software in a suitable, non-transitory computer-readable medium and execute the instructions in hardware using one or more processors to perform the techniques of this disclosure. Each of V-DMC encoder 200 and V-DMC decoder 300 may be included in one or more encoders or decoders, either of which may be integrated as part of a combined encoder/decoder (CODEC) in a respective device. A device including V-DMC encoder 200 and/or V-DMC decoder 300 may comprise one or more integrated circuits, microprocessors, and/or other types of devices.
V-DMC encoder 200 and V-DMC decoder 300 may operate according to a coding standard. This disclosure may generally refer to coding (e.g., encoding and decoding) of pictures to include the process of encoding or decoding data. An encoded bitstream generally includes a series of values for syntax elements representative of coding decisions (e.g., coding modes).
This disclosure may generally refer to “signaling” certain information, such as syntax elements. The term “signaling” may generally refer to the communication of values for syntax elements and/or other data used to decode encoded data. That is, V-DMC encoder 200 may signal values for syntax elements in the bitstream. In general, signaling refers to generating a value in the bitstream. As noted above, source device 102 may transport the bitstream to destination device 116 substantially in real time, or not in real time, such as might occur when storing syntax elements to storage device 112 for later retrieval by destination device 116.
The MPEG working group 7 (WG7), also known as the 3D graphics and haptics coding group (3DGH), is currently standardizing the video-based coding of dynamic mesh representations (V-DMC) targeting XR use cases. The current test model is based on the call for proposals result, Khaled Mammou, Jungsun Kim, Alexandros Tourapis, Dimitri Podborski, Krasimir Kolarov, [V-CG] Apple's Dynamic Mesh Coding CfP Response, ISO/IEC JTC1/SC29/WG7, m59281, April 2022, and encompasses the pre-processing of the input meshes into approximated meshes with typically fewer vertices named the base meshes, which are coded with a static mesh coder (e.g., Google Draco, MPEG's edge breaker implementation, etc.). In addition, V-DMC encoder 200 may estimate the motion of the base mesh vertices and code the motion vectors into the bitstream. The reconstructed base meshes may be subdivided into finer meshes with additional vertices and, hence, additional triangles. The V-DMC encoder 200 may refine the positions of the subdivided mesh vertices to approximate the original mesh. The refinements or vertex displacement vectors may be coded into the bitstream. In the current test model, the displacement vectors are wavelet transformed, quantized, and the coefficients are packed into a 2D frame. The sequence of frames is coded with a typical video coder, for example, HEVC or VVC, into the bitstream. In addition, the sequence of texture frames is coded with a video coder. The architecture of the V-DMC decoder is illustrated in
A detailed description of the proposal that was selected as the starting point for the V-DMC standardization can be found in m59281. The following description will detail the displacement vector coding in the current V-DMC test model and WD 2.0, such as WD 5.0 of V-DMC, ISO/IEC JTC1/SC29/WG7, N00744, October 2023. Additionally, the coding of the base mesh motion field is described.
V-DMC encoder 200 and V-DMC decoder 300 may be configured to perform preprocessing.
In
In some examples, the techniques are independent of the chosen subdivision scheme and could be combined with other subdivision schemes. The subdivided polyline is then deformed to get a better approximation of the original curve 304. For example, a displacement vector is computed for each vertex of the subdivided mesh (arrows 302 in
The displaced curve 310 is generated by decoding the displacement vectors associated with the subdivided curve 308 vertices. Besides allowing for spatial/quality scalability, the subdivision structure enables efficient transforms such as wavelet decomposition, which can offer high compression performance.
The mesh decimation module uses a simplification technique to decimate the input mesh 508 M(i) and produce the decimated mesh 510 dm(i). The decimated mesh 510 dm(i) is then re-parameterized using the UVAtlas tool (e.g., https://docs.microsoft.com/en-us/windows/win32/direct3d9/using-uvatlas). The generated mesh 514 is denoted as pm(i). The UVAtlas tool considers only the geometry information of the decimated mesh 510 dm(i) when computing the atlas parameterization, which is likely sub-optimal for compression purposes. Other parameterization schemes or tools could also be considered with the proposed framework.
As illustrated in
For the Random Access (RA) condition, a temporally consistent re-meshing could be computed by considering the base mesh m(j) of a reference frame with index j as the input for the subdivision surface fitting module. This makes it possible to produce the same subdivision structure for the current mesh M′(i) as the one computed for the reference mesh M′(j). Such a re-meshing process makes it possible to skip the encoding of the base mesh m(i) and re-use the base mesh m(j) associated with the reference frame M(j). This could also enable better temporal prediction for both the attribute and geometry information. In some examples, a motion field f(i) describing how to move the vertices of m(j) to match the positions of m(i) is computed and encoded. Such time-consistent re-meshing may not always possible. Some example techniques compare the distortion obtained with and without the temporal consistency constraint and choose the mode that offers the best RD compromise.
The pre-processing module may not be normative and could be replaced by any other system that produces displaced subdivision surfaces. A possible efficient implementation would constrain the 3D reconstruction module to directly generate displaced subdivision surface and avoids the need for such pre-processing.
V-DMC encoder 200 and V-DMC decoder 300 may be configured to perform displacements coding. Depending on the application and the targeted bitrate/visual quality, the V-DMC encoder 200 could optionally encode a set of displacement vectors associated with the subdivided mesh vertices, referred to as the displacement field d(i). The intra encoding process, which may be performed by V-DMC encoder 200, is illustrated in
First, the reconstructed quantized base mesh 902 m′(i) is used to update the displacement field 516 d(i) to generate an updated displacement field d′(i). This process considers the differences between the reconstructed base mesh 902 m′(i) and the original base mesh 934 m(i). By exploiting the subdivision surface mesh structure, a wavelet transform 908 is then applied to d′(i) and a set of wavelet coefficients 912 is generated. The scheme is agnostic of the transform applied and could leverage any other transform, including the identity transform. The wavelet coefficients 912 are then quantized 914, packed 944 into a 2D image/video, and can be compressed 936 by using a traditional image/video encoder 916 (e.g., V-PCC). The reconstructed version of the wavelet coefficients 920 is obtained by applying image unpacking and inverse quantization 926 to the reconstructed wavelet coefficient video 920 generated during the video encoding process. Reconstructed displacements d″(i) are then computed by applying the inverse wavelet transform 918 to the reconstructed wavelet coefficients 920. A reconstructed base mesh m″(i) is obtained by applying inverse quantization 926 to the reconstructed quantized base mesh m′(i). The reconstructed deformed mesh 928 DM(i) is obtained by subdividing m″(i) and applying the reconstructed displacements d″(i) to its vertices.
The mesh sub-stream is fed to the mesh decoder to generate the reconstructed quantized base mesh m′(i). The decoded base mesh m″(i) is then obtained by applying inverse quantization to m′(i). The displacement sub-stream could be decoded by a video/image decoder. The generated image/video is then un-packed and inverse quantization is applied to the transformed (e.g., wavelet) coefficients. The decoded displacement field d″(i) is then generated by applying the inverse transform to the unquantized coefficients. The final decoded mesh is generated by applying the reconstruction process to the decoded base mesh m″(i) and by adding the decoded displacement field d″(i). The attribute sub-stream is directly decoded by the video decoder and the decoded attribute map A″(i) is generated as output.
The mesh sub-stream is fed to the mesh decoder to generate the reconstructed quantized base mesh m′(i). The decoded base mesh 1018 m″(i) is then obtained by applying inverse quantization 1004 to m′(i). The displacement sub-stream could be decoded by a video/image decoder 1006. The generated image/video is then un-packed 1008 and inverse quantization 1004 is applied to the transformed (e.g., wavelet) coefficients. The decoded displacement field 1012 d″(i) is then generated by applying the inverse transform 1010 to the unquantized coefficients. The final decoded mesh is generated by applying the reconstruction process to the decoded base mesh m″(i) and by adding the decoded displacement field 1012 d″(i). The attribute sub-stream is directly decoded by the video decoder 1014 and the decoded attribute map A″(i) is generated as output 1016.
The following describes arithmetic coding of displacements. As an alternative to packing the quantized wavelet coefficients in frames and coding as images or video, a scheme was utilized that directly codes the quantized wavelet coefficients with a block-based arithmetic coder. This scheme is illustrated in
V-DMC encoder 200 and V-DMC decoder 300 may be configured to implement a subdivision scheme. Various subdivision schemes could be considered (e.g., https://www.cs.utexas.edu/users/fussell/courses/cs384g-fall2011/lectures/lecture17-Subdivision_curves.pdf). A possible solution is the mid-point subdivision scheme, which at each subdivision iteration subdivides each triangle into 4 sub-triangles as described in
The same process is used to compute the texture coordinates of the newly created vertex. For normal vectors, an extra normalization step is applied as follows:
V-DMC encoder 200 and V-DMC decoder 300 may be configured to apply wavelet transforms 1010. Various wavelet transforms may be applied (e.g., Kolarov, K and Lynch W. “Wavelet Compression for 3D and Higher-Dimensional Objects”, Proc. of SPIE Conference on Applications of Digital Image Processing, Volume 3164, San Diego, California, pp. 247-260, July 1997). The results reported for CP are based on a linear wavelet transform.
The prediction process is defined as follows:
where
The updated process is as follows:
The scheme may allow to skip the update process. The wavelet coefficients could be quantized e.g., by using a uniform quantizer with a dead zone.
Local vs. Canonical Coordinate System for Displacements will now be discussed. The displacement field d(i) is defined in the same cartesian coordinate system as the input mesh. A possible optimization is to transform d(i) from this canonical coordinate system to a local coordinate system, which is defined by the normal to the subdivided mesh at each vertex.
A potential advantage of considering a local coordinate system for the displacements is the possibility to quantize more heavily the tangential components of the displacements compared to the normal component. The normal component of the displacement may have more significant impact on the reconstructed mesh quality than the two tangential components.
V-DMC encoder 200 and V-DMC decoder 300 may be configured to implement packing of wavelet coefficients. The following scheme is used to pack the wavelet coefficients into a 2D image:
Other packing schemes could be used (e.g., zigzag order, raster order). The V-DMC encoder 200 could explicitly signal in the bitstream the used packing scheme (e.g., atlas sequence parameters). This could be done at patch, patch group, tile, or sequence level.
V-DMC encoder 200 may be configured for displacement video encoding. The techniques may be agnostic of which video coding technology is used. When coding the displacement wavelet coefficients, a lossless approach may be used since the quantization is applied in a separate module. Another approach is to rely on the video encoder to compress the coefficients in a lossy manner and apply a quantization either in the original or transform domain.
The following describes base mesh motion field coding. In current VDMC (WD 5.0 of V-DMC, ISO/IEC JTC1/SC29/WG7, N00744, Oct. 2023) and software TMM v6.0, the reference and current base meshes share the same topology. This means that there may be a one-to-one correspondence between triangles of the reference and current base meshes (order of vertices and connectivity). Therefore, motion vectors may be determined by subtracting corresponding 3D vertex positions. The motion vectors would be added to the reference base mesh vertex positions to obtain the current base mesh vertex positions. The coding of this motion field is performed as follows.
Firstly, the construction of the motion field candidate list is illustrated in
Subsequently, the motion vectors may be predicted, a predictor index and residuals may be coded in the bitstream. In the current version, motion vectors 302 are grouped into ‘motion blocks’, typically of size 16 (can be parameter), and one predictor is selected per block. In addition, instead of coding the motion vectors, a skip mode can be signaled. In case of skip mode, the coding of the motion vectors in the motion block is skipped (reference positions are copied, no residual coded). A skip flag is signaled in the bitstream. If skip is disabled, then one predictor is chosen (and residual is coded) from the following three options:
V-DMC encoder 200 and V-DMC decoder 300 may be configured to process a lifting transform parameter set and associated semantics, an example of which is shown in TABLE 1 below.
syntax_element[i][ltpIndex] with i equal to 0 may be applied to the displacement. syntax_element[i][ltpIndex] with i equal to non-zero may be applied to the (i−1)-th attribute, where ltpIndex is the index of the lifting transform parameter set list.
vmc_transform_lifting_skip_update_flag[i][ltpIndex] equal to 1 indicates the step of the lifting transform applied to the displacement is skipped in the vmc_lifting_transform_parameters(index, lptIndex) syntax structure, where ltpIndex is the index of the lifting transform parameter set list.
vmc_transform_lifting_skip_update_flag[i][ltpIndex] with i equal to 0 may be applied to the displacement. vmc_transform_lifting_skip_update_flag[i][ltpIndex] with i equal to non-zero may be applied to the (i−1)-th attribute.
vmc_transform_lifting_quantization_parameters_x[i][ltpIndex] indicates the quantization parameter to be used for the inverse quantization of the x-component of the displacements wavelets coefficients. The value of vmc_transform_lifting_quantization_parameters_x[index] [ltpIndex] shall be in the range of 0 to 51, inclusive.
vmc_transform_lifting_quantization_parameters_y[i][ltpIndex] indicates the quantization parameter to be used for the inverse quantization of the y-component of the displacements wavelets coefficients. The value of vmc_transform_lifting_quantization_parameters_x[index] [ltpIndex] shall be in the range of 0 to 51, inclusive.
vmc_transform_lifting_quantization_parameters_z[i][ltpIndex] indicates the quantization parameter to be used for the inverse quantization of the z-component of the displacements wavelets coefficients. The value of vmc_transform_lifting_quantization_parameters_x[index] [ltpIndex] shall be in the range of 0 to 51, inclusive.
vmc_transform_log2_lifting_lod_inverse_scale_x[i][ltpIndex] indicates the scaling factor applied to the x-component of the displacements wavelets coefficients for each level of detail.
vmc_transform_log2_lifting_lod_inverse_scale_y[i][ltpIndex] indicates the scaling factor applied to the y-component of the displacements wavelets coefficients for each level of detail.
vmc_transform_log2_lifting_lod_inverse_scale_z[i][ltpIndex] indicates the scaling factor applied to the z-component of the displacements wavelets coefficients for each level of detail.
vmc_transform_log2_lifting_update_weight[i][ltpIndex] indicates the weighting coefficients used for the update filter of the wavelet transform. vmc_transform_log2_lifting_prediction_weight[i][ltpIndex] the weighting coefficients used for the prediction filter of the wavelet transform.
V-DMC decoder 300 may be configured to perform inverse image packing of wavelet coefficients. Inputs to this process are:
The output of this process is dispQuantCoeffArray, which is a 2D array of size positionCount×3 indicating the quantized displacement wavelet coefficients.
Let the function extracOddBits(x) be defined as follows:
V-DMC decoder 300 may be configured to perform inverse quantization of wavelet coefficients. Inputs to this process are:
The output of this process is dispCoeffArray, which is a 2D array of size positionCount×3 indicating the dequantized displacement wavelet coefficients.
The wavelet coefficients inverse quantization process proceeds as follows:
V-DMC decoder 300 may be configured to apply an inverse linear wavelet transform. Inputs to this process are:
The output of this process is dispArray, which is a 2D array of size positionCount×3 indicating the displacements to be applied to the mesh positions.
The inverse wavelet transform process proceeds as follows:
V-DMC decoder 300 may be configured to perform positions displacement. The inputs of this process are:
The output of this process is positionsDisplaced, which is a 2D array of size positionCount×3 indicating the positions of the displaced subdivided submesh.
The positions displacement process proceeds as follows:
The base mesh working draft description is included in Annex H of WD 5.0 of V-DMC, ISO/IEC JTC1/SC29/WG7, N00744, Oct. 2023. The following reproduces some sections that are relevant to motion field coding in the base mesh.
. . . ”
. . . ”
sismu_derived_mv_present_flag[subMeshTD] indicates sismu_my_signalled_flag is present in the bitstream. If sismu_derived_mv_present_flag[subMeshTD] is 0, sismu_mv_signalled_flag[subMeshTD][v] is always inferred as 1.
It is a requirement of bitstream conformance that if sismu_derived_mv_present_flag[subMeshTD] is equal to 1 for a submesh with submesh TD equal to subMeshID, mesh_position_deduplicate_method, if present in the corresponding intra submesh data unit, shall be equal to MESH_POSITION_DEDUP_NONE.
sismu_my_signalled_flag[subMeshTD][v] indicates a motion vector for the vertex with index v is present in the bitstream. When sismu_mv_signalled_flag[subMeshTD][v] is not present in the bitstream, sismu_my_signalled_flag[subMeshTD][v] isinferred as 1.
sismu_skip_group_flag[subMeshID][g] indicates a motion vector associated with vertices in the group with index g of the current submesh, with submesh ID equal to subMeshID, is inferred as 0.
sismu_mv_residual_abs_gt[subMeshID][v][k] indicates whether the k-th component of the motion vector prediction residual associated with the vertex with index v of the current submesh, with submesh ID equal to subMeshID has an absolute value higher than zero (when 1), or not (when 0).
sismu_mv_residual_sign[subMeshID][v][k] indicates whether the k-th component of the motion vector prediction residual associated with the vertex with index v of the current submesh, with submesh ID equal to subMeshID has a positive sign (when 1), or not (when 0). If sismu_mv_residual_sign[v][k] is not present it shall be inferred to be equal to 1.
sismu_mv_residual_abs_gt1[subMeshID][v][k] indicates whether the k-th component of the motion vector prediction residual associated with the vertex with index v of the current submesh, with submesh ID equal to subMeshID has an absolute value higher than one (when 1), or not (when 0). If sismu_mv_residual_abs_gt1[v][k] is not present it shall be inferred to be equal to 0.
sismu_mv_residual_abs_rem[subMeshID][v][k] indicates the absolute value of the k-th component of the motion vector prediction residual associated with the vertex with index v of the current submesh, with submesh ID equal to subMeshID minus 2. If sismu_mv_residual_abs_rem[v][k] is not present it shall be inferred to be equal to 0.
The k-th component of the motion vector prediction residual VertexMotionVectorResiduals[v][k] associated with the vertex with index v of the current submesh, with submesh ID equal to subMeshID is computed as follows:
Inputs to this process are:
The outputs of this process is currentSubmeshVertexPositions, which is a 2D array of size submeshVertexCount by 3 indicating the positions of the current frame submesh.
The following arrays are derived during the submesh positions reconstruction process:
The k-th component of the position of the vertex with index v currentSubmeshVertexPositions[v][k] is derived as follows:
The k-th component of the motion vector associated with the vertex with index v, currentSubmeshMotionVectors[v][k] is derived as follows:
The group index g of the vertex with index v is derived as follows:
If sismu_skip_group_flag[subMeshID][g] is equal to 1, then currentSubmeshMotionVectors[v][k]=0
The prediction mode of the vertex with index v, MvPredMode[subMeshID][v], is derived as follow:
Otherwise
If the prediction mode, MvPredMode[subMeshID][v] is equal to MV_DERIVED, then currentSubmeshMotionVectors[v][k]=currentSubmeshMotionVectors[vRef][k]
vRef is derived as follow:
The function find_if(vertexPositions, v) returns an index in vertexPositions that satisfies VertexPositions[i]==v, or −1 if no such element is found.
if vRef=−1, currentSubmeshMotionVectors[v][k] is set as 0.
If the prediction mode, MvPredMode[subMeshID][v] is equal to 0, then currentSubmeshMotionVectors[v][k]=VertexMotionVectorResiduals[v][k]
Otherwise, when sismu_mv_pred_mode[v] is greater than 0, then currentSubmeshMotionVectors[v][k]=VertexMotionVectorResiduals[v][k]+currentSubmeshPredictedMotionVectors[v][k]
The predicted motion vector currentSubmeshPredictedMotionVectors[v] is derived by applying the following process:
Inputs to this process are:
The outputs of this process are:
The maximum number of neighbours maxVertexNeighbourCount is set equal to bmsps_inter_mesh_max_num_neighbours_minus1+1.
The disclosed techniques may improve the accuracy of motion estimation by computing a weighted average of multiple candidate MVs 302. These techniques may leverage a combination of multiple MVs 302, each weighted according to its relevance, and may provide a more accurate prediction than a single MV 302. By considering multiple MVs 302 and their relative importance, the weighted average may provide a more accurate prediction of the true motion. The weighted average may help to mitigate the impact of errors or outliers in the candidate MVs 302. V-DMC encoder 200 and V-DMC decoder 300 may compute a weighted average of the MVs 302 in the candidate list for current vertex v 1302 within a 3D submesh as described below.
As described above, in the current VDMC two motion vector predictors for the current motion vector (vertex index v) are computed by simple average of the MV candidates in the list for vertex v with and without rounding bias/offset as follows:
In accordance with one or more examples described in this disclosure, V-DMC encoder 200 and V-DMC decoder 300 may compute a weighted average of the MVs in the candidate list for current vertex v. The weights are determined based on a distance metric that computes a distance value between the current vertex v and the vertex position corresponding with each of the MVs in the candidate list as follows:
V-DMC encoder 200 and V-DMC decoder 300 may compute a weighted average of the MVs 302 in the candidate list for current vertex v 1302 shown in
The distance weight may be a factor that determines the influence of a particular MV 302 on the final weighted average. By making the distance weight inversely proportional to the distance between the vertices, V-DMC encoder 200 may ensure that closer vertices have a greater impact on the motion vector predictor. There are various techniques that may be used to compute the distance between two vertices. Euclidian distance is the straight-line distance between two points in Euclidean space. The Euclidian distance may be calculated as the square root of the sum of the squared differences between corresponding coordinates. The Manhattan distance may be calculated as the sum of the absolute differences of their Cartesian coordinates. The Manhattan distance is often used in scenarios where movement is restricted to grid-like patterns. The maximum difference may be calculated as the largest absolute difference between corresponding coordinates. This metric may be useful when the focus is on the largest discrepancy between dimensions. In a 3D space, Euclidean distance may be preferred to calculate the direct distance between two points.
In an aspect, multiple predictors may be used by V-DMC encoder 200 to estimate the MV 302 (e.g., to determine a motion vector predictor for MV 302) of the current vertex 1302. The distance-weighted average predictor may assign weights to neighboring vertices (e.g., vertices 1306-1310) based on their distance from the current vertex 1302. Closer vertices may have higher weights, and their MVs 1302 may contribute more to the final prediction. Simple average predictor may calculate the average MV 302 of neighboring vertices without considering their distances. By using multiple predictors, V-DMC encoder 200 may select the best predictor for each vertex, potentially improving coding efficiency. In an example, the best predictor may be selected based on a cost function. The cost function may be used to evaluate the quality of each predictor. This cost function may consider factors such as, but not limited to: bits required to encode the predictor mode (i.e., which predictor is used), bits required to encode the residual error between the predicted MV and the actual MV, distortion metric (e.g., rate-distortion optimization). The skip mode may be used by V-DMC encoder 200 to skip motion compensation for certain vertices, further reducing the bitrate. V-DMC encoder 200 may group vertices together to share the same motion vector, reducing the number of MVs 302 that need to be encoded.
In an aspect, V-DMC encoder 200 may calculate the residual after the best predictor has been selected. The residual may be the difference between the actual motion vector (MV) 302 of the current vertex (v) 1302 and the predicted motion vector obtained from the best predictor. V-DMC encoder 200 may encode the calculated residual into the bitstream 930 for transmission or storage.
In some examples, MV predictors may rely on the MVs 302 of other vertices (e.g., vertices 1304-1310) in the current frame. To better ensure that these MVs 302 are available for prediction, the vertices should be decoded in a specific order.
In accordance with one or more examples described in this disclosure, V-DMC encoder 200 may employ inter-frame prediction. The inter-frame prediction techniques may leverage the temporal redundancy between consecutive frames to improve coding efficiency. By utilizing motion vectors from previously decoded frames (reference frames or meshes), V-DMC encoder 200 may predict the motion vector of a current vertex of the current frame more accurately.
A variety of predictors may be employed by V-DMC encoder 200 to estimate the motion vector 302 of the current vertex 1302. The list of potential predictors may include, but is not limited to: zero MV, simple average, distance-weighted average, and reference frame MV 302. The zero MV may be a simple predictor that assumes no motion. The simple average predictor may determine the average of the motion vectors 302 of neighboring vertices (e.g., vertices 1306-1310). The distance-weighted average predictor may be a weighted average of the motion vectors of neighboring vertices, where the weights may be inversely proportional to the distance from the current vertex 1302. The reference frame MV predictor may comprise the motion vector 302 of the corresponding vertex in a previously decoded reference frame. If the topology of the current and reference meshes is the same, the motion vector 302 of the corresponding vertex in the reference frame may be directly used as a predictor. In cases where the topologies differ, index remapping or nearest-neighbor techniques may be employed to find the suitable corresponding vertex in the reference frame.
In the conventional V-DMC, when the candidate list for motion vector predictors reaches its maximum capacity, a simple strategy may be employed: the last element in the list is replaced with the next candidate MV. This technique, while straightforward, may not always yield the accurate set of predictors. To enhance the quality of MV prediction, V-DMC encoder 200 may implement the disclosed techniques to calculate the distance between the current vertex (v) 302 and the vertex (w) corresponding to the next candidate MV 302. V-DMC encoder 200 may employ the distance metrics described above, such as but not limited to, Euclidean distance, Manhattan distance, or maximum difference for this calculation. If the list is not full, the new candidate MV 302 may be directly added to the list. If the list is full, the candidate MV 302 with the largest distance to the current vertex may be replaced with the new candidate MV. By prioritizing candidates that are closer to the current vertex 1302, the V-DMC encoder 200 may improve the accuracy of the average-based predictors because nearby vertices may be more likely to have similar motion characteristics. By selecting more relevant candidates, the average-based predictors may produce more accurate MV estimates. More accurate MV predictions may lead to smaller residual errors, resulting in lower bitrates.
In one example, when the candidate list is full, the distance value of the next candidate MV 302 may be compared to the distance values of the existing candidates in the list, one by one. The first candidate in the candidate list with a larger distance value may be replaced with the new candidate. This technique may better ensure that the candidate list always contains the closest vertices to the current vertex 1302, potentially leading to more accurate motion vector prediction.
In an example, when the candidate list is full, similar to the previous technique, the V-DMC encoder 200 may compare the distance value of the next candidate MV to the distance values of the existing candidates. The first candidate with a larger distance value may be removed from the list, and the new candidate may be added to the end or a specific position in the list. This technique allows for a more dynamic update of the candidate list, potentially incorporating more diverse information from neighboring vertices. The choice of distance metric (e.g., Euclidean, Manhattan, Chebyshev) may significantly impact the performance of these techniques. The size of the candidate list may influence the trade-off between computational complexity and prediction accuracy.
In one example, when the list is full, the V-DMC encoder 200 may determine the maximum distance value among the current candidates in the list. V-DMC encoder 200 may compare the distance of the next candidate MV 302 to this maximum distance. If the distance of the next candidate is smaller than the maximum distance, V-DMC encoder 200 may replace the candidate with the largest distance with the new candidate. By replacing the candidate with the largest distance, this technique may better ensure that the candidate list always contains the closest vertices to the current vertex 1302. This may lead to more accurate motion vector prediction. This technique may be more efficient than comparing the next candidate to each existing candidate individually. By prioritizing closer vertices, the predictor may better capture local motion patterns.
Many alternative substitutions may be employed to construct the candidate list.
In VDMC and in examples described above, it is assumed that the motion vector candidates originate from neighboring vertices (e.g., vertices 1306-1310) that are in the same plane as the current vertex 1302, because the 3D positions of the vertices corresponding with candidate MVs relative to the 3D position of the current are not considered.
In accordance with one or more examples described in this disclosure, it is proposed to take the relative 3D positions into consideration to correct or weight the candidate MVs in the list.
In this example, for a current vertex of mesh vertices of the mesh data, V-DMC encoder 200 or V-DMC decoder 300 may determine a motion vector predictor based on respective weighted averages of respective motion vectors in a candidate list for the current vertex (1402). Motion vector predictors may be used to predict the motion of a particular mesh vertex based on the motion of its neighboring vertices. By accurately predicting the motion, V-DMC encoder 200 and/or V-DMC decoder 300 may reduce the amount of information needed to represent the actual motion. V-DMC encoder 200 may encode and/or V-DMC decoder 300 may decode the current vertex based on the motion vector predictor (1404). By using multiple predictors, V-DMC encoder 200 may select the best predictor for each vertex, potentially improving coding efficiency.
In one example, the current vertex may be in a current mesh, and at least one of the motion vectors in the candidate list for the current vertex may be based on a motion vector of a reference vertex in a reference mesh.
In one example, an index identifying the reference vertex may be same as an index identifying the current vertex in a condition where a topology of the current mesh and the reference mesh being the same.
In one example, the reference vertex may be identified based on index remapping or based on nearest vertex position in reference mesh to position of current vertex in current mesh in a condition where a topology of the current mesh and the reference mesh is different.
In one example, V-DMC encoder 200 or V-DMC decoder 300 may determine the respective weighted averages based on a respective distance between the current vertex and respective vertex position corresponding to each of the motion vectors.
In one example, the respective distance may be determined based on at least one of a Euclidean distance, Manhattan distance, maximum difference between position components, or a combination thereof.
In one example, the motion vector predictor may be a first motion vector predictor. The respective weighted averages may include first respective weighted averages. V-DMC encoder 200 or V-DMC decoder 300 may, for the current vertex of the mesh vertices of the mesh data, determine a second motion vector predictor based on second respective weighted averages of respective motion vectors in the candidate list for the current vertex. Encoding or decoding the current vertex may include encoding or decoding the current vertex based on the first motion vector predictor and the second motion vector predictor.
In one example, V-DMC encoder 200 or V-DMC decoder 300 may determine an additional motion vector predictor based on at least one of: zero motion vector, simple average without rounding, simple average with rounding, distance-weighted average without rounding, or distance-weighted average with rounding. Encoding or decoding the current vertex may include encoding or decoding the current vertex based additionally on the additional motion vector predictor.
In one example, V-DMC encoder 200 or V-DMC decoder 300 may construct the candidate list for the current vertex. Constructing the candidate list may include at least one of: adding motion vectors of vertices in a current mesh that includes the current vertex, adding motion vectors of vertices in a reference mesh that was previously encoded or decoded, or addition motion vectors of vertices in the current mesh and motion vectors of vertices in the reference mesh.
In one example, constructing the candidate list may include, in a condition where the candidate list is full: determining a distance value between a first position corresponding to a candidate motion vector; comparing the distance value to respective distance values of motion vectors in the candidate list; and based on the comparison, removing a motion vector from the candidate list, and adding the candidate motion vector.
In one example, the removed motion vector has the largest distance in the candidate list.
In one example, V-DMC encoder 200 or V-DMC decoder 300 may generate the mesh data.
The following describes example techniques that may be performed together or separately.
Clause 1. A method of encoding or decoding mesh data, the method comprising: for a current vertex of mesh vertices of the mesh data, determining a motion vector predictor based on respective weighted averages of respective motion vectors in a candidate list for the current vertex; and encoding or decoding the current vertex based on the motion vector predictor.
Clause 2. The method of clause 1, wherein the current vertex is in a current mesh, and wherein at least one of the motion vectors in the candidate list for the current vertex is based on a motion vector of a reference vertex in the current mesh.
Clause 3. The method of clause 2, wherein an index identifying the reference vertex is same as an index identifying the current vertex in a condition where a topology of the current mesh and the reference mesh being the same.
Clause 4. The method of clause 2, wherein the reference vertex is identified based on index remapping or based on nearest vertex position in reference mesh to position of current vertex in current mesh in a condition where a topology of the current mesh and the reference mesh is different.
Clause 5. The method of clause 1, further comprising: determining the respective weighted averages based on a respective distance between the current vertex and respective vertex position corresponding to each of the motion vectors.
Clause 6. The method of clause 5, wherein the respective distance is determined based on at least one of a Euclidean distance, Manhattan distance, maximum difference between position components, or a combination thereof.
Clause 7. The method of any of clauses 1-6, wherein the motion vector predictor is a first motion vector predictor, wherein the respective weighted averages comprise first respective weighted averages, the method further comprising: for the current vertex of the mesh vertices of the mesh data, determining a second motion vector predictor based on second respective weighted averages of respective motion vectors in the candidate list for the current vertex, wherein encoding or decoding the current vertex comprises encoding or decoding the current vertex based on the first motion vector predictor and the second motion vector predictor.
Clause 8. The method of any of clauses 1-7, further comprising:
Clause 9. The method of any of clauses 1-8, further comprising:
Clause 10. The method of clause 9, wherein constructing the candidate list comprises, in a condition where the candidate list is full: determining a distance value between a first position corresponding to a candidate motion vector; comparing the distance value to respective distance values of motion vectors in the candidate list; and based on the comparison, removing a motion vector from the candidate list, and adding the candidate motion vector.
Clause 11. The method of clause 10, wherein the removed motion vector has the largest distance in the candidate list.
Clause 12. The method of any of clauses 1-11, further comprising generating the mesh data.
Clause 13. A device for generating mesh data, the device comprising: a memory configured to store the mesh data; and one or more processors coupled to the memory, implemented in circuitry, and configured to: for a current vertex of mesh vertices of the mesh data, determine a motion vector predictor based on respective weighted averages of respective motion vectors in a candidate list for the current vertex; and encode or decode the current vertex based on the motion vector predictor.
Clause 14. The device of clause 13, wherein the current vertex is in a current mesh, and wherein at least one of the motion vectors in the candidate list for the current vertex is based on a motion vector of a reference vertex in the current mesh.
Clause 15. The device of clause 14, wherein an index identifying the reference vertex is same as an index identifying the current vertex in a condition where a topology of the current mesh and the reference mesh being the same.
Clause 16. The device of clause 14, wherein the reference vertex is identified based on index remapping or based on nearest vertex position in reference mesh to position of current vertex in current mesh in a condition where a topology of the current mesh and the reference mesh is different.
Clause 17. The device of clause 13, wherein the one or more processors are further configured to: determine the respective weighted averages based on a respective distance between the current vertex and respective vertex position corresponding to each of the motion vectors.
Clause 18. The device of clause 17, wherein the respective distance is determined based on at least one of a Euclidean distance, Manhattan distance, maximum difference between position components, or a combination thereof.
Clause 19. The device of any of clauses 13-18, wherein the motion vector predictor is a first motion vector predictor, wherein the respective weighted averages comprise first respective weighted averages, and wherein the one or more processors are further configured to: for the current vertex of the mesh vertices of the mesh data, determine a second motion vector predictor based on second respective weighted averages of respective motion vectors in the candidate list for the current vertex, wherein the one or more processors configured to encode or decode the current vertex are further configured to decode the current vertex based on the first motion vector predictor and the second motion vector predictor.
Clause 20. The device of any of clauses 13-19, wherein the one or more processors are further configured to: determine an additional motion vector predictor based on at least one of: zero motion vector, simple average without rounding, simple average with rounding, distance-weighted average without rounding, or distance-weighted average with rounding, wherein the one or more processors configured to encode or decode the current vertex are further configured to encode or decode the current vertex based on the first motion vector predictor and the second motion vector predictor.
Clause 21. The device of any of clauses 13-20, wherein the one or more processors are further configured to: construct the candidate list for the current vertex, and wherein the one or more processors configured to construct the candidate list are configured to at least one of: add motion vectors of vertices in a current mesh that includes the current vertex, add motion vectors of vertices in a reference mesh that was previously encoded or decoded, or add motion vectors of vertices in the current mesh and motion vectors of vertices in the reference mesh.
Clause 22. The device of clause 20, wherein the one or more processors configured to construct the candidate list is configured to, in a condition where the candidate list is full: determine a distance value between a first position corresponding to a candidate motion vector; compare the distance value to respective distance values of motion vectors in the candidate list; and based on the comparison, remove a motion vector from the candidate list, and add the candidate motion vector.
Clause 23. The device of clause 22, wherein the removed motion vector has the largest distance in the candidate list.
Clause 24. The device of any of clauses 13-23, wherein the one or more processors are further configured to generate the mesh data.
Clause 25. Non-transitory computer-readable storage media having instructions encoded thereon, the instructions configured to cause processing circuitry to: for a current vertex of mesh vertices of the mesh data, determine a motion vector predictor based on respective weighted averages of respective motion vectors in a candidate list for the current vertex; and encode or decode the current vertex based on the motion vector predictor.
Clause 26. The storage media of clause 25, wherein the current vertex is in a current mesh, and wherein at least one of the motion vectors in the candidate list for the current vertex is based on a motion vector of a reference vertex in the current mesh.
Clause 27. The storage media of clause 26, wherein an index identifying the reference vertex is same as an index identifying the current vertex in a condition where a topology of the current mesh and the reference mesh being the same.
Clause 28. The storage media of clause 26, wherein the reference vertex is identified based on index remapping or based on nearest vertex position in reference mesh to position of current vertex in current mesh in a condition where a topology of the current mesh and the reference mesh is different.
It is to be recognized that depending on the example, certain acts or events of any of the techniques described herein can be performed in a different sequence, may be added, merged, or left out altogether (e.g., not all described acts or events are necessary for the practice of the techniques). Moreover, in certain examples, acts or events may be performed concurrently, e.g., through multi-threaded processing, interrupt processing, or multiple processors, rather than sequentially.
In one or more examples, the functions described may be implemented in hardware, software, firmware, or any combination thereof. If implemented in software, the functions may be stored on or transmitted over as one or more instructions or code on a computer-readable medium and executed by a hardware-based processing unit.
Computer-readable media may include computer-readable storage media, which corresponds to a tangible medium such as data storage media, or communication media including any medium that facilitates transfer of a computer program from one place to another, e.g., according to a communication protocol. In this manner, computer-readable media generally may correspond to (1) tangible computer-readable storage media which is non-transitory or (2) a communication medium such as a signal or carrier wave. Data storage media may be any available media that can be accessed by one or more computers or one or more processors to retrieve instructions, code and/or data structures for implementation of the techniques described in this disclosure. A computer program product may include a computer-readable medium.
By way of example, and not limitation, such computer-readable storage media can comprise RAM, ROM, EEPROM, CD-ROM or other optical disk storage, magnetic disk storage, or other magnetic storage devices, flash memory, or any other medium that can be used to store desired program code in the form of instructions or data structures and that can be accessed by a computer. Also, any connection is properly termed a computer-readable medium. For example, if instructions are transmitted from a website, server, or other remote source using a coaxial cable, fiber optic cable, twisted pair, digital subscriber line (DSL), or wireless technologies such as infrared, radio, and microwave, then the coaxial cable, fiber optic cable, twisted pair, DSL, or wireless technologies such as infrared, radio, and microwave are included in the definition of medium. It should be understood, however, that computer-readable storage media and data storage media do not include connections, carrier waves, signals, or other transitory media, but are instead directed to non-transitory, tangible storage media. Disk and disc, as used herein, includes compact disc (CD), laser disc, optical disc, digital versatile disc (DVD), floppy disk and Blu-ray disc, where disks usually reproduce data magnetically, while discs reproduce data optically with lasers. Combinations of the above should also be included within the scope of computer-readable media.
Instructions may be executed by one or more processors, such as one or more digital signal processors (DSPs), general purpose microprocessors, application specific integrated circuits (ASICs), field programmable gate arrays (FPGAs), or other equivalent integrated or discrete logic circuitry. Accordingly, the terms “processor” and “processing circuitry,” as used herein may refer to any of the foregoing structures or any other structure suitable for implementation of the techniques described herein. In addition, in some aspects, the functionality described herein may be provided within dedicated hardware and/or software modules configured for encoding and decoding, or incorporated in a combined codec. Also, the techniques could be fully implemented in one or more circuits or logic elements.
The techniques of this disclosure may be implemented in a wide variety of devices or apparatuses, including a wireless handset, an integrated circuit (IC) or a set of ICs (e.g., a chip set). Various components, modules, or units are described in this disclosure to emphasize functional aspects of devices configured to perform the disclosed techniques, but do not necessarily require realization by different hardware units. Rather, as described above, various units may be combined in a codec hardware unit or provided by a collection of interoperative hardware units, including one or more processors as described above, in conjunction with suitable software and/or firmware.
Various examples have been described. These and other examples are within the scope of the following claims.
This application claims the benefit of U.S. Provisional Patent Application 63/624,699, filed Jan. 24, 2024, the entire content of which is incorporated by reference.
Number | Date | Country | |
---|---|---|---|
63624699 | Jan 2024 | US |