The present application relates, generally, to computer peripheral devices that communicate with a host computer and, more specifically, to such devices that provide video codec functionalities and to the host computer to which the devices provide the functionalities.
The quantity of known video applications executed by network-connected devices may be seen to be continuously on the rise. Correspondingly, network video traffic continues to grow. Accordingly, a demand for real-time video encoding and decoding also grows. An entity that carries out video encoding (co-) and decoding (dec) may be called a video “codec” and may be implemented as a software-only solution. Alternatively, a video codec may be implemented as a solution that combines software and hardware, in a so-called hardware-assisted solution.
It may be shown that, in many cases, software-only solutions cannot cope with the inherent computational complexity of the large amount of tasks to be carried out in a video codec. In such cases, hardware-assisted solutions are known to be more suitable. One example of a hardware-assisted video codec solution is the known Quick Sync Video (QSV) technology marketed by Intel™ of Santa Clara, Calif. In QSV technology, video codec modules are built into Central Processing Units (CPUs). Two further examples of hardware-assisted video codec solutions are the known NVENC technology and NVDEC technology, marketed by Nvidia™ of Santa Clara, Calif. In NVENC technology and NVDEC technology, video codec modules are integrated into Graphics Processing Units (GPUs). There are also known to be hardware-assisted video codec solutions that are based on Field Programmable Gate Arrays (FPGAs) and hardware-assisted video codec solutions that are based on Application Specific Integrated Circuits (ASICs). Each of these latter examples are known to use a bus that allows for a connection with a host machine according to the known Peripheral Component Interconnect express (PCIe) standard.
Reference will now be made, by way of example, to the accompanying drawings which show example implementations; and in which:
Scalability can be a problem when using video codec modules built into CPUs or integrated into GPUs. The video codec modules built into CPUs and integrated into GPUs are separate from regular computational units. Scaling up the video codec capability requires more CPUs or GPUs even when the regular computational capability of the existing CPUs or GPUs is more than enough to handle non-video-codec tasks. This may be seen to result in wasting computational power of CPUs or GPUs. Furthermore, it is not usually feasible to increase the number of CPUs in an existing computer or server. While it is feasible to increase the number of GPUs in an existing computer or server, GPUs can be expensive, power-hungry, can take up a lot of space in a computer and the number of GPUs in a computer is oftentimes very limited. As a result, the CPU-based video codec solutions and GPU-based video codec solutions may be understood to lack an ability to achieve a density of video codec modules that is sufficiently high for some applications.
The FPGA-based PCIe devices and ASIC-based PCIe devices may also be recognized as having a scalability problem, since these devices are usually provided in the form of an add-in card (AIC) that is designed to be received in a PCIe expansion slot in a computer. The number of expansion slots in a computer that can hold such cards can be very limited. In addition, in particular for a Linux system, these devices usually require special or proprietary device drivers. A requirement for special or proprietary device drivers may be seen to be problematic when upgrading an existing Linux-based computer or server to support FPGA-based PCIe devices incorporating video codec modules or ASIC-based PCIe devices incorporating video codec modules.
NVM Express® is an open collection of standards and information to fully expose the benefits of non-volatile memory in all types of computing environments from mobile to data center. This collection of standards is often referenced by the acronym NVMe™.
Aspects of the present application relate to devices with video codec functionality. Devices in aspects of the present application connect to a host computer through a PCIe bus and employ the known NVMe™ collection of standards. A device in one aspect of the present application is a NVMe-based storage device (such as a Solid State Drive) with embedded video codec functionality. A device in another aspect of the present application has video codec as its only or main functionality.
An aspect of the present application relates to a device housing a video codec module and having a form factor of an Solid State Drive (SSD). Such a device can, with such a profile, be fit into a computer or a server that has drive bays specifically suited to SSDs. There exists NVMe-based storage systems known as “Just a Bunch Of Flash” or “JBOF.” A device that houses video codec modules and has an SSD form factor could be fit into a NVMe™ JBOF storage system. The NVMe™ JBOF storage system would then be connected to one or more host computers through a PCIe extension cable or NVMe™ over Fabrics (NVMf or NVMoF) interface.
Conveniently, when a host computer system is equipped with a device configured, according to aspects of the present application, to house video codec modules, scaling up video codec capability in the host computer system is as simple as adding storage capability to the device. As of 2018, there have been servers on the market that have 24 U.2 NVMe™ drive bays and NVMe™ JBOFs on the market that have 32 NVMe™ drive bays. With devices housing video codec modules, according to aspects of the present application, the video codec capabilities of such servers and JBOFs may be seen to be able to achieve very high density.
With a NVMe™ video codec device that houses video codec modules according to aspects of the present application, it is not necessary to install special or proprietary drivers to add video codec module support to an existing host computer executing a Linux operating system. The existing host computer may simply employ the video codec functionality in the NVMe™ video codec device through vendor-specific NVMe™ commands. As of 2018, the NVMe™ device driver is readily available for wide variety of Linux operating systems.
According to an aspect of the present disclosure, there is provided a computer peripheral video codec device. The computer peripheral video codec device includes a peripheral component interconnect express (PCIe) bus physical interface and a Non-Volatile Memory express (NVMe) module in communication with the physical interface, wherein the NVMe module is configured to execute a NVMe interface. The computer peripheral video codec device also includes a video encoder module in communication with the NVMe module, wherein the video encoder module is configured to initialize, in response to receiving a command from a host computer through the NVMe module, a video encoder instance.
According to another aspect of the present disclosure, there is provided a computer peripheral video codec device. The computer peripheral video codec device includes a peripheral component interconnect express (PCIe) bus physical interface and a Non-Volatile Memory express (NVMe) module in communication with the physical interface, wherein the NVMe module is configured to execute a NVMe interface. The computer peripheral video codec device also includes a video decoder module in communication with the NVMe module, the video decoder module configured to receive, from a host computer through the NVMe module, a command and initialize, in response to the receiving, a video decoder instance.
According to a further aspect of the present disclosure, there is provided a method of encoding a frame of video. The method includes selecting, at a host computer and from a set of Non-Volatile Memory express (NVMe) commands, a write command, thereby producing a selected write command, and issuing, to a video encoder module, the selected write command, thereby causing the video encoder module to receive a raw video frame from the host computer. The method further includes selecting, at the host computer and from the set of NVMe commands, a read command, thereby producing a selected read command, issuing, to the video encoder module, the selected read command, thereby causing the video encoder module to transfer an encoded frame to the host computer, and receiving, from the video encoder module, the encoded frame. Additionally, there is provided a host computer comprising a processor configured to implement a driver operable to carry out this method.
According to a still further aspect of the present disclosure, there is provided a method of decoding a frame of video. The method includes selecting, at a host computer and from a set of Non-Volatile Memory express (NVMe) commands, a write command, thereby producing a selected write command, and issuing, to a video decoder module, the selected write command, thereby causing the video decoder module to receive an encoded video frame from the host computer. The method further includes selecting, at the host computer and from the set of NVMe commands, a read command, thereby producing a selected read command, issuing, to the video decoder module, the selected read command, thereby causing the video decoder module to transfer a decoded video frame to the host computer and receiving, from the video decoder module, the decoded video frame. Additionally, there is provided a host computer comprising a processor configured to implement a driver operable to carry out this method.
According to an even further aspect of the present disclosure, there is provided a implementation of software stack that provides NVMe-based video codec service to applications in Linux systems using FFMPEG.
Other aspects and features of the present disclosure will become apparent to those of ordinary skill in the art upon review of the following description of specific implementations of the disclosure in conjunction with the accompanying figures.
It is known to be difficult to achieve high-density video codec functionality with current hardware-assisted video codec solutions, such as video codec modules built into CPUs or integrated into GPUs. This is important because the efficiency of operation of cloud-based video services may be directly dependent upon the density of available video codec functionality.
It may be considered that the storage industry has been dealing with scalability problems since its very beginning and, furthermore, that the industry has come up with good solutions. Aspects of the present application may be seen to mitigate the video codec scalability problem by enabling the video codec devices to use the known NVMe™ interface. Conveniently, the NVMe™ interface was originally defined for storage devices, such as SSDs. of the present application may be seen to take advantage of existing NVMe-based storage scalability solutions.
Scaling inside an existing computer is known as “vertical scaling.” The devices that house one or more video codec modules have, in accordance with aspects of the present application, an NVMe™ interface. It is known that the NVMe™ interface was originally defined for storage device such as SSDs. It follows that devices that house one or more video codec modules and have a form factor of an SSD can be fit into storage servers that are typically designed to have plenty of NVMe™ SSD drive bays. Consequently, the devices that house one or more video codec modules in accordance with aspects of the present application allow for significant vertical scaling of video codec functionality.
Scaling outside an existing computer is known as “horizontal scaling.” The devices that house one or more video codec modules and have an NVMe™ interface, in accordance with aspects of the present application, can be installed into storage devices, such as NVMe™ JBOFs, that connect with a host computer with PCIe extension cables or with an NVMe-over-fabrics interface.
In a manner consistent with a traditional NVMe™ SSD device, the first type of device 101 has an NVMe™ module 103 and a plurality of NAND flash devices 107 (only one is illustrated) as storage media. NAND flash memory, such as the NAND flash devices 107, is a type of non-volatile storage technology that does not require power to retain data. The NVMe™ module 103 is configured to execute an NVMe™ interface.
Between the NVMe™ module 103 and the plurality of NAND flash devices 107, a Flash Translation Layer 105 connects to the NVMe™ module 103 and to a NAND flash device interface 106. The NAND flash device interface 106 connects to the NAND flash device 107. As is known, the Flash Translation Layer 105 is usually implemented as firmware and is sometimes implemented with some hardware acceleration. The NAND flash device interface 106 typically includes a Physical layer (PHY), a micro sequencer and some firmware.
In a manner inconsistent with traditional NVMe™ SSD devices, the NVMe™ module 103 of the first type of device 101 also maintains a connection with a plurality of video codec modules 202. Each video codec module 202 may include a hardware accelerator (not shown), a processor (not shown) and a memory (not shown) that stores firmware for execution on the processor.
A representative one of the video codec modules 202 among the plurality of video codec modules 202 is illustrated as hosting a plurality of video codec instances 203. The NVMe™ module 103 is defined by NVM Express, Inc. of Beaverton, Oreg. and implements a collection of communication standards between an NVMe™ driver 104 and the plurality of NAND flash devices 107 through the PCIe bus 102. The host computer 100 includes a processor (not shown) in communication with memory (not shown), with the memory storing instructions that allow the processor to implement the NVMe™ driver 104.
Notably, the NVMe™ collection of standards allows for vendor-specific fields within standard NVMe™ messages. Additionally, the NVMe™ collection of standards allows for vendor-specific commands. Some aspects of the present application make use of the NVMe™ vendor-specific fields to enable bidirectional communication between the host computer 100 and the video codec modules 202. Other aspects of the present application make use of the NVMe™ vendor-specific commands to enable communication between the host computer 100 and the video codec modules 202.
In a manner consistent with the first type of device 101, the second type of device 201 has the NVMe™ module 103 and the plurality of video codec modules 202. The representative one of the video codec modules 202 among the plurality of video codec modules 202 is illustrated as hosting the plurality of video codec instances 203. The NVMe™ module 103 facilitates communication with the NVMe™ driver 104 on the host computer 100 through the PCIe bus 102.
It is known that the NVMe™ collection of standards defines a data structure for an “Identify Controller” command and a “namespace” data structure. It is proposed herein that, when queried by the NVMe™ driver 104, the NVMe™ module 103 will report that the second type of device 201 has one namespace and that the one namespace has a size of zero. It should then be clear, to the host computer 100, that the second type of device 201 cannot be used as a storage device. The host computer 100 may, however, be configured to issue vendor-specific commands to communicate with the video codec modules 202 inside the second type of device 201.
NVM Express Inc. published the 1.0 version of the NVMe over Fabrics (NVMe-oF) specification on June 5, 2016. NVMe-oF is designed to extend the high-performance and low-latency benefits of NVMe™ across network fabrics that connect servers and storage systems, such as Fibre Channel, Ethernet and InfiniBand.
To communicate over the network fabrics 302, the host computer 300 has a NVM-oF host driver 301 and the example NVMe™ JBOF 310 has a NVM-oF target driver 313. The example NVMe™ JBOF 310 is illustrated as housing a plurality of devices, which may be devices of the first type 101 or devices of the second type 201.
The devices 101/201 each maintain a connection to a PCIe switch 315. The PCIe switch 315 communicates with the NVM-oF target driver 313 via a NVMe™ driver 314.
For the purposes of aspects of the present application, the vendor-specific fields may be defined to allow for communication between the host computer 100 and the video codec modules 202.
In
It is known that the NVMe™ collection of standards defines a collection of Opcodes associated with various commands related to such tasks as creating and deleting queues, identifying controllers and getting or setting various features. Conveniently, there are some Opcodes that have been left for vendor-specific uses. Commands that may be issued generally fall into one of two categories: admin commands; and NVM commands.
Opcode C1h is associated, in the vendor-specific admin commands table 602, with a command name “Xcoder-Open.” The Xcoder-Open command may be defined to allow the host computer 100 to allocate and initialize a video codec instance 203 within a specified video codec module 202.
Opcode C2h is associated, in the vendor-specific admin commands table 602, with a command name “Xcoder-Close.” The Xcoder-Close command may be defined to allow the host computer 100 to close one of the video codec instances 203 within a specified video codec module 202.
Opcode C3h is associated, in the vendor-specific admin commands table 602, with a command name “Xcoder-Query.” The Xcoder-Query command may be defined to allow the host computer 100 to check the status of each video codec module 202.
Opcode C4h is associated, in the vendor-specific admin commands table 602, with a command name “Xcoder-Reconfigure.” The Xcoder-Reconfigure command may be defined to allow the host computer 100 to reconfigure one of the video codec instances 203.
Opcode 83h is associated, in the vendor-specific NVM commands table 604, with a command name “Xcoder-Write.” The Xcoder-Write command may be defined to allow the host computer 100 to pass a video frame to one of the video codec instances 203.
Opcode 83h is associated, in the vendor-specific NVM commands table 604, with a command name “Xcoder-Read.” The Xcoder-Read command may be defined to allow the host computer 100 to read a video frame from one of the video codec instances 203.
Initially, the host computer 100 selects the Xcoder-Open command (e.g., Opcode C1h) from among the commands in the vendor-specific admin commands table 602 (
The host computer 100 then determines (step 704) whether there are frames remaining to be decoded.
Upon determining (step 704) that there are frames remaining to be decoded, the host computer 100 determines (step 706) whether the current frame is the last frame remaining to be decoded.
Upon determining (step 706) that the current frame is the last frame remaining to be decoded, the host computer 100 sets (step 708) an end-of-stream flag.
Upon determining (step 706) that the current frame is not the last frame remaining to be decoded or upon setting (step 708) the end-of-stream flag, the host computer 100 selects the Xcoder-Write command (e.g., Opcode 83h) from among the commands in the vendor-specific NVM commands table 604 (
The video decoder module 202 then receives (step 808) the Xcoder-Write command. Responsive to receiving (step 808) the Xcoder-Write command, the video decoder module 202 causes (step 810) the video decoder instance 203 to receive the current frame of compressed (encoded) video from the host computer 100.
After having issued the Xcoder-Write command (step 710) or upon determining (step 704) that there are no more frames remaining to be decoded, the host computer 100 selects the Xcoder-Read command (e.g., Opcode 84h) from among the commands in the vendor-specific NVM commands table 604 (
The video decoder module 202 receives (step 812) the Xcoder-Read command. Responsive to receiving (step 812) the Xcoder-Read command, the video decoder module 202 causes (step 814) the video decoder instance 203 to transfer a decoded YUV frame to the host computer 100.
The host computer 100 then determines (step 714) whether the end-of-stream flag is set.
Upon determining (step 714) that the end-of-stream flag is set, the host computer 100 selects the Xcoder-close command (e.g., Opcode C2h) from among the commands in the vendor-specific admin commands table 602 (see
The video decoder module 202 receives (step 816) the Xcoder-Close command. Responsive to receiving (step 816) the Xcoder-Close command, the video decoder module 202 closes (step 818) the video decoder instance 203.
Upon determining (step 714) that the end-of-stream flag is not set, the host computer 100 returns to determining (step 704) whether there are frames remaining to be decoded.
As will be well understood by a person of skill in the relevant art, a set of steps to be carried out at the video codec instance 203 is specific to the type of codec that is implemented by the video codec instance 203. Notably, the steps carried out by a codec implementing the known H.264/AVC standard are distinct from the steps carried out by a codec implementing the known H.265/HEVC standard.
Initially, the host computer 100 selects the Xcoder-Open command (e.g., Opcode C1h) from among the commands in the vendor-specific admin commands table 602 (
The host computer 100 then determines (step 904) whether there are frames remaining to be encoded.
Upon determining (step 904) that there are frames remaining to be encoded, the host computer 100 determines (step 906) whether the current frame is the last frame remaining to be encoded.
Upon determining (step 906) that the current frame is the last frame remaining to be encoded, the host computer 100 sets (step 908) an end-of-stream flag.
Upon determining (step 906) that the current frame is not the last frame remaining to be decoded or upon setting (step 908) the end-of-stream flag, the host computer 100 selects the Xcoder-Write command (e.g., Opcode 83h) from among the commands in the vendor-specific NVM commands table 604 (
The video encoder module 202 then receives (step 1008) the Xcoder-Write command. Responsive to receiving (step 1008) the Xcoder-Write command, the video encoder module 202 causes (step 1010) the video encoder instance 203 to receive the current frame of YUV video from the host computer 100.
After having issued the Xcoder-Write command (step 910) or upon determining (step 904) that there are no more frames remaining to be encoded, the host computer 100 selects the Xcoder-Read command (e.g., Opcode 84h) from among the commands in the vendor-specific NVM commands table 604 (
The video encoder module 202 receives (step 1012) the Xcoder-Read command. Responsive to receiving (step 1012) the Xcoder-Read command, the video encoder module 202 causes (step 1014) the video encoder instance 203 to transfer an encoded video frame to the host computer 100.
The host computer 100 then determines (step 914) whether the end-of-stream flag is set.
Upon determining (step 914) that the end-of-stream flag is set, the host computer 100 selects the Xcoder-Close command (e.g., Opcode C2h) from among the commands in the vendor-specific admin commands table 602 (see
The video encoder module 202 receives (step 1016) the Xcoder-Close command. Responsive to receiving (step 1016) the Xcoder-Close command, the video encoder module 202 closes (step 1018) the video encoder instance 203.
Upon determining (step 914) that the end-of-stream flag is not set, the host computer 100 returns to determining (step 904) whether there are frames remaining to be encoded.
The first type of device 101 and the second type of device 201 may be formed to have a so-called “U.2” form factor. The U.2 form factor is a standard enterprise SSD form factor. Conveniently, the U.2 form factor is known to enable improved vertical scaling.
The software stack includes the NVMe Video Codec Device 101/102 familiar from
In the Kernel Space 1202, the NVMe™ driver 104 communicates with the NVMe Video Codec Device 101/201. A user library of NVMe-based codec Application Programming Interfaces (APIs) 1204 interfaces with the NVMe™ driver 104.
The User Space 1203 includes an FFMPEG application 1208. FFMPEG is a free software project, the product of which is a vast software suite of libraries and programs for handling video, audio and other multimedia files and streams. At the core of the project is the FFMPEG application 1208 itself, designed for command-line-based processing of video and audio files. The FFMPEG application 1208 is widely used for format transcoding, basic editing (trimming and concatenation), video scaling, video post-production effects and standards compliance. The FFMPEG project has developed “libavcodec,” an audio/video codec library used by many commercial and free software products. The User Space 1203 includes an implementation 1206 of the FFMPEG libavcodec.
In operation, the FFMPEG application 1208 may invoke video codec functions through APIs provided by the libavcodec 1206.
The User Space 1203 also includes another application 1210. The other application 1210 may select to work with the FFMPEG application 1208, with the libavcodec 1206 or directly with the NVMe-based codec APIs 1204.
In the preceding, it has been recognized that the read and write commands in the current NVMe standard are not configured for use to communicate with an NVMe video codec device of the types (the first type 101,
The above-described implementations of the present application are intended to be examples only. Alterations, modifications and variations may be effected to the particular implementations by those skilled in the art without departing from the scope of the application, which is defined by the claims appended hereto.
Number | Date | Country | Kind |
---|---|---|---|
201811478858.3 | Dec 2018 | CN | national |
201811479224.X | Dec 2018 | CN | national |
The present application is a divisional application of U.S. patent application Ser. No. 16/415,759 filed May 17, 2019, which claims priority to U.S. Provisional Patent Application No. 62/779,868, filed Dec. 14, 2018, Chinese Patent Application No. 201811479224.X, filed Dec. 5, 2018, and Chinese Patent Application No. 201811478858.3, filed Dec. 5, 2018, all the contents of which are hereby incorporated herein by reference.
Number | Date | Country | |
---|---|---|---|
62779868 | Dec 2018 | US |
Number | Date | Country | |
---|---|---|---|
Parent | 16415759 | May 2019 | US |
Child | 17100435 | US |