The invention relates to a method and an apparatus for processing digital content, like digital image sequences. More specifically, a method and an apparatus for processing digital content are described, which allow to perform processing of the digital content with one or more processing nodes with a reduced delay caused by a transmission of the digital content to the one or more processing nodes.
Motion picture films, as a prominent example of digital content, are part of our cultural heritage. Unfortunately, they are often affected by undesirable objects such as scratches, dust, dirt, stains, abrasion and some more. Today a lot of effort is made to perform restoration of motion picture films. Usually, restoration is carried out digitally after scanning the motion picture films.
Apparently manual restoration of digitized films by finding and removing each scratch and dirt object is a time consuming business, although there is software on the market that assists artists in many aspects of the job. In particular, manual restoration of old content with large amounts of either scratch or dirt may not be financially viable. This is even more the case for large archives with footage of unknown commercial value.
The application of automatic restoration software with algorithms that try to detect and remove scratch and dirt is the only viable alternative to a manual process. At present there are a number of software and hardware products available on the market which perform detection and removal of scratch and dirt more or less automatically. Usually a manual adjustment of certain parameters is needed to fine tune detection and removal, sometimes individually for each scene.
After processing, the restored output or parts thereof have to be either accepted or rejected, with the option of rerunning restoration with different parameters. This is unsatisfactory since the adjustment takes time and quality may be not good enough in critical scenes that have not been specially adapted.
Detection of scratch and dirt is a nontrivial problem that is currently not totally solved. There is still a certain ratio of objects that will either not be detected or falsely detected.
Recently it has been proposed to split the restoration process of motion picture films into detection of objects, e.g. scratch and dirt objects, and removal using an automatic metadata driven workflow, and to further split the detection process and the removal process into a plurality of smaller processing tasks, which are allocated to a plurality of processing nodes.
Motion picture films are processed in either single frame file formats, such as DPX (SMPTE 268-2003) or TIFF (ISO 12639:2004), or in container stream based file formats, such as MXF (SMPTE 377M) or AVI (Video for Windows SDK, Microsoft) and MOV (QuickTime, Mac OSX SDK, Apple).
The single frame file formats are ideal for pipelining and multi-processing if the smallest work unit is a frame. The motion picture film can be split into frames and the frames can be transmitted to the processing nodes one by one. If enough frames have reached a processing node, the processing can start and continue by receiving new frames in parallel. This means that even if the number of frames of a motion picture film is very large, the processing can start immediately after receiving a few frames at the processing nodes.
In contrast, in order to process indexed files like AVI or MOV the whole motion picture film needs to be transmitted to the processing nodes because the index of the motion picture film is located at the end of the file. Therefore, the processing cannot start during transmission time and the processing is delayed for a significant amount of time depending on the bandwidth of the network to the processing nodes. The same problem arises with any digital content that is transmitted to multiple processing nodes as a large file with important management data like index tables.
It is thus an object of the present invention to propose a solution for processing digital content with one or more processing nodes, which does not require transmission of the complete digital content to the multiple processing nodes before processing can be started.
According to the invention, a method for processing digital content stored in a data repository using one or more processing nodes with associated storage systems, the digital content being arranged in a container file comprising internal file management information in accordance with a container file format, comprises the steps of:
In order to facilitate the above proposed method, an apparatus for processing digital content stored in a data repository using one or more processing nodes with associated storage systems, the digital content being arranged in a container file comprising internal file management information in accordance with a container file format, comprises:
The solution according to the present invention proposes to create a container file on the storage systems of the processing nodes, e.g. an MXF file, an AVI file, or a MOV file, with the original file size but without valid data. Then the file transfer first writes the internal file management information, e.g. index tables and/or header information, at the specific file offsets specified by the container file format. Only then the content parts of the file are transmitted, e.g. frames of a digital image sequence. This creates a digital content file at the processing nodes, which looks like a valid file. In this way a processing node can read first content elements from the file after a short time. The solution enables the start of processing much earlier than known solutions, before the whole file is transmitted to the one or more processing nodes. Depending on the network bandwidth this can save several hours.
Advantageously, information about which content elements of the digital content are available at the storage systems is stored in a management file, a system memory, or other storage. For this purpose preferably a content tracker is provided. In this way it is ensured that the processing nodes do not attempt reading of invalid areas of the file.
Favorably, before initiating processing of the content elements it is checked whether all content elements of the digital content necessary for a specific processing task are available at the storage systems. This helps to prevent that processing of the digital content is interrupted because necessary data is missing. Such interruptions could otherwise lead to the need to process the digital content up to the interruption point once again, which could increase the processing time.
Advantageously, results of the processing of the content elements are provided to a further stage of a processing workflow. For example, the results of a dirt detection process for a movie are made available to a dirt removal process. This allows to start the removal process though the detection process has not yet been completed for the whole movie.
For a better understanding the invention shall now be explained in more detail in the following description with reference to the figures. It is understood that the invention is not limited to this exemplary embodiment and that specified features can also expediently be combined and/or modified without departing from the scope of the present invention as defined in the appended claims. In the figures:
In order to process the digital content the required digital content first has to be transmitted and, if necessary, transcoded or transformed, from the data repository 1 to the storage systems 3. Depending on the interconnect 4 this can be a time consuming task with a significant duration in the order of magnitude of the processing time. To shorten the overall run-time a pipelined processing scheme is employed, which means that the processing does not await transmission of the whole digital content. Instead, the processing starts immediately after the minimum number of frames has been transmitted, i.e. all frames that are needed for the specific processing steps and in order to ensure full load of the CPU's of the processing nodes 2. Of course, the processing nodes 2 need information about the currently available frames to prevent processing of undefined data. This is easy to implement using frame based files like DPX, TIFF, PNG, as exemplarily illustrated in
However, the situation is different for container files 6 like AVI, MOV, MXF etc., as shown in
In order to allow processing of digital content that is transmitted as a container file, a workflow as illustrated in
An apparatus 30 for writing digital content to a local or network storage system 3 is schematically depicted in
The apparatus 30 includes a writer engine 31 for creating a placeholder file on the storage system 3 via an interface 32 and for filling this placeholder file with internal file management information 7, 8 and elements of the digital content retrieved via an input 33 from the data repository 1. The writer engine 31 is located either on the sender side or on the receiver side of a system for processing digital content. If the writer engine works locally, sender side and receiver side are the same. If the writer engine 31 is located on the receiver side, it preferably also comprises a content tracker 34 for tracking which elements of the digital content are already available on the storage system 3.
Number | Date | Country | Kind |
---|---|---|---|
11306591.6 | Nov 2011 | EP | regional |