SEARCHABLE MULTIMEDIA STREAM

Information

  • Patent Application
  • 20070156843
  • Publication Number
    20070156843
  • Date Filed
    November 29, 2006
    17 years ago
  • Date Published
    July 05, 2007
    17 years ago
Abstract
The present invention provides a system and a method making an archived conference or presentation searchable after being stored in the archive server. According to the invention, one or more media streams coded according to H.323 or SIP are transmitted to a conversion engine for converting multimedia content into a standard streaming format, which may be a cluster of files, each representing a certain medium (audio, video, data) and/or a structure file that synchronizes and associates the different media together. When the conversion is carried out, the structure file is copied and forwarded to a post-processing server. The post-processing server includes i.a. a speech recognition engine generating a text file of alphanumeric characters representing all recognized words in the audio file. The text file is then entered into the cluster of files associating each identified word to a timing tag in the structure file. After this post-processing, finding key words and associated points of time in the media stream could easily be executed by a conventional search engine.
Description

BRIEF DESCRIPTION OF THE DRAWINGS

In order to make the invention more readily understandable, the discussion that follows will refer to the accompanying drawings,



FIG. 1 illustrates a state diagram for Markov modelling,



FIG. 2 shows the data structure of the ASF streaming format,



FIG. 3 is a closer view on two specific parts of the data structure of the ASF streaming format,



FIG. 4 shows a flow chart illustrating the overall steps of one embodiment of the present invention.


Claims
  • 1. A method in a streaming and archiving system for post-processing a multimedia stream converted from a conventional conference format coded data stream for the purpose of making the multimedia stream searchable, characterized in monitoring in a H.323/SIP compatible conversion engine whether a H.323 or SIP coded data stream is received, and if so, converting the conventional conference format coded data stream to a multimedia stream in a defined multimedia streaming format including timing information related to respective fragments of the multimedia stream,analyzing fragments of sound from an audio part of said multimedia stream in a speech recognition engine by generating a model of each respective fragment of sound or sequences of fragments of sound,comparing the respective model of each respective fragment of sound or sequences of fragments of sound with reference models of pronunciations of known words or phonemes stored in a database,assigning a timing information referring to a fragment or a sequence of fragments whose model said speech recognition engine has found to match a reference model of a pronunciation of a known word in said database, and associatively storing the said timing information and said word in a text file.
  • 2. An method according to claim 1, characterized in that the step of analyzing further includes: extracting and temporarily storing information indicating a time position within said multimedia stream of the current fragment of sound,if a match between a model of a current fragment of sound or a sequence of fragments of sound with said current sound included and a reference model of a pronunciations of a known word or phoneme in said database is found, then using said time position as said timing information which associatively is being stored together with said word or an input word or tag in said text file.
  • 3. A method according to claim 1 or 2, characterized in storing, in the streaming and archiving system, said text file when all fragments of sound from said audio part of said multimedia stream are analyzed making said text file accessible for later search in said multimedia stream.
  • 4. A method according to one of the preceding claims, characterized in that said models and reference models include Markov models.
  • 5. A method according to one of the preceding claims, characterized in that said defined multimedia streaming format is an Active Stream Format (ASF).
  • 6. An method according to claim 5, characterized in that said timing information is a time field and/or an offset field of the ASF associated with the start or the end of matched fragment or sequence of fragments.
  • 7. A method according to one of the preceding claims, characterized in that conventional conference format coded data stream is a H.323, H.320 or SIP coded data stream.
  • 8. A system for post-processing a multimedia stream converted from a conventional conference format coded data stream for the purpose of making the multimedia stream searchable,
Priority Claims (1)
Number Date Country Kind
20056255 Dec 2005 NO national