Home networks provide users with an ability to share files, peripheral equipment such as printers and scanners, and often a high speed Internet connection. As home networking capabilities continue to grow and evolve, digital multimedia content including music, video, pictures, games and other data is increasingly accessed and shared among a larger variety of electronic devices in the home. For example, advanced programming content such as high-definition television (“HDTV”), pay-per-view entertainment (“PPV”) and video-on-demand (“VOD”) enters the home from a broadcast source such as a cable or satellite network source and is distributed over the home network where it is stored or consumed. Consumers are thus developing expectations that they will have more control over multimedia content, both to watch and use content when they want, as well as to move content to different types of displays in various rooms in the home.
Home networks often include set top boxes (“STBs”) which enable the multimedia content to be selected and played on a television or home entertainment system that is connected to the STB. In addition, as personal computers (“PCs”) have gained bigger displays and more capable audio playback capabilities, users are also relying on PCs more frequently to play multimedia content. Some PCs incorporate television tuners that allow television programming to be selected and played. However, PCs more frequently host a media player that is capable of rendering digital content. In addition to PCs, media players are commonly installed on other multimedia and portable electronic devices such as personal digital assistants (“PDAs”), mobile phones, game consoles, and multimedia players that can play video and music.
Most of the popular media players are not currently capable of displaying closed captioning that is encoded into video content according to broadcast television standards. Closed captioning is an assistive technology designed to provide access to multimedia content for persons with hearing disabilities by displaying the audio portion of the content as text on a display screen. While digital multimedia content delivery from television sources and the Internet continues to converge in many areas, a single method for delivering closed captioning to users across all multimedia platforms and devices has not emerged. As a result, closed captioning is not always available to users when watching video on PCs and other electronic multimedia devices.
Disclosed is a multimedia server and related methods for distributing closed captioning over a network to one or more client devices each running a media player that does not support standardized closed captioning. The client devices typically include PCs, multimedia players, and portable electronic devices that are coupled to a home network. The multimedia server receives a media stream including closed captioning that is encoded according to a closed captioning standard such as Consumer Electronics Association CEA-608-B, CEA-708-B, Advanced Television Systems Committee ATSC A/53 or the Society of Cable Telecommunications Engineers SCTE 20 and/or SCTE 21. The multimedia server transcodes the closed captioning into a format that is usable by the media player and transmits the transcoded closed captioning to the client device over the network so that the media player can render the closed captioning synchronously with programming content included in the media stream. Advantageously, the multimedia server enables a user to see closed captioning displayed on the client device that would otherwise be lost.
Closed captioning has historically been a way for deaf and hard-of—hearing/hearing-impaired people to read a transcript of the audio portion of a video program, film, movie or other presentation. Others benefiting from closed captioning include people learning English as an additional language and people first learning how to read. Many studies have shown that using captioned video presentations enhances retention and comprehension levels in language and literacy education.
As the video plays, words and sound effects are expressed as text that can be turned on and off at the user's discretion so long as they have a caption decoder. In the United States, since the passage of the Television Decoder Circuitry Act of 1990, manufacturers of most television receivers have been required to include closed captioning decoding capability. Beginning in July 1993, the Federal Communications Commission (“FCC”) required all analog television sets with screens 13 inches or larger sold or manufactured in the United States to contain built-in decoder circuitry to display closed captioning. Beginning Jul. 1, 2002, the FCC also required that digital television (“DTV”) receivers include closed captioning display capability. In 1996, Congress required video program distributors (cable operators, broadcasters, satellite distributors, and other multi-channel video programming distributors) to close caption their television programs. Pursuant to this requirement, the FCC in 1997 set a transition schedule requiring distributors to provide an increasing amount of captioned programming.
The term “closed” in closed captioning means that not all viewers see the captions—only those who decode and activate them. This is distinguished from open captions, where the captions are permanently burned into the video and are visible to all viewers. As used in the remainder of the description that follows, the term “captions” refers to closed captions unless specifically stated otherwise.
Closed captions are further distinguished from “subtitles.” In the U.S. and Canada, subtitles assume the viewer can hear but cannot understand the language, so they only translate dialogue and some onscreen text. Closed captions, by contrast, aim to describe all significant audio content, as well as “non-speech information,” such as the identity of speakers and their manner of speaking.
For live programs in countries that use the analog NTSC (National Television System Committee) television system, like the U.S. and Canada, spoken words comprising the television program's soundtrack are transcribed by a reporter (i.e., like a stenographer/court reporter in a courtroom using stenotype or stenomask equipment). Alternatively, in some cases the transcript is available beforehand and captions are simply displayed during the program. For prerecorded programs (such as recorded video programs on television, videotapes, and DVDs), audio is transcribed and captions are prepared, positioned, and timed in advance.
For all types of NTSC programming, captions are encoded into Line 21 of the vertical blanking interval (“VBI”)—a part of the TV picture that sits just above the visible portion and is usually unseen. “Encoded,” as used in the analog case here (and in the case of digital video below) means that the captions are inserted directly into the video stream itself and are hidden from view until extracted by an appropriate decoder or decoding process.
Closed caption information is added to Line 21 of the VBI in either or both the odd and even fields of the NTSC television signal. Particularly with the availability of Field 2, the data delivery capacity (or data bandwidth) far exceeds the requirements of simple program related captioning in a single language. Therefore, the closed captioning system allows for additional “channels” of program-related information to be included in the Line 21 data stream. In addition, multiple channels of non-program related information are possible.
The decoded captions are presented to the viewer in a variety of ways. In addition to various character formats such as upper/lower case, italic, and underline, the characters may “Pop-On” the screen, appear to “Paint-On’ from left to right, or continuously “Roll-Up” from the bottom of the screen. Captions may appear in different colors as well. The way in which captions are presented, as well their channel assignment, is determined by a set of overhead control codes which are transmitted along with the alphanumeric characters which form the actual caption in the VBI.
Sometimes music or sound effects are also described using words or symbols within the caption. The Consumer Electronics Association (“CEA”) defines the standard for NTSC captioning in CEA-608-B. Virtually all television equipment including videocassette players and/or recorders (collectively, “VCRs”), DVD players, DVRs (digital video recorders) and STBs with NTSC output can output captions on line 21 of the VBI in accordance with CEA-608-B.
For ATSC (Advanced Television Systems Committee) programming (i.e., digital- or high-definition television, DTV and HDTV, respectively, collectively referred to here as “DTV”), three data components are encoded in the video stream: two are backward compatible Line 21 captions, and the third is a set of up to 63 additional caption streams encoded in accordance with another standard—CEA-708-B. DTV closed captioning is covered by the ATSC A/53 standard for the carriage of line 21 VBI data which was extended under the SCTE 21 standard for CEA-608-B-compliant closed captioning to be supported by one or more VBI lines other than line 21. All DTV signals are compliant with the MPEG-2 protocol (Moving Pictures Expert Group). This protocol is commonly used in digital cable and satellite services, streaming Internet video and DVD (digital versatile disc) and defines the syntax and semantics for the movement of compressed digital content across a network.
Closed captioning in DTV is based around a caption window (i.e., like a “window” familiar to a computer user where the caption window overlays the video and closed captioning text is arranged within it). DTV closed captioning and related data is carried in three separate portions of the MPEG-2 data stream. They are the picture user data bits, the Program Mapping Table (PMT), and the Event Information Table (EIT). The captioning text itself and window commands are carried in the MPEG-2 Transport Channel in the picture user data bits. A captioning service directory (which shows which caption services are available) is carried in the PMT and optionally for cable, in the EIT. To ensure compatibility between analog and digital closed captioning (CEA-608-B and CEA-708-B, respectively), the MPEG-2 transport channel is designed to carry both formats.
The backwards compatible line 21 captions are important because some users want to receive DTV signals but display them on their NTSC television sets. Thus, DTV signals can deliver Line 21 caption data in a CEA-708-B format. In other words, the data does not look like Line 21 data, but once recovered by the user's decoder, it can be converted to Line 21 caption data and inserted into Line 21 of the NTSC video signal that is sent to an analog television. Thus, line 21 captions transmitted via DTV in the CEA-708-B format come out looking identical to the same captions transmitted via NTSC in the CEA-608-B format (in a CEA-608 format along with CEA-708 formatted data). This data has all the same features and limitations of 608 data, including the speed at which it is delivered to the user's equipment.
While U.S. law and FCC regulations cover closed captioning support in broadcast television, there is no equivalent scheme governing video delivered over the Internet. Accordingly, most of the popular media players such as Microsoft Media Player, Real Networks RealPlayer, and Apple QuickTime, iTunes, and iTunes Video—which were developed primarily for streaming video over Internet applications—have no native support for closed captioning that is encoded according to the standards for broadcast television, including CEA-608-B, CEA-708-B, ATSC A/53 and SCTE 20 and/or SCTE 21 (wherein a “native” format is one that the media player normally reads). That is, these media players are incapable of extracting and using the standardized closed captioning in their original form.
Turning now to
Client devices 125 are typically selected from consumer electronic devices including, for example, PCs, STBs, thin client STBs, mobile phones, music players, multimedia players, handheld game devices, laptop and notebook computers, webpads, PDAs, and the like. Client devices 125 each host a media player which is generally implemented as a software application either on a standalone basis, or built into a web browser (often as a “plug-in”) and include, for example, Microsoft Windows Media Player, Real Networks RealPlayer, and Apple QuickTime, iTunes, and iTunes Video.
Multimedia server 105 receives a modulated A/V (audio/video) signal 130 from a media content source 135. In most applications, A/V signal 130 is a digital signal carrying multiple channels of audio, video, and closed captioning data in accordance with CEA-708-B for digital television. In alternative arrangements, A/V signal 130 is an analog signal carrying multiple channels of audio, video and closed captioning in line 21 of the VBI in accordance with CEA-608-B for NTSC television.
Media content source 135 is alternatively arranged from such sources as a satellite network source, such as one used in conjunction with a direct broadcast service, a CATV (community access television) source for implementing cable television and broadband Internet access services, and a telecommunications network for implementing a digital subscriber line (“DSL”) service.
Multimedia server 105 is commonly incorporated into a STB and is optionally configured with DVR capabilities and thus includes a hard disk drive or other memory (not shown). In this case, multimedia server 105 is capable of serving multimedia content to client devices 125 substantially in real time as the modulated A/V signal 130 is received. Multimedia server 105 is also capable of recording incoming multimedia content to its DVR for distribution to the client devices at a later time. Alternatively, multimedia server 105 is arranged from devices such as personal computers, media jukeboxes, audio/visual file servers, and other devices that can receive, store, and serve multimedia content over home network 127.
Transcoder module 231 transcodes the A/V channels into a format that is suitable for one or more of the client devices 125. In an illustrative example, client device 1251 is configured to host a Windows Media Player application. Windows Media Player does not include built-in MPEG-2 support. That is, Windows Media Player is not supplied by Microsoft with an MPEG-2 decoder, but rather, includes native support for its own proprietary Windows Media Video (“WMV”) formatted video streams. Additional features such as audio and video effects and new rendering types, are commonly added to Windows Media Player (and the other popular media players) through the installation of “plug-ins” which is a computer program that is designed to work with the media player to provide the desired feature or functionality.
Transcoder module 231 receives an MPEG-2 stream, for example on line 3051, and transcodes the received stream into a WMV formatted stream that is output on line 327 from the multimedia server 105 to the home network 127. The transcoding optionally includes security encryption or imposition of other digital rights management (“DRM”) schemes that are compliant with the Windows Media Player security feature set. The WMV formatted stream is received on line 3401 from home network 127 and stored or played by the Windows Media Player on the client device 1251.
Transcoder module 231 outputs a plurality of transcoded AV signals on lines 3121 to 312N to router module 245. Router module 245 is utilized to route the transcoded AV signals 312 to the client devices 125. In some applications, such routing is performed in response to a request to the multimedia server 105 by a client device 125 to receive multimedia content. For example, a user of a client device 125 such as a PC wishing to view programming content typically interacts with a menu application running on client device 125. The menu application enables a user to browse and select programming content that is available to be served by multimedia server 105 to thereby initiate a multimedia viewing event on the client device 125. Typically, the menu is implemented with common electronic programming guide (“EPG”) features using a standalone software application. Alternatively, the menu is implemented using HTML (Hypertext Markup Language) code readable by a web browsing application.
Router module 245, in an illustrative example, encapsulates the transcoded A/V signals 312 in an IP layer in an output stream 327 using an IP datagram addressing methodology in which the destination IP address is the IP address of the requesting client device 125. Alternatively, router module 245 uses an IEEE-1394 compliant delivery protocol where transmission of the transcoded A/V signals 312 to the client devices 125 is performed isochronously. In this case, either IEEE EUI-64 (extended unique identifier) 64 bit addressing or IEEE 802.11 48 bit addressing is usable depending upon the requirements of a specific application. The transcoded A/V signals 312 are delivered over network 127 to the client devices 125 on lines 340, as shown.
As noted above, the closed captioning is encoded in A/V signal 130 using standard closed captioning encoding techniques and in particular, CEA-708-B (and/or CEA-608-B). The extracted closed captioning data is output from the tuner/demodulators 207 to the closed captioning module 222 on respective lines 4051 to 405N.
Closed captioning module 222 transcodes the extracted closed captioning data into a format that is suitable for one or more of the client devices 125. For example, as with the illustrative example described above in the text accompanying
Closed captioning module 222 outputs a plurality of transcoded closed captioning data signals on lines 4121 to 412N to router module 245. In a similar manner for routing the A/V signals 312 (
A transcoded A/V signal 340 is received by the client device 125 which typically contains programming content such as a television show or movie. As noted above, the transcoded A/V signal is formatted to be usable by the media player hosted by the client device 125. For example, if media player 605 is arranged as a Windows Media Player, then the transcoded A/V signal 340 is formatted as a WMV compliant signal (i.e., a native format for Windows Media Player) which is either streamed or served from multimedia server 105 (
The transcoded A/V signal 340 is buffered or stored in memory 619. The transcoded closed captioning 440 that is associated with the television programming in the transcoded A/V signal 340 is also buffered or stored in memory 619. The transcoded A/V signal 340 and closed captioning 440 is read from memory 619 by media player 605. Media player 605 supports several processes including A/V processing 625 and closed caption processing 631. A/V processing 625 includes decoding and decrypting, as appropriate, the transcoded A/V signal 340 and outputting a corresponding video output signal to the video display processor 610 for presentation on the display 617. Closed captioning processing 631 includes parsing the transcoded closed captioning 440, synchronizing the closed captioning with the A/V signal 340 and then outputting the closed captions to the video display processor 610 so that they are rendered on the display 617.
Transcoder 231 (
SAMI is a file format developed by Microsoft that is designed to deliver synchronized text such as captions, subtitles, or audio descriptions with digital media content. The Windows Media Player includes native support for SAMI and captioning delivered in SAMI may be rendered directly by the player without the necessity for any additional plug-ins or software.
SAMI files are plaintext files that have a .smi or .sami file name extension. They contain the text strings used for synchronized closed captions, subtitles, and audio descriptions. They also specify the timing parameters used by the Windows Media Player to synchronize closed caption text with audio portion of the programming content. When a media file reaches a time designated in the SAMI file, the captioning text changes accordingly in the closed caption display area in the media player.
SAMI and HTML share common elements, such as the <HEAD> and <BODY> tags. As in HTML, tags used in SAMI files must always be used in pairs. For example, a BODY element begins with a <BODY> tag and must always end with a </BODY> tag. A basic SAMI file requires three fundamental tags: <SAMI>, <HEAD>, and <BODY>. The <SAMI> tag identifies the document as a SAMI document so other applications can recognize its file format. Between the <HEAD> and </HEAD> tags, basic guidelines and other format information for the SAMI file, such as the document title, general information, and style properties for closed captions are defined. Like HTML, content declared within the HEAD element does not display as output. Elements and attributes defined between the <BODY> and </BODY> tags display content seen by the user. In SAMI, the BODY element contains the parameters for synchronization and the text strings used for closed captions. Defined within the HEAD element, the STYLE element provides for added functionality in SAMI. Between the <STYLE> and </STYLE> tags, several Cascading Style Sheet (“CSS”) selectors for style and layout may be defined. Style properties such as fonts, sizes, and alignments can be customized to provide a rich user experience while also promoting accessibility. For example, defining a large text font style class can improve the readability for users who have difficulty reading small text.
Reference numeral 904 indicates that the “ref” element has an “href” attribute value that refers to the location of the media file. In this illustrative example, the location is a Windows media server disposed in multimedia server 105 (
The transcoder 231 (
RealText is a file format developed by RealMedia that is designed to deliver synchronized text such as captions, subtitles, or audio descriptions with digital media content. The RealMedia Player includes native support for RealText and captioning delivered in RealText may be rendered directly by the player without the necessity for any additional plug-ins or software.
RealText files are plaintext files using the XML programming language that have a similar structure to HTML and may use HTML tags. Like the SAMI file shown in
Reference numerals 1304 and 1306, respectively, show the “video src” value which indicates the location of the RealMedia Video file 1102 and the “text stream src” value which indicates the location of the RealText file 1113. In this illustrative example, the location is a media server disposed in multimedia server 105 (
Reference numeral 1312 in
The transcoder 231 (
The plaintext file 1413 contains information about what captions will display, when they will display, and what they will look like to thereby deliver synchronized text such as captions, subtitles, or audio descriptions with the programming content contained in the QuickTime Movie File 1402. When the plaintext file 1413 is supplied with the SMIL metafile 1418, the closed captioning included therein may be rendered directly by the Apple QuickTime player without the necessity for any additional plug-ins or software.
Reference numerals 1604 and 1606, respectively, show the “video src” value which indicates the location of the QuickTime Movie File 1402 and “textstream src” value which indicates the location of the plaintext closed captioning file 1413. In this illustrative example, the location is a QuickTime media server disposed in multimedia server 105 (
A transcoded A/V signal 340 is received by the client device 125 which typically contains programming content such as a television show or movie. As noted above, the transcoded A/V signal is formatted to be usable by the media player hosted by the client device 125. For example, the transcoded A/V signal is coded in HTML with embedded video content that is either streamed or served from multimedia server 105 (
The transcoded A/V signal 340 is buffered or stored in memory 1719. The transcoded closed captioning 440 that is associated with the television programming in transcoded A/V signal 340 is also buffered or stored in memory 1719. The transcoded A/V signal 340 and closed captioning 440 are read from memory 1719 by web browser 1705. Web browser 1705 supports several processes including A/V processing 1725 and RSS (Really Simple Syndication) reader 1731. A/V processing 1725 includes decoding and decrypting the transcoded A/V signal 340, as appropriate, and then outputting a corresponding video output signal to video display processor 1710 for display on display 1717. A/V processing 1725 is generally implemented using a media player plug-in to web browser 1705. Such plug-ins are supplied by the major media player providers including Microsoft, RealMedia, and Apple with similar features and functions as the standalone media players described above.
RSS is a file format based on XML and is commonly used as a web feed format. RSS readers are often implemented as standalone programs or incorporated into standard web browsers as a plug-in. Accordingly, in this illustrative example, transcoded closed captioning 440 is coded in XML to include the closed captions, timing, and style information. RSS reader 1731 includes functionality for parsing the transcoded closed captioning 440, synchronizing the closed captioning with the A/V signal 340 and then outputting the closed captions to the video display processor 1710 so that they are rendered on the display 1717.
The transcoder 231 (
The closed captioning module 222 (
Both the transcoded video file 1802 and closed captioning file 1813 are embedded or otherwise linked, in this illustrative example, in an HTML file 1818 that is served to the client device 125. In alternative arrangements, the process of embedding the video and closed captioning files into the HTML file 1818 is performed by transcoder module 231 (
A Java applet 1809 is also embedded in the HTML file 1818 that is served by multimedia server 105 to client device 125 over home network 127. Java applet 1809 is arranged as a single file (having a java extension) or as a plurality of files (having a jar extension). Java applet 1809 is executable code that is transferred in the HTML file 1818 and run by web browser 1705 using a Java Virtual Machine plug-in. The Java Virtual Machine provides the environment that runs programs (i.e., applets) written in the Java language. Java applet 1809 provides a programmatic structure for the web browser 1705 to render the closed captioning file 1813. In particular, java applet 1809 renders closed captioning synchronously with media content contained in the video file 1802 in a captioning region that is defined in the HTML file 1818 using captioning, style and timing data included in closed captioning file 1813. Java applet 1809 thus provides an alternative to SMIL when web browser 1705 does not support SMIL or has a media player plug-in that does not support SMIL.
By embedding the video and closed captioning in an HTML file, the user is allowed to access the content without requiring another application to be opened which may be advantageous in some applications of closed captioning distribution over a home network. The embedding is performed using, for example, conventional HTML tags including <applet>, <object>, or <embed> tags which contain elements and attributes required to identify and locate the video file 1802 and closed captioning file 1813. Accordingly, the HTML file 1818 functions, in this illustrative example, in a similar manner as the ASX metafile 718 (
HTML file 1818 comprising the video file 1802, Java applet 1809 and closed captioning file 1813 is received by the client device 125. As described above, video file 1802 is generated during a transcoding process and typically includes programming content such as a television show or movie. Closed captioning file 1813 includes closed captioning that is associated with the programming content.
HTML file 1818 is buffered or stored in memory 1919. HTML 1818 file is read from memory 1919 by web browser 1905. Web browser 1905 supports several processes including A/V processing 1925 and applet and closed captioning processing 1931. A/V processing 1925 includes decoding and decrypting, as appropriate, the video file 1802 and outputting a corresponding video output signal to video display processor 1910 for display on display 1917. A/V processing 1725 is implemented using the media player plug-in for web browser 1905. Applet and closed captioning processing 1931 comprises executing the Java applet 1809, synchronizing the closed captioning with video file 1802 and then outputting the closed captions to the video display processor 1910 so that they are rendered on the display 1917.
PC 1251 also hosts a user interface to enable a user to browse, select and then play media content and associated closed captioning that is served from or stored on multimedia server 105. Such user interface is configured, in this illustrative example, using an EPG-like interface that enables media content to be selected, accessed and controlled. That is, the user interacts with PC 1251 to select and view media content and closed captioning as if the content and closed captioning were delivered directly to the PC 1251 and in the proper format. The transcoding of the media content and closed captioning performed at the multimedia server 105 is thus transparent to the user. The user interface is alternatively arranged as a standalone application, or more typically built into the media player or HTML pages displayed by the web browser.
A portable media player 1252 is coupled via cable 2010 to a port 2012 disposed in PC 1251. Port 2012 is arranged as a USB (Universal Serial Bus) or IEEE-1394 (sometimes referred to as a “FireWire”) port, for example, and enables portable media player 1252 to download content from home network 127 using PC 1251. PC 1251 typically is arranged to run a media content interface application to manage media content on portable media player 1252.
Portable media player 1252 is arranged to play a variety of multimedia including music, pictures, and video. Many portable media players include a media player with native support for MPEG-4 formatted video (having .mp4, .m4v, or .mp4v files extensions). As shown, portable media player 1252 displays programming content and synchronous captioning (as illustratively depicted in
Thin client STB 1253 is coupled to a television 2011 and to home network 127. As shown, STB 1253 displays programming content and synchronous captioning (as illustratively depicted in
Laptop computer 1254 is also coupled to home network 127 and typically hosts either a standalone media player or web browser, or both applications (such as media player 605 in
A wireless access point 2025 is coupled to home network 127. Wireless access point 2025 is arranged, in this illustrative example, as a Wi-Fi access point that utilizes a wireless communications protocol in accordance with IEEE 802.11x. Wireless access point 2025 enables portable electronic devices such as mobile phones, PDAs, handheld games, music players and the like, to communicate over, and receive media content from sources such as multimedia server 105 on home network 127.
Mobile phone 1255 is in operative communication with wireless access point 2025 to receive media content from multimedia server 105. Mobile phones commonly are configured to play a variety of multimedia types including music and video. Native video formats include MPEG-4 or the 3GP format defined by 3GPP, the 3rd Generation Partnership Project (and having a .3gp or .3g2 file extension). As shown, mobile phone 1255 displays programming content and synchronous captioning (as illustratively depicted in
A handheld game console 1256 is in operative communications with wireless access point 2025 to receive media content from multimedia server 105. Handheld game console 1256 is representative of the variety of lightweight, portable electronic machines for playing video games that are available. Such devices often include features beyond gaming such as an ability to play music and video and browse the Internet. Native video formats typically include MPEG-4, while some handheld game consoles also support DivX created by DivX Inc. (and having a .divx file extension). As shown, handheld game console 1256 displays programming content and synchronous captioning (as illustratively depicted in
At block 2121, multimedia server 105 ascertains the capabilities of client devices 125 on the network 127. In an illustrative example, capabilities are ascertained through a discovery process utilizing a command control communication protocol where each client device 125, upon connection to home network 127, publishes it capabilities to control points in the home network 127, including multimedia server 105. The description of capabilities may be formatted in any of a variety of conventional formats, for example, XML using the SOAP (Simple Object Access Protocol) or other similar protocols. Such client device capabilities include identification of the video format(s) that the client device 125 supports, a list of installed video codecs, and/or other optional data that describes the video rendering and display capabilities of the client device, including for example, display window size, resolution, and color depth. Such information enables multimedia server 105 to advantageously tailor the transcoded A/V and closed captioning to meet the specific characteristics of the media player in the client device. After multimedia server 105 receives a description of the capabilities of a client device 125, controller 215 (
A first alternative to discovery is through multimedia server 105 affirmatively querying a client device 125 to ascertain its capabilities. For example, multimedia server 105 may be arranged to periodically poll client devices 125 that connect to home network 127. A second alternative is for the capabilities description of the client device 125 to be transmitted to the multimedia server 105 during the initiation of a multimedia viewing event as described above in the text accompanying
At block 2125, the closed captioning extracted from the media stream is transcoded by closed captioning module 222 into a supported format for client device 125 responsively to the instructions of controller 222. At block 2133, the A/V programming selected by the user is transcoded by transcoder 231 into a supported video format for client device 125 responsively to the instructions of controller 222. At block 2135, the transcoded closed captioning and A/V programming is transmitted from multimedia server 105 over home network 127 to client device 125. The method ends at block 2140.
Each of the processes shown in the figures and described in the accompanying text may be implemented in a general, multi-purpose or single purpose processor. Such a processor will execute instructions, either at the assembly, compiled or machine-level, to perform that process. Those instructions can be written by one of ordinary skill in the art following the description herein and stored or transmitted on a computer readable medium. The instructions may also be created using source code or any other known computer-aided design tool. A computer readable medium may be any medium capable of carrying those instructions and include a CD-ROM, DVD, magnetic or other optical disc, tape, silicon memory (e.g., removable, non-removable, volatile or non-volatile), packetized or non-packetized wireline or wireless transmission signals.