1. Field of the Invention
The present invention relates to processing of multimedia contents, and more particularly, to a method of and apparatus for encoding multimedia contents and a method of and system for applying encoded multimedia contents.
2. Description of the Related Art
Moving Picture Experts Group (MPEG), which is an international standardization organization related to multimedia, has been conducting standardization of MPEG-2, MPEG-4, MPEG-7 and MPEG-21, since its first standardization of MPEG-1 in 1988. As a variety of standards have been developed in this way, a need to generate one profile by combining different standard technologies has arisen. As a step responding to this need, MPEG-A (MPEG Application: ISO/ICE 230000) multimedia application standardization activities have been carried out. Application format standardization for music contents has been performed under a name of MPEG Music Player Application Format (ISO/ICE 23000-2) and at present the standardization is in its final stage. Meanwhile, application format standardization for image contents, and photo contents in particular, has entered a fledgling stage under a name of MPEG Photo Player Application Format (ISO/IEC 23000-3).
Previously, element standards required in one single standard system are grouped as a set of function tools, and made to be one profile to support a predetermined application service. However, this method has a problem in that it is difficult to satisfy a variety of technological requirements of industrial fields with a single standard. In a multimedia application format (MAF) for which standardization has been newly conducted, non-MPEG standards as well as the conventional MPEG standards are also combined so that the utilization value of the standard can be enhanced by actively responding to the demand of the industrial fields. The major purpose of the MAF standardization is to provide opportunities that MPEG technologies can be easily used in industrial fields. In this way, already verified standard technologies can be easily combined without any further efforts to set up a separate standard for application services required in the industrial fields.
At present, a music MAF is in a final draft international standard (FDIS) state and the standardization is in an almost final stage. Accordingly, the function of an MP3 player which previously performed only a playback function can be expanded and thus the MP3 player can automatically classify music files by genre and reproduce music files, or show the lyrics or browse album jacket photos related to music while the music is reproduced. This means that a file format in which users can receive more improved music services has been prepared. In particular, recently, the MP3 player has been mounted on a mobile phone, a game console (e.g., Sony's PSP), or a portable multimedia player (PMP) and has gained popularities among consumers. Therefore, a music player with enhanced functions using the MAF is expected to be commercialized soon.
Meanwhile, standardization of a photo MAF is in its fledgling stage. Like the MP3 music, photo data (in general, Joint Photographic Experts Group (JPEG) data) obtained through a digital camera has been rapidly increasing with the steady growth of the digital camera market. As media (memory cards) for storing photo data have been evolving toward a smaller size and higher integration, hundreds of photos can be stored in one memory card now. However, in proportion to the increasing amount of the photos, the difficulties that users are experiencing have also been increasing.
In the recent several years, the MPEG has standardized element technologies required for content-based retrieval and/or indexing as descriptors and description schemes under the name of MPEG-7. A descriptor defines a method of extracting and expressing content-based feature values, such as texture, shape, and motions of an image, and a description scheme defines the relations between two or more descriptors and a description scheme in order to model digital contents, and defines how to express data. Though the usefulness of MPEG-7 has been proved through a great number of researches, lack of an appropriate application format has prevented utilization of the MPEG-7 in the industrial fields. In order to solve this problem, the photo MAF is aimed to standardize a new application format which combines photo digital contents and related metadata in one file.
Also, the MPEG is standardizing a multimedia integration framework under the name of MPEG-21. That is, in order to solve potential problems, including compatibility among content expression methods, methods of network transmission, and compatibility among terminals, caused by individual fundamental structures for transmission and use of multimedia contents and individual management systems, the MPEG is suggesting a new standard enabling transparent access, use, process, and reuse of multimedia contents through a variety of networks and devices. The MPEG-21 includes declaration, adaptation, and processing of digital items (multimedia contents+metadata). However, the problem of how to interoperate the technologies of the MPEG-7 and MPEG-21 with the MAF has yet to be solved.
Additional aspects, features, and/or advantages of the invention will be set forth in part in the description which follows and, in part, will be apparent from the description, or may be learned by practice of the invention.
The present invention provides a method and apparatus for encoding multimedia contents in which in order to allow a user to effectively browse or share photos, photo data, visual feature information obtained from the contents of photo images, and a variety of hint feature information for effective indexing of photos are used as metadata and encoded into a multimedia application format (MAF) file.
The present invention also provides a method and system for applying encoded multimedia contents, in which an MAF file is processed in order to allow a user to browse or share the MAF file.
According to an aspect of the present invention, there is provided a method of encoding multimedia contents, comprising: separating media data and metadata from multimedia contents; creating metadata complying with a predetermined standard format by using the separated metadata; and encoding the media data and the metadata complying with the standard format, and thus creating a multimedia application format (MAF) file including a header containing information indicating a location of the media data, the metadata and the media data, wherein the metadata complying with the standard format includes media player metadata.
According to another aspect of the present invention, there is provided an apparatus for encoding multimedia contents, comprising: a pre-processing unit separating media data and metadata from multimedia contents; a metadata creation unit creating metadata complying with a predetermined standard format by using the separated metadata; and an encoding unit encoding the media data and the metadata complying with the standard format, and thus creating an MAF file including a header containing information indicating a location of the media data, the metadata and the media data, wherein the metadata complying with the standard format includes media player metadata.
According to another aspect of the present invention, there is provided a method of applying multimedia contents comprising: storing in a database, an MAF file, including at least one single track which includes a header containing information indicating a location of media data, media data complying with a predetermined standard format, and media metadata, and application data indicating information on an application method of the media; and browsing or sharing the MAF file stored in the database, wherein the media data complying with a predetermined standard format is at least one of media player metadata or media album metadata.
According to another aspect of the present invention, there is provided a system for applying multimedia contents, comprising: a database storing an MAF file, including at least one single track which includes a header containing information indicating a location of media data, media data, and media metadata, and application data indicating information on an application method of the media; and an application unit browsing or sharing the MAF file stored in the database, wherein media data complying with the standard format is at lease one of media player metadata and media album metadata.
According to still another aspect of the present invention, there is provided a computer readable recording medium having embodied thereon a computer program for executing the methods.
These and/or other aspects, features, and advantages of the invention will become apparent and more readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings of which:
Reference will now be made in detail to exemplary embodiments of the present invention, examples of which are illustrated in the accompanying drawings, wherein like reference numerals refer to the like elements throughout. Exemplary embodiments are described below to explain the present invention by referring to the figures.
Referring to
The pre-processing unit 130 creates media data and basic metadata of the media content from the input media content. At this time, media content may be provided from the media acquisition unit 110 or may be input from the outside other than the media acquisition unit 110. By parsing exchangeable image file format (Exif) metadata included in the media content or decoding JPEG images, the pre-processing unit 130 extracts information required to generate basic metadata of the media content, and by using the extracted information, the pre-processing unit 130 creates the basic metadata of the media content. The basic metadata includes metadata which is described when each media content is obtained or created. Examples of the basic metadata may include Exif metadata for a JPEG photo file, ID3 metadata of an MP3 music file, and compression related metadata of an MPEG video file, but the basic metadata is not limited to these examples. The media data and basic metadata created in the pre-processing unit 130 are provided to the media metadata creation unit 150.
The metadata creation unit 150 creates media metadata required for forming a single integrated MAF file from a large amount of media data. According to an embodiment, the media metadata creation unit 150 creates media metadata complying with a predetermined standard, by using basic metadata provided from the pre-processing unit 130. According to another embodiment, the media metadata creation unit 150 extracts and creates basic metadata directly from the input media content, by using an MPEG-based standardized description tool, and by using the created basic metadata, creates media metadata complying with a standard. When media metadata is created complying with a standardized format and structure, MPEG-7 and MPEG-21 may be used, but the embodiment is not limited to these.
The encoding unit 170 encodes media metadata provided from the metadata creation unit 150 together with media data, and creates a single integrated MAF file 190 as the result of the encoding.
The application method data creation unit 180 creates data on an application method of an MAF file, and provides the created application method data to the encoding unit 170.
Referring to
The content-based features item 3100 includes an MPEG-7 visual descriptor 3110 that is metadata for visual feature information, such as the color, texture, and shape of photo content, and an MPEG-7 audio descriptor 3120 that is metadata for audio feature information, such as voice or music related to a photo.
The photo collection information item 3200 is an item describing information on photos belonging to an identical event, an identical person, or an identical category. In an embodiment, photo collection information may be expressed using MPEG-7 multimedia description scheme (MDS) 3210 or MPEG-21 digital item declaration (DID) 3220. However, the method of expressing the photo collection information of a photo album is not limited to the MPEG-7 MDS 3210 and the MPEG-21 DID 3220. Basically, the MPEG-7 MDS 3210 includes metadata of creation information 3211, metadata of semantic information 3212, and metadata of content organization information 3213 of media content. However, the MPEG-7 MDS 3210 applied to the present invention is not limited to those metadata, and can include other metadata included in the suggested MPEG-7 MDS.
The photo processing information item 3300 is an item describing information required in the process of browsing or sharing photos based on media metadata. For this, a procedure to display a plurality of photos on a screen based on metadata is described using an MPEG-4 scene description 3310, a procedure to display a plurality of photos on a screen based on media metadata is described using an MPEG-21 digital item processing (DIP) 3320, or information to adaptively transform a multimedia application format file for a photo album with respect to the performance of a terminal or a network, is described by using an MPEG-21 digital item adaptation (DIA) 3330.
The item 3400 indicating a user right over a photo album is an item by which an owner of an MAF file for a photo album encrypts the MAF file for the photo album and controls access by others to the photo album. The item 3400 includes MPEG-21 intellectual property management and protection (IPMP) 3410, an item (view permission) 3420 to control browsing of the MAF file for the photo album by using other right expression methods, an item (print permission) 3430 to control printing of the MAF file for the photo album, and an item (editing permission) 3440 to control editing of the MAF file for the photo album. However, the item 3400 indicating a user right is not limited to these items.
The albuming hint item 3500 includes a hint item (perception hints) 3510 to express perceptional characteristics of a human being in relation to the contents of a photo, a hint item (acquisition hints) 3520 to express camera information and photographing information when a photo is taken, a hint item (view hints) 3540 to express view information of a photo, a hint item (subject hints) 3550 to express information on persons included in a photo, and a hint item (popularity) 3560 to express popularity information of a photo.
Referring to
The item (avgColorfulness) 3511 indicating the colorfulness of the color tone expression of a photo can be measured after normalizing the histogram heights of each RGB color value and the distribution value the entire color values from a color histogram, or by using the distribution value of a color measured using a CIE L*u*v color space. However, the method of measuring the item 3511 indicating the colorfulness is not limited to these methods.
The item (avgColorCoherence) 3512 indicating the color coherence of the entire color tone appearing in a photo can be measured by using a dominant color descriptor among the MPEG-7 visual descriptors, and can be measured by normalizing the histogram heights of each color value and the distribution value the entire color values from a color histogram. However, the method of measuring the item 3512 indicating the color coherence of the entire color tone appearing in a photo is not limited to these methods.
The item (avgLevelOfDetail) 3513 indicating the detailedness of the contents of a photo can be measured by using an entropy measured from the pixel information of the photo, or by using an isopreference curve that is an element for determining the actual complexity of a photo, or by using a relative measurement method in which compression ratios are compared when compressions are performed under identical conditions, including the same image sizes, and quantization steps. However, the method of measuring the item 3513 indicating the detailedness of contents of a photo is not limited to these methods.
The item (avgHomogenity) 3514 indicating the homogeneity of texture information of the contents of a photo can be measured by using the regularity, direction and scale of texture from feature values of a texture browsing descriptor among the MPEG-7 visual descriptors. However, the method of measuring the item 3514 indicating the homogeneity of texture information of the contents of a photo is not limited to this method.
The item (avgPowerOfEdge) 3515 indicating the robustness of edge information of the contents of a photo can be measured by extracting edge information from a photo and normalizing the extracted edge power. However, the method of measuring the item 3515 indicating the robustness of edge information of the contents of a photo is not limited to this method.
The item (avgDepthOfField) 3516 indicating the depth of the focus of a camera in relation to the contents of a photo can be measured generally by using the focal length and diameter of a camera lens, and an iris number. However, the method of measuring the item 3516 indicating the depth of the focus of a camera in relation to the contents of a photo is not limited to this method.
The item (avgBlurrness) 3517 indicating the blurriness of a photo caused by shaking of a camera generally due to a slow shutter speed can be measured by using the edge power of the contents of the photo. However, the method of measuring the item 3517 indicating the blurriness of a photo caused by shaking of a camera due to a slow shutter speed is not limited to this method.
The item (avgGlareness) 3518 indicating the degree that the contents of a photo are affected by a very bright external light source is a value indicating a case where a light source having a greater amount of light than a threshold value is photographed in a part of a photo or in the entire photo, that is, a case of excessive exposure, and can be measured by using the brightness of the pixel value of the photo. However, the method of measuring the item 3518 indicating the degree that the contents of a photo are affected by a very bright external light source is not limited to this method.
The item (avgBrightness) 3519 indicating information on the brightness of an entire photo can be measured by using the brightness of the pixel value of the photo. However, the method of measuring the item 3519 indicating information on the brightness of an entire photo is not limited to this method.
Referring to
The above information exists in Exif metadata, and can be used effectively for albuming of photos. If photo data includes Exif metadata, more information can be used. However, since photo data may not include Exif metadata, the important metadata is described as photo albuming hints. The description structure of the photo acquisition hint item 3520 includes the information items described above, but is not limited to these items.
Referring to
Referring to
The item 3552 indicating the position information of the face and clothes of each person included in a photo includes an ID (PersonID) 3553, the face position (facePosition) 3554, and the position of clothes (clothPosition) 3555 of the person.
The item 3556 indicating the relationship between persons included in a photo includes IDs (PersonID1, PersonID2) 3557 and 3558 indicating two persons, and an item (relation) 3559 describing the relationship between the two person in an arbitrary format.
The following table 1 shows description structures, which express hint items required for photo albuming among hint items required for effective multimedia albuming, expressed in an extensible markup language (XML) format.
The following table 2 shows the description structure of the perceptional hint item 3510 indicating the perceptional characteristics of a human being in relation to the contents of a photo, among hint items required for photo albuming illustrated in table 1, expressed in an XML format.
The following table 3 shows the description structure of the photo acquisition hint item 3520 indicating camera information and photographing information when a photo is taken, among hint items required for photo albuming illustrated in table 1, expressed in an XML format.
The following table 4 shows the description structure of the photo view hint item 3540 indicating view information of a photo, among hint items required for photo albuming illustrated in table 1, expressed in an XML format.
The following table 5 shows the description structure of the subject hint item 3550 to indicate information on persons included in a photo, among hint items required for photo albuming illustrated in table 1, expressed in an XML format.
Referring to
The collection-level description metadata 18100 includes description metadata 18110 describing creation information of a corresponding metadata, creation information metadata 18120 describing creation information for a photo collection that is defined by metadata, content references metadata 18130 describing identification information about each photo in a photo collection that is defined by metadata, and a content collection metadata 18140 for a sub-level photo collection in a photo collection that is defined by metadata.
The item-level description metadata 18200 includes description metadata 18210 describing creation information of corresponding metadata, creation information metadata 18220 describing creation information for photos that are defined by metadata, content references metadata 18230 describing identification information about each photo that is defined by metadata, and visual features metadata 18240 for content-based visual features in photos that are defined by metadata.
Meanwhile, the collection-level description metadata is expressed by an ID of CreationInformation DS, or is expressed using “//Creation Information DS/Classification/Genre” description scheme with a classification scheme which is newly defined. The classification scheme may be expressed as defined in table 6.
Referring to
Referring to
Referring to
The following table 7 shows an example of mapping albuming semantics to MPEG-7 MDS.
Referring to
Referring to
The following tables 8-1 and 8-2 represent the Event collection metadata in an XML format, and
The following table 9 represents another example of the Event collection metadata in an XML format, based on the classification scheme defined in table 6, and
The following tables 10-1 and 10-2 represent the Category collection metadata in an XML format, and
The following table 11 represents another example of the Category collection metadata in an XML format, based on the classification scheme defined in table 6, and
The following tables 12-1 and 12-2 represent the Person collection metadata in an XML format, and
The following table 13 represents another example of the Person-identity collection metadata in an XML format, based on the classification scheme defined in table 6, and
Referring to
Meanwhile, an MAF file can be formed with one multiple track MAF 6100 which is composed of a plurality of single track MAFs 6300. The multiple track MAF 6100 includes one or more single track MAFs 6300, an MAF header 6110 of the multiple tracks, MPEG metadata 6600 in relation to the multiple tracks, and application method data 6500 of the MAF file. In the current embodiment, the application method data 6500 is included in the multiple tracks 6100. In another embodiment, the application method data 6500 may be input independently to an MAF file.
Referring to
Also, the part (Movie box) 1520 indicating the metadata of the entire file includes, as basic elements, the part (Meta box) 1530 indicating the metadata in relation to a collection level and a single track MAF (Track box) 1540 formed with one media content and metadata corresponding to the media content. The single track MAF 1540 includes a header (Track Header box) 1541 of the track, media data (Media box) 1542, and MPEG metadata (Meta box) 1543. MAF header information is data indicating media data, and may comply with an ISO basic media file format. The link between metadata and each corresponding internal resource can be specified using the media data 1542. If an external resource 1550 is used instead of the MAF file itself, link information to this external resource may be included in a position specified in each single track MAF 1540, for example, may be included in the media data 1542 or MPEG metadata 1543.
Also, a plurality of signal track MAFs 1540 may be included in the part (Movie box) 1520 indicating the metadata of the entire file. Meanwhile, the MAF file 1500 may further include data on the application method of an MAF file as illustrated in
Also, in the MAF file 1500, descriptive metadata may be stored using metadata 1530 and 1543 included in Movie box 1520 or Track box 1540.
The metadata 1530 of Movie box 1520 can be used to define collection level information and the metadata 1543 of Track box 1540 can be used to define item level information. All descriptive metadata can be used using an MPEG-7 binary format for metadata (BiM) and the metadata 1530 and 1543 can have an mp7b handler type. The number of Meta box for collection level descriptive metadata is 1, and the number of Meta boxes for item level description metadata is the same as the number of resources in the MAF file 1500.
Referring to
Metadata and application method data related to media data are transferred to the encoding unit 170 and created as one independent MAF file 190.
Referring to
Referring to
The media album database 2220 stores the MAF file created in the MAF file creation unit 2210. The MAF file stored in the media album database 2220 is provided to the browsing unit 2240 and the sharing unit 2250 according to a request from the user.
The query processing unit 2230 retrieves an MAF file which the user desires to browse or share. At this time, metadata of each MAF file stored in the media album database 2220 is parsed so that MAF files matching with the user's query are found.
In an embodiment of the present invention, the created photo album MAF file is transmitted to other devices through a communication channel 2260. Here, the communication channel 2260 includes wired and/or wireless Internet, a mobile communication network, and a Bluetooth channel, and also includes a physical connection, such as a universal serial bus (USB) apparatus.
An example of the device to which the MAF file is transmitted may include any one of a legacy device 2271 which can recognize the MAF but does not provide full compatibility, an MAF-aware terminal device 2273 which fully recognizes the MAF, an MAF-aware mobile device 2275 which fully recognizes the MAF, and an MAF-aware web album 2277 which fully recognizes the MAF.
Referring to
If photo data is stored in each MAF file of the MAF database 2330, redundancy of photo data occurs. Accordingly, photo data is stored in a separate photo database (Photo DB) 2340 and in each MAF file of the MAF database 2330, metadata and locators indicating the photos of the photo database 2340 are included.
Then, an input query of the user is processed in operation 2350, a new MAF file including a photo matching with the user's query is created and shared in operation 2360 or a photo collection matching with the user's query is browsed in operation 2370.
Referring to
Referring to
Referring to
According to a method of applying a photo album MAF file in a web album apparatus according to an embodiment of the present invention, when a great number of photos included in the photo album MAF are desired to be stored in the web album apparatus, one MAF file is transmitted to the web album apparatus and the web album apparatus extracts metadata from the transmitted MAF file and automatically performs categorization.
Referring to
The following table 11 shows semantic tools of collection level description metadata of
The following table 12 shows semantic tools of item level description metadata of
In addition to the above-described exemplary embodiments, exemplary embodiments of the present invention can also be implemented by executing computer readable code/instructions in/on a medium, e.g., a computer readable medium. The medium can correspond to any medium/media permitting the storing and/or transmission of the computer readable code. The computer readable code/instructions can be recorded/transferred in/on a medium in a variety of ways, with examples of the medium including magnetic storage media (e.g., floppy disks, hard disks, magnetic tapes, etc.), optical recording media (e.g., CD-ROMs, or DVDs), magneto-optical media (e.g., floptical disks), hardware storage devices (e.g., read only memory media, random access memory media, flash memories, etc.) and storage/transmission media such as carrier waves transmitting signals, which may include instructions, data structures, etc. Examples of storage/transmission media may include wired and/or wireless transmission (such as transmission through the Internet). Examples of wired storage/transmission media may include optical wires and metallic wires. The medium/media may also be a distributed network, so that the computer readable code/instructions is stored/transferred and executed in a distributed fashion. The computer readable code/instructions may be executed by one or more processors.
According to the present invention as described above, in a process of integrating digital photos and other multimedia content files into one file in the application file format MAF, visual feature information obtained from photo data and the contents of the photo images, and a variety of hint feature information for effective indexing of photos are included as metadata and content application method tools based on the metadata are included. Accordingly, even when the user does not have a specific application or a function for applying metadata, general-purpose multimedia content files can be effectively used by effectively browsing or sharing the multimedia content files.
Although a few exemplary embodiments of the present invention have been shown and described, it would be appreciated by those skilled in the art that changes may be made in these exemplary embodiments without departing from the principles and spirit of the invention, the scope of which is defined in the claims and their equivalents.
Number | Date | Country | Kind |
---|---|---|---|
10-2006-0049126 | May 2006 | KR | national |
This application claims the priority of U.S. Provisional Application Nos. 60/700,737, filed on Jul. 20, 2005, 60/724,789, filed on Oct. 11, 2005, 60/783,067, filed on Mar. 17, 2006, and 60/786,366, filed on Mar. 28, 2006 in the United States Patent Trademark Office, and the benefit of Korean Patent Application No. 10-2006-0049126, filed on May 31, 2006, in the Korean Intellectual Property Office, the disclosures of which are incorporated herein in their entirety by reference.
Number | Date | Country | |
---|---|---|---|
60700737 | Jul 2005 | US | |
60724789 | Oct 2005 | US | |
60783067 | Mar 2006 | US | |
60786366 | Mar 2006 | US |