The present invention relates to a program data processing technology and a reproducing technology.
Over the recent years, with acceleration of a multi-channel scheme in television broadcasting, there have been increased opportunities for viewing programs suited to preferences or tastes of users, and a conceivable number of programs desired for viewing by the users has been presumed to rise. It is, however, difficult for the users to view all of the preferential programs within a limited period of time. A spread of a technology is seen, which is contrived to view within the limited period of time by utilizing a variety of reproducing techniques.
This type of technology is exemplified by a highlight reproduction function of extracting only a scene (exciting scene) assumed to be a want-to-see scene of the user or a stretch reproduction function of adjusting a reproduction speed. The highlight reproduction function is to extract a highlight scene from within a video file and reproduce only the scene exhibiting a strong degree of highlight. This function is, in short, a function of mechanically creating a digest version from the original program. In this case, it is feasible to designate a length of time (5 min, 10 min, an arbitrary period of time, etc) of the digest program from which to reproduce the highlight scene.
On the other hand, the stretch reproduction function is a function enabling a magnification of speed when reproduced to be designated such as x1.0→x1.2→x1.5→x2.0= . . . . A reproducing device adjusts a period of viewable time at the designated magnification. If the magnification falls within a predetermined limit, voices can be also reproduced.
The conventional technologies, though capable of completing the viewing till the desired time, cause such an inconvenience that there is a discrepancy between the scene selected for the highlight reproduction and a really-want-to-see scene of the user, with the result that the user “misses seeing” the want-to-see scene due to a case of not being extracted. Further, in the stretch reproduction, the reproduction is finished till the target time, and hence such a case arises that the reproduction is performed at a reproduction speed as high as disabling the user from sufficiently understanding a recorded content. In any case, the conventional viewing technologies are not friendly to the users. Note that the similar problem might arise in only-voice-based programs with no picture.
It is an aspect of the technology of the disclosure to provide a technology capable of adjusting the reproduction time of program data stored on a storage medium and enhancing a possibility that a part of the program assumed to be desired by the user can be provided at a reasonable reproduction speed.
According to an aspect of the embodiment, a program data processing device includes a reading unit, a feature extracting unit, a weight acquiring unit and a weighting unit. The reading unit reads a data part contained in program data from a file stored with program data. The feature extracting unit extracts feature information for distinguishing between reproduction information to be reproduced from the data part and reproduction information to be reproduced from another data part. The weight acquiring unit acquires a weight set on the extracted feature information from a weight table storage unit in which to set the weight per feature information contained in the program data. The weighting unit allocates the acquired weight to the data part from which the feature portion is extracted.
The object and advantages of the embodiment will be realized and attained by means of the elements and combinations particularly pointed out in the claims.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory and are not restrictive of the embodiment, as claimed.
An audiovisual device according to an embodiment will hereinafter be described with reference to the drawings. A configuration in the following embodiment is an exemplification, and the present technology is not limited to the configuration in the embodiment.
<Outline of Processing>
The audiovisual device segments video data of a video file into plural frames (video frames) of scene data (corresponding to a data part), and puts a weight on each scene. Herein, the “scene” is defined as a concept for delimiting reproduction information of a video picture, sound, voice, a story, etc which are reproduced (played back) from the video file. The scene data is defined as data for reproducing a scene delimited from other scenes on the basis of features of the reproduction information of the video picture, the sound, the voice, the story, etc. The scenes can be, though delimited based on differences in audiovisual effect between the video picture, the sound, the voice, the story, etc, also simply delimited on a time-designation basis. For example, delimiters are the time-designations of a scene 1 ranging from a start to N1 and a scene 2 ranging from N1 to N2. Further, the scenes can be also delimited in a manner that designates the frames as a concept equivalent to the time-designation. For instance, the scene 1 is delimited in a way that ranges from the frame 0 to the frame N1, and the scene 2 is delimited ranging from the frame N1 to the frame N2. Moreover, the scenes can be also delimited based on composition information which composes a program. The composition information of the program, e.g., a certain variety show program is organized by a guest feature (up to 15 min since the start), a commercial 1 (ranging from 15 min up to 16 min), a gourmet feature (from 16 min up to 30 min), a commercial 2 (from 30 min up to 31 min), a gift feature (from 31 min up to 40 min), etc. The composition information such as this can be obtained from, e.g., an EPG (Electronic Program Guide).
Then, a reproduction speed is changed on a per-scene basis according to the weight thereof. Namely, the scene presumed to be suitable for a user's preference or taste is reproduced at a normal speed, while the scene presumed to be unsuitable for the user's preference or taste is reproduced at a higher speed than the normal speed. This type of adjustment enables such a possibility to be enhanced that the user can surely view the video file in a predetermined period of reproduction time (within, e.g., the time designated by the user) and can view the scene which the user has an interest in.
Herein, the video file is created by, e.g., recording a televised program. The video file is not, however, limited to the recorded file and may involve using the data available by a variety of methods, e.g., the video file provided in the way of being stored on a storage medium. Further, the weight is set from a history of operations with respect to the programs that were viewed by the user in the past. For instance, the program viewed in the past is segmented into the plurality of scenes, features of the respective scenes are extracted, and the history of the operations conducted by the user when reproducing these scenes is collected. Then, when the operation history indicates fast-forwarding, the audiovisual device determines that the user does not show any interest in the scene or alternatively the scene is not matched with the user's preference or taste. As a result, the audiovisual device decreases the weight on the feature of this scene. By contrast, when the user returns the reproduction speed to the normal reproduction speed from the fast-forwarding status, the audiovisual device determines that the scene at that time is matched with the user's preference or taste. Then, the audiovisual device puts a heavy weight on the feature of this scene. Herein, the “normal reproduction speed” in terminology connotes a 1× speed at which to reproduce the data without the so-called fast-forwarding.
The feature of the scene is determined in a manner that extracts items of information related to the program given in the EPG, such as a sound volume level at each scene, a change in sound level, characters displayed on a screen at each scene, a change or no-change in character, words contained in an uttered voice at each scene, words given to the setting to which each scene belongs in the program, a degree of change of the screen and the information related to the program shown in the EPG.
Herein, the “characters displayed on the screen” are exemplified by a subtitle, a score in a sports program, etc. The “change or no-change in character” implies a case in which the score in the sports program changes, and so on. Moreover, the “information related to the program shown in the EPG” implies, for example, a title, performers, a rough plot, etc which are given to each section in such a case that the variety show program includes a combination of a plurality of sections of the guest feature, the gourmet feature and the gift feature. The sections organizing the program such as that and the broadcasting time of the section can be obtained from the EPG data. Further, the EPG can be acquired from a Web site on the Internet. The audiovisual device stores a relation between the features of the scenes and the weights in a storage means such as a memory or a hard disc in a weight table format.
Then, the audiovisual device segments the video file stored on the medium such as the hard disc into the plurality of scenes and reads the weight by searching through the weight table on the basis of the feature of each of the scenes. Subsequently, the readout weight is set to each scene.
A reproducing device undergoes a designation of reproduction time given from the user. Then, if the designated reproduction time is shorter than the original reproduction time of the video file, the reproducing device adjusts the reproduction time of each scene, thus controlling the reproduction time of the whole video file to converge on the reproduction time designated by the user.
When the words such as [entry of players], [start of game], [kickoff], [play ball], [end of game], [game and set] and [hero interview (flash interview)] are detected from the words of the uttered voices that are broadcasted, the starts of the respective scenes may be presumed. Further, when a numeral in the subtitle indicating the score is changed, this scene may be presumed to be the scoring scene. Still further, when detecting words such as [goal], [safe at home plate], [home-run] from within the uttered voices, the scenes before and after detecting these words may be presumed to be the scoring scenes.
Then, the weight as small as 0.1 or 0.2 is set to the “commercial”, while the weight as large as 0.9 is set to the “scoring scene”. Further, 0.6 or 0.7 is weighted on the “on-playing game” (excluding the “scoring scene”), and the numerical values smaller than the value of the “on-playing game” are weighted on the “entry of players” and the “post-game interview”.
Then, the scenes weighted by 0.2 or smaller are cut so as not to be reproduced. Moreover, in the case of the weight of 0.9 or more, the scene is reproduced at the 1× speed, i.e., at the normal reproduction speed. Further, the 0.4-weighted scene is reproduced at a 4× speed. Still further, in the case of the weight ranging from 0.6 to 0.7, each scene is played back at an intermediate speed, e.g., 1.2× or 1.5× speed between 1× speed and 4× speed.
An audiovisual device 1 according to a first working example will hereinafter be described with reference to
Then, a video picture reproduced by the audiovisual device 1 is displayed on a monitor 21. The monitor is exemplified such as a liquid crystal display, an electroluminescence panel, a plasma display, a CRT (Cathode Ray Tube).
Moreover, an attachable/detachable storage medium drive 22 is externally connected to the audiovisual device 1 or alternatively can be built in a housing of the audiovisual device 1. An attachable/detachable storage medium is, e.g., a CD (Compact Disc), a DVD (Digital Versatile Disk), a Blu-ray disc, a flash memory, etc. The attachable/detachable storage medium drive 22 reads the video data from the medium stored with the video file. Further, the attachable/detachable storage medium drive 22, when installing the application 15 etc, reads the program from the medium and downloads the program into the hard disk.
The control unit 11 includes a CPU (Central Processing Unit) and a memory, in which the CPU executes a computer program deployed on the memory in an executable-by-CPU format. One of the computer programs such as this is the application 15. Note that the application 15 is, before being deployed on the memory, stored on the hard disc drive 18 or an unillustrated ROM (Read Only Memory). The control unit 11 accepts a user's operation via, e.g., the RC 20 and controls a recording reservation process, a receiving process based on the recording reservation, and the recording process.
Moreover, the control unit 11 accepts the user's operation via the RC 20, and executes the reproduction of the recorded TV program. On the occasion of the reproduction, the control unit 11 accepts a designation of the reproduction time or reproduction ending time from the user. Subsequently, if the reproduction time or a period of time ranging from the present time to the reproduction ending time is shorter than the recording time of the recorded program, the control unit 11 executes the highlight reproduction according to the embodiment.
The broadcast receiving device 19 demodulates the broadcast waves received by an antenna, and thus acquires signals of the TV programs. The broadcast receiving device 19 is exemplified by a TV tuner receiving an analog broadcast, a HDTV (High Definition Television) tuner receiving a digital broadcast, or a tuner for a 1seg (one segment) broadcast which uses one segment in HDTV-based channels. As for both of the analog broadcast and the digital broadcast, a configuration of the broadcast receiving device 19 is broadly known, and hence an in-depth description thereof will be omitted.
The acquired signals of the TV program are temporarily stored on the hard disc drive 18. The decoder 12 decodes the signals of the TV program stored on the hard disc drive 18, thus generating the video data. The video data is segmented by the scene extracting unit 13 into the scenes each consisting of the plurality of frames (video frames). A scene feature is extracted from each scene. The scene feature is stored together with information for specifying each scene in the form of a scene feature table in the memory of the control unit 11.
Furthermore, the highlight extracting unit 14 searches through, based on the scene features, the weight table and allocates the weights to the respective scenes. The weights are stored in the scene feature table. The scene extracting unit 13 and the highlight extracting unit 14 are realized in the form of the computer program executed by the control unit 11.
The video data generated by the decoder 12 and the scene feature table are stored on the hard disc drive 18. Note that if the video data demodulated by the broadcast receiving device 19 is not yet encrypted, the decoding process of the decoder 12 is omitted. Further, the processing target video data described above may be either the analog data or the digital data. Moreover, the broadcast receiving device 19 may capture the analog signals or the digital data of the TV program from a cable network instead of receiving the broadcast waves from the antenna.
A reproduction speed determining unit 16 is defined as one of the computer programs executed by the control unit 11. The reproduction speed determining unit 16 determines, when reproducing the video data in the hard disc, a reproduction speed on the basis of the scene feature table generated based on the video data. The highlight reproducing unit 17 reproduces each scene in accordance with the reproduction speed designated by the reproduction speed determining unit 16. The highlight reproducing unit 17 may be configured as the computer program executed by the CPU of the control unit 11 and may also be configured as a hardware circuit. In any case, the highlight reproducing unit 17 determines the scene to which the respective frames belong according to a frame count from a start position of the TV program, and adjusts the output frame count per unit time of the scene concerned.
An operational example of the audiovisual device 1 will hereinafter be discussed. To start with, the user makes a recording reservation of, e.g., a program of a soccer game by use of the RC 20 (arrows A1-A3). After completing the recording based on the recording reservation, the scene extracting unit 13 and the highlight extracting unit 14 are started up under the control of the control unit 11, thereby executing the extraction of the highlight scene and a computation of the weight on the scene (arrows A4-A10). For example, sound volumes of the respective scenes are compared, and an assumption is [Scene with Loud Sounds=Exciting], which is extracted as the scene feature. The audiovisual device 1, however, determines, based on an operation history of how the user behaved in the past when reproducing the scene containing such a feature without depending simply on the scene feature, whether a heavy weight is put on the scene or not.
For viewing the recorded scene, the user starts up the application 15 by using the RC 20 (arrow A1). Here at, the control unit 11 executing the application 15 displays a recording program list on a monitor screen. The user selects the recording program of the soccer game broadcasted by relay, and further specifies the time when the user wants to complete the reproduction. The application 15 accepts these operations and executes the process of reproducing the recording program. At this time, the control unit 11 executes the reproduction speed determining unit 16 (A11) and computes the reproduction speed based on the weight so as to converge within the specified period of time. Furthermore, the control unit 11 executes the highlight reproducing unit 17, thereby performing the highlight reproduction according to the reproduction speed (arrows A11-A13).
For instance, if the user prefers the scoring scene, i.e., a goal scene, the user views the scene containing the uttered word [Goal !] at the 1× speed in many cases, and it follows that the history of this operation is recorded often. Further, if the user is not interested in post-game comments of a coach, the video is fast forwarded at a 4× speed in the majority of cases, and the history of such an operation is frequently recorded.
Accordingly, it may be sufficient that the keywords characterizing the respective scenes are weighted in the way of being associated with the detected user's operations (or the reproduction speed when viewing, etc). For instance, after setting an initial value “1” to each keyword, in the case of viewing at an Nx speed (N-fold speed), the present weight is set to 1/N-fold, and so forth. Thereupon, with respect to the initial value “1”, as a fast-forwarding speed is accelerated and a fast-forwarding count is incremented, the weight becomes smaller. It is therefore feasible to distinguish between the scene in which the user is interested and the scene in which the user is not interested on a per-user basis according to the viewing history and to set a proper weight on each scene.
Moreover, an available scheme is that an additional point to be added is prescribed corresponding to the user's operation (or the reproduction speed when viewing, etc) (a point “0” is set to 2× speed or faster, a point “1” is set to 1×-2× speed, and a point “3” is set to 1× speed), and the additional point is added each time the individual operation is detected, thus totalizing the points for the respective keywords. Then, the points may also be normalized so that the weights of the individual keywords are distributed in a range of 0-1.
The weights as in
In the scene feature table, each scene is identified by the frame count. For example, the scenes are segmented in a way that ranges from the start to the 300th frame, from the 301st frame to N1-frame (N is an integer equal to or larger than 301), and so on. The example in
The audiovisual device 1 searches through the weight table in
Upon receiving the user's designation, the audiovisual device 1 determines whether or not the reproduction is completed within the designated time (F3). The reproduction time required for reproducing the file can be determined from the frame count described in the reproduction file, the reproduction time described on the medium or the recording elapse time recorded in the reproduction file.
If the reproduction is not finished within the desired time (N at F3), the scenes of the reproduction file are segmented, and the weights are set on the per-scene basis (F4). The reproduction time of each scene is set so as to fall within the time by changing the reproducing method (e.g., the reproduction speed) based on the per-scene weighting technique (F5-F6). The setting is that the scene exhibiting, for instance, the high degree of highlight, i.e., the heavy-weight scene is reproduced at the normal reproduction speed. Another setting is that the scene exhibiting the intermediate degree of highlight is reproduced by fast-forwarding, i.e., at the 2× speed. On the other hand, the scene exhibiting the small degree of highlight as in the case of the commercial is cut (the scene is eliminated). Then, the audiovisual device 1 reproduces the reproduction file at the set-up reproduction speed (F7).
Thus, the reproduction time gets variable depending on the degree of highlight, i.e., depending on the weight. Note that the user may perform the setting enabled to transition to the next scene any time by pressing a [skip] button on the RC 20 etc. Similarly, during the double-speed reproduction, the setting may be such that the mode can transition to the normal reproduction any time by pressing a [reproduction] button on the RC 20 etc. Moreover, these operations on the RC 20 are stored beforehand and may also accumulated as reference information on the occasion of determining the degree of highlight.
Then, the audiovisual device 1 extracts the scene feature by analyzing the scene data (F42). For example, the scene feature is determined from the uttered word (as the keyword) detected in the voice data. To be specific, the audiovisual device 1 vocally recognizes the voice data and thus extracts the keyword. The voice recognition involves collating combinations of consonants and vowels of the voice data with a predetermined set of dictionary data. A specific process of the voice recognition has already been known, and therefore the detailed explanation thereof is omitted. If the category of the TV program is known, however, any inconvenience may not be caused by changing the voice recognition dictionary on a category-by-category basis. This is because, for instance, the in-voice words uttered in the soccer game are limited in number to some extent. The extracted scene feature, i.e., the keyword is stored in the format of
Next, the audiovisual device 1 determines, based on the extracted keyword, the weight with reference to the weight table (F43). The CPU of the control unit 11, which executes this process, corresponds to a weight acquiring unit. Then, the weight is allocated to the scene (F44). The CPU of the control unit 11, which executes this process, corresponds to a weight allocating unit. Subsequently, the audiovisual device 1 determines whether the next scene data (i.e., the next frame) exists or not (F45). If the next scene data exists, the audiovisual device 1 advances the control to F41. Whereas if the processing for all of the scenes is terminated, the audiovisual device 1 finishes the scene weighting process.
Note that a scheme of the present working example is to execute the scene weighting process during the reproducing process illustrated in
As discussed above, the audiovisual device 1 according to the present working example enables the user to view the video within the time in a manner that comprehends the content of the video such as enabling the important scene to be viewed at the normal reproduction speed while cutting the unnecessary scene. In this case, the determination as to which scene is cut, which scene is fast forwarded and which scene is reproduced at the normal reproduction speed, is made based on the weight allocated to the scene feature. Further, even the non-cut scene, if not matched with the user's preference or taste, can be fast forwarded. Owing to the combinations of the reproduction speeds, it is feasible to finish the reproduction till the time desired by the user and to reduce the possibility that the user might miss a part (scene) in which the user is interested.
Note that if a have-an-interest-in player appears for the post-game interview, the setting is that the normal reproduction can be done by pressing the [reproduction] button on the RC 20. Further, the setting in the case of pressing a [skip] button on the RC 20 may be such that the operation transitions to the next scene. Moreover, the reproducing method during the playback may invariably be displayed in order for the user not to get confused. A display is, e.g., [On-Highlight-Reproduction] etc.
The audiovisual device 1 according to a second working example will hereinafter be described with reference to
Together with collecting the scene feature, the audiovisual device 1 detects the user's operation from on, e.g., the RC 20 (F102, F103) (the RC 20 or an unillustrated input device corresponds to an operation detecting unit). Then, when detecting the operation, the audiovisual device 1 determines whether the detected operation indicates a skip of the scene or not (F104). If the detected operation indicates the skip, the weight on the scene feature is reduced (F105). For example, the weight is reduced by one count (or alternatively, the weight is multiplied by 1/(2M), where M is a magnification of the fastest-forwarding with respect to the normal reproduction speed). Then, the audiovisual device 1 loops the control back to F101.
If the operation does not indicate the skip of the scene, the reproducing device 1 determines whether the operation indicates a change in reproduction speed or not (F107). If the detected operation indicates an increase up to N-fold speed, the weight on the scene feature is decremented (F108). For example, the weight is decremented by 0.5 count (or alternatively the weight is multiplied by 1/N). Then, the audiovisual device 1 loops the control back to F101. Further, the audiovisual device 1 increments the weight on the feature of the scene about which the detected operation indicates the change in normal reproduction speed (F109). For instance, the weight is incremented by 1 count (or alternatively the weight is doubled). Subsequently, the audiovisual device 1 loops the control back to F101.
Moreover, upon finishing the reproduction (N in F100), the reproducing device 1 normalizes the weights of the scene feature table to the range of 0-1 (F110). To be specific, the weight values are converted based on the weight set in the processes of F101-F109 in the range of the minimum value “0” through the maximum value “1”. As for the conversion, the numerical value may be converted based on the computed weight by use of a linear function. Further, the weight characteristic is changed together with the conversion, and hence the value may also be converted by use of a curvilinear function.
The weight can be set on the scene feature on the per-scene basis according to the history of the user's operations in the processes described above. Note that the process of reproducing the video file on the basis of the set-up weight is the same as in the first working example.
In the first working example, the scene feature is extracted based on the information acquired by processing the video data such as the keywords etc contained in the voice data. As a substitute for this process, as illustrated in
Subsequently, the weights may be allocated to the respective scenes with reference to the set-up weights according to the history acquired in the past when reproducing the program having the similar program composition. This technique enables the scenes to be segmented based on the EPG.
A processing example of the audiovisual device 1 according to a third working example will hereinafter be discussed. In the third working example, the audiovisual device 1 displays a reduced image (which will hereinafter be termed a chapter image) of the frame partly composing the video picture on the monitor screen. In the third working example, the chapter image represents a head frame (or representative frame) of each scene. A plurality of chapter images may also be displayed on the per-scene basis. In this case, the audiovisual device 1 may select the chapter image in accordance with a predetermined standard per scene such as the degree of change in screen, the degree of change in sound and a change in character and may display the selected chapter image.
Then, the audiovisual device 1 accepts the user's operation about the chapter image. The audiovisual device 1 sets the weight on each chapter image, i.e., each scene according to the user's operation.
The following is a working example in a case where the user sets the weight on the scene.
The scene features are related to, e.g., whether the voice level is equal to or larger than a predetermined reference value, whether the voice level rises by the predetermined value or above, whether the numeral (which is a character portion indicating the score) changes on the screen, whether the image changes to a predetermined degree or greater, whether the voice belonging to the frame group contains the specified keyword (e.g., [goal], [scoring], etc), and so forth.
Then, the audiovisual device 1 determines whether a new scene should be defined from the collected scene features or not (F133). To be specific, the audiovisual device 1 determines that the new scene should be defined if applied to any one of the criteria as to whether the voice level is equal to or larger than the reference value, whether the voice level rises by the predetermined value or above, whether the numeral (which is the character portion indicating the score) changes on the screen, whether the image changes to the predetermined degree or greater, whether the voice belonging to the frame group contains the specified keyword, etc. Then, any one of the images (which is, e.g., the head image) in the frame group is stored as the chapter image on the hard disc drive 18 (F134). Further, the entry is added to the chapter image management table for managing the chapter images (F135). After the entry is added to the chapter image management table, the audiovisual device 1 advances the control to F131.
Further, when all of the frames are processed due to the determination in F131, the audiovisual device 1 displays the chapter image selected in the process described above (F136). Subsequently, the audiovisual device 1 accepts the weight setting according to the user's operation (F137).
Note that the video file reprocessing process based on the set-up weight is the same as in the first working example.
In this process, the reference frame and the target frame are each segmented into partial areas. Then, a difference in feature quantity between the partial areas is computed. The feature quantity is defined as, e.g., an average color within the partial area (e.g., a color frequency value of each of RGB values, i.e., red, green and blue values). Moreover, the feature quantity is also defined as a color distribution, i.e., the RGB values of the respective pixels. Then, a total sum of the variations of the average R-value, G-value and B-value is set as the difference. Moreover, what a sum of the variations of the R-value, G-value and B-value on a per-pixel basis is integrated with respect to all of the pixels within the partial areas is set as the difference. Then, the variation on the screen is assumed to be a total value obtained by collecting the differences in the respective partial areas with respect to all of the partial areas.
Then, the audiovisual device 1 determines whether or not the total sum given in F154 exceeds a reference value (F155). The reference value is, e.g., an empirically accumulated value and can be set as a system parameter. Subsequently, if the total sum exceeds the reference value, a new scene is defined (F156). Specifically, a new entry is added to the chapter image management table depicted in
Then, the audiovisual device 1 determines whether the next frame remains or not (F157). If the next frame remains, the audiovisual device 1 loops the control back to F151. While on the other hand, in the case of processing all of the frames, the processing comes to an end.
The chapter image can be extracted through the procedures described above. Note that the processing may be done in the same procedures in the case of extracting the chapter image on the basis of other features as to, i.e., whether the voice level is equal to or larger than the reference value, whether the voice level rises by the predetermined value or above, whether the numeral (which is the character portion indicating the score) changes on the screen and whether the voice belonging to the frame group contains the specified keyword (e.g., [goal], [scoring], etc).
Note that the numeral on the screen may be detected by pattern matching between the screen data and a numeral pattern. The keyword may also be detected by the pattern matching between the screen data and a character pattern. A character size of the subtitle, Telop (Television Opaque), the score of the sports game, etc may also be pattern-matched in a manner that narrows the character size down to a dimensional range acquired from the empirical value per program.
In the first working example, the scene feature is extracted based on, e.g., the keyword in the voice. This does not, however, mean that the scene feature is limited to the keyword in the voice. For example, the scenes can be categorized by use of the variety of scene features such as the sound level, the keyword associated with the program and the degree of the variation on the screen. The scene weight may also be put on each of the scenes categorized based on the variety of features through the user's operation when viewing the scene.
The weight may be set in the same way as by the processing in
Then, in the reproduction of the next similar program or the program of the similar category, the respective scenes are determined from the features described above, and the weights are set on the individual scenes. Subsequently, the weights may be stored in the scene feature table similar to the table in
Meta information of the program may be utilized for the weighting determination element. For instance, if a [news] program is recognized from the meta information of the program acquired from the EPG, an available scheme is not that the weighting is determined from loudness of the sound of the scene but that the weight is put on a portion of the news Telop.
Furthermore, the embodiment has exemplified the example of adjusting the reproduction speed when reproducing mainly the TV program. This process is not, however, restricted to the TV program and can be applied similarly to a radio program using only the sounds or the voices. Moreover, the program can be, without being confined simply to the broadcasting programs, similarly applied to programs stored in the data file acquired from the Internet, and movies, music, compositions, etc that are stored on the storage medium. Accordingly, when the present technology is applied, the terminology “program” includes the program of the TV broadcast, the program of the radio broadcast, the movie, the music, the composition, etc.
Further, in the embodiment, the weights are associated with the respective scenes in the scene feature table as illustrated in
<Readable-by-Computer Recording Medium>
A program for making a computer, other machines and devices (which will hereinafter be referred to as the computer etc) realize any one of the functions can be recorded on a recording medium readable by the computer etc. Then, the computer etc is made to read and execute the program on this recording medium, whereby the function thereof can be provided.
Herein, the recording medium readable by the computer etc connotes a recording medium capable of storing information such as data and programs electrically, magnetically, optically, mechanically or by chemical action, which can be read from the computer etc. Among these recording mediums, for example, a flexible disc, a magneto-optic disc, a CD-ROM, a CD-R/W, a DVD, a Blu-ray disc, a DATA (Digital Audio Tape), an 8 mm tape, a memory card, etc are given as those removable from the computer.
Further, a hard disc, a ROM (Read-Only Memory), etc are given as the recording mediums fixed within the computer etc.
This is a continuation of Application PCT/JP2008/073694, filed on Dec. 26, 2008, now pending, the entire contents of which are incorporated herein by reference.
Number | Date | Country | |
---|---|---|---|
Parent | PCT/JP2008/073694 | Dec 2008 | US |
Child | 13163130 | US |