One or more implementations relate to the field of media players; and more specifically, to creating playlists of excerpts from audio recordings.
Some media players allow a user to create a playlist of audio or video recordings. An audio recording is audio data that has been stored on machine-readable media for later playback. Audio of an audio recording may be combined with other media (e.g., an audio recording may comprise the audio portion of a video recording). Playback is the action of causing media recordings to be heard or seen again via an electronic device. For example, playback of an audio recording is the action of causing the audio recording to be heard again (e.g., via an end user device). Playback might be performed by a media player; i.e., software that provides a graphical user interface for playing back media via an electronic device.
The following figures use like reference numbers to refer to like elements. Although the following figures depict various example implementations, alternative implementations are within the spirit and scope of the appended claims. In the drawings:
The following description describes implementations for creating a playlist of excerpts, that include mentions of keywords, from audio recordings for playback by a media player. A playlist is data that identifies one or more audio recordings, or portions thereof, for playback. Typically, a playlist includes data that identifies more than one audio recording and a media player will play the audio recordings (or excerpts thereof) in the playlist sequentially.
A keyword is a word of particular significance in a particular context. For example, a keyword corresponding to the name of a business competitor might be of particular significance in the context of a salesperson pitching a prospect. For another example, a keyword corresponding to the name of a product or service might be of particular significance in the context of a customer service representative providing telephone support to a customer. Although reference is made herein to “a keyword,” “keywords,” “keyword of interest” and the like, implementations described herein may support key phrases (i.e., a phrase of particular significance in a particular context).
Creating a Playlist of Excerpts
Media Player
Media player 100 also shows a scrubber bar 104 that shows a timeline for the audio going from a beginning of the audio recording on the left to an ending of the audio recording on the right. The current play position in the audio recording is shown by cursor 106. In scrubber bar 104, the time that each of the participants was talking is shown using the indicators shown in the legend 107 for the respective participants. In particular, the scrubber bar 104 is divided it includes a section 109 for each participant (as shown, section 109A for the participant identified as “Jesse”; section 109B for the participant identified as “Kathy”). Section 109 runs the length of scrubber bar 104. Also, section 109 includes, for each participant: 1) portions with an indicator from the legend 107 (e.g., shading) to indicate the times when that participant was talking (or providing some other kind of meaningful audio input); and 2) portions without an indicator (e.g., blanks) during the times when that participant was not talking (or not providing some other kind of meaningful audio input). In other words, section 109 in scrubber bar 104 shows who is talking at what points in the audio recording. For a given audio recording, scrubber bar 104 might include one, two, or more sections 109 if the audio recording includes one, two, or more participants respectively.
Implementations might include other user interface (UI) elements in scrubber bar 104. A UI element is an element of which a user interface is comprised, such as an input element (e.g., dropdown list, toggle), navigation element (e.g., search field, icon), informational element (e.g., text label, visualization), container element (e.g., accordion), etc. An implementation might include one or more of UI elements that allow a user to: 1) start, pause, and/or stop playback of an audio recording (e.g., as indicated by the play icon immediately below the center of section 109B); 2) fast forward or rewind playback (e.g., as indicated by the two icons respectively to the left and right of the play icon); 3) view a time corresponding to the current play position in the audio recording and the duration of the audio recording (e.g., the values “00:00/04:05” shown immediately below the bottom right-hand corner of section 109B); 4) adjust the volume of the playback (e.g., as indicated by the slider and the speaker icon to its right); etc.
Media player 100 includes one or more UI elements 108 for one or more keywords 110. A UI element 108 allows a user to select a keyword of interest 112 (i.e., make a selection that identifies a keyword of interest 112) and indicate mentions (i.e., instances) of that keyword of interest 112 in the audio recording via a caret 116 in scrubber bar 104 (a caret is a UI element that indicates a position in the audio recording). Some implementations associate a keyword 110 with a type (e.g., a classification of that keyword). In the example shown in
Other implementations show UI element 108 and/or carets 116 differently. For example, an implementation might display one UI element 108 with a set of keywords 110 (e.g., via a drop-down list) for a user to select. Another implementation might allow a user to select more than one keyword 110 as a keyword of interest 112 and display a caret 116 for each mention of each keyword of interest 112 (e.g., carets 116 for one keyword of interest 112 in one color, carets 116 for another keyword of interest 112 in another color, etc.). When an audio recording has more than one participant, an implementation might show a caret 116 associated with the section 109 corresponding to the participant that mentioned the keyword of interest 112 to which the caret 116 corresponds. For example, in an audio recording with two participants (as
Implementations of media player 100 allow for navigation to positions in the audio recording where mentions of a keyword of interest 112 occur. For example, an implementation might position cursor 106 in scrubber bar 104 at or before a caret 116 corresponding to a first mention in the audio recording of a keyword 110 when the user selects the UI element 108 corresponding to the keyword 110. In the example shown in
Media player 100 further includes a UI element 114 (shown with text “Add to Playlist”) that a user may select to add, to a playlist, data that identifies an excerpt. An excerpt is a portion of an audio recording that is less than all of the audio recording. For example, an excerpt of an audio recording is a portion of the audio recording with a duration less than that of the audio recording. Different implementations of media player 100 may support different modes of operation for UI element 114 and/or adding data that identifies an excerpt to a playlist, and some implementations may allow a user to select a mode of operation.
One mode of operation includes, responsive to a user selecting UI element 114, adding data that identifies an excerpt to a playlist for all mentions of a keyword of interest 112 in an audio recording. Another mode of operation includes, responsive to a user selecting UI element 114, adding data that identifies an excerpt to a playlist for only some mentions of a keyword of interest 112. For example, implementations might add data that identifies an excerpt for only a first mention, or a given number of mentions, of the keyword of interest 112. An implementation might support a user specifying the number of mentions to be added to a playlist in a configuration setting for media player 100, and/or in UI element 114 (or another UI element). Yet another mode of operation might blend the modes of operation previously described. For example, an implementation might support a user selecting one or more mentions of a keyword of interest 112 in media player 100 (e.g., by selecting one or more carets 116, which may be selectable in some implementations). A blended mode of operation might 1) add data that identifies an excerpt to a playlist for all mentions of a keyword of interest 112 if a user has not selected particular mentions of a keyword of interest 112, and 2) add data that identifies an excerpt to a playlist for only mentions of a keyword of interest 112 that the user has selected, if a user has selected particular mentions.
Communication with Media Server
At time 1a (indicated with circled reference “1a”), a set of IDs for audio recordings 147 is transmitted by server 130 (i.e., by code 132) to electronic device 122. The set of IDs for audio recordings 147 is based on audio recordings 136 in datastore 134. In one implementation, the set of IDs for audio recordings 147 is transmitted by server 130 to electronic device 122 responsive to a user of electronic device 122 selecting a UI element in media player 100 to browse audio recordings 136, and code 124 transmitting a request to server 130 for a set of IDs for audio recordings 147. In another implementation, the set of IDs for audio recordings 147 is transmitted by server 130 to electronic device 122 responsive to a user performing a search with media player 100 for audio recordings 136 that include mentions of a keyword of interest 112, and code 124 transmitting a request to server 130 for a set of IDs for audio recordings 147 that include mentions of the keyword of interest 112 (which might also include code 132 submitting a query to, and receiving query results from, metadata repository 140).
At time 1b (indicated with circled reference “1b”), the set of IDs for audio recordings 147 is displayed in media player 100 such that a user may select a corresponding audio recording 136 for playback by media player 100. In one implementation, the set of IDs for audio recordings 147 are displayed in a browse file dialog box. In another implementation described later herein referring to
At time 2a (indicated with circled reference “2a”), code 124 receives a selection of an audio recording for playback 151. In one implementation, responsive to receiving the selection of an audio recording for playback 151, code 124 performs block 154 shown in
At time 2b (indicated with circled reference “2b”), code 124 transmits a request 155 for content and metadata for audio recording 136 to server 130. In one implementation, responsive to receiving the request, code 124 retrieves content for audio recording 136 from datastore 134, and metadata 142 for audio recording 136 from metadata repository 140.
At time 2c (indicated with circled reference “2c”), code 132 transmits content and metadata 159 for audio recording 136 to electronic device 122. Different implementations may handle the transmission of content and metadata 159 between server 130 and electronic device 122 in different ways. For example, implementations may support server 130 transmitting content and metadata 159 to electronic device 122 via different streaming and/or buffering techniques. Thus, implementations may involve server 130 transmitting content and metadata 159 to electronic device 122 in different parts at different times (e.g., via adaptive or multi bitrate streaming). Other implementations may support server 130 transmitting content and metadata 159 to electronic device 122 without streaming and/or buffering techniques (e.g., as a single file at one time). Implementations may transmit the metadata 142 of content and metadata 159 separately to electronic device 122 (e.g., to allow media player 100 to display some or all of the metadata while content is buffered for later playback).
At time 2d (indicated with a circled reference “2d”), responsive to receiving content and metadata 159, code 124 1) displays metadata for audio recording 136 in media player 100 (e.g., UI elements 108 for keywords 110; identifiers for participants in legend 107, sections 109 for each participant; etc.); and/or 2) begins playback of the audio recording 136 with the content, or buffers the content for future playback. Content might be buffered for future playback: 1) if playback is yet to occur, when a user of media player 100 selects a UI element that starts playback; and/or 2) if playback is occurring, when a current play position (e.g., as indicated by cursor 106) reaches the portion of the audio recording 136 that includes the content; etc. At time 3a (indicated with circled reference “3a”), code 124 receives a selection 163 that identifies a keyword of interest. In one implementation, responsive to receiving the selection 163, code 124 performs block 158 shown in
In one implementation at time 3b (indicated with circled reference “3b”), code 124 transmits a request 167 for indications of mentions 171 for audio recording 136 to server 130. In one implementation, an indication of a mention 171 includes data that indicates a position of a mention in audio; e.g., an offset relative to a beginning of an audio recording 136. In another implementation, an indication of a mention 171 includes data that identifies a participant who made the mention (i.e., said the keyword). Additionally, or alternatively, an indication of a mention 171 may include an identifier for the keyword 110 to which the mention corresponds.
In one implementation, responsive to receiving the request 167, code 124 retrieves indications of mentions 171 for audio recording 136 from metadata repository 140 (indications of mentions 171 may be stored in metadata 142 for audio recording 136). In another implementation, content and metadata 159 includes indications of mentions 171 for audio recording 136, and code 124 need not transmit request 167 for indications of mentions 171.
At time 3c (indicated with circled reference “3c”), code 132 transmits indications of mentions 171 for audio recording 136 to electronic device 122. Indications of mentions 171 might include indications of mentions corresponding only to the keyword of interest 112 that the user selected (i.e., in selection 163). Alternatively, indications of mentions 171 might include indications of mentions corresponding to keyword of interest 112 and other keywords 110 (e.g., keywords 110 for which indications of mentions are included in metadata 142).
At time 3d (indicated with circled reference “3d”), code 124 displays the indications of mentions 171 for the one or more keywords of interest 112 in media player 100. In one implementation, carets 116 are displayed in scrubber bar 104 of the media player 100 for one or more of the indications of mentions 171.
At time 4 (indicated with circled reference “4”), in one implementation, code 124 receives a selection of a caret 116 that indicates a mention 175. Responsive to receiving the selection of a caret 116 that indicates a mention 175, code 124 performs block 162 shown in
At time 5a (indicated with circled reference “5a”), code 124 receives a selection 181 of a UI element 114 that allows for adding data that identifies an excerpt to a playlist 144. It should be noted that a user may select UI element 114 before or after playback of an audio recording 136 has begun. In one implementation, responsive to receiving the selection 181, code 124 performs block 166 shown in
Block 170 includes adding, to a playlist 144, data that identifies an excerpt 148, from the first audio recording 136, that includes a mention of the first keyword of interest 112. Data that identifies an excerpt 148 is described in more detail later herein referring to other figures. In one implementation, data that identifies an excerpt 148 includes an identifier 102 for an audio recording 136. Block 170 includes block 172 and block 174 in one implementation. In block 172, an identifier 102 for a first audio recording 136 is added to the playlist 144. In block 174, data to locate the first excerpt in the first audio recording 136 is added to the playlist 144. Data to locate the first excerpt in the first audio recording is also described in more detail later herein referring to other figures. In implementations that support a user making a selection 163 that identifies a set of one or more keywords of interest 112, adding data that identifies a first excerpt 148 to playlist 144 may include adding data that includes a mention of at least a first keyword of interest 112 from the set of keywords of interest, which may in turn include one or both of block 172 and block 174. From block 170, flow passes to block 176.
In block 176, a selection 151 of a second audio recording 136 for playback by the media player 100 is accepted from a user. Block 176 may be performed for a second audio recording 136 as block 154 is performed for a first audio recording 136. From block 176, flow passes to block 178.
In block 178, a selection 163 that identifies a second keyword of interest 112 is accepted from the user. The second keyword of interest 112 may be the same as, or different from, the first keyword of interest 112 (e.g., the first and second keywords of interest 112 might have a value of “Comp. 1” (i.e., the same); or the first keyword of interest 112 might have a value of “Comp. 1” and the second keyword of interest 112 might have a value of “Product 1” (i.e., different)). From block 178, flow passes to block 179.
In block 179, a selection of a set of carets 116 that indicate mentions of the second keyword of interest 112 in the first audio recording 136 are accepted from the user. From block 179, flow passes to block 180.
In block 180, another selection 181 of the user interface element 114 in the media player 100 is accepted from the user. Block 180 may be performed for the other selection 181 as block 166 is performed for a selection 181 of the user interface element 114.
Some implementations may support performing other operations before block 180. For example, implementations may support accepting, from a user, a selection 163 that identifies a second keyword of interest 112; and/or accepting, from the user, a selection of a set of carets 116 in the media player 100 that indicate mentions of the first or second keyword of interest 112 in the second audio recording 136. From block 180, flow passes to block 184.
Block 184 includes adding to the playlist 144 data that identifies a second excerpt, from the second audio recording 136, that includes a mention of a second keyword of interest 112. Block 184 optionally includes one or both of block 186 and block 188. In block 186, an identifier 102 for the second audio recording 136 is added. In block 188, data to locate the second excerpt in the second audio recording 136 is added. In implementations that support a user making a selection 163 that identifies a set of one or more keywords of interest 112, adding data that identifies a second excerpt to playlist 144 may include adding data that includes a mention of at least a second keyword of interest 112 from the set of keywords of interest 112, which may in turn include one or both of block 186 and block 188.
Deployment
In one implementation, a playlist 144 of excerpts from audio recordings 136 is created in block 150, as shown in
In another implementation, block 170 and/or block 184 are performed by code 132 on server 130. In one such implementation, at time 5b1 (indicated by circled reference “5b1”), 1) code 124 transmits data that identifies a first excerpt 148 to server 130, responsive to which block 170 is performed on server 130 in respect of a first audio recording 136; and/or 2) code 124 transmits data that identifies a second excerpt 148 to server 130, responsive to which block 184 is performed on server 130 in respect of a second audio recording 136. Code 132 optionally causes playlist 144 to be stored (e.g., in metadata repository 140).
Implementations are described in relation to creating a playlist 144. However, creating a playlist 144 may include creating a playlist 144 from an existing playlist 144. For example, an implementation may use one or more of 1) blocks 154, block 158, block 162, block 166, block 170; or 2) the foregoing blocks and block 176, block 178, block 179, block 180, and block 184, in each case to add data that identifies an excerpt 148 to a playlist 144 that already exists (e.g., by appending the data that identifies an excerpt 148 to the existing playlist 144).
Relatedly, it should be noted that implementations support a user selecting different audio recordings 136 at different times, and the user selecting the same or different keywords of interest 112 from one or more of those different audio recordings 136 during playback thereof. For example, one implementation supports 1) accepting, from the user, a set of one or more selections 163 that identify a set of one or more keywords of interest 112; 2) accepting, from a user at different times, selections of different ones of a plurality of audio recordings 136 for playback by a media player 100 for playing audio; 3) accepting, from the user, selections of a user interface element 114 in the media player 100 during the different times each of the plurality of audio recordings 136 is selected for playback; and 4) adding to a playlist 144, responsive to the selections of the user interface element 114, data that identifies excerpts 148 from the plurality of audio recordings 136, the data including identifiers 102 for each of the plurality of audio recordings 136 and a set of data to locate the excerpts in the plurality of audio recordings 136, wherein each of the excerpts includes a mention of at least one of the set of keywords of interest 112.
It should also be noted that different implementations may support different sequences of the circled references shown in
A playlist 144 of excerpts from audio recordings 136 provides several advantages. A playlist 144 of excerpts allows for different uses of those excerpts. Notably, a media player 100 may play back only excerpts of one or more audio recordings 136 rather than the audio recordings 136. A user that creates a playlist 144 of excerpts may be more interested in playing back the excerpts of audio recordings 136 than the audio recordings 136. In turn, server 130 needs not transmit content to electronic device 122 for the entire duration of audio recordings 136. Creating and playing back a playlist 144 of excerpts thus reduces the consumption of computing resources (e.g., of electronic device 122 and server 130), such as processing cycles and network traffic. A user can also playback only the excerpts of one or more audio recordings 136 in which the user is interested, and/or share the playlist 144 of those excerpts with others. Other uses of a playlist 144 of excerpts provide further advantages as discussed later herein.
Moreover, creating a playlist 144 as described herein provides several advantages. Creating a playlist 144 of excerpts as described is more efficient than other ways of creating a playlist. For example, implementations allow a user to make a selection that identifies a keyword of interest 112 and add to a playlist 144 data that identifies an excerpt that includes a mention of the keyword of interest 112. This is more efficient than the user selecting a start and end position in a scrubber bar 106 of a media player 100 to select the mention of the keyword of interest 112, not only in time but in computing resources and network traffic (e.g., because the user need not search manually to find the start and end position in the media player 100, and thus the media player 100 need not cue and recue audio for playback, etc.).
Also, a user can create a playlist 144 using a media player 100 for playing audio, which facilitates the selection of excerpts to be included in the playlist 144. Media player 100 also lends itself to creating a playlist 144 of excerpts that include mentions of keywords 110. Implementations of media player 100 allow a user to select one or more keywords of interest 112, responsive to which mentions are indicated in a scrubber bar 104 via carets 116. The user can add corresponding excerpts to a playlist 144 by selecting a UI element 114 in media player 100. Implementations may support adding one, some, or all excerpts that include mentions of the one or more keywords of interest 112, as described later herein. Thus, media player 100 provides an intuitive and useful user interface for creating a playlist 144.
Playlist Data Structures
As
Block 270 includes adding, to a playlist 144, data that identifies an excerpt 248, from an audio recording 136, that includes a mention of a keyword of interest 112. Block 270 includes block 272. In block 272, an identifier 102 for a first audio recording 136 is added to the playlist 144. In one implementation, block 270 includes block 274. From block 272, flow passes to block 274.
Block 274 includes block 276 and/or block 278. In block 276, an identifier 260 for the keyword of interest is included in the data to locate the excerpt 256 in the audio recording 136. From block 276, flow passes to block 278.
In block 278, an indication of a position of the mention 264 of the keyword of interest 112 in the audio recording 136 is included in the data to locate the excerpt 256. In one implementation, the indication of the position of the mention 264 is an index 266 of the mention of the keyword of interest 112 in the audio recording 136 (per block 280). In another implementation, the indication of the position of the mention 264 is an offset 268; e.g., an offset relative to a beginning of the audio recording 136 (per block 282).
Different combinations of the elements of data shown in
One Keyword
One Audio Recording: All Mentions
One way of capturing data is shown in
A playlist 144A that includes one or all mentions of a keyword of interest 112 in an audio recording 136 may be created in different situations. In one implementation of media player 100, a user may make selections: 1) of an audio recording 136 for playback by a media player 100; 2) identifying a keyword of interest 112; and 3) of a UI element 114. Referring back to
In another situation, a user may make selections: 1) of an audio recording 136 for playback by a media player 100; 2) identifying a keyword of interest 112; 3) of a caret 116 in the media player 100 from carets 116 that indicate mentions of the keyword of interest 112 in the audio recording 136; and 4) of a UI element 114. Referring back to
One Audio Recording: Less Than All Mentions
Another way of capturing data is shown in
A playlist 144B that includes multiple excerpts corresponding to multiple mentions of a keyword of interest 112 in an audio recording 136 may be created in different situations. In one situation, a playlist 144B is created when a user makes selections: 1) of an audio recording 136 for playback by a media player 100; 2) identifying a keyword of interest 112; and 3) of a UI element 114. This situation might occur when media player 100 is configured such that selection of UI element 114 creates a playlist with the first n mentions of a keyword of interest 112 from an audio recording 136 (where n is a positive integer).
Referring back to
Put differently, 1) the first audio recording 136 and the second audio recording 136 are the same (i.e., identified by the same identifier 102 with a value of “VC-00000013”); and 2) the second keyword of interest 112 and the first keyword of interest 112 are the same (i.e., have the same value of “Comp. 1”). In one implementation, block 184 includes block 186, and the value of “VC-00000013” is added to data that identifies an excerpt 248B. In another implementation, block 184 need not be executed because in playlist 144B, data that identifies an excerpt 248A and data that identifies an excerpt 248B each include the same identifier 102A for the audio recording 136. For example, playlist 144B may be stored such that both data that identifies an excerpt 248A and data that identifies an excerpt 248B are associated with the one identifier 102A for an audio recording 136. In one implementation, block 184 includes block 188. In block 188, data to locate the second excerpt (i.e., of the n excerpts) in the second audio recording (i.e., audio recording 136 with an ID with a value of “VC-00000013”) is added to playlist 144B (e.g., an index 266B; an offset 268B; etc.).
In another situation, a playlist 144B is created when a user makes selections: 1) of an audio recording 136 for playback by a media player 100 for playing audio; 2) identifying a keyword of interest 112; 3) of a caret 116 in the media player 100 from carets 116 that indicate mentions of the keyword of interest 112 in the audio recording 136; and 4) of a UI element 114.
Additionally, or alternatively, a playlist 144B may be created when a user selects multiple carets 116 in media player 100 and selects UI element 114. Specifically, a playlist 144B may be created when a user makes selections: 1) of an audio recording 136 for playback by a media player 100; 2) identifying a keyword of interest 112; 3) of multiple of carets 116A-C that indicate mentions of the keyword of interest 112 in the audio recording 136; and 4) of a UI element 114. Referring back to
Multiple Audio Recordings
In some implementations, playlist 144 includes data that identifies excerpts 248 in multiple audio recordings 136.
A playlist 144C that includes excerpts corresponding to multiple mentions of the same keyword of interest 112 in multiple audio recording 136 may be created in different situations. For example, a user of media player 100 might cause a search to be performed for audio recordings 136 that include one or more mentions of the keyword of interest 112, and make selections of audio recordings 136 from the results of that search (e.g., from search results 520A-G shown in
Referring back to
Multiple Keywords
One Audio Recording
In some implementations, playlist 144 includes data that identifies multiple excerpts that include different keywords of interest 112.
A playlist 144D that includes excerpts corresponding to mentions of different keywords of interest 112 in an audio recording 136 may be created in different situations. For example, a user of media player 100 might make selections that identify multiple keywords of interest 112 (i.e., select multiple UI elements 108) when an audio recording 136 is selected. In one situation, a playlist 144D is created when a user makes the following selections: 1) of an audio recording 136 for playback by a media player 100; 2) identifying a first keyword of interest 112; 3) identifying a second keyword of interest 112; and 4) of the UI element 114. Referring back to
For another example, a playlist 144D may be created when a user makes selections 1) of an audio recording 136 for playback by a media player 100; 2) identifying a first keyword of interest 112; 3) of a caret 116 in the media player 100 that indicates a mention of the first keyword of interest 112; 4) identifying a second keyword of interest 112; 5) of a caret 116 in the media player 100 that indicates a mention of the second keyword of interest 112; and 6) of the UI element 114. Referring back to
Multiple Audio Recordings
In some implementations, playlist 144 includes data that identifies multiple excerpts, from different audio recordings 136, that include different keywords of interest 112. An example of such a playlist 144 can be described referring back to
An example of a user creating such a playlist 144 is when a user of media player 100 makes selections that identify multiple keywords of interest 112 (i.e., select multiple UI elements 108) when different audio recordings 136 are selected; e.g., when a user makes the following selections: 1) of an audio recording 136 for playback by a media player 100; 2) identifying a keyword of interest 112; 3) of a UI element 114; 4) of another audio recording 136; 5) identifying another keyword of interest 112; and 6) of the UI element 114. Referring back to
Use of Positions
Referring back to
An indication of a position of a mention 264 might correspond to different positions relative to an excerpt. For example, an indication of a position of a mention 264 may correspond to a starting position for an excerpt, an ending position for an excerpt, a position of a mention of a keyword of interest 112 in an excerpt, etc. In one implementation, an indication of a position of a mention 264 is a predetermined period of time (e.g., 5 s, 10 s, etc.) before the mention of the keyword of interest 112 in an audio recording 136. In another implementation, an indication of a position of a mention 264 is such that the excerpt includes a start of a sentence, a start of a paragraph, etc. that includes the mention. Such implementations provide more context around a mention of a keyword 110 in an excerpt, in turn making the excerpt more useful.
Using a Playlist
Implementations may support various uses for a playlist 144, such as a media player (such as media player 100) playing back a playlist 144, creating an audio recording 136 based on a playlist 144, and/or creating a transcript 138 based on a playlist 144.
Block 405 includes retrieving an excerpt, from an audio recording 136, that includes a mention of a keyword of interest 112. Block 405 includes block 410, block 450, and block 470.
In block 410, an offset 268, in the audio recording 136, is identified for the mention of the keyword of interest 112. In some implementations, block 410 includes one or more of block 415, block 420, block 425, and block 440.
Block 415 includes determining whether data that identifies the excerpt 248 includes an indication of a position of the mention 264 of the keyword of interest 112 in the audio recording 136. Responsive to determining that the data that identifies the excerpt 248 does include an indication of a position of the mention 264, flow passes from block 415 to block 420. In contrast, responsive to determining that the data that identifies the excerpt 248 does not include an indication of a position of the mention 264, flow passes from block 415 to block 425.
In block 425, one or more offsets 268 are identified for one or more respective mentions of the keyword of interest 112. For example, a playlist 144 may store an identifier 260 for a keyword of interest and not an indication of a position of a mention 264 (e.g., as discussed referring to
Block 420 includes determining a type of the indication of the position of the mention 264. Responsive to determining that the type of the indication of the position of the mention 264 is an index 266 (i.e., of the mention of the keyword 110 in the audio recording 136), flow passes from block 420 to block 440. Responsive to determining that the type of the indication of the position of the mention 264 is an offset 268, flow passes from block 420 to block 450.
In block 440, the offset 268 for the mention of the keyword of interest 112 is identified based on the index 266 of the mention of the keyword of interest 112. In one implementation, the offset 268 is identified from metadata 142 for an audio recording 136, as previously discussed. From block 440, flow passes to block 450.
In block 450, the offset 268 for the mention of the keyword of interest 112 is optionally adjusted. Block 450 includes block 455, in which whether the offset 268 is to be adjusted is determined. Whether the offset 268 is to be adjusted may be determined in different ways. In one implementation, data that identifies an excerpt 248 includes a flag that indicates whether an indication of a position of the mention 264 has been adjusted (e.g., by a predetermined period of time, such as to include a sentence that includes the mention of the keyword of interest 112, etc.), and whether the offset 268 is to be adjusted may be determined based on a value of the flag (e.g., the flag is not to be adjusted if the flag indicates that the offset 268 has already been adjusted, and the flag is to be adjusted if the flag indicates that the offset 268 has not already been adjusted). In another implementation, whether the offset 268 is to be adjusted is based on a configuration or a default setting. For example, a configuration or default setting may indicate that an offset 268 is to be adjusted if an excerpt does not include a predetermined period of time before the mention of the keyword 110, or if an excerpt does not include a start of a sentence that includes the mention of the keyword 110. Additionally, or alternatively, an implementation may detect whether an excerpt includes a predetermined period of time before the mention of the keyword 110, or a start of a sentence that includes the mention of the keyword 110, and determine whether the offset 268 is to be adjusted accordingly. For example, an implementation may retrieve the excerpt based on an unadjusted offset 268 for the mention of the keyword of interest 112 and analyze the audio to determine the position of the mention in the excerpt (e.g., in the beginning of the excerpt, after a period of time, at the start of a sentence, etc.), then determine whether the offset 268 is to be adjusted.
In one implementation, block 450 includes block 460. In block 460, responsive to determining that the offset 268 is to be adjusted, the offset 268 for the mention of the keyword of interest 112 is adjusted by a predetermined period of time. Implementations may adjust an offset 268 by a predetermined period of time in different ways (e.g., if the offset 268 is a time offset, by subtracting the predetermined period of time from the offset 268; if the offset 268 is a data offset, by identifying an amount of data that corresponds to the predetermined period of time and subtracting that from the offset 268, etc.).
In another implementation, block 450 includes block 465. In block 465, responsive to determining that the offset 268 is to be adjusted, the offset 268 is adjusted such that the excerpt includes a start of a sentence that includes the mention of the keyword of interest 112. Implementations may adjust an offset 268 such that the excerpt includes a start of a sentence in different ways (e.g., by analyzing the audio to determine a start of the sentence and determining the offset 268 of the start of the sentence; analyzing a transcript for the audio recording 136 to determine a start of the sentence and determining the offset 268 of the start of the sentence, etc.). From block 465, flow passes to block 470.
In block 470, the excerpt is retrieved based on the offset 268 for the mention of the keyword of interest 112. In some implementations, retrieving the excerpt includes retrieving the audio recording 136 identified by the identifier 102 for the audio recording 136 (e.g., from a server 130 as shown in
Playback of a Playlist
In one implementation, a playlist 144 can be played back in block 480 based on the excerpts retrieved in block 470. In another implementation, block 470 and block 400 are executed concurrently. For example, an implementation may play back, or buffer for later playback, audio for an excerpt after the excerpt is retrieved in block 470 and before all the excerpts of a playlist 144 are retrieved in block 400.
Different implementations may include support for block 400 in a media player 100 (e.g., in code 124 as shown in
Creation of an Audio Recording
Implementations may also support a playlist 144 being stored as an audio recording 136 in block 490 based on the excerpts retrieved in block 470. Such implementations may discard the remainder of the audio recordings 136 on which the excerpts are based. Such an implementation is advantageous in that it reduces the storage used to store the excerpts from the audio recordings 136, in turn improving the performance of and/or reducing the requirements of the electronic devices (e.g., server 130) and networks used for this purpose.
Creating a Transcript
Implementations may also support a transcript being created for a playlist 144, in block 495. Referring back to
It should be noted that a portion of a transcript corresponding to an excerpt might include a sentence, a paragraph, etc. that includes the mention of a keyword of interest 112. In one implementation, this inclusion in the portion of the transcript is regardless whether the excerpt includes the sentence, the paragraph, etc. Put differently, an implementation may support including fewer, more, or the same words or utterances in a portion of a transcript corresponding to an excerpt than the words or utterances spoken in the excerpt. For example, an implementation may support including, in a portion of a transcript corresponding to an excerpt, a whole sentence that includes a mention of a keyword of interest 112 regardless whether the excerpt includes the whole sentence.
Other Functionality
Selection Via Search Results
GUI 500 includes UI element 505, UI element 510, and UI element 515. In one implementation, UI element 505 is a search bar that allows a user to perform a search for audio recordings 136 that include one or more mentions of the keyword of interest 112. As shown in
UI element 510 allows a user to filter search results 520, before or after a search is performed, such that the search results 520 include only one or more participants selected in UI element 510. For example, in one implementation, a user may select in UI element 510 an identifier for a participant (i.e., “Kathy” or “Jesse” as shown in
UI element 515 allows a user to filter search results 520, before or after a search is performed, such that the search results 520 include only one or more audio recordings 136 that are dated in a selected period of time. For example, in one implementation, a user may select in UI element 515 a period of time (e.g., 1 hour, 4 hours, 1 day, 2 days, 5 days, 1 week, etc.), such that search results 520 only include audio recordings 136 that are dated in that period of time (e.g., the audio recordings 136 are stored in that period of time, concluded in that period of time, etc.).
Referring to
The column under the heading “Review” for search results 520A-G includes, for each search result 520, a respective one of user interface elements 525A-G. In one implementation, each of UI elements 525 allows a user to select the audio recording 136 to which that search result 520 corresponds. For example, selecting the top-most UI element 525A shown in
Referring to
Example Electronic Devices and Environments
Electronic Device and Machine-Readable Media
One or more parts of the above implementations may include software and/or a combination of software and hardware. An electronic device (also referred to as a computing device, computer, etc.) includes hardware and software, such as a set of one or more processors coupled to one or more machine-readable storage media (e.g., magnetic disks, optical disks, read only memory (ROM), Flash memory, phase change memory, solid state drives (SSDs)) to store code (which is composed of software instructions and which is sometimes referred to as computer program code or a computer program) for execution on the set of processors and/or to store data. For instance, an electronic device may include non-volatile memory (with slower read/write times, e.g., magnetic disks, optical disks, read only memory (ROM), Flash memory, phase change memory, SSDs) and volatile memory (e.g., dynamic random access memory (DRAM), static random access memory (SRAM)), where the non-volatile memory persists code/data even when the electronic device is turned off or when power is otherwise removed, and the electronic device copies that part of the code that is to be executed by the set of processors of that electronic device from the non-volatile memory into the volatile memory of that electronic device during operation because volatile memory typically has faster read/write times. As another example, an electronic device may include a non-volatile memory (e.g., phase change memory) that persists code/data when the electronic device is turned off, and that has sufficiently fast read/write times such that, rather than copying the part of the code/data to be executed into volatile memory, the code/data may be provided directly to the set of processors (e.g., loaded into a cache of the set of processors); in other words, this non-volatile memory operates as both long term storage and main memory, and thus the electronic device may have no or only a small amount of volatile memory for main memory. In addition to storing code and/or data on machine-readable storage media, typical electronic devices can transmit code and/or data over one or more machine-readable transmission media (also called a carrier) (e.g., electrical, optical, radio, acoustical or other form of propagated signals—such as carrier waves, infrared signals). For instance, typical electronic devices also include a set of one or more physical network interface(s) to establish network connections (to transmit and/or receive code and/or data using propagating signals) with other electronic devices. Thus, an electronic device may store and transmit (internally and/or with other electronic devices over a network) code and/or data with one or more machine-readable media (also referred to as computer-readable media).
Electronic devices (also referred to as devices) are designed for and/or used for a variety of purposes, and different terms may reflect those purposes (e.g., user devices, network devices). Some user devices are designed to mainly be operated as servers (sometime referred to as server devices), while others are designed to mainly be operated as clients (sometimes referred to as client devices, client computing devices, client computers, or end user devices; examples of which include desktops, workstations, laptops, personal digital assistants, smartphones, wearables, augmented reality (AR) devices, virtual reality (VR) devices, etc.). The software executed to operate a user device (typically a server device) as a server may be referred to as server software or server code), while the software executed to operate a user device (typically a client device) as a client may be referred to as client software or client code. A server provides one or more services to (also referred to as serves) one or more clients.
The term “user” refers to an entity (e.g., an individual person) that uses an electronic device, and software and/or services may use credentials to distinguish different accounts associated with the same and/or different users. Users can have one or more roles, such as administrator, programmer/developer, and end user roles. As an administrator, a user typically uses electronic devices to administer them for other users, and thus an administrator often works directly and/or indirectly with server devices and client devices.
During operation an instance of the software 628 (illustrated as instance 606A and also referred to as a software instance; and in the more specific case of an application, as an application instance) is executed. In electronic devices that use compute virtualization, the set of one or more processor(s) 622 typically execute software to instantiate a virtualization layer 608 and software container(s) 604A-R (e.g., with operating system-level virtualization, the virtualization layer 608 may represent a container engine (such as Docker Engine by Docker, Inc. or rkt in Container Linux by Red Hat, Inc.) running on top of (or integrated into) an operating system, and it allows for the creation of multiple software containers 604A-R (representing separate user space instances and also called virtualization engines, virtual private servers, or jails) that may each be used to execute a set of one or more applications; with full virtualization, the virtualization layer 608 represents a hypervisor (sometimes referred to as a virtual machine monitor (VMM)) or a hypervisor executing on top of a host operating system, and the software containers 604A-R each represent a tightly isolated form of a software container called a virtual machine that is run by the hypervisor and may include a guest operating system; with para-virtualization, an operating system and/or application running with a virtual machine may be aware of the presence of virtualization for optimization purposes). Again, in electronic devices where compute virtualization is used, during operation an instance of the software 628 is executed within the software container 604A on the virtualization layer 608. In electronic devices where compute virtualization is not used, the instance 606A on top of a host operating system is executed on the “bare metal” electronic device 600. The instantiation of the instance 606A, as well as the virtualization layer 608 and software containers 604A-R if implemented, are collectively referred to as software instance(s) 602.
Alternative implementations of an electronic device may have numerous variations from that described above. For example, customized hardware and/or accelerators might also be used in an electronic device.
Example Environment
The system 640 is coupled to user devices 680A-S over a network 682. The service(s) 642 may be on-demand services that are made available to one or more of the users 684A-S working for one or more entities other than the entity which owns and/or operates the on-demand services (those users sometimes referred to as outside users) so that those entities need not be concerned with building and/or maintaining a system, but instead may make use of the service(s) 642 when needed (e.g., when needed by the users 684A-S). The service(s) 642 may communicate with each other and/or with one or more of the user devices 680A-S via one or more APIs (e.g., a REST API). The user devices 680A-S are operated by users 684A-S.
In some implementations the system 640 is a multi-tenant system (also known as a multi-tenant architecture). The term multi-tenant system refers to a system in which various elements of hardware and/or software of the system may be shared by one or more tenants. A multi-tenant system may be operated by a first entity (sometimes referred to a multi-tenant system provider, operator, or vendor; or simply a provider, operator, or vendor) that provides one or more services to the tenants (in which case the tenants are customers of the operator and sometimes referred to as operator customers). A tenant includes a group of users who share a common access with specific privileges. The tenants may be different entities (e.g., different companies, different departments/divisions of a company, and/or other types of entities), and some or all of these entities may be vendors that sell or otherwise provide products and/or services to their customers (sometimes referred to as tenant customers). A multi-tenant system may allow each tenant to input tenant specific data for user management, tenant-specific functionality, configuration, customizations, non-functional properties, associated applications, etc. A tenant may have one or more roles relative to a system and/or service. For example, in the context of a CRM system or service, a tenant may be a vendor using the CRM system or service to manage information the tenant has regarding one or more customers of the vendor. As another example, in the context of Data as a Service (DAAS), one set of tenants may be vendors providing data and another set of tenants may be customers of different ones or all of the vendors' data. As another example, in the context of Platform as a Service (PAAS), one set of tenants may be third party application developers providing applications/services and another set of tenants may be customers of different ones or all of the third-party application developers.
Multi-tenancy can be implemented in different ways. In some implementations, a multi-tenant architecture may include a single software instance (e.g., a single database instance) which is shared by multiple tenants; other implementations may include a single software instance (e.g., database instance) per tenant; yet other implementations may include a mixed model; e.g., a single software instance (e.g., an application instance) per tenant and another software instance (e.g., database instance) shared by multiple tenants.
In one implementation, the system 640 is a multi-tenant cloud computing architecture supporting multiple services, such as one or more of the following:
For example, system 640 may include an application platform 644 that enables PAAS for creating, managing, and executing one or more applications developed by the provider of the application platform 644, users accessing the system 640 via one or more of user electronic devices 680A-S, or third-party application developers accessing the system 640 via one or more of user electronic devices 680A-S.
In some implementations, one or more of the service(s) 642 may use one or more multi-tenant databases 646, as well as system data storage 650 for system data 652 accessible to system 640. In certain implementations, the system 640 includes a set of one or more servers that are running on server electronic devices and that are configured to handle requests for any authorized user associated with any tenant (there is no server affinity for a user and/or tenant to a specific server). The user electronic device 680A-S communicate with the server(s) of system 640 to request and update tenant-level data and system-level data hosted by system 640, and in response the system 640 (e.g., one or more servers in system 640) automatically may generate one or more Structured Query Language (SQL) statements (e.g., one or more SQL queries) that are designed to access the desired information from the one or more multi-tenant database 646 and/or system data storage 650.
In some implementations, the service(s) 642 are implemented using virtual applications dynamically created at run time responsive to queries from the user electronic devices 680A-S and in accordance with metadata, including: 1) metadata that describes constructs (e.g., forms, reports, workflows, user access privileges, business logic) that are common to multiple tenants; and/or 2) metadata that is tenant specific and describes tenant specific constructs (e.g., tables, reports, dashboards, interfaces, etc.) and is stored in a multi-tenant database. To that end, the program code 660 may be a runtime engine that materializes application data from the metadata; that is, there is a clear separation of the compiled runtime engine (also known as the system kernel), tenant data, and the metadata, which makes it possible to independently update the system kernel and tenant-specific applications and schemas, with virtually no risk of one affecting the others. Further, in one implementation, the application platform 644 includes an application setup mechanism that supports application developers' creation and management of applications, which may be saved as metadata by save routines. Invocations to such applications, including the media player service, may be coded using Procedural Language/Structured Object Query Language (PL/SOQL) that provides a programming language style interface. Invocations to applications may be detected by one or more system processes, which manages retrieving application metadata for the tenant making the invocation and executing the metadata as an application in a software container (e.g., a virtual machine).
Network 682 may be any one or any combination of a LAN (local area network), WAN (wide area network), telephone network, wireless network, point-to-point network, star network, token ring network, hub network, or other appropriate configuration. The network may comply with one or more network protocols, including an Institute of Electrical and Electronics Engineers (IEEE) protocol, a 3rd Generation Partnership Project (3GPP) protocol, a 4th generation wireless protocol (4G) (e.g., the Long Term Evolution (LTE) standard, LTE Advanced, LTE Advanced Pro), a fifth generation wireless protocol (5G), and/or similar wired and/or wireless protocols, and may include one or more intermediary devices for routing data between the system 640 and the user electronic devices 680A-S.
Each user electronic device 680A-S (such as a desktop personal computer, workstation, laptop, Personal Digital Assistant (PDA), smart phone, augmented reality (AR) devices, virtual reality (VR) devices, etc.) typically includes one or more user interface devices, such as a keyboard, a mouse, a trackball, a touch pad, a touch screen, a pen or the like, video or touch free user interfaces, for interacting with a GUI provided on a display (e.g., a monitor screen, a liquid crystal display (LCD), a head-up display, a head-mounted display, etc.) in conjunction with pages, forms, applications and other information provided by system 640. For example, the user interface device can be used to access data and applications hosted by system 640, and to perform searches on stored data, and otherwise allow a user 684 to interact with various GUI pages that may be presented to a user 684. User electronic devices 680A-S might communicate with system 640 using TCP/IP (Transfer Control Protocol and Internet Protocol) and, at a higher network level, use other networking protocols to communicate, such as HyperText Transfer Protocol (HTTP), Andrew File System (AFS), Wireless Application Protocol (WAP), File Transfer Protocol (FTP), Network File System (NFS), an application program interface (API) based upon protocols such as SOAP, REST, etc. In an example where HTTP is used, one or more user electronic devices 680A-S might include an HTTP client, commonly referred to as a “browser,” for sending and receiving HTTP messages to and from server(s) of system 640, thus allowing users 684 of the user electronic device 680A-S to access, process and view information, pages and applications available to it from system 640 over network 682.
In the above description, numerous specific details such as resource partitioning/sharing/duplication implementations, types and interrelationships of system components, and logic partitioning/integration choices are set forth in order to provide a more thorough understanding. The invention may be practiced without such specific details, however. In other instances, control structures, logic implementations, opcodes, means to specify operands, and full software instruction sequences have not been shown in detail since those of ordinary skill in the art, with the included descriptions, will be able to implement what is described without undue experimentation.
References in the specification to “one implementation,” “an implementation,” “an example implementation,” “some implementations,” “other implementations,” etc., indicate that the implementation(s) described may include a particular feature, structure, or characteristic, but every implementation may not necessarily include the particular feature, structure, or characteristic. Moreover, such phrases are not necessarily referring to the same implementation. Further, when a particular feature, structure, and/or characteristic is described in connection with an implementation, one skilled in the art would know to affect such feature, structure, and/or characteristic in connection with other implementations whether or not explicitly described.
For example, the figure(s) illustrating flow diagrams sometimes refer to the figure(s) illustrating block diagrams, and vice versa. Whether or not explicitly described, the alternative implementations discussed with reference to the figure(s) illustrating block diagrams also apply to the implementations discussed with reference to the figure(s) illustrating flow diagrams, and vice versa. At the same time, the scope of this description includes implementations, other than those discussed with reference to the block diagrams, for performing the flow diagrams, and vice versa.
Bracketed text and blocks with dashed borders (e.g., large dashes, small dashes, dot-dash, and dots) may be used herein to illustrate optional operations and/or structures that add additional features to some implementations. However, such notation should not be taken to mean that these are the only options or optional operations, and/or that blocks with solid borders are not optional in certain implementations.
The detailed description and claims may use the term “coupled,” along with its derivatives. “Coupled” is used to indicate that two or more elements, which may or may not be in direct physical or electrical contact with each other, co-operate or interact with each other.
While the flow diagrams in the figures show a particular order of operations performed by certain implementations, such order is exemplary and not limiting (e.g., alternative implementations may perform the operations in a different order, combine certain operations, perform certain operations in parallel, overlap performance of certain operations such that they are partially in parallel, etc.).
While the above description includes several example implementations, the invention is not limited to the implementations described and can be practiced with modification and alteration within the spirit and scope of the appended claims. The description is thus illustrative instead of limiting.
This application claims the benefit of U.S. Provisional Application No. 62/937,786, filed Nov. 19, 2019, which is hereby incorporated by reference.
Number | Date | Country | |
---|---|---|---|
62937786 | Nov 2019 | US |