Contemporary mobile devices are used for many types of user applications, including running interactive applications and listening to music or other audio (e.g., broadcasts). Audio output is generally something a user often wants to have performed in the background, e.g., after setting up a playlist or other audio content, the user wants to be able to listen to the audio while still being able to use the device features and/or perform other foreground tasks.
To implement a background audio scenario, the system needs to let a process run in the background and play audio. Current solutions have one or more issues with such a scenario, including consuming too much battery and/or other system resources, providing poor (if any) integration with the system user experience/interface (UX), and/or the possibility of introducing a security threat to the system. Further, playback may stop unexpectedly due to resource depletion.
As a result, one solution is to use a where “first party” application as the background audio program, (where as used herein, “first party” generally refers to trusted code such as provided by the operating system vendor, and “third party” refers to applications from vendors, regardless of their source or trustworthiness). However, this limits the device system to not allowing third party applications to perform background audio playback and provide different user experiences, while consuming resources for the first party application, and so forth.
This Summary is provided to introduce a selection of representative concepts in a simplified form that are further described below in the Detailed Description. This Summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used in any way that would limit the scope of the claimed subject matter.
Briefly, various aspects of the subject matter described herein are directed towards a technology by which a media service plays audio in a background process on a mobile device, as initially directed by a foreground (e.g., third party) application. The application, when in the foreground, communicates via an interface with the media service, including to provide information to the media service that corresponds to audio data (e.g., an audio track) to play. The media service plays the audio, and acts upon requests directed towards the audio playback as the media service plays the background audio.
For example, the requests directed towards the audio playback may correspond to user actions, such as play, pause, skip, stop, skip next, skip previous, seek, fast forward, rewind, rating-related actions, shuffle and/or repeat requests. The requests directed towards the audio playback may be to provide state information, such as playing, paused, stopped, fast forwarding, rewinding, buffering started, buffering stopped, track ready and/or track ended state.
In one aspect, the media service operates to launch an agent that provides the requests directed towards the audio playback. The foreground application may be deactivated, with the media service continuing the audio playback in the background. The media service may cause the agent to be re-launched as needed to obtain additional audio information, e.g., more tracks to play.
In one aspect, a universal volume control (e.g., system) component provides requests directed towards the audio playback. The application (when in the foreground) may provide information that determines the operation of the universal volume control component. The media service provides information that may be presented on a user interface of the universal volume control component, e.g., text (title, artist), images and so forth, such as obtained from the application and/or agent.
In one aspect, a source agent may be configured to output audio data that the media service processes into the audio playback. The source agent may use a (e.g., custom) codec, decryption mechanism, decompression mechanism, and/or a proprietary protocol to provide the audio data. The source agent may output the audio data to a shared memory for processing by the media service.
The information that corresponds to the audio data to play may be associated with a control flag that indicates via attribute settings whether the media service is allowed to take a particular action with respect to the audio playback. For example, the flag may include attributes that allow/deny skip next, skip previous, fast forward, pause, and/or rewind actions for any media item that is playing or queued to be played.
Other advantages may become apparent from the following detailed description when taken in conjunction with the drawings.
The present invention is illustrated by way of example and not limited in the accompanying figures in which like reference numerals indicate similar elements and in which:
Various aspects of the technology described herein are generally directed towards a technology in which a mobile device or the like includes a background audio service. To play audio, an application (e.g., third party application) sends a request via the background audio service to a media service, which performs the playback. By providing a system service rather than allowing an untrusted application process to operate in the background, more security and stability are provided, with a known amount of impact on the system, while allowing third party applications to direct the background audio playback and playback-related operations.
It should be understood that any of the examples herein are non-limiting. As such, the present invention is not limited to any particular embodiments, aspects, concepts, structures, functionalities or examples described herein. Rather, any of the embodiments, aspects, concepts, structures, functionalities or examples described herein are non-limiting, and the present invention may be used various ways that provide benefits and advantages in computing and mobile devices in general.
The background audio service 108 supports basic playback and a custom codec mode. The playback mode can help the device save battery, and makes coding of the application relatively simple. In the custom codec mode, described below with reference to
In one implementation, the application 104 calls into the background audio service 108 with a request to play audio. The media service 102 is notified, which in turn communicates with an application instance manager 110 (a system service) to launch the player agent 106. The communication between the media service 102 and the application instance manager 110 requests that resources be reserved for the agent 106, and notifies the application instance manager 110 that the player agent 106 is a time-critical resource (so that in typical operating circumstances, the resource is not subject to interruptions that if present provide a poor user experience). In general, the player agent 106 is independent of the application 104, as either one may remain operational while the other is not, both may operate at the same time, or neither may be operational at a given time.
In one implementation, the player agent 106 makes the actual playback requests and other playback-related requests (e.g., skip, rewind and so forth) on behalf of the application 104. This allows the application 104 to be removed from the foreground and so forth. In an alternative implementation, the requests may be shared between the agent and application, with the application choosing which requests it keeps for itself, and which are delegated to the agent, for example. Note that having a player agent be responsible for handling requests provides an advantage in the requests may originate via the application or other system services, and thus there are no conflicts, and the user may use the system even after the application has been terminated or otherwise deactivated.
In general, the media service 102 via a media service proxy/translator 112 notifies the application instance manager 110 on various events, such as on a user action and any play state change; (note that the media service proxy/translator 112 alternatively may be incorporated into the media service 102). Example user actions include play, pause, skip, stop, skip next, skip previous, seek, fast forward, rewind, rating-related actions, shuffle and/or repeat. Example play states include playing, paused, stopped, fast forwarding, rewinding, buffering started, buffering stopped, track ready and/or track ended.
Each time the media service wants to communicate with the player agent 106, the media service 102 instructs the application instance manager 110 to re-launch the player agent 106 if needed; note that this allows the application instance manager 110 to leave the player agent 106 in memory if desired, to put the player agent 106 into a dormant state (retained in memory but unable to run code until activated), or fully terminate the player agent 106 and re-launch an instance as needed. Once notified, the media service and the player agent 106 may communicate (via the background audio service 108) to perform further actions, such as to request a next track, notify the agent that the user has performed some action related to audio, e.g., by interfacing with the application 104 or a system-provided UVC (universal volume control) 114, and so forth.
Also shown in
It should be noted that the mobile device may output the audio in any suitable form. For example, the mobile device may output the audio via internal speakers, via a headphone jack to a headset, via wireless or wired communication to another device (e.g., over Wi-Fi to an external sound system), and the like. The media service 102 may be considered as providing input source information to a sink.
As represented in
When launched, the control agent 206 makes requests to the media service 102 (arrow (4)), including a request for a particular source agent 222. The source agent may have been previous loaded into the device and maintained thereon, or downloaded on demand. As represented via the arrows labeled (5) and (6), the application instance manager launches the source agent 222.
The source agent 222 requests that the media service play a track (arrows (7a) and (7b)), informing the media service 102 that the source agent 222 will provide (e.g., stream) the properly formatted content. The source agent responds to the control agent 206 that the track is ready (arrows (8a) and (8b). The source agent 222 provides the properly formatted audio content (arrows (9a) and (9b) to a playing component of the media service 102, (DShow 226, incorporated into or coupled to the media service 102). In one implementation, the audio content is buffered via a shared memory 228, which is efficient as further data transfer/copying is avoided.
As can be seen, in a basic playback mode as in
In a custom codec mode as in
Turning to another aspect, as described above the device system provides a UVC 114 control/user interface by which the user is able to control audio playback independent of the application 104 or 204. In general, at any desired time, the user presses a hardware button to bring up the UVC 114 control/user interface.
Note that in one implementation, when the application 104 or 204 is in the foreground, the application may disable, limit or otherwise integrate with the UVC, such as to show its controls instead of or in addition to the UVC's controls, add text or images (e.g., album art) to the UVC user interface, and so forth. The background audio service API 108 enables an application to implement its own logic when it receives user interaction. User interactions with the UVC system user experience (such as skipping a track) are handled by the application. Applications have the opportunity to disable/enable the UVC playback control button, and/or change song metadata such as title, artist, and album art.
To this end, the media service 102 sends notifications to application foreground code, first party background code 116 and the UVC 114 when the playback state changes or other pertinent events occur to ensure the components have the correct metadata and playback status. For playback position, the application is able to query the background service. This notification design enables applications, the UVC 114, and the first party player 116 to obtain correct playback status and metadata, without introducing significant, unnecessary cross-process traffic.
When the application is in the background, the user may interface with the audio via the UVC 114. For example, the user may increase or decrease the volume, pause the track, skip the track and so forth. The media service 102 is instructed by the agent (or application) via a “Nowplaying” token or the like as to what may be displayed on the UVC user interface, e.g., text (title, artist, album name and/or the like), an image, a logo, and so forth. The UVC 114 communicates (e.g., directly) with the media service 102, including to subscribe to status changes (e.g., playback changes, such as play/pause/stopped play states as well as item changed (track switching) notifications, whereby UVC pulls the relevant data, such as title/artist/album from media services for display) and the like. Via the application/agent, the media service has information on what is playing now, and when the track changes, will instruct the UVC 114 to update the title (new track name). If the media service queues multiple tracks, the application/agent need not be involved until the queue needs to be updated/changed. Note that in one alternative, instead of sending a single track to play, the application/agent may send a playlist of multiple tracks. This may be more efficient, to avoid communicating with (including possibly launching) the agent for each new track.
Further, each media item (e.g., audio track) is associated with attributes in a media control flag that the application/agent may set, and which control what the UVC 114 is able to do with respect to a track. For example, the application may specify that a track (such as an advertisement, for example) cannot be skipped. Another application may let a user skip some limited number of tracks before having to listen to one fully, before again allowing skipping. In general, the media control flag includes attributes to allow/deny skip next, skip previous, fast forward, pause, and rewind.
Turning to integration with the device's telephone, when there is incoming call, the media service is notified. In one implementation, the background music volume is decreased, and a ringtone sound is audibly mixed with the background music. A user may configure the relative volume levels. If user presses ‘ignore’, the ringtone stops and the background audio will continue playing, e.g., at its previous volume level. If a user answers, the audio is paused (if allowed by the media control flag, or silenced if not, for example). Thus, the mixed audio output signal and ringtone continues until the call attempt ends in some way, e.g., is ignored (by explicit user action or until the call attempt terminates) or is answered (by the user or automatically). When a user-answered call ends, the audio resumes playing.
1. Application creates service request (“play this track and call me back when you need the next one”)
2. Service starts playing the track
3. User closes application
4. Current track ends
5. Service asks system to call the agent to get the next track
6. System starts new process and invokes the agent
7. Agent performs logic and provides the next track information
8. System suspends or kills the agent
9. Current track ends . . . (back to step 5)
For third party application background audio playback, the application 304 operates to contact its servers on the web 334. This typically includes authenticating the user and retrieving any user data. For this example, assume the content to be played is represented by an HTTP URL that points to some server. When the application 304 decides that it is time to begin playing music, the application may perform the following example steps:
1. Create an instance of the background service 108 (including the API set).
2. As part of its initialization, the background service 108 calls the media service 102 to create a queue for the application 304. This call creates the queue if one does not already exist for this application.
3. The application 304 checks to see if the background audio service 108 has a current track. (For this example, assume there is not already a queue)
4. There is not a current track, so the application 304 performs its operations to get a track from the server (which may include authenticating the user, and so forth).
5. For this example, assume the server returns a HTTP URL that points to the track to play.
6. The application 304 creates an AudioTrack object (or other suitable data structure), passing in the URL, track name, and any other metadata as desired.
7. The application 304 passes the new AudioTrack object to the background audio service 108 via an appropriate function call.
8. The background audio service 108 creates a new media item via an appropriate function call.
9. The background audio service 108 queries the given AudioTrack for the URI which it sets on the item.
10. The background audio service 108 adds the track to the media service/queue.
11. The application calls the play function of the background audio service 108.
12. The background audio service 108 initiates playback by calling the media service.
13. The application receives events it is interested in.
14. Shell tells the application it is closing.
15. The application destroys its background audio service 108 object, and the media service continues to play the track.
Resuming playback (persist playlist queues) is another scenario. A user may tap on the power button to turn the screen on. Without unlocking the phone the user may push the volume up key to bring up the UVC, and taps the play button and the station corresponding to the application starts playing again, e.g., the same song where it was stopped earlier. The following is an example:
1. The agent receives an “OnUserAction callback.” This callback's parameters provide a reference to the current track, with the action being “Play”. Assume that there is no current track because the queue was destroyed at some point.
2. The agent notices that there is no current track, whereby the agent goes to its persistent store to read in the title, source, and position of the last track it played.
3. The agent creates a new AudioTrack and sets the metadata accordingly.
4. The agent sets where the audio is to resume by calling a “progress” function of the background audio service 108, e.g., “BackgroundAudioPlayer.Progress.”
5. The agent passes the new AudioTrack object to the BackgroundAudioPlayer.
6. The agent calls background audio service 108.
7. Playback resumes where the user left off (assuming that the track is still available on the service).
Resuming applications (get queue state) is another example, where a user decides to double-check a song title that he or she cannot remember via the application. The user unlocks the device, navigates to the application list, and taps on the application icon. The application launches and displays the currently playing artist and the current song playback time counter. An “Events” label may be selected, which for example shows that a concert with the artist is upcoming. The following is an example:
1. The application creates an instance of the background audio service 108.
2. As part of its initialization, the background audio service 108 checks with the media service to see if there is already an application queue.
3. The media service reports that there is a background queue for this application.
4. The background audio service 108 retrieves the media item value for the currently playing track and creates a new AudioTrack object that contains the media item value.
5. The application checks to see if there is a current track, which in this example there is.
6. The application gets the current track and queries for the title.
7. Because this is the first metadata query, the AudioTrack object reaches the media service and obtains the available metadata. Then it returns the title.
8. The application calls a function (BackgroundAudioPlayer.PlayState) to get the current play state (Playing).
9. Because the content is playing, the application calls a function (AudioTrack.Duration) to get the track's duration.
10. The application calls a function (BackgroundAudioPlayer.Progress) to get the current position.
11. The application updates its UI with this information.
Switching playback between applications also may be performed. While a song/station is still playing via the previous application, the user may navigate into the first party media player (e.g., Zune®) application, for example, and taps on podcasts. The user taps the play button next to a new episode, causing the previous station to automatically stop playing and the new podcast playback to start. The podcast plays for a while, until the user decides to tune into other radio content. While the previous podcast is still playing, the user navigates to the application list and launches a different application, where the user finds a desired radio station and taps the play icon. The podcast automatically stops playing and the user now hears the desired radio station. The following is an example:
1. The original background audio agent receives the save-related (On PlayStateChanged/Shutdown) callback.
2. The original background audio agent grabs the current track from the given background audio service instance and saves off the title, source, current position, and so forth. It saves these values in its isolated storage 330.
3. To free up resources, the original background audio agent calls a function of the background audio service (BackgroundAudioPlayer.Close) which instructs the media service to delete the application's queue. Note that in the event that the original background audio playback was stopped due to the user playing a different media, the media service frees up the resource for the original application/agents as part of the Shutdown.
Playlist control (skipping) is yet another desirable feature. A user begins streaming music, and then goes into a Games hub to select a game, for example. After a few minutes of playing the game, an undesirable song is played; while in the gaming application, the user taps the volume up button making the UVC controls appear. The user may then tap the skip icon to move onto another song. The following is an example:
1. The audio background agent's OnUserAction callback is fired. The UserAction enumeration is set to SkipNext.
2. The audio background agent calls an internal method or the like that determines that the user is allowed to skip the current track (e.g., based on the company's business model/rules).
3. The audio background agent queries the web servers for the URL of the next track to play.
4. The audio background agent creates a new AudioTrack object and sets the URL and other pertinent metadata.
5. The audio background agent uses the background audio service's object that was also passed in as one of OnUserAction's parameters to set the new AudioTrack as the new current track. This action causes the currently playing track to stop.
6. The audio background agent calls Play to begin playing the new track.
With reference to
Components of the mobile device 500 may include, but are not limited to, a processing unit 505, system memory 510, and a bus 515 that couples various system components including the system memory 510 to the processing unit 505. The bus 515 may include any of several types of bus structures including a memory bus, memory controller, a peripheral bus, and a local bus using any of a variety of bus architectures, and the like. The bus 515 allows data to be transmitted between various components of the mobile device 500.
The mobile device 500 may include a variety of computer-readable media. Computer-readable media can be any available media that can be accessed by the mobile device 500 and includes both volatile and nonvolatile media, and removable and non-removable media. By way of example, and not limitation, computer-readable media may comprise computer storage media and communication media. Computer storage media includes volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer-readable instructions, data structures, program modules, or other data. Computer storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital versatile disks (DVD) or other optical disk storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can be accessed by the mobile device 500.
Communication media typically embodies computer-readable instructions, data structures, program modules, or other data in a modulated data signal such as a carrier wave or other transport mechanism and includes any information delivery media. The term “modulated data signal” means a signal that has one or more of its characteristics set or changed in such a manner as to encode information in the signal. By way of example, and not limitation, communication media includes wired media such as a wired network or direct-wired connection, and wireless media such as acoustic, RF, Bluetooth®, Wireless USB, infrared, WiFi, WiMAX, and other wireless media. Combinations of any of the above should also be included within the scope of computer-readable media.
The system memory 510 includes computer storage media in the form of volatile and/or nonvolatile memory and may include read only memory (ROM) and random access memory (RAM). On a mobile device such as a cell phone, operating system code 520 is sometimes included in ROM although, in other embodiments, this is not required. Similarly, application programs 525 are often placed in RAM although again, in other embodiments, application programs may be placed in ROM or in other computer-readable memory. The heap 530 provides memory for state associated with the operating system 520 and the application programs 525. For example, the operating system 520 and application programs 525 may store variables and data structures in the heap 530 during their operations.
The mobile device 500 may also include other removable/non-removable, volatile/nonvolatile memory. By way of example,
In some embodiments, the hard disk drive 536 may be connected in such a way as to be more permanently attached to the mobile device 500. For example, the hard disk drive 536 may be connected to an interface such as parallel advanced technology attachment (PATA), serial advanced technology attachment (SATA) or otherwise, which may be connected to the bus 515. In such embodiments, removing the hard drive may involve removing a cover of the mobile device 500 and removing screws or other fasteners that connect the hard drive 536 to support structures within the mobile device 500.
The removable memory devices 535-537 and their associated computer storage media, discussed above and illustrated in
A user may enter commands and information into the mobile device 500 through input devices such as a key pad 541 and the microphone 542. In some embodiments, the display 543 may be touch-sensitive screen and may allow a user to enter commands and information thereon. The key pad 541 and display 543 may be connected to the processing unit 505 through a user input interface 550 that is coupled to the bus 515, but may also be connected by other interface and bus structures, such as the communications module(s) 532 and wired port(s) 540. Motion detection 552 can be used to determine gestures made with the device 500.
A user may communicate with other users via speaking into the microphone 542 and via text messages that are entered on the key pad 541 or a touch sensitive display 543, for example. The audio unit 555 may provide electrical signals to drive the speaker 544 as well as receive and digitize audio signals received from the microphone 542.
The mobile device 500 may include a video unit 560 that provides signals to drive a camera 561. The video unit 560 may also receive images obtained by the camera 561 and provide these images to the processing unit 505 and/or memory included on the mobile device 500. The images obtained by the camera 561 may comprise video, one or more images that do not form a video, or some combination thereof.
The communication module(s) 532 may provide signals to and receive signals from one or more antenna(s) 565. One of the antenna(s) 565 may transmit and receive messages for a cell phone network. Another antenna may transmit and receive Bluetooth® messages. Yet another antenna (or a shared antenna) may transmit and receive network messages via a wireless Ethernet network standard.
Still further, an antenna provides location-based information, e.g., GPS signals to a GPS interface and mechanism 572. In turn, the GPS mechanism 572 makes available the corresponding GPS data (e.g., time and coordinates) for processing.
In some embodiments, a single antenna may be used to transmit and/or receive messages for more than one type of network. For example, a single antenna may transmit and receive voice and packet messages.
When operated in a networked environment, the mobile device 500 may connect to one or more remote devices. The remote devices may include a personal computer, a server, a router, a network PC, a cell phone, a media playback device, a peer device or other common network node, and typically includes many or all of the elements described above relative to the mobile device 500.
Aspects of the subject matter described herein are operational with numerous other general purpose or special purpose computing system environments or configurations. Examples of well known computing systems, environments, and/or configurations that may be suitable for use with aspects of the subject matter described herein include, but are not limited to, personal computers, server computers, hand-held or laptop devices, multiprocessor systems, microcontroller-based systems, set top boxes, programmable consumer electronics, network PCs, minicomputers, mainframe computers, distributed computing environments that include any of the above systems or devices, and the like.
Aspects of the subject matter described herein may be described in the general context of computer-executable instructions, such as program modules, being executed by a mobile device. Generally, program modules include routines, programs, objects, components, data structures, and so forth, which perform particular tasks or implement particular abstract data types. Aspects of the subject matter described herein may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.
Furthermore, although the term server may be used herein, it will be recognized that this term may also encompass a client, a set of one or more processes distributed on one or more computers, one or more stand-alone storage devices, a set of one or more other devices, a combination of one or more of the above, and the like.
While the invention is susceptible to various modifications and alternative constructions, certain illustrated embodiments thereof are shown in the drawings and have been described above in detail. It should be understood, however, that there is no intention to limit the invention to the specific forms disclosed, but on the contrary, the intention is to cover all modifications, alternative constructions, and equivalents falling within the spirit and scope of the invention.
The present application claims priority to U.S. provisional patent applications Ser. Nos. 61/442,701, 61/442,713, 61/442,735, 61/442,740 and 61/442,753, each filed Feb. 14, 2011 and hereby incorporated by reference. The present application is related to U.S. patent applications attorney docket nos. 332296.02, 332297.02, 332320.02 and 332340.02, assigned to the assignee of the present invention, and hereby incorporated by reference.
Number | Date | Country | |
---|---|---|---|
61442701 | Feb 2011 | US | |
61442713 | Feb 2011 | US | |
61442735 | Feb 2011 | US | |
61442753 | Feb 2011 | US | |
61442740 | Feb 2011 | US |