The present disclosure relates generally to public address systems, and more particularly to mobile devices receiving information from such public address systems.
Almost all mobile device users have had the experience of being distracted in a setting such as a train station, airport or public gathering and missing public announcements being made over a public address system. An additional experience that most mobile device users have had is being unable to hear such announcements in a noisy environment. In some cases, a user is focused on surrounding entertainment or in conversation with another person. Alternatively, a distracted user may be listening to music or taking a call. Because of these issues, the intended audience sometimes misses public announcements in train station, airports and restaurants.
In other public venues, such as schools and shopping malls, emergency broadcasts may be transmitted that require immediate evacuation. A distracted user may miss the announcement and be placed in a dangerous situation.
Briefly, a disclosed mobile device listens, via at least one microphone and internal audio equipment, for audio public announcements broadcast in the surrounding area over a public address system. The mobile device is operative to send its location information to a server and, in response, obtain at least one audio template from the server. Low power operations, such as basic audio recognition, run on the mobile device while the mobile device's primary processor is in a sleep mode. A basic audio recognition engine, operating while the processor is in a sleep mode, listens for an audio trigger that matches either a predetermined audio trigger defined by the audio template, or that matches a portion of the audio template. Detection of the trigger wakes the processor in order to display text versions of the audio public announcements on the mobile device display. In some embodiments, an audio template received from the server may have an associated text version attached, which can be directly displayed on the mobile device display when the associated audio trigger is detected. Alternatively, the mobile device can store an audio file of the audio public announcement in memory and perform a voice-to-text operation to convert the audio file into a displayable text version. Examples of audio public announcements may include, but are not limited to sounds, spoken words (either human speech or synthesized speech), music, combinations thereof, and the like.
One disclosed method includes running audio recognition on a mobile device that is operating in a low power state, waking a processor in response to detecting an audio trigger corresponding to an audio public announcement at a location of the mobile device, receiving the audio public announcement and displaying a text version of the audio public announcement on a display of the mobile device. In some embodiments, the audio templates, which define the audio trigger related to a public address system, may be obtained from a server based on location of the mobile device. An audio trigger may be a spoken word, phrase, sound or combination thereof from a public address system.
A disclosed mobile device includes at least one microphone, a display and a first processor, operatively coupled to the microphone and to the display. The first processor is operative to receive an audio public announcement and display a text version of the audio public announcement on the display. The disclosed mobile device also includes a second processor, operatively coupled to the first processor, the microphone and to the display. The second processor is operative to run audio recognition while the first processor operates in a low power state, and wake the first processor from the low power state in response to an audio trigger detected using the microphone. The audio trigger corresponds to an audio public announcement at a location of the mobile device.
The first processor is further operative to obtain an audio template from a server based on location of a mobile device. The audio template is used to define the audio trigger related to a public address system at the location.
Turning now to the drawings wherein like numerals represent like components,
The terms “audio public announcement” and “public announcement” as used herein refers to an audio announcement broadcast openly, using a public address system having speakers at the location of the mobile device. Examples of locations having such public address systems are airports, train stations, bus stations, hotel lobbies, restaurant waiting areas, etc. Examples of public announcements may include, but are not limited to, voice announcements made in train or bus stations or airports, as well as information provided by signs, banners, sound, music, video, combinations thereof, and the like. Voice announcements may be actual human speech or may be synthesized speech. For example train stations usually have voice public announcements when a train is approaching the station that may inform passengers of the train number, destination, boarding and departure times or some subset combination of these, etc. Many such systems employ synthesized speech provided by the public address system rather than having a human announcer at each train station. In another example, airports usually have voice public announcements when a plane is ready for passenger boarding. In other words an “audio public announcement” is a “public announcement” that is an announcement made over a public address system and broadcast over one or more speakers of the public address system. The public announcement may be made by a person speaking over the public address system or may be an automated message that is played over the public address system using a text-to-voice converter used to simulate a human speaker (i.e. synthesized speech). Text files used for automated public announcements may be stored in components of a public address system such as public address system 113 shown in
In
In the example illustrated in
Thus
In one example embodiment illustrated by
In another embodiment illustrated by
An example method of operation of the mobile device 201 shown in
The audio templates 207 are audio signatures but the server 210 may also send audio stream files and text message files that correspond to the audio templates 207 such that the mobile device 201 may store the files in memory. In operation block 409, the mobile device 201 listens for audio public announcements by comparing received audio with all or segments of the audio templates 207. When an audio template 207 matches audio received by the mobile device 201, the mobile device 201 takes some further action.
In other words, the mobile device 201 uses the audio templates 207 as audio triggers that trigger specific actions by the mobile device 201. For example, when the mobile device 201 is placed in a sleep mode to conserve power, the audio templates 207 may be used as audio triggers that wake the mobile device 201 from the sleep mode such that it may perform some further action. In operation block 411, the mobile device 201 may detect an audio trigger where the audio trigger is defined as all, or a portion of, one of the audio templates 207 received from the server 210. At operation block 413, the mobile device 201 main processor is activated (i.e. waken from sleep mode) in response to detecting the audio trigger. In operation block 415, the mobile device 201 may display the public announcement as text, play an audio stream stored in memory or both. The mobile device 201 may also continue to sample the audio public announcement in some embodiments. As discussed above, speech-to-text or text-to-speech conversions may be involved in these operations in some embodiments depending on the information available to the mobile device 201 in conjunction with the audio templates 207.
Another method of operation of a mobile device in accordance with an embodiment is illustrated in
Microphones and speakers 613, which include at least one speaker, and at least one microphone, are operatively coupled to audio equipment 615. The audio equipment 615 may include, among other things, signal amplification, analog-to-digital conversion/digital audio sampling, echo cancellation, other audio processing, etc., which may be applied to one or more microphones and/or one or more speakers of the mobile device 600.
All of the mobile device 600 components shown are operatively coupled to the processor 601 by one or more internal communication buses 602. In some embodiments, the separate sensor processor 619 (rather than the main processor or application processor such as processor 601) monitors sensor data from various sensors including a gyroscope 621 and an accelerometer 623 as well as other sensors 625. The gyroscope 621 and accelerometer 623 may be separate or may be combined into a single integrated unit.
The memory 603 is non-volatile and non-transitory and stores executable code for an operating system 627 that, when executed by the processor 601, provides an application layer (or user space), libraries (also referred to herein as “application programming interfaces” or “APIs”) and a kernel. The memory 603 also stores executable code for various applications 629, such as voice-to-text converter code 631, audio recognition engine code 630 and basic audio recognition engine code 632. The applications 629 may also include, but are not limited to, a web browser, email client, calendar application, etc. The memory may also store various text files and audio files, such as, but not limited to, text versions of audio public announcements 636. The memory may also store audio streams for segments of audio public announcements in some embodiments. The memory 603 also stores audio signatures such as audio triggers 633 and audio templates 635. The audio triggers 633 are abbreviated audio signatures that are used to wake the processor 601 from sleep mode, and may be portions or segments of the audio templates 635 in some embodiments.
The processor 601 is operative to execute and run a public announcement module 638. The public announcement module 638 may be one of the applications 629 stored in memory 603. The processor 601 is also operative to execute the audio recognition engine code 630 to run an audio recognition engine 637, and to execute the voice-to-text converter code 631 to run a voice-to-text converter 639. The public announcement module 638 is operative to communicate with the WAN transceivers 609 and with the WLAN baseband hardware 611 and can establish an IP connection 626 with a server using a wireless interface implemented by either the WAN transceivers 609 or the WLAN baseband hardware 611. The public announcement module 638 is operative to obtain audio signatures from the server and store these in memory 603 as audio triggers 633 and/or audio templates 635. The public announcement module 638 is also operative to receive text files and audio stream files for audio public announcements from the server along with the audio signatures and to store these files in memory 603. Therefore the memory 603 may store the text versions of audio public announcements 636 in some embodiments. In some embodiments, these text files are indexed corresponding to audio template identification information that identifies the audio triggers 633 and/or the audio templates 635 such that when the audio recognition engine 637 detects one of these audio signatures it sends the corresponding audio template identification information to the public announcement module 638.
In some embodiments, the public announcement module 638 may then lookup the text version of the public announcement segment or segments in the text versions of audio public announcements 636 stored in memory, and display the public announcement on the display 605. In other embodiments, the public announcement module 638 may send the audio template identification information to the server over the IP connection 626 and receive back either a text file or an audio stream of the public announcement. The audio template identification information may be a numeric value that is included as a prefix or header of the text files that contain the text versions of audio public announcements. These prefixes may be stored in mobile device 600 memory 603 along with the audio templates 635. In other words, when an audio template 635 is detected the public announcement module 638 may retrieve the text file prefix and send that as the audio template identification information to the server. The server may then use the prefix to retrieve the corresponding text file and send it to the mobile device 600.
In some embodiments, the audio signatures (or portions thereof) may be used only as audio triggers 633 which wake the processor 601 from sleep mode when detected by a basic audio recognition engine 620. In this case, the audio trigger causes the processor 601 to wake and the audio recognition engine 637 listens to the entire audio public announcement (via the microphone and audio equipment 615). The voice-to-text converter 639 may then convert the recognized speech (actual human speech or synthesized speech) and display the public announcement textually on the display 605. The public announcement module 638 is also operative to communicate with GPS hardware 617, over the one or more internal communication buses 602, to obtain location information which it may then send to the server in order to obtain location related audio templates and other information such as text files and audio streams, etc.
Under certain mobile device 600 operating conditions, the processor 601 is placed in sleep mode in order to conserve battery power. One example of such operating conditions is when user activity stops or lulls for a predetermined period of time, such as a given number of minutes. When the processor 601 is operating in sleep mode, the sensor processor 619, which requires a lower amount of battery current than the processor 601, monitors the various sensors in order to detect user activity. Any user activity detected using the sensors may be used as a trigger to wake the processor 601. More particularly, the sensor processor 619 is operative to detect processor 601 “wake-up” triggers and send a wake up signal to the processor 601 over the one or more internal communication buses 602 in response to detecting one of the triggers.
One example wake-up trigger used in the various embodiments is an audio trigger. The sensor processor is operative to execute basic audio recognition engine code 632 from memory 603, to run a basic audio recognition engine 620. The basis audio recognition engine 620 is operative to detect limited audio signatures (i.e. audio template segments) such as the audio triggers 633 stored in memory 603. The audio triggers 633 may be an initial part of a sound, a single word, a simple short phrase, etc. Put another way, the basic audio recognition engine 620 is operative to recognize keywords, short phrases or sounds which may be included in, or precursory to, audio public announcements. The basic audio recognition engine 620 listens for an audio trigger 633 and, in response to detecting the audio trigger, will send a “wake up” command to wake the processor 601 from the low power state (i.e. wake from sleep mode). The processor 601 is operative to, among other things, launch and execute the audio recognition engine 637 and the voice-to-text convertor 639 upon receiving the wake up command. The audio recognition engine 637 will proceed to listen to the entirety of the audio public announcement. The voice-to-text converter 639 converts any speech portion of the audio public announcement into text that can be displayed on display 605.
It is to be understood that any of the above described example components, including those described as “modules” and “engines”, in the example mobile device 600, without limitation, may be implemented as software (i.e. executable instructions or executable code) or firmware (or a combination of software and firmware) executing on one or more processors, or using ASICs (application-specific-integrated-circuits), DSPs (digital signal processors), hardwired circuitry (logic circuitry), state machines, FPGAs (field programmable gate arrays) or combinations thereof. In embodiments in which one or more of these components is implemented as software, or partially in software/firmware, the executable instructions may be stored in the operatively coupled, non-volatile, non-transitory memory 603, or in flash memory or EEPROM on the processor chip, and/or on the same die, such that the software/firmware may be accessed by the processor 601, or other processors, as needed.
An example method of operation of the mobile device 600 that includes displaying a text version of an audio public announcement is illustrated in
In some embodiments, the public announcement module 638 communicates with the audio recognition engine 637 and segments any obtained audio templates 635 to generate the audio triggers 633. For example, the audio recognition engine 637 may process the audio templates 635 to extract initial keywords or short phrases and store these segments as the audio triggers 633. One example audio trigger may be the keyword “train” which would be an appropriate audio trigger in a train station setting where audio public announcements often begin with the work train (for example, “Train 636 is now approaching the station”). However the audio trigger need not be a complete word and may be only a portion of a word (i.e. such as a phoneme).
In operation block 703, the mobile device 600 detects a match between a received audio public announcement and the audio template. If the processor 601 is operating in sleep mode, then the basic audio recognition engine 620 will first detect an audio trigger 633 and send a wakeup command to the processor 601. The processor 601 will then execute the audio recognition engine 637 to recognize the entire audio template by comparing the received audio with the audio templates 635 stored in memory 603. In decision block 705, mobile device settings are checked to determine how to proceed. The mobile device settings may be stored in memory 603. The mobile settings allow the mobile device 600 user to specify whether the mobile device 600 should sample entire audio public announcements and perform speech-to-text conversion or whether the mobile device 600 should obtain text versions from the server.
In decision block 705, if the user has selecting settings for obtaining text versions from the server, then the mobile device 600 proceeds to operation block 707 and sends the audio template identification information to the server. The server then performs a database lookup operation using the audio template identification information and finds the text version of the corresponding audio public announcement. In operation block 709, the mobile device 600 obtains the text version of the audio public announcement from the server. In operation block 711, the mobile device 600 displays the text version of the audio public announcement on the display 605. The method of operation then terminates.
In decision block 705, if the user has selecting settings for sampling the entire audio public announcement, then the mobile device 600 proceeds to operation block 713 and samples the entire audio public announcement. In operation block 715, the voice-to-text converter 639 performs speech-to-text conversion of the sample audio public announcement. In operation block 717, the mobile device 600 displays the text version of the audio public announcement on the display 605. The method of operation then terminates.
Another example method of operation of the mobile device 600 is shown in
Another example method of operation of the example mobile device 600 shown in
In decision block 915, the public announcement module 638 will determine whether a received audio public announcement is associated with a preconfigured public announcement stored in memory 603 or on the server. If yes, then in operation block 917 the public announcement module 638 will obtain the text version of the audio public announcement either from memory 603 or from the server and will display the text version on the display 605 as shown in operation block 921. In some embodiments, the public announcement module 638 may obtain the text version of the audio public announcement from a text message sent by the server, for example from a Short-Message-Service (SMS) message or from an Internet Messaging (IM) message. If the audio template is not associated with a preconfigured message, or if the embodiment is one that does not provide for access to preconfigured text versions of audio public announcements, then at operation block 919, the audio recognition engine 637 and the voice-to-text converter 639 will generate a text version of the audio public announcement. The method of operation will then proceed to operation block 921 and display the text version on display 605.
In decision block 922, the public announcement module 638 will check the mobile device 600 settings to determine if the user wants to have the audio public announcement played back over the mobile device 600 speaker or over a connected headset accessory 612. If yes, then in operation block 927 the public announcement module 638 will play the audio over either the speaker or over the headset accessory 612 if connected. The method of operation then proceeds to decision block 923. If the user does not want audio to be played in decision block 922, then the method of operation proceeds directly to decision block 923.
In decision block 923, the public announcement module 638 will determine whether the location of the mobile device 600 has changed by communicating with the GPS hardware 617. If the location has changed in decision block 923, then the method of operation will terminate as shown. However if the mobile device 600 location remains the same in decision block 923, then in operation block 925 the processor 601 will be again placed in sleep mode in order to conserve battery power, assuming that no other user activity causes the processor 601 to remain awake. In other words, operation block 925 is only implemented if user activity is at a lull, otherwise the method of operation proceeds directly to operation block 907. The method of operation therefore will loop back to operation block 907 such that the mobile device 600 will continue listening for audio public announcements until the mobile device 600 leaves the location of the public address system. If the processor 601 is placed into sleep mode, then the basic audio recognition engine 620 will listen for audio triggers 633 and wake the processor 601 if any audio triggers 633 are detected. The audio recognition engine 637 will then take over the audio detection operation. If the processor 601 is already awake, the audio detection operation will be performed directly by the audio recognition engine 637 which will listen for audio templates 635.
While various embodiments have been illustrated and described, it is to be understood that the invention is not so limited. Numerous modifications, changes, variations, substitutions and equivalents will occur to those skilled in the art without departing from the scope of the present invention as defined by the appended claims.