Embodiments of the present disclosure relate generally to communicating tokens and, more particularly, to methods and apparatuses for communicating audio tokens.
Mobile devices and other computing and communication devices are becoming virtually ubiquitous in today's society. These devices, with their wireless communication capabilities, allow a user to keep connected with the world in new ways. However, there are still untapped ways to use these devices and communicate with them. Some new communication means may be less intrusive and add value to the devices as well as systems communicating with the devices.
There is a need for methods and apparatuses that include new ways to pass information to computing devices and communication devices that do not use traditional wireless frequency spectra.
In the following description, reference is made to the accompanying drawings in which is shown, by way of illustration, specific embodiments of the present disclosure. The embodiments are intended to describe aspects of the disclosure in sufficient detail to enable those skilled in the art to practice the invention. Other embodiments may be utilized and changes may be made without departing from the scope of the disclosure. The following detailed description is not to be taken in a limiting sense, and the scope of the present invention is defined only by the appended claims.
Furthermore, specific implementations shown and described are only examples and should not be construed as the only way to implement or partition the present disclosure into functional elements unless specified otherwise herein. It will be readily apparent to one of ordinary skill in the art that the various embodiments of the present disclosure may be practiced by numerous other partitioning solutions.
In the following description, elements, circuits, and functions may be shown in block diagram form in order not to obscure the present disclosure in unnecessary detail. Additionally, block definitions and partitioning of functions between various blocks is exemplary of a specific implementation. It will be readily apparent to one of ordinary skill in the art that the present disclosure may be practiced by numerous other partitioning solutions. Those of ordinary skill in the art would understand that information and signals may be represented using any of a variety of different technologies and techniques. For example, information, and signals that may be referenced throughout the description may be represented by voltages, currents, electromagnetic waves, magnetic fields, or any combination thereof. Some drawings may illustrate signals as a single signal for clarity of presentation and description. It will be understood by a person of ordinary skill in the art that the signal may represent a bus of signals, wherein the bus may have a variety of bit widths and the present disclosure may be implemented on any number of data signals including a single data signal.
Elements described herein may include multiple instances of the same element. These elements may be generically indicated by a numerical designator (e.g. 110) and specifically indicated by the numerical indicator followed by an alphabetic designator (e.g., 110A) or a numeric indicator preceded by a “dash” (e.g., 110-1). For ease of following the description, for the most part element number indicators begin with the number of the drawing on which the elements are introduced or most fully discussed. For example, where feasible elements in
It should be understood that any reference to an element herein using a designation such as “first,” “second,” and so forth does not limit the quantity or order of those elements, unless such limitation is explicitly stated. Rather, these designations may be used herein as a convenient method of distinguishing between two or more elements or instances of an element. Thus, a reference to first and second elements does not mean that only two elements may be employed or that the first element must precede the second element in some manner. In addition, unless stated otherwise, a set of elements may comprise one or more elements.
As used herein the term “sonic range” refers to a range of acoustic frequencies that may be audible to humans and are generally considered to be in the range of about 20 Hz to about 20 kHz.
As used herein, the term “infrasonic range” refers to a range of acoustic frequencies that may be inaudible to humans, but generally can be generated by acoustic transmitters (e.g., speakers) and detected by acoustic receivers (e.g., microphones) present in electronic devices. As a non-limiting example, the infrasonic range may refer to a range of about 1 Hz to about 20 Hz. However, the lower end of the range may vary depending on capabilities of the acoustic transmitters and acoustic receivers used in systems discussed herein.
As used herein, the term “ultrasonic range” refers to a range of acoustic frequencies that may be inaudible to humans, but generally can be generated by acoustic transmitters and detected by acoustic receivers present in electronic devices. As a non-limiting example, the ultrasonic range may refer to a range of about 20 kHz to about 22 khz. However, the upper end of the range may vary depending on capabilities of the acoustic transmitters and acoustic receivers used in systems discussed herein.
Unless specifically stated otherwise, the term “audio,” as used herein, refers to a range of acoustic frequencies in a combination of the infrasonic range, sonic range, and ultrasonic range.
Embodiments of the present disclosure include various combinations of transmitters, receivers, and servers to create, communicate, and use audio tokens embedded into audio information. The audio tokens can be used in a variety of different usage models, some examples of which are discussed below.
The one or more processors 110 may be configured for executing a wide variety of operating systems and applications including computing instructions for carrying out embodiments of the present disclosure.
The memory 120 may be used to hold computing instructions, data, and other information for performing a wide variety of tasks including performing embodiments of the present disclosure. By way of example, and not limitation, the memory 120 may include Synchronous Random Access Memory (SRAM), Dynamic RAM (DRAM), Read-Only Memory (ROM), Flash memory, and the like.
As non-limiting examples, the user interface elements 130 may include elements such as displays, keyboards, mice, joysticks, haptic devices, microphones, speakers, cameras, and touchscreens.
As non-limiting examples, the communication elements 150 may be configured for communicating with other devices or communication networks. As non-limiting examples, the communication elements 150 may include elements for communicating on wired and wireless communication media, such as for example, serial ports, parallel ports, Ethernet connections, universal serial bus (USB) connections IEEE 1394 (“firewire”) connections, Bluetooth wireless connections, 802.1 a/b/g/n type wireless connections, cellular telephone networks, and other suitable communication interfaces and protocols.
The storage 140 may be used for storing relatively large amounts of non-volatile information for use in the computing system 100 and may be configured as one or more storage devices. By way of example, and not limitation, these storage devices may include computer-readable media (CRM). This CRM may include, but is not limited to, magnetic and optical storage devices such as disk drives, magnetic tape, CDs (compact disks), DVDs (digital versatile discs or digital video discs), and semiconductor devices such as RAM, DRAM, ROM, EPROM, and Flash memory, and other equivalent storage devices.
Software processes illustrated herein are intended to illustrate representative processes that may be performed by the systems illustrated herein. Unless specified otherwise, the order in which the process acts are described is not intended to be construed as a limitation, and acts described as occurring sequentially may occur in a different sequence, or in one or more parallel process streams. It will be appreciated by those of ordinary skill in the art that many acts and processes may occur in addition to those outlined in flow charts. Furthermore, the processes may be implemented in any suitable hardware, software, firmware, or combinations thereof. When executed as firmware or software, the instructions for performing the processes may be stored on a computer-readable medium.
By way of non-limiting example, computing instructions for performing the processes may be stored on the storage 140, transferred to the memory 120 for execution, and executed by the processors 110. The processor 110, when executing computing instructions configured for performing the processes, constitutes structure for performing the processes and can be considered a special-purpose computer when so configured. In addition, some or all portions of the processes may be performed by hardware specifically configured for carrying out the processes.
The computing system 100 may be configured as a server to provide information and databases for embodiments of the present disclosure. The computing system 100 also may be used for generating audio tokens, embedding audio tokens into media files or audio information being transmitted, and receiving and decoding audio tokens. The computing systems may also be used for communicating with local databases, remote databases, or combinations thereof. An audio token is a piece of information that may be inserted into media information that includes audio information. The audio tokens may include a variety of information.
The payload 230 includes a timestamp 232 and an identifier 234. Other information may also be included within the payload as indicated by the ellipses after the identifier 234. The identifier 234 may include different information for different usage models. As non-limiting examples, the identifier 234 may include information about the media that the audio token 200 is embedded in, information about a specific location where the audio token 200 is being transmitted, information about the sender that is transmitting the audio token 200, and information about transactions that may be performed relative to the sender and/or location from which the audio token 200 is being transmitted.
Cyclic Redundancy Check (CRC) information 240, other error checking information, or other error correction information may also be included to determine or improve the integrity of any received audio tokens.
Media information 310 may include audio information, video information, combinations, thereof, as well as other information. The audio token 200 is embedded into the audio portion of the media information 310. Media information 310, may be generally referred to herein as a media stream 310 and the context of whether it is streaming media, a media file, or a combination thereof will be apparent from the description.
This examination of the media stream 310 may be performed as preprocessing, or may be performed substantially real-time as the media stream 310 is to be transmitted. The synchronizer 330 determines a timestamp 232 defining a temporal position within the media stream 310, a clock time, or a combination thereof. The timestamp 232 may then be assembled with other information to create the audio token 200. The audio token 200 may then be embedded into the audio stream portion of the media stream 310 at the appropriate time and matching the timestamp 232 if needed. The devices where the timestamp 232 is determined and where the audio token 200 is assembled and embedded, may vary depending on usage models.
As non-limiting examples, the audio token 200 and media stream 310 in
The audio portion of the media stream 310 along with the embedded audio token 200 is converted to sound waves and transmitted 350 by an acoustic transmitter 345 (e.g., a speaker) in the broadcaster 340. Thus, in the example of
One or more mobile devices (360A and 360B) may be configured to receive the sound waves including the audio stream and the embedded audio token 200 using an acoustic receiver 365 (e.g., a microphone). Thus, in the example of
The mobile devices 360 include an audio token extractor in the form of software, hardware, or a combination thereof, that receives the incoming audio stream, recognizes the embedded audio tokens 200, and extracts the embedded audio tokens 200 from the audio stream. The mobile devices 360 also include an interpreter that uses information in the audio token 200 to access a database, which may be local on the device, or accessed through a communication element 150 (
The audio token may be encoded into an audio stream in a number of ways. As non-limiting examples, the audio token may be modulated onto a baseband signal in the infrasonic, sonic, or ultrasonic ranges using amplitude modulation, frequency modulation, phase shifting and other similar encoding and modulation methods. The audio token may be generated as a serial ASCII stream, or any other digital encoding suitable for representing the audio token in a format such as the example format outlined in the discussion of
The audio token 200 is generally configured to be inaudible to humans, while still being able to be transmitted by an acoustic transmitter 345 and received by an acoustic receiver 365. As non-limiting examples, the audio token may be placed in the infrasonic range or the ultrasonic range. In addition, while the upper end of the sonic range is generally considered to extend to 20 kHz, most people cannot hear frequencies above 18 kHz. Thus, in some embodiments, the audio token may be placed between about 18 kHz and 20 kHz.
In addition, audio tokens may be placed in frequency ranges of the sonic range that are normally audible to humans. In such cases, the audio token may be substantially masked from recognition by humans using a number of techniques. One such technique is to use amplitude adaptation. Audio token insertion runs in parallel with the master media source when it plays. Thus, the amplitude of the audio token signal may “adapt” to the amplitude of the master media source. In other words, when the media source is loud, the audio token signal will be louder, but still substantially imperceptible. If the media source weakens out, the audio token signal will be quieter. By being adaptive, audio token signals stay at a level that is proportionate to the source, avoiding any pitch that may stand out as not belonging to the original media stream.
Another such technique may be to temporally spread the audio token signal over a longer time period such that any change in the combination of the audio token signal and the master media source is substantially imperceptible to a human.
Some non-limiting examples of uses for the audio tokens include subtitle text displaying for the deaf or for audiences who need real-time translations during movies, shows, or conferences. Another non-limiting example is for real-time mobile advertising, product/service promotions, and shopping applications targeted to users who are watching TV, listening to the radio, or walking into a store where there is an audio program playing. Another non-limiting example is for enabling users to be interactive with outdoor screens or displays that have speakers to get instant coupons, or to buy tickets for an event or for transportation.
In some master media streams such as movies or TV, the audio tokens may be synchronized with the timestamp to specific temporal positions in the master media stream enabling user information to be presented on the mobile device 360 such as captioning or enhanced information not available in the mater media stream.
Audio token receivers are generally described herein as mobile devices 360, however any electronic device that includes the audio token extractor and interpreter capabilities in software, hardware, or a combination thereof may be used.
The synchronizer 330 (
Operation block 430 indicates that the new media stream including the embedded audio tokens is encoded and compressed if desired in any appropriate format and stored in a new media file. The new media file includes the original media stream from the original media file and the embedded audio tokens in a single media stream. Operation block 440 indicates that the new media file may be interpreted and played by any suitable media player and the audio portion may be broadcast on an acoustic transmitter 445.
The synchronizer 330 (
Many media players are capable of mixing different media streams as they are played. In the process of
The usage model of
For example, a foreign film may be in another language from what the user understands, or a hearing-impaired user may be viewing a movie. The audio tokens may be synchronized to the foreign film or movie and transmitted with the audio of the foreign film or movie. The mobile devices (760A and 760B) use the audio tokens to access a database of text 780 (e.g., SubRip Text (SRT) file or other suitable text including the dialogue), the text may be presented on a screen of the mobile devices (760A and 760B) at the appropriate and substantially synchronized time. Moreover, a speech synthesis tool on the mobile devices (760A and 760B) may be included to convert the text to speech, which may be presented to the user on the mobile devices (760A and 760B) as audio through a speaker or headphones at the appropriate and substantially synchronized time.
This usage model may also be used in a setting such as a conference including presentations. The audio tokens may be synchronized to slides or other media of a presentation and the supplemental information may be accessed from the database and presented to the user at the appropriate times. Each user of a mobile device (760A or 760B) may select a different language and software on the mobile devices (760A and 760B) would use this information, along with the identifier and timestamp to access a different database or access different data within a combined data base to obtain the supplemental information in an appropriate language.
Those of ordinary skill in the art will recognize that the supplemental information may include many other types of media and information other than text. As non-limiting examples, the supplemental information may include augmenting audio, video, or images. As other non-limiting examples, the supplemental information may include background information about the presenter, actors, locations, or other details about the media presentation.
As stated earlier, the database including the supplemental information may be local on the mobile devices (760A and 760B), accessed through the Internet or other remote source, or a combination thereof.
With such location, and possibly time information, online advertising 870A may be accessed and presented to users on the mobile devices (860A and 860B) that is targeted to a specific location, time, or combination thereof. Moreover, applications on the mobile devices (860A and 860B), or remotely accessed, may include information about the user such that the advertising can be even more specific to the user's background, or other personal indicators.
Online shopping 870B may be included and present information to the user, such as, for example, a menu for the restaurant broadcasting the audio tokens, or a list of styles and prices for clothing in a clothing store. Other possible models may include advertising during certain time periods. As a non-limiting example, an entertainment venue may advertise certain types of refreshments during certain times, such as during an intermission or prior to the start of the entertainment.
Online social networking 870C may be accessed to connect multiple users with mobile devices (860A and 860B) at a certain location, such as, for example, an entertainment venue. As a non-limiting example, connected users may exchange comments or other information about a sporting event at a venue that they are attending based on what is occurring as indicated by timestamps and identifiers in the audio tokens.
Other online information may also be accessed. As a non-limiting example, audio tokens at an entertainment venue presented at the end of the entertainment or as patrons are leaving may direct the mobile devices ((860A and 860B) to real-time information regarding traffic patterns near the venue so a user can plan a route away from the venue.
The
As a non-limiting example, a coffee shop may be supplied with a system that may insert customizable audio tokens into background music presented at the coffee shop. Alternatively, any conventional computer, communication device, or mobile device may include appropriate software to create the audio tokens and mix them with background music or other audio streams in either a pre-processing process or a real-time process.
The audio tokens may direct the user to specific websites or present the users with information on the mobile devices (860A and 860B) such as coupons, specials, or menus. The audio tokens may also connect the user with specific data for the coffee shop to automatically update information for the user related to purchases such as awarding loyalty points and tracking purchasing history.
While the present disclosure has been described herein with respect to certain illustrated embodiments, those of ordinary skill in the art will recognize and appreciate that the present invention is not so limited. Rather, many additions, deletions, and modifications to the illustrated and described embodiments may be made without departing from the scope of the invention as hereinafter claimed along with their legal equivalents. In addition, features from one embodiment may be combined with features of another embodiment while still being encompassed within the scope of the invention as contemplated by the inventor.
This application claims priority to U.S. Patent Provisional Application Ser. No. 61/644,058, filed May 8, 2012, pending, the disclosure of which is incorporated herein by reference in its entirety.
Number | Date | Country | |
---|---|---|---|
61644058 | May 2012 | US |