This application generally relates to audio, and more particularly to providing audio-based notifications.
In the current application, system and method is introduced wherein a device detects incoming audio such that the audio is a voiceprint of a person, and the person, henceforth referred to as an “acquaintance” is of some relationship to the user of the current application, henceforth referred to as the “user”. In certain embodiments, there is processing in the current application that determines recent interactions between the user and the acquaintance wherein it is determined whether the user may further interact with the acquaintance regarding an issue and the issue is either known by both or one of the user and acquaintance or not known by both or one of the user and the acquaintance.
An example operation may include a method comprising one or more of recording data, by a device, wherein the data is one or more of a location, a video, and an audio, sending the data to a server, splitting, by the server, the data into at least one participant, determining, by the server, an interaction by matching the at least one participant, and a group of stored data, and notifying the device, by the server, of the match.
Another example operation may include a system comprising a device which contains a processor and memory, wherein the processor is configured to perform one or more of record data, by a device, wherein the data is one or more of a location, a video, and an audio, send the data to a server, split, by the server, the data into at least one participant, determine, by the server, an interaction by a match of the at least one participant, and a group of stored data, and notify the device, by the server, of the match.
A further example operation may include a non-transitory computer readable medium comprising instructions, that when read by a processor, cause the processor to perform one or more of recording data, by a device, wherein the data is one or more of a location, a video, and an audio, sending the data to a server, splitting, by the server, the data into at least one participant, determining, by the server, an interaction by matching the at least one participant, and a group of stored data, and notifying the device, by the server, of the match.
It will be readily understood that the instant components and/or steps, as generally described and illustrated in the figures herein, may be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of at least one of a method, system, component and non-transitory computer readable medium, as represented in the attached figures, is not intended to limit the scope of the application as claimed but is merely representative of selected embodiments.
The instant features, structures, or characteristics as described throughout this specification may be combined in any suitable manner in one or more embodiments. For example, the usage of the phrases “example embodiments”, “some embodiments”, or other similar language, throughout this specification refers to the fact that a particular feature, structure, or characteristic described in connection with the embodiment may be included in at least one embodiment. Thus, appearances of the phrases “example embodiments”, “in some embodiments”, “in other embodiments”, or other similar language, throughout this specification do not necessarily all refer to the same group of embodiments, and the described features, structures, or characteristics may be combined in any suitable manner in one or more embodiments.
In addition, while the term “message” may have been used in the description of embodiments, the application may be applied to many types of network data, such as, packet, frame, datagram, etc. The term “message” also includes packet, frame, datagram, and any equivalents thereof. Furthermore, while certain types of messages and signaling may be depicted in exemplary embodiments they are not limited to a certain type of message, and the application is not limited to a certain type of signaling.
Referring to
The system 100 includes a network 104 (e.g., the Internet or Wide Area Network (WAN)). The network may be the Internet or any other suitable network for the transmitting of data from a source to a destination.
A server 106 exists in the system 100, communicably coupled to the network 104, and may be implemented as multiple instances wherein the multiple instances may be joined redundant network or may be singular in nature. Furthermore, the server may be connected to database 108 wherein tables in the database are utilized to contain the elements of the stored data in the current application, such as Structured Query Language (SQL), for example. The database may reside remotely to the server coupled to the network 104 and may be redundant in nature.
Referring to
Computer system 200 may also include main memory 208, such as a random access memory (RAM) or other dynamic storage device, coupled to bus 206 for storing information and instructions to be executed by a processor 205. Main memory 208 also may be used for storing temporary variables or other intermediate information during the execution of instructions to be executed by a processor 205. Such instructions, when stored in the non-transitory storage media accessible to processor 205, may render computer system 200 into a special-purpose machine that is customized to perform the operations specified in the previously stored instructions.
Computer system 200 may also include a read only memory (ROM) 207 or other static storage device, which is coupled to bus 206 for storing static information and instructions for processor 205. A storage device 209, such as a magnetic disk or optical disk, may be provided and coupled to bus 206, which stores information and instructions.
Computer system 200 may also be coupled via bus 206 to a display 212, such as a cathode ray tube (CRT), a light-emitting diode (LED), etc. for displaying information to a computer user. An input device 211 such as a keyboard, including alphanumeric and other keys, is coupled to bus 206, which communicates information and command selections to processor 205. Other type of user input devices may be present including cursor control 210, such as a mouse, a trackball, or cursor direction keys which communicates direction information and command selections to processor 205 and controlling cursor movement on display 212.
According to one embodiment, the techniques herein are performed by computer system 200 in response to a processor 205 executing one or more sequences of one or more instructions which may be contained in main memory 208. These instructions may be read into main memory 208 from another storage medium, such as storage device 209. Execution of the sequences of instructions contained in main memory 208 may cause processor 205 to perform the process steps described herein. In alternative embodiments, hard-wired circuitry or embedded technology may be used in place of or in combination with software instructions.
The term “storage media” as used herein refers to any non-transitory media that may store data and/or instructions causing a machine to operation in a specific fashion. These storage media may comprise non-volatile media and/or volatile media. Non-volatile media may include, for example, optical or magnetic disks, such as storage device 209. Volatile media may include dynamic memory, such as main memory 208. Common forms of storage media include, for example, a hard disk, solid state drive, magnetic tape, or other magnetic data storage medium, a CD-ROM, any other optical data storage medium, any physical medium with patterns of holes, a RAM, a PROM, and EPROM, a FLASH-EPROM, NVRAM, any other memory chip or cartridge.
Various forms of media may be involved in the carrying one or more sequences of one or more of the instructions to processor 205 for execution. For example, the instructions may initially be carried on a magnetic disk or solid state drive of a remote computer. The remote computer may load the instructions into its dynamic memory and send the instructions over a medium such as the Internet 202.
Computer system 200 may also include a communication interface 204 coupled to bus 206. The communication interface may provide two-way data communication coupling to a network link, which is connected to a local network 201.
A network link typically provides data communication through one or more networks to other data devices. For example, the network link may provide a connection through local network 201 to data equipment operated by an Internet Service Provider (ISP) 202. ISP 202 provides data communication services through the worldwide packet data communication network now commonly referred to as the “Internet” 202. Local network 201 and Internet 202 both use electrical, electromagnetic, or optical signals that carry digital data streams. The signals through the various networks and the signals on network link and through communication interface 204, carrying the digital data to and from computer system 200, are example forms of transmission media.
Computer system 200 can send messages and receive data, including program code, through the network(s) 202, the network link, and the communication interface 204. In the Internet example, a server 203 may transmit a requested code for an application program through Internet 202, local network 201, and communication interface 204.
Processor 205 can execute the received code as it is received, and/or stored in storage device 209, or other non-volatile storage for execution at a later time.
Every action or step described herein is fully and/or partially performed by at least one of any element depicted and/or described herein.
In the current application, a device detects incoming audio wherein the audio is a voiceprint of a person, and the person, henceforth referred to as an “acquaintance” is of some relationship to the user of the current application, henceforth referred to as the “user”. In certain embodiments, there is processing in the current application that determines recent interactions between the user and the acquaintance wherein it is determined whether the user may further interact with the acquaintance regarding an issue wherein the issue is either known by both or one of the user and acquaintance or not known by both or one of the user and the acquaintance.
For example, the user of the current application may be in a conversation with an acquaintance or be near the acquaintance wherein the acquaintance's voice is detected by the device executing the current application. The device receives the incoming audio where analysis is performed such that the incoming audio is determined to be a voiceprint of the said acquaintance. The current application, knowing the voiceprint is of the acquaintance, seeks to determine current outstanding issues between the user and the acquaintance wherein a discussion between the two may be beneficial. The current application may then inform the user of this via a notification on the device of the current application.
As another example, a third person referred to as the “remote acquaintance” may update some data, a project plan for example, without the knowledge of the user and the acquaintance, which may have some impact to the user and the acquaintance. This update in data may trigger the current application to request a meeting between the user and the acquaintance without either of them being aware of the situation.
Developed technology in today's marketplace utilizes the technology of voiceprint as further depicted herein. The use of voiceprint in the current application allows for the detection of incoming audio and comparing the received audio against a database storing recorded samples of users in the current environment.
The current application requires a voiceprint of the users in the environment, for example a business environment. The voiceprint recordings, in one embodiment, are received upon the initiation of the user to the environment such as when employees are hired. For employees already in the environment, voiceprints may be requested such as having them send in a voice recording of themselves speaking a particular sentence or sentences.
These recordings are stored in a database, such as a corporate database 108 which may be queried via server 106 as needed.
To record audio from a device, the software initiates recording functionality. For example, in the Android operating system, the following Java code is utilized to initiate the recording of audio on the device:
In one embodiment, the recorded media is stored in the device 102 locally then sent to a server 106 for processing. In another embodiment, the recorded media is stored in the device 102.
In one embodiment, the device 102 initiates the recording of audio in the background, without specific interaction with the device from the user. This allows the device to record audio in the device during conversations that occur during a workday. To record audio on the device in the background, the current application may utilize the built-in microphone on many devices, such as mobile phones will suffice.
In the mobile device-programming environment, a service is needed to allow the executing of an application in the background. As an example, one popular mobile operating system utilizes service that that runs in the background without direct interaction with the user. A service has no user interface and is not bound to the lifecycle of an activity. The activity of recording audio is perfect for the use of a service. These services run with a higher priority than inactive or invisible activities and therefore it is less likely that the operating system terminates them.
The use of asynchronous processing in a service allows for the execution of resource intensive tasks in the background. A new thread is created and executed in the service wherein the service (and the thread) may be restarted automatically if the operating system reboots.
Recorded audio samples at the client device 102a may be sent to a server, such as server 106 wherein the recorded audio is processed to compare the recorded audio to voiceprints of users in the environment to determine who is speaking. The interval of when to cut the recording and send the audio samples is hardcoded to a value, such as 10 seconds.
In one embodiment, the recorded audio is sent to the server 106 for processing. The server first splits the audio into sections wherein each portion of audio corresponds to a speaker, a process called diarization.
Speaker diarization is the process of partitioning an input audio stream into homogeneous segments according to the speaker identity. Speaker diarization is a combination of speaker segmentation and speaker clustering. The first aims at finding speaker change points in an audio stream. The second aims at grouping together speech segments on the basis of speaker characteristics.
There are many open-sourced initiatives solving the speaker diarization problem:
ALIZE Speaker Diarization—an opensource platform for speaker recognition. The purpose of this project is to provide a set of low-level and high-level frameworks that will allow anybody to develop applications handling the various tasks in the field of speaker recognition: verification/identification, segmenting, etc.
SpkDiarization—a software dedicated to speaker diarization (ie speaker segmentation and clustering). It is written in Java and includes the most recent developments in the domain.
Audioseg—a toolkit dedicated to audio segmentation and classification of audio streams.
SHoUT—a software package developed at the University of Twente to aid speech recognition research.
pyAudioAnalysis—Python Audio Analysis Library: Feature Extraction, Classification, Segmentation and Applications
Having obtained the split of the input audio into sections containing the speakers, the server 106 then compares the audio sections against a database (such as database 108) of voiceprints to determine who the speaker is as further disclosed below.
The identity of the speakers is determined by first splitting the incoming audio stream into the respective speakers called diarization, further disclosed herein. The individual audio streams per speaker is referred to henceforth as audioPerSpeaker.
Once the audio is split, the server 106 processes each audioPerSpeaker. For example, the split audio portions are stored in an array: audioPerSpeaker [n]. The server loops through the array, comparing the stored audio in each array element against a library of voiceprints, wherein each voiceprint is an audio sample of a user associated with the organization. The comparison of audio is accomplished via commonly used logic in computer science with an outcome assigned to the matching of the audio, for example a range of [0-10] where 0 is no match at all and 10 equates to a perfect match.
In one embodiment, if the match is 60%, or an outcome equal to 6 or higher, the match is validated, and any value under 5 reflecting that the audio samples do not match.
Referring to
In another embodiment, the client device 102a is recording and storing video wherein the video being recorded is of the environment. For example, the device may be a wearable device, such as computer-equipped glasses, or a device on a user's clothing capturing video. The device also records audio along with the video.
In another embodiment, the recorded data is stored not in the client device 102, but in a remote server, such as server 106 and/or a database such as database 108. The data is sent to the server via the network 104, and routed to the server, and optionally to the database.
A device of the system processes the incoming data to determine the people speaking wherein the device may be the client device 102 or the server 106. The video is analyzed using facial recognition. Facial recognition is a common problem that is solved in today's computer science environment. There are many products that tackle the facial recognition problem, both open source and private solutions.
One technique that is common utilizes three technologies:
The Eigenface method encodes the statistical variation among face images a form of dimensionality reduction method like Principal Component Analysis (PCA), where the resulting characteristic differences in the feature space don't necessarily correspond to isolated facial features such as eyes, ears, and noses. The indispensable components of the feature vector are not pre-determined).
Elastic matching generates nodal graphs (i.e. wireframe models) corresponding to specific contour points of a face, such as the eyes, chin, tip of the nose, etc. Recognition is based on a comparison of image graphs against a known database. Since image graphs may be rotated during the matching process, this system tends to be more robust to large variation in the images.
Classification net recognition utilizes the same geometric characteristics as elastic matching, but fundamentally differs by being a supervised machine learning technique, often involving the use of support vector machines.
Although Eigenface detection may underperform the other methods when variation in lighting or facial alignment is large, it has many benefits including:
Therefore, Eigenface detection tends to be a de-facto standard. Many state-of-the-art detection techniques also rely on some form of dimensionality reduction prior to recognition, even if feature vector extraction is handled in a different manner.
An acquaintance 402 is determined by the system. This is accomplished via facial recognition from received image/video media wherein the face data obtained in the received image/video is compared to a bank of facial images stored locally in the client device or in a remote database such as database 108, or matching a received voice audio with a voice print stored in the client device 102 or remotely in a database, such as database 108.
In another embodiment, previous communications between the user and other people with whom the user has interactions with is recorded either locally in the client device, or in a remote location such as server 106 or database 108. The voices and/or facial recognition data of the interactions are stored as future acquaintances wherein this data may be used to determine acquaintances in future interactions.
In another embodiment, the geographic location of individuals or groups of people with whom the user interacts with is used as possible acquaintances. If the acquaintance is normally encountered in a similar geographic area as the user, such as in a nearby office or cube, it is then determined that this person is an acquaintance of the user and the facial/voice print data of that individual is stored as a future acquaintance of the user.
An acquaintance 402 is nearby such that the voice of the acquaintance is being received at the client device 102a and recorded 404. The audio is sent to server 106 for processing 406.
The server splits the received audio into speakers, a process called diarization and is further discussed herein 408.
Once the audio is split into speakers, the speakers' received audio is compared against voiceprints 414 to obtain the identity of the speaker 410, also further disclosed herein.
Having obtained the identity of the speaker in the received audio, the server 106 then performs logic to determine any interaction(s) between the originator and the speaker 412. A database, such as database 108 may be queried via APIs for example to obtain possible interactions on projects, calendar events, etc. 416.
If at least one interaction is encountered, notifications 418 and 420 are sent from the server 106 to the respective client devices of the users 102a and 102b.
In another embodiment, the audio is processed at the client device 102a such that the voiceprints of the users in the environment exist on the client device. In this embodiment, the audio sample message 406 is not present (not depicted).
In another embodiment, the audio is processed at the client device 102a such that the voiceprints exist on a remote database, such as database 108 and messaging occurs between the client device 102a and the database 108 occur to query the stored voiceprints at the database 108. In this embodiment, the audio sample message 406 is not present (not depicted).
In another embodiment, the database as pictured in the above message flow may be multiple databases wherein the databases may also be remote databases such that interactions between the server 106 and said database(s) occur via messaging between the server and the database(s).
The current application seeks to determine interactions between users wherein the users have been detected via a voiceprint matching incoming audio. Other methods are utilized by the system to determine speakers, such as facial recognition wherein incoming data from a camera that the user is wearing, for example. The location of the speaker is also considered when determining a speaker, such as when an interaction occurs at the same or similar geographic location. For example, at a cube outside of the users cube, or a nearby office, etc.
Below are two examples of interaction with data (a calendar data and project plan data), but one versed in programming design will easily be able to use the examples and relay the methods introduced to other types of data with other types of interactions without deviating from the scope of the current application.
For example, if the device 102a of the user of the current application (henceforth referred to as User A) is either speaking to User B, or User B is in the proximity of, such that in the processing of the incoming audio from the device 102a matches User B's voiceprint, the current application seeks to determine interactions between User A and User B, the client device of User B being 102b.
The current application executing on the client device 102a queries the calendar of User A via interactions with either the calendar application on the current device 102a or a remote server such as server 106 containing the calendar data of User A 102a wherein messaging occurs between said remote server and the client device 102a.
For example, to search calendar/event data, the java code below shows retrieving events in a user's calendar application using Java and a popular Calendar API:
The event returned is of type “Event”, containing the specific details of the event. Including in the Event data is an attendee array:
The attendee array above is part of the data returned in the Event data in the Calendar API. The Event data will contain an array with all of the attendees in the event, along with details of each attendee including the email and name.
The Event data also contains the event's creator data including the creator's email and name.
The current application obtains events of the originator and the events of the recipient(s). This is the event(s) that either of them has created. Using the returned Event data, it then determines if any of the events created contain the other user's name or email. If there is a match, this means that the originator (User A) and the User B share an event. Furthermore, knowing the data of the event, if the event is scheduled within a particular time period (e.g. within 3 business days), a match has been found and a notification is sent to both parties.
Referring to
In this scenario, the user, Bill Dewitz is interacting with another person of who is determined to be an acquaintance (determined via the system 102 having received data such as an audio recording of a conversation or a video recording from a camera on Bill's person). The system sends the data to the server 106 wherein the data is compared against either voiceprints or facial data to determine the identity of Jim Brisk, of whom is an acquaintance of Bill. The stored comparative data is optionally stored in the database 108.
The notification is received at the client device, in this scenario both 102a and 102b, but the server 106 may send different notifications according to the interaction determined 412 wherein the notification to Bill, for example would have only Jim's name 602 and the notification to Jim would have only Bill's name.
In another embodiment, the words “that person” may be substituted with the name of the user.
In another embodiment, the meeting details may be presented in the message of the notification 602 including the title of the event, the event start time, the duration of the event, the attendees of the event, the location of the event, etc.
Referring to
When pressed, a message is sent to the calendar API of either the calendar application on the client device, or query remote calendar data on a remote database, such as database 108. The resulting action is either a new window pops up on the display with the details for the calendar data, or the calendar data is added to the current message text.
The current application executing on the client device 102 interacts with the data in an environment, such as project plan data in an enterprise environment via a project plan Application Program Interface (API) to obtain data pertaining to the users of the current application whom have been determined to be within audio range from each other, for example User A 102a and User B 102b.
Understanding a project associated with the user, and the role the user has in the project may aid in determining current interactions between User A and User B. For example, having access to the project management software such as an Application Programming Interface (API), the current application queries the program management software to obtain possible interactions between users.
Project Management software may be available online which exist entirely in the cloud, or on client machines wherein a remote database stores the live changes of the project. Many popular project management applications include an API, allowing other applications to query and retrieve project management specific data pertaining to stored projects.
For example, a popular project management application is the Basecamp software. Basecamp allows for simple communication and collaboration amongst the users in a project. It also is implemented as simple Extensible Markup Language (XML) over the Hypertext Transfer Protocol (HTTP) protocol.
Through the user of the Basecamp API, it is possible to return all people in a project as well as obtain data pertaining to each person's role in a project.
For example, using the Basecamp API, it is possible return a specified person:
In this example, the user's data is returned including their name, email address, current events associated with the user and any assigned tasks as well as other information.
Furthermore, to get projects a person has access to via the API call:
This will return a list of all projects a person has access to including draft, template, archived, the date that the project was last updated, and deleted projects. Projects that the requesting user does not have access to will not appear in the project list.
To obtain the people who have access to a particular project, the following API call may be called:
The result includes all the people with access to the project, including the data pertaining to each user.
Using these and similar API function calls, it is possible to determine the other people on the project and obtain their contact information. Through this, it is also possible to automatically determine the date that a project has been updated wherein both User A and User B are members of.
Referring to
Referring to
When pressed, a message is sent to the calendar API of either the project plan application on the client device, or query remote project plan data on a remote database, such as database 108. The resulting action is either a new window pops up on the display with the details for the project plan data, or the project plan data is added to the current message text.
In another embodiment, the current data of the project plan is displayed upon the pressing of the project plan button.
Referring to
A local variable i is set to zero 704. This is a counting variable for looping through the speakers array.
A check is made if the array has items remaining 706. If there are remaining items, the first item in the speakers array is compared against the stored voiceprints. These voiceprints are a collection of audio from each of the users in the environment and may be stored locally at the server 106, or may be remotely stored, such as in a database 108. This process is further disclosed herein. If a speaker is found to match the speakers[i] audio, then the person from the voiceprint data is used as the current matched person 710, herein referred to as speaker[i].user.
A check is made to determine if speaker[i].user is the originator 712 wherein speaker[i].user is the person from the voiceprint data that matched the audio section speaker[i] and originator is the owner of the initial client device sending the incoming audio stream 102a.
If speaker[i].user is the same as originator, then that portion of recorded audio belongs to the user of client device 102 and the i variable is incremented 270 and the process loops back.
If speaker[i].user is not the same as originator, then speaker[i].user is a person to attempt to determine if an outstanding issue may be present 714, as further depicted herein.
If no issue is found 716, then the process loops and the i variable is incremented 270 to continue processing other speakers in the received audio stream.
If an issue is found 716, then notifications are sent 718 to at least one of the originator and/or speaker[i].user client device(s) 102a/102b. The process then loops and the i variable is incremented 720 to continue attempting to determine users for the remaining speaker[i] audio segments.
Referring to
A remote acquaintance 802 updates data wherein the updated data pertains to both the user 102a and the acquaintance 102b. The remote acquaintance may be another user in the environment and may not personally know the users 102a and 102b.
Data modification message 806 is sent to the server 106, and also sent 808 to database 108 where the data is updated. The data may be a project plan, code pertaining to both users 102a and 102b, or any other data that is able to be updated by various personnel within an organization.
The client device of the user 102a is placed in a mode wherein audio of the current environment is being received and recorded.
An acquaintance 810 is nearby such that the voice of the acquaintance is being received at the client device 102a and recorded 812. The audio is sent to server 106 for processing 814.
Processing continues as previously depicted wherein notifications are presented to the client devices of the user 102a and the acquaintance 102b. The notifications indicate the similar text of the previously depicted notifications.
The users may not be aware of the data update by the remote acquaintance, yet the current embodiment embodies the intuitiveness of the current application. A user may receive a notification pertaining to data wherein the user and/or the acquaintance is not aware of the update yet is made aware via the current application.
The notion of movement of a device, for example a mobile device, has been used in determining a user's action of said device. For example, smartphones in the market today have functionality that automatically answers an incoming call when the device is raised to the user's ear.
Other movement of a device is used to automatically perform particular actions of a device. In some implementations, if a device is turned over on a surface, the functionality of the device is altered according to this action. The device automatically determines that the user is in a meeting and desires to silent the device. As such, the volume of the device is silenced, and the haptic feedback (vibration feedback) is turned on. As other implementations, if the device is shaken up and down, this movement of the device allows the software of the device to perform some functionality, such as erase data (such as in a game) or turn on another application such as a flashlight.
In many transports, speed-sensitive-volume is a feature that is included. This feature modifies the volume of the speakers in the transport to be raised according to the speed-sensitive-volume setting. The faster the transport is traveling; the more adjustment is made to the volume of the speakers to adjust for road noise. Many transports also allow the driver or occupants to adjust the amount of modification by setting the modification to three categories: Low, Medium, and High, wherein the higher the setting, the more modification of the output to the speakers is made.
For example, if an occupant's head turns toward another occupant, it may be determined that the occupant wishes to carry on a conversation with the other occupant. In this scenario, it would be beneficial for the speakers in the transport to be modified such that the speakers near the occupant's head are lowered to allow for conversation, then automatically be returned to a previous level once it was determined that the conversation is complete.
The transport may be an automobile, airplane, train, bus, boat, or any type of vehicle that normally transports people from one place to another.
Referring to
The client device may be least one of a mobile device, a tablet, or a laptop device. It should be noted that other types of devices might be used with the present application. For example, a PDA, an MP3 player, or any other wireless device, a gaming device (such as a hand held system or home based system), any computer wearable device, and the like (including personal computer or other wired device) that may transmit and receive information may be used with the present application. The client device and/or the in-transport navigation system may execute a user browser used to interface with the network 908, an email application used to send and receive emails, a text application used to send and receive text messages, and many other types of applications. Communication may occur between the client device and/or the in-transport navigation system and the network 908 via applications executing on said device and may be applications downloaded via an application store or may reside on the client device by default. Additionally, communication may occur on the client device wherein the client device's operating system performs the logic to communicate without the use of either an inherent or downloaded application.
A server 910 exists in the system, communicably coupled to the network 908, and may be implemented as multiple instances wherein the multiple instances may be joined to form a complete cryptocurrency wallet or may be singular in nature. Furthermore, the server may be connected to a database (not depicted) wherein tables in the database are utilized to contain the elements of the system and may be accessed via queries to a database, such as Structured Query Language (SQL), for example. The database may reside remotely to the server coupled to the network 908 and may be redundant in nature.
Each seat in the transport has a pair of speakers near the occupant's head, henceforth referred to as “headrest speakers”. Headrest speakers are on the left and right side of the occupant's head when the occupant is sitting in the seat.
Referring to
In another embodiment, the headrest speakers are along a track 1004 wherein they may be moved vertically to accommodate shorter or taller occupants. A lever 1006 placed alongside the side of the seat allows the headrest speaker to move such that each headrest speaker may be moved inside the seat. The seat lining 1004 over the speaker area is made of mesh such that regardless where the speaker is placed alongside the track, it is able to produce full sound due to the construction of the mesh covering.
A problem arises when an occupant in the transport desires to interact with another occupant in the transport, or on a phone call or the like. In this scenario, to carry on a quality conversation with another party, it is necessary to lower the radio wherein the sound from the radio is lowered for all occupants, even occupants in the rear (for example) who is not part of the conversation or wish to be part of the said conversation.
The current transport system 904 receives data from a source, such as a transport camera 907 and detects through the analysis of received images and/or video that an occupant's head has turned toward another occupant. The transport system 904 alters the headrest speakers 1000 by at least one of the following:
This embodiment allows the system to adjust the volume of the headrest speakers for conversation.
In another embodiment, as the user turns the head to address another occupant, for example the driver turning to speak to a passenger, the headrest speakers 1000 are lowered temporarily for both occupants.
In another embodiment, the monitoring camera 907 tracks the conversation such that the conversation is recognized therein, and the speakers are returned to a previous volume as before the conversation when the transport system determines that the conversation is complete, for example after a 10 second time period expires from an occupant speaking in the conversation.
The monitoring camera 907 utilizes the tracking of the occupants' mouth to determine the ongoing or lack of conversation.
In another embodiment, particular functionality of the transport may allow the transport system 904 to override the headrest speaker modification for conversation. For example, if the transport system receives notification that the transport's right turn signal is on, this action will override the modification of the headrest speaker, as the driver will be looking right ahead of the right turn.
In yet another embodiment, other input from the transport may override the functionality of the current application such as the current speed of the transport. The current transport speed is accounted for when determining the modification of headrest speakers 1000. If the speed is below a threshold value, then modifications are not performed on the headrest speakers.
For example, if the driver of the transport is looking for a parking space, or on a highway where an accident has occurred, the driver may turn to look around for either a parking space, or “rubberneck” to view an accident. In both scenarios, the volume of the transport would most probably below a particular speed, such as below 10 miles per hour.
The transport system 904 checks the current speed of the transport before performing the functionality to alter the headrest speaker(s) 1000.
In another embodiment, the movement of the torso is examined to alter the headrest speaker(s). The monitoring camera 907 captures the movement of the occupant's torso and may determine to alter either the volume and/or the direction of the occupant based on the movement of the torso.
Referring to
The transport system 904 receives data from a source, such as the monitoring camera 907 indicating an occupant in the transport has made a gesture to begin a conversation with another occupant, such as a head turn 1102. The data is analyzed utilizing head tracking software, further disclosed herein.
The current speed of the transport is determined by the transport system interacting with the transport's computer through an Application Program Interface (API). If the current speed is below a threshold speed 1104, then the process ends, and the headrest speakers 1000 are not modified. The threshold speed is a value hardcoded in the transport system 907 and is determined to be a speed wherein the modification of the speakers is not necessary, for example 10 miles per hour.
If the current speed is above a threshold speed, a check is made if a blinker is on in the transport. The transport system 904 checks this by interaction with the transport's internal computer 1106. If the blinker is on, the process ends, as any speaker modifications are not needed if the user as turned a blinker on for an upcoming turn. If the transport's blinker is not on, then the speaker is modified 1108 as further depicted herein.
Number | Date | Country | |
---|---|---|---|
62677144 | May 2018 | US |