A portion of the disclosure of this patent document contains material that is subject to copyright protection. The copyright owner has no objection to the facsimile reproduction by anyone of the patent document or the patent disclosure, as it appears in the U.S. Patent and Trademark Office patent files or records, but otherwise reserves all copyright rights whatsoever. The following notice applies to the disclosure herein and to the drawings that form a part of this document: Copyright 2012-2014, CloudCar Inc., All Rights Reserved.
This patent document pertains generally to tools (systems, apparatuses, methodologies, computer program products etc.) for allowing electronic devices to share information with each other, and more particularly, but not by way of limitation, to a system and method for recognition and automatic correction of voice commands.
An increasing number of vehicles are being equipped with one or more independent computer and electronic processing systems. Certain of the processing systems are provided for vehicle operation or efficiency. For example, many vehicles are now equipped with computer systems or other vehicle subsystems for controlling engine parameters, brake systems, tire pressure and other vehicle operating characteristics. Additionally, other subsystems may be provided for vehicle driver or passenger comfort and/or convenience. For example, vehicles commonly include navigation and global positioning systems and services, which provide travel directions and emergency roadside assistance, often as audible instructions. Vehicles are also provided with multimedia entertainment systems that may include sound systems, e.g., satellite radio receivers, AM/FM broadcast radio receivers, compact disk (CD) players, MP3 players, video players, smartphone interfaces, and the like. These electronic in-vehicle infotainment (IVI) systems can also provide navigation, information, and entertainment to the occupants of a vehicle. The IVI systems can source navigation content, information, and entertainment content from a variety of sources, both local (e.g., within proximity of the IVI system) and remote (e.g., accessible via a data network).
Functional devices, such as navigation and global positioning receivers (GPS), wireless phones, media players, and the like, are often configured by manufacturers to produce audible instructions or information advisories for users in the form of audio streams that audibly inform and instruct a user. Increasingly, these devices are also being equipped with voice interlaces, so users can interact with the devices in a hands-free manner using voice commands. However, in an environment such as a moving vehicle, ambient noise levels can interfere with the ability of these voice interfaces to properly and efficiently receive and process voice commands from a user. As a result, voice commands can be misunderstood by the device, which can cause incorrect operation, incorrect guidance, and user frustration with devices that use such standard voice interfaces.
The various embodiments are illustrated by way of example, and not by way of limitation, in the figures of the accompanying drawings in which:
In the following description, for purposes of explanation, numerous specific details are set forth in order to provide a thorough understanding of the various embodiments. It will be evident, however, to one of ordinary skill in the art that the various embodiments may be practiced without these specific details.
As described in various example embodiments, a system and method for recognition and automatic correction of voice commands are described herein. In one example embodiment, an in-vehicle infotainment system with a voice command recognition and auto-correction module can be configured like the architecture illustrated in
Referring now to
Similarly, ecosystem 101 can include a wide area data/content network 120. The network 120 represents one or more conventional wide area data/content networks, such as a cellular telephone network, satellite network, pager network, a wireless broadcast network, gaming network, WiFi network, peer-to-peer network, Voice over IP (VoIP) network, etc. One or more of these networks 120 can be used to connect a user or client system with network resources 122, such as websites, servers, call distribution sites, headend sites, or the like. The network resources 122 can generate and/or distribute data, which can be received in vehicle 119 via one or more antennas 114. Antennas 114 can serve to connect the IVI system 150 and the voice command recognition and auto-correction module 200 with the data/content network 120 via cellular, satellite, radio, or other conventional signal reception mechanisms. Such cellular data or content networks are currently available (e.g., Verizon™, AT&T™, T-Mobile™, etc.). Such satellite-based data or content networks are also currently available (e.g., SiriusXM™, HughesNet™, etc.). The conventional broadcast networks, such as AM/FM radio networks, pager networks, UHF networks, gaming networks, WiFi networks, peer-to-peer networks, Voice over IP (VoIP) networks, and the like are also well-known. Thus, as described in more detail below, the IVI system 150 and the voice command recognition and auto-correction module 200 can receive telephone calls and/or phone-based data transmissions via an in-vehicle phone interface 162, which can be used to connect with the in-vehicle phone receiver 116 and network 120. The IVI system 150 and the voice command recognition and auto-correction module 200 can receive web-based data or content via an in-vehicle web-enabled device interface 166, which can be used to connect with the in-vehicle web-enabled device receiver 118 and network 120. In this manner, the IVI system 150 and the voice command recognition and auto-correction module 200 can support a variety of network-connectable in-vehicle devices and systems from within a vehicle 119.
As shown in
In various embodiments, the mobile device 130 interface and user interface between the IVI system 150 and the mobile devices 130 can be implemented in a variety of ways. For example, in one embodiment, the mobile device 130 interface between the IVI system 150 and the mobile devices 130 can be implemented using a Universal Serial Bus (USB) interface and associated connector.
In another embodiment, the interface between the IVI system 150 and the mobile devices 130 can be implemented using a wireless protocol, such as WiFi or Bluetooth® (BT). WiFi is a popular wireless technology allowing an electronic device to exchange data wirelessly over computer network. Bluetooth® is a wireless technology standard for exchanging data over short distances.
Referring again to
Referring still to
In the example embodiment shown in
In the example embodiment shown in
Additionally, ether data and/or content (denoted herein as ancillary data) can be obtained from local and/or remote sources as described above. The ancillary data can be used to augment or modify the operation of the voice command recognition and auto-correction module 200 based on a variety of factors including, the identity and profile of the speaker, the context in which the utterance is spoken (e.g., the location of the vehicle, the specified destination, the time of day, the status of the vehicle, the relationship between the current utterance and a prior utterance, etc.), the context of the speaker (e.g., whether travelling for business or pleasure, whether there are events in the speaker's calendar or correspondence in their email or message queues, the status of processing of the speaker's previous utterances on other occasions, the status of processing of other speaker's related utterances, the historical behavior of the speaker while processing the speaker's utterances, and a variety of other data obtainable from a variety of sources, local and remote.
In a particular embodiment, the IVI system 150 and the voice command recognition and auto-correction module 200 can be implemented as in-vehicle components of vehicle 119. In various example embodiments, the IVI system 150 and the voice command recognition and auto-correction module 200 can be implemented as integrated components or as separate components. In an example embodiment, the software components of the IVI system 150 and/or the voice command recognition and auto-correction module 200 can be dynamically upgraded, modified, and/or augmented by use of the data connection with the mobile devices 130 and/or the network resources 122 via network 120. The IVI system 150 can periodically query a mobile device 130 or a network resource 122 for updates or updates can be pushed to the IVI system 150.
Referring now to
In an example embodiment as shown in
The speech recognition logic module 210 of an example embodiment is responsible for performing speech or text recognition in a first-level speech recognition analysis on a received set of utterance data. As described above, the voice command recognition and auto-correction module 200 can receive a plurality of sets of utterance data from the IVI system 150 via voice interface 158. The sets of utterance data each represent a voice command, statement, or utterance spoken by a user/speaker. In a particular embodiment, the sets of utterance data correspond to a voice command or other utterance spoken by a speaker in the vehicle 119. The speech recognition logic module 210 can search database 170 and attempt to match the received set of utterance data to any of a plurality of sample voice commands stored in voice command database 172 of database 170. The sample voice commands stored in database 170 can include a typical or acceptable audio signature corresponding to a particular valid system command with an associated command code or command identifier. In this manner, the data stored in database 170 forms an association between a spoken audio signal or signature and a corresponding valid system voice command. Thus, a particular received utterance can be associated with a corresponding valid system voice command. However, it is unlikely that an utterance spoken by a particular speaker will exactly match a sample voice command stored in database 170. In most cases, a received utterance can be considered to match a sample voice command stored in database 170 if the received utterance includes a sufficient number of characteristics or indicia that match the sample voice command. The number of matching characteristics needed to be sufficient for a match can be pre-determined and pre-configured. Depending on the quality and nature of the received utterance, there may be more than one sample voice command in database 170 that matches the received utterance. As such, a plurality of sample voice command search results may be returned for a database 170 search performed for a given input utterance. However, the speech recognition logic module 210 can rank these search results based on the number of characteristics from the utterance that match a particular sample voice command. In other words, the speech recognition logic module 210 can use the matching characteristics of the utterance to generate to confidence value corresponding to the likelihood that (or the degree to which) a particular received utterance matches a corresponding sample voice command. The speech recognition logic module 210 can rank the search results based on the confidence value for a particular received utterance and to corresponding sample voice command. The sample voice command corresponding to the highest confidence value can be returned as the most likely voice command corresponding to the received utterance, if the highest confidence value meets or exceeds a pre-configured threshold value that defines whether a match is acceptable. If the received utterance does not match a sufficient number of characteristics from any sample voice command, the speech recognition logic module 240 can return a value indicating that no match was found. In either case, the speech recognition logic module 210 can produce a first result and a confidence value associated with the first result.
The content of the database 170 can be dynamically updated or modified at any time from local or remote (networked) sources. For example, a user mobile device 130 can be configured to store to plurality of spoken audio signatures and corresponding system voice commands. When a user brings his/her mobile device 130 into proximity with the IVI system 150 and the voice command recognition and auto-correction module 200, the mobile device 130 can automatically pair with the IVI system 150 and the content of the mobile device 130 can be synchronized with the content of database 170. The content of the database 170 can thereby get automatically updated with the plurality of spoken audio signatures and corresponding system voice commands from the user's mobile device 170. In this manner, the content of database 170 can be automatically customized for a particular user. This customization increases the likelihood that the particular user's utterances will be matched to a voice command in database 170 and thus the user's voice commands will be more often and more quickly recognized. Similarly, a plurality of spoken audio signatures and corresponding system voice commands customized for a particular user can be downloaded to the IVI system 150 from network resources 122 via network 120. As a result, new features can be easily added to the IVI system 150 and/or the voice command recognition and auto-correction module 200 or existing features can be easily and quickly modified or replaced. Therefore the IVI system 150 and/or the voice command recognition and auto-correction module 200 are highly customizable and adaptable.
As described above, the speech recognition logic module 210 of an example embodiment can attempt to match a received set of utterance data with a corresponding voice command in database 170 to produce a first result. If a matching voice command is found and the confidence value associated with the match is high (and meets or exceeds the pre-configured threshold), the high-confidence matching result can be returned and the processing performed by the voice command recognition and auto-correction module 200 can be terminated. However, in many circumstances, the speech recognition logic module 210 may not be able to match the received utterance with a corresponding voice command or the matches found may have low associated confidence values. This situation can occur if the quality of the received set of utterance data is low. Low quality utterance data can occur if the audio sample corresponding to the utterance is taken in an environment with high volume ambient noise, poor microphone positioning relative to the speaker, ambient noise with signal frequencies similar to the speaker's vocal tone, a speaker moving while speaking, and the like. Such situations can occur frequently in a vehicle where utterances compete with other interference in the environment. The voice command recognition and auto-correction module 200 is configured to handle voice recognition and auto-correction in this challenging environment. In particular, the voice command recognition and auto-correction module 200 includes a repeat utterance correlation logic module 212 to further process a received set of utterance data in a second-level speech recognition analysis when the speech recognition logic module 210 in the first-level speech recognition analysis may not be able to match the received utterance with a corresponding voice command or the matches found may have low associated confidence values (e.g., when the speech recognition logic module 210 produces poor results).
In the example embodiment shown in
The example embodiments described herein use a different approach. In the example embodiment implemented as repeat utterance correlation logic module 212, a more rigorous attempt is made in a second-level speech recognition analysis to filter noise and perform a deeper level of voice recognition analysis and/or a different voice recognition process on the set of utterance data when the speech recognition logic module 210 initially fails to produce satisfactory results in the first-level speech recognition analysis. In other words, subsequent or repeat utterances can be processed differently relative to processing performed on an original utterance. As a result, the second-level speech recognition analysis can produce a result that is not merely the same result produced by the first-level speech recognition analysis or previous attempts at speech recognition. Thus, the results produced for a repeat utterance are not the same as the results produced for a previous or original utterance. This approach prevents the undesirable effect produced when a system repeatedly generates an incorrect response to a repeated utterance. The different processing performed on the subsequent or repeat utterance can also be customized or adapted based on a comparison of the characteristics of the original utterance and the characteristics of the subsequent or repeat utterance. For example, the tone and pace of the original utterance can be compared with the tone and pace of the repeat utterance. The tone of the utterance represents the volume and the pitch or signal frequency signature of the utterance. The pace of the utterance represents the speed at which the utterance is spoken or the audio signature of the utterance relative to a temporal component. Changes in the tone or pace of the subsequent or repeat utterance relative to the original utterance can be used to re-scale the audio signature of the repeat utterance to correspond to the scale of the original utterance. The re-scaled repeat utterance in combination with the audio signature of the original utterance is more likely to be matched to a voice command in the database 170. Changes in the tone or pace of the repeat utterance can also be used as an indication of an agitated speaker. Upon detection of an agitated speaker, the repeat utterance correlation logic module 212 can be configured to offer the speaker an alternative command selection method rather than merely prompting again for another repeated utterance.
In various example embodiments, the repeat utterance correlation logic module 212 can be configured to perform any of a variety of options for processing a set of utterance data for which a high-confidence matching result could not be found by the speech recognition logic module 210. In one embodiment, the repeat utterance correlation logic module 212 can be configured to present the top several matching results with the highest corresponding confidence values. For example, the speech recognition logic module 210 may have found one or more matching voice command options, none of which had confidence values that met or exceeded a pre-determined high-confidence threshold (e.g., low-confidence matching results). In this case, the repeat utterance correlation logic module 212 can be configured to present the low-confidence matching results to the user via an audio or visual interface for selection. The repeat utterance correlation logic module 212 can be configured to limit the number of low-confidence matching results presented to the user to a pre-determined maximum number of options. In this situation, the user can be prompted to explicitly select a voice command option from the presented list of options to rectify the ambiguous results produced by the speech recognition logic module 210.
In another example embodiment, the repeat utterance correlation logic module 212 can be configured to more rigorously process the utterance for which either no matching results were found or only low-confidence matching results were found (e.g., no high-confidence matching result was found). In this example, the repeat utterance correlation logic module 212 can submit the received set of utterance data to each of a plurality of utterance processing modules to analyze the utterance data from a plurality of perspectives. The results from each of the plurality of utterance processing modules can be compared or aggregated to produce a combined result. For example, one of the plurality of utterance processing modules can be a signal frequency analysis module that focuses on comparing the signal frequency signatures of the received set of utterance data with corresponding signal frequency signatures of sample voice commands stored in database 170. A second one of the plurality of utterance processing modules can be configured to focus on an amplitude or volume signature of the received utterance relative to the sample voice commands. A third one of the plurality of utterance processing modules can be configured to focus on the tone and/or pace of the received set of utterance data relative to a previous utterance as described above. A re-sealed or blended set of utterance data can be used to search the voice command options in database 170.
A fourth one of the plurality of utterance processing modules can be configured to focus on the specific characteristics of the particular speaker. In this case, the utterance processing module can access ancillary data, such as the identity and profile of the speaker. This information can be used to adjust speech recognition parameters to produce a speech recognition model that is more likely to match the speaker's utterances with a voice command in database 170. For example, the age, gender, and native language of the speaker can be used to tune the parameters of the speech recognition model to produce better results.
A fifth one of the plurality of utterance processing modules can be configured to focus on the context in which the utterance is spoken (e.g., the location of the vehicle, the specified destination, the time of day, the status of the vehicle, etc.). This utterance processing module can be configured to obtain ancillary data from a variety of sources described above, such as the vehicle operational subsystems 115, the in-vehicle GPS receiver 117, the in-vehicle web-enabled devices 118, and/or the user mobile devices 130. The information obtained from these sources can be used to adjust speech recognition parameters to produce a speech recognition model that is more likely to match the speaker's utterances with a voice command in database 170. For example, as described above, the utterance processing module can obtain ancillary data indicative of the current location of the vehicle as provided by a navigation subsystem or GPS device in the vehicle 119. The vehicle's current location is one factor that is indicative of the context of the utterance. Given the vehicle's current location, the utterance processing module may be better able to reconcile ambiguities in the received utterance. For example, an ambiguous utterance may be received by the voice command recognition and auto-correction module 200 as, “Navigate to 160 Maple Avenue.” In reality, the speaker may have wanted to convey, “Navigate to 116 Marble Avenue.” Using the vehicle's current location and a navigation or mapping subsystem, the utterance processing module can determine that there is no “160 Maple Avenue” in proximity to the vehicle's location or destination, but there is a “116 Marble Avenue” location. In this example, the utterance processing module can automatically match the ambiguous utterance to an appropriate voice command option. As such, an example embodiment can perform automatic correction of voice commands. In a similar manner, other utterance context ancillary data can be used to enhance the operation of the utterance processing module and the speech recognition process. Additionally, an example embodiment can perform automatic correction of voice commands using the utterance context ancillary data.
A sixth one of the plurality of utterance processing modules can be configured to focus on the context of the speaker (e.g., whether travelling for business or pleasure, whether there are events in the speaker's calendar or correspondence in their email or message queues, the status of processing of the speaker's previous utterances on other occasions, the status of processing, of other speaker's related utterances, the historical behavior of the speaker while processing the speaker's utterances, and a variety of other data obtainable from a variety of sources, local and remote. This utterance processing module can be configured to obtain ancillary data from a variety of sources described above, such as the in-vehicle web-enabled devices 118, the user mobile devices 130, and/or network resources 122 via network 120. The information obtained from these sources can be used to adjust speech recognition parameters to produce a speech recognition model that is more likely to match the speaker's utterances with a voice command in database 170. For example, the utterance processing module can access the speaker's mobile device 130, web-enabled device 118, or account at a network resource 122 to obtain speaker-specific context information that can be used to rectify ambiguous utterances in a manner similar to the process described above. This speaker-specific context information can include current events listed on the speaker's calendar, the content of the speaker's address book, a log of the speaker's previous voice commands and associated audio signatures, content of recent email messages or text messages, and the like. The utterance processing module can use this speaker-specific context ancillary data to enhance the operation of the utterance processing module and the speech recognition process. Additionally, an example embodiment can perform automatic correction of voice commands using the speaker-specific context ancillary data.
It will be apparent to those of ordinary skill in the art in view of the disclosure herein that a variety of other utterance processing modules can be configured to enhance the processing accuracy of the speech recognition processes described herein. As described above, the repeat utterance correlation logic module 212 can submit the received set of utterance data to each or any one of a plurality of utterance processing modules as described above to analyze the utterance data from a plurality of perspectives. Because of the deeper level of analysis and/or the different voice recognition process provided by the repeat utterance correlation logic module 212, a greater quantity of computing resources (e.g., processing cycles, memory storage, etc.) may need to be used to effect the speech recognition analysis. As such, it is not usually feasible to perform this deep level of analysis for every received utterance. However, the embodiments described herein can selectively employ this deeper level of analysis and/or a different voice recognition process only when it is required as described above. In this manner, a more robust and effective speech recognition analysis can be provided while preserving valuable computing resources.
As described above, the repeat utterance correlation logic module 212 can provide a deeper level of analysis and/or a different voice recognition process when the speech recognition logic module 210 produces poor results. Additionally, the repeat utterance correlation logic module 212 can recognize when a currently received utterance is a repeat of a prior utterance. Often, when an utterance is misunderstood, the user/speaker will repeat the same utterance and continue repeating the utterance until the system recognizes the voice command. In an example embodiment, the repeat utterance correlation logic module 212 can identify a current utterance as a repeat of a previous utterance using a variety of techniques. In one example, the repeat utterance correlation logic module 212 can compare the audio signature of a current utterance to the audio signature of a previous utterance. The repeat utterance correlation logic module 212 can also compare the tone and/or pace of a current utterance to the tone and pace of a previous utterance. The timing of a time gap between the current utterance and a previous utterance can also be used to infer that a current utterance is likely a repeat of a prior utterance. Using any of these techniques, the repeat utterance correlation logic module 212 can identify a current utterance as a repeat of a previous utterance. Once it is determined that a current utterance is a repeat of a prior utterance, the repeat utterance correlation logic module 212 can determine that the speaker is trying to be recognized for the same voice command and the prior speech recognition analysis is not working. In this case, the repeat utterance correlation logic module 212 can employ the deeper level of speech recognition analysis and/or as different voice recognition process as described above. In the manner, the repeat utterance correlation logic module 212 can be configured to match the set of utterance data to a voice command and return information indicative of the matching voice command without returning information that is the same as previously returned information if the set of utterance data is a repeat utterance.
An example embodiment can also record or log parameters associated with the speech recognition analysis performed on a particular utterance. These log parameters can be stored in log database 174 of database 170 as shown in
Referring now to
At decision block 618, if the received set of utterance data is determined to be a repeat utterance as described above, processing continues at processing block 620 where a second-level speech recognition analysis is performed on the received set of utterance data using the repeat utterance correlation logic module 212 as described above. Once the second-level speech recognition analysis performed by the repeat utterance correlation logic module 212 is complete, processing can continue at processing block 612 where speech recognition analysis is again performed on the processed set of utterance data.
At decision block 618, if the received set of utterance data is determined to not be a repeat utterance as described above, processing continues at processing block 622 where the top n results produced by the speech recognition logic module 210 are presented to the user/speaker. As described above, these results can be ranked based on the corresponding confidence values for each matching result. Once the ranked results are presented to the user/speaker, the user/speaker can be prompted to select one of the presented result options. At decision block 624, if the user/speaker selects one of the presented result options, the selected result is accepted and processing terminates at bubble 626. However, if the user/speaker does not provide a valid result option selection within a pre-determined time limit, the process resets and processing continues at processing block 610 where a new set of utterance data is received.
As used herein and unless specified otherwise, the term “mobile device” includes any computing or communications device that can communicate with the IVI system 150 and/or the voice command recognition and auto-correction module 200 described herein to obtain read or write access to data signals, messages, or content communicated via any mode of data communications. In many cases, the mobile device 130 is a handheld, portable device, such as a smart phone, mobile phone, cellular telephone, tablet computer, laptop computer, display pager, radio frequency (RF) device, infrared (IR) device, global positioning device (GPS), Personal Digital Assistants (PDA), handheld computers, wearable computer, portable game console, other mobile communication and/or computing device, or an integrated device combining one or more of the preceding devices, and the like. Additionally, the mobile device 130 can be a computing device, personal computer (PC), multiprocessor system, microprocessor-based or programmable consumer electronic device, network PC, diagnostics equipment, a system operated by a vehicle 119 manufacturer or service technician, and the like, and is not limited to portable devices. The mobile device 130 can receive and process data in any of a variety of data formats. The data format may include or be configured to operate with any programming format, protocol, or language including, but not limited to, JavaScript, C++, iOS, Android, etc.
As used herein and unless specified otherwise, the term “network resource” includes any device, system, or service that can communicate with the IVI system 150 and/or the voice command recognition and auto-correction module 200 described herein to obtain read or write access to data signals, messages, or content communicated via any mode of inter-process or networked data communications. In many cases, the network resource 122 is a data network accessible computing platform, including client or server computers, websites, mobile devices, peer-to-peer (P2P) network nodes, and the like. Additionally, the network resource 122 can be a web appliance, a network router, switch, bridge, gateway, diagnostics equipment, a system operated by a vehicle 119 manufacturer or service technician, or any machine capable of executing a set of instructions (sequential or otherwise) that specify actions to be taken by that machine. Further, while only a single machine is illustrated, the term “machine” can also be taken to include any collection of machines that individually or jointly execute a set (or multiple sets) of instructions to perform any one or more of the methodologies discussed herein. The network resources 122 may include any of a variety of providers or processors of network transportable digital content. Typically, the file format that is employed is Extensible Markup Language (XML), however, the various embodiments are not so limited, and other file formats may be used. For example, data formats other than Hypertext Markup Language (HTML)/XML or formats other than open/standard data formats can be supported by various embodiments. Any electronic file format, such as Portable Document Format (PDF), audio (e.g., Motion Picture Experts Group Audio Layer 3—MP3, and the like), video (e.g. MP4, and the like), and any proprietary interchange format defined by specific content sites can be supported by the various embodiments described herein.
The wide area data network 120 (also denoted the network cloud) used with the network resources 122 can be configured to couple one computing or communication device with another computing or communication device. The network may be enabled to employ any form of computer readable data or media for communicating information from one electronic device to another. The network 120 can include the Internet in addition to other wide area networks (WANs), cellular telephone networks, metro-area networks, local area networks (LANs), other packet-switched networks, circuit-switched networks, direct data connections, such as through a universal serial bus (USB) or Ethernet port, other forms of computer-readable media, or any combination thereof. The network 120 can include the Internet in addition to other wide area networks (WANs), cellular telephone networks, satellite networks, over-the-air broadcast networks, AM/FM radio networks, pager networks, UHF networks, other broadcast networks, gaming networks, WiFi networks, peer-to-peer networks, Voice Over IP (VoIP) networks, metro-area networks, local area networks (LANs), other packet-switched networks, circuit-switched networks, direct data connections, such as through a universal serial bus (USB) or Ethernet port, other forms of computer-readable media, or any combination thereof. On an interconnected set of networks, including those based on differing architectures and protocols, a router or gateway can act as a link between networks, enabling messages to be sent between computing devices on different networks. Also, communication links within networks can typically include twisted wire pair cabling, USB, Firewire, Ethernet, or coaxial cable, while communication links between networks may utilize analog or digital telephone lines, full or fractional dedicated digital lines including T1, T2, T3, and T4, Integrated Services Digital Networks (ISDNs), Digital User Lines (DSLs), wireless links including satellite links, cellular telephone links, or other communication links known to those of ordinary skill in the art. Furthermore, remote computers and other related electronic devices can be remotely connected to the network via a modem and temporary telephone link.
The network 120 may further include any of a variety of wireless sub-networks that may further overlay stand-alone ad-hoc networks, and the like, to provide an infrastructure-oriented connection. Such sub-networks may include mesh networks, Wireless LAN (WLAN) networks, cellular networks, and the like. The network may also include an autonomous system of terminals, gateways, routers, and the like connected by wireless radio links or wireless transceivers. These connectors may be configured to move freely and randomly and organize themselves arbitrarily, such that the topology of the network may change rapidly. The network 120 may further employ one or more of a plurality of standard wireless and/or cellular protocols or access technologies including those set forth below in connection with network interface 712 and network 714 described in detail below in relation to
In a particular embodiment, a mobile device 130 and/or a network resource 122 may act as a client device enabling a user to access and use the IVI system 150 and/or the voice command recognition and auto-correction module 200 to interact with one or more components of a vehicle subsystem. These client devices 130 or 122 may include virtually any computing device that is configured to send and receive information over a network, such as network 120 as described herein. Such client devices may include mobile devices, such as cellular telephones, smart phones, tablet computers, display pagers, radio frequency (RF) devices, infrared (IR) devices, global positioning devices (GPS), Personal Digital Assistants (PDAs), handheld computers, wearable computers, game consoles, integrated devices combining one or more of the preceding devices, and the like. The client devices may also include other computing devices, such as personal computers (PCs), multiprocessor systems, microprocessor-based or programmable consumer electronics, network PC's, and the like. As such, client devices may range widely in terms of capabilities and features. For example, a client device configured as a cell phone may have a numeric keypad and a few lines of monochrome LCD display on which only text may be displayed. In another example, a web-enabled client device may have a touch sensitive screen, a stylus, and a color LCD display screen in which both text and graphics may be displayed. Moreover, the web-enabled client device may include a browser application enabled to receive and to send wireless application protocol messages (WAP), and/or wired application messages, and the like. In one embodiment, the browser application is enabled to employ HyperText Markup Language (HTML), Dynamic HTML, Handheld Device Markup Language (HDML), Wireless Markup Language (WML), WMLScript, JavaScript, EXtensible HTML (xHTML), Compact HTML (CHTML), and the like, to display and send a message with relevant information.
The client devices may also include at least one client application that is configured to receive content or messages from another computing device via a network transmission. The client application may include a capability to provide and receive textual content, graphical content, video content, audio content, alerts, messages, notifications, and the like. Moreover, the client devices may be further configured to communicate and/or receive a message, such as through a Short Message Service (SMS), direct messaging (e.g., Twitter), email, Multimedia Message Service (MMS), instant messaging (IM), Internet relay chat (IRC), mIRC, Jabber, Enhanced Messaging Service (EMS), text messaging, Smart Messaging, Over the Air (OTA) messaging, or the like, between another computing device, and the like. The client devices may also include a wireless application device on which a client application is configured to enable a user of the device to send and receive information to/from network resources wirelessly via the network.
The IVI system 150 and/or the voice command recognition and auto-correction module 200 can be implemented using systems that enhance the security of the execution environment, thereby improving security and reducing the possibility that the IVI system 150 and/or the voice command recognition and auto-correction module 200 and the related services could be compromised by viruses or malware. For example, the IVI system 150 and/or the voice command recognition and auto-correction module 200 can be implemented using a Trusted Execution Environment, which can ensure that sensitive data is stored, processed, and communicated in a secure way.
The example mobile computing and/or communication system 700 can include a data processor 702 (e.g., a System-on-a-Chip (SoC), general processing core, graphics core, and optionally other processing logic) and a memory 704, which can communicate with each other via a bus or other data transfer system 706. The mobile computing, and/or communication system 700 may further include various input/output (I/O) devices and/or interfaces 710, such as a touchscreen display, an audio jack, a voice interface, and optionally a network interface 712. In an example embodiment, the network interface 712 can include one or more radio transceivers configured for compatibility with any one or more standard wireless and/or cellular protocols or access technologies (e.g., 2nd (2G), 2.5, 3rd (3G), 4th (4G) generation, and future generation radio access for cellular systems, Global System for Mobile communication (GSM), General Packet Radio Services (GPRS), Enhanced Data GSM Environment (EDGE), Wideband Code Division Multiple Access (WCDMA), LTE, CDMA2000, WLAN, Wireless Router (WR) mesh, and the like). Network interface 712 may also be configured for use with various other wired and/or wireless communication protocols, including TCP/IP, UDP, SIP, SMS, RTP, WAP, CDMA, TDMA, UMTS, UWB, WiFi, WiMax, Bluetooth®, IEEE 802.11x, and the like. In essence, network interface 712 may include or support virtually any wired and/or wireless communication and data processing mechanisms by which information/data may travel between a mobile computing and/or communication system 700 and another computing or communication system via network 714.
The memory 704 can represent a machine-readable medium on which is stored one or more sets of instructions, software, firmware, or other processing logic (e.g., logic 708) embodying any one or more of the methodologies or functions described and/or claimed herein. The logic 708, or a portion thereof, may also reside, completely or at least partially within the processor 702 during execution thereof by the mobile computing and/or communication system 700. As such, the memory 704 and the processor 702 may also constitute machine-readable media. The logic 708, or a portion thereof, may also be configured as processing logic or logic, at least a portion of which is partially implemented in hardware. The logic 708, or a portion thereof, may further be transmitted or received over a network 714 via the network interface 712. While the machine-readable medium of an example embodiment can be a single medium, the term “machine-readable medium” should be taken to include a single non-transitory medium or multiple non-transitory media (e.g., as centralized or distributed database, and/or associated caches and computing systems) that store the one or more sets of instructions. The term “machine-readable medium” can also be taken to include any non-transitory medium that is capable of storing, encoding or carrying a set of instructions for execution by the machine and that cause the machine to perform any one or more of the methodologies of the various embodiments, or that is capable of storing, encoding or carrying data structures utilized by or associated with such a set of instructions. The term “machine-readable medium” can accordingly be taken to include, but not be limited to, solid-state memories, optical media, and magnetic media.
The Abstract of the Disclosure is provided to comply with 37 C.F.R. §1.72(b), requiring an abstract that will allow the reader to quickly ascertain the nature of the technical disclosure. It is submitted with the understanding that it will not be used to interpret or limit the scope or meaning of the claims. In addition, in the foregoing Detailed Description, it can be seen that various features are grouped together in a single embodiment for the purpose of streamlining the disclosure. This method of disclosure is not to be interpreted as reflecting an intention that the claimed embodiments require more features than are expressly recited in each claim. Rather, as the following claims reflect, inventive subject matter lies in less than all features of a single disclosed embodiment. Thus, the following claims are hereby incorporated into the Detailed Description, with each claim standing on its own as a separate embodiment.