This application claims priority to Indian Provisional Patent Application No. 202111025170, filed Jun. 7, 2021, the entire content of which is incorporated by reference herein.
The subject matter described herein relates generally to vehicle systems, and more particularly, embodiments of the subject matter relate to contextual speech recognition for interfacing with aircraft systems and related cockpit displays using air traffic control communications.
Air traffic control typically involves voice communications between air traffic control and a pilot or crewmember onboard the various aircrafts within a controlled airspace. For example, an air traffic controller (ATC) may communicate an instruction or a request for pilot action by a particular aircraft using a call sign assigned to that aircraft, with a pilot or crewmember onboard that aircraft acknowledging the request (e.g., by reading back the received information) in a separate communication that also includes the call sign. As a result, the ATC can determine that the correct aircraft has acknowledged the request, that the request was correctly understood, what the pilot intends to do, etc.
Modern flight deck displays (or cockpit displays) are utilized to provide a number of different displays from which the user can obtain information or perform functions related to, for example, navigation, flight planning, guidance and navigation, and performance management. Modern displays also allow a pilot to input information, such as, navigational clearances or commands issued by ATC. However, input of an incomplete and/or incorrect clearance can be consequential and antithetical to maintaining aircraft control. Accordingly, it is desirable to provide aircraft systems and methods that facilitate inputting ATC clearances or commands with improved accuracy. Other desirable features and characteristics of the methods and systems will become apparent from the subsequent detailed description and the appended claims, taken in conjunction with the accompanying drawings and the preceding background.
Methods and systems are provided for assisting operation of a vehicle, such as an aircraft, using speech recognition and related analysis of communications with respect to the vehicle. One method involves automatically identifying a parameter value for an operational subject based at least in part on an audio communication with respect to the vehicle, recognizing a received audio input as an input command with respect to the vehicle, determining a second operational subject associated with the input command, and automatically commanding a vehicle system associated with the operational subject to implement the parameter value identified from the audio communication when the second operational subject corresponds to the operational subject.
In another embodiment, a computer-readable medium having computer-executable instructions stored thereon is provided. The computer-executable instructions, when executed by a processing system, cause the processing system to automatically identify a parameter value for an operational subject based at least in part on a preceding audio communication with respect to a vehicle prior to receiving voice command audio input, recognize the voice command audio input as an input command including placeholder terminology, and automatically command a vehicle system to implement the input command utilizing the parameter value when the placeholder terminology corresponds to the operational subject.
In another embodiment, a system is provided that includes a communications system to receive an audio communication with respect to a vehicle, an audio input device receive input voice command audio, and a processing system coupled to the communications system and the audio input device. The processing system is configurable to automatically identify a parameter value for an operational subject based at least in part on the audio communication prior to receiving the input voice command audio, recognize the input voice command audio as an input command including placeholder terminology, and automatically command a vehicle system to implement the input command using the parameter value when the placeholder terminology corresponds to the operational subject.
This summary is provided to describe select concepts in a simplified form that are further described in the detailed description. This summary is not intended to identify key or essential features of the claimed subject matter, nor is it intended to be used as an aid in determining the scope of the claimed subject matter.
Embodiments of the subject matter will hereinafter be described in conjunction with the following drawing figures, wherein like numerals denote like elements, and:
The following detailed description is merely exemplary in nature and is not intended to limit the subject matter of the application and uses thereof. Furthermore, there is no intention to be bound by any theory presented in the preceding background, brief summary, or the following detailed description.
Embodiments of the subject matter described herein generally relate to systems and methods that facilitate a vehicle operator providing an audio input to one or more displays or onboard systems using a contextual speech recognition graph that is influenced by clearance communications associated with different vehicles operating within a commonly controlled area. For purposes of explanation, the subject matter is primarily described herein in the context of aircraft operating in a controlled airspace; however, the subject matter described herein is not necessarily limited to aircraft or avionic environments, and in alternative embodiments, may be implemented in an equivalent manner for ground operations, marine operations, or otherwise in the context of other types of vehicles and travel spaces.
As described in greater detail below primarily in the context of
When an operational parameter value assigned to or defined for an operational subject is identified within an audio communication, an entry in a keyword mapping table is automatically created to establish an association or mapping between the operational parameter value and one or more words, terms and/or phrases that may be utilized to refer to the operational subject to which that parameter value pertains. Thereafter, when a received audio input command is recognized that includes a reference to the operational subject that matches or otherwise corresponds to the operational subject of the preceding audio communication from which the operational parameter value was derived, the keyword mapping table is utilized to implement input command using the identified operational parameter value. For example, in some embodiments, the keyword mapping table may be utilized to augment the recognized audio input command by substituting the identified operational parameter value for the corresponding words, terms and/or phrases that were previously used to reference the operational subject. In yet other embodiments, the language model utilized by the speech recognition engine may be augmented or modified to include potential voice commands that utilize the different words, terms and/or phrases that may be utilized to refer to the operational subject in the input voice command rather than requiring the user to specify the parameter value, with the keyword mapping table being utilized to retrospectively interchange the oral reference to the operational subject with the previously identified parameter value.
For example, in one or more implementations, a speech recognition engine is implemented using two components, an acoustic model and a language model, where the language model is implemented as a finite state graph configurable to function as or otherwise support a finite state transducer, where the acoustic scores from the acoustic model are utilized to compute probabilities for the different paths of the finite state graph, with the highest probability path being recognized as the desired user input which is output by the speech recognition engine to an onboard system. In this regard, the keyword mapping table may be utilized to augment the finite state graph that includes different references to the operational subject, thereby accommodating a shorter, more concise voice command that can be subsequently mapped to a particular operational parameter value rather than requiring the voice command contain the entire sequence of numbers, letters and/or other symbols to designate the operational parameter value. Thus, a contextually augmented speech recognition graph may be utilized to quickly and accurately recognize the received audio input as designating a previously defined or previously assigned operational parameter value for a particular operational subject without delays or potential errors that could otherwise be associated with requiring longer voice commands that contain the entire sequence of numbers, letters and/or symbols to spell out the parameter value.
In exemplary embodiments, the display device 102 is realized as an electronic display capable of graphically displaying flight information or other data associated with operation of the aircraft 120 under control of the display system 108 and/or processing system 106. In this regard, the display device 102 is coupled to the display system 108 and the processing system 106, and the processing system 106 and the display system 108 are cooperatively configured to display, render, or otherwise convey one or more graphical representations or images associated with operation of the aircraft 120 on the display device 102. The user input device 104 is coupled to the processing system 106, and the user input device 104 and the processing system 106 are cooperatively configured to allow a user (e.g., a pilot, co-pilot, or crew member) to interact with the display device 102 and/or other elements of the system 100, as described in greater detail below. Depending on the embodiment, the user input device(s) 104 may be realized as a keypad, touchpad, keyboard, mouse, touch panel (or touchscreen), joystick, knob, line select key or another suitable device adapted to receive input from a user. In some exemplary embodiments, the user input device 104 includes or is realized as an audio input device, such as a microphone, audio transducer, audio sensor, or the like, that is adapted to allow a user to provide audio input to the system 100 in a “hands free” manner using speech recognition.
The processing system 106 generally represents the hardware, software, and/or firmware components configured to facilitate communications and/or interaction between the elements of the system 100 and perform additional tasks and/or functions to support operation of the system 100, as described in greater detail below. Depending on the embodiment, the processing system 106 may be implemented or realized with a general purpose processor, a content addressable memory, a digital signal processor, an application specific integrated circuit, a field programmable gate array, any suitable programmable logic device, discrete gate or transistor logic, processing core, discrete hardware components, or any combination thereof, designed to perform the functions described herein. The processing system 106 may also be implemented as a combination of computing devices, e.g., a plurality of processing cores, a combination of a digital signal processor and a microprocessor, a plurality of microprocessors, one or more microprocessors in conjunction with a digital signal processor core, or any other such configuration. In practice, the processing system 106 includes processing logic that may be configured to carry out the functions, techniques, and processing tasks associated with the operation of the system 100, as described in greater detail below. Furthermore, the steps of a method or algorithm described in connection with the embodiments disclosed herein may be embodied directly in hardware, in firmware, in a software module executed by the processing system 106, or in any practical combination thereof. For example, in one or more embodiments, the processing system 106 includes or otherwise accesses a data storage element (or memory), which may be realized as any sort of non-transitory short or long term storage media capable of storing programming instructions for execution by the processing system 106. The code or other computer-executable programming instructions, when read and executed by the processing system 106, cause the processing system 106 to support or otherwise perform certain tasks, operations, functions, and/or processes described herein.
The display system 108 generally represents the hardware, software, and/or firmware components configured to control the display and/or rendering of one or more navigational maps and/or other displays pertaining to operation of the aircraft 120 and/or onboard systems 110, 112, 114, 116 on the display device 102. In this regard, the display system 108 may access or include one or more databases suitably configured to support operations of the display system 108, such as, for example, a terrain database, an obstacle database, a navigational database, a geopolitical database, a terminal airspace database, a special use airspace database, or other information for rendering and/or displaying navigational maps and/or other content on the display device 102.
In the illustrated embodiment, the aircraft system 100 includes a data storage element 118, which contains aircraft procedure information (or instrument procedure information) for a plurality of airports and maintains association between the aircraft procedure information and the corresponding airports. Depending on the embodiment, the data storage element 118 may be physically realized using RAM memory, ROM memory, flash memory, registers, a hard disk, or another suitable data storage medium known in the art or any suitable combination thereof. As used herein, aircraft procedure information should be understood as a set of operating parameters, constraints, or instructions associated with a particular aircraft action (e.g., approach, departure, arrival, climbing, and the like) that may be undertaken by the aircraft 120 at or in the vicinity of a particular airport. An airport should be understood as referring to any sort of location suitable for landing (or arrival) and/or takeoff (or departure) of an aircraft, such as, for example, airports, runways, landing strips, and other suitable landing and/or departure locations, and an aircraft action should be understood as referring to an approach (or landing), an arrival, a departure (or takeoff), an ascent, taxiing, or another aircraft action having associated aircraft procedure information. An airport may have one or more predefined aircraft procedures associated therewith, wherein the aircraft procedure information for each aircraft procedure at each respective airport are maintained by the data storage element 118 in association with one another.
Depending on the embodiment, the aircraft procedure information may be provided by or otherwise obtained from a governmental or regulatory organization, such as, for example, the Federal Aviation Administration in the United States. In an exemplary embodiment, the aircraft procedure information comprises instrument procedure information, such as instrument approach procedures, standard terminal arrival routes, instrument departure procedures, standard instrument departure routes, obstacle departure procedures, or the like, traditionally displayed on a published charts, such as Instrument Approach Procedure (IAP) charts, Standard Terminal Arrival (STAR) charts or Terminal Arrival Area (TAA) charts, Standard Instrument Departure (SID) routes, Departure Procedures (DP), terminal procedures, approach plates, and the like. In exemplary embodiments, the data storage element 118 maintains associations between prescribed operating parameters, constraints, and the like and respective navigational reference points (e.g., waypoints, positional fixes, radio ground stations (VORs, VORTACs, TACANs, and the like), distance measuring equipment, non-directional beacons, or the like) defining the aircraft procedure, such as, for example, altitude minima or maxima, minimum and/or maximum speed constraints, RTA constraints, and the like. In this regard, although the subject matter may be described in the context of a particular procedure for purpose of explanation, the subject matter is not intended to be limited to use with any particular type of aircraft procedure and may be implemented for other aircraft procedures in an equivalent manner.
Still referring to
In exemplary embodiments, the processing system 106 is also coupled to the FMS 114, which is coupled to the navigation system 112, the communications system 110, and one or more additional avionics systems 116 to support navigation, flight planning, and other aircraft control functions in a conventional manner, as well as to provide real-time data and/or information regarding the operational status of the aircraft 120 to the processing system 106. Although
It should be understood that
The transcription system 202 generally represents the processing system or component of the contextual speech recognition system 200 that is coupled to the microphone 206 and communications system(s) 208 to receive or otherwise obtain clearance communications, analyze the audio content of the clearance communications, and transcribe the clearance communications, as described in greater detail below. The command system 204 generally represents the processing system or component of the contextual speech recognition system 200 that is coupled to the microphone 206 to receive or otherwise obtain voice commands, analyze the audio content of the voice commands, and output control signals to an appropriate onboard system 210 to effectuate the voice command, as described in greater detail below. In some embodiments, the transcription system 202 and the command system 204 are implemented separately using distinct hardware components, while in other embodiments, the features and/or functionality of the transcription system 202 and the command system 204 maybe integrated and implemented using a common processing system (e.g., processing system 106). In this regard, the transcription system 202 and the command system 204 may be implemented using any sort of hardware, firmware, circuitry and/or logic components or combination thereof. In one or more exemplary embodiments, the transcription system 202 and the command system 204 are implemented as parts of the processing system 106 onboard the aircraft 120 of
The audio input device 206 generally represents any sort of microphone, audio transducer, audio sensor, or the like capable of receiving voice or speech input. In this regard, in one or more embodiments, the audio input device 206 is realized as a microphone (e.g., use input device 104) onboard the aircraft 120 to receive voice or speech annunciated by a pilot or other crewmember onboard the aircraft 120 inside the cockpit of the aircraft 120. The communications system(s) 208 (e.g., communications system 110) generally represent the avionics systems capable of receiving clearance communications from other external sources, such as, for example, other aircraft, an air traffic controller, or the like. Depending on the embodiment, the communications system(s) 208 could include one or more of a very high frequency (VHF) radio communications system, a controller-pilot data link communications (CPDLC) system, an aeronautical operational control (AOC) communications system, an aircraft communications addressing and reporting system (ACARS), and/or the like.
In exemplary embodiments, computer-executable programming instructions are executed by the processor, control module, or other hardware associated with the transcription system 202 and cause the transcription system 202 to generate, execute, or otherwise implement a clearance transcription application 220 capable of analyzing, parsing, or otherwise processing voice, speech, or other audio input received by the transcription system 202 to convert the received audio into a corresponding textual representation. In this regard, the clearance transcription application 220 may implement or otherwise support a speech recognition engine (or voice recognition engine) or other speech-to-text system. Accordingly, the transcription system 202 may also include various filters, analog-to-digital converters (ADCs), or the like, and the transcription system 202 may include or otherwise access a data storage element 224 (or memory) that stores a speech recognition vocabulary for use by the clearance transcription application 220 in converting audio inputs into transcribed textual representations. In one or more embodiments, the clearance transcription application 220 may also mark, tag, or otherwise associate a transcribed textual representation of a clearance communication with an identifier or other indicia of the source of the clearance communication (e.g., the onboard microphone 206, a radio communications system 208, or the like).
In exemplary embodiments, the computer-executable programming instructions executed by the transcription system 202 also cause the transcription system 202 to generate, execute, or otherwise implement a clearance table generation application 222 (or clearance table generator) that receives the transcribed textual clearance communications from the clearance transcription application 220 or receives clearance communications in textual form directly from a communications system 208 (e.g., a CPDLC system). The clearance table generator 222 parses or otherwise analyzes the textual representation of the received clearance communications and generates corresponding clearance communication entries in a table 226 in the memory 224. In this regard, the clearance table 226 maintains all of the clearance communications received by the transcription system 202 from either the onboard microphone 206 or an onboard communications system 208.
In exemplary embodiments, for each clearance communication received by the clearance table generator 222, the clearance table generator 222 parses or otherwise analyzes the textual content of the clearance communication using natural language processing and attempts to extract or otherwise identify, if present, one or more of an identifier contained within the clearance communication (e.g., a flight identifier, call sign, or the like), an operational subject of the clearance communication (e.g., a runway, a taxiway, a waypoint, a heading, an altitude, a flight level, or the like), an operational parameter value associated with the operational subject in the clearance communication (e.g., the runway identifier, taxiway identifier, waypoint identifier, heading angle, altitude value, or the like), and/or an action associated with the clearance communication (e.g., landing, takeoff, pushback, hold, or the like). The clearance table generator 222 also identifies the radio frequency or communications channel associated with the clearance communication and attempts to identify or otherwise determine the source of the clearance communication. The clearance table generator 222 then creates or otherwise generates an entry in the clearance table 226 that maintains an association between the textual content of the clearance communication and the identified fields associated with the clearance communication. Additionally, the clearance table generator 222 may analyze the new clearance communication entry relative to existing clearance communication entries in the clearance table 226 to identify or otherwise determine a conversational context to be assigned to the new clearance communication entry.
Still referring to
In exemplary embodiments, the processor, control module, or other hardware associated with the command system 204 executes computer-executable programming instructions that cause the command system 204 to generate, execute, or otherwise implement a vocabulary generation application 242 (or vocabulary generator) that is capable of dynamically adjusting the search space for the language model for the command recognition application 240 to reflect the current conversational context. In the illustrated embodiment, the vocabulary generation application 242 generates or otherwise constructs a keyword mapping table 250 based on analysis of the transcribed clearance communications associated with the aircraft in the clearance table 226. In this regard, entries in the keyword mapping table 250 are utilized to establish and maintain associations between parameter values identified within a transcription of a preceding audio communication and the corresponding words, terms and/or phrases that may be utilized to invoke a particular parameter value by reference. For example, based on the operational subject associated with an identified parameter value, the vocabulary generation application 242 may utilize a command vocabulary 246 to identify the potential words, terms and/or phrases that may be utilized to set or configure the parameter associated with the operational subject, and then update the keyword mapping table 250 to maintain associations between those words, terms and/or phrases and the parameter value identified from a preceding audio communication.
In the illustrated embodiment, the vocabulary generation application 242 also generates or otherwise constructs a recognition graph data structure 260 from the command vocabulary 246, where a path (or sequence of nodes and edges) of the recognition graph data structure 260 corresponds to a particular voice command to be implemented by or at an onboard system 210. In some embodiments, after a received voice command audio is probabilistically mapped or recognized to a particular path of the recognition graph data structure 260 that has the highest probability of matching the voice command audio, the vocabulary generation application 242 and/or the voice command recognition application 240 analyzes the recognized voice command using the keyword mapping table 250 to identify any words, terms and/or phrases capable of invoking a particular parameter value by reference. When the recognized voice command includes reference to an operational subject having an assigned or defined parameter value derived from a preceding audio communication, the vocabulary generation application 242 and/or the voice command recognition application 240 may augment or otherwise modify the recognized voice command to include or otherwise incorporate the mapped parameter value for the operational subject (e.g., by substituting the mapped parameter value for the reference to the operational subject).
In some embodiments, the vocabulary generation application 242 may be utilized to generate or otherwise construct a contextual recognition graph data structure 260 by utilizing the keyword mapping table 250 to add and/or remove potential paths to the contextual recognition graph data structure 260 in order to support a voice command using the words, terms and/or phrases associated with a particular operational parameter to invoke the mapped parameter value rather than requiring a fixed voice command grammar that includes the desired parameter value. In this regard, in some embodiments, the entries in the keyword mapping table 250 may be tagged with contextual information to support dynamically varying the recognition graph data structure 260 based on the current operational context of the aircraft. For example, an entry in the keyword mapping table 250 may be tagged with or otherwise maintain the flight phase, airspace and/or other operational context information at the time when the respective audio communication was received, which, in turn may be utilized by the vocabulary generation application 242 and/or the voice command recognition application 240 to limit the applicability of the entry in the keyword mapping table 250 based on the current operating context. Thus, when the aircraft changes flight phases, exits the previous airspace and/or begins operating in another airspace, and/or the like, the vocabulary generation application 242 may dynamically update the recognition graph data structure 260 to remove potential nodes, edges and/or paths that correspond to a keyword mapping that is stale or no longer relevant to the current operating context.
To generate the keyword mapping table 250, the vocabulary generator 242 analyzes the sequence of transcribed clearance communications associated with the aircraft in the clearance table 226 (e.g., using an identifier associated with the ownship aircraft) to ascertain the operational subject and corresponding operational parameter value associated with assignments received from the ATC, broadcasts received from the automatic terminal information service (ATIS), and/or requests or acknowledgments provided by the pilot. For example, communication system 208 may receive an ATIS broadcast identifying the tower frequency for the Phoenix Deer Valley airport as the frequency channel 112.1 (e.g., “DEER VALLEY TOWER INFORMATION DELTA ONE ONE TWO ONE”). The clearance table generator 222 creates a corresponding entry in the clearance table 226 that includes a transcription of the ATIS broadcast and identifies the parameter value for the airport operational subject as KDVT, the parameter value for the airport tower frequency operational subject as the frequency channel 112.1, and the ATIS broadcast as the source of the communication. Thereafter, the vocabulary generator 242 analyzes the entry for the transcribed ATIS broadcast communication in the clearance table 226 to identify frequency channel 112.1 as the assigned or defined value for the tower frequency and KDVT as the assigned or defined value for the airport. The vocabulary generator 242 creates or otherwise instantiates an entry in the keyword mapping table 250 for the 112.1 frequency channel and then utilizes the associated airport (KVDT) and source of the preceding communication (ATIS) in concert with the command vocabulary 246 to identify potential words, terms and/or phrases capable of functioning as keywords or placeholder terminology when referring to the assigned KVDT tower frequency (e.g., “Deer Valley tower,” “KVDT tower,” “ATIS tower,” “ATIS frequency,” and/or the like) to be associated with the 112.1 frequency channel entry. Additionally, the vocabulary generator 242 may tag or otherwise associate the 112.1 frequency channel entry in the keyword mapping table 250 with the KVDT terminal area and/or the airspace that the aircraft is operating in at the time of receipt of the ATIS broadcast, the current flight phase of the aircraft, and/or the like.
After creating the 112.1 frequency channel entry in the keyword mapping table 250, the pilot or other operator of the aircraft may utilize one of the potential combinations of words, terms and/or phrases when providing a voice command to set a radio frequency channel of a communications system 110 onboard the aircraft to the 112.1 frequency channel rather than orally spelling out or enunciating the desired radio frequency. In this regard, in some embodiments, the vocabulary generator 242 may dynamically determine a contextual recognition graph data structure 260 that leverages the keyword mapping table 250 to include nodes at different levels of the finite state directed graph data structure 260 that allow the pilot to use one of the words, terms and/or phrases in the keyword mapping table 250 (e.g., “tower” or “frequency”) as an alternative to spelling out the individual numerical values and decimal separator for the desired frequency. For example, the pilot may provide an oral voice command string of “Set COM1 as tower frequency” “Set COM1 as ATIS tower” or “Set COM1 as Deer Valley tower” or the like rather than “Set COM1 one one two point one.” After recognizing or otherwise resolving audio input from the microphone 206 to a particular input voice command that includes a reference to the assigned tower frequency provided by the preceding ATIS broadcast, the command system 204 and/or the voice command recognition application 240 utilizes the keyword mapping table 250 to substitute, incorporate, or otherwise include the assigned frequency channel of 112.1 in the control signals or other indicia of the recognized command input that are provided to the communications system 110, 210 to set the COM1 radio to the 112.1 frequency channel. In this manner, the communications system 110, 210 may be automatically commanded to tune the frequency of a communications radio to an assigned radio frequency channel previously communicated by ATC, ATIS, or the like using a voice command audio input that does not include the assigned radio frequency channel.
By virtue of the keyword mapping table 250, the voice command audio input provided by the pilot or other aircraft operator does not need to include the particular radio frequency channel assignment, thereby reducing the mental burden on the pilot of remembering and repeating the entire radio frequency string when providing the voice command, which may improve situational awareness. Additionally, the total number of nodes or levels of the graph data structure 260 that need to be searched to identify the received voice command may be reduced (e.g., from 5 nodes required to spell out “one one two point one” to 2 nodes for “ATIS frequency”), thereby allowing the accuracy and/or response time to be improved, which, in turn, improves the user experience and may also facilitate improved situational awareness with respect to flying the aircraft by responding more quickly and reducing delay.
As another example, the ATC may communicate for an aircraft to “Hold at waypoint GILA,” which results in the clearance table generator 222 creating a corresponding entry in the clearance table 226 that includes a transcription of the ATC clearance communication and identifies the parameter value for the next hold point operational subject as the GILA waypoint and the ATC as the source of the communication. Thereafter, the vocabulary generator 242 analyzes the entry for the transcribed ATC clearance communication in the clearance table 226 to identify GILA as the defined waypoint value for the next hold point. The vocabulary generator 242 creates or otherwise instantiates an entry in the keyword mapping table 250 for the GILA waypoint and then utilizes the source of the preceding communication (ATC) in concert with the command vocabulary 246 to identify potential words, terms and/or phrases capable of functioning as keywords or placeholder terminology when referring to the waypoint specified by ATC (e.g., “ATC cleared waypoint,” “ATC hold waypoint,” and/or the like) to be associated with the GILA waypoint entry.
Thereafter, the pilot or other operator of the aircraft may utilize one of the potential placeholder terminology when providing a voice command to configure the FMS 114 or other onboard system 210 to insert a hold within the flight plan at the GILA waypoint rather than articulating or enunciating the specified waypoint. For example, the pilot may provide an oral voice command string of “Hold at ATC cleared waypoint,” which may be recognized as including a reference to the designated waypoint provided by a preceding ATC clearance communication. The command system 204 and/or the voice command recognition application 240 utilizes the keyword mapping table 250 to designate, incorporate, or otherwise include the GILA waypoint in the control signals or other indicia of the recognized command input that are provided to the FMS 114 and/or other onboard system 210 to insert a temporary hold at the GILA waypoint. In this manner, the FMS 114 and/or other onboard system 210 may be automatically commanded to set a waypoint or other navigational reference point to a particular name or identifier from an assignment previously communicated by ATC, ATIS, or the like (e.g., by updating or modifying a flight plan to include or otherwise traverse the designated waypoint) using a voice command audio input that does not include the assigned waypoint name or identifier.
It should be noted that
Referring to
After identifying assigned, defined, or otherwise designated values for a particular operational parameter, the contextual mapping process 300 automatically establishes or otherwise creates one or more mappings between keywords or placeholder terminology that may be utilized to invoke or incorporate that operational parameter value by reference (task 304). For example, as described above, based on the identified operational parameter and/or the operational subject associated therewith, a command vocabulary 246 may be analyzed to select or otherwise identify a subset of potential words, terms and/or phrases that may be utilized as a keyword or placeholder for the parameter value. An entry in a keyword mapping table 250 is then created to maintain an association between the operational parameter value derived from the preceding audio communication(s) or conversational context and the potential keywords or placeholder terms that are likely to be used by a pilot or other operator to reference that parameter.
After establishing keyword mappings for an operational parameter value, the contextual mapping process 300 continues by recognizing received voice command audio as including a keyword or other placeholder terminology that is mapped to that operational parameter value and automatically augmenting or otherwise modifying the recognized voice command to include the operational parameter value derived from preceding audio communication(s) based on the mapping before providing output signals corresponding to the augmented recognized command to the appropriate onboard system(s) (tasks 306, 308, 310). For example, when a pilot manipulates the user input device 104 to indicate a desire to provide a voice command or otherwise initiate provisioning a voice command, the command system 204 and/or the command recognition application 240 resolves or otherwise recognizes the voice command audio subsequently received via the microphone 206 to a particular path of the recognition graph data structure 260. Once the voice command audio is probabilistically mapped or recognized to a particular path of the recognition graph data structure 260 having the highest probability of matching the voice command audio (e.g., using speech-to-text recognition), the voice command recognition application 240 may utilize the keyword mapping table 250 to scan or otherwise analyze the content of the recognized voice command to identify any keywords or placeholders within the recognized voice command that have been mapped to a particular operational parameter value.
When the voice command recognition application 240 identifies placeholder terminology that corresponds to an established keyword mapping for a predefined operational parameter value derived from a preceding audio communication, voice command recognition application 240 may automatically augment or otherwise modify the recognized voice command to include that operational parameter value as a commanded parameter value to be associated with the voice command in lieu of or in addition to the placeholder terminology. Thereafter, the voice command recognition application 240 may map, translate, or otherwise convert the augmented recognized voice command including the operational parameter value derived from a preceding audio communication into a corresponding command for one or more destination onboard system(s) 210 to implement or otherwise execute the commanded operational parameter value derived from the preceding audio communication. In this manner, a pilot or other vehicle operator may use keywords or placeholder terminology in voice commands to provide a previously communicated parameter value as a commanded value to an onboard system 210 without having to articulate, enunciate or even remember the exact value for the operational parameter that was previously communicated.
For example, in one embodiment, a template-based pattern matching approach (or template matching) is utilized to identify sets of keywords or key values that may be utilized to establish mappings using the format or syntax of commands supported by the command vocabulary 246. In this regard, natural language processing and template matching may be applied to historical ATC conversations and corresponding pilot inputs or actions or other reference data to derive key value pairs using pattern matching by tagging parts of speech. or example, for a reference voice command of “HOLD AT WAYPOINT AFRIC,” natural language processing may identify the intent of the command is to hold at waypoint AFRIC, where the word “HOLD” is tagged or matched to the action word in the command, “WAYPOINT” is tagged or matched to the operational subject of the command (e.g., the place where the hold action applies), and “AFRIC” is tagged or matched to the operational parameter value (e.g., the waypoint identifier) for the operational subject of the command (e.g., the name of the place where the hold action applies). Thereafter, when received voice command audio contains the phrase “HOLD AT ATC WAYPOINT,” the contextual mapping process 300 identifies the term “ATC WAYPOINT” as a placeholder for the waypoint identifier for the subject waypoint of the hold action. Natural language processing or the like may be performed on the placeholder terminology (e.g., “ATC WAYPOINT”) to identify that the value should be mapped from a preceding communication from the ATC, and based thereon, the contextual mapping process 300 may identify the specified value for the waypoint identifier (e.g., AFRIC) from the transcription of a preceding communication from the ATC and then augment the received voice command to include the specified value for the waypoint identifier in lieu of the placeholder terminology (e.g., “HOLD AT AFRIC”). Thereafter, the voice command recognition application 240 may generate a corresponding command for implementing the hold action at the specified waypoint (e.g., AFRIC) and provide the command to one or more destination onboard system(s) 210 to implement or otherwise execute the hold action using the waypoint identifier derived from a transcription of a preceding ATC audio communication.
In practice, the contextual mapping process 300 may repeat throughout operation to dynamically update the keyword mappings to reflect more recent audio communications or changes to the operating context. For example, when the aircraft exits the KVDT airspace or terminal area, the vocabulary generation application 242 may dynamically update the keyword mapping table 250 to remove or inactivate the KVDT tower frequency entry for invoking the 112.1 frequency channel based on the difference between the current aircraft operating context and the operating context associated with the 112.1 frequency channel entry. In a similar manner, when a more recent communication is received that includes a different tower frequency that conflicts with an existing or previous entry in the keyword mapping table 250, the vocabulary generation application 242 may dynamically update the keyword mapping table 250 to remove or inactivate that existing entry in concert with creating a new entry in the keyword mapping table 250 that reflects the more recently communicated tower frequency.
Referring to
When the command recognition application 240 recognizes a voice command that includes a generic keyword rather than a specific value, the command recognition application 240 may then query the one-hot decoding supported by the keyword mapping table 250 to obtain the particular operational parameter value that has been previously defined, assigned or otherwise designated for that particular keyword and thereby utilize the keyword mapping table 250 to augment the voice command to include or otherwise incorporate that specific operational parameter value derived from a preceding audio communication. In this regard, the command recognition graph data structure 260 and/or the command vocabulary 246 may be configured to achieve a faster response time and/or higher accuracy using more generic keywords or placeholders rather than being constrained to commands that require orally spelling out specific parameter values. Rather than requiring acoustic or language models that are pretrained or otherwise preconfigured to map different keywords, placeholder terms, or other indirect references to specific values, which could increase the size of the models and recognition graphs and undesirably delay response time, the one-hot decoding using the keyword mapping table 250 allows the command recognition graph data structure 260 and/or the command vocabulary 246 to be designed to merely recognize simpler voice commands, which can then be augmented and mapped to specific values by querying the keyword mapping table 250 that is dynamically varied or adapted in real-time to reflect the current operating context and current conversational context.
To briefly summarize, by utilizing multiple speech engines (e.g., clearance transcription and command recognition) and leveraging the current context of the ATC clearance communications to establish mappings between previously communicated parameter values and potential keywords or placeholder terminology that may be utilized to refer to those parameter values, a pilot or other aircraft operator can more conveniently command onboard systems to effectuate the previously communicated parameter values using voice commands without having to remember or enunciate the exact values. This improves the user experience while also reducing workload, thereby improving situational awareness. Additionally, leveraging keywords or placeholders rather than relying on longer strings of numerical values, decimal separators and/or the like may also reduce response time and improving accuracy, thereby further improving the user experience and usability.
For the sake of brevity, conventional techniques related to user interfaces, speech recognition, avionics systems, and other functional aspects of the systems (and the individual operating components of the systems) may not be described in detail herein. Furthermore, the connecting lines shown in the various figures contained herein are intended to represent exemplary functional relationships and/or physical couplings between the various elements. It should be noted that many alternative or additional functional relationships or physical connections may be present in an embodiment of the subject matter.
The subject matter may be described herein in terms of functional and/or logical block components, and with reference to symbolic representations of operations, processing tasks, and functions that may be performed by various computing components or devices. It should be appreciated that the various block components shown in the figures may be realized by any number of hardware components configured to perform the specified functions. For example, an embodiment of a system or a component may employ various integrated circuit components, e.g., memory elements, digital signal processing elements, logic elements, look-up tables, or the like, which may carry out a variety of functions under the control of one or more microprocessors or other control devices. Furthermore, embodiments of the subject matter described herein can be stored on, encoded on, or otherwise embodied by any suitable non-transitory computer-readable medium as computer-executable instructions or data stored thereon that, when executed (e.g., by a processing system), facilitate the processes described above.
The foregoing description refers to elements or nodes or features being “coupled” together. As used herein, unless expressly stated otherwise, “coupled” means that one element/node/feature is directly or indirectly joined to (or directly or indirectly communicates with) another element/node/feature, and not necessarily mechanically. Thus, although the drawings may depict one exemplary arrangement of elements directly connected to one another, additional intervening elements, devices, features, or components may be present in an embodiment of the depicted subject matter. In addition, certain terminology may also be used herein for the purpose of reference only, and thus are not intended to be limiting.
The foregoing detailed description is merely exemplary in nature and is not intended to limit the subject matter of the application and uses thereof. Furthermore, there is no intention to be bound by any theory presented in the preceding background, brief summary, or the detailed description.
While at least one exemplary embodiment has been presented in the foregoing detailed description, it should be appreciated that a vast number of variations exist. It should also be appreciated that the exemplary embodiment or exemplary embodiments are only examples, and are not intended to limit the scope, applicability, or configuration of the subject matter in any way. Rather, the foregoing detailed description will provide those skilled in the art with a convenient road map for implementing an exemplary embodiment of the subject matter. It should be understood that various changes may be made in the function and arrangement of elements described in an exemplary embodiment without departing from the scope of the subject matter as set forth in the appended claims. Accordingly, details of the exemplary embodiments or other limitations described above should not be read into the claims absent a clear intention to the contrary.
Number | Date | Country | Kind |
---|---|---|---|
202111025170 | Jun 2021 | IN | national |