Drivers and pedestrians can communicate using non-verbal methods to negotiate safe passage, for example, at a traffic junction having a pedestrian crossing. However, it can be difficult to accurately understand non-verbal communication from both pedestrians and drivers. Additionally, pedestrians lack a reliable and accurate way to interact with autonomous vehicles (AV) or swarms of cooperative vehicles. Pedestrians can be unaware that a lack of communication has occurred despite road user detection and classification. This contributes to the fear of pedestrians towards AV and impedes trust which is one of the major hurdles in mass adoption. Reliable pedestrian assistance to safely interact with vehicles at a traffic junction will improve pedestrian and traffic flow as well as increase trust and certainty in AV and swarms of cooperative vehicles.
According to one aspect, a system for assisting road agents including a first road agent and a second road agent includes connected devices and a processor operably connected for computer communication to the connected devices. The connected devices are devices in proximity to a traffic junction and capture sensor data about the road agents and the traffic junction. The processor is configured to receive an invocation input including a desired action to be executed at the traffic junction. The processor is also configured to manage interactions between the road agents to coordinate execution of the desired action by converting human-readable medium to vehicle-readable medium in a back-and-forth manner. Further, the processor is configured to receive a cooperation acceptance input from the second road agent indicating an acceptance to coordinate execution of the desired action or a non-acceptance to coordinate execution of the desired action, and transmit a response output invoking the desired action based on the cooperation acceptance input.
According to another aspect, a computer-implemented method for assisting road agents at a traffic junction, where the road agents include at least a first road agent and a second road agent, includes receiving sensor data from one or more connected devices in proximity to the traffic junction. The sensor data includes an invocation input with a desired action to be executed at the traffic junction by the first road agent. The method includes managing interactions between the first road agent and the second road agent based on the sensor data and the desired action including converting interactions from human-readable medium to machine-readable medium and vice versa. The method also includes receiving a cooperation acceptance input from the second road agent indicating an agreement to execute a cooperation action thereby allowing execution of the desired action by the first road agent. Furthermore, the method includes transmitting a response output to the one or more connected devices, wherein the response output includes instructions to invoke the desired action.
According to a further aspect, a non-transitory computer-readable medium comprising computer-executable program instructions, when executed by one or more processors, the computer-executable program instructions configures the one or more processors to perform operations including receiving an invocation input including a desired action to be executed by a first road agent at a traffic junction. The operations also include receiving sensor data associated with the invocation input and the desired action, and translating human-readable medium to vehicle-readable medium in a back-and-forth manner between the first road agent and a second road agent to coordinate execution of the desired action. The operations also include receiving a cooperation acceptance input from the second road agent indicating an acceptable to coordinate execution of the desired action or a non-acceptance to coordinate execution of the desired action. Further, the operations include transmitting a response output invoking the desired action based on the cooperation acceptance input.
The accompanying drawings, which are incorporated in and constitute a part of the specification, illustrate various systems, methods, devices, and other embodiments of the disclosure. It will be appreciated that the illustrated element boundaries (e.g., boxes, groups of boxes, directional lines, or other shapes) in the figures represent one embodiment of the boundaries. In some embodiments one element may be designed as multiple elements or that multiple elements may be designed as one element. In some embodiments, an element shown as an internal component of another element may be implemented as an external component and vice versa. Furthermore, elements may not be drawn to scale.
The following includes definitions of selected terms employed herein. The definitions include various examples and/or forms of components that fall within the scope of a term and that may be used for implementation. The examples are not intended to be limiting. Further, the components discussed herein, may be combined, omitted or organized with other components or into different architectures.
“Bus,” as used herein, refers to an interconnected architecture that is operably connected to other computer components inside a computer or between computers. The bus may transfer data between the computer components. The bus may be a memory bus, a memory processor, a peripheral bus, an external bus, a crossbar switch, and/or a local bus, among others. The bus may also be a vehicle bus that interconnects components inside a vehicle using protocols such as Media Oriented Systems Transport (MOST), Controller Area network (CAN), Local Interconnect network (LIN), among others.
“Component,” as used herein, refers to a computer-related entity (e.g., hardware, firmware, instructions in execution, combinations thereof). Computer components may include, for example, a process running on a processor, a processor, an object, an executable, a thread of execution, and a computer. A computer component(s) may reside within a process and/or thread. A computer component may be localized on one computer and/or may be distributed between multiple computers.
“Computer communication,” as used herein, refers to a communication between two or more computing devices (e.g., computer, personal digital assistant, cellular telephone, network device, vehicle, vehicle computing device, infrastructure device, roadside device) and may be, for example, a network transfer, a data transfer, a file transfer, an applet transfer, an email, a hypertext transfer protocol (HTTP) transfer, and so on. A computer communication may occur across any type of wired or wireless system and/or network having any type of configuration, for example, a local area network (LAN), a personal area network (PAN), a wireless personal area network (WPAN), a wireless area network (WAN), a wide area network (WAN), a metropolitan area network (MAN), a virtual private network (VPN), a cellular network, a token ring network, a point-to-point network, an ad hoc network, a mobile ad hoc network, a vehicular ad hoc network (VANET), a vehicle-to-vehicle (V2V) network, a vehicle-to-everything (V2X) network, a vehicle-to-infrastructure (V2I) network, among others. Computer communication may utilize any type of wired, wireless, or network communication protocol including, but not limited to, Ethernet (e.g., IEEE 802.3), WiFi (e.g., IEEE 802.11), communications access for land mobiles (CALM), WiMax, Bluetooth, Zigbee, ultra-wideband (UWAB), multiple-input and multiple-output (MIMO), telecommunications and/or cellular network communication (e.g., SMS, MMS, 3G, 4G, LTE, 5G, GSM, CDMA, WAVE), satellite, dedicated short range communication (DSRC), among others.
“Computer-readable medium,” as used herein, refers to a non-transitory medium that stores instructions, algorithms, and/or data configured to perform one or more of the disclosed functions when executed. A computer-readable medium may take forms, including, but not limited to, non-volatile media, and volatile media. Non-volatile media may include, for example, optical disks, magnetic disks, and so on. Volatile media may include, for example, semiconductor memories, dynamic memory, and so on. Computer-readable medium can include, but is not limited to, a floppy disk, a flexible disk, a hard disk, a magnetic tape, other magnetic medium, an application specific integrated circuit (ASIC), a programmable logic device, a compact disk (CD), other optical medium, a random access memory (RAM), a read only memory (ROM), a memory chip or card, a memory stick, solid state storage device (SSD), flash drive, and other media from which a computer, a processor or other electronic device can interface with. Computer-readable medium excludes non-transitory tangible media and propagated data signals.
“Database,” as used herein, is used to refer to a table. In other examples, “database” may be used to refer to a set of tables. In still other examples, “database” may refer to a set of data stores and methods for accessing and/or manipulating those data stores. A database may be stored, for example, at a disk and/or a memory.
“Disk,” as used herein may be, for example, a magnetic disk drive, a solid-state disk drive, a floppy disk drive, a tape drive, a Zip drive, a flash memory card, and/or a memory stick. Furthermore, the disk may be a CD-ROM (compact disk ROM), a CD recordable drive (CD-R drive), a CD rewritable drive (CD-RW drive), and/or a digital video ROM drive (DVD ROM). The disk may store an operating system that controls or allocates resources of a computing device.
“Logic circuitry,” as used herein, includes, but is not limited to, hardware, firmware, a non-transitory computer readable medium that stores instructions, instructions in execution on a machine, and/or to cause (e.g., execute) an action(s) from another logic circuitry, module, method and/or system. Logic circuitry may include and/or be a part of a processor controlled by an algorithm, a discrete logic (e.g., ASIC), an analog circuit, a digital circuit, a programmed logic device, a memory device containing instructions, and so on. Logic may include one or more gates, combinations of gates, or other circuit components. Where multiple logics are described, it may be possible to incorporate the multiple logics into one physical logic. Similarly, where a single logic is described, it may be possible to distribute that single logic between multiple physical logics.
“Memory,” as used herein may include volatile memory and/or nonvolatile memory. Non-volatile memory may include, for example, ROM (read only memory), PROM (programmable read only memory), EPROM (erasable PROM), and EEPROM (electrically erasable PROM). Volatile memory may include, for example, RAM (random access memory), synchronous RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double data rate SDRAM (DDRSDRAM), and direct RAM bus RAM (DRRAM). The memory may store an operating system that controls or allocates resources of a computing device.
“Operable connection,” or a connection by which entities are “operably connected,” is one in which signals, physical communications, and/or logical communications may be sent and/or received. An operable connection may include a wireless interface, a physical interface, a data interface, and/or an electrical interface.
“Portable device,” as used herein, is a computing device typically having a display screen with user input (e.g., touch, keyboard) and a processor for computing. Portable devices include, but are not limited to, handheld devices, mobile devices, smart phones, laptops, tablets and e-readers.
“Processor,” as used herein, processes signals and performs general computing and arithmetic functions. Signals processed by the processor may include digital signals, data signals, computer instructions, processor instructions, messages, a bit, a bit stream, that may be received, transmitted and/or detected. Generally, the processor may be a variety of various processors including multiple single and multicore processors and co-processors and other multiple single and multicore processor and co-processor architectures. The processor may include logic circuitry to execute actions and/or algorithms.
“Vehicle,” as used herein, refers to any moving vehicle that is capable of carrying one or more human occupants and is powered by any form of energy. The term “vehicle” includes, but is not limited to cars, trucks, vans, minivans, SUVs, motorcycles, scooters, boats, go-karts, amusement ride cars, rail transport, personal watercraft, and aircraft. In some cases, a motor vehicle includes one or more engines. Further, the term “vehicle” may refer to an electric vehicle (EV) that is capable of carrying one or more human occupants and is powered entirely or partially by one or more electric motors powered by an electric battery. The EV may include battery electric vehicles (BEV) and plug-in hybrid electric vehicles (PHEV). The term “vehicle” may also refer to an autonomous vehicle and/or self-driving vehicle powered by any form of energy. The autonomous vehicle may carry one or more human occupants. The autonomous vehicle can have any level or mode of driving automation ranging from, for example, fully manual to fully autonomous. Further, the term “vehicle” may include vehicles that are automated or non-automated with pre-determined paths or free-moving vehicles.
“Vehicle control system,” and/or “vehicle system,” as used herein may include, but is not limited to, any automatic or manual systems that may be used to enhance the vehicle, driving, and/or security. Exemplary vehicle systems include, but are not limited to: an electronic stability control system, an anti-lock brake system, a brake assist system, an automatic brake prefill system, a low speed follow system, a cruise control system, a collision warning system, a collision mitigation braking system, an auto cruise control system, a lane departure warning system, a blind spot indicator system, a lane keep assist system, a navigation system, a transmission system, brake pedal systems, an electronic power steering system, visual devices (e.g., camera systems, proximity sensor systems), a climate control system, an electronic pre-tensioning system, a monitoring system, a passenger detection system, a vehicle suspension system, a vehicle seat configuration system, a vehicle cabin lighting system, an audio system, a sensory system, an interior or exterior camera system among others.
The systems and methods discussed herein facilitate communication between pedestrians, vehicles, and traffic infrastructures to negotiate and execute actions thereby resolving traffic scenarios (e.g., pedestrian crossings at a traffic junction). More specifically, a smart traffic assistant is employed for interacting and managing communication between the pedestrians, vehicles, and infrastructures thereby controlling traffic actions and traffic flow. Referring now to the drawings, wherein the showings are for purposes of illustrating one or more exemplary embodiments and not for purposes of limiting same,
In
The traffic junction 110 also includes a crosswalk 116a, a crosswalk 116b, a crosswalk 116c, and a crosswalk 116d. The crosswalks 116 can be controlled or uncontrolled, for example, by a signal and/or a regulatory sign. For example, crossing the first road segment 102 via the crosswalk 116a can be controlled by a crosswalk signal device 118a and/or a crosswalk signal device 118b. Crossing the second road segment 104 via the crosswalk 116b can be controlled by the crosswalk signal device 118b and/or the crosswalk signal device 118c. In contrast, in
As mentioned above, the systems and methods describe herein assist communication between vehicles 120 and pedestrians 124. In
One or more of the pedestrians 124 can desire to cross one or more road segments shown in
Referring now to
Although not shown in
The vehicle 120a includes a vehicle computing device (VCD) 212, vehicle control systems 214, and vehicle sensors 216. Generally, the VCD 212 includes a processor 218, a memory 220, a data store 222, a position determination unit 224, and a communication interface (I/F) 226, which are each operably connected for computer communication via a bus 228 and/or other wired and wireless technologies discussed herein. Referring again to the vehicle 120a, the VCD 212, can include provisions for processing, communicating and interacting with various components of the vehicle 120a and other components of the system 200, including the vehicle 120b, the traffic infrastructure computing device 202, and the assistant computing device 204.
The processor 218 can include logic circuitry with hardware, firmware, and software architecture frameworks for facilitating control of the vehicle 120a and facilitating communication between the vehicle 120a, the vehicle 120b, the traffic infrastructure computing devices 202, and the assistant computing device 204. Thus, in some embodiments, the processor 218 can store application frameworks, kernels, libraries, drivers, application program interfaces, among others, to execute and control hardware and functions discussed herein. In some embodiments, the memory 220 and/or the data store (e.g., disk) 222 can store similar components as the processor 218 for execution by the processor 218.
The position determination unit 224 can include hardware (e.g., sensors) and software to determine and/or acquire position data about the vehicle 120a and position data about other vehicles and objects in proximity to the vehicle 120a. For example, the position determination unit 224 can include a global positioning system unit (not shown) and/or an inertial measurement unit (not shown). Thus, the position determination unit 224 can provide a geoposition of the vehicle 120a based on satellite data from, for example, a global position satellite 210. Further, the position determination unit 224 can provide dead-reckoning data or motion data from, for example, a gyroscope, accelerometer, magnetometers, among other sensors (not shown). In some embodiments, the position determination unit 224 can be a navigation system that provides navigation maps, map data, and navigation information to the vehicle 120a or another component of the system 200 (e.g., the assistant computing device 204).
The communication interface (I/F) 226 can include software and hardware to facilitate data input and output between the components of the VCD 212 and other components of the system 200. Specifically, the communication I/F 226 can include network interface controllers (not shown) and other hardware and software that manages and/or monitors connections and controls bi-directional data transfer between the communication I/F 226 and other components of the system 200 using, for example, the network 206. As another example, the communication I/F 226 can facilitate communication (e.g., exchange data and/or transmit messages) with one or more of the vehicles 120.
Referring again to the vehicle 120a, the vehicle control systems 214 can include any type of vehicle system described herein to enhance the vehicle 120a and/or driving of the vehicle 120a. The vehicle sensors 216, which can be integrated with the vehicle control systems 214, can include various types of sensors for use with the vehicle 120a and/or the vehicle control systems 214 for detecting and/or sensing a parameter of the vehicle 120a, the vehicle systems 214, and/or the environment surrounding the vehicle 120a. For example, the vehicle sensors 216 can provide data about vehicles in proximity to the vehicle 120a, data about the traffic junction 110 and/or the pedestrians 124. As an illustrative example, the vehicle sensors 216 can include ranging sensors to measure distances and speed of objects surrounding the vehicle 120a (e.g., other vehicles 120, pedestrians 124). Ranging sensors and/or vision sensors can also be utilized to detect other objects or structures (e.g., the traffic junction 110, the traffic signal devices 112, the crosswalk signal devices 118, and the crosswalks 116). As will be discussed in more detail herein, data from the vehicle control systems 214 and/or the vehicle sensors 216 can be referred to as sensor data or input data and utilized for smart traffic assistance.
Referring again to
Referring again to
The sensors 240 can include various types of sensors for monitoring and/or controlling traffic flow. For example, the sensors 240 can include visions sensors, (e.g., imaging devices, cameras) and/or ranging sensors (e.g., RADAR, LIDAR), for detecting and capturing data about the vehicles 120, the pedestrians 124, and the traffic junction 110. As an illustrative example with reference to
The communication I/F 242 can include software and hardware to facilitate data input and output between the components of the traffic infrastructure computing device 202 and other components of the system 200. Specifically, the communication I/F 242 can include network interface controllers (not shown) and other hardware and software that manages and/or monitors connections and controls bi-directional data transfer between the communication I/F 242 and other components of the system 200 using, for example, the network 206. Thus, the traffic infrastructure computing device 202 is able to communicate sensor data acquired by the sensors 240 and data about the operation of the traffic infrastructure computing device 202 (e.g., timing, cycles, light operation). As will be discussed in more detail herein, data from the sensors 240 can be referred to as sensor data or input data and utilized for smart traffic assistance.
Referring again to the system 200 of
Further, the communication I/F 250 can include software and hardware to facilitate data input and output between the assistant computing device 204 and other components of the system 200. Specifically, the communication I/F 250 can include network interface controllers (not shown) and other hardware and software that manages and/or monitors connections and controls bi-directional data transfer between the communication I/F 250 and other components of the system 200 using, for example, the network 206. In one embodiment, which will be described with
Referring to the block diagram 300 of
The voice data 308 can include voice and/or speech data (e.g., utterances emitted from one or more of the pedestrians 124. Thus, the voice data 308 can include an active audio input from one or more of the pedestrians 124 forming part of a conversation with the assistant computing device 204. The voice data 308 can also include any audible data detected in proximity to the traffic junction 110. As will be discussed herein, in some embodiments, the voice data 308 is captured by the traffic infrastructure computing device 202 (e.g., the sensors 240).
The context data 310 includes data associated with the traffic junction 110, the vehicles 120, and/or the pedestrians 124 that describe the environment of the traffic junction 110. For example, context data 310 can include sensor data captured by the vehicle sensors 216 and/or the sensors 240.
The external domain data 312 includes data from remote servers and/or services not shown. In some embodiments, the vehicle 120a and/or the traffic infrastructure computing device 202 can retrieve the external domain data 312 from remote servers and/or services and shown and send the external domain data 312 to the assistant computing device 204 for processing by the conversation interface 304. In
Generally, the conversation I/F 304 manages communication and interaction between the components of the system 200. The input data 302, which is received from the computing devices and sensors shown in
The translation interface 330 is the hub of the smart traffic assistant described herein that combines artificial intelligence and linguistics to handle interactions and conversations between vehicles 120 and pedestrians 124. For purposes of the systems and methods described herein, a conversation can include a plurality of information and other data related to one or more exchanges between the pedestrians 124 and the vehicles 120. This information can include words and/or phrases spoken by the pedestrians 124, queries presented by the pedestrians 124, sensor data received from one or more sensors and/or systems, vehicle data from the vehicles 120, vehicle messages from the vehicles 120, and/or context data about the traffic junction 110, the pedestrians 124, and/or the vehicles 120.
Generally, the translation interface 330 includes a communication encoder/decoder 338, a conversation engine 340, conversation meta-info 342, and map data 344. The communication encoder/decoder 338 and the conversation engine 340 can: process the input data 302 into a format that is understandable by the translation interface 330, utilize Natural Language Processing (NLP) to interpret a meaning and/or a concept with the input data 302, identify or perform tasks and actions, and generate responses and/or outputs (e.g., at output interface 332) based on the input data 302. The conversation meta-info 342 can include linguistic data, NLP data, intent and/or response templates, current and/or historical conversation history, current and/or historical conversation output, among other types of static or learned data for conversation processing. The map data 344 can include map and location data, for example, map data about the traffic junction 110. As will be discussed in more detail herein, the vehicle communication encoder/decoder 338 facilitates translation from human-readable medium to vehicle-readable medium and vice versa with assistance from the conversation engine 340.
The output interface 332 facilitates generation and output in response to the processing performed by the translation interface 330. For example, output interface 332 includes a voice interface 346 and a system command interface 348. The voice interface 346 can output speech to, for example, a connected device (e.g., the traffic infrastructure computing device 202) in proximity to the desired recipient pedestrian. The system command interface 348 can transmit a command signal to a connected device and/or a vehicle to control the connected device and/or the vehicle. The output interface 332 and the other components of the conversation interface 304 will now be described in more detail with exemplary smart assistant methods.
Initially, the invocation input triggers the assistant computing device 204 to initiate a conversation and provide smart traffic assistance. In one embodiment, the invocation input includes a desired action to be executed at the traffic junction 110 by at least one first road agent. In some embodiments, the first road agent is a road user (e.g., a pedestrian 124a) and the second road agent is a vehicle (e.g., the vehicle 120a). In this embodiment, the invocation input is a voice utterance from the first road agent, which is shown in
With reference first to
With reference to
Referring again to
The method 400 also includes at block 408 managing interactions between road agents. Generally, managing interactions between road agents includes conversation management, translation between human-readable mediums and vehicle-readable mediums, and control of the road agents with responsive outputs. The processor 244 and the translation interface 330 facilitate the processing and execution at block 408.
As mentioned above, managing the interactions between the first road agent and the second road agent can be based on at least the invocation input and the sensor data 404. As shown in
Referring again to
As shown in
Thus, in one embodiment, managing the interactions at block 408 includes translating human-readable medium to vehicle-readable medium in a back-and-forth manner between the first road agent (e.g., the pedestrian 126a) and a second road agent (e.g., the vehicle 120a) to coordinate execution of the desired action. In one embodiment, this includes processing the voice utterance (e.g., the speech input 416) and the sensor data 404 into a command signal having a vehicle-readable format with instructions to control the vehicle 120a to execute the cooperation action, and the processor 244 transmitting the command signal to the vehicle 120a to execute the cooperation action.
The vehicle-readable format can include the command signal capable of being executed by the vehicle 120a and/or a vehicle message capable of being processed by the vehicle 120a. In one embodiment, the vehicle message is in a defined message format, for example as a Basic Safety Message (BSM) under the SAE J2735 standard. Accordingly, the translation from human-readable medium to vehicle-readable medium includes converting and formatting the human-readable medium into a BSM that contains information about vehicle position, heading, speed, and other information relating to a vehicle's state and predicted path according to the desired action and the cooperative action.
In another embodiment, the command signal has a machine-readable format with instructions to control one or more of the connected devices (e.g., the traffic infrastructure computing device 202) to execute the cooperating action. Thus, managing interactions at block 408 includes converting interactions from human-readable medium to machine-readable medium and vice versa. For example, translating the sensor data and the invocation input into a format capable of being processed by the second road agent. In the case where the invocation input includes a voice utterance, the voice utterance is translated into a command signal to control the second road agent.
In some embodiments, managing the interactions at block 408 can include managing the interactions based on the classification of the road user determined at block 406. In one embodiment, the sensor data 404, the speech input 416, and/or the classification is used to determine conversational actions, conversational responses, desired actions and/or the cooperative action. As an illustrative example, if the pedestrian 124a is classified as having a physical disability, the timing of the cooperative action can be modified to allow the pedestrian 124a additional time to walk across the first road segment 102. Thus, the vehicle 120a must remain in a stopped state for a longer period of time and/or the timing of the traffic signal device 112b is modified to control the length of time the vehicle 120a is in a stopped state. In another example, conversational responses can be tailored based on a classification of the pedestrian 124a. For example, as will be described below in more detail with block 412, output to the pedestrian 124a can be directed specifically to the pedestrian 124a based on a classification of the pedestrian 124a (e.g., a physical characteristic of the pedestrian 124a).
Referring again to
In
Referring again to the method 400 of
In some embodiments, transmitting the response output at block 412 can be based on the classification determined at block 406. More specifically, the response output can be modified based on the classification of the intended recipient (e.g., road agent). This can be helpful to catch the attention of the intended recipient. For example, based on the classification determined at block 406, the pedestrian 124a is identified as wearing a red shirt. In this example, the output phrase 508 can be modified to identify the actor of the action, namely, “Okay, the pedestrian in the red shirt can go.” This provides for clear communication particularly if there are other road users in proximity to the connected device and/or the pedestrian 124a. A unique classification of the pedestrian 124a when compared to other road agents in proximity to the connected device and/or the pedestrian 124a is preferable. This type of interactive and identifying communication will also be described in more detail with
In some embodiments, the conversation interface 304 can continue to manage interactions between the first road agent and the second road agent. For example, as shown in
In the examples described above with
In the example shown in
As mentioned above with
As discussed in detail above with
Furthermore, the instructions in the phrase 602 includes classification of one or more of the vehicles 120. For example, the classification of the “red Honda Accord” identifies the vehicle 120b, which is the last vehicle to cross the traffic junction 110 (see
It will be appreciated that various embodiments of the above-disclosed and other features and functions, or alternatives or varieties thereof, may be desirably combined into many other different systems or applications. Also that various presently unforeseen or unanticipated alternatives, modifications, variations or improvements therein may be subsequently made by those skilled in the art which are also intended to be encompassed by the following claims.