Multimodal event detection involves the use of data from multiple modalities (e.g., video, audio) to detect events. For example, an event may be triggered if it is detected by more than one of the modalities. This approach can result in false negatives, however, as some events may not be triggered due to inconsistencies among the modalities, such as when an event is detected by some modalities but not others.
In the drawings, which are not necessarily drawn to scale, like numerals may describe similar components in different views. Like numerals having different letter suffixes may represent different instances of similar components. Some embodiments are illustrated by way of example, and not limitation, in the figures.
In a multimodal data fusion scenario for event detection, multiple sources of input data from different modalities (e.g., video, audio, and/or other sensor data) are typically fused together to provide a more comprehensive picture for detecting events of interest (EOIs). A system typically has a predefined way to detect events based on the respective modalities. In some cases, for example, an EOI may be triggered when more than one modality detects an event with the same or similar level of confidence and reliability. Current solutions suffer from false negatives, however, as an EOI may not be successfully triggered if some modalities are given more weight than others and/or the modalities are inconsistent or otherwise contradict each other.
As an example, in an audio/visual multimodality for fight detection, audio and video modalities are fused together to detect fighting events using sound classification and action recognition computer vision modules. In some cases, the audio modality may detect a fighting event, while the video modality detects no activity because the fighting event is outside the current field of view of the camera. As a result, certain events may be ignored due to the uncertainty and/or inconsistency among the modalities.
Thus, while the use of multiple modalities improves complementarity and diversity, it also creates uncertainty in real-world settings, which leads to undesired data fusion quality when the underlying relationship between modalities is not properly understood and/or defined.
Accordingly, this disclosure presents embodiments of adaptive multimodal event detection, where the configuration used for event detection is dynamically adapted in real time based on external factors, such as environmental conditions or other external considerations. In some embodiments, for example, when there are inconsistencies among different modalities or sensors used for event detection (e.g., an event is detected by some sensors but not others), external factors detected by the sensors may be used to adjust various configuration parameters in real time to enhance the consistency and reliability of the event detection results among the sensors. For example, based on the external factors, the weights used to fuse the respective modalities may be adjusted, the sensors or other modalities used for data ingestion may be reconfigured, and/or the workloads used to perform event detection based on data from the respective modalities may be reconfigured.
The described solution provides various advantages. For example, the described solution intelligently combines/fuses multiple data sources or modalities to provide more accurate, consistent, and concise analytics than any individual data source.
The described solution is also beneficial for edge computing, as it enables multimodal artificial intelligence applications to be deployed and/or distributed at the edge with improved performance. In particular, the described solution works well with Internet-of-Things devices, which are often equipped with a variety of sensors that capture large amounts of data. The workloads of multiple sensors, such as deep learning inference on visual and audio data, may be consolidated on a single edge device with heterogenous processing resources, such as a central processing unit (CPU), graphics processing unit (GPU), vision processing unit (VPU), field-programmable gate array (FPGA), and so forth. As a result, large volumes of data can be processed accurately and efficiently at the edge instead of being funneled to the cloud, which results in cost savings and also enables deployments that would otherwise be infeasible due to limited network bandwidth and real-time constraints of many applications.
The described solution is applicable to a variety of use cases that rely on data from multiple modalities for event detection or other types of data analytics, particularly when real-time corrective action must be taken in certain situations. These use cases span a variety of industries and market segments, including automotive, aerospace, manufacturing and industrial (e.g., robotic safety, quality control, anomaly detection), smart cities (e.g., public safety and surveillance), retail (e.g., personalized shopping experience, frictionless store, customer service quality evaluation, etc.), education (e.g., smart classrooms, school safety and surveillance), business (e.g., smart meeting rooms), augmented reality and/or virtual reality (AR/VR), and more.
The pipeline 100 begins with multimodal data ingestion 102, where input data is continuously captured and ingested from multiple sensors or other modalities, such as cameras, microphones, LIDAR, and so forth. In some embodiments, various types of preprocessing may be performed on the input data before further analysis, such as filtering, cleaning, denoising, reformatting, and so forth.
The (preprocessed) input data is then provided as input to the multimodal analytics workloads 104, which are used to perform analytics on the data from the respective modalities or sensors, such as event detection or any other form of analytics (e.g., classification, regression, inference, predictive analytics, etc.).
In some embodiments, for example, event detection may be performed on the data from each modality using artificial intelligence (e.g., deep learning models) and/or other forms of statistical analysis or analytics. Further, in some embodiments, data from each modality may be processed by a separate event detection workload. For example, each workload may perform event detection on data from one of the modalities, such as video/images from cameras, audio from microphones, or point clouds from LIDAR. Further, the output of the workload for each modality may indicate a confidence level or probability of an event occurring.
The output of each workload/modality (e.g., confidence/probability) is then provided to the fusion block 106, which generates a final prediction 112 by fusing the outputs from the respective workloads/modalities. For example, with respect to event detection, a weighted average of the confidence levels output by the respective modalities may be computed, and if the weighted average is above a threshold (e.g., 80%), an event may be triggered.
In some cases, however, there may be inconsistencies in the outputs from different modalities. For example, an event may be detected with high confidence by some modalities/sensors and not others, or the confidence level of a detected event may vary significantly among the modalities/sensors. As a result, the fused prediction result 112 may have a low confidence level that falls below the threshold or within certain boundary conditions, which creates uncertainty as to whether the event of interest actually occurred. In these cases, external conditions of the environment may be used to dynamically adjust the system configuration in real time to achieve greater consistency and balance among the outputs of the respective modalities, thus improving the confidence level and reliability of the fused prediction 112.
For example, external sensing 108 may be performed to detect conditions of the surrounding environment based on data from sensors or other sources, such as time of day/night, lighting, weather, visibility, noise, location of sensors, location/direction of potential events or activity, and so forth.
Based on the detected external conditions, intelligence parameter computation 110 is then performed to achieve more consistent and reliable predictions among the respective modalities, thus improving the overall accuracy of the fused prediction 112. For example, intelligence parameter computation 110 dynamically reconfigures the system in real time based on the external conditions, including:
The data ingestion inputs, workloads, and/or data fusion weights will be continuously tuned in this manner until more consistency and balance is achieved among the modalities. This intelligence helps improve the overall accuracy of the fused prediction 112 for a potential event-of-interest so that a genuine event will not be missed or neglected.
In the illustrated embodiment, the respective modalities are fused using decision-level fusion, where the fusion is performed on the respective predictions/results of the multimodal workloads 104. In other embodiments, however, the modalities may be fused using other approaches, including data-level fusion (e.g., fusing the input data from each modality), feature-level fusion (e.g., extracting features from the input data of multiple modalities), and so forth.
In the illustrated embodiment, system 200 includes data ingestion inputs 202a-d, event detection workloads 204a-d, fusion logic 206, external sensing logic 208, and intelligent parameter computation logic 210.
The data ingestion inputs 202 include a camera 202a, microphone 202b, LIDAR 202c, and potentially other sensors, modalities, or data sources 202d (e.g., thermometer/temperature sensor, gas sensor, clock). Each data ingestion input 202a-d generates a stream of raw data, where Xi represents the raw data from modality i, such as video (X1) from the camera 202a, audio (X2) from the microphone 202b, point clouds (X3) from LIDAR 202c, and any other type of data (X4) from other modalities 202d in the system 200.
Each data stream Xi is analyzed separately through independent event detection workloads 204a-d, some of which may leverage artificial intelligence (AI) techniques (e.g., deep learning) and others that may leverage other forms of analytics. For example, the video data X1 may be analyzed using an AI workload 204a that performs object detection and/or action recognition, the audio data X2 may be analyzed using an AI workload 204b that performs sound classification, the LIDAR point cloud data X3 may be analyzed using an AI workload 204c that performs object detection and/or action recognition, and other data X4 may be analyzed using non-AI workloads 204d (e.g., detecting an abnormal temperature from a temperature sensor, detecting the presence of a certain type of gas from a gas sensor). Further, the output of the workload 204a-d for each modality may indicate a confidence level or probability of an event occurring, where Y represents the output of the workload for modality i.
The fusion logic 206 then computes a fused prediction 212 by fusing the outputs Yi of the workloads 204a-d for each modality. For example, with respect to event detection, the fused prediction 212 may be a weighted average of the confidence levels output by the respective modalities, Σi=1N(WiYi), where:
In some embodiments, an event may be formally detected or triggered if the fused confidence level 212 is above a threshold (e.g., 80% or higher).
In some cases, however, there may be imbalance or inconsistency in the outputs Yi of the respective modalities. For example, an event may be detected with high confidence by some modalities/sensors and not others, or the confidence level of an event may fluctuate or vary significantly among the modalities/sensors (e.g., an audio modality has high confidence of an event while a video modality has low confidence of the event because the event is outside the field of view of the camera). As a result, the fused prediction 212 may have a low confidence level that falls below the event trigger threshold and/or within certain boundary conditions, which creates uncertainty as to whether the event of interest actually occurred.
Due to this uncertainty, external conditions of the environment may be used to dynamically reconfigure the system 200 in real time to achieve greater consistency and balance among the outputs Yi of the respective modalities, thus improving the accuracy and reliability of the fused prediction 212.
For example, external sensing 208 may be performed to detect external factors or conditions of the environment based on data from sensors or other sources 202a-d, such as time of day/night, lighting, weather, visibility, noise, location of sensors, location/direction of potential events or activity, and so forth. In particular, the accuracy of the sensing by the sensors 202a-d (e.g., cameras, microphones, LIDAR, temperature sensors, gas sensors, etc.) is highly dependent on the installation and configuration of the sensors (e.g., at the corner, wide open space, etc.) and external environmental factors (e.g., lighting conditions, weather, noisiness, etc.). The external sensing logic 208 senses the dynamic change in these external factors, such as video brightness (e.g., due to shadows, time of day/night, etc.), obstructions/occlusions in the camera field of view, noise level (e.g., based on audio input gain, audio input beaming direction), and so forth.
The installation/configuration of the sensors and the real-time changing external environmental factors are fed into the intelligent parameter computation (IPC) logic 210, along with the outputs Yi of the respective modalities and the fused output 212. In this manner, if the IPC logic 210 detects an inconsistency among the modalities/sensors (e.g., based on the modality outputs Yi and/or the fused output 212), it dynamically reconfigures the system 200 in real time based on the information from the external sensing logic 208.
For example, based on the external environmental conditions and sensor configurations, the IPC logic 210 may dynamically adjust the data ingestion inputs, computational workloads, and/or data fusion weights to achieve greater consistency and balance among the outputs Yi of the respective modalities, thus improving the accuracy and reliability of the fused prediction 212.
For example, the data ingestion inputs may be adjusted via the internal configuration parameters and/or external actuators of the sensors, such as adjusting the pan, tilt, or zoom setting(s) of a camera to redirect the field of view of the camera, adjusting the frame rate, video resolution, or lighting intensity of the camera, adjusting the sampling rate, gain/sensitivity, or beam direction of a microphone, and so forth.
Reconfiguring the workloads 204a-d may include dynamic inference model switching, workload performance adjustments, workload prioritization, and so forth. For example, with dynamic inference model switching, the inference model used to perform a particular workload 204a-d may be switched with another model that is tuned for the current combination of environmental factors (e.g., a model trained to perform action recognition at night). Similarly, the precision of the inference engine for a particular workload 204a-d may be adjusted based on the environment (e.g., to single-precision floating-point (FP32), half-precision floating-point (FP16), and/or 8-bit integer quantization) to adjust certain performance characteristics of the workload, such as prediction accuracy, power consumption, latency, throughput, and so forth. Certain workloads may also be prioritized over others with respect to precision/accuracy, bandwidth, sampling rate, inference speed, and so forth. Further, the various workload adjustments may be performed such that the overall system resource utilization remains within the ideal operating range that the system can handle (e.g., CPU utilization <90%).
The weights Wi used to fuse the outputs Yi of the respective modalities may also be dynamically tuned based on the current environment (e.g., lighting, ambient noise). For example, the weights may be adjusted to increase reliance on certain modalities that are more reliable than others in the current conditions (e.g., rely more on audio/LIDAR, and less on video, at night when lighting is poor). In some embodiments, for example, a set of weights can be trained using multiple datasets under several environmental conditions that are representative of a real-world deployment scenario. In this manner, the respective modalities are no longer fused using a fixed or predetermined methodology. Rather, the weightage used to fuse the modalities is very flexible, as the weights are dynamically determined in real time based on the current external factors to improve the accuracy of the fused output 212. However, some or all of the weights can be predetermined or fixed according to the needs and requirements of a particular use case.
In some embodiments, the IPC logic 210 may include a set of predefined configurations 211a-e tailored to different deployment scenarios and environmental conditions, which may be used to make the appropriate configuration adjustments based on the current system configuration and environmental conditions. For example, when the IPC logic 210 detects an inconsistency among the modalities, it triggers a search for a new configuration that will potentially resolve the inconsistency to improve the accuracy of the fused prediction 212 for an event-of-interest. An appropriate course of action will be applied based on the sensed environmental factors, such as reconfiguring the data ingestion inputs 202a-d (e.g., sensors), reconfiguring the event detection workloads 204a-d, and/or tuning the weights Wi used to fuse the modalities. Further, because the decision methodology is aimed at resolving the conflict and inconsistency among the modalities, adaptation priority may be given to modalities with lower confidence levels rather than those with higher confidence levels.
As an example, if a microphone detects fighting noises with high confidence while a camera detects fighting activity with low or zero confidence, adaptation priority may be given to the camera. Further, it may be determined that the current field of view of the camera is pointing in a different direction than where the fighting noises detected by the microphone are coming from. As a result, the pan, tilt, and/or zoom settings of the camera may be adjusted to redirect the camera field of view to the same direction where the fighting noises are coming from.
The IPC logic 210 may continue making these dynamic configuration adjustments until more consistency is achieved among the outputs (e.g., confidence levels) of the respective modalities. For example, after adjusting the configuration, the IPC logic 210 may reevaluate the outputs Y of the respective modalities and the fused output 212 to determine whether an event should be triggered, whether a potential event was a false alarm, or whether to continue making performance adjustments due to unresolved inconsistencies among the sensors/modalities.
To illustrate, consider an example of dynamic weight tuning for a system with two modalities: a camera and a microphone. Under normal environmental conditions (e.g., average brightness), the dynamic weights are evenly distributed across the modalities, W1=W2=0.5, where:
Each modality generates a confidence level (e.g., 0-100%) for a particular prediction, such as whether an event such as fighting occurs, where:
In order to detect fighting, the probability of a fight from each modality is fused using a weighted average, W1*Y1+W2*Y2, and if the fused probability exceeds a threshold such as 80%, fighting is detected.
Table 1 shows example outputs for fight detection in different environmental conditions before and after the weights have been tuned. As shown by these examples, when the event detection outputs of the modalities are inconsistent, the weights of the modalities can be tuned based on the current environment to improve the confidence and accuracy of the fused output.
In examples 1-3 of Table 1, each modality has the same default weight (e.g., 0.5). In examples 1 and 2, under bright light conditions, the outputs of the modalities are relatively consistent (and accurate), as their respective confidence levels for fighting are either both low or both high. This is because the camera has good visibility under bright light conditions.
In example 3, under low light conditions, the outputs of the modalities are inconsistent, as the video modality has a confidence level of 10% for fighting while the audio modality has a confidence level of 90% for fighting. This is because the camera has poor visibility under low light conditions, while the microphone is unaffected by lighting. As a result, the fused output only has a confidence level of 50% for fighting, which is below the 80% threshold, and thus the system fails to detect genuine fighting.
In example 4, due to the inconsistency between the camera and microphone modalities in example 3, the weights are dynamically tuned for the low light environment. In particular, since the camera has poor visibility in low light and the microphone is unimpacted by light, the camera weight (W1) is decreased to 0.1 and the microphone weight (W2) is increased to 0.9, which places more weight or reliance on sound recognition compared to action recognition. As a result, the fused output has a confidence level of 82% for fighting, which is above the threshold, and thus the system successfully detects fighting.
In the example above, only one class of event is detected—fighting—which is reflected by the fact that Y and W are scalar values. However, this solution naturally extends to detection of multiple types of events concurrently, such as fighting, running, and so forth. When detecting multiple events, the output of each modality (Yi) can be represented as a multi-dimensional vector of confidence values. For example, Y1=[0.1, 0.4] may represent the output of action recognition on a video stream for “fighting” and “running” events, where the confidence level of “fighting” is 10% and the confidence level of “running” is 40%. The fused output may similarly be represented as a multi-dimensional vector of confidence values, where an output of [0.82, 0.43] means the overall confidence level for “fighting” is 82% and the overall confidence level for “running” is 43%. In some embodiments, the confidence levels may be sorted and the event with the highest confidence may be evaluated as the most likely event.
This solution is flexible and the data fusion logic can remain very light even as more modalities are added. For example, when the confidence of a particular modality is extremely high, this solution may focus primarily or exclusively on that modality in some situations, and the other modalities may be ignored or temporarily deactivated. As a result, the other modalities do not increase the burden or load on the system.
In the illustrated example, each sensor 302a-n has an associated hash table 300a-n that stores various attributes in the form of key-value pairs, including a unique identifier (ID), sensor type (e.g., camera, LIDAR, microphone), data dimensionality (e.g., 1D, 2D, 3D), location (e.g., absolute location such as GPS coordinates, relative location such as a floor or room of a building, etc.), sampling rate, event detection flag, and default weight for fusion. It should be appreciated that these attributes are merely provided as examples, and additional or alternative attributes may be provided in other embodiments, such as sensor orientation, direction, or field of view, confidence level or probability associated with the event detection flag, and so forth.
In this manner, the metadata or attributes of each sensor 302a-n can be accessed by looking up the values for the respective keys in the hash table 300a-n associated with the particular sensor. For example, the event detection flag can be accessed to determine if an event is currently being detected based on the data captured by a particular sensor. Similarly, the default weight for each sensor can be accessed to fuse the sensor data and/or event detection results of the respective sensors.
In the illustrated example, the decision tree 400 begins by determining the time of day 402. If it is currently daytime 404, the decision tree 400 determines whether the weather is good 406 or bad 410. If the weather is good 406, the event detection status of the camera is checked 408 since it is usually reliable when visibility is good (e.g., due to daylight and clear skies). If the weather is bad 410, the event detection status of LIDAR and the microphone are checked 412 since they are usually more reliable than the camera when visibility is poor (e.g., due to bad weather). Similarly, if it is currently nighttime 414, the event detection status of LIDAR and the microphone are checked 416 since they are usually more reliable than the camera when visibility is poor (e.g., due to the lack of light at night).
In the illustrated example,
In
In
In the illustrated example, the microphone 506 at node 502b may localize the direction of sound from the fighting event 508 and coordinate with its associated camera 504. For example, various configuration parameters of the microphone 506 at node 502b may be adjusted to increase the confidence level of the sound classification model, such as the audio beam direction, the audio sampling rate, the microphone sensitivity/gain, and so forth. Further, various configuration parameters of the camera 504 at node 502b may also be adjusted to detect the fighting event with high confidence, such as adjusting the camera field of view 505 in the direction of the fighting event 508, adjusting the lighting intensity to make the video frames clearer for fighting action recognition, and so forth.
The process flow begins at block 602 by receiving, via interface circuitry, sensor data captured by multiple sensors. In various embodiments, the sensors may include at least one of a camera, a microphone, a location sensor, a radio frequency identification (RFID) sensor, a light detection and ranging (LIDAR) sensor, a radio detection and ranging (RADAR) sensor, an ultrasonic sensor, a thermal sensor, an infrared sensor, a temperature sensor, a gas sensor, or a magnetic sensor, among other examples.
The process flow then proceeds to block 604 to perform event detection on the sensor data. For example, one or more workloads may be executed to detect events based on the sensor data captured by the respective sensors, such as performing inference on the sensor data using artificial intelligence and/or machine learning models trained to detect events. In various embodiments, any type of event may be detected depending on the particular use case, such as fighting, criminal activity (e.g., theft), emergencies, manufacturing anomalies, human behavior and emotions (e.g., shopper behavior in retail stores, student behavior in schools, employee behavior at work), vehicle maneuvers (e.g., a car turning or switching lanes), and so forth.
In some embodiments, for example, event detection may be performed on visual data (e.g., images and videos captured by a camera or other vision sensor) using convolutional neural networks (CNN) (e.g., Inception/ResNet CNN architectures, fuzzy CNNs (F-CNN)), among other examples.
In some embodiments, event detection may be performed on audio (e.g., sound captured by a microphone) using transformer models, recurrent neural networks (RNN), long short-term memory (LSTM) networks, and/or CNNs, among other examples.
In some embodiments, event detection may be performed on point clouds captured by LIDAR or RADAR using PointNet architectures and/or clustering (e.g., k-nearest neighbors (kNN), Gaussian mixture models (gMM), k-means clustering, density-based spatial clustering of applications with noise (DBSCAN)), among other examples.
In various embodiments, however, any suitable type and/or combination of artificial intelligence, machine learning, and/or data analysis techniques may be used for event detection and/or other use cases, including, without limitation, artificial neural networks (ANN), deep learning, deep neural networks, convolutional neural networks (CNN) (e.g., Inception/ResNet CNN architectures, fuzzy CNNs (F-CNN)), feed-forward artificial neural networks, multilayer perceptron (MLP), pattern recognition, scale-invariant feature transforms (SIFT), principal component analysis (PCA), discrete cosine transforms (DCT), recurrent neural networks (RNN), long short-term memory (LSTM) networks, transformers, clustering (e.g., k-nearest neighbors (kNN), Gaussian mixture models (gMM), k-means clustering, density-based spatial clustering of applications with noise (DBSCAN)), support vector machines (SVM), decision tree learning (e.g., random forests, classification and regression trees (CART)), gradient boosting (e.g., gradient tree boosting, extreme gradient boosted trees), logistic regression, Bayesian networks, Naïve-Bayes, moving average models, autoregressive moving average (ARMA) models, autoregressive integrated moving average (ARIMA) models, exponential smoothing models, regression analysis models, and/or ensembles thereof (e.g., models that combine the predictions of multiple machine learning models to improve prediction accuracy), among other examples.
The process flow then proceeds to block 606 to determine, based on performing event detection on the sensor data, whether an inconsistency is detected among the sensors. For example, the respective sensors may inconsistently detect or fail to detect an event, the confidence level of a detected event may vary significantly among the sensors, and so forth. In some cases, for example, after performing event detection on the sensor data captured by the respective sensors, an event may be detected based on the sensor data of some sensors, while the event fails to be detected based on the sensor data of other sensors.
The process flow then proceeds to block 608 to detect an external environment of the sensors based on the sensor data. For example, one or more conditions of the external environment may be detected based on the sensor data, such as time of day, lighting, weather, visibility, noise, location, and/or direction (e.g., direction of sound or other activity), among other examples.
The process flow then proceeds to block 610 to adjust one or more configuration parameters used for event detection based on the external environment of the sensors. In some embodiments, the configuration parameters may include sensor settings associated with the sensors, sensor fusion weights indicating the level of influence of the respective sensors for performing event detection, and/or parameters associated with event detection models trained to perform event detection based on the sensor data captured by certain sensors.
In some cases, for example, various sensor settings associated with a camera and/or a microphone may be adjusted based on the external environment. The adjusted camera settings may include pan, tilt, or zoom setting(s) associated with a field of view of the camera, a resolution of the camera, a frame rate of the camera, and/or a lighting intensity of the camera, among other examples. The adjusted microphone settings may include a sensitivity of the microphone, a beam direction of the microphone, and/or a sampling rate of the microphone, among other examples.
Additionally, or alternatively, sensor fusion weights may be adjusted based on the external environment to modify the level of influence of certain sensors when performing event detection. For example, at night, the weights for sensors that are reliable in low-light conditions may be increased (e.g., LIDAR, microphones), while the weights for sensors that are less reliable in those conditions may be decreased (e.g., cameras).
Additionally, or alternatively, certain event detection models used to perform event detection may be reconfigured based on the external environment. In some cases, for example, certain performance characteristics of the event detection models may be adjusted, such as the precision, input data resolution, power efficiency, latency, throughput, accuracy, bandwidth, sampling rate, and/or inference speed, among other examples. Similarly, certain event detection models may be replaced with alternative event detection models that have different performance characteristics. For example, at night, an event detection model for a camera may be replaced with an alternative event detection model trained to perform event detection in low-lighting conditions.
After adjusting the configuration parameters used for event detection, the process flow repeats blocks 602-610 to continue receiving sensor data, performing event detection on the sensor data based on the adjusted configuration parameters, and (re)adjusting the configuration parameters based on the external environment, until the inconsistency among the sensors is resolved at block 606.
Once the inconsistency among the sensors is no longer detected at block 606, the process flow proceeds to block 612 to determine whether an event is detected. In some embodiments, for example, an event may be officially detected or triggered if multiple sensors or modalities detect the event with a confidence level above a particular threshold (e.g., 80% or higher). If an event is not detected, the process flow proceeds back to block 602 to continue receiving sensor data and performing event detection. If an event is detected, however, the process flow proceeds to block 614 to trigger an appropriate action in response to the detected event, such as alerting a user or entity, logging or storing the event, gathering additional information about the event (e.g., performing face detection to identify people involved in the event), triggering a responsive action by a robot (e.g., robots on the manufacturing line, autonomous vehicles such as cars and drones), and/or performing any other responsive or remedial action based on the particular use case.
At this point, the flowchart may be complete. In some embodiments, however, the flowchart may restart and/or certain blocks may be repeated. For example, in some embodiments, the flowchart may restart at block 602 to continue receiving sensor data and performing event detection.
Examples of various computing embodiments that may be used to implement the event detection solution described throughout this disclosure are described below. In particular, any aspects of the solution described in the preceding sections may be implemented using the computing embodiments described below.
Compute, memory, and storage are scarce resources, and generally decrease depending on the edge location (e.g., fewer processing resources being available at consumer endpoint devices, than at a base station, than at a central office). However, the closer that the edge location is to the endpoint (e.g., user equipment (UE)), the more that space and power is often constrained. Thus, edge computing attempts to reduce the amount of resources needed for network services, through the distribution of more resources which are located closer both geographically and in network access time. In this manner, edge computing attempts to bring the compute resources to the workload data where appropriate, or, bring the workload data to the compute resources.
The following describes aspects of an edge cloud architecture that covers multiple potential deployments and addresses restrictions that some network operators or service providers may have in their own infrastructures. These include, variation of configurations based on the edge location (because edges at a base station level, for instance, may have more constrained performance and capabilities in a multi-tenant scenario); configurations based on the type of compute, memory, storage, fabric, acceleration, or like resources available to edge locations, tiers of locations, or groups of locations; the service, security, and management and orchestration capabilities; and related objectives to achieve usability and performance of end services. These deployments may accomplish processing in network layers that may be considered as “near edge”, “close edge”, “local edge”, “middle edge”, or “far edge” layers, depending on latency, distance, and timing characteristics.
Edge computing is a developing paradigm where computing is performed at or closer to the “edge” of a network, typically through the use of a compute platform (e.g., x86 or ARM compute hardware architecture) implemented at base stations, gateways, network routers, or other devices which are much closer to endpoint devices producing and consuming the data. For example, edge gateway servers may be equipped with pools of memory and storage resources to perform computation in real-time for low latency use-cases (e.g., autonomous driving or video surveillance) for connected client devices. Or as an example, base stations may be augmented with compute and acceleration resources to directly process service workloads for connected user equipment, without further communicating data via backhaul networks. Or as another example, central office network management hardware may be replaced with standardized compute hardware that performs virtualized network functions and offers compute resources for the execution of services and consumer functions for connected devices. Within edge computing networks, there may be scenarios in services which the compute resource will be “moved” to the data, as well as scenarios in which the data will be “moved” to the compute resource. Or as an example, base station compute, acceleration and network resources can provide services in order to scale to workload demands on an as needed basis by activating dormant capacity (subscription, capacity on demand) in order to manage corner cases, emergencies or to provide longevity for deployed resources over a significantly longer implemented lifecycle.
Examples of latency, resulting from network communication distance and processing time constraints, may range from less than a millisecond (ms) when among the endpoint layer 800, under 5 ms at the edge devices layer 810, to even between 10 to 40 ms when communicating with nodes at the network access layer 820. Beyond the edge cloud 710 are core network 830 and cloud data center 840 layers, each with increasing latency (e.g., between 50-60 ms at the core network layer 830, to 100 or more ms at the cloud data center layer). As a result, operations at a core network data center 835 or a cloud data center 845, with latencies of at least 50 to 100 ms or more, will not be able to accomplish many time-critical functions of the use cases 805. Each of these latency values are provided for purposes of illustration and contrast; it will be understood that the use of other access network mediums and technologies may further reduce the latencies. In some examples, respective portions of the network may be categorized as “close edge”, “local edge”, “near edge”, “middle edge”, or “far edge” layers, relative to a network source and destination. For instance, from the perspective of the core network data center 835 or a cloud data center 845, a central office or content data network may be considered as being located within a “near edge” layer (“near” to the cloud, having high latency values when communicating with the devices and endpoints of the use cases 805), whereas an access point, base station, on-premise server, or network gateway may be considered as located within a “far edge” layer (“far” from the cloud, having low latency values when communicating with the devices and endpoints of the use cases 805). It will be understood that other categorizations of a particular network layer as constituting a “close”, “local”, “near”, “middle”, or “far” edge may be based on latency, distance, number of network hops, or other measurable characteristics, as measured from a source in any of the network layers 800-840.
The various use cases 805 may access resources under usage pressure from incoming streams, due to multiple services utilizing the edge cloud. To achieve results with low latency, the services executed within the edge cloud 710 balance varying requirements in terms of: (a) Priority (throughput or latency) and Quality of Service (QoS) (e.g., traffic for an autonomous car may have higher priority than a temperature sensor in terms of response time requirement; or, a performance sensitivity/bottleneck may exist at a compute/accelerator, memory, storage, or network resource, depending on the application); (b) Reliability and Resiliency (e.g., some input streams need to be acted upon and the traffic routed with mission-critical reliability, where as some other input streams may be tolerate an occasional failure, depending on the application); and (c) Physical constraints (e.g., power, cooling and form-factor).
The end-to-end service view for these use cases involves the concept of a service-flow and is associated with a transaction. The transaction details the overall service requirement for the entity consuming the service, as well as the associated services for the resources, workloads, workflows, and business functional and business level requirements. The services executed with the “terms” described may be managed at each layer in a way to assure real time, and runtime contractual compliance for the transaction during the lifecycle of the service. When a component in the transaction is missing its agreed to SLA, the system as a whole (components in the transaction) may provide the ability to (1) understand the impact of the SLA violation, and (2) augment other components in the system to resume overall transaction SLA, and (3) implement steps to remediate.
Thus, with these variations and service features in mind, edge computing within the edge cloud 710 may provide the ability to serve and respond to multiple applications of the use cases 805 (e.g., object tracking, video surveillance, connected cars, etc.) in real-time or near real-time, and meet ultra-low latency requirements for these multiple applications. These advantages enable a whole new class of applications (Virtual Network Functions (VNFs), Function as a Service (FaaS), Edge as a Service (EaaS), standard processes, etc.), which cannot leverage conventional cloud computing due to latency or other limitations.
However, with the advantages of edge computing comes the following caveats. The devices located at the edge are often resource constrained and therefore there is pressure on usage of edge resources. Typically, this is addressed through the pooling of memory and storage resources for use by multiple users (tenants) and devices. The edge may be power and cooling constrained and therefore the power usage needs to be accounted for by the applications that are consuming the most power. There may be inherent power-performance tradeoffs in these pooled memory resources, as many of them are likely to use emerging memory technologies, where more power requires greater memory bandwidth. Likewise, improved security of hardware and root of trust trusted functions are also required, because edge locations may be unmanned and may even need permissioned access (e.g., when housed in a third-party location). Such issues are magnified in the edge cloud 710 in a multi-tenant, multi-owner, or multi-access setting, where services and applications are requested by many users, especially as network usage dynamically fluctuates and the composition of the multiple stakeholders, use cases, and services changes.
At a more generic level, an edge computing system may be described to encompass any number of deployments at the previously discussed layers operating in the edge cloud 710 (network layers 800-840), which provide coordination from client and distributed computing devices. One or more edge gateway nodes, one or more edge aggregation nodes, and one or more core data centers may be distributed across layers of the network to provide an implementation of the edge computing system by or on behalf of a telecommunication service provider (“telco”, or “TSP”), internet-of-things service provider, cloud service provider (CSP), enterprise entity, or any other number of entities. Various implementations and configurations of the edge computing system may be provided dynamically, such as when orchestrated to meet service objectives.
Consistent with the examples provided herein, a client compute node may be embodied as any type of endpoint component, device, appliance, or other thing capable of communicating as a producer or consumer of data. Further, the label “node” or “device” as used in the edge computing system does not necessarily mean that such node or device operates in a client or agent/minion/follower role; rather, any of the nodes or devices in the edge computing system refer to individual entities, nodes, or subsystems which include discrete or connected hardware or software configurations to facilitate or use the edge cloud 710.
As such, the edge cloud 710 is formed from network components and functional features operated by and within edge gateway nodes, edge aggregation nodes, or other edge compute nodes among network layers 810-830. The edge cloud 710 thus may be embodied as any type of network that provides edge computing and/or storage resources which are proximately located to radio access network (RAN) capable endpoint devices (e.g., mobile computing devices, IoT devices, smart devices, etc.), which are discussed herein. In other words, the edge cloud 710 may be envisioned as an “edge” which connects the endpoint devices and traditional network access points that serve as an ingress point into service provider core networks, including mobile carrier networks (e.g., Global System for Mobile Communications (GSM) networks, Long-Term Evolution (LTE) networks, 5G/6G networks, etc.), while also providing storage and/or compute capabilities. Other types and forms of network access (e.g., Wi-Fi, long-range wireless, wired networks including optical networks) may also be utilized in place of or in combination with such 3GPP carrier networks.
The network components of the edge cloud 710 may be servers, multi-tenant servers, appliance computing devices, and/or any other type of computing devices. For example, the edge cloud 710 may include an appliance computing device that is a self-contained electronic device including a housing, a chassis, a case or a shell. In some circumstances, the housing may be dimensioned for portability such that it can be carried by a human and/or shipped. Example housings may include materials that form one or more exterior surfaces that partially or fully protect contents of the appliance, in which protection may include weather protection, hazardous environment protection (e.g., EMI, vibration, extreme temperatures), and/or enable submergibility. Example housings may include power circuitry to provide power for stationary and/or portable implementations, such as AC power inputs, DC power inputs, AC/DC or DC/AC converter(s), power regulators, transformers, charging circuitry, batteries, wired inputs and/or wireless power inputs. Example housings and/or surfaces thereof may include or connect to mounting hardware to enable attachment to structures such as buildings, telecommunication structures (e.g., poles, antenna structures, etc.) and/or racks (e.g., server racks, blade mounts, etc.). Example housings and/or surfaces thereof may support one or more sensors (e.g., temperature sensors, vibration sensors, light sensors, acoustic sensors, capacitive sensors, proximity sensors, etc.). One or more such sensors may be contained in, carried by, or otherwise embedded in the surface and/or mounted to the surface of the appliance. Example housings and/or surfaces thereof may support mechanical connectivity, such as propulsion hardware (e.g., wheels, propellers, etc.) and/or articulating hardware (e.g., robot arms, pivotable appendages, etc.). In some circumstances, the sensors may include any type of input devices such as user interface hardware (e.g., buttons, switches, dials, sliders, etc.). In some circumstances, example housings include output devices contained in, carried by, embedded therein and/or attached thereto. Output devices may include displays, touchscreens, lights, LEDs, speakers, I/O ports (e.g., USB), etc. In some circumstances, edge devices are devices presented in the network for a specific purpose (e.g., a traffic light), but may have processing and/or other capacities that may be utilized for other purposes. Such edge devices may be independent from other networked devices and may be provided with a housing having a form factor suitable for its primary purpose; yet be available for other compute tasks that do not interfere with its primary task. Edge devices include Internet of Things devices. The appliance computing device may include hardware and software components to manage local issues such as device temperature, vibration, resource utilization, updates, power issues, physical and network security, etc. Example hardware for implementing an appliance computing device is described in conjunction with
In
In further examples, any of the compute nodes or devices discussed with reference to the present edge computing systems and environment may be fulfilled based on the components depicted in
In the simplified example depicted in
The compute node 1000 may be embodied as any type of engine, device, or collection of devices capable of performing various compute functions. In some examples, the compute node 1000 may be embodied as a single device such as an integrated circuit, an embedded system, a field-programmable gate array (FPGA), a system-on-a-chip (SOC), or other integrated system or device. In the illustrative example, the compute node 1000 includes or is embodied as a processor 1004 and a memory 1006. The processor 1004 may be embodied as any type of processor capable of performing the functions described herein (e.g., executing an application). For example, the processor 1004 may be embodied as a multi-core processor(s), a microcontroller, a processing unit, a specialized or special purpose processing unit, or other processor or processing/controlling circuit.
In some examples, the processor 1004 may be embodied as, include, or be coupled to an FPGA, an application specific integrated circuit (ASIC), reconfigurable hardware or hardware circuitry, or other specialized hardware to facilitate performance of the functions described herein. Also in some examples, the processor 704 may be embodied as a specialized x-processing unit (xPU) also known as a data processing unit (DPU), infrastructure processing unit (IPU), or network processing unit (NPU). Such an xPU may be embodied as a standalone circuit or circuit package, integrated within an SOC, or integrated with networking circuitry (e.g., in a SmartNIC, or enhanced SmartNIC), acceleration circuitry, storage devices, or AI hardware (e.g., GPUs or programmed FPGAs). Such an xPU may be designed to receive programming to process one or more data streams and perform specific tasks and actions for the data streams (such as hosting microservices, performing service management or orchestration, organizing or managing server or data center hardware, managing service meshes, or collecting and distributing telemetry), outside of the CPU or general purpose processing hardware. However, it will be understood that a xPU, a SOC, a CPU, and other variations of the processor 1004 may work in coordination with each other to execute many types of operations and instructions within and on behalf of the compute node 1000.
The memory 1006 may be embodied as any type of volatile (e.g., dynamic random access memory (DRAM), etc.) or non-volatile memory or data storage capable of performing the functions described herein. Volatile memory may be a storage medium that requires power to maintain the state of data stored by the medium. Non-limiting examples of volatile memory may include various types of random access memory (RAM), such as DRAM or static random access memory (SRAM). One particular type of DRAM that may be used in a memory module is synchronous dynamic random access memory (SDRAM).
In an example, the memory device is a block addressable memory device, such as those based on NAND or NOR technologies. A memory device may also include a three dimensional crosspoint memory device (e.g., Intel® 3D XPoint™ memory), or other byte addressable write-in-place nonvolatile memory devices. The memory device may refer to the die itself and/or to a packaged memory product. In some examples, 3D crosspoint memory (e.g., Intel® 3D XPoint™ memory) may comprise a transistor-less stackable cross point architecture in which memory cells sit at the intersection of word lines and bit lines and are individually addressable and in which bit storage is based on a change in bulk resistance. In some examples, all or a portion of the memory 1006 may be integrated into the processor 1004. The memory 1006 may store various software and data used during operation such as one or more applications, data operated on by the application(s), libraries, and drivers.
The compute circuitry 1002 is communicatively coupled to other components of the compute node 1000 via the I/O subsystem 1008, which may be embodied as circuitry and/or components to facilitate input/output operations with the compute circuitry 1002 (e.g., with the processor 1004 and/or the main memory 1006) and other components of the compute circuitry 1002. For example, the I/O subsystem 1008 may be embodied as, or otherwise include, memory controller hubs, input/output control hubs, integrated sensor hubs, firmware devices, communication links (e.g., point-to-point links, bus links, wires, cables, light guides, printed circuit board traces, etc.), and/or other components and subsystems to facilitate the input/output operations. In some examples, the I/O subsystem 1008 may form a portion of a system-on-a-chip (SoC) and be incorporated, along with one or more of the processor 1004, the memory 1006, and other components of the compute circuitry 1002, into the compute circuitry 1002.
The one or more illustrative data storage devices 1010 may be embodied as any type of devices configured for short-term or long-term storage of data such as, for example, memory devices and circuits, memory cards, hard disk drives, solid-state drives, or other data storage devices. Individual data storage devices 1010 may include a system partition that stores data and firmware code for the data storage device 1010. Individual data storage devices 1010 may also include one or more operating system partitions that store data files and executables for operating systems depending on, for example, the type of compute node 1000.
The communication circuitry 1012 may be embodied as any communication circuit, device, or collection thereof, capable of enabling communications over a network between the compute circuitry 1002 and another compute device (e.g., an edge gateway of an implementing edge computing system). The communication circuitry 1012 may be configured to use any one or more communication technology (e.g., wired or wireless communications) and associated protocols (e.g., a cellular networking protocol such a 3GPP 4G or 5G standard, a wireless local area network protocol such as IEEE 802.11/Wi-Fi®, a wireless wide area network protocol, Ethernet, Bluetooth®, Bluetooth Low Energy, a IoT protocol such as IEEE 802.15.4 or ZigBee®, low-power wide-area network (LPWAN) or low-power wide-area (LPWA) protocols, etc.) to effect such communication.
The illustrative communication circuitry 1012 includes a network interface controller (NIC) 1020, which may also be referred to as a host fabric interface (HFI). The NIC 1020 may be embodied as one or more add-in-boards, daughter cards, network interface cards, controller chips, chipsets, or other devices that may be used by the compute node 1000 to connect with another compute device (e.g., an edge gateway node). In some examples, the NIC 1020 may be embodied as part of a system-on-a-chip (SoC) that includes one or more processors, or included on a multichip package that also contains one or more processors. In some examples, the NIC 1020 may include a local processor (not shown) and/or a local memory (not shown) that are both local to the NIC 1020. In such examples, the local processor of the NIC 1020 may be capable of performing one or more of the functions of the compute circuitry 1002 described herein. Additionally, or alternatively, in such examples, the local memory of the NIC 1020 may be integrated into one or more components of the client compute node at the board level, socket level, chip level, and/or other levels.
Additionally, in some examples, a respective compute node 1000 may include one or more peripheral devices 1014. Such peripheral devices 1014 may include any type of peripheral device found in a compute device or server such as audio input devices, a display, other input/output devices, interface devices, and/or other peripheral devices, depending on the particular type of the compute node 1000. In further examples, the compute node 1000 may be embodied by a respective edge compute node (whether a client, gateway, or aggregation node) in an edge computing system or like forms of appliances, computers, subsystems, circuitry, or other components.
In a more detailed example,
The edge computing device 1050 may include processing circuitry in the form of a processor 1052, which may be a microprocessor, a multi-core processor, a multithreaded processor, an ultra-low voltage processor, an embedded processor, an xPU/DPU/IPU/NPU, special purpose processing unit, specialized processing unit, or other known processing elements. The processor 1052 may be a part of a system on a chip (SoC) in which the processor 1052 and other components are formed into a single integrated circuit, or a single package, such as the Edison™ or Galileo™ SoC boards from Intel Corporation, Santa Clara, Calif. As an example, the processor 1052 may include an Intel® Architecture Core™ based CPU processor, such as a Quark™, an Atom™, an i3, an i5, an i7, an i9, or an MCU-class processor, or another such processor available from Intel®. However, any number other processors may be used, such as available from Advanced Micro Devices, Inc. (AMD®) of Sunnyvale, Calif., a MIPS®-based design from MIPS Technologies, Inc. of Sunnyvale, Calif., an ARM®-based design licensed from ARM Holdings, Ltd. or a customer thereof, or their licensees or adopters. The processors may include units such as an A5-A13 processor from Apple® Inc., a Snapdragon™ processor from Qualcomm® Technologies, Inc., or an OMAP™ processor from Texas Instruments, Inc. The processor 1052 and accompanying circuitry may be provided in a single socket form factor, multiple socket form factor, or a variety of other formats, including in limited hardware configurations or configurations that include fewer than all elements shown in
The processor 1052 may communicate with a system memory 1054 over an interconnect 1056 (e.g., a bus). Any number of memory devices may be used to provide for a given amount of system memory. As examples, the memory 754 may be random access memory (RAM) in accordance with a Joint Electron Devices Engineering Council (JEDEC) design such as the DDR or mobile DDR standards (e.g., LPDDR, LPDDR2, LPDDR3, or LPDDR4). In particular examples, a memory component may comply with a DRAM standard promulgated by JEDEC, such as JESD79F for DDR SDRAM, JESD79-2F for DDR2 SDRAM, JESD79-3F for DDR3 SDRAM, JESD79-4A for DDR4 SDRAM, JESD209 for Low Power DDR (LPDDR), JESD209-2 for LPDDR2, JESD209-3 for LPDDR3, and JESD209-4 for LPDDR4. Such standards (and similar standards) may be referred to as DDR-based standards and communication interfaces of the storage devices that implement such standards may be referred to as DDR-based interfaces. In various implementations, the individual memory devices may be of any number of different package types such as single die package (SDP), dual die package (DDP) or quad die package (Q17P). These devices, in some examples, may be directly soldered onto a motherboard to provide a lower profile solution, while in other examples the devices are configured as one or more memory modules that in turn couple to the motherboard by a given connector. Any number of other memory implementations may be used, such as other types of memory modules, e.g., dual inline memory modules (DIMMs) of different varieties including but not limited to microDIMMs or MiniDIMMs.
To provide for persistent storage of information such as data, applications, operating systems and so forth, a storage 1058 may also couple to the processor 1052 via the interconnect 1056. In an example, the storage 1058 may be implemented via a solid-state disk drive (SSDD). Other devices that may be used for the storage 1058 include flash memory cards, such as Secure Digital (SD) cards, microSD cards, eXtreme Digital (XD) picture cards, and the like, and Universal Serial Bus (USB) flash drives. In an example, the memory device may be or may include memory devices that use chalcogenide glass, multi-threshold level NAND flash memory, NOR flash memory, single or multi-level Phase Change Memory (PCM), a resistive memory, nanowire memory, ferroelectric transistor random access memory (FeTRAM), anti-ferroelectric memory, magnetoresistive random access memory (MRAM) memory that incorporates memristor technology, resistive memory including the metal oxide base, the oxygen vacancy base and the conductive bridge Random Access Memory (CB-RAM), or spin transfer torque (STT)-MRAM, a spintronic magnetic junction memory based device, a magnetic tunneling junction (MTJ) based device, a DW (Domain Wall) and SOT (Spin Orbit Transfer) based device, a thyristor based memory device, or a combination of any of the above, or other memory.
In low power implementations, the storage 1058 may be on-die memory or registers associated with the processor 1052. However, in some examples, the storage 1058 may be implemented using a micro hard disk drive (HDD). Further, any number of new technologies may be used for the storage 1058 in addition to, or instead of, the technologies described, such resistance change memories, phase change memories, holographic memories, or chemical memories, among others.
The components may communicate over the interconnect 1056. The interconnect 1056 may include any number of technologies, including industry standard architecture (ISA), extended ISA (EISA), peripheral component interconnect (PCI), peripheral component interconnect extended (PCIx), PCI express (PCIe), or any number of other technologies. The interconnect 1056 may be a proprietary bus, for example, used in an SoC based system. Other bus systems may be included, such as an Inter-Integrated Circuit (I2C) interface, a Serial Peripheral Interface (SPI) interface, point to point interfaces, and a power bus, among others.
The interconnect 1056 may couple the processor 1052 to a transceiver 1066, for communications with the connected edge devices 1062. The transceiver 1066 may use any number of frequencies and protocols, such as 2.4 Gigahertz (GHz) transmissions under the IEEE 802.15.4 standard, using the Bluetooth® low energy (BLE) standard, as defined by the Bluetooth® Special Interest Group, or the ZigBee® standard, among others. Any number of radios, configured for a particular wireless communication protocol, may be used for the connections to the connected edge devices 1062. For example, a wireless local area network (WLAN) unit may be used to implement Wi-Fi® communications in accordance with the Institute of Electrical and Electronics Engineers (IEEE) 802.11 standard. In addition, wireless wide area communications, e.g., according to a cellular or other wireless wide area protocol, may occur via a wireless wide area network (WWAN) unit.
The wireless network transceiver 1066 (or multiple transceivers) may communicate using multiple standards or radios for communications at a different range. For example, the edge computing node 1050 may communicate with close devices, e.g., within about 10 meters, using a local transceiver based on Bluetooth Low Energy (BLE), or another low power radio, to save power. More distant connected edge devices 1062, e.g., within about 50 meters, may be reached over ZigBee® or other intermediate power radios. Both communications techniques may take place over a single radio at different power levels or may take place over separate transceivers, for example, a local transceiver using BLE and a separate mesh transceiver using ZigBee®.
A wireless network transceiver 1066 (e.g., a radio transceiver) may be included to communicate with devices or services in a cloud (e.g., an edge cloud 1095) via local or wide area network protocols. The wireless network transceiver 1066 may be a low-power wide-area (LPWA) transceiver that follows the IEEE 802.15.4, or IEEE 802.15.4g standards, among others. The edge computing node 1050 may communicate over a wide area using LoRaWAN™ (Long Range Wide Area Network) developed by Semtech and the LoRa Alliance. The techniques described herein are not limited to these technologies but may be used with any number of other cloud transceivers that implement long range, low bandwidth communications, such as Sigfox, and other technologies. Further, other communications techniques, such as time-slotted channel hopping, described in the IEEE 802.15.4e specification may be used.
Any number of other radio communications and protocols may be used in addition to the systems mentioned for the wireless network transceiver 1066, as described herein. For example, the transceiver 1066 may include a cellular transceiver that uses spread spectrum (SPA/SAS) communications for implementing high-speed communications. Further, any number of other protocols may be used, such as Wi-Fi® networks for medium speed communications and provision of network communications. The transceiver 1066 may include radios that are compatible with any number of 3GPP (Third Generation Partnership Project) specifications, such as Long Term Evolution (LTE) and 5th Generation (5G) communication systems, discussed in further detail at the end of the present disclosure. A network interface controller (NIC) 1068 may be included to provide a wired communication to nodes of the edge cloud 1095 or to other devices, such as the connected edge devices 1062 (e.g., operating in a mesh). The wired communication may provide an Ethernet connection or may be based on other types of networks, such as Controller Area Network (CAN), Local Interconnect Network (LIN), DeviceNet, ControlNet, Data Highway+, PROFIBUS, or PROFINET, among many others. An additional NIC 1068 may be included to enable connecting to a second network, for example, a first NIC 1068 providing communications to the cloud over Ethernet, and a second NIC 1068 providing communications to other devices over another type of network.
Given the variety of types of applicable communications from the device to another component or network, applicable communications circuitry used by the device may include or be embodied by any one or more of components 1064, 1066, 1068, or 1070. Accordingly, in various examples, applicable means for communicating (e.g., receiving, transmitting, etc.) may be embodied by such communications circuitry.
The edge computing node 1050 may include or be coupled to acceleration circuitry 1064, which may be embodied by one or more artificial intelligence (AI) accelerators, a neural compute stick, neuromorphic hardware, an FPGA, an arrangement of GPUs, an arrangement of xPUs/DPUs/IPU/NPUs, one or more SoCs, one or more CPUs, one or more digital signal processors, dedicated ASICs, or other forms of specialized processors or circuitry designed to accomplish one or more specialized tasks. These tasks may include AI processing (including machine learning, training, inferencing, and classification operations), visual data processing, network data processing, object detection, rule analysis, or the like. These tasks also may include the specific edge computing tasks for service management and service operations discussed elsewhere in this document.
The interconnect 1056 may couple the processor 1052 to a sensor hub or external interface 1070 that is used to connect additional devices or subsystems. The devices may include sensors 1072, such as accelerometers, level sensors, flow sensors, optical light sensors, camera sensors, temperature sensors, global navigation system (e.g., GPS) sensors, pressure sensors, barometric pressure sensors, and the like. The hub or interface 1070 further may be used to connect the edge computing node 1050 to actuators 1074, such as power switches, valve actuators, an audible sound generator, a visual warning device, and the like.
In some optional examples, various input/output (I/O) devices may be present within or connected to, the edge computing node 1050. For example, a display or other output device 1084 may be included to show information, such as sensor readings or actuator position. An input device 1086, such as a touch screen or keypad may be included to accept input. An output device 1084 may include any number of forms of audio or visual display, including simple visual outputs such as binary status indicators (e.g., light-emitting diodes (LEDs)) and multi-character visual outputs, or more complex outputs such as display screens (e.g., liquid crystal display (LCD) screens), with the output of characters, graphics, multimedia objects, and the like being generated or produced from the operation of the edge computing node 1050. A display or console hardware, in the context of the present system, may be used to provide output and receive input of an edge computing system; to manage components or services of an edge computing system; identify a state of an edge computing component or service; or to conduct any other number of management or administration functions or service use cases.
A battery 1076 may power the edge computing node 1050, although, in examples in which the edge computing node 1050 is mounted in a fixed location, it may have a power supply coupled to an electrical grid, or the battery may be used as a backup or for temporary capabilities. The battery 1076 may be a lithium ion battery, or a metal-air battery, such as a zinc-air battery, an aluminum-air battery, a lithium-air battery, and the like.
A battery monitor/charger 1078 may be included in the edge computing node 1050 to track the state of charge (SoCh) of the battery 1076, if included. The battery monitor/charger 1078 may be used to monitor other parameters of the battery 1076 to provide failure predictions, such as the state of health (SoH) and the state of function (SoF) of the battery 1076. The battery monitor/charger 1078 may include a battery monitoring integrated circuit, such as an LTC4020 or an LTC2990 from Linear Technologies, an ADT7488A from ON Semiconductor of Phoenix Ariz., or an IC from the UCD90xxx family from Texas Instruments of Dallas, Tex. The battery monitor/charger 1078 may communicate the information on the battery 1076 to the processor 1052 over the interconnect 1056. The battery monitor/charger 1078 may also include an analog-to-digital (ADC) converter that enables the processor 1052 to directly monitor the voltage of the battery 1076 or the current flow from the battery 1076. The battery parameters may be used to determine actions that the edge computing node 1050 may perform, such as transmission frequency, mesh network operation, sensing frequency, and the like.
A power block 1080, or other power supply coupled to a grid, may be coupled with the battery monitor/charger 1078 to charge the battery 1076. In some examples, the power block 1080 may be replaced with a wireless power receiver to obtain the power wirelessly, for example, through a loop antenna in the edge computing node 1050. A wireless battery charging circuit, such as an LTC4020 chip from Linear Technologies of Milpitas, Calif., among others, may be included in the battery monitor/charger 1078. The specific charging circuits may be selected based on the size of the battery 1076, and thus, the current required. The charging may be performed using the Airfuel standard promulgated by the Airfuel Alliance, the Qi wireless charging standard promulgated by the Wireless Power Consortium, or the Rezence charging standard, promulgated by the Alliance for Wireless Power, among others.
The storage 1058 may include instructions 1082 in the form of software, firmware, or hardware commands to implement the techniques described herein. Although such instructions 1082 are shown as code blocks included in the memory 1054 and the storage 1058, it may be understood that any of the code blocks may be replaced with hardwired circuits, for example, built into an application specific integrated circuit (ASIC).
In an example, the instructions 1082 provided via the memory 1054, the storage 1058, or the processor 1052 may be embodied as a non-transitory, machine-readable medium 1060 including code to direct the processor 1052 to perform electronic operations in the edge computing node 1050. The processor 1052 may access the non-transitory, machine-readable medium 1060 over the interconnect 1056. For instance, the non-transitory, machine-readable medium 1060 may be embodied by devices described for the storage 1058 or may include specific storage units such as optical disks, flash drives, or any number of other hardware devices. The non-transitory, machine-readable medium 1060 may include instructions to direct the processor 1052 to perform a specific sequence or flow of actions, for example, as described with respect to the flowchart(s) and block diagram(s) of operations and functionality depicted above. As used herein, the terms “machine-readable medium” and “computer-readable medium” are interchangeable.
Also in a specific example, the instructions 1082 on the processor 1052 (separately, or in combination with the instructions 1082 of the machine readable medium 1060) may configure execution or operation of a trusted execution environment (TEE) 1090. In an example, the TEE 1090 operates as a protected area accessible to the processor 1052 for secure execution of instructions and secure access to data. Various implementations of the TEE 1090, and an accompanying secure area in the processor 1052 or the memory 1054 may be provided, for instance, through use of Intel® Software Guard Extensions (SGX) or ARM® TrustZone® hardware security extensions, Intel® Management Engine (ME), or Intel® Converged Security Manageability Engine (CSME). Other aspects of security hardening, hardware roots-of-trust, and trusted or protected operations may be implemented in the device 1050 through the TEE 1090 and the processor 1052.
In the illustrated example of
In the illustrated example of
In further examples, a machine-readable medium also includes any tangible medium that is capable of storing, encoding or carrying instructions for execution by a machine and that cause the machine to perform any one or more of the methodologies of the present disclosure or that is capable of storing, encoding or carrying data structures utilized by or associated with such instructions. A “machine-readable medium” thus may include but is not limited to, solid-state memories, and optical and magnetic media. Specific examples of machine-readable media include non-volatile memory, including but not limited to, by way of example, semiconductor memory devices (e.g., electrically programmable read-only memory (EPROM), electrically erasable programmable read-only memory (EEPROM)) and flash memory devices; magnetic disks such as internal hard disks and removable disks; magneto-optical disks; and CD-ROM and DVD-ROM disks. The instructions embodied by a machine-readable medium may further be transmitted or received over a communications network using a transmission medium via a network interface device utilizing any one of a number of transfer protocols (e.g., Hypertext Transfer Protocol (HTTP)).
A machine-readable medium may be provided by a storage device or other apparatus which is capable of hosting data in a non-transitory format. In an example, information stored or otherwise provided on a machine-readable medium may be representative of instructions, such as instructions themselves or a format from which the instructions may be derived. This format from which the instructions may be derived may include source code, encoded instructions (e.g., in compressed or encrypted form), packaged instructions (e.g., split into multiple packages), or the like. The information representative of the instructions in the machine-readable medium may be processed by processing circuitry into the instructions to implement any of the operations discussed herein. For example, deriving the instructions from the information (e.g., processing by the processing circuitry) may include: compiling (e.g., from source code, object code, etc.), interpreting, loading, organizing (e.g., dynamically or statically linking), encoding, decoding, encrypting, unencrypting, packaging, unpackaging, or otherwise manipulating the information into the instructions.
In an example, the derivation of the instructions may include assembly, compilation, or interpretation of the information (e.g., by the processing circuitry) to create the instructions from some intermediate or preprocessed format provided by the machine-readable medium. The information, when provided in multiple parts, may be combined, unpacked, and modified to create the instructions. For example, the information may be in multiple compressed source code packages (or object code, or binary executable code, etc.) on one or several remote servers. The source code packages may be encrypted when in transit over a network and decrypted, uncompressed, assembled (e.g., linked) if necessary, and compiled or interpreted (e.g., into a library, stand-alone executable, etc.) at a local machine, and executed by the local machine.
Illustrative examples of the technologies described throughout this disclosure are provided below. Embodiments of these technologies may include any one or more, and any combination of, the examples described below. In some embodiments, at least one of the systems or components set forth in one or more of the preceding figures may be configured to perform one or more operations, techniques, processes, and/or methods as set forth in the following examples.
Example 1 includes at least one non-transitory machine-readable storage medium having instructions stored thereon, wherein the instructions, when executed on processing circuitry, cause the processing circuitry to: receive, via interface circuitry, sensor data captured by a plurality of sensors; detect, based on performing event detection on the sensor data, an inconsistency among the sensors; detect, based on the sensor data, an external environment of the sensors; adjust, based on the external environment of the sensors, one or more configuration parameters for event detection; and perform event detection on the sensor data based on the one or more adjusted configuration parameters.
Example 2 includes the storage medium of Example 1, wherein the instructions that cause the processing circuitry to detect, based on performing event detection on the sensor data, the inconsistency among the sensors further cause the processing circuitry to: perform event detection on the sensor data captured by the plurality of sensors; detect an event based on the sensor data captured by a first subset of the sensors; and fail to detect the event based on the sensor data captured by a second subset of the sensors.
Example 3 includes the storage medium of any of Examples 1-2, wherein the instructions that cause the processing circuitry to detect, based on the sensor data, the external environment of the sensors further cause the processing circuitry to: detect, based on the sensor data, one or more conditions of the external environment, wherein the one or more conditions include at least one of lighting, weather, visibility, noise, location, or time of day.
Example 4 includes the storage medium of any of Examples 1-3, wherein: the one or more configuration parameters include one or more sensor settings associated with one or more of the sensors; and the instructions that cause the processing circuitry to adjust, based on the external environment of the sensors, the one or more configuration parameters for event detection further cause the processing circuitry to: adjust, based on the external environment of the sensors, the one or more sensor settings.
Example 5 includes the storage medium of Example 4, wherein: the plurality of sensors include a camera; and the one or more sensor settings include: one or more pan, tilt, or zoom settings associated with a field of view of the camera; a resolution of the camera; a frame rate of the camera; or a lighting intensity of the camera.
Example 6 includes the storage medium of Example 4, wherein: the plurality of sensors include a microphone; and the one or more sensor settings include: a sensitivity of the microphone; a beam direction of the microphone; or a sampling rate of the microphone.
Example 7 includes the storage medium of any of Examples 1-6, wherein: the one or more configuration parameters include one or more sensor fusion weights, wherein the one or more sensor fusion weights indicate a level of influence of one or more of the sensors for performing event detection; and the instructions that cause the processing circuitry to adjust, based on the external environment of the sensors, the one or more configuration parameters for event detection further cause the processing circuitry to: adjust, based on the external environment of the sensors, the one or more sensor fusion weights.
Example 8 includes the storage medium of any of Examples 1-7, wherein: the one or more configuration parameters are associated at least in part with one or more event detection models, wherein the one or more event detection models are trained to perform event detection based on the sensor data captured by one or more of the sensors; and the instructions that cause the processing circuitry to adjust, based on the external environment of the sensors, the one or more configuration parameters for event detection further cause the processing circuitry to: reconfigure, based on the external environment of the sensors, the one or more event detection models used to perform event detection.
Example 9 includes the storage medium of Example 8, wherein the instructions that cause the processing circuitry to reconfigure, based on the external environment of the sensors, the one or more event detection models used to perform event detection further cause the processing circuitry to: adjust, based on the external environment of the sensors, one or more performance characteristics of the one or more event detection models; or replace, based on the external environment of the sensors, the one or more event detection models with one or more alternative event detection models, wherein the one or more alternative event detection models have different performance characteristics than the one or more event detection models.
Example 10 includes the storage medium of any of Examples 1-9, wherein the plurality of sensors include at least one of a camera, a microphone, a location sensor, a radio frequency identification (RFID) sensor, a light detection and ranging (LIDAR) sensor, a radio detection and ranging (RADAR) sensor, an ultrasonic sensor, a thermal sensor, an infrared sensor, a temperature sensor, a gas sensor, or a magnetic sensor.
Example 11 includes a system, comprising: interface circuitry; and processing circuitry to: receive, via the interface circuitry, sensor data captured by a plurality of sensors; detect, based on performing event detection on the sensor data, an inconsistency among the sensors; detect, based on the sensor data, an external environment of the sensors; adjust, based on the external environment of the sensors, one or more configuration parameters for event detection; and perform event detection on the sensor data based on the one or more adjusted configuration parameters.
Example 12 includes the system of Example 11, wherein the processing circuitry to detect, based on the sensor data, the external environment of the sensors is further to: detect, based on the sensor data, one or more conditions of the external environment, wherein the one or more conditions include at least one of lighting, weather, visibility, noise, location, or time of day.
Example 13 includes the system of any of Examples 11-12, wherein: the one or more configuration parameters include one or more sensor settings associated with one or more of the sensors; and the processing circuitry to adjust, based on the external environment of the sensors, the one or more configuration parameters for event detection is further to: adjust, based on the external environment of the sensors, the one or more sensor settings.
Example 14 includes the system of Example 13, wherein: the plurality of sensors include a camera; and the one or more sensor settings include: one or more pan, tilt, or zoom settings associated with a field of view of the camera; a resolution of the camera; a frame rate of the camera; or a lighting intensity of the camera.
Example 15 includes the system of Example 13, wherein: the plurality of sensors include a microphone; and the one or more sensor settings include: a sensitivity of the microphone; a beam direction of the microphone; or a sampling rate of the microphone.
Example 16 includes the system of any of Examples 11-15, wherein: the one or more configuration parameters include one or more sensor fusion weights, wherein the one or more sensor fusion weights indicate a level of influence of one or more of the sensors for performing event detection; and the processing circuitry to adjust, based on the external environment of the sensors, the one or more configuration parameters for event detection is further to: adjust, based on the external environment of the sensors, the one or more sensor fusion weights.
Example 17 includes the system of any of Examples 11-16, wherein: the one or more configuration parameters are associated at least in part with one or more event detection models, wherein the one or more event detection models are trained to perform event detection based on the sensor data captured by one or more of the sensors; and the processing circuitry to adjust, based on the external environment of the sensors, the one or more configuration parameters for event detection is further to: reconfigure, based on the external environment of the sensors, the one or more event detection models used to perform event detection.
Example 18 includes the system of Example 17, wherein the processing circuitry to reconfigure, based on the external environment of the sensors, the one or more event detection models used to perform event detection is further to: adjust, based on the external environment of the sensors, one or more performance characteristics of the one or more event detection models; or replace, based on the external environment of the sensors, the one or more event detection models with one or more alternative event detection models, wherein the one or more alternative event detection models have different performance characteristics than the one or more event detection models.
Example 19 includes a method, comprising: receiving, via interface circuitry, sensor data captured by a plurality of sensors; detecting, based on performing event detection on the sensor data, an inconsistency among the sensors; detecting, based on the sensor data, an external environment of the sensors; adjusting, based on the external environment of the sensors, one or more configuration parameters for event detection; and performing event detection on the sensor data based on the one or more adjusted configuration parameters.
Example 20 includes the method of Example 19, wherein adjusting, based on the external environment of the sensors, the one or more configuration parameters for event detection comprises: adjusting, based on the external environment of the sensors: one or more sensor settings associated with one or more of the sensors; one or more sensor fusion weights, wherein the one or more sensor fusion weights indicate a level of influence of one or more of the sensors for performing event detection; or one or more performance characteristics of one or more event detection models, wherein the one or more event detection models are trained to perform event detection based on the sensor data captured by one or more of the sensors.
While the concepts of the present disclosure are susceptible to various modifications and alternative forms, specific embodiments thereof have been shown by way of example in the drawings and are described herein in detail. It should be understood, however, that there is no intent to limit the concepts of the present disclosure to the particular forms disclosed, but on the contrary, the intention is to cover all modifications, equivalents, and alternatives consistent with the present disclosure and the appended claims.
References in the specification to “one embodiment,” “an embodiment,” “an illustrative embodiment,” etc., indicate that the embodiment described may include a particular feature, structure, or characteristic, but every embodiment may or may not necessarily include that particular feature, structure, or characteristic. Moreover, such phrases are not necessarily referring to the same embodiment. Further, when a particular feature, structure, or characteristic is described in connection with an embodiment, it is submitted that it is within the knowledge of one skilled in the art to effect such feature, structure, or characteristic in connection with other embodiments whether or not explicitly described. Additionally, it should be appreciated that items included in a list in the form of “at least one of A, B, and C” can mean (A); (B); (C); (A and B); (A and C); (B and C); or (A, B, and C). Similarly, items listed in the form of “at least one of A, B, or C” can mean (A); (B); (C); (A and B); (A and C); (B and C); or (A, B, and C).
The disclosed embodiments may be implemented, in some cases, in hardware, firmware, software, or any combination thereof. The disclosed embodiments may also be implemented as instructions carried by or stored on one or more transitory or non-transitory machine-readable (e.g., computer-readable) storage media, which may be read and executed by one or more processors. A machine-readable storage medium may be embodied as any storage device, mechanism, or other physical structure for storing or transmitting information in a form readable by a machine (e.g., a volatile or non-volatile memory, a media disc, or other media device).
In the drawings, some structural or method features may be shown in specific arrangements and/or orderings. However, it should be appreciated that such specific arrangements and/or orderings may not be required. Rather, in some embodiments, such features may be arranged in a different manner and/or order than shown in the illustrative figures. Additionally, the inclusion of a structural or method feature in a particular figure is not meant to imply that such feature is required in all embodiments and, in some embodiments, may not be included or may be combined with other features.