The present application claims priority from Australian Provisional Patent Application No 2020904456 filed on 1 Dec. 2020, the contents of which are incorporated herein by reference in their entirety.
This disclosure relates to monitoring livestock. In particular, but not limited to, this disclosure relates to sensors and communication networks for monitoring livestock.
Trust in food products has become an important concern with the advancement of globalised food supply chains. A number of providers have gained a reputation for high-quality food products but there are also competitors who's claims on product quality remains dubious. In particular for livestock products, such as meat and especially beef, there is a need to provide a provenance solution that enables verification of beef quality. Since that quality depends on the conditions for an individual animal, it would be ideal to have record for each individual animal.
The problem with livestock, however, is that animals graze across a large area of land, which may not have network connectivity everywhere. Especially in remote areas, such as on large Australian cattle farms, a wireless network is available at or near farm buildings, but at places where animals graze, connectivity is not available.
Further, sensors that operate under harsh conditions are prone to damage and other effects that lead to inaccurate sensor readings. It is then difficult to determine whether a particular sensor provides accurate or inaccurate data.
Therefore, there is a need for a food provenance solution that considers measurements from individual animals but can operate without permanent network connectivity between each animal and an aggregating server.
Throughout this specification the word “comprise”, or variations such as “comprises” or “comprising”, will be understood to imply the inclusion of a stated element, integer or step, or group of elements, integers or steps, but not the exclusion of any other element, integer or step, or group of elements, integers or steps.
Any discussion of documents, acts, materials, devices, articles or the like which has been included in the present specification is not to be taken as an admission that any or all of these matters form part of the prior art base or were common general knowledge in the field relevant to the present disclosure as it existed before the priority date of each of the appended claims.
A system for livestock monitoring comprises:
In some embodiments, the one or more gateway devices are configured to store the behaviour classification data and the sensor data on a first blockchain.
In some embodiments, the first blockchain is a private blockchain.
In some embodiments, calculating the score indicative of the reliability of the animal behaviours in light of the sensor data is performed by a smart contract on the first blockchain.
In some embodiments, the first blockchain is configured to store a hash value of each of multiple blocks; and the system further comprises a second blockchain that is configured to store the hash values of the first blockchain to establish a cryptographic link to the first blockchain.
In some embodiments, classifying by the processor integrated with each of the animal monitoring devices, the accelerometer data into one of multiple animal behaviours, to create behaviour classification data, is based on a linear classifier.
In some embodiments, the linear classifier comprises a soft-max classifier.
In some embodiments, classifying comprises filtering the accelerometer data using a low-pass filter.
In some embodiments, classifying is based on frequency features of the accelerometer data.
In some embodiments, the one or more gateway devices are further configured to obtain weather data; and the weather data is used as variable values in determining compliance by the rules engine.
In some embodiments, each of the multiple animal monitoring devices is further configured to determine compliance, based on data collected by that animal monitoring device, with animal rules.
In some embodiments, each of the multiple animal monitoring devices is further configured to collect sensor data from sensors integrated with that animal monitoring device.
In some embodiments, the sensor data comprises geographic location data from a satellite or terrestrial navigation system.
In some embodiments, the one or more gateway devices are further configured to calculate a score indicative of a reliability of the animal behaviour in light of the sensor data and the rules engine is configured to determine compliance such that the behaviours are related to the score.
In some embodiments, the system further comprises an aggregator configured to receive behaviour classification data for individual animals and output composite data.
In some embodiments, the composite data comprises one or more of:
In some embodiments, the rules engine is further configured to determine animal data other than behavioural classification data for individual animals, and the aggregator is further configured to receive the animal data other than behavioural classification data to determine the output composite data.
In some embodiments, the rules engine is configured to
A method for livestock monitoring comprises:
A method for livestock monitoring comprises:
As set out above, there is a need for verifiable provenance and monitoring of individual animals, such as cattle. However, devices that the animals can wear have only limited functionality in terms of wireless networking and storage space due to their small size, limited battery capacity and required robustness.
While it is possible to store the acceleration data on the monitoring device 102, the resulting amount of data may become larger than the local memory capacity. Further, transmitting this data over a low-bandwidth network connection may be impractical due to long transmission time and packet errors. Therefore, the processor on the monitoring device classifies the movement data into cattle behaviours, such as resting, walking, grazing, drinking, ruminating and being transported.
Cattle farm 100 further comprises fixed infrastructure, such as water trough 103. At that fixed infrastructure, it is feasible to install network connectivity, such as wireless access point 104, providing wireless data access within a communication range 105. As can be seen, two animals 106 are located within the communication range 105, while further three animals 107 are outside communication range 105.
While the animals 106 are within communication range 105, their monitoring device can transmit monitoring data to wireless access point 104. In particular, the monitoring device transmits behaviour classifications for multiple time stamps, that is a time recording of behaviours, to the access point 104. Transmitting the behaviour data requires significantly less bandwidth that transmitting the full recorded acceleration data. Since wireless access point 104 operates as a gateway to a data network, it is also referred to as a gateway node 104.
Gateway node 104 may further collect additional data in relation to animals 106. Gateway node 104 may collect this data by way of analysing network metrics. For example, gateway node 104 determines that monitoring devices of animals 106 are within range 105, which means the animals 106 are located nearby. The gateway node may comprise further sensors, such as a camera, infrared sensors, and the like to capture further data. In particular, gateway node 104 may collect data that is indicative of a presence of one of the animals 106. Therefore, the additional data can serve to corroborate the data from the monitoring device. More particularly, the monitoring device 102 may also record GPS data and process it by determining whether the animal is in proximity to the water trough 103, for example. Gateway node 104 can then compare the proximity data from the monitoring device 102 to the additional data captured by the gateway node 105.
Gateway node 104 can then calculate a score that indicates whether both match. This score may also be referred to as a “trust score”. If the monitoring device 102 indicates that the animal 106 was in close proximity to the water trough, and the gateway node 104 registered that the monitoring device was within communication range 105, gateway node 104 assigs a high score, such as 1. If both data points do not match, gateway node 104 assigns a low score, such as 0 to that data point.
In one example, gateway node 104 stores the data from monitoring device 102 and the additional data on a private blockchain 113. In that case, the calculation of the trust score can be implemented using smart contracts on the private blockchain 113. As a result, the data itself is immutable and the calculation of the trust score can be verified at any time. It should also be noted that in most examples there are multiple gateway devices 104, which each have access to the private blockchain 113. As a result, the multiple gateway devices 104 store their data concurrently on blockchain 113 and each gateway device 104 has access to the data from the other gateway devices. This way, a distributed data set is created, which allows compliance checking and other calculations at any point within the network, such as anywhere on and off the farm 100. It is also possible that the actual raw data is stored in a distributed database, which is a computer network where information is stored on more than one node, potentially in a replicated fashion. Examples are Bigtable, Couchbase, Dynamo and Hypertable. The private blockchain may then only store a reference to the corresponding dataset stored on the distributed database and a hash of that data set. This has the advantage that less raw data is stored in the blockchain, which improves performance of the blockchain, while at the same time, enabling auditing at a later stage since the raw data can always we obtained. Where the data is replicated across multiple nodes, the data is also protected against loss or corruption of individual nodes, which makes the system more robust.
Cattle farm 100 further comprises a farm building 110 housing a server 111. The server 111 receives data from a second wireless access point 112, which in turn receives data from gateway node 104. This data, received by server 111 comprises behaviour data generated by the monitoring devices 102 as well as additional data generated by gateway node 104. Since the additional data generated by gateway node is indicative of the presence of an animal. Server 111 can store this data locally or rely on private blockchain 113 for a persistent ledger of records.
In some examples, behaviour data and sensor data are secret and should be treated as confidential. As a result, cryptography can be used to prevent outside attacks to reveal the secret data. To that end, there is also a public blockchain 114, which stores hash value or other audit information. This way, a specific data value can be audited on the private blockchain 113, but the data value itself does not need to be reveal. Instead, only the hash value is revealed. While it is practically impossible to calculate the original data from the hash value, it is possible to compare the hash value against the data stored on public blockchain 114. If both values match, the data is correct. If they do not, there is a difference between the purported data and the actual data, noting that the data stored on private blockchain is immutable. In one example, the public blockchain 114 stores the hash value of the behaviour data and/or sensor data while in other example, he public blockchain 114 stores the hashes of the blocks of the private blockchain 113. In the latter example, with blockchain hashes from private chain stored on public chain 114, there is a direct cryptographic link between the chains by way of their shared hashes.
Cattle farm 100 further comprises a logic engine 115, which obtains the data stored on private blockchain 113 or directly from server 111. In one example, the logic engine 115 comprises the SPINdle software as available from https://github.com/NICTA/SPINdle. The logic engine 115 converts the data into variable values, which may be Boolean values. The logic engine 115 has further stored a logic representation of policies, regulations or guidelines, where the variables of the logic representations include the same variables for which the values have been determined using the data from server 111. Consequently, rules engine 115 can evaluate the rules for those variable values and provide an output indicative of whether the policies, regulations or guidelines are met or violated.
The result is a single value for each animal. Advantageously, the result can be audited at any time against the data stored on the blockchain 113 such that it is practically impossible to tamper with the output value of the rules engine 115. Since the data from monitoring devices 102 and gateway node 104 are available instantly, logic engine 115 can provide a real-time indication of compliance. In that case, the logic engine does not take into account the trust values since the calculation by way of smart contracts may add a delay, which would make real-time indication impossible.
Here, the logic engine 115 is located remote from the animal. This means that the logic engine is a separate device from the animal in the sense that the animal can move away from the logic engine 115 and can potentially move to a place that is outside range 105 so that the logic engine relies on historical, stored data from monitoring device 102. Logic engine 115 does not necessarily be located far away from the animal but could be at a distance of 1 m, for example. While the logic engine 115 for the entire farm 100 is located remotely from the animals, there may be further instances of logic engines integrated into monitoring devices 102, which perform in-situ real-time compliance checking.
While logic engine 115 is shown in
The compliance checking may be performed in real-time, such that compliance is determined as the behaviour data becomes available with minimum delay. The compliance output, such as a value of a Boolean variable for each of multiple points in time, is then provided separately from blockchains 113 and 114. Additionally, the trust score may be calculated in private blockchain 113 by execution of smart contracts, which may take longer than the compliance checking. For example, the trust calculation may take minutes while compliance checking takes seconds. However, real-time compliance is useful to give farmers and other operators the chance of early intervention. On the other hand, trust scores may only be required for retrospective auditing, so the added benefit of a blockchain based security may outweigh the disadvantage of a delay.
The computer system further obtains 202 sensor data in relation to the animals that is measured independent from the animal behaviours. As mentioned above, the sensor data may comprise camera data. The computer system may further receive or obtain weather data or other farm-related data and then calculate 203 a score indicative of a reliability of the animal behaviours in light of the sensor data. The details of this trust calculation are provided below. Finally, the computer system determines 204 compliance of the animal behaviours with livestock rules, wherein the behaviours are related to the trust score and are used as variable values in determining the compliance.
It is noted that the weather data and further farm-related data may also be used as variable values in the compliance calculation. For example, the weather data may comprise a temperature value and the rules may require certain conditions to be met above predefined temperatures. The weather data may be obtained by gateway device 104 via a satellite connection, and may be received as part of a broadcast communication from the satellite.
Further, monitoring devices 102 may comprise localisation sensors, such as GPS or terrestrial localisation sensors. Monitoring devices 102 may record the location over time and also transmit the recorded locations to the gateway device. In other examples, monitoring devices 102 only transmit the current location to the gateway device while the monitoring device 102 is within range 105.
In yet a further example, the monitoring device 102 has data connectivity through a satellite link. This may be a low bandwidth data link or an emergency link. There may also be a lightweight compliance engine embedded into the monitoring device, such that the location data can be processed in real time in-situ to determine compliance locally. This would enable making use of the location data without the need to store or transmit the location data over the low-bandwidth satellite data link. In the event of a non-compliance result, the monitoring device 102 can use the satellite data link as an emergency channel to send the current location with an emergency flag. This way, a geo-fencing functionality can be realised, where stolen or lost cattle can be identified and saved before it is too late.
Method 200 may be performed by the gateway device, which may be referred to as edge processing. In other examples, method 200 is performed remotely on a server, such as a cloud application. The connection to blockchain 113 may also be realised by gateway device 104 or by a server, such that either the gateway device 104 stores the behaviour and sensor data on blockchain 113 or sends that data to server 111, which then stores the received data on blockchain 113.
While examples herein relate to cattle, this disclosure equally applies to other domesticated animals raised in an agricultural setting, such as sheep, chicken, pigs, etc. to produce labour and commodities such as meat, eggs, milk, fur, leather, and wool.
The description below provides further details on the calculations performed. It is noted that when reference is made to ‘we’ performing a particular step, this means that this step can be performed by a computer system, either on the monitoring device 102, the gateway device 104, the server 111 or elsewhere.
Monitoring and analyzing behaviors of individual livestock over large spatio-temporal scales can provide valuable information about changes in their baseline behavior. This information can in turn be used to measure animal health and performance allowing real-time management of resources in an optimal manner. The cost of gathering such information by relying solely on human labor can be prohibitive rendering it impractical or even impossible when large numbers of animals, spread over large areas, are to be monitored continuously. Wearable and networked sensor technologies offer a solution by enabling the automated collection and processing of the relevant data. Often, the sheer size and high frequency of data makes it inefficient or even infeasible to stream the data to some central storage/processing hub for analysis. Therefore, for sensor technologies to be successful at scale, the capability to extract knowledge from the data in real-time needs be integrated into the sensor nodes, which also capture the data. This allows the high volume of raw data to be compressed into summarized and interpreted results that are more suitable for transmission. Such embedded intelligence is achievable if the associated processing can be realized on the sensor embedded systems under the constraints imposed by their restricted available computational, memory, and energy resources.
Micro-electro-mechanical accelerometer sensors can capture information related to pose and movement. They are relatively inexpensive, consume little power, and take up little space. Accelerometry data is useful for building supervised machine-learning models to classify various behavioral activities of wildlife and livestock, particularly cattle
This disclosure provides classification models for cattle behavior that are suitable for implementation on embedded systems. To this end, a processor analyzes the tri-axial accelerometry data collected by sensor nodes fitted on collars and located on top of the neck of ten cattle. Based on visual observations of the cattle behavior, the data is labeled with six mutually-exclusive behavior classes, namely, grazing, walking, ruminating, resting, drinking, and other (a collective of all other behaviors). With the insights gained from the analysis, the processor extracts informative features from the appropriately-partitioned time segments of the accelerometer readings while keeping in mind the constraints of embedded systems. The raw labeled data is highly imbalanced. By sliding the partitioning time window with different stride lengths for different classes, we produce a balanced dataset that has roughly the same number of datapoints for each class. The resulting balanced dataset facilitates classification model learning and performance evaluation. Moreover, the less frequent but important behaviors, i.e., walking and drinking, constitute similar proportions of the dataset as the other more frequent behaviors. Thus, the datapoints of all classes participate equally to model learning and no class dominates or outweighs any other one. Without balancing, the less prevalent classes may be regarded as noise or outliers.
The processor extracts features from the windowed segment of the accelerometer readings that are pertinent to the pose (pitch) of the animal's head and the intensity of its body movements. The pose-related features are means of the accelerometer readings in three orthogonal spatial axes. To remove the effect of gravity, the processor applies two first-order high-pass Butterworth filters with different cut-off frequencies to the accelerometer reading. Subsequently, the processor calculates the intensity-related features as the mean of the absolute values of the filter outputs for all three spatial axes. The extracted intensity-related features are novel, and somewhat non-conventional, yet meaningful and interpretable. They lead to good classification performance and are computationally low-cost. The use of a second high-pass filter with a different cut-off frequency enables the extraction of further discriminative information, particularly from the spectral domain, with little additional resource consumption.
The extracted features are coupled with the related behavior annotations form a labeled dataset. In association with this dataset, we evaluate the performance of several classification algorithms whose learned models can be stored and used for prediction on embedded systems. The results are encouraging as they indicate that good in-situ cattle behavior classification, i.e., with accuracy close to 90%, is possible using linear classifiers such as logistic regression and support-vector machine.
During atrial from 31 July to 4 Sep. 2018 at the Commonwealth Scientific and Industrial Research Organisation (CSIRO) FD McMaster Laboratory Pasture Intake Facility, Armidale, NSW, Australia, we fitted ten cattle with collar tags specifically designed to collect, store, and process various types of data including inertial measurement, temperature, pressure, and geo-location (through the global navigation satellite system). The research undertaken in this work was approved by the CSIRO FD McMaster Laboratory Chiswick Animal Ethics Committee with the animal research authority number 17/20. The tag, houses a sensor node (mote), a battery pack, and photovoltaic modules for harvesting solar energy. We mount the tag on top of the animal's neck and secure it with a collar belt and a counterweight placed at the bottom of the neck.
The sensor node, named Loci, has a Texas Instruments CC2650F128 system-on-chip that consists of an Arm Cortex-M3 CPU running at 48 MHz with 28 KB of random access memory (RAM), 128 KB of read-only memory (ROM), and a 802.15.4 radio module. Loci also contains an MPU9250 9-axis micro-electro-mechanical (MEMS) inertial measurement unit (IMU) including a tri-axial accelerometer sensor that measures acceleration in three orthogonal spatial directions (axes). The x axis corresponds to the antero-posterior (forward/backward) direction, the y axis to the medio-lateral (horizontal/sideways) direction, and the z axis to the dorso-ventral (upward/downward) direction. The IMU chip outputs the accelerometer readings as 12-bit signed integers at a rate set to 50 samples per second. The operating system running on Loci is Contiki 3.
Throughout the experiment, the tags recorded the tri-axial accelerometer readings on external flash memory cards. We monitored the cattle wearing the tags and manually recorded their behaviors over a total period of approximately 19.2 hours. Using these observations, we annotated the corresponding accelerometry data after retrieving the tags and uploading the data at the end of the trial. The behaviors of interest were grazing, walking, ruminating (standing or lying), resting (standing or lying), drinking, and other. Table 1 shows the total annotated periods in seconds for each animal and each behavior.
Monitoring devices 102 generate a labeled dataset for cattle behavior classification using the annotated accelerometry data. To this end, monitoring devices 102 divide the annotated accelerometer readings into overlapping sets consisting of the values for 10 consecutive seconds. Monitoring devices 102 then extract features from the values within each set to create the datapoints of the dataset. Monitoring devices 102 realize this by sliding a 10-second-wide time window over the annotated accelerometer readings. To balance the dataset, monitoring devices 102 use a different stride length (overlap ratio) for each behavior class. Monitoring devices 102 set the stride length for each class such that the number of 10-second windows are roughly the same for all classes. Table 2 shows the chosen stride length, i.e., the distance between the first samples of every two consecutive 10-second windows, and the resultant number and percentage of the overlapping 10-second windows for each class.
Note that using a different stride length for each class only serves to generate a balanced dataset and not to produce any new information. It can be viewed as being equivalent to making the maximum possible number of datapoints by sliding the partitioning window only one sample (accelerometer reading) for all classes, then subsample the datapoints in accordance with the stride lengths in Table 2. Although balancing the dataset does not add any new information, it has two main advantages. First, most supervised machine-learning algorithms learn more accurate and reliable classification models when the training dataset is balanced. Severe imbalance in a dataset can cause the minority classes to be dominated by the majority classes in model learning, sometimes leading to minority classes being treated as noise. Second, classification performance evaluation is generally more straightforward and comprehensible with balanced datasets. Especially, most classification accuracy measures are meaningful when the underlying dataset is balanced.
To gain some insight into the annotated accelerometry data and identify potential meaningful features to extract from the 10-second windows, we plot the normalized histogram for each class and each spatial axis averaged over all respective 10-second windows in
A key observation from
Our first three features are the means of the values of three axes for each 10-second window calculated as
where fx,i is the mean feature for the ith 10-second window and the x axis (similarly for y and z axes), the second index n of the accelerometer readings is for sample number (time), is the set of indexes of the accelerometer readings within the ith window, and Ni is the cardinality of .
The histograms in
Removing the contribution of earth's gravity to the accelerometer readings is not straightforward. This is mainly because the exact orientation of the sensors at any given time cannot be resolved as they are not firmly attached to the animals and can drift in all directions when the animal moves its body/head. This slack is inevitable and in fact necessary to make the collar tag practically wearable. The counterweight at the bottom of the collar helps keep the tag in place and re-position it when it shifts. The looseness of the tags implies that the projection of earth's gravity over the three spatial axes of the accelerometers is not purely static hence does not manifest only at the zero frequency but it may affect higher frequencies as well. This is further exacerbated by the non-ideal performance of the low-cost MEMS accelerometers used. However, one can assume that gravity mostly influences lower frequencies and has negligible impact at higher frequencies.
To remove the affect of gravity with minimal computational or memory overhead, monitoring devices 102 apply a first-order high-pass Butterworth filter to the accelerometer readings of each 10-second window. As this filter has an infinite impulse response with a single tunable parameter, its implementation requires at most a single multiplication and two additions per sample. The transfer function of this filter is expressed in the s domain as
and in the z domain as
where γ is the parameter of the filter that can be related to the cut-off frequency of the filter, denoted by fc, and the sampling period, denoted by Ts, as
Thus, ignoring the constant gain factor
the application of the filter to the accelerometer readings in all axes can be written as
b
x,n
=γb
x,n-1
+a
x,n
−a
x,n-1
b
y,n
=γb
y,n-1
+a
y,n
−a
y,n-1
b
z,n
=γb
z,n-1
+a
z,n
−a
z,n-1 (5)
where bx,n, by,n, and bz,n are the filter outputs at sample number (time) n.
as in (3) for different values of γ and the corresponding cut-off frequencies. Note that the sampling frequency is 50 Hz. We will explain how we choose the value of γ later.
Therefore, to inspect the possibility of extracting useful features through second-order statistics, after the high-pass filtering and elimination of the effect of gravity, monitoring devices 102 calculate three more features from the filter outputs by averaging their absolute values. Since the mean of the filter output values for each axis is very close to zero, the mean absolute value is a good surrogate for the standard deviation. It is an equally effective measure of spread in the probability distribution while being more economic to calculate than the standard deviation. Monitoring devices 102 compute the mean-absolute features as
The features gx,i, gy,i, and gz,i can be seen as representatives of the intensity of the animal's body movements while the contribution of the lower parts of the frequency spectrum has been suppressed to lessen the effect of gravity. The extent of this suppression depends on the shape of the frequency response of the filter and consequently the cut-off frequency that is determined by the parameter γ. However, since gx,i, gy,i, and gz,i are single values aggregated over the whole spectrum, they may not possess sufficient discriminative power.
To provide more insight, we plot the amplitude spectral density (ASD) functions of the accelerometer readings averaged over all 10-second windows for all classes and axes in
To capture further discriminative information available within the spectral domain without incurring any substantial additional computational or memory complexity, we propose to utilize two first-order high-pass Butterworth filters with different parameters, γ1 and γ2, and obtain two sets of high-pass-filtered values as
b
x,n=γ1bx,n-1+ax,n−ax,n-1
b
y,n=γ1by,n-1+ay,n−ay,n-1
b
z,n=γ1bz,n-1+az,n−az,n-1
c
x,n=γ2cx,n-1+ax,n−ax,n-1
c
y,n=γ2cy,n-1+ay,n−ay,n-1
c
z,n=γ2cz,n-1+az,n−az,n-1. (7)
Then, monitoring devices 102 compute six activity-intensity-related features, gx,i, gy,i, gz,i, hx,i, hy,i, and hz,i, for each 10-second window through (6) and
Meanwhile, we did not find any benefit in using three or more first-order high-pass Butterworth filters nor in using any higher-order high-pass filter. In addition, the higher, i.e., third or more, -order moments of the filtered accelerometer readings do not seem to contain any further discriminative information. Specifically, since the probability distributions of the filtered values are rather symmetric with thin tails, the odd-order moments are close to zero and the even-order moments, other than the second-order one, appear to be insignificant.
It is also worth noting that the window size of 10 second leads to a resolution of 0.1 Hz in the frequency spectrum that seems to be sufficient to highlight subtle differences between the classes, especially at low frequencies. In addition, as the accelerometer readings are considerably noisy, calculating features over 10 second windows, i.e. around 500 values, helps reduce the adverse effects of noise.
To decide the values of the filter parameters, γ1 and γ2, monitoring devices 102 evaluate the behavior classification performance for different combinations of γ1 and γ2 over a two-dimensional grid of values ranging from −0.5 to 0.9 with a step of 0.1. Monitoring devices 102 use the softmax (multinomial logistic regression) algorithm without any regularization to learn a behavior classification model for each combination. Monitoring devices 102 then evaluate the overall accuracy of the learned model for each point on the grid using a 5-fold stratified cross validation without any shuffling or randomization of the datapoints. The overall accuracy is the ratio of the number of correctly classified datapoints for all classes to the number of all datapoints. Here, it is a meaningful measure of general classification accuracy since our dataset is balanced.
The points on the diagonal of the grid where γ1=γ2 correspond to when only one filter and consequently six features are used. It is clear from
Our choice of the softmax algorithm with no regularization for determining the filter parameters is mainly due to its simplicity and lack of any tunable hyperparameter.
To make an initial comparative assessment of the usefulness of each extracted feature, we compute the analysis of variance (ANOVA) F-value and the estimated mutual information between the features and the corresponding class labels. We also examine the importance score of the features calculated by a random forest (RF) of 1000 classification trees each with maximum 10 leaf nodes. We plot the results in
Dealing with the Outliers
While resting, be it standing or lying, cattle do not generally stay completely stationary but may occasionally move their head, flick ears, twitch various muscles in the body, etc. In our annotations, such occasions lasting less than a few seconds have still been labeled as resting. To minimize inaccuracies caused by such miscellaneous behaviors during the periods labeled as resting, monitoring devices 102 identify the outliers among the 10-second windows labeled as resting and relabel them as the other behavior.
Monitoring devices 102 employ the random cut forest (RCF) algorithm to detect the outliers. Monitoring devices 102 set the contamination ratio, i.e., the ratio of expected outliers in the resting 10-second windows, to 0.1 (10%) and the number of trees in the RCF to 1000. Monitoring devices 102 also use only the activity-intensity-related features gx, gy, gz, hx, hy, and hz.
Note that the numbers and percentages given in Table 2 were calculated after eliminating the resting outliers by relabeling them as other.
We visualize our dataset consisting of the features extracted from the 10-second windows and their pertaining class labels by projecting the eight-dimensional feature space of the dataset onto two dimensions using the t-distributed stochastic neighbor embedding (tSNE) algorithm.
An important observation in
Extracting meaningful features as described in the previous section is the first step in developing a statistical model for predicting the behavior performed by a cattle based on accelerometer readings. The second step is learning a function that maps the extracted features to the corresponding behavior class labels. In this section, we examine the performance of various supervised machine-learning algorithms in learning such a mapping function and building an behavior classification model. Nevertheless, our goal is to perform behavior classification on embedded systems with constrained energy, memory, and computational resources. Therefore, we only consider the algorithms whose inference procedure can be implemented on typical embedded systems, particularly on our Loci sensor-node.
We consider the following algorithms: logistic regression (LR), softmax, support vector machine (SVM), decision tree (DT), linear discriminant analysis (LDA), quadratic discriminant analysis (QDA), and Gaussian naive Bayes (GNB). For the LR, SVM, and softmax algorithms, we use the l2-norm regularization and tune the regularization parameter of each algorithm to attain the best possible overall accuracy. For SVM, we use the squared hinge loss. For DT, we set the maximum number of leaf nodes to 23. We use the scikit-learn Python package to train all classification models and evaluate their predictive performance.
For the LR and SVM algorithms, we use both one-versus-one (OvO) and one-versus-rest (OvR) reduction schemes. With the OvO scheme, a set of binary classifiers is used where each one is trained on the datapoints of a pair of classes and learns to distinguish those two classes. For inference, all 6(6−1)/2=15 binary classifiers are applied to a datapoint whose class is to be predicted (test datapoint) and the class with the highest cumulative score (or number of wins) is taken as the prediction of the combined classifier. In the OvR scheme, a set of binary classifiers each trained to distinguish the datapoints of one class from the rest of the dataset is used. At inference, all six binary classifiers are used to classify a test datapoint and the class whose respective classifier yields the highest score is taken as the prediction. The considered algorithms other than LR and SVM can inherently handle multiple classes.
Table 3 shows the inference equation/procedure as well as the model parameters for each considered algorithm. Tables 4 and 5 also present the model size and computational complexity for performing classification inference using the considered algorithms in terms of the required number of integer and floating-point parameters and operations. In the calculations leading to the results of Table 4, we assume that the feature vectors, ft in Table 3, the mean vectors, mc∈C in Table 3, and the thresholds of the learned DT have integer values. We also assume that each integer parameter takes up four bytes and each floating-point parameter eight bytes. Note that C denotes the set of indexes for all classes. The model size and required operations given for DT in Tables 4 and 5 are regarding the learned classification tree depicted in
2ln[ p(c)| Σc |−1/2 ] − (ft − mc)T Σc−1(ft − mc)
It is evident from Tables 4 and 5 that the number of computations that the considered algorithms require for inference as well as the amount of memory they require for model storage is well within the affordability of most embedded systems. On the other hand, more complex classification algorithms such as SVM with a radial basis function (RBF) kernel, k nearest neighbor, multi-layer perception, and random forest generally demand significantly higher resources for both performing inference and storing the pertinent model.
Table 6 presents the results of cross-validated performance evaluation of the considered algorithms with our cattle behavior dataset in terms of the F1 score for all classes. The F1 score is the harmonic mean of the precision and recall. It is a positive scalar and has the best value of one with perfect precision. Table 7 shows the cross-validated overall accuracy, cross-entropy score, and Brier score for all algorithms. The cross-entropy and Brier scores are measures of the accuracy of probabilistic forecast when probabilities are assigned to the class predictions. The cross-entropy score is the negative log-likelihood of the true class averaged over all datapoints given the probabilities predicted for every class. The Brier score is the mean of squared difference between the predicted probabilities and the actual class of every datapoint. Therefore, smaller cross-entropy and Brier scores indicate better calibrated probabilistic predictions.
We use a 5-fold stratified cross-validation without shuffling the datapoints. Stratification helps protect the balance in the prevalence of the datapoints of different classes within all five cross-validation folds. We avoid shuffling the datapoints when dividing them into the folds to respect the time-series nature of the accelerometer readings and acknowledge the fact that datapoints generated from consecutive 10-second windows may be correlated and shuffling can lead to information leakage from the training set into the test set.
As seen in Tables 6 and 7, the LR, SVM, and softmax algorithms perform noticeably better than the other considered algorithms. The softmax algorithm is particularly interesting as its overall accuracy is very close to those of the best-performing algorithms, LR and SVM, while requiring appreciably less computations and model memory for inference. In addition, it has low cross-entropy and Brier scores. All algorithms except LDA exhibit good performance in recognizing the grazing, ruminating, resting, and drinking behavior classes. However, all algorithms are less accurate in distinguishing the other behavior class. The results for the walking class are somewhat mixed.
According to Table 7, the LR and SVM algorithms using the OvO reduction scheme slightly outperform their OvR-based counterparts in terms of the overall accuracy. This might be due to the fact that although the whole dataset is balanced in the multiclass sense, it is not balanced from the perspective of the binary classifiers of the OvR scheme.
To provide a qualitative assessment of broader generalizability of the knowledge gained from our annotated accelerometry data to unseen instances, monitoring devices 102 use the softmax classification model trained on our cattle behavior dataset to predict the behaviors of two cattle for the entire period of the experiment totaling 591,490 datapoints, albeit without the exact ground-truth information for the whole period.
The durations and temporal trends of various behaviors in
With a time window of 10 second and a sampling rate of 50 samples per second, the features for every datapoint are calculated using 500 accelerometer readings in each axis. To avoid the expensive division operations required to average the accelerometer readings in (1) and the absolute values of the filter outputs in (6) and (8) for calculating the features, monitoring devices 102 use 512 values instead and replace the divisions by bit shifts. Since monitoring devices 102 set the filter parameters to γ1=0 and γ2=0.5 and the multiplication by 0.5 can be implemented as a bit shift, the calculation of the features can be realized using only integer additions and bit shifts with no need for any multiplication/division or any floating-point operation. The calculated feature values are hence integers. In addition, to prevent a possible overflow when summing up the accelerometer readings or the absolute values of the filter outputs, monitoring devices 102 average every 64 consecutive values then average the last eight 64-average values to obtain the features. Thus, monitoring devices 102 only store the last 64 values and the last eight 64-average values. Accordingly, monitoring devices 102 are able to calculate the features and carry out inference as every new 64 accelerometer readings become available, i.e., every 1.28 seconds. As such, the inference time window is sliding with a width of 10.24 seconds and a stride of 1.28 seconds.
We implement the inference procedure for behavior class prediction using the learned classification models on Loci with the aid of the skleam-porter package. This package contains C code for implementing several trained classification models produced by scikit-learn on embedded systems.
In Table 8, we show the average number of CPU cycles and the average processing time required for performing inference via each considered algorithm on Loci. The values given in Table 8 are averaged over 200 independent trails as the complexity of floating-point operations is non-deterministic.
Besides the memory needed to store the model parameters (as in Tables 4) and the inference routine, the working memory required for calculating the features is only about (64+8)×9×4=2592 bytes assuming each integer number takes up 4 bytes. Overall, in-situ behavior classification using any of the examined algorithms can be conveniently performed on Loci without imposing any strain on the available computational or memory resources.
An interesting finding is that the relatively simple linear discriminative models, LR, SVM, and softmax (also DT and QDA to some extent) can distinguish the rare behaviors, walking and drinking, rather well despite the small amount of annotated data for these classes. This is particularly important as these behaviors are often much less frequent compared to the common behaviors of grazing (feeding), ruminating, and resting. In fact, almost all cattle behavior datasets are bound to be unbalanced regardless of the quantity of annotations since, in any realistic trial, one cannot essentially expect to observe biologically significant but infrequent behaviors, such as drinking, in abundance.
Our feature extraction approach is somewhat heuristic albeit being founded on theoretical insights, domain expertise, existing body of knowledge in the literature, and our extensive examinations and observations. Another approach for feature extraction that can make most of the available discriminative information is to use an appropriate filter bank whose parameters are estimated alongside training the classification model in a so-called end-to-end learning manner. However, apart from the challenges associated with such concurrent representation and classification model learning, the learned end-to-end model should be fit for performing inference on embedded systems. This is a topic of our ongoing investigations.
We chose to use the softmax algorithm for predicting the class labels in
Farm 100 relies on Internet of Things (IoT) sensors to capture observations of the physical domain and record them digitally, effectively converting continuous physical signals into digital signals in the process. In other words, IoT provides observations of the true state of the physical domain. These observations may be subject to noise, bias, sensor drift, or malicious alterations. Trust in IoT systems is critical at three distinct levels: (1) the data layer that relates to sensor and other observational data; (2) the interaction layer that relates to communications among devices in the IoT network; and (3) the application layer that relates to data processing and the interactions between service providers and service users.
This disclosure provides trust mechanisms that cut across these levels to ensure the end-to-end integrity of the collected data and the associated interactions. One key to fulfilling these requirements is the transparency of data collection processes and the associated interactions, in addition to the ability to audit these processes and interactions. Both the transparency and auditability requirements motivate the consideration of blockchain to underpin trust in IoT. Some examples are presented herein in the context of rooms in a building, noting that this is for illustrative purposes only and the rooms can be substituted for paddocks, areas, or farms in other examples.
Blockchain is a distributed ledger and has been applied to non-monetary applications. Blockchain is immutable as it is jointly managed by network participants through a consensus mechanism, such as Proof-of-Work (PoW), Proof-of-Stake (PoS), or Proof-of-Elapsed-Time (PoET). Consensus delivers agreement among the network participants, which are untrusted, on the current state of the ledger. In effect, trust in the current state is decentralised due to its coupling to the outcome of distributed consensus among the participants.
In the context of IoT, blockchain provides an immutable audit trail of sensor observations by linking the hash of the sensor data to blockchain transactions. The transactions themselves record immutable records of interactions among IoT devices and other network entities. Transactions are grouped into blocks that are linked through cryptographic hash functions to previous blocks in the chain, making it virtually impossible to alter previously stored blocks without detection. Using public key cryptography, blockchain can verify the authenticity of IoT transactions and blocks, before they are added to the blockchain. Once the blocks are mined into the blockchain, we have a guarantee that the inter-node interactions recorded in the block's transactions are securely recorded and are tamper-proof Providing a tamper-proof audit trail of inter-node interactions is a necessary but insufficient element to deliver end-to-end trust in IoT. Storing the hash of the data on the blockchain does ensure that the integrity of the stored data can be verified by comparing its hash against the blockchain-stored hash value. The authenticity of the observational data itself in the first place, however, is not guaranteed. As IoT data is an observation of the physical environment, its capture can involve noise, bias, sensor drift, or manipulation by a malicious entity. The immutability of blockchain does not protect against this risk associated with data capture, as inaccurate observational data that is secured with blockchain may not be useful to the IoT end users.
The discussion above highlights the intertwined nature of trust in IoT involving both the inter-node interactions and the data capture process. There is a clear need for an integrated architecture to deliver end-to-end trust that cuts across the data collection and blockchain node interactions in IoT. To address this problem, this disclosure provides a layered trust architecture for blockchain-based IoT applications. This architecture provides end-to-end trust from data observation to blockchain validation. To enhance trust in observational data, node computers (such as gateway device 104 or server 111) use the observer's long-term reputation, its own confidence in its data, and corroborating data from neighboring observers. Trust at the block generation level is established based on verifying transactions through our adaptive block validation mechanism.
Contributions are:
This disclosure provides a trust architecture that takes into account both the data and the blockchain layers to improve the end-to-end trust.
We propose a blockchain-based layered trust architecture for IoT as shown in
Our architecture introduces two key modules for trust management: (1) the data trust module; and (2) the gateway reputation module. The data trust module quantifies the confidence in specific observational data based on: the evidence from other nearby data sources; the reputation of the data source based on the long-term behaviour; and the confidence level of the observation reported by the data source. It uses inputs from the data layer and records the trust value of observations into their associated transactions. The reputation module tracks a blockchain network participant's long-term reliability. It inputs information from the blockchain layer on a participant's reputation history, and continuously updates the reputation to provide it to both the blockchain and application layers. The blockchain layer can use the updated reputation to dynamically adapt its transaction or block validation requirements of other participants, where blocks from more trustworthy receive less scrutiny. The application layer can use updated reputation scores to offer economic incentives to highly reputable nodes, such as through increased business interactions. The reputation module can also incorporate external inputs, such as a participant's reputation from external systems, referred to as reputation transfer. Next, we introduce the underlying network model for the proposed architecture before proceeding to the details of the proposed trust and reputation mechanisms.
Due to the resource constraints and the limited capabilities of IoT nodes, we consider a two-tiered IoT network model as shown in
Having defined the key components of our trust architecture, we now focus on generic mechanisms for managing trust and reputation within this framework.
Recall that trust in our architecture refers to the instantaneous confidence in observations.
Assuming that neighboring sensor nodes connected to the same gateway have correlated observations due to close proximity, the observations of neighbouring sensor nodes can be used as evidence for the trustworthiness of a sensor observation. Sensor nodes build a history of reputation based on the evidence of other sensor node observations. A sensor node whose observations are supported by evidence most of the time has a higher reputation than a sensor node whose observations are not supported.
The reputation component in our data trust mechanism represents a node's long-term behaviour and affects the trust value of the observation data it provides. While the long-term reputation of a sensor node evolves with time, the trust value of the data is instantaneous for each observation. The other element that feeds into trusting a particular observation from a sensor node, which has yet to be considered, is the node's own confidence or uncertainty in its observations. For instance, a location estimate obtained from GPS is often associated with a position uncertainty estimate, which is the GPS module's estimate of error based on the received satellite signal and algorithm features. Including this uncertainty into the trust computation for the location observation provides the observer's own account of possible inaccuracies in its measurement. As a result, node computers model the trust in an observation Observationij at the data layer as:
Trustij=f(Tsensij,Trepij,Tconfij) (9)
where f is a function mapping the evidence from other sensor node observations Tsensij, the reputation of the sensor node Trepij, as well as the node's uncertainty in its observation Tconfij, to the trust level Trustij in this observation. All the terms refer to values at the current time t, so we omit this notation for simplicity. Evidence supporting the observation, higher reputation of the data source, as well as lower observation uncertainty should all lead to a higher trust value in the observation. The definition of the mapping f and the trust components (i.e., evidence, reputation, and confidence) are application-specific and dependent on the relevant sensing modalities. To illustrate how the gateway nodes can assign the trust values to sensor observations, let us consider the simple mapping:
Trustij=Tsensij×Trepij×Tconfij (10)
and
Confidence of the data source (Tconfij): Confidence of the data source represents how confident the data source is in its observation and can be modelled as a variable Tconfij ∈[0,1], whose value is determined by the data source and transmitted to the gateway together with the observation. Thus, the transaction from the data source Sij to the gateway G, becomes:
Tx
ij=[Observationij|Tconfij|PKij|Sigij] (11)
where PKij and Sigij are the public key and the signature of the node Sij, respectively. The confidence of the data source depends on the application-specific confidence model. As an example, the confidence of a GPS sensor node would be high when the received GPS signal is strong, and low when the received signal is weak. Furthermore, a fixed confidence value can be assigned to nodes who do not receive any GPS fix. We present a confidence model for sensor nodes that use the Received Signal Strength Indicator (RSSI) for determining the proximity of a beacon node for the indoor target localization application. In this sense, the mere fact that data has been received from a sensor wirelessly at a gateway, means that the sensor must have been at close proximity to the gateway. Therefore, relatively high trust can be placed in the locality of that sensor data. This provides an additional source of trust to the disclosed system.
Evidence from other observations (Tsensij): The gateway uses the correlation in sensor observations to calculate the evidence component for the trust in observations. The gateway G, calculates the evidence Tsensij for the observation Observationij based on the data received from the neighboring sensor nodes of Sij. The neighborhood information is recorded in the profiles of the nodes on blockchain. If a sensor observation Observationim supports Observationij, it increases Tsensij by a value proportional to its own observation confidence Tconfim. Otherwise, if Observationim does not support Observationij, it decreases Tsensij by a value proportional to Tconfim. The proposed confidence weighted evidence calculation is given by:
and Nij denotes the set of the neighboring sensor nodes of Sij. The support condition in Eq. 4 is application-specific. As an example, for acoustic sensor observations, the difference between the measurements can be compared to a threshold value to determine if the observations support each other or not.
Reputation of the data source (Trepij): There is a clear interplay between the trust level in an observation and the data source's long-term reputation. Higher reputation of a node leads to higher trust in the node observation. The reputation of a data source evolves in time and is updated by its responsible gateway node. The governing principle of reputation update based on the observation confidence and the evidence of other observations is the following: the reputation reward or penalty must be proportional to the reported confidence. If node Sij has high confidence in its observation (i.e. Tconfij≥confidence threshold) and the observation is substantiated by other nodes (i.e. Tsensij≥evidence threshold), Sij should receive a significant increase ΔRepH in its reputation Trepij. Conversely, if Sij delivers observations with high confidence that are refuted by other nodes, its reputation should also drop significantly. Similarly, rewards and penalties ΔRepL for observations with low confidence should be lower, i.e. ΔRepL<ΔRepH.
Malicious nodes may then be compelled to report erroneous values with low confidence in order to perturb the system. While such nodes will not suffer a significant drop in reputation for each observation, their actions can be countered by: (1) proper design of the function weighting high uncertainty measurements in the trust level calculation; and (2) design of the reputation score update mechanism to penalise repetitive low confidence observations from the same node.
Note that, the data trust block proposed in our architecture is modular and can be adapted for different applications. For example, depending on the spatio-temporal properties of the physical phenomenon being observed by the sensor nodes, spatial and temporal correlation of observations can be incorporated in the trust calculations. Once Trepij and Trustij values are computed for Observationij by the associated gateway Gi, they can be included as part of the transaction that records the occurrence of the observation in the blockchain. This provides an auditable account of the trust estimate of the generated data and the reputation of the data source in the blockchain.
This section presents the gateway reputation module, which updates the reputations to be used by the adaptive block validation process to integrate the data trust mechanism to the blockchain layer.
Once a gateway node generates a new block, this block should be validated by validators before being appended to the blockchain. For the proposed trust architecture, the block validation involves: (1) validating the data transactions by checking the public keys of the data sources and their signatures in the transactions; and (2) validating the trust values assigned by the gateway to the observations by recalculating the trust values with the data available in the generated block and on the blockchain. The gateway reputation module tracks the long-term behaviour of gateway nodes and adapts the block validation depending on the reputation of the current gateway node. The proposed reputation module receives frequent updates from the blockchain layer, where each node's honesty in block mining, B(Gi), is reported based on direct and indirect evidence, and used to update the node's reputation score. Our reputation module further integrates the data trust mechanism to the block validation process by validating: (1) the observation trust values assigned by the gateway, and (2) the sensor transactions reported in the block to update the reputation score of the gateway node. External sources of a node's reputation Ext(Gi), which can be imported from other systems, can also be fed into the node's reputation score. In summary, the reputation score, Rep(Gi)∈[Repmin, Repmax], of node Gi is based on a function g:
Rep(Gi)=g[T(Gi),B(Gi),Ext(Gi)] (18)
where T(Gi) captures how much other validator nodes trust Gi based on Gi's trust value assignment to the observations.
We propose a reputation update mechanism that considers the validity of sensor transactions and the correctness of the associated trust values. The reputation of the gateway node increases if the generated block is validated, and decreases otherwise.
where ΔR is the reputation increase step, and β·ΔR is the reputation reduction step. For β>1, it is harder for the gateway nodes to build reputation than to lose it.
Based on Lightweight Scalable Blockchain (LSB) that is optimized for IoT requirements, we propose a private blockchain for our trust architecture with a lightweight block generation mechanism, reputation-based adaptive block validation, and distributed consensus among blockchain nodes.
At the blockchain layer, the gateway nodes participate in block generation, block validation, and distributed consensus in a private blockchain network. In a private blockchain, nodes have permissions to participate in the network. Since the gateway nodes are known by the network and have permissions to generate blocks, they do not need to compete for block generation using computationally expensive block mining mechanisms. We propose a lightweight block generation mechanism, where gateways generate blocks in periodic intervals. After receiving all the associated sensor transactions, the gateway validates these transactions and calculates the evidences and the sensor reputations to assign trust values for the sensor observations. Then, it generates a block with transactions containing observation data, the public key and the signature of the data source, the assigned trust value for the observation, and the updated reputation of the data source. The gateway node waits for its turn to multicast the block to the other blockchain nodes for validation. The block generation time periods for the gateways can be adjusted based on the sensor data rate and the latency of data collection and block generation.
The proposed block validation mechanism adapts the block validation scheme based on the reputation of the block generating node Rep(Gi) and the number of validator nodes Nval. The integration of trust management in the block verification mechanism improves the block validation and is managed by the gateway reputation module of our architecture.
Depending on the reputation of the block generating node, each validator randomly validates a percentage of the transactions in the block. The idea behind using reputation for adaptive block validation can be explained by
P(successfulattack)=P(attacksucceeds|attack)P(attack) (16)
where higher reputation of a node can be perceived as a lower node attack probability. For a target P(successful attack) threshold, if P(attack) is low, the system can tolerate a higher P(attack succeeds|attack). In terms of block validation, that corresponds to validating a smaller number transactions in a block generated by a gateway with high reputation. Node computers can model the relative effect of the reputation on the percentage of transactions to be validated with a linearly decreasing function.
The percentage of the transactions to be validated also depends on the number of validators. For a fixed probability of invalid block detection target, as the number of validators increases, the percentage of transactions required to be validated by each validator node decreases. Following the adaptive block validation logic, validator nodes validate a percentage of the transactions in a block. Consequently, there is a risk of not detecting invalid transactions in a given block. The probability of not detecting any invalid transactions by Nval validator nodes given that there are Txinval invalid transactions in the block can be calculated as follows:
where Txtotal is the number of transactions in the block and Txval is the number of transactions to be validated by each validator node. There are
ways to choose a subset of Txval transactions to be validated out of which
of them does not include any invalid transactions.
The minimum number of validators required to achieve a target probability threshold of not detecting any invalid transactions for a given number of transactions validated by each validator is shown in
Based on these observations, we consider an adaptive block validation mechanism, where the Percentage of the Validated Transactions (PVT) decreases with the reputation of the block generating node (Rep) and the number of validator nodes (Nval) as:
PVT=(γ0+γ1Rep)×e−δN
where δ is a controlling parameter determining the effect of Nval on PVT, and γ0 and γ1 are the parameters of an affine function determining the effect of Rep on PVT. For large values of 3, PVT decreases quickly with Nval and this may result in a lower probability of detection of invalid blocks. For small values of δ, increasing Nval does not decrease PVT enough, and causes higher number of transactions to be validated than needed. The proposed adaptive block validation scheme can reduce the computational cost of block validation process significantly, and improve the scalability and latency of the proposed trust architecture.
As a result of block validation, a validator either multicasts a “VALID” message to confirm that the block is valid, or an “INVALID TRANSACTION ID” message to notify other nodes about an invalid transaction in the block. If all the validators multicast “VALID” messages, then the block is appended to the blockchain by the nodes. However, if a blockchain node receives “INVALID TRANSACTION ID” messages for a block, it validates the transactions given by the Invalid Transaction IDs. If at least one transaction is found to be invalid, the block is rejected by the node. If all the transactions are verified to be valid, then the block appended to the blockchain. A malicious validator may keep broadcasting “INVALID TRANSACTION ID” messages and try to waste the network resources by forcing all the transactions in the block to be validated. To mitigate such attacks, during the consensus period, each validator is allowed to multicast only one message, either confirming the valid block or containing the transaction ID for only one invalid transaction.
We divide the performance analysis of the architecture into two parts: (1) the performance of the data trust module, and (2) the end-to-end performance of the proposed trust architecture. To illustrate how the proposed architecture works, we consider an indoor target localization application in a smart construction environment, where IoT devices collect data from the construction site to monitor all stages of the construction project.
This section analyzes the data trust module's ability to assign higher trust values to honest nodes and lower trust values to malicious nodes in the presence of malicious observations. Assume that an unauthorized vehicle (target) enters a restricted construction area (ROOM1). The target periodically broadcasts beacons and ROOM1 is monitored by K=48 IoT sensor nodes, which can hear these beacons. The sensors report the RSSI values and the confidence of their observations to the associated gateway by appending their public keys and signing the transactions with their private keys. These RSSI values can be used for target detection and localization in the application layer. Furthermore, we consider that a sensor node may be malicious or malfunctioning and its observation of the target movement may deviate from the true target path. We assume that the honest sensor nodes report RSSI observations for the target track, and malicious sensor nodes report RSSI observations for the malicious track as shown in
The RSSI(dB) value of a sensor node at an arbitrary distance d from the target can be defined with a log-normal shadowing model as:
RSSI(d)=RSSI(d0)−10α log10(dd0)+Xσ (19)
where RSSI(d0) represents the received signal strength at a reference distance d0, α is the environment-specific pathloss exponent, and Xδ:N (0,σ2) is a normal variable, which represents the variation of the received power due to fading. The minimum received RSSI value by the sensor nodes is assumed to be −120 dB. As shown in
We model the confidence of an RSSI observation based on the intuition that the sensor nodes with very high RSSI values would have the maximum confidence in the range of the target. Conversely, sensor nodes with RSSI values below the receiver sensitivity would have a lower constant confidence value. The intuition behind setting a lower confidence value for no target detection is as follows. If honest nodes detect no target, they can report a fixed lower confidence for their observation. A malicious node falsely claiming the absence of the target can therefore not gain a disproportionate advantage over honest nodes by setting high confidence in its false observation. Conversely, setting the target absence confidence to be non-zero avoids a minority of malicious nodes falsely claiming the presence of a target with high confidence and representing a majority view. Observations where the RSSI values are moderate are assigned a confidence based on a linear function as follows:
The specific values of the confidence function were derived empirically for our scenario. The simulated RSSI values reported by the sensor nodes are shown in
Clearly, the performance of the data trust module depends on the ratio of malicious nodes to honest nodes, and the observation confidences. Next, we investigate analytically the maximum number of malicious nodes the data trust module can tolerate as a function of the observation confidences and the total number of nodes when all the sensor nodes associated with a gateway node are assumed to be neighbors. Consider two disjoint sets of nodes, i.e. honest nodes Sh and malicious nodes Sm, such that |Sh|Sm|=K and
Trepij=Treph,Tconfij=Tconfh for Sij∈Sh
Trepij=Trepm,Tconfij=Tconfm for Sij∈Sm
For the worst case scenario of colluding malicious nodes, let us assume that the members of a set share the same evidence value:
Tsensij=Σk=1|S
Tsensij=Σk=1|S
Furthermore, malicious nodes can behave like honest nodes to build similar reputations not to get detected before behaving maliciously, i.e. Treph=Trepm. Based on these assumptions, the tolerable region, where the data trust module is capable of assigning higher trust values to the honest nodes Trusth than the trust values assigned to the malicious nodes Trustm is given by:
Trusth>Trustm
TsenshTrephTconfh>TsensmTrepmTconfm (23)
For Treph=Trepm, substituting Eq. 21 and Eq. 22 in Eq. 23:
((|Sh|−1)Tconfh−|Sm|Tconfm)Tconfh>((|Sm|−1)Tconfm−(|Sh|)Tconfh)Tconfm (24)
While honest nodes report their true confidence levels, malicious nodes may report higher confidence levels. For Tconfm,Tconfh>0, Eq. 24 can be stated as:
where c is the ratio of confidences
For end-to-end performance analysis, we used the ns-3 network simulator with our lightweight blockchain architecture. RSSI measurements were generated according to the simulation scenario shown in
The validators follow the adaptive block validation mechanism described above. When Txtotal=48, the number of transactions to be validated by each validator can be calculated by
PVT=((4.7/4)−(0.7/4)Rep)×e−0.03N
Tx
val
=┌Tx
total×PVT┐ (27)
where the parameters for Eq. 26 are determined empirically (the optimization of parameters will be considered in future work) such that the probability of not detecting any invalid transactions by Nval validators given that there is 1 invalid transaction in the block is not ‘high’.
Table 1 shows the number of transactions randomly validated by each validator. Last two columns show the number of transactions to be validated by each validator for a given probability threshold for not detecting an invalid transaction given that there is 1 invalid transaction in the block. For example, when Nval, =15 and Rep=1, each validator validates 31 transactions randomly chosen out of 48. Whereas, if the gateway has reputation Rep=5, each validator validates only 10 transactions. Note that, in this case, the probability of not detecting an invalid transaction given that there is 1 invalid transaction in the block is ≈0.03. Although this probability may be high, we need to multiply this probability with the attack probability of a gateway node with high reputation to get the probability of a successful attack as in Eq. 16. If the attack probability of a gateway with Rep=5 is less than ≈0.033, the probability of a successful attack becomes less than 0.001. If there is 1 invalid transaction in the block, for Nval=15, each validator should validate at least 18 transactions so that the probability of not detecting the invalid transaction in the block is less than 0.001. There is a tradeoff between the computational cost of block validation and the probability of a successful attack by a malicious gateway.
In this section, a gateway with initial reputation Rep=3 generates 205 valid, 105 invalid, and 105 valid blocks in order, running a single simulation.
The invalid block detection performance improves by increasing the number of validators. For Nval=15, when there is more than 1 invalid transaction in the invalid blocks, all invalid blocks have been detected. Increasing the reputation reduction step also improves the detection performance. However, a steep reputation reduction results in a higher number of transactions to be validated by the validators.
Note that, when there are 10 invalid transactions in the invalid blocks, all invalid blocks have been detected by all of the simulated schemes, as the probability of detecting at least one of the invalid transactions approaches one.
We analyze the latencies caused by the proposed trust architecture by comparing it with a baseline blockchain application without a trust architecture and adaptive block validation.
In the baseline case, a gateway node creates a block of sensor observations by verifying the signatures of the transactions. In the proposed scheme, a gateway node needs to calculate the trust values for the observations in addition to verifying the transaction signatures.
In the baseline case, blocks are validated by only verifying the signatures of all the transactions. In the proposed scheme, a percentage of the transactions are validated depending on the reputation of the gateway node. However, validation requires checking the signatures and recalculating the trust values.
The block validation, blockchain layer, and end-to-end latencies of the proposed scheme is higher than the baseline when the gateway nodes have low reputation, and lower than the baseline when the gateway nodes have high reputation. However, the difference is relatively low, with the proposed approach adding less than 0.3% end-to-end delay over the baseline, since most of the delay is common for the baseline and the proposed architecture (due to packet transfers from sensor nodes to gateways and among blockchain nodes).
This section considers attack scenarios that can be implemented by sensor nodes, blockchain nodes, or external attackers, and the response of the proposed architecture. We assume that the sensor nodes and the gateway nodes are registered to the network during initialization by a trusted entity. Their public keys are published in their profiles and they have secure mechanisms to generate and keep their private keys.
Malicious sensor nodes can try to tamper with their observations. The proposed data trust module uses evidence of other node observations, the reputation of the sensor node, and the reported confidence levels to assign a trust value to the observation. As long as the tampered observation is not supported by other node observations, the observation will be assigned a low trust value depending on the reputation of the sensor node and the confidence of its observation. Furthermore, the reputation of the node is decreased for future observations. In order to increase the probability of a successful attack, malicious sensor nodes may collude to tamper with their observations, such that the tampered observations support each other. For the collusion attack to be successful, the number of malicious nodes should exceed bound given in Eq. 25, which depends on the number of sensor nodes, and the ratio of confidences of malicious and honest nodes.
Malicious gateways: Malicious gateways can generate invalid blocks by tampering transactions, or assigning fake trust values to transactions. During block validation, the validators try to verify the block generated by the malicious node. If the transactions are changed by the gateway, the signatures of the sensor nodes corresponding to the tampered transactions cannot be verified. If the transaction trust values are not assigned according to the architecture, this would also be detected by the validators, as they recompute the trust values for the transactions during block validation. Once the block is invalidated by the validator nodes, the reputation of the gateway node is downgraded. Since the reputation module updates the block validation process depending on the reputation of gateway nodes, the blocks generated by the malicious gateway will be subjected to a stricter validation process. If the malicious node repeats creating invalid blocks, it is isolated from the blockchain network and the data sources connected to that node are associated with a new gateway node.
Colluding blockchain nodes: Malicious blockchain nodes can collude to validate invalid blocks. For a blockchain network with a large number of validators, the success probability of this attack would be very low, as it would require a large number of malicious validators. If the blockchain network has a lower number of validators, the choice of block validating nodes can be randomized to mitigate the collusion of malicious block validating nodes.
Impersonation: An external attacker may try to impersonate a sensor node or a blockchain node. This attack requires the attacker to have access to the private key of the attacked node as all transactions are signed using the private keys of the nodes, whose public keys are known and used for verification of the transactions.
This disclosure provides a layered architecture for improving the end-to-end trust that can be applied to a diverse range of blockchain-based IoT applications. The proposed architecture can also be used for other applications involving physical observations being stored on blockchains (e.g. healthcare, social media analysis, etc.).
At the data layer, the gateways can calculate the trust for sensor observations based on the data they receive from neighboring sensor nodes, the reputation of the sensor node, and the observation confidence. If the neighboring sensor nodes are associated with different gateway nodes, then, the gateway nodes may share the evidence with their neighboring gateway nodes to calculate the observation trust values. This case will be investigated further in our future work.
In the proposed architecture, the computational complexity of calculating the trust values is O(K2), where K is the number of sensor nodes in a cluster with highly correlated observations. The number of spatially proximal nodes is finite and is not large given the practical sensor node densities in real deployments, which reduces the computational cost of calculating trust values within a cluster. When the number of sensor nodes in a cluster is high, nodes can be clustered further for improving the complexity.
We have implemented the data trust and blockchain mechanisms on a custom private blockchain for end-to-end performance analysis. It can also be implemented on major public (e.g. block validation for Ethereum blockchain can be adapted through the Ethereum source code) and private blockchain (e.g. the Hyperledger block validation logic can be adapted through Hyperledger Fabric) platforms.
Data processing system 2704 integrates a reasoning engine 2705, such as the SPINdle software as available from https://github.com/NICTA/SPINdle (but other engines can equally be used), and an aggregator 2706. The reasoning engine 2705 has access to animal specific rules 2707, applying to a single animal as well as to general rules 2708, applying over cattle as a whole, such as a group of cattle, a herd of cattle or the cattle of an entire farm.
Aggregator 2706 has access to an aggregation specification 2709, describing how individual variables are to be aggregated (e.g. take the sum, average, etc.). In one example, these rules are defined by legislation, regulations, farm practices, accreditation requirements, etc.
The data processing system 2704 performs the following method steps:
The data processing system 2704 collects 2710 all variables from animals and collects 2711 all static data. The data processing system 2704 uses these variables and applies the animal specific rules 2707 in the reasoner 2705 to obtain derived facts 2712 (i.e. outputs from the animal specific rules).
The data processing system 2704 then takes 2713 animal data and derived facts 2712 for each animal as input to the aggregator 2706 to obtain composite facts 2714 if required in the aggregation spec (e.g. percentage of animals that have been handled carefully). All composite facts 2714, animal data 2710, derived facts 2714, static data 2711 is sent to the reasoner 2705 again to be evaluated against the general rules 2708. The output is posted as the compliance result (possibly posted to the Internet of Things (IoT) Application Enablement and Data Management cloud-based platform again). The last step is the actual compliance check, where the output is compared to a specified requirement.
It is noted that the examples above relate to two layers including the individual animal data 2712/2713 and the composite facts 2714. However, further layers may be included, such that the architecture is individual animal data->aggregated/composite facts->further aggregated/composite facts->final results. It is further noted that the composite facts, also referred to as “composite data” may comprise composite data for a group of animals (such as the herd in total has been on a paddock). In other examples, the composite data comprises composite data over a period of time for an animal, such as “an animal has been drinking at least once in the last 6 hours”, or composite data over a period of time for weather data, such as if it has been raining in the past couple of days, no drinking behaviour is necessary.
The steps before the compliance check, (performed by data processing system 2704) are preprocessing steps to aggregate and prepare raw data from individual animals, to ensure the right information is obtained as requested in the general rules.
The aggregator 2706 allows for a generic aggregation of arrays of values, as specified in the aggregation specification. As such, it is flexible, and allows for various different aggregated outputs.
The architecture disclosed herein, for example architecture 2700 in
It will be appreciated by persons skilled in the art that numerous variations and/or modifications may be made to the above-described embodiments, without departing from the broad general scope of the present disclosure. The present embodiments are, therefore, to be considered in all respects as illustrative and not restrictive.
Number | Date | Country | Kind |
---|---|---|---|
2020904456 | Dec 2020 | AU | national |
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/AU2021/051430 | 11/30/2021 | WO |