This invention relates to air quality monitoring, and in particular to the monitoring and prediction of indoor bioaerosol concentrations.
People spend more than 85% of their time indoors,1-3 which means that indoor air quality (IAQ) significantly affects human health.3 As such, poor IAQ causes building-associated illness.4 Given that ˜5%-34% of particulate matter (PM) in indoor air is in the form of bioaerosols (i.e., bacteria, fungi and pollen),5 these particles are gaining increasing research attention,5-7 especially as the coronavirus disease 2019 (COVID-19) pandemic continues.8,9
Culturing-based methods have traditionally been used to determine the concentration of bioaerosols,10,11 but as these require offline processing and a long incubation time they cannot supply real-time information. Furthermore, because many microorganisms are known to be unculturable under standard laboratory conditions,12 bioaerosol concentrations are typically underestimated.13 Alternatively, ultraviolet light/laser-induced fluorescence techniques can be used to determine the concentrations and identities of bioaerosols in real time.14-18 However, the instruments required for these analyses are large and expensive, which makes them impractical for widespread deployment in indoor environments.
Artificial intelligence (AI)-based methods, such as machine learning and deep learning models, have been developed for the prediction of IAQ and applied to predict the trends in the values of IAQ parameters using data measured by real-time sensors.19 An artificial neural network, a form of deep learning model, was used to accurately determine the future concentration of carbon dioxide (CO2) in an office from past data.20 Similarly, deep learning models based on long short-term memory (LSTM) and gated recurrent units (GRUs) were developed to forecast trends in the concentrations of CO2 and fine dust in an office based on the past data of six IAQ parameters.21 In another example, size-segregated particle concentrations, temperature and relative humidity (RH) were fitted to a multi-linear regression model, enabling it to predict the concentrations of airborne bacteria and fungi in a hospital from culture-based data.22
However, no method has been developed that can accurately determine real-time and near future bioaerosol concentrations on a continuous basis.
Each of the following references (and associated appendices and/or supplements) is expressly incorporated herein by reference in its entirety:
Accordingly, the present invention, in one aspect, is a method for predicting concentration of indoor bioaerosols. The method contains the steps of providing a plurality of AI models, evaluating a prediction accuracy of each of the plurality of AI models for a venue; choosing a best model from the plurality of AI models for the venue; inputting measured data at the venue into the best model; and generating a prediction of concentration of indoor bioaerosols by the best model for the venue.
In some embodiments, the plurality of AI models includes one or more of a linear regression model, a lasso regression model, a random forest (RF) model, an extreme gradient boosting model, a multilayer perceptron model, an LSTM model, and a recurrent neural network model.
In some embodiments, the step of evaluating a prediction accuracy of each of the plurality of AI models for a venue, further includes the steps of inputting test data for the venue into each of the plurality of AI models; applying more than one pair of input and output time windows; finding, for each of the plurality of AI model, a difference data between predicted test data and measured test data; and determining one of the plurality of AI models that has a best difference data as the best model.
In some embodiments, the difference data contains one or more of a mean squared error (MSE), a root-mean-square error (RMSE) and a value on a revised version of the Willmott's index (WI).
In some embodiments, the more than one pair of input and output time windows includes a real-time window pair.
In some embodiments, the measured data contains a plurality of input features. The method further contains a step of determining which one of the plurality of input features is more important than another one by conducting a permutation importance analysis.
In some embodiments, the plurality of input features contains one or more of temperature, RH, concentrations of CO2, total volatile organic compounds (TVOCs), PM2.5 and PM10.
In some embodiments, the plurality of input features contains concentrations of more than one biological matters.
According to another aspect of the invention, there is provided an apparatus for predicting concentration of indoor bioaerosols. The apparatus include one or more processors; a memory storing computer-executable instructions that, when executed, cause the one or more processors to provide a plurality of AI models; evaluate a prediction accuracy of each of the plurality of AI models for a venue; choose a best model from the plurality of AI models for the venue; input measured data at the venue into the best model; and generate a prediction of concentration of indoor bioaerosols by the best model for the venue.
According to yet another aspect of the invention, there is provided a non-transitory computer readable medium, which contains executable instructions that, when executed by at least one processor, direct the at least one processor to perform a method. The method includes providing a plurality of AI models; evaluating a prediction accuracy of each of the plurality of AI models for a venue; choosing a best model from the plurality of AI models for the venue; inputting measured data at the venue into the best model; and generating a prediction of concentration of indoor bioaerosols by the best model for the venue.
One can see that exemplary embodiments of the invention provide a method for predicting real-time and near-future concentration of indoor bioaerosols with AI models, which enables accurately monitoring and predicting the indoor concentration of bioaerosols. The method may generate a suitable AI model for predicting the concentration of bioaerosols in various indoor venues by training the model with the IAQ data collected in those venues. The AI model can render predictions of the concentrations of indoor bioaerosols (such as bacteria, fungi, and pollen) by only using specific IAQ sensor data (such as temperature, relative humidity, carbon dioxide, total volatile organic compounds, PM2.5 and PM10) as input features. Before training the AI models, the training dataset with the input features is firstly prepared from a data-processing step and then fed into multiple different AI models, which can produce the real-time or near-future indoor concentrations of bioaerosols as outputs. By a specific set of evaluation metrics, the most suitable AI model will be chosen for each testing location. Also, by specifying different time lengths of historical input features, the AI model can forecast the indoor concentrations of bioaerosols (e.g. up to 60 minutes) in the future. The method provides a viable solution to industry and the general public to get information on the indoor bioaerosols with commonly available IAQ sensors, and make a better indoor environment to protect human health.
The foregoing summary is neither intended to define the invention of the application, which is measured by the claims, nor is it intended to be limiting as to the scope of the invention in any way.
The foregoing and further features of the present invention will be apparent from the following description of embodiments which are provided by way of example only in connection with the accompanying figures, of which:
1 shows linear regression (in solid line) of the measured and LSTM model-predicted values for two target features for the commercial office, for the time window of real-time prediction.
2 shows linear regression (in solid line) of the measured and LSTM model-predicted values for another two target features for the commercial office, for the time window of real-time prediction.
3 shows, for the commercial office, linear regression (in solid line) of the measured and LSTM model-predicted values for another target feature for the time window of real-time prediction, and linear regression (in solid line) of the measured and LSTM model-predicted values for a target feature for the time window of 60-60.
4 shows linear regression (in solid line) of the measured and LSTM model-predicted values for another two target features for the commercial office, for the time window of 60-60.
5 shows linear regression (in solid line) of the measured and LSTM model-predicted values for another two target features for the commercial office, for the time window of 60-60.
1 shows linear regression (in solid line) of the measured and LSTM model-predicted values for two target features for the shopping mall, for the time window of real-time prediction.
2 shows linear regression (in solid line) of the measured and LSTM model-predicted values for another two target features for the shopping mall, for the time window of real-time prediction.
3 shows, for the shopping mall, linear regression (in solid line) of the measured and LSTM model-predicted values for another target feature for the time window of real-time prediction, and linear regression (in solid line) of the measured and LSTM model-predicted values for a target feature for the time window of 60-60.
4 shows linear regression (in solid line) of the measured and LSTM model-predicted values for another two target features for the shopping mall, for the time window of 60-60.
5 shows linear regression (in solid line) of the measured and LSTM model-predicted values for another two target features for the shopping mall, for the time window of 60-60.
1 shows linear regression (in solid line) of the measured and the LSTM model-predicted values for two target features for the commercial office for the time windows of 10-5.
2 shows linear regression (in solid line) of the measured and the LSTM model-predicted values for another two target features for the commercial office for the time windows of 10-5.
3 shows, for the commercial office, linear regression (in solid line) of the measured and the LSTM model-predicted values for another target feature for the time windows of 10-5, and linear regression (in solid line) of the measured and the LSTM model-predicted values for a target feature for the time windows of 30-15.
4 shows linear regression (in solid line) of the measured and the LSTM model-predicted values for another two target features for the commercial office for the time windows of 30-15.
5 shows linear regression (in solid line) of the measured and the LSTM model-predicted values for another two target features for the commercial office for the time windows of 30-15.
6 shows linear regression (in solid line) of the measured and the LSTM model-predicted values for two target features for the commercial office for the time windows of 60-30.
7 shows linear regression (in solid line) of the measured and the LSTM model-predicted values for another two target features for the commercial office for the time windows of 60-30.
8 shows linear regression (in solid line) of the measured and the LSTM model-predicted values for another target feature for the commercial office for the time windows of 60-30.
1 shows linear regression (in solid line) of the measured and the LSTM model-predicted values for two target features for the shopping mall for the time window of 10-5.
2 shows linear regression (in solid line) of the measured and the LSTM model-predicted values for another two target features for the shopping mall for the time windows of 10-5.
3 shows, for the shopping mall, linear regression (in solid line) of the measured and the LSTM model-predicted values for another target feature for the time windows of 10-5, and linear regression (in solid line) of the measured and the LSTM model-predicted values for a target feature for the time windows of 30-15.
4 shows linear regression (in solid line) of the measured and the LSTM model-predicted values for another two target features for the shopping mall for the time windows of 30-15.
5 shows linear regression (in solid line) of the measured and the LSTM model-predicted values for another two target features for the shopping mall for the time windows of 30-15.
6 shows linear regression (in solid line) of the measured and the LSTM model-predicted values for two target features for the shopping mall for the time windows of 60-30.
7 shows linear regression (in solid line) of the measured and the LSTM model-predicted values for another two target features for the shopping mall for the time windows of 60-30.
8 shows linear regression (in solid line) of the measured and the LSTM model-predicted values for another target feature for the shopping mall for the time windows of 60-30.
1 illustrates plots of the measured and LSTM model-predicted values for three target features for the commercial office time-series dataset for the time window of real-time prediction.
2 illustrates plots of the measured and LSTM model-predicted values for another two target features for the commercial office time-series dataset for the time window of real-time prediction.
1 illustrates plots of the measured and LSTM model-predicted values for three target features for the commercial office time-series dataset for the 10-5 time windows.
2 illustrates plots of the measured and LSTM model-predicted values for another two target features for the commercial office time-series dataset for the 10-5 time windows.
1 illustrates plots of the measured and LSTM model-predicted values for three target features for the commercial office time-series dataset for the 30-15 time windows.
2 illustrates plots of the measured and LSTM model-predicted values for another two target features for the commercial office time-series dataset for the 30-15 time windows.
1 illustrates plots of the measured and LSTM model-predicted values for three target features for the commercial office time-series dataset for the 60-30 time windows.
2 illustrates plots of the measured and LSTM model-predicted values for another two target features for the commercial office time-series dataset for the 60-30 time windows.
1 illustrates plots of the measured and LSTM model-predicted values for three target features for the commercial office time-series dataset for the 60-60 time windows.
2 illustrates plots of the measured and LSTM model-predicted values for another two target features for the commercial office time-series dataset for the 60-60 time windows.
1 illustrates plots of the measured and LSTM model-predicted values for three target features for the shopping mall time-series dataset for the time windows of real-time prediction.
2 illustrates plots of the measured and LSTM model-predicted values for another two target features for the shopping mall time-series dataset for the time windows of real-time prediction.
1 illustrates plots of the measured and LSTM model-predicted values for three target features for the shopping mall time-series dataset for the 10-5 time windows.
2 illustrates plots of the measured and LSTM model-predicted values for another two target features for the shopping mall time-series dataset for the 10-5 time windows.
1 illustrates plots of the measured and LSTM model-predicted values for three target features for the shopping mall time-series dataset for the 30-15 time windows.
2 illustrates plots of the measured and LSTM model-predicted values for another two target features for the shopping mall time-series dataset for the 30-15 time windows.
1 illustrates plots of the measured and LSTM model-predicted values for three target features for the shopping mall time-series dataset for the 60-30 time windows.
2 illustrates plots of the measured and LSTM model-predicted values for another two target features for the shopping mall time-series dataset for the 60-30 time windows.
1 illustrates plots of the measured and LSTM model-predicted values for three target features for the shopping mall time-series dataset for the 60-60 time windows.
2 illustrates plots of the measured and LSTM model-predicted values for another two target features for the shopping mall time-series dataset for the 60-60 time windows.
As will be described in details below, in a first embodiment of the invention machine learning and deep learning models are developed which can accurately predict continuous real-time concentrations of bioaerosols using input from a typical IAQ sensor that measures the physical and chemical properties of indoor air. The models are then trained and their performance tested using data that are obtained by measuring the physical, chemical and biological characteristics of the indoor air in an operating commercial office and a shopping mall. In addition, hyperparameters of the models are optimized and it is explored how using various time windows of past data as inputs affect the output data. In one exemplary configuration, the best model determined the per-minute concentration of bioaerosols up to 60 min into the future with ˜60%-80% accuracy. This constitutes a practical and economical strategy for assessing indoor concentrations of bioaerosols to facilitate the protection of human health.
Next,
Field Sampling
For the field sampling step 32, two different IAQ sensors are put together to collect necessary IAQ data for the later AI model development 36. In this example, the IAQ data are collected from an operating commercial office and a shopping mall. The measured data from the IAQ sensors are processed and curated prior to input into the AI models as will be described in more details later. The first IAQ sensor could be the IAQ sensor 20 in
In one implementation, a real-time fluorescence-based aerosol cytometer (InstaScope, Boulder, CO, USA) as the second sensor is operated at a flow rate of 0.85 L/min to identify bioaerosols based on the light-scattering and fluorescence spectra of airborne particles. Three fluorescence channels are used in tandem to classify airborne particles: channel A (excitation=280 nm, emission=310-400 nm), channel B (excitation=280 nm, emission=420-650 nm) and channel C (excitation=370 nm, emission=420-650 nm).16 A particle with a fluorescence intensity that exceeded the instrument intrinsic noise baseline by three standard deviations was classified into one of the following four fluorescent-type categories: A, AB, BC or ABC.16 A particle was classified as either bacteria-like, fungi-like or pollen-like by analysis of its fluorescence and optical size properties, according to a previous study15 (see Table 1 below). Despite the merits of an ultraviolet light/laser-induced fluorescence instrument, there are uncertainties attached to the assignment of fluorescent particles to biological matters.18 This study refers to bacteria-like, fungi-like, and pollen-like particles as bacteria, fungi, and pollen, respectively, for simplicity in descriptions. A custom R script was used to process the raw data generated. The physical and chemical properties of indoor air are measured using a commercial grade IAQ sensor (Kaiterra Ltd., Mollens, Switzerland) as the first sensor. Six parameters are measured per minute: temperature, RH, concentrations of CO2, TVOCs, PM2.5 and PM10. All instruments are calibrated prior to use.
aCategory ABC was classified as pollen only, not both pollen and fungi as indicated in Hernandez et al. (2016)15.
As a practical case, the indoor air of a typical commercial office (from April 26 to May 25, 2021, and from June 2 to 14, 2021) and a shopping mall (from Dec. 20, 2021, to Jan. 12, 2022) in Hong Kong were measured every minute for 24 h a day during the above-stated sampling periods. The commercial office was open-plan, ˜1,400 m2 and had ˜250 people sitting in rows of desks without partitions. The staff office hours were 08:30 to 17:30 Monday to Friday and the heating, ventilation and air conditioning (HVAC) systems operated from 07:00 to 19:00 on weekdays. Separate measurements were made in the office at each of three different locations on at least 8 consecutive weekdays. Two sampling locations were close to the middle of the office, with one adjacent to many occupants and the other adjacent to fewer occupants, while the third location was near the back of the office, away from the seating area. The data from the three locations were subsequently combined into a single representative dataset for downstream analysis. In the shopping mall, the store hours were 11:00 to 22:00 7 days a week and the HVAC systems operated from 10:00 to 22:00. Measurements were made at a single location on a floor in a ˜2,000-m2 section that housed individual shops and a children's playground for 21 consecutive days spanning weekdays and weekends. At the sampling location, ˜100 people passed by on average per hour and the occupancy on weekends was ˜50% higher than that on weekdays.
Data Processing
After the measured data from the IAQ sensors are obtained in Step 32, in the data processing step 34 the measured data are processed and curated prior to input into the AI models.
Development of AI Models
After the measured data has been processed, the method in
With more details, in the implementation, model training and evaluation are all conducted using Python (v. 3.6.15) on a typical workstation computer (OS: Ubuntu 20.10, CPU: Intel® Xeon® E-2136 with 64 GB of memory) in a Linux environment. Four machine learning models are developed: three models (a linear regression model,23 a lasso regression model24,25 and a RF26 model) are developed using the scikit-learn32 package (v 0.24.2), while one model (an extreme gradient boosting (XgBoost)27 model) is developed using the package xgboost27 (v 1.4.0). The linear regression and lasso regression models are linear models; lasso regression model performs L1-regularization to penalize the magnitude of coefficients.25 The RF and XgBoost models are non-linear tree-based models; every node of a tree is labeled with a criterion generated based on the input feature that leads to the subordinate decision nodes of another criterion, while each leaf of a tree returns a regression value as an output.
Three deep learning models (a MLP model,28 a LSTM model29,30 and a RNN model30 are developed using Keras54 (v 2.3.1) in the Python package TensorFlow55 (v 1.14) and optimized using the Adam optimizer.56 All three models are composed of three types of layers: input, hidden and output layers. The hidden layers in an MLP model consist of several nodes, with each being responsible for computing a mathematical function. The MLP model outputs are defined by a weighted calculation between all of its neural nodes. A similar hidden layer is also present in an LSTM model and an RNN model; however, an LSTM model also contains a specific layer to maintain weighted past records in memory for a long period of time, whereas an RNN model contains a specific layer to maintain the last input parameters in memory.
The sets of hyperparameters (see Table 2 below) for each model were optimized during model training by a grid search57 function with five-fold cross validation provided by the scikit-learn package. This function iterated all combination sets of hyperparameters to configure and train each model and thus returned an optimized model with a set of hyperparameters (see
a N/A: Not applicable
bA hyperparameter for the Adam optimizer. This hyperparameter was optimized only in the model training for real-time prediction, the default value (0.9) was used for the other time windows.
In addition, five different combinations of time windows (each combination has a pair of input and output time windows) of past measured data to be used as input features and time windows of predicted data for the target features are investigated. The time window of past measured data constrained how far back into the past the values for input features (including the current moment) should be obtained for use in forecasting, while the time window of future data constrained how far into the future values of the target features are forecasted. To obtain accurate predictions, a time window that was longer than or the same length as the output data was adopted for the input data. The five different combinations of time windows tested are (i) real-time prediction, (ii) a 10-min input and a 5-min output (abbreviated as “10-5”), (iii) a 30-min input and a 15-min output (“30-15”), (iv) a 60-min input and a 30-min output (“60-30”) and (v) a 60-min input and a 60-min output (“60-60”). As an example, for “10-5,” the past 9 min, including the current minute (i.e., 10 min in total), of measured input features are used to forecast the target features in the subsequent 5 min. For real-time prediction, the measured real-time input features are used to forecast the target features at the same moment in time. Data for both the measured input and predicted target features are set to have a time interval of 1 min.
Evaluation of Predictive Accuracy of Models
In
In particular, the difference between the measured values and those predicted by each model were evaluated to determine each model's predictive accuracy, in terms of its MSE, RMSE and/or value on a revised version of the WI31. If a model has the best difference data as compared to other models, then the model is determined as the best model for the given type of indoor venue. The MSE and RMSE are computed using the package scikit-learn32 (v 0.24.2), and the WI value (between 0 and 1, with a higher value indicating a more accurate prediction) was determined, using the following equations respectively:
where n is the total number of data points, ŷi is the i-th predicted value, yi is the i-th measured value and
Unlike the original WI, which squares the errors prior to summation, the revised WI does not over-weight the influence of errors on the sum-of-squared errors31 and thus is less sensitive to errors concentrated in outliers and can better differentiate well-performing models.31 A custom script was used to calculate the revised WI.
In addition, all statistical analyses are performed using Python. Pairwise Pearson's correlations between IAQ parameters were calculated using the package Scipy33 (v 1.5.2). Linear regressions of the measured and predicted values were computed using the package statsmodel34 (v 0.12.0). Permutation importance,35 which represents the importance of each input feature in an AI model, was analyzed using the package scikit-learn with the default number of permutations.
The IAQ of the two venues in the example (i.e. the commercial office and the shopping mall) are now discussed. The daily and hourly average values of nine parameters were analyzed to assess the physical, chemical and biological profiles of indoor air in a commercial office (
Performance of the AI Models in Predicting IAQ, and Choosing a Best Model
The analysis result of the predictive accuracy of models are now described. The ability of the AI models to determine the target features in various future time windows from various time windows of measured data is evaluated using testing datasets based on the WI, MSE and RMSE (see
For the commercial office, the predictive accuracy of the linear models—the linear regression and lasso regression models—was poor, regardless of the time windows of data, with an average WI consistently less than 0.62 (see Table 4 below). In contrast, all of the non-linear models exhibited superior performances, with the LSTM model having the highest average predictive accuracy for all of the target features (WI=0.75-0.76) in three of the five time windows tested (i.e., 10-5, 30-15 and 60-30) and the Xgfloost and RF models generating the most accurate values for the real-time prediction (WI=0.78) and 60-60 (WI=0.75) time windows, respectively. Similarly, for all of the target features, the LSTM model had the lowest average MSE and RMSE in the 10-5, 30-15 and 60-30 time windows, and the RF model had the lowest average MSE and RMSE for the real-time prediction and 60-60 time windows (see
0.75 ± 0.06
0.78 ± 0.09
0.76 ± 0.05
0.75 ± 0.05
0.75 ± 0.06
0.75 ± 0.05
0.82 ± 0.07
0.80 ± 0.05
0.80 ± 0.05
0.82 ± 0.07
0.80 ± 0.05
0.80 ± 0.05
0.80 ± 0.04
For the shopping mall, all of the non-linear models except the XgBoost model consistently yielded a more accurate average prediction for all of the target features than the linear models in all of the time windows tested (WI>0.77). The LSTM model afforded the most accurate average prediction (WI=0.80-0.82) in three of the five time windows (i.e., real-time prediction, 60-30 and 60-60) and a similar average accuracy to the RF and RNN models in the other two time windows (see Table 4 and
Permutation importance analysis of input features was conducted using the testing datasets to determine how important each input feature was to the ability of the LSTM model to predict the five target features. A positive feature permutation importance indicates that the feature generates a reduction in predictive accuracy after permutation, while a negative permutation feature importance indicates that the feature has no effect on the accuracy after permutation.36 In the commercial office, RH and the concentrations of PM2.5 and PM10 were the three most important features for the accuracy of real-time predictions, while temperature and the concentration of TVOCs were not important features (see Table 5 below). In addition, temperature was consistently among the top 10 most important features for the accuracy of predictions for the 10-5, 60-30 and 60-60 time windows, and the concentration of TVOCs was related to the top 10 most important features for the 30-15 time window. Similar to in the commercial office, in the shopping mall, RH and the concentrations of PM2.5 and PM10 were the top three features for real-time predictive accuracy (see Table 6 below). However, unlike in the commercial office, in the shopping mall both RH and temperature were among the top 10 most important features that contributed to the accuracy of predictions for the other time windows.
1The mean and standard deviation of the permutation feature
1The number at the end of each input feature indicates the time in minutes from the current time.
2The mean and standard deviation of the permutation feature importance after five rounds of permutation are shown.
1The mean and standard deviation of the permutation feature
1The number at the end of each input feature indicates the time in minute from the current time.
2The mean and standard deviation of the permutation feature importance after five rounds of permutation are shown.
Prediction of Time Series Data
Lastly, as the trained LSTM model was determined to be the best model for both venue types which are the commercial office and the shopping mall, the model is then applied in the actual monitoring and forecasting operation in Steps 41 and 43 (see
In one example, the ability of the trained LSTM model to make long-term continuous predictions for the commercial office (see
One can see that in the above embodiment, AI models are developed using physical and chemical data from an indoor air quality sensor and physical data from an ultraviolet light/laser-induced fluorescence instrument that measures bioaerosols. This enables effective determination of the concentrations of bioaerosols (bacteria, fungi and pollen) and 2.5-μm and 10-μm particulate matter (PM2.5 and PM10) on a real-time and near-future (≤60 min) basis. Seven AI models are developed and evaluated using measured data from an operating commercial office and a shopping mall. The long short-term memory model required a relatively short training time and gave the highest prediction accuracy of ˜60%-80% for bioaerosols and ˜90% for PM on the testing and time series datasets from the two venues. The AI-based method for monitoring and predicting indoor concentrations of bioaerosols provides important information to building operators in an effective and economical manner, enabling them to optimally manage indoor environmental quality.
As a summary to the above embodiment, a LSTM model is developed and demonstrated that it could determine real-time and near-future concentrations of indoor bioaerosols and PM with an accuracy of ˜60%-80% and ˜90%, respectively, for both testing and time series datasets. It was expected that the predictive accuracy of PM would be relatively high as past PM data were used in the model. The ability of the model to continuously determine the real-time or near-future concentration of indoor bioaerosols relied on the deployment of an ultraviolet light/laser-induced fluorescence instrument to acquire the high temporal resolution biological data required for model training, as these data cannot be obtained by traditional culture-based methods. In comparison with the predictive accuracies that have been reported for concentrations of PM (˜80%-90%)37,38 and bacteria and fungi (based on culturing measurements; ˜50%-80%),22,39 the predictive accuracies for concentrations of these analytes generated by our LSTM model are similar or higher. As such, the embodiment above has demonstrated that AI models can use real-time IAQ sensor data to determine real-time and near-future concentrations of indoor bioaerosols and PM with relatively high accuracy. Given that preventing disease transmission in indoor environments is a top priority, there is an urgent need for low-cost technology that can accurately monitor IAQ, especially the concentrations of airborne bioaerosols.52 It has been shown that this could be achieved with a commercial IAQ sensor and an AI model, and could form a bioaerosol monitoring and forecasting system in large indoor environments. Such forecasting acts as an early-warning system for the high level of bioaerosols concentration, and would enable remedial actions (e.g., increasing fresh air supply) to be taken to maintain IAQ.
The exemplary embodiments are thus fully described. Although the description referred to particular embodiments, it will be clear to one skilled in the art that the invention may be practiced with variation of these specific details. Hence this invention should not be construed as limited to the embodiments set forth herein.
While the embodiments have been illustrated and described in detail in the drawings and foregoing description, the same is to be considered as illustrative and not restrictive in character, it being understood that only exemplary embodiments have been shown and described and do not limit the scope of the invention in any manner. It can be appreciated that any of the features described herein may be used with any embodiment. The illustrative embodiments are not exclusive of each other or of other embodiments not recited herein. Accordingly, the invention also provides embodiments that comprise combinations of one or more of the illustrative embodiments described above. Modifications and variations of the invention as herein set forth can be made without departing from the spirit and scope thereof, and, therefore, only such limitations should be imposed as are indicated by the appended claims.
The functional units and modules of the systems and methods in accordance with the embodiments disclosed herein may be implemented using computing devices, computer processors, or electronic circuitries including but not limited to application-specific integrated circuits (ASIC), field programmable gate arrays (FPGA), and other programmable logic devices configured or programmed according to the teachings of the present disclosure. Computer instructions or software codes running in the computing devices, computer processors, or programmable logic devices can readily be prepared by practitioners skilled in the software or electronic art based on the teachings of the present disclosure.
All or portions of the methods in accordance with the embodiments may be executed in one or more computing devices including server computers, personal computers, laptop computers, and mobile computing devices such as smartphones and tablet computers.
The embodiments include computer storage media, transient and non-transient memory devices having computer instructions or software codes stored therein which can be used to program computers or microprocessors to perform any of the processes of the present invention. The storage media, transient and non-transitory computer-readable storage medium can include but are not limited to floppy disks, optical discs, Blu-ray Disc, DVD, CD-ROMs, magneto-optical disks, ROMs, RAMs, flash memory devices, or any type of media or devices suitable for storing instructions, codes, and/or data.
Each of the functional units and modules in accordance with various embodiments also may be implemented in distributed computing environments and/or Cloud computing environments, wherein the whole or portions of machine instructions are executed in a distributed fashion by one or more processing devices interconnected by a communication network, such as an intranet, WAN, LAN, the Internet, and other forms of data transmission medium.
In the preferred embodiments mentioned above, seven models were developed initially and the LSTM model was determined to be the best model for both commercial offices and shopping malls. However, the number of models or the LSTM model is by no means intended to be limiting. In variations of the embodiments, different number of models, and/or different types of models other than the seven models exemplified above can be used, and the best model can be different from the LSTM model in particular if the venue type is not a commercial office or a shopping center. The invention is not confined by any particular candidate models, or best models, but it is the approach that provides multiple candidate models and choosing a best one for each type of venue based on prediction accuracy that marks one of the essential features of the invention.
Also, in the preferred embodiments mentioned above, five different combinations of time windows of past measured data and predicted data are discussed in both evaluation of different AI models and actual monitoring/forecasting of indoor bioaerosol concentrations. However, those skilled in the art will realize that the number of time windows that can be used in the monitoring/forecasting operations is not limited to five, and the time length in each window may have a different value than what has been described above.
The invention may be applied to the monitoring/forecasting of bioaerosol concentrations in all types of indoor environment, and is not limited to the commercial office and the shopping center as described in the preferred embodiments above.