The present teaching generally relates to a computer, and, more specifically, relates to machine learning.
In recent decades, the ubiquitous presence of the Internet and data access in electronic forms have facilitated advancement of various technologies, including big data analytics and machine learning. Artificial intelligence (AI) technologies and applications thereof usually rely on machine learning based on big data. For example, machine learning techniques have been used for learning preferences of users via contents consumed and forecasting specific behavior based on historic time series data. In recent years, time series forecasting has drawn substantial attention with a wide range of applications, such as forecasting sales volume and click traffic. The goal of time series forecasting includes to predict future measurements of a target time series by leveraging temporal patterns identified from historical observations.
With the proliferation and success of artificial neural networks, recurrent neural networks (RNNs) are widely adopted for capturing complex non-linear temporal dependencies. To further enhance the relation extraction and representation, some research works focus on integrating more appropriate modules or features, such as attention mechanisms and multiple resolutions aggregation. Existing state-of-the-arts works aim at improving what can be achieved using basic RNN-based methods by sharing temporal patterns globally across different time series. This is illustrated in
In this framework, training data from different time series may be used to train the model parameters, attempting to capture the characteristics of all of these time series. However, different time series, especially those collected from different data sources, likely exhibit very different temporal patterns. For example, daily sales for a store located in the downtown may follow a very different pattern than that of a store located in suburbs. Thus, purely relying on pattern generalization across different time series and encoding their characteristics via global modeling does not work well.
In some applications, the desire is towards embracing pattern specialization, which trains specialized model parameters using training data of that special type of time series. This mode of operation also presents data deficiency problems. To achieve such customized treatments, a straightforward solution is to train a forecasting model for each target time series. However, a well-trained model, especially a model based on neural networks, tends to significantly rely on massive training data, which may not be available or accessible in real-world scenarios. Another issue has to do with long-range temporal patterns. A time series may start at any time and span across variable time periods. Temporal patterns between existing observations and the ones in predictions may not be well-captured by the learned model if such patterns are not observed in the target time series. For instance, the forecasting for stores with one year data is expected to be easier than the ones with less data, such as only a couple of months, as the more data we have, the more underlying temporal patterns could be identified. However, how to capture long-range historical temporal patterns remains a daunting task.
Thus, there is a need for methods and systems that address the deficiency of existing approaches.
The teachings disclosed herein relate to methods, systems, and programming for advertising. More particularly, the present teaching relates to methods, systems, and programming related to exploring sources of advertisement and utilization thereof.
In one example, a method, implemented on a machine having at least one processor, storage, and a communication platform capable of connecting to a network for machine learning for time series via hierarchical learning is provided. First, global model parameters of a base model are learned via deep learning for forecasting time series measurements of a plurality of time series. Based on the learned base model, target model parameters of a target model are obtained by customizing the base model, wherein the target model corresponds to a specific target time series from the plurality of time series for forecasting time series measurements of the specific target time series.
In a different example, a system is disclosed for machine learning of time series forecasting, which comprises a general deep machine learning mechanism and a customized deep learning mechanism. The general deep machine learning mechanism is configured for deep learning global model parameters of a base model for forecasting time series measurements of a plurality of time series. The customized deep learning mechanism configured for obtaining target model parameters of a target model by customizing the base model, wherein the target model corresponds to a target time series from the plurality of time series and is for forecasting time series measurements of the target time series.
Other concepts relate to software for implementing the present teaching. A software product, in accord with this concept, includes at least one machine-readable non-transitory medium and information carried by the medium. The information carried by the medium may be executable program code data, parameters in association with the executable program code, and/or information related to a user, a request, content, or other additional information.
In one example, a machine-readable, non-transitory and tangible medium having data recorded thereon for machine learning for time series via hierarchical learning. First, global model parameters of a base model are learned via deep learning for forecasting time series measurements of a plurality of time series. Based on the learned base model, target model parameters of a target model are obtained by customizing the base model, wherein the target model corresponds to a specific target time series from the plurality of time series for forecasting time series measurements of the specific target time series.
Additional advantages and novel features will be set forth in part in the description which follows, and in part will become apparent to those skilled in the art upon examination of the following and the accompanying drawings or may be learned by production or operation of the examples. The advantages of the present teachings may be realized and attained by practice or use of various aspects of the methodologies, instrumentalities and combinations set forth in the detailed examples discussed below.
The methods, systems and/or programming described herein are further described in terms of exemplary embodiments. These exemplary embodiments are described in detail with reference to the drawings. These embodiments are non-limiting exemplary embodiments, in which like reference numerals represent similar structures throughout the several views of the drawings, and wherein:
In the following detailed description, numerous specific details are set forth by way of examples in order to facilitate a thorough understanding of the relevant teachings. However, it should be apparent to those skilled in the art that the present teachings may be practiced without such details. In other instances, well known methods, procedures, components, and/or circuitry have been described at a relatively high-level, without detail, in order to avoid unnecessarily obscuring aspects of the present teachings.
The present teaching aims to address the deficiencies of the traditional approaches in learning time series forecasting. The present teaching discloses a solution that overcomes the challenge and deficiency of the traditional solutions via a framework that is able to enrich the training data by enhancing the expressiveness of encoded temporal patterns via historic patterns. In addition, model parameters learned from general time series data can be customized efficiently in the CTSF framework to remedy the problems associated with data deficiency and long-range pattern modeling. For enrichment of training data, historical patterns may be queried to enrich the pattern information and broaden time span of the input sequence. With respect to customization, the CTSF framework enables explicit combination of generalization among time series and specialization of target time series forecasting. The CTSF framework as disclosed herein is configured with a bidirectional recurrent neural network (RNN) (such as a gated recurrent unit neural network or GRU) to encode target time series.
The framework includes three components: a bi-GRU base model, a historical temporal pattern (HTP) graph, and a customized forecasting module. The base model is used to encode a time series. The HTP graph is used in enriching the representation of a time series based on historical time series. The customized forecasting module is configured to initially learn optimal globally-shared model parameters and then adjust such learned global model parameters to derive customized model parameters for each time series. First, the based model maps an observation sequence into an embedding vector, which expresses the underlying patterns and then outputs predicted values based on the learned embedding. This is shown in
These concepts associated with the CTSF framework are illustrated schematically in
Before the detailed discussion, some definitions are provided first. Time series information from a source is defined as one time series, which is composed of a set of chronologically ordered observations that are collected at equal-space time intervals. Suppose there are m time series data sources, X=(X1, . . . , X, . . . , Xm) denotes the corresponding m time series and Xi=(, . . . , , . . . ) (∈) denotes the ith time series, where || represents the number of measurements in the ith time series. Each measurement is also associated with a timestamp t. In general, different time series may vary in number of involved measurements. The concept of time series forecasting is to infer or predict a future measurement based on or by leveraging temporal patterns from historical (previously occurred) observations. That is, given K previous observations (, . . . , ) in the ith time series, the objective of forecasting is to predict, e.g., the next measurement of the time series at time t.
Based on such definitions and notations, the problem of customized time series forecasting via knowledge transfer can be defined as follows. Given time series data X, the objective is to provide a customized time series forecasting model, which is formulated as:
where represents the learned time series forecasting model for time series .
To derive customized model parameters for forecasting with respect to time series from a specific source in the framework of CTSF, a base model may first be established using training time series data from multiple sources. The base model aims to forecast future observations by understanding the time-ordered pattern of input sequence. It may be constructed using neural networks, e.g., a bi-directional GRU (bi-GRU) followed by several fully connected layers. For each time series i, the output representation of bi-GRU may be represented as =[]=[GRU (, ), GRU(, )], where hk is the hidden representation of bi-GRU at time stamp k. For each time stamp k, is fed into several fully connected layers, which generates a predicted value , which is defined as =FC(). The parameters of the base model f is optimized by minimizing some type of error, e.g., the mean-squared-error (MSE) loss as:
where is the real observation at t-th time of time series xi.
Next, the enrichment via HTP graph is disclosed. Based on the above definitions of the base model, the extracted hidden representations are encoded based on the fixed-length input. Because information contained in a fixed length input may be limited (in some situations may be severely limited), this configuration likely may not capture long-term temporal patterns in some situations. For example, in some situations, the observation time series used for prediction may be truncated to the latest K observations (input time steps K). In some situations, the target time series (e.g., a newly emerged time series) may span over only a very short time frame. Such issues leads to failure of capturing long-term temporal patterns and degrades the predicted accuracy within a limited time scope. The hidden representation may be enriched by incorporating relevant historical information to address such issues. The purpose is to broaden the time scope of input sequence by dynamically querying relevant historical information across all time series archived.
In querying/retrieving the relevant patterns, the goal is to enhance the expressiveness of by using the hidden representations {. . . , } to query the HTP Graph. This involves three steps. The first step is to query space projection. The second step is to aggregate relevant historical information. The third step is to aggregate features. The first step is for query space projection, the hidden representation H is projected into the query space, which is denoted as:
=+ (3)
where ∈and ∈ are learnable parameters. For illustration purposes, the query process described herein is in the forward direction. The backward query process is similar but using a separate set of trainable parameters.
After the relevant information query, the next step is to aggregate the queried historical patterns by replacing relevant scores. If the representation of a vertex set in the HTP graph, e.g., a forward graph, is ={, . . . , }, to get the relevant information from the HTP graph, the most intuitive is to aggregate ={, . . . } into (e.g., using mean pooling) and then query the HTP graph. C is the total number of vertices in the vertex set in the HTP graph. Both c and c′ refers to a specific vertex in this set such as Vf=vf1, . . . , vfc′, . . . , vfc, . . . , vfC}, where vfc is the cth vertex among a set of C vertices in total, and similarly vfc′ refers to the c′th vertex. Pooling as discussed herein corresponds to an aggregation operator in neural networks, which may be pooling using some exemplary potential functions such as min, max, sum, average/mean, etc. For example, by “mean pooling”, it may be that {
The queried information may be aggregated by attention mechanism as follows:
where .,. represents an inner product. Equation (4) defines the forward (hence the f) historical pattern vector rf as a weighted summation of all vfc (hence the summation with c). The weight of the cth forward pattern graph vertex vfc is defined by its similarity to
Representation aggregation before information query may fail to distill effective information without exploring the relevance between different time step representations and HTP graph. To reduce information loss, a graph query method is adopted to simultaneously consider interactions of −, and −, and −vc, leading to three types of edge weights. The first edge weight corresponds to − which represents the interaction of different timestamps/steps within a time series to use its own past to enrich its recent past representations and vice versa to capture seasonality. Therefore, and are added into the HTP graph as:
ε(,,)=σ(|−|+)
where
The edge weight is higher/stronger if
The second edge weight is directed to interaction between vertices vfc and vfc′ in the historical pattern graph and is related to the hidden representations of other time series in the dataset to enrich each other. This type of edge weight is defined in a similar manner.
where
Note that the weight of the edge will be higher/stronger if hidden representations of two different time series in the database are similar to f each other, lower/weaker if they are not. Wevf is a simple learnable f parameter vector, and bevf is a learnable scalar (intercept).
The third edge weight is directed to interaction between
ε(
It may also be defined the same way as the first and the second edge weight functions above as:
ε(
where
The choice of an edge weight function may differ, which may not matter that much so long as it makes the edge weight higher when
Based on the above defined edge weights, a new graph may be constructed with the vertex set with the following vertices (projected hidden representations of a particular time series as well as all other time series) V0f={
=ReLU(GNN(,;)), (5)
where is the layer index, are trainable parameters on -th layer. After stacked GNN layer, we get the relevant historical patterns from the K-th row of .
The third step is on feature aggregation, during which the queried forward pattern vector and backward pattern vector are projected to the same feature space and concatenated with as:
h
k′=(,+)⊕(,+)⊕⊕ (6)
where , are learnable parameters and ⊕ is a concatenation operation. In this operation, hk is replaced with hk′ which is ultimately fed into fully connected layers of the base model for predictions.
Through such aggregation operation, the time series forecasting is based on both the input sequence and relevant historical patterns to broaden the time scope. With this aggregation, as any vertex that has higher weighted edges with other vertices that are more similar with it, this propagation scheme enables to aggregate the most relevant historical information in the pattern graph utilizing these three types of edge definitions. Thus, each vertex vector in the initial graph's vertex set V0f is utilized by the 1st hidden layer of the GNN to construct the new/enriched vertex vector representations of the next hidden graph layer—and the set of all vertex vectors after the 1st layer is V1f. Similarly, vertex vectors of V1+1f are constructed as an aggregation over the vertex vectors of V1f, using the edge weights ε0f and the corresponding GNN parameters of the 1th layer W1′. Ultimately, after the last GNN layer nL′ each vertex vector derives its final enriched representations using all three types of edge definitions. And
Improving the expressiveness of the hidden representation via historic data query and enrichment enables better forecasting ability without having to be trained on a massive amount of data. This addresses the challenge of inadequate training data.
The effectiveness of HTP graph may heavily depend on how well the historical knowledge stored is extracted in forward graph representation and backward graph representation . In some situations, it may be difficult to learn well by minimizing merely the MSE loss defined in Equation (2). To enhance the learning efficiency, a triplet loss function is used to optimize and by leveraging the intrinsic property contained in time series. It is based on the observation that two extracted historical patterns may show different distance influenced by whether each other come from the same time series or not. Such a distance may be small when the two extracted historical patterns are generated in different time periods of a same time series. That is, the intrinsic characteristics may exhibit over time in a consistent way. Conversely, such a distance may be large if two query embeddings are derived from different time series. A triple loss is formulated as follows:
where rit=[, ] is the extracted relevant historical patterns of time series i at time step t, m is the margin value to control the difference strength between intra-distance and inter-distance. As seen, this formulation of the triplet loss is to enforce that the distance between rit and rit′ (extracted relevant historical pattern of time series i at different timestamps t vs. t′) are small while the distance between rit vs. rjt are large (extracted relevant historical pattern of time series i vs. time series j at the same timestamp t).
As discussed herein, another aspect of the CTSF framework is related to customization of a globally learned base model using input data specific to a particular time series to generate model parameters that are optimized with respect to that particular time series for forecasting. As discussed herein, globally sharing model across different time series usually fail to capture individual temporal dependencies of each time series. Such as approach is usually not effective because different time series, e.g., xi vs xj, can be quite different in nature so that and globally sharing the parameters does not work well.
To enhance the base model on its expressiveness with respect to each time series, the base model may be used as a basis for customization for each time series. With the customization in accordance with the present teaching, the forecasting capability in the field of time series prediction can be significantly enhanced. Although model customization can be achieved by separately training the models for different time series prediction using corresponding time series data, this approach is impractical in real-world scenarios for various reasons. For instance, deep learning models in the forecasting field need a large training dataset (as usual) to learn model parameters effectively. However, in reality individual time series usually do not have sufficient data to train a deep learning and well-performed network.
A solution to solve the dilemma as discussed herein is to leverage pretrained model parameters obtained based on a large shared dataset as model initialization and then adapt specific tasks by fine-tuning parameters using smaller datasets. Specifically, meta learning described with two phases: i) model initialization and ii) model customization. During the first phase for model initialization, a global deep learning based forecasting model or base model is trained using data typically from a large set of time series likely encompassing a long history. This set of time series training data is called source time series set denoted by . This data set provides usually much more information, e.g., yearly seasonality, and enables the global base model to capture significant events such as Christmas, Thanksgiving, Mother's Day, July 4th, etc. Base model parameters are shared across all time series, hence represent the across time series knowledge or meta-knowledge. Source dataset is used for this task.
During the model customization phase, based on the base model, target time series set (usually limited without a long history) is used to learn customized deep learning based forecasting models with respect to individual time series in the target set . The base model parameters learned from the first phase serve as starting point so that although time series training data in target set do not have enough data to train a deep learning model, they are adequate for customizing the model parameters.
Formally, in supervised machine learning, a predictive model ŷ=fθ(x) parameterized by θ can be learned via training as follows:
θ*=argminθL(fθ,S)
where L is a loss function that measures the degree of match between true labels and those predicted by the predictive model fθ(.) based on the training dataset . In accordance with the present teaching, two types of losses are defined: Lmse (Equation (2)) and Lgraph (Equation (7)). A combined loss is defined as:
θ*=argminθLmse(fθ,S)+γLgraph
γ is a weight for Lgraph, which can be learned during training. Parameter set θ* corresponds to learned parameters during the first phase of training to obtain an initialized prediction model using cross time series data set . That is, θ* is not customized and thus will not be effective for target time series (in target set ).
In meta-learning according to the present teaching, the source set is divided into ={support, query} or abbreviated as ={s, q}. Similarly, the target set ={support, query} or abbreviated as ={s, q}. If following the naming convention in machine learning, can be described as ={train, validation} and as ={train, test}, respectively. The subscript denotes a specific time series in those sets, hence the final notation in the paper ={is, iq} and ={is, iq}.
The base model is first learned from source time series set (shared training data) and then the base model parameters are modulated using target time series set . The goal is to learn the base/global model parameters θ*0 on the s dataset such that the customized time-series specific parameters θi are good for the ith time series in the source set iq. Thus, it is a hierarchical (bi-level) optimization problem with outer and inner optimizations. Specifically, this hierarchical optimization problem is formulated as follows. The first level of optimization is formulated as:
θ*0=argminθ
where
θ*i=argminθ
with γ being a weight for Lgraph, Lmse (Equation (2)) and Lgraph (Equation (7)) being the two types of losses as disclosed herein, and θ0 is fixed in Equation (8).
In the above formulation, Equation (8) is an outer loss function and is for searching for a global θ0 that serves as a good initialization for Equation (9) on each time series Sis. Equation (9) is an inner loss function for searching a time series specific θi that minimizes the outer loss function in Equation (8) for the customized model fθ
Equation (9) can be re-written as
θi=θ0−α∇θLmse(fθ
where α is the learning rate for gradient descend.
Upon meta-training being complete and θ0 being finally converged as θ*0 based on the source set ={, }), the learned base model parameter θ*0 is used as the initialization for training the target/test dataset as follows:
θi=θ*0−α∇θLmse(fθ
Then the customized parameter θi can be used to make a prediction for .
In the disclosure above, the aspect of enrichment and the aspect of customization are presented separately for the ease of understanding. Either aspect provides improvement over the prior art solutions and represents advancement in the field. In some embodiments, these two aspects of the present teaching may be used individually to enhance the performance of deep learning for time series forecasting. In some embodiments, these two aspects of the present teaching may be combined in applications.
As shown, the framework 315 combines the components in both
The global optimization in the first phase of the hierarchical process is carried out at 415 and 420 to update the global model parameters 340 according to Equation (8) using the model parameters θi from the other phase for updating customized model parameters. Specifically, to update the global model parameters, the enriched customized deep learning mechanism 390 generate predictions based on current global model parameters at 415 and then updates, at 420, the global model parameters 340 based on the two losses, i.e., Lmse(fθ
Conversely, the customized optimization of the second phase of the hierarchical process is carried out at 435 and 440 to update the customized model parameters 370 according to Equation (9) using the global model parameters θ0 from the global optimization phase. Specifically, in order to adjust the parameters for each time series, predictions are generated based on the relevant time series input first at 435 based on global model parameters θ0 as updated above. Such generated predictions and the true labels of the time series input are used to compute the loss Lmse(fθ
As discussed herein, an alternative embodiment is to establish converged model parameters in sequence.
In this illustrated embodiment, the enriched customized deep learning mechanism 390 comprises an artificial neural network 500, a relevant historic information query engine 510, a graph based historic information aggregator 520, a feature aggregator 530, a triple loss determiner 540, a global MSE loss determiner 550, a global model parameter updater 560, a time series MSE loss determiner 570, and a customized model parameter updater 580. These different components cooperate in the meta-learning framework as disclosed herein to carry out the enrichment of embedded feature vectors based on relevant historic information to improve the expressiveness of the model parameters and deriving both enhanced global model parameters (due to enrichment) and customize specific time series forecasting model parameters.
The queried relevant historic information is used by the graph based historic info aggregator 520 to aggregate via attention mechanism as shown in Equation (4) in the forward direction. The same operation in the backward direction may also be similarly performed. Multiple types of historic information may be aggregated, at 620, by the graph based historic info aggregator 520 based on Equation (5). In addition, based on the historical information aggregation as shown in Equation (4), the triple loss determiner 540 computes, at 630, the triple loss Lgraph in accordance with Equation (7). The queried pattern vectors in both forward and backward directions as specified in Equation (4) are then projected by the feature aggregator 530 to the same feature space as the embeddings with concatenation with the original features as specified in Equation (6). This is performed by the feature aggregator 530 at 640.
The aggregated feature vectors are then fed, at 650, to the artificial network 500 to generate a forecasted measurement based on the input training time series. When the forecasted measurement (prediction) is received, at 660, by the global MSE loss determiner 550, it computes, at 670, the MSE loss Lmse based on Equation (2). Such computed Lmse and Lgraph are then used by the global model parameter updater 560 to determine how to update, at 680, the global model parameters stored in 340. As discussed herein, the optimization corresponds to a hierarchical process, which may update global and customized model parameters at the same time or in a sequence. If the operational mode is in a simultaneous mode, in order to update the customized model parameters, the time series MSE loss determiner 570 determines the LSE loss for each time series based on the current global model parameters as fixed values. Based on the time series specific MSE loss, the customized model parameter updater 580 may then update, at 690, the customized model parameters stored in 370 by minimizing the MSE loss specific to each time series. As discussed herein, in an alternative embodiment, the customized model parameters may not be updated until the global model parameters converge to their established form.
The process as described herein may continue to iterate on different input time series data so that the model parameters will be learned via this deep learning scheme until convergence. As described herein, enrichment and customization are independent aspects or separate improvement under the present teaching attributed to improvement to the current state of the art in time series forecasting. With the deep learned mode parameters, not only the time series forecasting for cross time series using global model parameters can be improved due to the enrichment, the quality of customization is also enhanced because the base model derived via enrichment incorporate relevant information from related historic data queried. At the same time, the customization as described herein allows rapid adaptation of general model parameters to specific target model parameters suitable and effective for each particular time series in the absence of large sum of training data.
To implement various modules, units, and their functionalities described in the present disclosure, computer hardware platforms may be used as the hardware platform(s) for one or more of the elements described herein. The hardware elements, operating systems and programming languages of such computers are conventional in nature, and it is presumed that those skilled in the art are adequately familiar therewith to adapt those technologies to appropriate settings as described herein. A computer with user interface elements may be used to implement a personal computer (PC) or other type of workstation or terminal device, although a computer may also act as a server if appropriately programmed. It is believed that those skilled in the art are familiar with the structure, programming, and general operation of such computer equipment and as a result the drawings should be self-explanatory.
Computer 800, for example, includes COM ports 850 connected to and from a network connected thereto to facilitate data communications. Computer 800 also includes a central processing unit (CPU) 820, in the form of one or more processors, for executing program instructions. The exemplary computer platform includes an internal communication bus 810, program storage and data storage of different forms (e.g., disk 870, read only memory (ROM) 830, or random access memory (RAM) 840), for various data files to be processed and/or communicated by computer 800, as well as possibly program instructions to be executed by CPU 820. Computer 800 also includes an I/O component 860, supporting input/output flows between the computer and other components therein such as user interface elements 880. Computer 800 may also receive programming and data via network communications.
Hence, aspects of the methods of dialogue management and/or other processes, as outlined above, may be embodied in programming. Program aspects of the technology may be thought of as “products” or “articles of manufacture” typically in the form of executable code and/or associated data that is carried on or embodied in a type of machine readable medium. Tangible non-transitory “storage” type media include any or all of the memory or other storage for the computers, processors or the like, or associated modules thereof, such as various semiconductor memories, tape drives, disk drives and the like, which may provide storage at any time for the software programming.
All or portions of the software may at times be communicated through a network such as the Internet or various other telecommunication networks. Such communications, for example, may enable loading of the software from one computer or processor into another, for example, in connection with conversation management. Thus, another type of media that may bear the software elements includes optical, electrical, and electromagnetic waves, such as used across physical interfaces between local devices, through wired and optical landline networks and over various air-links. The physical elements that carry such waves, such as wired or wireless links, optical links, or the like, also may be considered as media bearing the software. As used herein, unless restricted to tangible “storage” media, terms such as computer or machine “readable medium” refer to any medium that participates in providing instructions to a processor for execution.
Hence, a machine-readable medium may take many forms, including but not limited to, a tangible storage medium, a carrier wave medium or physical transmission medium. Non-volatile storage media include, for example, optical or magnetic disks, such as any of the storage devices in any computer(s) or the like, which may be used to implement the system or any of its components as shown in the drawings. Volatile storage media include dynamic memory, such as a main memory of such a computer platform. Tangible transmission media include coaxial cables; copper wire and fiber optics, including the wires that form a bus within a computer system. Carrier-wave transmission media may take the form of electric or electromagnetic signals, or acoustic or light waves such as those generated during radio frequency (RF) and infrared (IR) data communications. Common forms of computer-readable media therefore include for example: a floppy disk, a flexible disk, hard disk, magnetic tape, any other magnetic medium, a CD-ROM, DVD or DVD-ROM, any other optical medium, punch cards paper tape, any other physical storage medium with patterns of holes, a RAM, a PROM and EPROM, a FLASH-EPROM, any other memory chip or cartridge, a carrier wave transporting data or instructions, cables or links transporting such a carrier wave, or any other medium from which a computer may read programming code and/or data. Many of these forms of computer readable media may be involved in carrying one or more sequences of one or more instructions to a physical processor for execution.
Those skilled in the art will recognize that the present teachings are amenable to a variety of modifications and/or enhancements. For example, although the implementation of various components described above may be embodied in a hardware device, it may also be implemented as a software only solution—e.g., an installation on an existing server. In addition, the fraudulent network detection techniques as disclosed herein may be implemented as a firmware, firmware/software combination, firmware/hardware combination, or a hardware/firmware/software combination.
While the foregoing has described what are considered to constitute the present teachings and/or other examples, it is understood that various modifications may be made thereto and that the subject matter disclosed herein may be implemented in various forms and examples, and that the teachings may be applied in numerous applications, only some of which have been described herein. It is intended by the following claims to claim any and all applications, modifications and variations that fall within the true scope of the present teachings.
The present application is related to U.S. patent application Ser. No. 17/083,020, filed Oct. 28, 2020, which is incorporated herein by reference in its entirety.