This disclosure relates to the training of artificial intelligence and memory management associated therewith.
“Intelligence” demonstrated by machines can take the form of agents that make predictions from a stream of data values describing various phenomena. Intelligent models can, for example, attempt to predict future stock prices or outputs from a manufacturing operation given various data. Such models are often trained on large amounts of data.
A computer system includes a memory and a processor. The processor is programmed to construct and utilize a plurality of data package objects. Each of the data package objects contains signal data describing time-series values for parameters, organizes the signal data into batches having a size less than the memory, and identifies the batches according to indices. Each of the data package objects further, responsive to requests, provides output identifying the indices in randomly shuffled or arbitrary order, loads into the memory one of the batches such that features of the signal data of the one of the batches can be used to train a machine learning model to predict time-series parameter outputs from time-series parameter inputs, and removes from the memory the one of the batches to prevent the one of the batches and other of the batches from completely occupying all of the memory at a same time.
An embedded system includes a hardware registry and a microcontroller. The microcontroller is programmed to construct and utilize a plurality of data package objects. Each of the data package objects contains signal data describing time-series values for parameters, organizes the signal data into batches having a size less than the hardware registry, and identifies the batches according to indices. Each of the data package objects further, responsive to requests, provides output identifying the indices in randomly shuffled or arbitrary order, loads into the hardware registry one of the batches such that features of the signal data of the one of the batches can be used to train a machine learning model to predict time-series parameter outputs from time-series parameter inputs, and removes from the hardware registry the one of the batches to prevent the one of the batches and other of the batches from completely occupying all of the hardware registry at a same time.
Embodiments are described herein. It is to be understood, however, that the disclosed embodiments are merely examples and other embodiments may take various and alternative forms. The figure is not necessarily to scale. Some features could be exaggerated or minimized to show details of particular components. Therefore, specific structural and functional details disclosed herein are not to be interpreted as limiting, but merely as a representative basis for teaching one skilled in the art.
Various features illustrated or described with reference to any one example may be combined with features illustrated or described in one or more other examples to produce embodiments that are not explicitly illustrated or described. The combinations of features illustrated provide representative embodiments for typical applications. Various combinations and modifications of the features consistent with the teachings of this disclosure, however, could be desired for particular applications or implementations.
Training of artificial intelligence, such as machine learning models (e.g., sequence to sequency models, etc.) is often done in a bespoke fashion. Data scientists, for example, create custom code for loading and pre-processing data to be used in training machine learning models. Introduction of different models or different data, however, can result in the need for additional custom code. Moreover, the amount of data used to train an artificial intelligence can exceed available memory. Here, we introduce a platform that has reusable objects to reduce the need for custom code and large amounts of memory when training artificial intelligence.
Several types of objects are contemplated, including a project object, an experiment package object, a data package object, a pipeline object, a model package object, and others. When instantiated from their corresponding class, they inherit the respective functionality described below. All may be distinct and serializable, and can be saved as files and managed as versions. And these objects, when executed by a processor or microcontroller for example, cooperate to facilitate model training.
In one example, a state-based interface (“seq2seq”) to a software physics package (“physics”) and with multiple functions is contemplated. These functions constitute a high-level project application programming interface (API) for setting up and running what will be referred to as a project. Use of the project API is generally preferred over an object API when possible.
Project API Example # Import state-based interface; Project API is now available
From physics import seq2seq as pjt
# Extract data from a source
pjt.extract_data(source=‘manufacturing plant’, run_id=140, search=‘files’)
Conceptually, the state-based interface maintains relationships between its major object classes as illustrated in
The following briefly introduces various objects with reference to
In addition to the project API, various objects can be exposed directly utilizing their individual object API. This, however, may not be generally useful in a production environment. It may, however, be convenient to use object APIs for certain tasks, such as visualizing raw data in a data package object, or running speed benchmark tests on signal processing operations contained in a pipeline object.
More generally, a data package object can contain raw signal data (e.g., time-series values for parameters, such as temperature sensor and motor amperage, of manufacturing equipment during operation, etc.) and corresponding metadata (e.g., data describing control limits, such as the operating temperature range and power limits, for the manufacturing equipment, etc.). A data package object can also have several management features related to the data it contains. These features include knowing the size of the data; constructing and indexing batches (e.g., subsets) from the data, with each batch typically having a size less than the size of available memory; shuffling of the batch indices; loading a requested batch into memory; and removing the batch from memory.
A pipeline object can contain a predefined and configurable sequence of data processing operations (e.g., frequency analysis, statistical analysis, etc.) that create features of interest from the data, and can operate across an arbitrarily large number of data package objects. The operations, for example, can define which signals are targeted by each operation, and can be saved as a single object that can be called and used again. As a result, a sequence of steps for preparing data from a certain type of source (e.g., a factory) can be saved, and every time data from that source is used to update training of a corresponding model, the same sequence of steps for preparing the data can be called by loading the corresponding pipeline object. An example sequence of operations defined by a pipeline object may include generating a moving average on temperature data, scaling data so everything is zero mean, performing frequency analysis on pressure signals to create wavelets, etc.
A model package can contain machine learning models (e.g., a sequence to sequence model, etc.) and a taxonomy of all parameters required to reconstruct the machine learning models after training (e.g., sequence lengths, number of neural network layers, number of inputs and outputs, trained weights of neural network elements, etc.), and have the ability to serialize, save, and reload them. Once loaded, the model is reconstituted into memory and can be used or trained.
An experiment package object can provide functional tools to use the various sub-packages together. It is serializable and contains all data needed to reconstruct in different run times, and will thus run training, validation, and simulation, and can make plots of model performance: It is the orchestrator for model training.
The memory management features associated with the data package objects allow the platform to use an arbitrarily large number of data package objects for a particular model even if all of those data sources could not simultaneously fit in memory. As mentioned above, the data package object has knowledge of important features of the data including its size and how many batches (e.g., subsets) can be constructed from the data, and can index such batches. A corresponding experiment package can look at its data packages even though the data is not yet in memory, and generate requests for the data packages to randomly shuffle and report their indices of available batches. Across all of the data packages, the experiment package can thus select batches at random. The data package holding the selected batch will load the batch's data and corresponding metadata into memory for use by the experiment package for training of the model to, for example, predict time-series parameter outputs of manufacturing equipment from time-series parameter inputs to the manufacturing equipment subject to control limits defined by the metadata, and then remove the data from memory when training of the model on the data is finished so that the memory is not overwhelmed with data from the various batches. As such, the experiment package can iterate through all batches of the data packages randomly for multiple epochs of model training, which is advantageous for efficiently training machine learning models.
The algorithms, methods, or processes disclosed herein can be deliverable to or implemented by a computer, controller, or processing device, which can include any dedicated electronic control unit or programmable electronic control unit. Similarly, the algorithms, methods, or processes can be stored as data and instructions executable by a computer or controller in many forms including, but not limited to, information permanently stored on non-writable storage media such as read only memory devices and information alterably stored on writeable storage media such as compact discs, random access memory devices, or other magnetic and optical media. The algorithms, methods, or processes can also be implemented in software executable objects. Alternatively, the algorithms, methods, or processes can be embodied in whole or in part using suitable hardware components, such as application specific integrated circuits, field-programmable gate arrays, state machines, or other hardware components or devices, or a combination of firmware, hardware, and software components.
While exemplary embodiments are described above, it is not intended that these embodiments describe all possible forms encompassed by the claims. The words used in the specification are words of description rather than limitation, and it is understood that various changes may be made without departing from the spirit and scope of the disclosure. For example, the words processor and processors may be used interchangeably, and the words microcontroller and microcontrollers may be used interchangeably.
As previously described, the features of various embodiments may be combined to form further embodiments of the invention that may not be explicitly described or illustrated. While various embodiments could have been described as providing advantages or being preferred over other embodiments or prior art implementations with respect to one or more desired characteristics, those of ordinary skill in the art recognize that one or more features or characteristics may be compromised to achieve desired overall system attributes, which depend on the specific application and implementation. These attributes may include, but are not limited to cost, strength, durability, life cycle cost, marketability, appearance, packaging, size, serviceability, weight, manufacturability, ease of assembly, etc. As such, embodiments described as less desirable than other embodiments or prior art implementations with respect to one or more characteristics are not outside the scope of the disclosure and may be desirable for particular applications.
This application claims the benefit of Provisional App. No. 63/281,433, filed Nov. 19, 2021, which is hereby incorporated by reference in its entirety.
Number | Date | Country | |
---|---|---|---|
63281433 | Nov 2021 | US |