This U.S. patent application claims priority under 35 U.S.C. § 119 to: Indian Patent Application number 202321087298, filed on Dec. 20, 2023. The entire contents of the aforementioned application are incorporated herein by reference.
The embodiments herein generally relate to the field of Artificial Intelligence (AI) and, more particularly, to a method and system for Human intuition based Decoder Neural Network (Theta-DecNN) for Artificial Intelligence (AI) Model Refinement.
Traditional methods in artificial intelligence (AI) model development, fine-tuning, and optimization heavily rely on human expertise and intuition. This approach, while effective, often lacks scalability and can introduce inefficiencies. Automatic Machine Learning (AutoML) framework introduces automation and simplifies each step in the machine learning process, from handling a raw data set to deploying a practical ML model, unlike traditional machine learning, where models are developed manually, and each step in the process must be handled separately. It can be understood that AutoML and similar frameworks optimize machine learning pipelines through genetic programming.
Simulating human intuition in AI model development, particularly in fine-tuning, domain alignment, hyperparameter optimization is hardly explored. This limitation in AI/ML development automation needs to be addressed to bring in efficiency and scalability in AL model development.
Embodiments of the present disclosure present technological improvements as solutions to one or more of the above-mentioned technical problems recognized by the inventors in conventional systems.
For example, in one embodiment, a method for human intuition based Artificial Intelligence (AI) model fine-tuning is provided.
The method includes receiving, via a Human Intuition based Decoder-Neural Network (Theta-DecNN) implementing a Self-Regulated Human Intuition Emulation Mechanism (SRHIEM), an input sequence defining a task in natural language for which an Artificial Intelligence (AI) model is to be fine-tuned.
Further, the method includes deriving an initial intuition value from a reference intuition value of a human intuition factor (θ) associated with the input sequence by: processing the input sequence by the Theta-DecNN to obtain an output sequence emulating human intuition present in the input sequence, wherein a self-attention mechanism of the Theta-DecNN is modified in each iteration of a plurality of iterations (iterations) using a first iterative feedback mechanism of the SRHIEM based on change in the human intuition factor (θ). The human intuition factor (θ) is updated from the reference intuition value in a first iteration to the initial intuition value in a final iteration based on learning rate of the Theta-DecNN, and a gradient of a feedback function. The feedback function compares the output sequence with a human-like intuitive criteria.
Furthermore, the method includes determining an optimized human intuition factor (θ*) and training the AI model in accordance with the optimized human intuition factor (θ*) by applying a second iterative feedback mechanism of the SRHIEM that iteratively fine tune the initial intuition value based on historical tuning data H, and live performance metrics P, wherein live performance of the AI model trained on the initial intuition factor is analyzed in accordance with target metrics defined for the AI model in every iteration;
Further, the method includes fine-tuning the trained AI model having an intuition-like capability obtained via the optimized intuition factor to align to a target domain using a third iterative feedback mechanism of the SRHIEM by aligning the optimized intuition factor with the target domain to obtain a domain aligned optimized intuition factor based on a domain-specific alignment function that incorporates contextual cues into feedback loop of the third iterative feedback mechanism.
Further, the method includes optimizing a plurality of hyperparameters (hyperparameters) of the fine-tuned AI model using a hyperparameter optimization module implementing a fourth iterative feedback mechanism that utilizes i) predictive impact analysis implementing a combination of regression analysis and decision trees to predict how changes in each of the hyperparameter affect performance of the AI model, ii) performance benchmarking for effectiveness of configuration of the hyperparameters based on historical data, iii) feedback-driven adjustment loop enabling the fine-tuned AI model to learn from each adjustment of the hyperparameters, and iv) historical data continuously mined for insights.
In another aspect, a system for human intuition based Artificial Intelligence (AI) model fine-tuning is provided. The system comprises a memory storing instructions; one or more Input/Output (I/O) interfaces; and one or more hardware processors coupled to the memory via the one or more I/O interfaces, wherein the one or more hardware processors are configured by the instructions to receive, via a Human Intuition based Decoder-Neural Network (Theta-DecNN) implementing a Self-Regulated Human Intuition Emulation Mechanism (SRHIEM), an input sequence defining a task in natural language for which an Artificial Intelligence (AI) model is to be fine-tuned.
Further, the one or more hardware processors are configured to derive an initial intuition value from a reference intuition value of a human intuition factor (θ) associated with the input sequence by: processing the input sequence by the Theta-DecNN to obtain an output sequence emulating human intuition present in the input sequence, wherein a self-attention mechanism of the Theta-DecNN is modified in each iteration of the a plurality of iterations using a first iterative feedback mechanism of the SRHIEM based on change in the human intuition factor (θ). The human intuition factor (θ) is updated from the reference intuition value in a first iteration to the initial intuition value in a final iteration based on learning rate of the Theta-DecNN, and a gradient of a feedback function. The feedback function compares the output sequence with a human-like intuitive criteria.
Furthermore, the one or more hardware processors are configured to determine an optimized human intuition factor (θ*) and training the AI model in accordance with the optimized human intuition factor (*) by applying a second iterative feedback mechanism of the SRHIEM that iteratively fine tune the initial intuition value based on historical tuning data H, and live performance metrics P, wherein live performance of the AI model trained on the initial intuition factor is analyzed in accordance with target metrics defined for the AI model in every iteration;
Further, the one or more hardware processors are configured to fine-tune the trained AI model having an intuition-like capability obtained via the optimized intuition factor to align to a target domain using a third iterative feedback mechanism of the SRHIEM by aligning the optimized intuition factor with the target domain to obtain a domain aligned optimized intuition factor based on a domain-specific alignment function that incorporates contextual cues into feedback loop of the third iterative feedback mechanism.
Further, the one or more hardware processors are configured to optimize a plurality of hyperparameters of the fine-tuned AI model using a hyperparameter optimization module implementing a fourth iterative feedback mechanism that utilizes i) predictive impact analysis implementing a combination of regression analysis and decision trees to predict how changes in each of the hyperparameter affect performance of the AI model, ii) performance benchmarking for effectiveness of configuration of the hyperparameters based on historical data, iii) feedback-driven adjustment loop enabling the fine-tuned AI model to learn from each adjustment of the hyperparameters, and iv) historical data continuously mined for insights.
In yet another aspect, there are provided one or more non-transitory machine-readable information storage mediums comprising one or more instructions, which when executed by one or more hardware processors causes a method for human intuition based Artificial Intelligence (AI) model fine tuning.
The method includes receiving, via a Human Intuition based Decoder-Neural Network (Theta-DecNN) implementing a Self-Regulated Human Intuition Emulation Mechanism (SRHIEM), an input sequence defining a task in natural language for which an Artificial Intelligence (AI) model is to be fine-tuned.
Further, the method includes deriving an initial intuition value from a reference intuition value of a human intuition factor (θ) associated with the input sequence by: processing the input sequence by the Theta-DecNN to obtain an output sequence emulating human intuition present in the input sequence, wherein a self-attention mechanism of the Theta-DecNN is modified in each iteration of a plurality of iterations using a first iterative feedback mechanism of the SRHIEM based on change in the human intuition factor (θ). The human intuition factor (θ) is updated from the reference intuition value in a first iteration to the initial intuition value in a final iteration based on learning rate of the Theta-DecNN, and a gradient of a feedback function. The feedback function compares the output sequence with a human-like intuitive criteria.
Furthermore, the method includes determining an optimized human intuition factor (θ*) and training the AI model in accordance with the optimized human intuition factor (θ*) by applying a second iterative feedback mechanism of the SRHIEM that iteratively fine tune the initial intuition value based on historical tuning data H, and live performance metrics P, wherein live performance of the AI model trained on the initial intuition factor is analyzed in accordance with target metrics defined for the AI model in every iteration;
Further, the method includes fine-tuning the trained AI model having an intuition-like capability obtained via the optimized intuition factor to align to a target domain using a third iterative feedback mechanism of the SRHIEM by aligning the optimized intuition factor with the target domain to obtain a domain aligned optimized intuition factor based on a domain-specific alignment function that incorporates contextual cues into feedback loop of the third iterative feedback mechanism.
Further, the method includes optimizing a plurality of hyperparameters, also referred to as hyperparameters herein after, of the fine-tuned AI model using a hyperparameter optimization module implementing a fourth iterative feedback mechanism that utilizes i) predictive impact analysis implementing a combination of regression analysis and decision trees to predict how changes in each of the hyperparameter affect performance of the AI model, ii) performance benchmarking for effectiveness of configuration of the hyperparameters based on historical data, iii) feedback-driven adjustment loop enabling the fine-tuned AI model to learn from each adjustment of the hyperparameters, and iv) historical data continuously mined for insights.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the invention, as claimed.
The accompanying drawings, which are incorporated in and constitute a part of this disclosure, illustrate exemplary embodiments and, together with the description, serve to explain the disclosed principles:
It should be appreciated by those skilled in the art that any block diagrams herein represent conceptual views of illustrative systems and devices embodying the principles of the present subject matter. Similarly, it will be appreciated that any flow charts, flow diagrams, and the like represent various processes which may be substantially represented in computer readable medium and so executed by a computer or processor, whether or not such computer or processor is explicitly shown.
Exemplary embodiments are described with reference to the accompanying drawings. In the figures, the left-most digit(s) of a reference number identifies the figure in which the reference number first appears. Wherever convenient, the same reference numbers are used throughout the drawings to refer to the same or like parts. While examples and features of disclosed principles are described herein, modifications, adaptations, and other implementations are possible without departing from the scope of the disclosed embodiments.
Existing Automatic machine learning (AutoML) approaches fail to capture human intuition and domain alignment as part of automated ML model development. Embodiments of the present disclosure provide a method and system comprising Human intuition based Decoder Neural Network (Theta-DecNN) for Artificial Intelligence (AI) Model Refinement. The Theta-DecNN disclosed herein applies a three level process for AI model building by integrating and optimizing a human intuition factor (θ) and thereafter aligning the AI model to a domain of interest. The Theta-DecNN, which is a Decoder-only generative AI (Gen AI) model, utilizes iterative feedback mechanisms at each level to extract the human intuition from input task, further optimizes the human intuition factor based on Historical tuning data (H), live performance metrics (P) obtain a trained AI model for the task. At the third level the train AI model is finetuned to align with domain of the task based on a domain specific function. As understood, attention drives outcomes and by forcing the attention head of the Theta-DecNN to look at theta, the outcomes are being influenced by theta.
Example input tasks to develop AI model aligned to the domain by understanding the human intuition in the task:
Once a domain aligned AI model that incorporates human intuition is obtained, it is further optimized by tuning hyperparameter to generate a human intuition based domain optimized AL model enabling the model to produce outputs that are not only accurate but also contextually and intuitively aligned with complex human thought processes. The AI model development pipeline disclosed herein is used for finetuning and hyperparameter optimization Generative AI models such as Generative Adversarial Networks (GANs) that can create visual and multimedia artifacts from both imagery and textual input data, Transformer-based models such as Generative Pre-Trained (GPT) language models that can use information gathered and create textual content, and the like.
Referring now to the drawings, and more particularly to
Referring to the components of system 100, in an embodiment, the processor(s) 104, can be one or more hardware processors 104. In an embodiment, the one or more hardware processors 104 can be implemented as one or more microprocessors, microcomputers, microcontrollers, digital signal processors, central processing units, state machines, logic circuitries, and/or any devices that manipulate signals based on operational instructions. Among other capabilities, the one or more hardware processors 104 are configured to fetch and execute computer-readable instructions stored in the memory 102. In an embodiment, the system 100 can be implemented in a variety of computing systems including laptop computers, notebooks, hand-held devices such as mobile phones, workstations, mainframe computers, servers, and the like.
The I/O interface(s) 106 can include a variety of software and hardware interfaces, for example, a web interface, a graphical user interface for interactive exchange and human feedback to automatically refine human intuition factor. The user interface also enables receiving the natural language task for which the AI model has to be developed (fine-tuned). The I/O interface 106 can facilitate multiple communications within a wide variety of networks N/W and protocol types, including wired networks, for example, LAN, cable, etc., and wireless networks, such as WLAN, cellular and the like. In an embodiment, the I/O interface(s) 106 can include one or more ports for connecting to a number of external devices or to another server or devices.
The memory 102 may include any computer-readable medium known in the art including, for example, volatile memory, such as static random access memory (SRAM) and dynamic random access memory (DRAM), and/or non-volatile memory, such as read only memory (ROM), erasable programmable ROM, flash memories, hard disks, optical disks, and magnetic tapes.
In an embodiment, the memory 102 includes a plurality of modules 110 such as the Theta-DecNN for and the Hyperparameter optimization module.
The plurality of modules 110 include programs or coded instructions that supplement applications or functions performed by the system 100 for executing different steps involved in the process AI Model Refinement using of Human intuition based Decoder Neural Network (Theta-DecNN), being performed by the system 100. The plurality of modules 110, amongst other things, can include routines, programs, objects, components, and data structures, which performs particular tasks or implement particular abstract data types. The plurality of modules 110 may also be used as, signal processor(s), node machine(s), logic circuitries, and/or any other device or component that manipulates signals based on operational instructions. Further, the plurality of modules 110 can be used by hardware, by computer-readable instructions executed by the one or more hardware processors 104, or by a combination thereof. The plurality of modules 110 can include various sub-modules (not shown).
Further, the memory 102 may comprise information pertaining to input(s)/output(s) of each step performed by the processor(s) 104 of the system 100 and methods of the present disclosure.
Further, the memory 102 includes a database 108. The database (or repository) 108 may include a plurality of abstracted pieces of code for refinement and data that is processed, received, or generated as a result of the execution of the plurality of modules in the module(s) 110.
Although the data base 108 is shown internal to the system 100, it will be noted that, in alternate embodiments, the database 108 can also be implemented external to the system 100, and communicatively coupled to the system 100. The data contained within such an external database may be periodically updated. For example, new data may be added into the database (not shown in
Further, the human intuition factor is optimized based on the Historical tuning data H, and live performance metrics (P), wherein a trained AI model is obtained in accordance with optimized human intuition. This trained AI model is further trained to align with domain in accordance with a domain specific function. The fine-tuned AI model is a model that is well developed on the human intuition for the domain of interest. This domain aligned AI model is then applied with hyper parameter optimization automated approach to generate a domain optimized AI model. Furthermore, a continual learning loop is integrated into the system which enables updating θ* based on processed structured/unstructured Human feedback on output of domain optimized AI model.
While there exists approaches that automate hyperparameter tuning, the system extends this automation to include domain-specific alignment, which is crucial for applications requiring a deep understanding of specific industry or task domains. The domain alignment herein uses advanced techniques like natural language processing and semantic analysis to automatically align the AI model with the specific requirements of a given domain. This process ensures that the model's outputs are not only optimized in terms of performance metrics but are also contextually relevant and tailored to the domain's unique characteristics. This level of domain-specific fine-tuning and alignment is not addressed, whereas prior work primarily focuses on general hyperparameter optimization without a targeted approach to domain-specific needs.
In an embodiment, the system 100 comprises one or more data storage devices or the memory 102 operatively coupled to the processor(s) 104 and is configured to store instructions for execution of steps of the method 200 by the processor(s) or one or more hardware processors 104. The steps of the method 200 of the present disclosure will now be explained with reference to the components or blocks of the system 100 as depicted in
Referring to the steps of the method 200, at step 202 of the method 200, the one or more hardware processors 104 are configured by the instructions to receive, via the Theta-DecNN implementing a Self-Regulated Human Intuition Emulation Mechanism (SRHIEM), an input sequence defining a task in natural language for which an AI model is to be developed.
An example task can be “Domain: Medical Image Analysis” and “Task: Training a model to analyze medical images, such as MRI scans or X-rays, to assist in early detection and diagnosis of diseases”.
At step 204 of the method 200, the one or more hardware processors 104 are configured by the instructions to derive an initial intuition value from a reference intuition value of the human intuition factor (θ) associated with the input sequence. The Theta-DecNN, also referred to as decoder or decoder-only architecture hereinafter, processes the input sequence (X) to obtain an output sequence (O) emulating human intuition present in the input sequence. The self-attention mechanism of the Theta-DecNN is modified in each of the plurality of iterations (iterations) using the first iterative feedback mechanism of the SRHIEM based on change in the human intuition factor. The human intuition factor (θ) is updated from the reference intuition value in a first iteration plurality of iterations to the initial intuition value in a final iteration of the plurality of iterations based on learning rate (η) of the Theta-DecNN, and a gradient of a feedback function (∇θF(H, O), where (H) represents hidden state vectors generated by each layer in the decoder generating output O generated by the model at the current state H. The feedback function compares the output sequence with the human-like intuitive criteria, which is quantifying criteria that determine human intuition for optimizing AI models involves the following. The criteria below influences the hyper-parameter selection for the purpose of model optimization:
At step 206 of the method 200, the one or more hardware processors 104 are configured by the instructions to determine an optimized human intuition factor (θ*) and train the AI model in accordance with the optimized human intuition factor (θ*) by applying a second iterative feedback mechanism of the SRHIEM that iteratively fine tunes the initial intuition value based on historical tuning data H, and live performance metrics P. The live performance of the AI model trained on the initial intuition factor is analyzed in accordance with target metrics defined for the AI model in every iteration.
Theta-DecNN: The steps 202 to 206 are better understood with the help of the disclosed decoder architecture and the iterative feedback mechanisms explained below in conjunction with architecture of
The SRHIEM's discloses an approach to optimize a generative model using an emulation of human intuition that is beyond the conventional pattern recognition and statistical learning methods by incorporating a mechanism that dynamically adjusts the generative process, making it capable of producing outputs that are not only creative but also intuitively aligned with human-like reasoning and decision-making. The utility of the SRHIEM is evident in its application to a wide range of generative tasks where human intuition plays a pivotal role, from artistic content creation to complex problem-solving scenarios. This mechanism's ability to adapt and self-regulate based on an emulation of human intuition presents a notable advancement in the field of GenAI. The creativity embedded in the SRHIEM is mathematically represented by the intuition factor θ, which introduces a degree of human-like unpredictability and adaptability into the AI model. This not only enhances the model's ability to generate novel and relevant outputs but also reflects the inherently creative process of human intuition.
Self-Attention Mechanism: Human intuition factor defined as θ, a dynamic variable that adjusts the attention mechanism to optimize output generation based on feedback. The modified attention mechanism is influenced by θ. The self-attention mechanism within each layer I of the decoder is defined as:
wherein query Q, key K and value V are matrices computed for proceeding layer output of the Theta-DecNN. The θ is adjusted after each generation cycle based on the optimization feedback, emulating the iterative learning process akin to human intuition.
Key Vectors (K): These are part of the input to the self-attention mechanism alongside queries (Q) and values (V). They play a role in determining the attention each value receives. Dimensionality (dk): This is the size of the key vectors. In practice, it is a hyperparameter that defines the number of elements in each key vector.
Optimization Feedback Loop: (first iterative feedback mechanism) as depicted in
The update rule for θ can be represented as:
The θ that is updated from reference value to initial value above is further optimized during AI model building for the input task. The system 100 uses the second iterative feedback mechanism to automatically adjust model parameters, learning from vast datasets and previous fine-tuning instances, effectively mimicking human intuition and expertise. This Includes real-time monitoring and adjustment capabilities to continuously optimize the model's performance. The SRHIEM fine-tuning subsystem employs an approach that learns from extensive datasets and historical fine-tuning instances. This is designed to replicate the adaptive and anticipatory qualities of human expertise in model parameter adjustment.
Further, the optimization of human intuition factor derived by pseudocode 1 is performed based on iterative optimization that balances exploitation of known parameter configurations with the exploration of new configurations, akin to human experts when they fine-tune based on experience and intuition.
Standard techniques used by SRHIEM for optimization of theta:
The approach of the SRHIEM of learning from historical fine-tuning instances and live data to adjust the intuition factor θ is novel. It represents a unique blend of retrospective analysis and real-time responsiveness. The predictive modeling approach to suggesting intuition factor adjustments introduces a unique mechanism of anticipating the impact of parameter changes. The system 100 thus is self-evolving in nature, where the fine-tuning process becomes more refined with each iteration, leveraging past successes and failures to inform future adjustments, akin to an expert honing their craft. The real-time validation loop that tests and potentially reverts adjustments introduces an inventive feedback mechanism uncommon in traditional static fine-tuning methods.
The utility of the system 100 is evident in its application across various domains where model performance is critical, such as natural language processing, image recognition, and autonomous systems. Its capability to adaptively fine-tune in real-time ensures that the model remains effective under changing conditions, addressing the practical need for high-performing, dynamic AI systems.
Referring back to steps of method 200, once the human intuition factor is optimized providing a trained AI model, at step 208 of the method 200, the one or more hardware processors 104 are configured by the instructions to fine-tune the trained AI model having an intuition-like capability obtained via the optimized intuition factor to align to a target domain using a third iterative feedback mechanism of the SRHIEM by aligning the optimized intuition factor with the target domain to obtain a domain aligned optimized intuition factor based on a domain-specific alignment function that incorporates contextual cues into feedback loop of the third iterative feedback mechanism.
The domain-specific alignment function ‘A’ within the Self-Regulated Human Intuition Emulation Mechanism (SRHIEM) can be tailored to a particular domain to ensure that the AI model's outputs are aligned with the unique characteristics and requirements of that domain.
Example: Healthcare Domain-Specific Alignment Function in SRHIEM
Domain Alignment Process (third iterative feedback mechanism): This automatically aligns the model with specific industry or task domains, leveraging natural language processing and semantic analysis. Domain-specific data and contextual cues are utilized by the to refine the model's understanding and output relevance.
Design: The Design for the integrated domain alignment process is built upon a foundation where the SRHIEM's third iterative feedback mechanism serves as the core engine for emulating human-like intuition in the Generative AI model. The SRHIEM, having been finely tuned to generalize across various data domains, possesses an adaptive quality that reflects a sophisticated level of decision-making, akin to that of a human expert.
Post the SRHIEM optimization, the domain alignment module acts as an additional layer of specialization. This module is not a simple plug-in but an intrinsic part of the SRHIEM that extends its capabilities. It uses advanced natural language processing (NLP) to interpret and integrate the subtleties of domain-specific language, terminology, and concepts. This is crucial because while SRHIEM imparts a broad intuition-like capability to the model, the domain alignment module imparts specificity, sharpening the model's outputs to conform with domain-specific expectations and nuances. The module considers not only the linguistic aspects but also the contextual and conceptual requirements of the domain. For instance, in a medical domain, the system would align its outputs to reflect the accuracy and detail-oriented nature of medical communication, whereas in a creative writing domain, it would skew towards narrative flair and stylistic coherence.
The logic that governs the domain alignment process for the trained AI model process must be capable of iterative learning and real-time adjustment. It is designed to be dynamic, continuously updating the model parameters as it receives new data, feedback, and alignment targets. The following expanded pseudocode provides a step-by-step procedural implementation. The algorithm demonstrates a methodical and adaptive approach to fine-tuning the SRHIEM for domain-specific tasks.
Thus, the approach disclosed herein by the system 100 is two-tiered approach to model optimization—first, by imparting a generalized intuition-like capability via the SRHIEM, and second, by refining this capability to the specificities of a target domain through the domain alignment module. The uniqueness is in the integration of real-time feedback directly into the SRHIEM, allowing for an unprecedented level of dynamic parameter adjustment that continuously steers the model towards optimal domain-specific performance. The utility is evident as the system can be deployed across a wide range of industries and tasks, from technical and scientific fields requiring high precision to creative fields demanding stylistic adaptation, ensuring the AI's outputs are always relevant and aligned with domain expectations.
Once a domain aligned AI model is obtained, then at step 210 of the method 200, the one or more hardware processors 104 are configured by the instructions to optimize a plurality of hyperparameters, also referred as hyperparameters herein after, of the fine-tuned AI model using a hyperparameter optimization module by implementing a fourth iterative feedback mechanism. The module utilizes predictive Impact Analysis implementing a combination of regression analysis and decision trees to predict how changes in each of the hyperparameter affect performance of the AI model, performance benchmarking for effectiveness of configuration of the hyperparameters based on historical data, Feedback-Driven Adjustment Loop enabling the fine-tuned AI model to learn from each adjustment of the hyperparameters, and Historical Data continuously mined for insights.
Thus, the system 100 implements an AI-driven hyperparameter tuning mechanism that autonomously determines optimal settings, guided by historical data and performance metrics. This process is enhanced by predictive analytics, estimating the best hyperparameter configurations for various scenarios.
Design: The Design for hyperparameter optimization in this context is built around the idea of creating a self-evolving, intelligent system. This system is not only aware of its current performance metrics but also understands how different hyperparameters interact and affect overall performance. It's designed to learn from historical tuning data, making informed predictions about which hyperparameter configurations are likely to yield the best results.
Intelligent Learning from History: The system 100 with the hyper parameter optimization module analyzes historical data to understand the impact of various hyperparameter settings on model performance across different scenarios. This learning is not just about which settings worked best but also why they were effective, giving the system a form of ‘intuition’ about hyperparameter tuning.
Predictive Analytics for Forward-Looking Adjustments: By leveraging predictive models, the with the hyper parameter optimization module can forecast the potential impact of hyperparameter adjustments before they are implemented. This proactive approach allows for more strategic and less trial-and-error-based tuning.
Adaptability Across Multiple Components: The with the hyper parameter optimization module uniquely tunes the hyperparameters for the SRHIEM, fine-tuning processes, and domain alignment in unison. This holistic approach ensures that adjustments in one area complement rather than conflict with others, maintaining overall system harmony.
The fourth iterative feedback mechanism of the with the hyper parameter optimization module implements a multi-faceted optimization strategy, integrating machine learning techniques to predict hyperparameter efficacy, reinforced by real-world performance metrics. It combines methods from Bayesian optimization, reinforcement learning, and evolutionary algorithms to traverse the hyperparameter space intelligently. As understood, the hyperparameters dictate the structure and behavior of a model. They are adjusted settings to optimize the learning process. For example, this includes the learning rate, which determines how quickly a model updates its parameters in response to the training data, or the regularization term, which helps prevent overfitting.
ϕ_initial represents the starting hyperparameters which could be defaults or previously optimized values. M represents the performance metrics used to evaluate how well the model is performing with the given hyperparameters. H contains historical data which is critical for informing the optimization process. Objectives could include specific performance targets or constraints the model must adhere to.
ϕ_candidate is the set of hyperparameters that the system is currently considering, and ϕ_optimized is the final set of optimized hyperparameters that the algorithm outputs after the iterative process is complete. This process is designed to be exhaustive and dynamic, ensuring that the model is continuously refined and aligned with the desired performance objectives.
As depicted in
Process for Human Feedback Integration: The process for integrating human feedback into the learning cycle of the AI system encompasses the following steps:
Correlation to the all system's Processes: The human feedback integration architecture and process are deeply intertwined with the SRHIEM, fine-tuning, domain alignment, and hyperparameter optimization processes. Each piece of feedback can lead to improvements across these components:
By systematically incorporating human expertise at multiple stages of the AI system's operation, the AI model becomes more aligned with expert expectations and industry standards, leading to outputs that are both technically sound and contextually relevant
Thus, the system 100 provides advanced parallelism by: Incorporating state-of-the-art parallel processing methods, enabling the system to handle large-scale computations and data processing efficiently. This includes parallel training and inference mechanisms, ensuring that the system remains scalable and responsive under heavy loads.
Technical Process flow of the System of Automating Intuition based AI Model Finetuning and Optimization.
Traditional GenAI systems are often limited by their architectural constraints and struggle to generalize creative processes across different architectures. The system introduces a breakthrough in cross-architecture generative learning, enabling GenAI to apply creative intuition across a variety of model architectures, leading to more flexible and creative outputs. Existing Decoder-only GenAI models often require extensive data to generate high-quality content, which is not always available. The addition of theta-DecNN with an ability to generate creative content from limited data, the system enables GenAI to produce high-quality outputs in data-scarce environments, mimicking human creativity.
Further, combining different GenAI models such as GANs and GPTs to produce a singular creative output often leads to integration challenges. The system utilizes a novel approach to harmonize the strengths of various GenAI models via human intuition based approach leading to more nuanced and complex creative expressions.
GenAI models traditionally rely on large datasets for training, which is impractical for many real-world creative applications. The system introduces data-efficient tuning that allows GenAI to learn from limited datasets by emulating human-like inferential creativity, enabling it to generate quality content with less data.
Furthermore, optimizing hyperparameters for creativity in GenAI is often a tedious and manual process. The system automates this process using heuristics inspired by human creative experts, leading to more efficient and effective generative models. LLMs in GenAI struggle with generating content that captures deep semantic meanings and creative nuances. The system enhances LLMs with advanced semantic understanding, enabling GenAI to generate content with a level of semantic richness and creativity previously unattainable. Replicating the dynamic and complex patterns of human creativity in GenAI models is a significant challenge. The NN pipleince disclosed by the system reflects the human brain's creativity processes, allowing GenAI to generate more original and human-like creative outputs. GenAI systems typically lack the ability to adapt their creative processes to different scenarios. The system adapts its creative strategies based on the scenario, similar to human adaptive creativity, leading to more relevant and context-aware generative content.
Incorporating qualitative human-like judgment into the optimization of GenAI models is difficult due to the subjective nature of creativity. However, by quantifying qualitative creative judgments and integrating them into the training process, the system enables GenAI to refine its creative outputs in a manner akin to human evaluators.
The written description describes the subject matter herein to enable any person skilled in the art to make and use the embodiments. The scope of the subject matter embodiments is defined by the claims and may include other modifications that occur to those skilled in the art. Such other modifications are intended to be within the scope of the claims if they have similar elements that do not differ from the literal language of the claims or if they include equivalent elements with insubstantial differences from the literal language of the claims.
It is to be understood that the scope of the protection is extended to such a program and in addition to a computer-readable means having a message therein; such computer-readable storage means contain program-code means for implementation of one or more steps of the method, when the program runs on a server or mobile device or any suitable programmable device. The hardware device can be any kind of device which can be programmed including e.g. any kind of computer like a server or a personal computer, or the like, or any combination thereof. The device may also include means which could be e.g. hardware means like e.g. an application-specific integrated circuit (ASIC), a field-programmable gate array (FPGA), or a combination of hardware and software means, e.g. an ASIC and an FPGA, or at least one microprocessor and at least one memory with software processing components located therein. Thus, the means can include both hardware means, and software means. The method embodiments described herein could be implemented in hardware and software. The device may also include software means. Alternatively, the embodiments may be implemented on different hardware devices, e.g. using a plurality of CPUs.
The embodiments herein can comprise hardware and software elements. The embodiments that are implemented in software include but are not limited to, firmware, resident software, microcode, etc. The functions performed by various components described herein may be implemented in other components or combinations of other components. For the purposes of this description, a computer-usable or computer readable medium can be any apparatus that can comprise, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device.
The illustrated steps are set out to explain the exemplary embodiments shown, and it should be anticipated that ongoing technological development will change the manner in which particular functions are performed. These examples are presented herein for purposes of illustration, and not limitation. Further, the boundaries of the functional building blocks have been arbitrarily defined herein for the convenience of the description. Alternative boundaries can be defined so long as the specified functions and relationships thereof are appropriately performed. Alternatives (including equivalents, extensions, variations, deviations, etc., of those described herein) will be apparent to persons skilled in the relevant art(s) based on the teachings contained herein. Such alternatives fall within the scope of the disclosed embodiments. Also, the words “comprising,” “having,” “containing,” and “including,” and other similar forms are intended to be equivalent in meaning and be open ended in that an item or items following any one of these words is not meant to be an exhaustive listing of such item or items or meant to be limited to only the listed item or items. It must also be noted that as used herein and in the appended claims, the singular forms “a,” “an,” and “the” include plural references unless the context clearly dictates otherwise.
Furthermore, one or more computer-readable storage media may be utilized in implementing embodiments consistent with the present disclosure. A computer-readable storage medium refers to any type of physical memory on which information or data readable by a processor may be stored. Thus, a computer-readable storage medium may store instructions for execution by one or more processors, including instructions for causing the processor(s) to perform steps or stages consistent with the embodiments described herein. The term “computer-readable medium” should be understood to include tangible items and exclude carrier waves and transient signals, i.e., be non-transitory. Examples include random access memory (RAM), read-only memory (ROM), volatile memory, nonvolatile memory, hard drives, CD ROMs, DVDs, flash drives, disks, and any other known physical storage media.
It is intended that the disclosure and examples be considered as exemplary only, with a true scope of disclosed embodiments being indicated by the following claims.
| Number | Date | Country | Kind |
|---|---|---|---|
| 202321087298 | Dec 2023 | IN | national |