The invention involves system on chip (SoC) and network on chip (NoC) semiconductor technology. The system is a three dimensional (3D) super computer on a chip (SCOC) and involves multiple processors on silicon (MPSOC) and a system on a programmable chip (SOPC). Components of the present invention involve micro-electro-mechanical systems (MEMS) and nano-electro-mechanical systems (NEMS). In particular, the reconfigurable components of the SoC are adaptive and represent evolvable hardware (EHW), consisting of field programmable gate array (FPGA) and complex programmable logic device (CPLD) architectures. The system has elements of intelligent microsystems that signify bio-inspired computing behaviors, exemplified in hardware-software interactivity. Because the system is a hybrid heterostructure semiconductor device that incorporates EHW, intelligent behaviors and synthetic computer interconnect network fabrics, the system is exemplar of polymorphous computing architecture (PCA) and cognitive computing.
The challenge of modern computing is to build economically efficient chips that incorporate more transistors to meet the goal of achieving Moore's law of doubling performance every two years. The limits of semiconductor technology are affecting this ability to grow in the next few years, as transistors become smaller and chips become bigger and hotter. The semiconductor industry has developed the system on a chip (SoC) as a way to continue high performance chip evolution.
So far, there have been four main ways to construct a high performance semiconductor. First, chips have multiple cores. Second, chips optimize software scheduling. Third, chips utilize efficient memory management. Fourth, chips employ polymorphic computing. To some degree, all of these models evolve from the Von Neumann computer architecture developed after WWII in which a microprocessor's logic component fetches instructions from memory.
The simplest model for increasing chip performance employs multiple processing cores. By multiplying the number of cores by eighty, Intel has created a prototype teraflop chip design. In essence, this architecture uses a parallel computing approach similar to supercomputing parallel computing models. Like some supercomputing applications, this approach is limited to optimizing arithmetic-intensive applications such as modeling.
The Tera-op, Reliable, Intelligently Adaptive Processing System (TRIPS), developed at the University of Texas with funding from DARPA, focuses on software scheduling optimization to produce high performance computing. This model's “push” system uses data availability to fetch instructions, thereby putting additional pressure on the compiler to organize the parallelism in the high speed operating system. There are three levels of concurrency in the TRIPS architecture, including instruction-level parallelism (ILP), thread-level parallelism (TLP) and data-level parallelism (DLP). The TRIPS processor will process numerous instructions simultaneously and map them onto a grid for execution in specific nodes. The grid of execution nodes is reconfigurable to optimize specific applications. Unlike the multi-core model, TRIPS is a uniprocessor model, yet it includes numerous components for parallelization.
The third model is represented by the Cell microprocessor architecture developed jointly by the Sony, Toshiba and IBM (STI) consortium. The Cell architecture uses a novel memory “coherence” architecture in which latency is overcome with a bandwidth priority and in which power usage is balanced with peak computational usage. This model integrates a microprocessor design with coprocessor elements; these eight elements are called “synergistic processor elements” (SPEs). The Cell uses an interconnection bus with four unidirectional data flow rings to connect each of four processors with their SPEs, thereby meeting a teraflop performance objective. Each SPE is capable of producing 32 GFLOPS of power in the 65 nm version, which was introduced in 2007.
The MOrphable Networked Micro-ARCHitecture (MONARCH) uses six reduced instruction set computing (RISC) microprocessors, twelve arithmetic clusters and thirty-one memory clusters to achieve a 64 GFLOPS performance with 60 gigabytes per second of memory. Designed by Raytheon and USC/ISI from DARPA funding, the MONARCH differs distinctly from other high performance SoCs in that it uses evolvable hardware (EHW) components such as field programmable compute array (FPCA) and smart memory architectures to produce an efficient polymorphic computing platform.
MONARCH combines key elements in the high performance processing system (HPPS) with Data Intensive Architecture (DIVA) Processor in Memory (PIM) technologies to create a unified, flexible, very large scale integrated (VLSI) system. The advantage of this model is that reprogrammability of hardware from one application-specific integrated circuit (ASIC) position to another produces faster response to uncertain changes in the environment. The chip is optimized to be flexible to changing conditions and to maximize power efficiency (3-6 GFLOPS per watt). Specific applications of MONARCH involve embedded computing, such as sensor networks.
These four main high performance SoC models have specific applications for which they are suited. For instance, the multi-core model is optimized for arithmetic applications, while MONARCH is optimized for sensor data analysis. However, all four also have limits.
The multi-core architecture has a problem of synchronization of the parallel micro-processors that conform to a single clocking model. This problem limits their responsiveness to specific types of applications, particularly those that require rapid environmental change. Further, the multi-core architecture requires “thread-aware” software to exploit its parallelism, which is cumbersome and produces quality of service (QoS) problems and inefficiencies.
By emphasizing its compiler, the TRIPS architecture has the problem of optimizing the coordination of scheduling. This bottleneck prevents peak performance over a prolonged period.
The Cell architecture requires constant optimization of its memory management system, which leads to QoS problems.
Finally, MONARCH depends on static intellectual property (IP) cores that are limited to combinations of specified pre-determined ASICs to program its evolvable hardware components. This restriction limits the extent of its flexibility, which was precisely its chief design advantage.
In addition to SoC models, there is a network on a chip (NoC) model, introduced by Arteris in 2007. Targeted to the communications industry, the 45 nm NoC is a form of SoC that uses IP cores in FPGAs for reprogrammable functions and that features low power consumption for embedded computing applications. The chip is optimized for on-chip communications processing. Though targeted at the communications industry, particularly wireless communications, the chip has limits of flexibility that it was designed to overcome, primarily in its deterministic IP core application software.
Various implementations of FPGAs represent reconfigurable computing. The most prominent examples are the Xilinx Virtex-II Pro and Virtex-4 devices that combine one or more microprocessor cores in an FPGA logic fabric. Similarly, the Atmel FPSLIC processor combines an AVR processor with programmable logic architecture. The Atmel microcontroller has the FPGA fabric on the same die to produce a fine-grained reconfigurable device. These hybrid FPGAs and embedded microprocessors represent a generation of system on a programmable chip (SOPC). While these hybrids are architecturally interesting, they possess the limits of each type of design paradigm, with restricted microprocessor performance and restricted deterministic IP core application software. Though they have higher performance than a typical single core microprocessor, they are less flexible than a pure FPGA model.
All of these chip types are two dimensional planar micro system devices. A new generation of three dimensional integrated circuits and components is emerging that is noteworthy as well. The idea to stack two dimensional chips by sandwiching two or more ICs using a fabrication process required a solution to the problem of creating vertical connections between the layers. IBM solved this problem by developing “through silicon vias” (TSVs) which are vertical connections “etched through the silicon wafer and filled with metal.” This approach of using TSVs to create 3D connections allows the addition of many more pathways between 2D layers. However, this 3D chip approach of stacking existing 2D planar IC layers is generally limited to three or four layers. While TSVs substantially limit the distance that information traverses, this stacking approach merely evolves the 2D approach to create a static 3D model.
In U.S. Pat. No. 5,111,278, Echelberger describes a 3D multi-chip module system in which layers in an integrated circuit are stacked by using aligned TSVs. This early 3D circuit model represents a simple stacking approach. U.S. Pat. No. 5,426,072 provides a method to manufacture a 3D IC from stacked silicon on insulation (SOI) wafers. U.S. Pat. No. 5,657,537 presents a method of stacking two dimensional circuit modules and U.S. Pat. No. 6,355,501 describes a 3D IC stacking assembly technique.
Recently, 3D stacking models have been developed on chip in which several layers are constructed on a single complementary metal oxide semiconductor (CMOS) die. Some models have combined eight or nine contiguous layers in a single CMOS chip, though this model lacks integrated vertical planes. MIT's microsystems group has created 3D ICs that contain multiple layers and TSVs on a single chip.
3D FPGAs have been created at the University of Minnesota by stacking layers of single planar FPGAs. However, these chips have only adjacent layer connectivity.
3D memory has been developed by Samsung and by BeSang. The Samsung approach stacks eight 2-Gb wafer level processed stack packages (WSPs) using TSVs in order to minimize interconnects between layers and increase information access efficiency. The Samsung TSV method uses tiny lasers to create etching that is later filled in with copper. BeSang combines 3D package level stacking of memory with a logic layer of a chip device using metal bonding.
See also U.S. Pat. No. 5,915,167 for a description of a 3D DRAM stacking technique, U.S. Pat. No. 6,717,222 for a description of a 3D memory IC, U.S. Pat. No. 7,160,761 for a description of a vertically stacked field programmable nonvolatile memory and U.S. Pat. No. 6,501,111 for a description of a 3D programmable memory device.
Finally, in the supercomputing sphere, the Cray T3D developed a three dimensional supercomputer consisting of 2048 DEC Alpha chips in a torus networking configuration.
In general, all of the 3D chip models merely combine two or more 2D layers. They all represent a simple bonding of current technologies. While planar design chips are easier to make, they are not generally high performance.
Prior systems demonstrate performance limits, programmability limits, multi-functionality limits and logic and memory bottlenecks. There are typically trade-offs of performance and power.
The present invention views the system on a chip as an ecosystem consisting of significant intelligent components. The prior art for intelligence in computing consists of two main paradigms. On the one hand, the view of evolvable hardware (EHW) uses FPGAs as examples. On the other hand, software elements consist of intelligent software agents that exhibit collective behaviors. Both of these hardware and software aspects take inspiration from biological domains.
First, the intelligent SoC borrows from biological concepts of post-initialized reprogrammability that resembles a protein network that responds to its changing environmental conditions. The interoperation of protein networks in cells is a key behavioral paradigm for the iSoC. The slowly evolving DNA root structure produces the protein network elements, yet the dynamics of the protein network are interactive with both itself and its environment.
Second, the elements of the iSoC resemble the subsystems of a human body. The circulatory system represents the routers, the endocrine system is the memory, the skeletal system is comparable to the interconnects, the nervous system is the autonomic process, the immune system provides defense and security as it does in a body, the eyes and ears are the sensor network and the muscular system is the bandwidth. In this analogy, the brain is the central controller.
For the most part, SoCs require three dimensionality in order to achieve high performance objectives. In addition, SoCs require multiple cores that are reprogrammable so as to maintain flexibility for multiple applications. Such reprogrammability allows the chip to be implemented cost effectively. Reprogrammability, moreover, allows the chip to be updatable and future proof. In some versions, SoCs need to be power efficient for use in embedded mobile devices. Because they will be prominent in embedded devices, they also need to be fault tolerant. By combining the best aspects of deterministic microprocessor elements with indeterministic EHW elements, an intelligent SoC efficiently delivers superior performance.
While the design criteria are necessary, economic efficiency is also required. Computational economics reveals a comparative cost analysis that includes efficiency maximization of (a) power, (b) interconnect metrics, (c) transistor per memory metrics and (d) transistor per logic metrics.
Problems that the System Solves
Optimization problems that the system solves can be divided into two classes: bi-objective optimization problems (BOOPs) and multi-objective optimization problems (MOOPs).
BOOPs consist of trade-offs in semiconductor factors such as (a) energy consumption versus performance, (b) number of transistors versus heat dissipation, (c) interconnect area versus performance and (d) high performance versus low cost.
Regarding MOOPs, the multiple factors include: (a) thermal performance (energy/heat dissipation), (b) energy optimization (low power use), (c) timing performance (various metrics), (d) reconfiguration time (for FPGAs and CPLDs), (e) interconnect length optimization (for energy delay), (f) use of space, (g) bandwidth optimization and (h) cost (manufacture and usability) efficiency. The combination of solutions to trade-offs of multiple problems determines the design of specific semiconductors. The present system presents a set of solutions to these complex optimization problems.
One of the chief problems is to identify ways to limit latency. Latency represents a bottleneck in an integrated circuit when the wait to complete a task slows down the efficiency of the system. Examples of causes of latency include interconnect routing architectures, memory configuration and interface design. Limiting latency problems requires the development of methods for scheduling, anticipation, parallelization, pipeline efficiency and locality-priority processing.
The architecture of a system on a chip (SoC) provides the main structure of the circuitry, but the functioning of the chip is critical for providing operational effectiveness. There are numerous advantages to a 3D multi-functional reconfigurable SoC. First, the 3D iSoC operation is asymmetric because its various parts operate independently. Second, it is highly parallel and has multiple interoperational parts that function simultaneously. Third, it is self-regulating, with variable modulation of activities. Fourth, it is reconfigurable. Finally, it exhibits polymorphous computing behaviors for continuous reorganization plasticity.
There are two sources of polymorphous computing. One of the sources of polymorphous computing is flexible hardware reconfiguration such as in a CPLD or FPGA. The second source of polymorphous computing is based on flow control. The present invention describes the distinctive features of the 3D iSoC pertaining to functional dynamics.
The iSoC uses a multi-agent system (MAS) to coordinate the collective behaviors of software agents to perform specific actions to solve problems. This integrated software system automates numerous iSoC operations, including self-regulation and self-diagnosis of multiple autonomic regulatory functions and the activation of multiple reconfigurable, and interactive, subsystems.
Intelligent mobile software agents (IMSAs) cooperate, collaborate and compete in order to solve optimization problems in the 3D iSoC. Since the system nodes operate autonomously within the various subsystems, the IMSAs perform numerous functions from communication to decision-making in parallel.
The system uses a library of metaheuristics to perform specific actions and to solve complex MOOPs. These metaheuristics include hybrid evolutionary computation algorithms, swarm intelligence algorithms, local search algorithms and artificial immune system algorithms. These learning techniques are applied to optimization problems in the framework of a reconfigurable polymorphous computing architecture.
The operational aspects of the present invention involve self-regulating flow, variable internodal functionality, independent nodal operation, hybrid interaction of hardware and software, asynchronous clocking between clusters and spiking flows for plasticity behaviors. Novel metaheuristics are applied to solve BOOPs and MOOPs, by using modeling scenarios. The present system also employs predictive techniques for optimal routing. Finally, the system uses software agents to perform collective behaviors of automated programming.
The system uses metaheuristics to guide hardware evolution. In particular, it uses a hybrid artificial immune system to model chip optimization to solve MOOPs. The system also uses autonomic computing processes to regulate chip functions.
The chip has sufficient redundancy with multiple nodes to be fault tolerant: If one node is damaged, others remodulate the system and perform tasks.
Software agents perform numerous coordinated functions in the chip. The pro-active adaptive operation of the chip that provides its evolvable characteristics is facilitated by a combination of novel features involving software agents to solve MOOPs.
The combination of iSoCs into networks produces a flexible high performance system that exhibits self-organizing behaviors.
The system uses metaheuristic optimization algorithms for hyper-efficiency. The self-regulating aspects of network logic are applied to the unified 3D iSoC. The combination of novel features in the iSoC allows it to perform autonomous functions such as the internal autonomic computer network functions of self-diagnosis, self-regulation, self-repair and self-defense.
This 3D iSoC disclosure represents a second generation of polymorphous computing architecture (PCA).
Programmability in the present invention involves the employment of software agents, which exhibit collective behaviors of autonomous programming. This feature allows the reprogrammable nodes to be coordinated and self-organized. Further, this allows the scaling of multiple iSoCs in complex self-organizing networks.
Because of its modular characteristics, the current system is fault tolerant. If part of the system is dysfunctional, other parts of the system modulate their functionality for mission completion.
The disclosure describes solutions to problems involving operational functionality in a 3D iSoC, particularly involving parallelization, integration of multiple reprogrammable and self-assembling components, and dynamics.
(1) Independent Operation of Nodes in 3D SoC
Each node in the 3D SoC functions independently. The multiple processing nodes in a neighborhood cluster operate as “organs” performing specific functions at a particular time. There are interoperational efficiencies of using a parallel multi-node on-chip computing system, particularly massive parallelism in a single iSoC.
The use of a combination of multiple nodes in the iSoC provides overall multi-functionality, while the iSoC performs specific functions within each neighborhood cluster. The iSoC's multi-functional network fabric uses multiple computer processing capabilities that produce superior performance and superior flexibility compared to other computing architectures. These processes are optimized in the 3D environment.
(2) Variable Operation of Asymmetric Node Clusters in 3D SoC
The composition of each neighborhood cluster varies for each task. Algorithms are employed to configure the composition of neighborhood clusters from the potential node configuration in order to optimize the overall system performance of the 3D iSoC. Because the node composition of the neighborhood clusters periodically changes, each octahedron sector has jurisdictional autonomy for control of reconfigurability of its component structures. While the node clusters periodically reassemble their cluster configurations, the nodes in the neighborhood clusters are coordinated to operate together.
Whole sections of the 3D iSoC can be offline, or minimally active, and the chip fabric adjusts. These collective effects of several neighborhood sections produce aspects of plasticity behavior.
Several nodes in a neighborhood are synchronized. The neighborhood then adds nodes in adjoining regions as needed on-demand to solve MOOPs. Each neighborhood cluster continuously restructures by synchronizing the added nodes. Specific operations are emphasized at one equilibrium point in a process and then shifted at another equilibrium point.
One advantage of using this operational model is that specific dysfunctional processor nodes can be periodically taken off line and then the overall system rapidly reroutes around these nodes. The system is therefore continuously rebalancing its load, both within individual neighborhoods and across all neighborhoods in the chip. This rebalancing capability provides a critical redundancy that allows for fault tolerance in order to overcome limited damage to parts of the chip.
(3) Central Node Activation of Multiple Nodes in 3D SoC Internode Network
Though the 3D iSoC has a cubic structure with octagonal neighborhood configuration, the central node affects internodal activation. The role of the central node is to regulate the neighborhood nodes, as a system manager. The neighborhood subsystems are autonomous clusters that interact with the central node to obtain instructions and to provide periodic operational updates.
Activation of computational processes in the central node affects the operational function of the neighborhood nodes. The central node has greater computational capacity than other individual nodes in the iSoC and controls the eight neighborhood clusters consisting of a total of 34 nodes.
The central node receives data inputs from nodes in neighborhood clusters. The central node also sends data and instructions to the nodes in the neighborhood clusters. The interactions between the cluster nodes and the central node create a dynamic process.
(4) Polymorphous Computing Using Simultaneous Multi-Functional 3D IC Operation in Reconfigurable 3D SoC
Polymorphous computing involves the operation of multiple reconfigurable circuits in a SoC fabric. Polymorphous computing allows an SoC's rapid adaptation to uncertain and changing environments.
The 3D iSoC exhibits polymorphous computing functionality because it (a) uses multiple reconfigurable hardware components in the form of multiple interacting FPGA nodes and (b) employs a control flow process that exhibits reconfigurable behaviors. The iSoC continuously reprograms multiple simultaneous operations in the various neighborhood nodes to optimize functionality. The continuous optimization and reprioritization of multiple operations enable the iSoC to engage in multi-functional behaviors and to solve multiple MOOPs concurrently.
An analogy to the iSoC multifunctional operation is a symphony, which exhibits unified coordinated operation to successfully achieve an objective. The multiple parts of the iSoC continuously harmonize by achieving multiple equilibrium points in a progression of computational stages to solve complex problems.
The subsystems in the neighborhood clusters of the iSoC engage in multiple simultaneous prototyping by continuously reconfiguring their evolvable hardware nodes. Though the overall chip fabric is seen as an integrated system, the neighborhood clusters are scalable and variable in composition, like a subdivision that builds out and then recedes, in order to modulate the circuitry work flow demands.
(5) Self-Regulating Flow Mechanisms for Polymorphous Computing in a 3D SoC
Polymorphous computing requires modulation of the work flow between multiple interoperating flexible computing nodes. The 3D iSoC network fabric constantly reprioritizes tasks to auto-restructure operational processes in order to optimize task solutions. The system continuously routes multiple tasks to various nodes for the most efficient processing of optimization solutions. Specifically, the system sorts, and resorts, problems to various nodes so as to obtain solutions. The system is constantly satisfying different optimality objectives and reassigning problems to various nodes for problem solving. At the same time, the reconfigurable nodes constantly evolve their hardware configurations in order to optimize these solutions in the most efficient ways available.
The closest available node has routed the highest priority problem routed to it. In addition, specific problem types are matched to the closest available node that can supply a particular computing capability to optimally solve these problems.
The challenge for the central node is to efficiently route traffic flows to various parts of the iSoC. The central node constantly tracks the present configuration of the evolving nodes and routes problems to each respective node to match its configuration.
If the nodes require transformation in order to optimize the solutions, the neighborhood nodes will reconfigure. The continuous plasticity effects of the changing solution requirements for solving MOOPs in the iSoC network fabric create a complex adaptive process. Overall, the iSoC network is a self-regulating system that unifies numerous operational techniques.
(6) Variable Modulation in 3D SoC Asynchronous Clocking Architecture
Since the neighborhood clusters all operate independently, they use variable clock rates. This clock rate variability between nodes allows a tremendous benefit in voltage modulation to match the operational rate. When the work load is moderate, the clocks modulate to a minimal rate so as to save energy, while at peak work load, the clocks spike to maximum working rates. This variable modulation of clocking usefully segregates the individual neighborhood clusters as they operate autonomously.
The linkage of the neighborhoods occurs via the use of a globally asynchronous locally asynchronous (GALA) process. In this model, whole neighborhoods may become dormant when the load does not require their activity, while the overall system remodulates the iSoC network fabric as demand warrants. The variable clocking of each autonomous neighborhood, and node, calibrates the asynchronous components of the iSoC network system as a whole.
The GALA system is used to connect the various neighborhoods to each other and to the central node. The overall iSoC clock speed is an aggregate of the modulating clocking rates of the various neighborhoods.
(7) Hybrid Parallelization for Concurrent Operations in 3D SoC Using Task Graphs
The use of multiple computational nodes in the 3D iSoC network fabric constitutes a highly parallel system. The benefits of computational parallelism lie in the dividing of tasks into manageable units for simultaneous computability and faster results. Initially, the present invention uses global level parallelization that is coarse-grained and focuses on dividing tasks to the specific neighborhood clusters. At a more refined level, the system uses node-level parallelization that is fine-grained in order to solve MOOPs. The combination of the global and the local levels produces a hybrid parallelization for concurrent operations. In one embodiment of the invention, MOOPs are divided into multiple BOOPs, which are then allocated to specific nodes for rapid parallel problem solving.
The system uses task graphs to assign optimization problems to specific neighborhoods and nodes. The list of problem-based priorities is scheduled in the task graphs, which are then matched to a specific node in a particular configuration. As the nodes periodically reconfigure on-demand in order to more efficiently perform tasks or solve MOOPs, the task graphs are updated with new information on the availability of new node configurations. Since the process is evolutionary, the task graphs constantly change. The task graphs require continuous rescheduling in order to accommodate the changing node reconfigurations. The particular challenge of the task graph logic is to efficiently specify the concurrency of tasks across the 3D iSoC network fabric as the system continuously recalibrates.
As one neighborhood simulates the optimal operations and schedule of the performance of an operation, it updates the task graph system. This scheduling process itself stimulates the continuous transformation of the evolvable hardware in the individual nodes. This process is co-evolutionary and further adaptive to a changing environment, thereby satisfying polymorphous computing constraints.
(8) Accelerated Transformation of Reconfigurable Application Layer of Node in 3D SoC
Each node consists of multiple circuit layers in a 3D configuration. The application layer reflects the specific functionality of each computational node. Since nodes in the 3D iSoC are reconfigurable, the transformation of the application layer in these EHW nodes is organized so they perform their functions in an accelerated way. Specifically, the application layer is the fastest, and simplest, to structurally modify. In some rapid transformation cases, the reconfigurable application layer is the only layer that is modified. In other cases, the application layer is transformed first, and the other layers are modified later.
This accelerated transformation of the reconfigurable application layer of a node in a 3D SoC is useful for more efficiently processing specific applications. This process allows the rapid reprogrammability of nodes in the iSoC on demand.
By calibrating the transformation of the application layers of multiple reconfigurable nodes, the system further accelerates continuous sequential reprogrammable features of the 3D iSoC network.
(9) Internodal Coordination of Spiking Flows for Plasticity in 3D SoC
The 3D iSoC network fabric is constantly readjusting for optimum self-organization by coordinating program instructions with external feedback. The chip exhibits intelligence and is active rather than static and passive.
The SoC performs plasticity behaviors by continuously modulating the functionality of the reconfigurable hardware components. As the subsystems are continuously modulated, the overall system exhibits plasticity.
While the system is rarely at peak performance over a continuous period of time, the flow of network traffic behaviors periodically spike within specific neighborhoods at peak capacity. Though spiking occurs in key nodes at key times, the system constantly modulates load-balancing between neighborhood clusters.
In this sense, the 3D iSoC overall emulates aspects of neural network behaviors.
The present disclosure presents solutions to problems involving software components in a 3D iSoC, including instruction parallelization, metaheuristic applications and MAS automation.
(1) Multi-Agent System Applied to 3D SoC
Intelligent mobile software agents (IMSAs) are software code that moves from one location in a network to another location. IMSAs are organized in collectives in a multi-agent system (MAS) to coordinate behaviors in the 3D iSoC. Specifically, IMSAs coordinate the functions of the multiple circuit nodes in the SoC network. IMSAs guide FPGA behaviors and coordinate the functions of reconfigurable nodes. IMSAs anticipate and model problems by using stochastic processes.
Each node has its own IMSA collective that performs routine functions of negotiating with other nodes IMSA collectives.
In most cases, cooperating IMSAs coordinate non-controversial functions of the iSoC. In solving more complex optimization problems that require decisions, competitive IMSAs negotiate solutions between nodes to satisfy goals. Competitive IMSAs use stochastic processes to negotiate using an auction model in a time-sensitive environment.
IMSAs in the MAS represent an intermediary layer of software functionality between the higher level abstract language and the application level operations. A compiler is integrated into each node to process the IMSA code as well as higher and lower level software languages. First order predicate logic is also applied to IMSAs.
(2) Coordination of Internodal Network for Compiler Architecture by Routing Code to Parallel Asynchronous Nodes in 3D SoC
The coordination of intra-nodal compilers in the parallel network provides routing solutions to software code in the SoC. While each node has the ability to compile some software code, each neighborhood uses a node compiler to coordinate the behaviors of its entire cluster. The neighborhood compiler uses asynchronous routing to continuously optimize the software routing process to multiple parallel nodes within the neighborhood cluster and between neighborhood routers. On the one hand, the neighborhood compilers divide up the IMSA functions to specific nodes as code is pushed from the neighborhood router to the nodes. On the other hand, IMSAs are coordinated and unified in the neighborhood compiler as they arrive from the individual neighborhood nodes.
Each compiler uses a metaheuristics engine to generate specific hybrid learning algorithms to solve MOOPs and then to route the IMSAs with instructions to specific locations based on the metaheuristic algorithm solutions.
(3) Autonomic Computing in 3D SoC Using Software Agents
Autonomic computing emulates the human autonomic nervous system in which specific regulatory functions such as heartbeat, breathing and swallowing are automatically coordinated. Autonomic computing models that emphasize the self-diagnosis, self-repair and self-defense of network systems have been applied to the network computing environment.
Autonomic computing is applied in the present system to the network fabric of the 3D iSoC primarily for the self-diagnosis and self-regulation of multiple parallel internal SoC functions.
By using sensors that are embedded into individual node circuits, the SoC performs continuous self-assessment procedures. The individual nodes keep an assessment record of all events and record them to internal memory. This process is useful for the purpose of tracking the operational record of indeterministic FPGA performance.
Autonomic computing is particularly useful for self-defense. By employing metaheuristics that emulate the human immune system, hybrid artificial immune system (AIS) processes are able to anticipate, detect, defend and eliminate malicious code and to provide strong network security mechanisms.
The autonomic computing processes are integrated into the SoC by collectives of IMSAs to provide self-regulatory functions. These SoC regulatory processes are centered in the central master node in order to consolidate the processes of the various neighborhood clusters.
The combination of multiple autonomic computing solutions applied to the 3D iSoC represents a form of self-awareness or cognitive intelligence in a network computing fabric on a chip.
(4) Metaheuristics for Solving Multi-Objective Optimization Problems in 3D SoC
The present system addresses the challenge of solving multiple aggregate optimization problems. The 3D iSoC is useful for solving multiple parallel MOOPs. In dynamic environments, the optimal solution changes as the conditions change. The best way to solve optimization problems in this context is to employ multiple parallel reconfigurable processing elements that interact with each other and with the environment. The iSoC makes continuous attempts to find techniques that provide solutions to evolving MOOPs.
By simultaneously employing multiple hybrid metaheuristics to continually optimize operations, the iSoC is more likely to achieve its MOOPs objectives within critical timing constraints.
The iSoC employs a library of hybrid and adaptive metaheuristics algorithms to test various learning techniques to solve MOOPs on-demand. MOOP solution options are constantly pruned as the set of families of options evolve given changing conditions and objectives. The system constantly seeks the best set of options to satisfy multiple changing constraints generated both by the evolving system itself and the evolving environment.
Metaheuristics are particularly useful when applied to evolutionary hardware (EHW). The FPGAs in the SoC are continuously tuned by applying metaheuristics algorithms to solve optimization problems.
Metaheuristic techniques that are employed in the iSoC include genetic algorithms, local search (tabu search, scatter search and adaptive memory programming), swarm intelligence (ant colony optimization [ACO], particle swarm optimization [PSO] and stochastic diffusion search [SDS]), and artificial immune system (AIS) algorithms. Each of these metaheuristics algorithms solves a different type of optimization problem, and each has a strength and weakness. Combining the best elements of each of these models, including hybrid configurations, with the use of the metaheuristics library allows the IMSAs in the network fabric of the 3D iSoC to accomplish a broad range of tasks and to offer solutions to many optimization problems.
Various metaheuristic algorithms are used simultaneously by the parallel processes of the 3D iSoC. Different nodes employ various metaheuristics in parallel in order to solve MOOPs within time constraints. For example, multiple FPGA nodes use metaheuristics to guide their reprogramming functions. These functions are coordinated, and the iSoC shares tasks collectively and continuously modifies its programming functions as the process continues to task conclusion. Multi-node optimization of operational functions is a specific feature of the iSoC that presents computing advantages.
The use of parallel optimization approaches is particularly useful in specific applications that involve arithmetic intensivity such as scientific and financial modeling.
(5) Reprogrammable Network Pathways in 3D SoC
Reprogrammable circuits present challenges for the development of methods and stimulus for reconfiguration, particularly in indeterministic self-organizing systems. The coordination of activities between multiple asynchronous FPGAs in an iSoC is particularly complex. Once an FPGA restructures its circuitry, its most recent EHW architecture configuration and functionality are transmitted to other circuits in its own neighborhood cluster and in the iSoC globally. Information about the reconfigurable situations of the various reprogrammable circuits at a specific time is organized by IMSAs that continuously monitor and share information between nodes.
IMSAs are useful in applying metaheuristic algorithms to train reconfigurable circuits. As FPGAs are trained, information about their configurations is transmitted by IMSAs to other reconfigurable circuits. The system shares tasks, particularly within neighborhood clusters, between multiple reconfigurable nodes that divide the tasks, reprogram multiple hardware components and continuously evolve hardware configurations in order to solve MOOPs.
In the course of this process of coordinating multiple FPGAs, IMSAs establish network pathways that are continuously optimized and shift based on the changing data traffic flows. The system globally produces plasticity behaviors by reorganizing the network paths to accommodate the continuously reconfigurable circuits. This process leads to indeterministic asynchronous reprogrammability that allows the system to solve complex problems in real time.
In particular, the process of using reprogrammable network pathways promotes multi-functional applications in which several processes are concurrently optimized. The multiple transforming circuits in the 3D iSoC present complex self-organizing dynamic processes for continuous plasticity.
(6) Predictive Elements to Anticipate Optimal Network Routing
Because the iSoC provides extremely rapid data throughput by using multiple continuously reconfigurable circuits, solving optimization problems in real time requires anticipatory behaviors. In the context of network computing, the system anticipates the most effective computing process by identifying and modeling a problem, planning the optimal routing to minimize bottlenecks and revising the schedule to actually route the data. The system anticipates problems to solve and solution options.
Anticipatory processes are developed by the iSoC via analysis and modeling of past processes. In the context of network routing, the past routing practices are assessed so they may provide the foundation for anticipating the best solution for future problems.
(7) Optimizing Traffic Flows in 3D SoC Operation
The multiple simultaneous reprogrammable features of the 3D iSoC illustrate the polymorphous computing architecture advantages of the present system. As multiple FPGAs continuously reorganize their hardware attributes, they send signals to other circuits to perform EHW functions. These processes use IMSAs to carry messages between nodes and to coordinate functions. IMSAs cooperate and compete in order to perform specific multi-functional tasks. IMSAs employ metaheuristics in order to solve parallel MOOPs on-demand.
While IMSAs are the messengers and metaheuristics are the analytical components, IP cores are the units of software that enable specific FPGAs to perform specific operations. IP core elements are accessed in the IP core library and combined in unique ways by IMSAs to solve MOOPs. The accessibility of IP core elements allows the system to autonomously coordinate reconfigurable behaviors to solve new problems in novel ways.
In the case of microprocessors, the IMSAs activate specific functions by creating short-cuts for routine tasks. The IMSAs narrow the constraints between a set of options in the microprocessor programming in order to accelerate its behaviors. While limiting the range of applications, this method allows the MPs to work with multiple FPGAs to create self-organizing processes.
In an additional embodiment of the present system, the IP cores are self-programming. After identifying application objectives, the IMSAs access the IP core library and select the most appropriate IP core from the library to solve similar problems. The closest IP core(s) are then tuned to specific application problems.
Multiple IP cores are used to control multiple EHW nodes in an iSoC simultaneously (or sequentially). Complex functions are performed by multiple asynchronous nodes. Interoperations are controlled by IP cores that are combined and recombined into an efficient processing network. By combining sequential or parallel IP cores, the present system continuously reprograms multiple FPGAs. In particular, sequential IP core use produces indeterministic behaviors for FPGAs for multi-functionality within specific contingency thresholds.
(8) Plasticity Using Reconfigurable 3D SoC for Polymorphous Computing
IP cores provide programming specifications for complex programmable logic devices such as FPGAs. IP cores are integrated into FPGAs in the present system by using IMSAs and metaheuristics that identify the specific IP core elements to be combined in unique ways for solving specific MOOPs. The IP cores activate a change of geometrical configuration of the reconfigurable intra-layer 3D IC node logic blocks in the iSoC in order to optimize their problem solving operations. As the environment changes, the constraints change that require a reprogramming of the hardware circuits.
The internal network features of the present system provide parameters for the interaction of multiple interactive reprogrammable embedded computing components. By self-organizing multiple hardware components, i.e., by reconfiguring their application specificity in real time, the system adapts to evolving environment conditions.
The iSoC employs several parallel processes to assess the changing environment, the present capabilities of the network fabric, and the reconfigurable components of the system. The iSoC models various scenarios by using stochastic processes and analysis of past behaviors in order to develop scenario solution options to solve MOOPs. IMSAs perform the modeling processes by using adaptive modeling algorithms.
The combination of these processes presents a novel adaptive computing environment that employs polymorphous computing architectures and processes to accomplish evolutionary objectives.
(9) Environmental Interaction with Reprogrammable SoC
The 3D iSoC presents a model for a cognitive control system. The system produces co-evolution of software and hardware components in an integrated reconfigurable network fabric. This polymorphous network architecture is more reliable and far faster than previous systems.
The iSoC is auto programming. The system is structured with elastic dynamics for both exogenous adaptation and endogenous transformation. Specifically, part of the chip may be used to engage in programming itself while it simultaneously solves a range of problems. This is performed by implementing D-EDA tools on board the chip to produce IP cores and install them with IMSAs.
The evolving environment provides specific feedback for the iSoC according to which it must reorganize in order to perform application tasks. This environmental interaction with the reprogrammable iSoC produces adaptation of the reprogrammable network components.
Although the invention has been shown and described with respect to a certain embodiment or embodiments, it is obvious that equivalent alterations and modifications will occur to others skilled in the art upon the reading and understanding of this specification and the annexed drawings. In particular regard to the various functions performed by the above described elements (components, assemblies, devices, compositions, etc.) the terms (including a reference to a “means”) used to describe such elements are intended to correspond, unless otherwise indicated, to any element that performs the specified function of the described element (i.e., that is functionally equivalent), even though not structurally equivalent to the disclosed structure that performs the function in the herein illustrated exemplary embodiment or embodiments of the invention. In addition, while a particular feature of the invention may have been described above with respect to only one or more of several illustrated embodiments, such feature may be combined with one or more other features of the other embodiments, as may be desired and advantageous for any given or particular application.
The present application claims the benefit of priority under 35 U.S.C. §119 from U.S. Provisional Patent Application Ser. No. 60/993,637, filed on Sep. 12, 2007, the disclosure of which is hereby incorporated by reference in their entirety for all purposes.
Number | Date | Country | |
---|---|---|---|
60993637 | Sep 2007 | US |