The present invention relates generally to integrated circuits and more particularly to a system and method for a distributive computing subsystem.
Standalone Flash and RAM Memory, and FPGA Parts
The electrical erasable and programmable EEPROM memory devices have become more widely used in the last decade. The technological advances and broad product applications have made EEPROM memory devices the most viable candidate for implementing SOC level component integrations.
On the process and device technology side, the general practice of memories has been focused on the miniaturization of the physical size of the storage bit, scaling down the cell operating voltages and currents and therefore lowering power consumption. Thereby implementing multilevel signal storages per physical cell area can be implemented. In addition, chip apparatus can be built to manage per bit, byte, large and partial arrays, resource sharing schemes. The ultimate goal is to achieve highest level of system integration with mixed analog and logic circuits in a common chip and therefore improve IC devices with performance, reliability, system efficiency and capacity etc.
Flash memory is a good choice for information storage devices based upon their increasing capacity. The names of “Flash memory and logic device” is adopted based upon the device's fast operation and its use in large arrays. The Flash devices are closely related to the Flash technology. The density, power, and speed capability of Flash arrays exceed what is offered by rotating disks, so the semiconductor EEPROM is replacing the mechanical disk medium in many applications. The Flash memory can also replace DRAM/SRAM for certain applications if the speed/performance requirements are met. Flash memory is nonvolatile and has high density per cell for information storage.
The EEPROM device may be applicable as ideal memory device; both as standalone memory/logic part and as part of an embedded storage/logic unit in an ASIC. The Flash device has several attractive features such as compactness, low power and high speed. A Flash device could replace conventional mechanical and optical disks, controller and microprocessors for network and communications. There is an interest to extend the use of the Flash devices in printed circuit board (PCB) assemblies. However, conventional PCB subsystem assemblies still use standalone logic chips, memory chips, and discrete components interconnecting them with the PCB wiring. It is desirable for a small system such as SD card, stick card, pen drives, PDA, mobile phone to merge the memory capacity, processing power, and even some analog functions in a universal IC. This will be advantageous in both the space and cost savings, and to optimize performance.
There are numerous prior art methods and systems in Flash technology which has been utilized for information storage. The Flash transistor has been successfully developed as either a single bit or a dual bit system storage circuit element. However, typically the Flash transistor is not utilized as logic circuit element.
Field Programmable Logic Devices represented by PLA solutions utilizing Flash devices are well known. The field programmable ICs either reconfigure prime term logic arrays or functional units with on-chip wiring switches and tracks. However, these devices are not utilized to make functional units by directly programming the threshold of the switch transistor and in configuring a basic logic circuit unit. A typical FPGA contains standalone CMOS-TTL implementations with device capacity in the range of a couple hundred gates to about 10 k gates. The basic building blocks contain I/O and logic elements for the latch and the TTL hard and soft macros, RAM arrays, wiring switches and tracks. The most advanced FPGA uses 1.8V supply. The device is highly popular for it flexibility and supported software package. It is difficult to merge a Flash array with the CMOS-TTL logic circuit for the process and circuit compatibility issues, and there is no business advantage to merge these technologies for either the manufacturers of FPGA or the manufacturers of Memory standard parts.
In conventional integrated circuits billions of transistors are successfully found therein. However, many parts that perform different functions are still difficult to integrate. One of the most obvious reasons for this difficulty is the process compatibility issue. It is difficult to merge present technologies because of different process cost objectives for volume parts such as memory and logic units. Memory commodity parts are remarkably cost sensitive and even a minor complication would cost more to the standardized parts. As long as the standardized parts are selling in high volume, there is a barrier for any newly emerged parts or approaches to begin. Usually a tremendous breakthrough in speed, density, power, or capacity is required to make this change. In addition typically reliability-availability-serviceability (RAS) must be of a high quality for such a breakthrough.
Nevertheless, an opportunity to merge the FPGA and Flash technology is desired. By adding the computing power with the densest logic circuit to the densest storage devices, a universal part is provided, and great design flexibility is added to device capacity and performance options. Furthermore, logic circuits may be augmented to contain analog function and multi-valued logic, and still perform at low power.
By implementing such a system and method most of the volume PCB subassembly products in the display, memory, and disk areas can be benefit. This subsystem may support small and large machines for data access, transport and storage purposes.
Accordingly, what is needed is a system and method for addressing the need for such devices. The present invention addresses such a need.
A printed circuit board (PCB) subsystem of IC parts is disclosed. The PCB assembly contains single or plural of chips, each contains Giga Byte storage and 10 k gate equivalent Schottky CMOS (SCMOS) based field programmable gate arrays (SFPGA). The process technology combines CMOS transistors, EEPROM transistors, and low barrier Schottky diodes. The circuit architecture mixes both hardwired and SCL type FPGA (SFPGA) functional units. The system component interface architecture is based on the combination of using low power, a low speed host interface (˜20 MHz), a medium speed local peripheral bus (˜500 MHz), and high-speed (˜3 GHz) on-chip busses.
For example, 1.2V supply low power, high capacity, and high flexibility IC product applications are supported. Efficient system integrations prescribe chip implementations with a distributive computing power running with GHz, 100 MHz, and 10 MHz clock rates at various interfaces. Mixed serial universal chip (UIC) and OS supports high bandwidth data access, transport, and storage operations with re-configurable software/hardware including the analog logic memory (ALM) circuit units and special nets.
The present invention relates generally to integrated circuits and more particularly to a system and method for a distributive computing subsystem. The following description is presented to enable one of ordinary skill in the art to make and use the invention and is provided in the context of a patent application and its requirements. Various modifications to the preferred embodiment and the generic principles and features described herein will be readily apparent to those skilled in the art. Thus, the present invention is not intended to be limited to the embodiment shown but is to be accorded the widest scope consistent with the principles and features described herein.
Mixed Signal Circuits and Process Technology for Super IC
The present invention utilizes device and system architecture for providing an intelligent nonvolatile subsystems. The nonvolatile subsystem encompasses embedded units of Flash and memory arrays (SRAM, DRAM, ROM) and programmable logic arrays. The goal is to optimize an organization of low cost, high capacity, distributive computing and memory storages. Flash transistors and SBD-CMOS transistors are the basic circuit elements to implement the various hardware constructs. SFPGA software and transmission line signal control means are keys to ensure high performance operation.
SFPGA techniques are utilized to allocate and configure certain portion of the logic circuits of a memory intensive chip. Both circuit unit types can be mixed to form a universal programmable device with Logic and Storage arrays. The users can field program certain high performance critical nets IO ports, buffers, and clocking constructs. A wide performance range switching operations can be supported. Prior art of fine-tuning clocking systems, reflection containment, data transfer protocols, and collision detection and error correction issues from leading vendors are greatly improved by the present invention.
Prior art U.S. Pat. No. 6,590,800 entitled “Schottky diode static random access memory (DSRAM) device, a method for making same, and CFET based DTL”, issued Jul. 8, 2003, by the inventor of this application describes a process and circuit scheme to lower the logic and RAM cell supply voltage to 1.2V and lower the current down to sub microampules. By lowering the current and voltage in this manner, the array peripheral organization can be revamped using low power logic circuits. Copending U.S. patent applications 3064, 3065 and 3070 also disclose these features and are incorporated by reference herein. These features therefore allow for the development of:
1. Standalone Flash memory circuits with low power peripheral logic.
2. Flash memory arrays as embedded ASIC units with other functional units on the same ship. One example of such a ship is the mix of a functional unit with low power logic gate arrays for a field reprogrammable logic gate array (FPGA) device.
A system and method in accordance with the present invention utilizes memory arrays with certain field programmable logic resources to provide circuit functionality and inter unit connectivity for the PCB. Ratios for the right mixture of the fixed and re-configurable units are at the discretion of the user as each functional part is defined.
The combined chips provide large (Gbit) storage capacity plus a large number (10-100 k) of gates, relatively smaller dedicated physical resources of processing and buffering power, re-configurable ports, and stored software constructs. Wide application chip sets can be formed from the embedded memory, processor and logic arrays in accordance with the present invention. Utilizing the system and method in accordance with the present invention, a plural number of chips can form subsystems with single to large string of super UIC chips. Finally, subsystem PCBs can provide distributive computing powers by partitioning them with various PCB arrangements and instantiated controls through reconfiguration procedures utilizing a system and method in accordance with the present invention.
One preferred embodiment shown in
It is clear that much more variations can be derived by the skilled from the teachings of this invention by mixing low power SCL circuits with Flash array and FPGA for other applications at system and device levels.
In contrast to traditional PCB assembly of subsystems with ASIC controllers or microprocessors and standalone memory parts (RAM, ROM etc), a high capacity small system with ASIC chips is disclosed. These chips can be made easily with a powerful design library based on existing MLC Flash storage arrays, traditional hardwired CMOS logic gates, and the proposed SFPGA with MLC transistors. More complex embedded circuits can be composed from a school of basic functional units. Certain functional units can be field programmed by the means described later in the detailed description. Typical design entries are low power, high density, high-speed logic gates, analog units, memory units, etc. One of the embodied applications are the distributive semiconductor disk system. Another subsystems can be PCI express cards. Still another subsystem can be a networking and communication plug-in card. All are comprised of programmable intelligent processor chip(s) and information storage chip(s).
An N-type transistor disclosed in U.S. Pat. No. 6,590,800, “Schottky diode static random access memory (DSRAM) device, a method for making same, and CFET based DTL”, by the current inventor, can be utilized in the present invention. In addition to using this transistor as an information storage element, it can also be field adjusted during any configuration procedures with a dedicated on chip biasing facility. Therefore, selected logic gates or device functions (such as depletion or enhancement mode switches) can be activated, deactivated or reconfigured by the users on the fly. This flexibility greatly enhances the capacity and efficiency of a semiconductor intelligent device.
Each of the controller or memory chip contains certain portions of logic units and storage arrays. One of the preferred PCB embodiments is to enhance the presently simple logic controller chip with more code storage, error detection and correction, PLL and processing and buffer powers, so it may buffer the local commodity memory chips. Still another embodiment is to eliminate the control and signal re-powering functions of the buffer chip, but redistribute the control and repowering functions over to the presently memory commodity chips. The later system architecture creates a design paradigm forming a platform of intelligent memory chips, which support UIC with a large portion of memory intensive arrays (greater than Giga bits) but to appropriate some basic processing systems (less than 10˜100 logic gate equivalents and special IO units). Simple small systems contain single UIC chip, PC and Server systems link all plug-in PCB subsystems, each housing plural UIC chips. Therefore, computing powers are distributed over the entire system networked by the plugged PCB subsystems.
The conventional approach uses standard FPGA chips, hardwired ASIC controller chip, and Flash chips from common CMOS process foundries. The invention system architecture and methods of device integrations ensures a much higher level of system integration, superior system performance, and greater flexibility.
The goals of the system architecture are to optimize subsystem organizations for low cost and high capacity information systems using chips comprising of distributive MLC based programmable logic and memory arrays. Flash transistors, SBD-CMOS transistors, pass transistors, capacitors and resistors are the basic circuit elements to implement hardware constructs. Conventional CMOS-TTL devices with more than two way logics are compatible as is, but should be replaced for area power savings and performance improvement. FPGA programming procedural software and transmission line signal connectivity control means are keys to ensure high performance operations. The combined chips offer flexible sizes of memory and logic arrays that form arbitrary units of memory and processor powers. Plural number of chips can form distributive computing systems by various PCB arrangements and synchronized clocking controls. One cited embodiment shown in
Since all chips are memory intensive and intelligent, the universal chip can be reconfigured as a controller or as an intelligent memory chips either from the factory (more secure since one can enforce licensing controls) or allowing field alteration or modification due to any reasons. The PCB subsystems may house single or string/stack of chips with system resources added up as the PCB is augmented and more plug-ins are networked.
High Capacity MLC Logic Cell, IO Cell Constructs and Method of Programming
Since the MLC transistor can be programmed with plural number of states (4) of thresholds (−1V, 0.7V, 1.7V, 2.7V in
If the programming engine 400 and a DAD (digital-analog-digital) converter is closely coupled with the SCL circuit, high order (beyond ternary) logic constructs can be built with this approach.
Table 1 below summarizes the apply conditions during the reconfiguration.
The High Speed Transmission Line Terminator
In
This switch can serve as Daisy chain terminator. When received instruction in the data protocol, it may also reconfigure a port from master driver to slave receiver for a point-to point communication, thus provides greatly flexibility and saves board spaces.
In the case of a long haul PCB transmission line, the SBD clamps may contain the overshoot or undershoots effectively for fine traces greater than 8 CM in 100 ps-class switching waveforms. The reconfiguring on-chip line termination saves PCB space, and supports well behaved point-to-point data transactions on the fly.
It is another object of the present invention to lower the voltage swing for the on-chip and off-chip switching waveforms. All circuits in SCL perform with low voltage swings, which resulted in voltage and current level scale down. The active power of nets at 1.2V is 50% less than the same circuits operating at 1.8V supply. This is particularly attractive for miniaturized systems and as laptops and handheld devices. The object is especially important where space and power savings are the main thrust in providing solutions.
The MUX and the latch are basic macro functions in forming Xlinx type FPGA devices. We introduced the MUX macro (
In
Table III below summarizes the functional merits of the SCL type MUX and Latch solutions which is efficient to support both the generic logics as well as FPGA building blocks.
The MUX and NAND combination also forms an analog signal comparator to select signals in comparison to the stored Vt in the switching transistor 710. During the clocked evaluating window, the switch samples desired input signal to the common gate of the inverter, if its value exceeds the stored Vt, the output would switch from its otherwise dc static state.
Notice that the voltage references will provide supply derivatives either by diode drops from V1 if Vrc=0, or by diode offsets built up from V2 if Vrc=1.
The Densest Memory and Logic Technology
The preferred embodiments depicted in
Referring back to
In
It is still another objective that the SCL circuitry may implement other critical nets such as the high speed PLL or DLL paths, ring oscillators, phase detectors and splitter, multiplier. It also can support on chip serial to parallel caching, and error detection, correction for the on-chip and inter-chip data transactions.
Super PLD
A conventional PLD incorporates 6T-SRAM cells as the storage elements for reconfiguration codes and data codes. It also uses conventional CMOS-TTL logic as the building blocks for computing resources. In a system and method in accordance with the prresent invention, the PLD comprises SBD diode trees, CMOS inverters, pass transistors, and MLC transistors. Connecting paths and switches are driven by densest embedded memory arrays NAND-EEPROM cells, 4T-SRAM cells and SBD ROMs. The logic is threefore delivered by the highly area and power efficient. This super PLD will feature the most efficient field programmable devices to support the highest capacity IC solutions with ideal hardware and software capabilities.
In the preferred embodiment shown in
The embedded circuit units depicted in
In
It is the main object of the present invention to implement PCB subsystems for the low power cost-performance applications. One of the product solutions is to design chip sets for field programmable version of the memory cards, stick disk drives, etc. as outlined in the Table IV below.
With the scheme provided by the present invention, the local OS may assign the mode of each ports according to information received from preceding frames. The master slave ports, and driver or net terminators are identified and reconfigured prior to each local data transfers. The system OS will maintain certain coherences according to agreed protocols. For small or large systems, each of the PCB subsystem may be implemented with a single chip or string of universal chips (8˜32). Local bus nets within the PCB subassembly may run at comfortable lower speed (say BW=400 MB with 16 bit bus width at 200 MHhz). With prior arts in PLL and frequency multiplier schemes, higher speed nets (BW=4 GB with 16 bit bus, dual phase clock at 1 Ghz) are possible to parallel process signals within each chips distributed over the entire distributive subassemblies.
The chip sets may support all memory intensive but intelligent systems. Each subsystem has independent computing power with GB storage and 10 s˜100K logic gate equivalent and analog signal processing power, simple system bus (4 digital pins and a couple of Op Amp pins) and high speed local bus interface capabilities. FPGA facility is provided for customizing special local nets so global synchronization and parallel processing is possible. A great flexibility is built-in with each subsystem such that additive computing power is distributed over the entire network with privileged clients.
Although the present invention has been described in accordance with the embodiments shown, one of ordinary skill in the art will readily recognize that there could be variations to the embodiments and those variations would be within the spirit and scope of the present invention. Accordingly, many modifications may be made by one of ordinary skill in the art without departing from the spirit and scope of the appended claims.
The present invention is related to copending U.S. patent application entitled “3D Flash EEPROM Cell and the Methods of Implementing the Same”, Ser. No. 10/800,257, filed on Mar. 11, 2004, and assigned to the assignee of the present invention; and copending U.S. patent application entitled “Variable Threshold Transistor for the Schottky FPGA and Multilevel Storage Cell Flash Arrays”, Ser. No. 10/817,201, filed on Apr. 2, 2004, and assigned to the assignee of the present invention which is related to copending U.S. patent application entitled “SCL Type FPGA with Multi-Threshold Transistors and Method for Forming Same”, Ser. No. ______ (3070P) filed on Apr. 19, 2004, and assigned to the assignee of the present invention, all of which are incorporated by reference herein.