This invention relates to the architecture of computing systems, and in particular to an architecture in which individual instructions may be executed in parallel, as well as to methods and apparatus for accomplishing that.
A common goal in the design of computer architectures is to increase the speed of execution of a given set of instructions. One approach to increasing instruction execution rates is to issue more than one instruction per clock cycle, in other words, to issue instructions in parallel. This allows the instruction execution rate to exceed the clock rate. Computing systems that issue multiple independent instructions during each clock cycle must solve the problem of routing the individual instructions that are dispatched in parallel to their respective execution units. One mechanism used to achieve this parallel routing of instructions is generally called a “crossbar switch.”
In present state of the art computers, e.g. the Digital Equipment Alpha, the Sun Microsystems SuperSparc, and the Intel Pentium, the crossbar switch is implemented as part of the instruction pipeline. In these machines the crossbar is placed between the instruction decode and instruction execute stages. This is because the conventional approach requires the instructions to be decoded before it is possible to determine the pipeline to which they should be dispatched. Unfortunately, decoding in this manner slows system speed and requires extra surface area on the integrated circuit upon which the processor is formed. These disadvantages are explained further below.
This invention relates to the architecture of computing systems, and in particular to an architecture in which groups of instructions may be executed in parallel, as well as to methods and apparatus for accomplishing that.
A common goal in the design of computer architectures is to increase the speed of execution of a given set of instructions. Many solutions have been proposed for this problem, and these solutions generally can be divided into two groups.
According to a first approach, the speed of execution of individual instructions is increased by using techniques directed to decreasing the time required to execute a group of instructions serially. Such techniques include employing simple fixed-width instructions, pipelined execution units, separate instruction and data caches, increasing the clock rate of the instruction processor, employing a reduced set of instructions, using branch prediction techniques, and the like. As a result it is now possible to reduce the number of clocks to execute an instruction to approximately one. Thus, in these approaches, the instruction execution rate is limited to the clock speed for the system.
To push the limits of instruction execution to higher levels, a second approach is to issue more than one instruction per clock cycle, in other words, to issue instructions in parallel. This allows the instruction execution rate to exceed the clock rate. There are two classical approaches to parallel execution of instructions.
Computing systems that fetch and examine several instructions simultaneously to find parallelism in existing instruction streams to determine if any can be issued together are known as superscaler computing systems. In a conventional superscaler system, a small number of independent instructions are issued in each clock cycle. Techniques are provided, however, to prevent more than one instruction from issuing if the instructions fetched are dependent upon each other or do not meet other special criteria. There is a high hardware overhead associated with this hardware instruction scheduling process. Typical superscaler machines include the Intel i960CA, the IBM RIOS, the Intergraph Clipper C400, the Motorola 88110, the Sun SuperSparc, the Hewlett-Packard PA-RISC 7100, the DEC Alpha, and the Intel Pentium.
Many researchers have proposed techniques for superscaler multiple instruction issue. Agerwala, T., and J. Cocke[1987] “High Performance Reduced Instruction Set Processors,” IBM Tech. Rep. (March), proposed this approach and coined the name “superscaler.” IBM described a computing system based on these ideas, and now manufactures and sells that machine as the RS/6000 system. This system is capable of issuing up to four instructions per clock and is described in “The IBM RISC System/6000 Processor,” IBM J. of Res. & Develop. (January, 1990) 34:1.
The other classical approach to parallel instruction execution is to employ a “wide-word” or “very long instruction word” (VLIW) architecture. A VLIW machine requires a new instruction set architecture with a wide-word format. A VLIW format instruction is a long fixed-width instruction that encodes multiple concurrent operations. VLIW systems use multiple independent functional units. Instead of issuing multiple independent instructions to the units, a VLIW system combines the multiple operations into one very long instruction. For example, in a VLIW system, multiple integer operations, floating point operations, and memory references may be combined in a single “instruction.” Each VLIW instruction thus includes a set of fields, each of which is interpreted and supplied to an appropriate functional unit. Although the wide-word instructions are fetched and executed sequentially, because each word controls the entire breadth of the parallel execution hardware, highly parallel operation results. Wide-word machines have the advantage of scheduling parallel operation statically, when the instructions are compiled. The fixed width instruction word and its parallel hardware, however, are designed to fit the maximum parallelism that might be available in the code, and most of the time far less parallelism is available in the code. Thus for much of the execution time, most of the instruction bandwidth and the instruction memory are unused.
There is often a very limited amount of parallelism available in a randomly chosen sequence of instructions, especially if the functional units are pipelined. When the units are pipelined, operations being issued on a given clock cycle cannot depend upon the outcome of any of the previously issued operations already in the pipeline. Thus, to efficiently employ VLIW, many more parallel operations are required than the number of functional units.
Another disadvantage of VLIW architectures which results from the fixed number of slots in the very long instruction word for classes of instructions, is that a typical VLIW instruction will contain information in only a few of its fields. This is inefficient, requiring the system to be designed for a circumstance that occurs only rarely—a fully populated instruction word.
Another disadvantage of VLIW systems is the need to increase the amount of code. Whenever an instruction is not full, the unused functional units translate to wasted bits, no-ops, in the instruction coding. Thus useful memory and/or instruction cache space is filled with useless no-op instructions. In short, VLIW machines tend to be wasteful of memory space and memory bandwidth except for only a very limited class of programs.
The term VLIW was coined by J. A. Fisher and his colleagues in Fisher, J. A., J. R. Ellis, J. C. Ruttenberg, and A. Nicolau [1984], “Parallel Processing: A Smart Compiler and a Dumb Machine,” Proc. SIGPLAN Conf. on Compiler Construction (June), Palo Alto, Calif., 11-16. Such a machine was commercialized by Multiflow Corporation.
For a more detailed description of both superscaler and VLIW architectures, see Computer Architecture—a Quantitative Approach, John L. Hennessy and David A. Patterson, Morgan Kaufmann Publishers, 1990.
We have developed a computing system architecture that enables instructions to be routed to an appropriate pipeline more quickly, at lower power, and with simpler circuitry than previously possible. This invention places the crossbar switch earlier in the pipeline, making it a part of the initial instruction fetch operation. This allows the crossbar to be a part of the cache itself, rather than a stage in the instruction pipeline. It also allows the crossbar to take advantage of circuit design parameters that are typical of regular memory structures rather than random logic. Such advantages include: lower switching voltages (200-300 milliamps rather than 3-5 volts); more compact design, and higher switching speeds. In addition, if the crossbar is placed in the cache, the need for many sense amplifiers is eliminated, reducing the circuitry required in the system as a whole.
To implement the crossbar switch, the instructions coming from the cache, or otherwise arriving at the switch, must be tagged or otherwise associated with a pipeline identifier to direct the instructions to the appropriate pipeline for execution. In other words, pipeline dispatch information must be available at the crossbar switch at instruction fetch time, before conventional instruction decode has occurred. There are several ways this capability can be satisfied: In one embodiment this system includes a mechanism that routes each instruction in a set of instructions to be executed in parallel to an appropriate pipeline, as determined by a pipeline tag applied to each instruction during compilation, or placed in a separate identifying instruction that accompanies the original instruction. Alternately the pipeline affiliation can be determined after compilation at the time that instructions are fetched from memory into the cache, using a special predecoder unit.
Thus, in one implementation, this system includes a register or other means, for example, the memory cells providing for storage of a line in the cache, for holding instructions to be executed in parallel. Each instruction has associated with it a pipeline identifier indicative of the pipeline to which that instruction is to be issued. A crossbar switch is provided which has a first set of connectors coupled to receive the instructions, and a second set of connectors coupled to the processing pipelines to which the instructions are to be dispatched for execution. Means are provided which are responsive to the pipeline identifiers of the individual instructions in the group supplied to the first set of connectors for routing those individual instructions onto appropriate paths of the second set of connectors, thereby supplying each instruction in the group to be executed in parallel to the appropriate pipeline.
In a preferred embodiment of this invention the associative crossbar is implemented in the instruction cache. By placing the crossbar in the cache all switching is done at low signal levels (approximately 200-300 millivolts). Switching at these low levels is substantially faster than switching at higher levels (5 volts) after the sense amplifiers. The lower power also eliminates the need for large driver circuits, and eliminates numerous sense amplifiers. Additionally by implementing the crossbar in the cache, the layout pitch of the crossbar lines matches the pitch of the layout of the cache.
We have developed a computing system architecture, which we term software-scheduled superscaler, which enables instructions to be executed both sequentially and in parallel, yet without wasting space in the instruction cache or registers. Like a wide-word machine, we provide for static scheduling of concurrent operations at program compilation. Instructions are also stored and loaded into fixed width frames (equal to the width of a cache line). Like a superscaler machine, however, we employ a traditional instruction set, in which each instruction encodes only one basic operation (load, store, etc.). We achieve concurrence by fetching and dispatching “groups” of simple individual instructions, arranged in any order. The architecture of our invention relies upon the compiler to assign instruction sequence codes to individual instructions at the time they are compiled. During execution these instruction sequence codes are used to sort the instructions into appropriate groups and execute them in the desired order. Thus our architecture does not suffer the high hardware overhead and run-time constraints of the superscaler strategy, nor does it suffer the wasted instruction bandwidth and memory typical of VLIW systems.
Our system includes a mechanism, an associative crossbar, which routes in parallel each instruction in an arbitrarily selected group to an appropriate pipeline, as determined by a pipeline tag applied to that instruction during compilation. Preferably, the pipeline tag will correspond to the type of functional unit required for execution of that instruction, e.g., floating point unit 1. All instructions in a selected group can be dispatched simultaneously.
Thus, in one implementation, our system includes a cache line, register, or other means for holding at least one group of instructions to be executed in parallel, each instruction in the group having associated therewith a pipeline identifier indicative of the pipeline for executing that instruction and a group identifier indicative of the group of instructions to be executed in parallel. The group identifier causes all instructions having the same group identifier to be executed simultaneously, while the pipeline identifier causes individual instructions in the group to be supplied to an appropriate pipeline.
In another embodiment the register holds multiple groups of instructions, and all of the instructions in each group having a common group identifier are placed next to each other, with the group of instructions to be executed first placed at one end of the register, and the instructions in the group to be executed last placed at the other end of the register.
In another embodiment of our invention a method of executing arbitrary numbers of instructions in a stream of instructions in parallel includes the steps of compiling the instructions to determine which instructions can be executed simultaneously, assigning group identifiers to sets of instructions that can be executed in parallel, determining a pipeline for execution of each instruction, assigning a pipeline identifier to each instruction, and placing the instructions in a cache line or register for execution by the pipelines.
a illustrates the frame structure for one maximum-sized group of eight instructions;
b illustrates the frame structure for a typical mix of three intermediate sized group of instructions;
c illustrates the frame structure for eight minimum-sized groups, each of one instruction;
In the preferred embodiment the instruction cache is a 16 kilobyte two-way set-associative 32 byte line cache. A set associative cache is one in which the lines (or blocks) can be placed only in a restricted set of locations. The line is first mapped into a set, but can be placed anywhere within that set. In a two-way set associative cache, two sets, or compartments, are provided, and each line can be placed in one compartment or the other.
The system also includes a data cache chip 20 that comprises a 32 kilobyte four-way set-associative 32 byte line cache. The third chip 30 of the system includes a predecoder, a cache controller, and a memory controller. The predecoder and instruction cache are explained further below. For the purposes of this invention, the CPU, FPU, data cache, cache controller and memory controller all may be considered of conventional design.
The communication paths among the chips are illustrated by arrows in FIG. 1. As shown, the CPU/FPU and instruction cache chip communicates over a 32 bit wide bus 12 with the predecoder chip 30. The asterisk is used to indicate that these communications are multiplexed so that a 64 bit word is communicated in two cycles. Chip 10 also receives information over 64 bit wide buses 14, 16 from the data cache 20, and supplies information to the data cache 20 over three 32 bit wide buses 18. The predecoder decodes a 32 bit instruction received from the secondary cache into a 64 bit word, and supplies that 64 bit word to the instruction cache on chip 10.
The specific functions of the predecoder are described in much greater detail below; however, essentially it functions to decode a 32 bit instruction received from the secondary cache into a 64 bit word, and to supply that 64 bit word to the instruction cache on chip 10.
The cache controller on chip 30 is activated whenever a first level cache miss occurs. Then the cache controller either goes to main memory or to the secondary cache to fetch the needed information. In the preferred embodiment the secondary cache lines are 32 bytes and the cache has an 8 kilobyte page size.
The data cache chip 20 communicates with the cache controller chip 30 over another 32 bit wide bus. In addition, the cache controller chip 30 communicates over a 64 bit wide bus 32 with the DRAM memory, over a 128 bit wide bus 34 with a secondary cache, and over a 64 bit wide bus 36 to input/output devices.
As will be described further below, the system shown in
As will be described, the system shown in
In this system, an arbitrary number of instructions can be executed in parallel. In one embodiment of this system the central processing unit includes eight functional units and is capable of executing eight instructions in parallel. These pipelines are designated using the digits 0 to 7. Also, for this explanation each instruction word is assumed to be 32 bits (4 bytes) long.
In this system, each group of instructions can contain an arbitrary number of instructions ordered in an arbitrary sequence. The only limitation is that all instructions in the group must be capable of simultaneous execution; e.g., there cannot be data dependency between instructions. The instruction groups are collected into larger sets and are organized into fixed width “frames” and stored. Each frame can contain a variable number of tightly packed instruction groups, depending upon the number of instructions in each group and on the width of the frame.
Below we describe this concept more fully, as well as describe a mechanism to route in parallel each instruction in an arbitrarily selected group to its appropriate pipeline, as determined by the pipeline tag of the instruction.
In the following description of the word, group, and frame concepts mentioned above, specific bit and byte widths are used for the word, group and frame. It should be appreciated that these widths are arbitrary, and can be varied as desired. None of the general mechanisms described for achieving the result of this invention depends upon the specific implementation.
In one embodiment of this system the central processing unit includes eight functional units and is capable of executing eight instructions in parallel. We designate these pipelines using the digits 0 to 7. Also, for this explanation each instruction word is 32 bits (4 bytes) long, with a bit, for example, the high order bit S being reserved as a flag for group identification.
When the instruction stream is compiled before execution, the compiler places instructions in the same group next to each other in any order within the group and then places that group in the frame. The instruction groups are ordered within the frame from left to right according to their issue sequence. That is, of the groups of instructions in the frame, the first group to issue is placed in the leftmost position, the second group to issue is placed in the next position to the right, etc. Thus, the last group of instructions to issue within that frame will be placed in the rightmost location in the frame. As explained, the group affiliation of all instructions in the same group is indicated by setting the S bit (bit 31 in
To clarify the use of a frame,
b illustrates the frame structure for a typical mixture of three intermediate sized groups of instructions. In
c illustrates the frame structure for eight minimum sized groups, each consisting of a single instruction. Because each “group” of a single instruction must be issued before the next group, the S bits toggle in a sequence 01010101 as shown.
As briefly mentioned above, in the preferred embodiment the pipeline identifiers are associated with individual instructions in a set of instructions during compilation, and in the preferred embodiment the group identifiers are associated with individual instructions in a group during compilation. In the preferred embodiment, this is achieved by compiling the instructions to be executed using a well-known compiler technology. During the compilation, the instructions are checked for data dependencies, dependence upon previous branch instructions, or other conditions that preclude their execution in parallel with other instructions. These steps are performed using a well-known compiler. The result of the compilation is identification of a set or group of instructions which can be executed in parallel. The result of the compilation is a group identifier being associated with each instruction. It is not necessary that the group identifier be added to the instruction as a tag, as shown in the preferred embodiment and described further below. In an alternative approach, the group identifier is provided as a separate tag that is later associated with the instruction. This makes possible the execution of programs on our system, without need to revise the word width. In addition, in the preferred embodiment, the compiler determines the appropriate pipeline for execution of an individual instruction. This determination is essentially a determination of the type of instruction provided. For example, load instructions will be sent to the load pipeline, store instructions to the store pipeline, etc. The association of the instruction with the given pipeline can be achieved either by the compiler, or by later examination of the instruction itself, for example, during predecoding.
Referring again to
At the time a frame is transferred into the instruction cache, the instruction word in that frame is predecoded by the predecoder 30 (FIG. 1), which as is explained below decodes the retrieved instruction into a full 64 bit word. As part of this predecoding the S bit of each instruction is expanded to a full 3 bit field 000, 001, . . . , 111, which provides the explicit binary group number of the instruction. In other words, the predecoder, by expanding the S bit to a three bit sequence explicitly provides information that the instruction group 000 must execute before instruction group 010, although both groups would have all instructions within the group have S bits set to 0. Because of the frame rules for sequencing groups, these group numbers correspond to the order of issue of the groups of instructions. Group 0 (000) will be issued first, Group 1 (001), if present, will be issued second, Group 2 (010) will be issued third. Ultimately, Group 7 (111), if present, will be issued last. At the time of predecoding of each instruction, the S value of the last word in the frame, which belongs to the last group in the frame to issue, is stored in the tag field for that line in the cache, along with the 19 bit real address and a valid bit. The valid bit is a bit that specifies whether the information in that line in the cache is valid. If the bit is not set to “valid,” there cannot be a match or “hit” on this address. The S value from the last instruction, which S value is stored in the tag field of the line in the cache, provides a “countdown” value that can be used to know when to increment to the next cache line.
At the time a group of instructions is transferred into the instruction cache, the instruction words are predecoded by the predecoder 30. As part of the predecoding process, a multiple bit field prefix is added to each instruction based upon a tag added to the instruction by the compiler. This prefix gives the explicit pipe number of the pipeline to which that instruction will be routed. Thus, at the time an instruction is supplied from the predecoder to the instruction cache, each instruction will have a pipeline identifier.
As another part of the predecoding process, a new 4 bit field prefix is added to each instruction giving the explicit pipe number of the pipeline to which that instruction will be routed. The use of four bits, rather than three allows the system to be later expanded with additional pipelines. Thus, at the time an instruction is supplied from the predecoder to the instruction cache, each instruction will have the format shown in FIG. 6. As shown by
It may be desirable to implement the system of this invention on computer systems that already are in existence and therefore have instruction structures that have already been defined without available blank fields for the group information, pipeline information, or both. In this case, in another embodiment of this invention, the group and the pipeline identifier information is supplied on a different clock cycle, then combined with the instructions in the cache or placed in a separate smaller cache. Such an approach can be achieved by adding a “no-op” instruction with fields that identify which instructions are in which group, and identify the pipeline for execution of the instruction, or by supplying the information relating to the parallel instructions in another manner. It therefore should be appreciated that the manner in which the data, the instruction and pipeline identifier arrives at the crossbar to be processed is somewhat arbitrary. I use the word “associated” herein to designate the concept that the pipeline and group identifiers are not required to have a fixed relationship to the instruction words. That is, the pipeline and group identifiers need not be embedded within the instructions themselves by the compiler as shown in FIG. 7. Instead they may arrive from another means, or on a different cycle.
In
In the preferred embodiment the instruction cache operates as a conventional physically-addressed instruction cache. In the example depicted in
Address sources for the instruction cache arrive at a multiplexer 80 that selects the next address to be fetched. Because preferably instructions are always machine words, the low order two address bits <1:0> of the 32 bit address field supplied to multiplexer 80 are discarded. These two bits designate byte and half-word boundaries. Of the remaining 30 bits, the next three low order address bits <4:2>, which designate a particular instruction word in the set, a frame, are sent directly via bus 81 to the associative crossbar (explained in conjunction with subsequent figures). The next low eight address bits <12:5> are supplied over bus 82 to the instruction cache 70 where they are used to select one of the 256 lines in the instruction cache. Finally, the remaining 19 bits of the virtual address <31:13> are sent to the translation lookaside buffer (TLB) 90. The TLB translates these bits into the high 19 bits of the physical address. The TLB then supplies them over bus 84 to the instruction cache. In the cache they are compared with the tag of the selected line, to determine if there is a “hit” or a “miss” in the instruction cache.
If there is a hit in the instruction cache, indicating that the addressed instruction is present in the cache, then the selected set of instructions, frame containing the addressed instruction is transferred across the 512 bit wide bus 73 into the associative crossbar 100. The associative crossbar 100 then dispatches the addressed instruction, with the other instructions in its group, if any, to the appropriate pipelines over buses 110, 111, . . . , 117. Preferably the bit lines from the memory cells storing or containing the bits of the instruction are themselves coupled to the associative crossbar. This eliminates the need for numerous sense amplifiers, and allows the crossbar to operate on the lower voltage swing information from the cache line directly, without the normally intervening driver circuitry to slow system operation.
As shown in
The decoders 170, 171, . . . , 177 associated with each instruction word pathway receive the 4 bit pipeline code from the instruction. Each decoder, for example decoder 170, provides eight 1 bit control lines as output. One of these control lines is associated with each pipeline pathway crossing of that instruction word pathway. Selection of a decoder as described with reference to
The pipeline processing units 200, 201, . . . , 207 shown in
The execution ports that connect to the pipelines specified by the pipeline identification bits of the enabled instructions are then selected to multiplex out the appropriate instructions from the current frame contents of the register. If one or more of the pipelines is not ready to receive a new instruction, a set of hold latches at the output of the execution ports prevents any of the enabled instructions from issuing until the “busy” pipeline is free. Otherwise the instructions pass transparently through the hold latches into their respective pipelines. Accompanying the output of each port is a “port valid” signal that indicates whether the port has valid information to issue to the hold latch.
Because the instructions that start later groups are known, the system can decide easily which instruction starts the next group. This information is used to update the PC to the address of the next group of instructions. If no instruction in the frame begins the next group, i.e., the last instruction group has been dispatched to the pipelines, a flag is set. The flag causes the next frame of instructions to be brought into the crossbar. The PC is then reset to I0. Shown in the figure is an exemplary sequence of the values that the PC, the instruction enable bits and the next frame flag take on over a sequence of eight clocks extending over two frames.
The processor architecture described above provides many unique advantages to a system using this crossbar. The system and the crossbar described is extremely flexible, enabling instructions to be executed sequentially or in parallel, depending entirely upon the “intelligence” of the compiler. As compiler technology improves, the described hardware can execute programs more rapidly, not being limited to any particular frame width, number of instructions capable of parallel execution, or other external constraints. Importantly, the associative crossbar relies upon the content of the message being decoded, not upon an external control circuit acting independently of the instructions being executed. In essence, the associative crossbar is self directed. In the preferred embodiment the system is capable of a parallel issue of up to eight operations per cycle.
Another important advantage of this system is that it allows for more intelligent compilers. Two instructions which appear to a hardware decoder (such as in the prior art described above) to be dependent upon each other can be determined by the compiler not to be interdependent. For example, a hardware decoder would not permit two instructions R1+R2 R3 and R3+R5=R6 to be executed in parallel. A compiler, however, can be “intelligent” enough to determine that the second R3 is a previous value of R3, not the one calculated by R1+R2, and therefore allow both instructions to issue at the same time. This allows the software to be more flexible and faster.
Although the foregoing has been a description of the preferred embodiment of the invention, it will be apparent to those of skill in the art the numerous modifications and variations may be made to the invention without departing from the scope as described herein. For example, arbitrary numbers of pipelines, arbitrary numbers of decoders, and different architectures may be employed, yet rely upon the system we have developed.
This is a continuation of U.S. application Ser. No. 08/754,337 filed Nov. 22, 1996, now U.S. Pat. No. 5,794,003; which is a continuation of U.S. application Ser. No. 08/498,135, filed Jul. 5, 1995, now abandoned, which is a continuation of U.S. application Ser. No. 08/147,797 filed Nov. 5, 1993, now abandoned.
Number | Name | Date | Kind |
---|---|---|---|
4295193 | Pomerene | Oct 1981 | A |
4437149 | Pomerene et al. | Mar 1984 | A |
4833599 | Colwell et al. | May 1989 | A |
4847755 | Morrison et al. | Jul 1989 | A |
4888679 | Fossum et al. | Dec 1989 | A |
4920477 | Colwell et al. | Apr 1990 | A |
4933837 | Freidin | Jun 1990 | A |
5021945 | Morrison et al. | Jun 1991 | A |
5051885 | Yates et al. | Sep 1991 | A |
5051940 | Vassiliadis et al. | Sep 1991 | A |
5055997 | Sluijter et al. | Oct 1991 | A |
5057837 | Colwell et al. | Oct 1991 | A |
5081575 | Hiller et al. | Jan 1992 | A |
5101341 | Circello et al. | Mar 1992 | A |
5121502 | Rau et al. | Jun 1992 | A |
5129067 | Johnson | Jul 1992 | A |
5151981 | Westcott et al. | Sep 1992 | A |
5179680 | Colwell et al. | Jan 1993 | A |
5197135 | Eickemeyer et al. | Mar 1993 | A |
5197137 | Kumar et al. | Mar 1993 | A |
5203002 | Wetzel | Apr 1993 | A |
5214763 | Blaner et al. | May 1993 | A |
5226169 | Gregor | Jul 1993 | A |
5233696 | Suzuki | Aug 1993 | A |
5239654 | Ing-Simmons et al. | Aug 1993 | A |
5276819 | Rau et al. | Jan 1994 | A |
5276821 | Imai et al. | Jan 1994 | A |
5287467 | Blaner et al. | Feb 1994 | A |
5295249 | Blaner et al. | Mar 1994 | A |
5297255 | Hamanaka et al. | Mar 1994 | A |
5297281 | Emma et al. | Mar 1994 | A |
5299321 | Iizuka | Mar 1994 | A |
5303356 | Vassiliadis et al. | Apr 1994 | A |
5333280 | Ishikawa et al. | Jul 1994 | A |
5337415 | DeLano et al. | Aug 1994 | A |
5355460 | Eickemeyer et al. | Oct 1994 | A |
5367694 | Ueno | Nov 1994 | A |
5377339 | Saito et al. | Dec 1994 | A |
5386531 | Blaner et al. | Jan 1995 | A |
5398321 | Jeremiah | Mar 1995 | A |
5404469 | Chung et al. | Apr 1995 | A |
5442760 | Rustad et al. | Aug 1995 | A |
5442762 | Kato et al. | Aug 1995 | A |
5446850 | Jeremiah et al. | Aug 1995 | A |
5448746 | Eickemeyer et al. | Sep 1995 | A |
5459844 | Eickemeyer et al. | Oct 1995 | A |
5465377 | Blaner et al. | Nov 1995 | A |
5471593 | Branigin | Nov 1995 | A |
5475853 | Blaner et al. | Dec 1995 | A |
5500942 | Eickemeyer et al. | Mar 1996 | A |
5502826 | Vassiliadis et al. | Mar 1996 | A |
5504932 | Vassiliadis et al. | Apr 1996 | A |
5506974 | Church et al. | Apr 1996 | A |
5513363 | Kumar et al. | Apr 1996 | A |
5664135 | Schlansker et al. | Sep 1997 | A |
5689428 | Sauerbrey et al. | Nov 1997 | A |
5689653 | Karp et al. | Nov 1997 | A |
5692139 | Slavenburg et al. | Nov 1997 | A |
5692169 | Kathail et al. | Nov 1997 | A |
5701430 | Jeremiah et al. | Dec 1997 | A |
5732234 | Vassiliadis et al. | Mar 1998 | A |
5748936 | Karp et al. | May 1998 | A |
5761470 | Yoshida | Jun 1998 | A |
5778219 | Amerson | Jul 1998 | A |
5819088 | Reinders | Oct 1998 | A |
5864692 | Faraboschi et al. | Jan 1999 | A |
5870576 | Faraboschi et al. | Feb 1999 | A |
5881260 | Raje et al. | Mar 1999 | A |
5881280 | Gupta et al. | Mar 1999 | A |
5901318 | Hsu | May 1999 | A |
5909559 | So | Jun 1999 | A |
5922065 | Hull | Jul 1999 | A |
5930508 | Faraboschi et al. | Jul 1999 | A |
5933850 | Kumar et al. | Aug 1999 | A |
5941983 | Gupta et al. | Aug 1999 | A |
5943499 | Gillies et al. | Aug 1999 | A |
5958044 | Brown et al. | Sep 1999 | A |
5970241 | Deao et al. | Oct 1999 | A |
5999738 | Schlansker et al. | Dec 1999 | A |
5999739 | Soni et al. | Dec 1999 | A |
6016555 | Deao et al. | Jan 2000 | A |
6023751 | Schlansker et al. | Feb 2000 | A |
6026479 | Fisher et al. | Feb 2000 | A |
6029240 | Blaner et al. | Feb 2000 | A |
6055628 | Seshan et al. | Apr 2000 | A |
6055649 | Deao et al. | Apr 2000 | A |
6058474 | Baltz et al. | May 2000 | A |
6061780 | Shippy et al. | May 2000 | A |
6065106 | Deao et al. | May 2000 | A |
6078940 | Scales | Jun 2000 | A |
6081885 | Deao et al. | Jun 2000 | A |
6105119 | Kerr et al. | Aug 2000 | A |
6105123 | Raje | Aug 2000 | A |
6112291 | Scales et al. | Aug 2000 | A |
6112298 | Deao et al. | Aug 2000 | A |
6125334 | Hurd | Sep 2000 | A |
6128725 | Leach | Oct 2000 | A |
6145027 | Seshan et al. | Nov 2000 | A |
6167466 | Nguyen et al. | Dec 2000 | A |
6173248 | Brauch | Jan 2001 | B1 |
6179489 | So et al. | Jan 2001 | B1 |
6182203 | Simar, Jr. et al. | Jan 2001 | B1 |
6195756 | Hurd | Feb 2001 | B1 |
6219796 | Bartley | Apr 2001 | B1 |
6219833 | Solomon et al. | Apr 2001 | B1 |
6246102 | Sauerbrey et al. | Jun 2001 | B1 |
6247172 | Dunn et al. | Jun 2001 | B1 |
6253359 | Cano et al. | Jun 2001 | B1 |
6260190 | Ju | Jul 2001 | B1 |
6263470 | Hung et al. | Jul 2001 | B1 |
6298370 | Tang et al. | Oct 2001 | B1 |
6311234 | Seshan et al. | Oct 2001 | B1 |
6314431 | Gornish | Nov 2001 | B1 |
6314560 | Dunn et al. | Nov 2001 | B1 |
6317820 | Shiell et al. | Nov 2001 | B1 |
6321318 | Baltz et al. | Nov 2001 | B1 |
6363516 | Cano et al. | Mar 2002 | B1 |
6374346 | Seshan et al. | Apr 2002 | B1 |
6374403 | Darte et al. | Apr 2002 | B1 |
6378109 | Young et al. | Apr 2002 | B1 |
6381704 | Cano et al. | Apr 2002 | B1 |
6385757 | Gupta et al. | May 2002 | B1 |
6408428 | Schlansker et al. | Jun 2002 | B1 |
6438747 | Schreiber et al. | Aug 2002 | B1 |
6442701 | Hurd | Aug 2002 | B1 |
Number | Date | Country |
---|---|---|
0021399 | Jan 1981 | EP |
0363222 | Apr 1990 | EP |
0449661 | Jan 1991 | EP |
0 451 562 | Mar 1991 | EP |
0426393 | May 1991 | EP |
0463229 | Jan 1992 | EP |
0463296 | Jan 1992 | EP |
0 496 407 | Jan 1992 | EP |
0 496 928 | Aug 1992 | EP |
0652510 | May 1995 | EP |
WO 8603038 | May 1986 | WO |
WO 8808568 | Nov 1988 | WO |
9838791 | Sep 1998 | WO |
Number | Date | Country | |
---|---|---|---|
20030191923 A1 | Oct 2003 | US |
Number | Date | Country | |
---|---|---|---|
Parent | 08754337 | Nov 1996 | US |
Child | 09057861 | US | |
Parent | 08498135 | Jul 1995 | US |
Child | 08754337 | US | |
Parent | 08147797 | Nov 1993 | US |
Child | 08498135 | US |