1. Field of the Invention
The present invention relates to the field of computers and computer processors, and more particularly to a method and means for a more efficient use of a stack within a stack computer processor.
2. Description of the Background Art
Stack machines offer processor complexity that is much lower than that of Complex Instruction Set Computers (CISCs), and overall system complexity that is lower than that of either Reduced Instruction Set Computers (RICSs) or CISC machines. They do this without requiring complicated compilers or cache control hardware for good performance. They also attain competitive raw performance, and superior performance for a given price in most programming environments. Their first successful application area has been in real time embedded control environments, where they outperform other system design approaches by a wide margin. Where previously the stacks were kept mostly in program memory, newer stack machines maintain separate memory chips or even an area of on-chip memory for the stacks. These stack machines provide extremely fast subroutine calling capability and superior performance for interrupt handling and task switching.
However, there is no hardware detection of stack overflow or underflow conditions. Stack overflow occurs when there are not a sufficient number of registers available and results continue to be pushed onto the stack, causing the bottom register(s) to be overwritten. Stack underflow occurs when all registers have been emptied, and continued popping of a stack produces unintentional or incorrect results. Some other stack processors use stack pointers and memory management such that an error condition is flagged when a stack pointer goes out of range of memory allocated for the stack. U.S. Pat. No. 6,367,005 issued to Zahir et al. disclose a register stack engine, which saves to memory sufficient registers of a register stack to provide more available registers in the event of stack overflow. The register stack engine also delays the microprocessor until the engine can restore an appropriate number of registers in the event of stack underflow.
U.S. Pat. No. 6,219,685 issued to Story discloses a method of comparing the results of an operation with a threshold value. However, this approach does not distinguish between results that are rounded down to the threshold value (which would raise an overflow exception) and results that just happen to equal the threshold value. Another method disclosed by Story reads and writes hardware flags to identify overflow or underflow conditions. However, the instructions must be performed sequentially, and any instructions following a register read/write can not proceed until the read/write operation is completed, which makes for a slow process.
With a stack in memory, an overflow or underflow would overwrite a stack item or use a stack item that was not intended to be part of the stack. A need exists for an improved method of reducing or eliminating overflow and underflow within a stack.
It is an object of the present invention to provide an apparatus and method in which the data stack and return stack of a dual stack processor are not arrays in memory accessed by a stack pointer, but instead are hardwire accessed by a separate dedicated shift register.
It is another object of the present invention to reduce or eliminate overflow and underflow of a data or return stack.
It is another object of the present invention to minimize the length of electrical connections between one bit stack registers of a bi-directional stack register, and thereby minimize the required driver size and minimize buffering.
These and other objects are achieved by the presently described invention, in which a conventional stack is replaced by an array of registers which function in a circular, repeating pattern. This circular, repeating pattern is accomplished through utilization of an associated bi-directional shift register which contains a plurality of one bit shift registers electrically interconnected in an alternating pattern. This configuration prevents reading from outside of the stack, and prevents reading an unintended empty register value.
The above described dual stack processor can function as an independently functioning processor, or it can be used with several other like or different processors in an interconnected computer array.
This invention is described with reference to the Figures, in which like numbers represent the same or similar elements. While this invention is described in terms of modes for achieving this invention's objectives, it will be appreciated by those skilled in the art that variations may be accomplished in view of these teachings without deviating from the spirit or scope of the presently claimed invention.
The embodiments and variations of the invention described herein, and/or shown in the drawings, are presented by way of example only and are not limiting as to the scope of the invention. Unless otherwise specifically stated, individual aspects and components of the invention may be omitted or modified for a variety of applications while remaining within the spirit and scope of the claimed invention, since it is intended that the present invention is adaptable to many variations.
Other basic components of the computer 12 are a return stack 28, including an R register 29, an instruction area 30, an arithmetic logic unit (ALU or processor) 32, a data stack 34 and a decode logic section 36 for decoding instructions. The computer 12 is a dual stack computer having a data stack 34 and a separate return stack 28. One skilled in the art will be generally familiar with the operation of stack based computers such as the computer 12 of this example.
In the presently described embodiment, the instruction area 30 comprises a number of registers 40 including, in this example, an A register 40a, a B register 40b and a P register 40c. In this example, the A register 40a is a full eighteen-bit register, while the B register 40b and the P register 40c are nine-bit registers.
The present invention discloses a stack computer processor in which the data and return stacks comprise an array of registers, which function in a cyclical, repeating, or circular pattern. The data stack and return stack are not arrays in memory accessed by a stack pointer, as in many prior art computers.
This embodiment also comprises a bi-directional shift register which contains a plurality of one bit shift registers. The number of one bit shift registers is equal to the number of bottom stack registers, S2 through S9 located below the S register. Each one bit shift register is connected to its corresponding S2 through S9 stack register as shown in
A ten cell deep push down stack is formed by the registers T, S, and S2 through S9. Because the bottom eight registers are in a circular buffer, the hardware wraps rather than overflows or underflows. One must not expect to put more than ten items there and get them all back, but one can keep taking more copies of the last eight items taken from the bottom of the stack forever. There is no underflow in the sense of it being an error. It is the fastest way to duplicate a pattern of eight words (or four or two or one) because the bottom eight will be read over and over if a program keeps taking values from the stack.
Similarly, there is no stack overflow in the sense of stack pointers allowing the stack to step on anything else. It is finite and if more than ten items are put there, only the last ten will remain; each store after the first ten will overwrite one of the S2 through S9 registers. There is no need to ‘initialize’ the stack to a preset location; one just declares it empty by starting to use it from wherever it is.
The circular register array, R1-R8 can operate in the absence of the R register. However, the presence of the R register in combination with R1-R8 registers provides faster access circuitry and an optimum for timing, and therefore provides higher operating speed of the circular register array. In addition, the R register acts as a buffer between the R1-R8 addressable registers and the rest of the processor system, which provides independence of timing between the R1-R8 registers and the rest of the processor system.
This embodiment also comprises a bi-directional shift register which contains a plurality of one bit shift registers. The number of one bit shift registers is equal to the number of additional bottom registers, R1 through R8 located below the R register. Each one bit shift register is connected to its corresponding R1 through R8 bottom stack register as shown in
In the instant invention, there is no hardware detection of stack overflow or underflow conditions. Generally, prior art processors use stack pointers and memory management, or the like, such that an error condition is flagged when a stack pointer goes out of the range of memory allocated for the stack. When the stacks are located or managed in memory, an overflow or underflow would overwrite a stack item or use a stack item as something other than what it was intended. However, because the present invention bottom registers function as a circular array, the stacks cannot overflow or underflow out of the stack area. Instead, the circular arrays will merely wrap around the array of registers. Because the stacks have finite depth, pushing anything to the top of a stack means something on the bottom is being overwritten. Pushing more than ten items to the data stack, or more than nine items to the return stack in the given embodiments must be done with the knowledge that doing so will result in the item at the bottom of the stack to be overwritten.
It is the responsibility of software to keep track of the number of items on the stacks and not try to put more items there than the respective stacks can hold. The hardware will not detect an overwriting of items at the bottom of the stack or flag it as an error. It should be noted that the software can take advantage of the circular arrays at the bottom of the stacks in several ways. As one example, the software can simply assume that a stack is ‘empty’ at any time. There is no need to clear old items from the stack as they will be pushed down towards the bottom where they will be lost as the stack fills, so there is nothing to initialize for a program to assume that the stack is empty.
Another advantage which can be utilized is to reuse the register items without having to reload those items for reuse. The bottom eight items in these stacks can also be read or read and written in loops that take advantage of the stack wrap. After two data stack reads, T and S will have copies of two items from the circular array of the eight stack registers below. After eight more reads, T and S will be reloaded again with the same values read again from below using stack wrap. There is no limit to how many times those eight items can be read in sequence off of the stack without having to duplicate the items or write them back to the stack. Algorithms that cycle through a set of parameters that can repeat in eight, four, or two cells on the data or return stack can repeatedly read them from the stack as the bottom registers will just wrap, and if intentional is not a stack error.
Although the instant invention has been described in an embodiment for a data stack and return stack of a dual stack 18-bit processor, other bit size processors can be utilized with the present invention.
The above described circular register arrays were described with respect to a single dual stack processor. However, the above described circular register arrays can also be utilized in an array of several self-contained computers, such as the computer array 10 shown in
Computer 12e is an example of one of the computers 12 that is not on the periphery of the array 10. That is, computer 12e has four orthogonally adjacent computers 12a, 12b, 12c, and 12d. This grouping of computers 12a through 12e will be used, by way of example, hereinafter in relation to a more detailed discussion of the communications between the computers 12 of the array 10. As can be seen in the view of
According to the present inventive method, a computer 12, such as the computer 12e can set high one, two, three, or all four of its read lines 18 such that it is prepared to receive data from the respective one, two, three, or all four adjacent computers 12. Similarly, it is also possible for a computer 12 to set one, two, three, or all four of its write lines 20 high.
When one of the adjacent computers 12a, 12b, 12c, or 12d sets a write line 20 between itself and the computer 12e high, if the computer 12e has already set the corresponding read line 18 high, then a word is transferred from that computer 12a, 12b, 12c, or 12d to the computer 12e on the associated data lines 22. Then, the sending computer 12 will release the write line 20 and the receiving computer (12e in this example) pulls both the write line 20 and the read line 18 low. The latter action will acknowledge to the sending computer 12 that the data has been received.
As shown in
Various modifications may be made to the invention without altering its value or scope. For example, while this invention has been described herein using particular computers 12, many or all of the inventive aspects are readily adaptable to other computer designs, other computer arrays, and the like.
While the present invention has been disclosed primarily herein in relation to communications between computers 12 in an array 10 on a single die 14, the same principles and methods can be used, or modified for use, to accomplish other inter-device communications, such as communications between a computer 12 and its dedicated memory or between a computer 12 in an array 10 and an external device.
Similarly, while the present invention has been disclosed herein for a dual stack processor, the present invention can also be practiced for a single stack processor or a processor comprising more than two stacks.
All of the above are only some of the examples of available embodiments of the present invention. Those skilled in the art will readily observe that numerous other modifications and alterations may be made without departing from the spirit and scope of the invention. Accordingly, the disclosure herein is not intended as limiting and the appended claims are to be interpreted as encompassing the entire scope of the invention.
This application claims priority to provisional application No. 60/818,084, filed Jun. 30, 2006, and is a continuation-in-part of the application entitled, “Method and Apparatus for Monitoring Inputs to a Computer,” filed May 26, 2006, Ser. No. 11/441,818, which is a continuation-in-part of the application entitled, “Asynchronous Power Saving Computer,” filed Feb. 16, 2006, Ser. No. 11/355,513. This application also claims priority to provisional application No. 60/788,265, filed Mar. 31, 2006, and claims priority to provisional application No. 60/797,345, filed May 3, 2006. All of the cited applications above are incorporated herein by reference.
Number | Name | Date | Kind |
---|---|---|---|
3757306 | Boone | Sep 1973 | A |
4107773 | Gilbreath et al. | Aug 1978 | A |
4215422 | McCray et al. | Jul 1980 | A |
4298932 | Sams | Nov 1981 | A |
4462074 | Linde | Jul 1984 | A |
4589067 | Porter et al. | May 1986 | A |
4593351 | Hong et al. | Jun 1986 | A |
4821231 | Cruess et al. | Apr 1989 | A |
4984151 | Dujari | Jan 1991 | A |
5029124 | Leahy et al. | Jul 1991 | A |
5053952 | Koopman et al. | Oct 1991 | A |
5218682 | Frantz | Jun 1993 | A |
5317735 | Schomberg | May 1994 | A |
5319757 | Moore et al. | Jun 1994 | A |
5359568 | Livay et al. | Oct 1994 | A |
5375238 | Ooi | Dec 1994 | A |
5377333 | Nakagoshi et al. | Dec 1994 | A |
5390304 | Leach et al. | Feb 1995 | A |
5396609 | Schmidt et al. | Mar 1995 | A |
5410723 | Schmidt et al. | Apr 1995 | A |
5440749 | Moore et al. | Aug 1995 | A |
5485624 | Steinmetz et al. | Jan 1996 | A |
5535393 | Reeve et al. | Jul 1996 | A |
5535417 | Baji et al. | Jul 1996 | A |
5551045 | Kawamoto et al. | Aug 1996 | A |
5572698 | Yen et al. | Nov 1996 | A |
5657485 | Streitenberger et al. | Aug 1997 | A |
5692197 | Narad et al. | Nov 1997 | A |
5706491 | McMahan | Jan 1998 | A |
5717943 | Barker et al. | Feb 1998 | A |
5727194 | Shridhar et al. | Mar 1998 | A |
5740463 | Oshima et al. | Apr 1998 | A |
5752259 | Tran | May 1998 | A |
5784602 | Glass et al. | Jul 1998 | A |
5826101 | Beck et al. | Oct 1998 | A |
5893148 | Genduso et al. | Apr 1999 | A |
5911082 | Monroe et al. | Jun 1999 | A |
6003128 | Tran | Dec 1999 | A |
6023753 | Pechanek et al. | Feb 2000 | A |
6038655 | Little et al. | Mar 2000 | A |
6085304 | Morris et al. | Jul 2000 | A |
6101598 | Dokic et al. | Aug 2000 | A |
6112296 | Witt et al. | Aug 2000 | A |
6145061 | Garcia et al. | Nov 2000 | A |
6145072 | Shams et al. | Nov 2000 | A |
6148392 | Liu | Nov 2000 | A |
6154809 | Ikenaga et al. | Nov 2000 | A |
6173389 | Pechanek et al. | Jan 2001 | B1 |
6178525 | Warren | Jan 2001 | B1 |
6219685 | Story | Apr 2001 | B1 |
6223282 | Kang | Apr 2001 | B1 |
6279101 | Witt et al. | Aug 2001 | B1 |
6353880 | Cheng | Mar 2002 | B1 |
6367005 | Zahir et al. | Apr 2002 | B1 |
6381705 | Roche | Apr 2002 | B1 |
6427204 | Arimilli et al. | Jul 2002 | B1 |
6449709 | Gates | Sep 2002 | B1 |
6460128 | Baxter et al. | Oct 2002 | B1 |
6507649 | Tovander | Jan 2003 | B1 |
6598148 | Moore et al. | Jul 2003 | B1 |
6665793 | Zahir et al. | Dec 2003 | B1 |
6725361 | Rozas et al. | Apr 2004 | B1 |
6825843 | Allen et al. | Nov 2004 | B2 |
6826674 | Sato | Nov 2004 | B1 |
6845412 | Boike et al. | Jan 2005 | B1 |
6959372 | Hobson et al. | Oct 2005 | B1 |
7028163 | Kim et al. | Apr 2006 | B2 |
7079046 | Tanaka | Jul 2006 | B2 |
7089438 | Raad | Aug 2006 | B2 |
7136989 | Ishii | Nov 2006 | B2 |
7155602 | Poznanovic | Dec 2006 | B2 |
7162573 | Mehta | Jan 2007 | B2 |
7197624 | Pechanek et al. | Mar 2007 | B2 |
7263624 | Marchand et al. | Aug 2007 | B2 |
7269805 | Ansari et al. | Sep 2007 | B1 |
20020010844 | Noel et al. | Jan 2002 | A1 |
20030028750 | Hogenauer | Feb 2003 | A1 |
20030065905 | Ishii | Apr 2003 | A1 |
20030217242 | Wybenga et al. | Nov 2003 | A1 |
20040003219 | Uehara | Jan 2004 | A1 |
20040059895 | May et al. | Mar 2004 | A1 |
20040107332 | Fujii et al. | Jun 2004 | A1 |
20040143638 | Beckmann et al. | Jul 2004 | A1 |
20040250046 | Gonzalez et al. | Dec 2004 | A1 |
20050027548 | Jacobs et al. | Feb 2005 | A1 |
20050114565 | Gonzalez et al. | May 2005 | A1 |
20050149693 | Barry | Jul 2005 | A1 |
20050206648 | Perry et al. | Sep 2005 | A1 |
20050223204 | Kato | Oct 2005 | A1 |
20060101238 | Bose et al. | May 2006 | A1 |
20060149925 | Nguyen et al. | Jul 2006 | A1 |
20060248317 | Vorbach et al. | Nov 2006 | A1 |
20060259743 | Suzuoki | Nov 2006 | A1 |
20070113058 | Tran et al. | May 2007 | A1 |
20070192504 | Moore | Aug 2007 | A1 |
Number | Date | Country |
---|---|---|
0156654 | Oct 1985 | EP |
0227319 | Jul 1987 | EP |
0992896 | Apr 2000 | EP |
1182544 | Feb 2002 | EP |
1821211 | Aug 2007 | EP |
WO9715001 | Apr 1997 | WO |
WO0042506 | Jul 2000 | WO |
WO 0250700 | Jun 2002 | WO |
WO02088936 | Nov 2002 | WO |
WO03019356 | Mar 2003 | WO |
WO2005091847 | Oct 2005 | WO |
Number | Date | Country | |
---|---|---|---|
20070192576 A1 | Aug 2007 | US |
Number | Date | Country | |
---|---|---|---|
60818084 | Jun 2006 | US | |
60788265 | Mar 2006 | US | |
60797345 | May 2006 | US |
Number | Date | Country | |
---|---|---|---|
Parent | 11441818 | May 2006 | US |
Child | 11503372 | US | |
Parent | 11355513 | Feb 2006 | US |
Child | 11441818 | US |