Field of the Invention
The present invention relates to multiprocessor integrated circuits, in general, and to synchronisation of multiple microprocessors implemented on an integrated circuit, in particular.
Description of Related Art
Multiple microprocessors implemented on integrated circuits typically communicate through shared memory using memory mapped registers or general purpose input/output devices (GPIO's) connected to system interrupt signals as a mechanism for synchronising with one another.
Multi-processor systems often use sophisticated memory management systems to support synchronisation (for example, cache coherency, or locked memory blocks). Some processors allow semaphores to be implemented using “atomic” test-and-set (exchange) instructions. Semaphores are well known in the art and are used to control access to shared resources, such as memory, in multi-processor environments.
“Atomic” instructions are basic instructions which allow a semaphore to be tested or set.
In prior art system, various ad-hoc schemes are implemented on an application-by-application basis in order to synchronise multiple processors. In one known scheme, illustrated in
The method described scheme can also be used in a reverse fashion so that the processor PB can send data and an interrupt request to processor PA. Such bidirectional communication allows a so-called “handshake” to be performed. That is, processor PA can generate an interrupt request communicating to PB the message “the data in SMC is ready”, processor PB then generates an interrupt request to processor PA communicating back to PA the message “I have finished with the data”. This communication between processors allows processes (or threads) running on respective processors to synchronise and communication with one another.
However, the scheme as described above relies on the integrated circuit hardware designer to construct a protocol for synchronising processes using interrupt requests (IRQs) and interrupt service routines (ISRs). If the application software is changed so that different communication patterns are required, new memory mapped registers (RD) and interrupt (IRQ) connections will have to be added and the hardware rebuilt. Such redesign and rebuild is clearly inefficient and costly.
Embodiments of the present invention provide mechanisms for allowing a plurality of application processes running on a plurality of processors to communicate and synchronise with one another. Embodiments allow application software to be rewritten without the need for redesign and rebuild of hardware. Embodiments of the present invention allow complex multi-threaded multi-processor systems to be constructed more quickly than previous design solutions.
A hardware IP block provides a group of semaphores that can be manipulated (Post, Pend, Set) by a number of microprocessors. The hardware block generates a number of interrupt signals. Mask registers, associated with respective interrupts, allow the processors to select the conditions which cause an interrupt to be generated.
According to one aspect of the present invention, there is provided an integrated circuit unit for synchronising processing threads running on respective processors, the unit including an interrupt request controller which is programmable to provide a first desired number of synchronisation objects and a second desired number of interrupt request signals for supply to such processors, wherein the controller is operable to direct and interrupt request signals to a chosen processor in dependence upon data received from the processors.
According to another aspect of the present invention, there is provided a method of synchronising multiple processing threads running on respective processors, the method comprising providing a first desired number of synchronisation objects and a second desired number of interrupt request signals, receiving an input command, and outputting an interrupt request signal in dependence upon the input command and on a programmable range of parameters relating to interrupt conditions.
The register bank 30 operates to store data synchronisation objects semaphores or mailboxes for synchronising processing threads running on the processors 20 and 22. These semaphores can be manipulated by the processors 20 and 22 by reading (load) or writing (store) to memory mapped locations over the bus 24. The processors 20 and 22 operate to load and store data to the access logic 28. The access logic 28 converts these load/store instructions into the appropriate operations (e.g. test-and-set, semaphore Post, mailbox clear, etc) for supply to the register bank, and hence to manipulate the semaphores. The value stored in the memory corresponds to either the semaphore value (0..N or 0/1 for counting and binary respectively) or the mailbox contents.
The access logic converts read (ie. load from) a certain address into an atomic operation that retrieves the value from the memory, supplies the value to the reading processor, decrements the value, places the new decremented value back in the memory replacing the previous value and generates any interrupt signals necessary. Other operations to load a value (set), set a value to zero (clear), or to read a value without modifying it are possible.
This use of simple load/store instructions from the processors 20 and 22 means that the software applications running the processing threads on processors 20 and 22 can be changed easily, without the need for hardware changes. Any changes in the application software running on the processors 20 and 22 need only conform to the load/store instruction set used by the synchronisation unit in order to generate the relevant interrupt request signals IRQ1 and IRQ2.
The interrupt generation logic units 32 and 34 receive outputs from the register bank 30. The outputs from the register bank 30 are the results of the synchronisation objects or semaphores being manipulated by data supply from the processors 20 and 22. The synchronisation objects and semaphores will be described in more detail below. The interrupt generation logic units 32 and 34 also receive masked data inputs from respective mask registers 36 and 38. The mask registers 36 and 38 receive data from the bus 24, from processors 22 and 20 respectively. The values stored in the mask registers 36 and 38 set the conditions under which interrupt requests can be generated by the corresponding interrupt generation unit 32 and 34 respectively. For example, processor 20 could load mask register 36 with a data value such that only under certain conditions could interrupt 1 IRQ1 be generated from the interrupt generation logic unit 32. In this way, the processors can control when they are able to receive interrupt requests from the synchronisation unit 26, and when such interrupt requests are forbidden. For example, if a processor is running a high priority processing thread, that does not require any communication with other processors, then the mask register can be set to prevent interrupts to that processing thread being requested unless a higher priority processing thread is involved.
As described above, the register bank contains several synchronisation objects, or semaphores, and these will now be described in more detail.
Counting Semaphore Block
A counting semaphore is a synchronisation object which has a value associated with it that which changes as it is manipulated. Threads can perform a number of operations on a semaphore:
One possible implementation the register bank implements 16 16-bit semaphores, each having a value between 0 and 65355.
The ACCESS LOGIC implements a memory mapped interface to 48 16-bit locations:
A read access (microprocessor LOAD) from locations 0 to 15 returns the current value of semaphore 0 to 15 respectively. If the value is greater than 0 the semaphore's value is decremented and the new value stored in the register bank.
A write access (microprocessor STORE) to locations 16 to 31 causes the value of semaphore 0 to 15 to be incremented.
A write access (microprocessor STORE) to locations 32 to 47 sets the value of semaphore 0 to 15 to the value written by the processor.
In one implementation, the mask registers 24 and 36 have identical behaviour and are provided by respective 16-bit memory mapped registers:
Setting bit N of the mask to “1” causes an interrupt to be generated if semaphore N is non-zero. Alternatively, the system could be set such that a semaphore N is zero.
The interface to the mask is extended slightly to allow bits to be set and cleared independently:
Writing (STOREing) a value V into memory location “irq_set_bits” causes the bits in the mask corresponding to any non-zero bits in V to be set; writing a value V into memory location “irq_clear_bits” causes the bits in the mask corresponding to any non-zero bits in V to be cleared. Independent access to the bitmask allows a number of threads to manipulate it safely.
The Pend, Post and Set operations can now be implemented in software using the synchronisation block. Operations such as query (read the current value without changing it or waiting if it is zero) can be also be implemented in software:
The sem_pend operation relies on a “wait_for_interrupt” service that is provided by the processor or the operating system running on it.
Binary Semaphore Block
A binary semaphore is identical to a counting semaphore except that is range is limited to the values 0 and 1. This simplifies the implementation as a single flip-flop is needed for each semaphore. The access logic can be simplified to support two operations:
A mailbox is a memory location into which a single “message” can be placed for collection. The mailbox can be empty or full. Typically a mailbox contains a single word of data with the value 0 signifying empty. The operations performed on a mailbox may include:
The mailbox block is similar to the semaphore block. In fact its interrupt generation logic is identical. The only change is that the ACCESS LOGIC now allows the current value to be read without modification and a successfully pend access causes the current value to be set to zero rather than decremented:
The mailbox access functions can now be implemented in software:
A “blocking” Pend operation waits for the mailbox to be empty if it is currently full. Such an operation can be constructed by associating a semaphore with the mailbox.
Mixed Function Block
The counting semaphore and mailbox functionality is quite similar and can be combined to provide a block which is able to act as a group of semaphores or as a group of mailboxes depending on which software functions are used to access it. The interface to the access logic for the mixed block is:
Embodiments of the invention allows the systems which contain multiple processors running numerous software threads to be quickly constructed as a single block supports multiple synchronisation objects.
As a mask allows numerous conditions to be specified a single interrupt vector can be used to wait for multiple synchronisation events.
As the use of synchronisation objects and the control of where interrupts are generated is completely under software control the software can be changed to move thread between processors at design time or run time without requiring hardware changes.
This invention allows systems containing multiple embedded software processors to be quickly constructed.
It allows a single reference design to support multiple software configurations.
It provides a basic building block that implements in hardware some of the services provided by single-processors real time operations systems.
Although many of the components and processes are described above in the singular for convenience, it will be appreciated by one of skill in the art that multiple components and repeated processes can also be used to practice the techniques of the present invention.
While the invention has been particularly shown and described with reference to specific embodiments thereof, it will be understood by those skilled in the art that changes in the form and details of the disclosed embodiments may be made without departing from the spirit or scope of the invention. For example, the embodiments described above may be implemented using firmware, software, or hardware. Moreover, embodiments of the present invention may be employed with a variety of different file formats, languages, and communication protocols and should not be restricted to the ones mentioned above. Therefore, the scope of the invention should be determined with reference to the appended claims.
Number | Name | Date | Kind |
---|---|---|---|
4378589 | Finnegan et al. | Mar 1983 | A |
4860190 | Kaneda et al. | Aug 1989 | A |
5197138 | Hobbs et al. | Mar 1993 | A |
5410710 | Sarangdhar et al. | Apr 1995 | A |
5448743 | Gulick et al. | Sep 1995 | A |
5691493 | Usami et al. | Nov 1997 | A |
5724537 | Jones | Mar 1998 | A |
5761534 | Lundberg et al. | Jun 1998 | A |
5920572 | Washington et al. | Jul 1999 | A |
6098144 | De Oliveira et al. | Aug 2000 | A |
6212607 | Miller et al. | Apr 2001 | B1 |
6275864 | Mancusi et al. | Aug 2001 | B1 |
6928502 | Monteiro | Aug 2005 | B2 |
7058744 | Kawaguchi | Jun 2006 | B2 |
20030041173 | Hoyle | Feb 2003 | A1 |
20040044874 | Leach et al. | Mar 2004 | A1 |