The present invention contains subject matter related to Japanese Patent Application JP 2007-002986 filed in the Japan Patent Office on Jan. 11, 2007, the entire contents of which being incorporated herein by reference.
1. Field of the Invention
The present invention relates to an information processing apparatus, an information processing method and a computer program. More specifically, the present invention relates to an information processing apparatus assuming a multiprocessor configuration to execute data processing by utilizing a plurality of processors, an information processing method to be adopted in such an information processing apparatus and a computer program, with which a decrease in the data processing efficiency attributable to, for instance, communication processing, is prevented.
2. Description of the Related Art
Today, information processing apparatuses such as PCs assuming a multiprocessor configuration with multiple processors (CPUs) installed therein are utilized widely in applications in which various types of data processing are executed by concurrently engaging the plurality of processors in operation. A multiprocessor environment in which different types of data processing are executed by allocating a specific role to one of the plurality of CPUs is called an asymmetrical multiprocessor environment.
In an asymmetrical multiprocessor environment such as that described above, a main CPU (hereafter referred to as a “PPU” (power processor unit)) and a plurality of sub-CPUs (hereafter referred to as “SPUs” (synergistic processor units)) may be installed. The individual processors may be designated to execute different types of processing as follows;
It is to be noted that the SPUs are designed to have better versatility for general-purpose program execution than standard DSPs assure greater processing advantages that the DSPs.
The PPU 111 controls programs 130 such as a driver 131 that drives the network card and a protocol stack 132 corresponding to the communication protocol, e.g., TCP/IP in addition to executing the OS. An application 140, which issues a request for communication processing execution, is set in the highest-order layer. As shown in
It is to be noted that a descriptor such as that shown in
As explained earlier, an interrupt from the network card 101 is processed by the PPU 111, i.e., the main processor that controls the OS, in the information processing apparatus shown in
First, in response to a communication processing request originating from the application, the PPU 111 secures a memory area in preparation for packet transmission/reception and sets a descriptor corresponding to the secured memory area based upon the driver 131 in step S101. In step S102, notification processing is executed to provide information on the descriptor having been set to the network card. This notification processing is executed by, for instance, writing the information into a register for the network card 101.
Next, in step S103, data transmission/reception is executed via the network card 101 in accordance with the descriptor. Following step S103, interrupt processing for the PPU 111 occurs in step S104. Based upon the interrupt processing, the PPU 111 executes predetermined post-communication processing by, for instance, releasing the memory space.
Different types of processing are executed for data transmission and for data reception during the data transmission/reception processing executed in step S103 via the network card 101. The data transmission processing and the data reception processing are now explained in detail in reference to
(Data Transmission Processing)
As shown in
Next, in step S125, a decision is made via the network card 101 as to whether or not there is any remaining descriptor yet to be processed and if it is decided that there is an unprocessed descriptor, the processing is repeatedly executed starting from step S121. Once there is no more unprocessed descriptor, the operation proceeds to step S126. Subsequently the transmission/reception status indicating success/failure of the data transmission is written into the register corresponding to the network card (step S126) and then, interrupt notification processing for the PPU is executed (step S127).
(Data Reception Processing)
The network card 101 engaged in the data reception processing first reads out the descriptor (step S131), as shown in
Next, the network card 101 writes the DMA results indicating success/failure of the DMA having been executed in step S133 and step S134, i.e., whether or not a memory access has been achieved, into the status field in the descriptor and also writes the actual packet size into the length field (step S135). In step S136, the transmission/reception status indicating success/failure of the data reception is written into the register corresponding to the network card and in step S137, interrupt notification processing for the PPU 111 is executed.
Once the data transmission/reception processing executed via the network card 101 through the DMA is completed, interrupt processing is executed at the PPU 111. The sequence of the interrupt processing executed at the PPU 111 in response to the interrupt notification from the network card is now explained in reference to the flowchart presented in
As the interrupt from the network card 101 is received at the PPU in step S141, processing for halting the process currently underway is executed by, for instance, executing a register clear (step S142). An interrupt handler is then started up (step S143).
The PPU executes processing for reading out the network card status from the register corresponding to the network card in order to determine the cause of the interrupt (step S151). This status is the data transmission/reception status indicating success or an error in the data transmission/reception, having been written via the network card 101 in step S126 in
Next, the PPU 111 enters a state in which it makes a decision as to whether or not the status transmission has been completed (step S152) and if a “no” decision is made, it proceeds directly to execute the processing in step S154. If, on the other hand, a “yes” decision is made, it releases the memory area where the transmission target data have been stored (step S153) and then proceeds to execute the processing in step S154. If it is decided in step S154 that the status indicates that the reception has been completed, the PPU 111 hands the received packet over to the protocol stack (step S155). If it is decided in step S156 that the status indicates an error, the PPU 111 resets the network card (step S157). Finally, the PPU 111 clears the interrupt status in step S158 before the interrupt handler startup processing ends.
After executing the processing in the flowchart presented in
In the structure described above, the PPU needs to halt the process currently underway whenever interrupt processing occurs. Accordingly, the process is bound to be halted increasingly often if interrupt processing occurs frequently during high-speed network communication. The interrupt processing, which readily leads to cache destruction or destruction of the memory access locality, is bound to take up significant processing time at the PPU, thus lowering the processing performance.
The present invention, having been completed by addressing the concerns discussed above, provides an information processing apparatus, an information processing method and a computer program, with which the data processing efficiency is sustained even when, for instance, data communication processing is executed.
According to an embodiment of the present invention, there is provided an information processing apparatus equipped with a plurality of processors, which includes a first processor that executes processing based upon an operating system, a communication unit that executes communication processing and a second processor that executes processing based upon a device driver corresponding to the communication unit.
Furthermore, according to another embodiment of the present invention, there is provided an information processing method to be adopted in an information processing apparatus equipped with a plurality of processors including: a step in which a second processor different from a first processor that executes processing based upon an operating system executes read processing to read out an interrupt status set by a communication unit and a step in which the second processor determines processing to be executed based upon the interrupt status having been read out and executes the processing thus determined.
Furthermore, according to another embodiment of the present invention, there is provided a computer program enabling an information processing apparatus equipped with a plurality of processors to execute communication processing control and including: a step in which a second processor different from a first processor that executes processing based upon an operating system executes read processing to read out an interrupt status set by a communication unit and a step in which the second processor determines processing to be executed based upon the interrupt status having been read out and executes the processing thus determined.
According to an embodiment of the present invention, a sub-processor different from the main processor that controls an operating system (OS), is engaged in control of a device driver corresponding to a communication unit and the sub-processor executes communication control based upon an interrupt originating from a network card functioning as the communication unit in an information processing apparatus equipped with a plurality of processors and engaged in communication via a network. As a result, data processing can be executed at the main processor with a high level of efficiency without a time lag.
The following is a detailed explanation of the information processing apparatus, the information processing method and the computer program according to the present invention.
(Embodiment 1)
First, in reference to
As shown in
The PPE 210 includes a two-stage cache memory constituted with an L1 cache 212 and an L2 cache 213. The SPEs 220 each include a memory space referred to as a local store (LS) 222 capable of executing operation at a speed equivalent to that of the L1 cache. An SPE 220 accesses a main memory (XDR) 232 primarily through DMA via a memory flow controller (MSC) 223. As shown in the figure, the main memory (XDR) 232 is connected via a memory interface controller (MIC) 231, and a network card 242 to function as a communication unit is connected via an input/output controller (IOC) 241.
In reference to
In
The PPU 311 executes programs 330 such as an OS 330 and a protocol stack 331 corresponding to the communication protocol, e.g., TCP/IP. The protocol stack 331 is a communication control program used to control data communication executed via the network card 301. In the highest-order layer, an application 340 that issues a request for the execution of data communication via the network card 301 is set.
In the symmetrical multiprocessor environment in the related art the driver-based processing is executed by the PPU as explained earlier. Thus, as packet transmission/reception takes place, interrupt processing is executed, resulting in a temporary halt in the process currently underway at the PPU. Such a halt in the processing at the PPU attributable to the interrupt lowers the performance level of the PPU unexpectedly, leading to an undesirable condition in which the capability of the processor cannot be fully utilized.
In addition, if a packet transmission/reception is executed by applying an interrupt to the PPU, a delay occurs before;
In order to absorb these delays, the buffer needs to assure an ample capacity margin necessitating wasteful utilization of buffer resources.
There is an added concern to be addressed on the network card side in that the wide network band available for use may not be effectively utilized in the communication due to the bottleneck manifesting with regard to the PPU processing capability.
Since the cause of these conditions is assumed to lie in the fact that the PPU is assigned to handle the packet transmission/reception, a driver 350 corresponding to the network card 301 is installed in one of the SPUs among SPU-1 through SPU-8 so as to execute processing based upon the driver 350 at one of the SPUs 1˜8. In this structure, an interrupt from the network card 301 is processed at one of the SPUs 1˜8, freeing the PPU from the processing that must be executed to engage the driver 350 and leaving all the programs executed on the PPU unaffected and uninterrupted by the interrupt processing. In addition, since a program other than the device driver is not executed in the SPU having installed therein the device driver among the SPUs 1˜8 and thus no interrupt attributable to another application program occurs while interrupt processing executed to achieve DMA transfer is underway in the structure described above, faster response to an interrupt from the network card is enabled. Namely, since the transmission target data/received data can be transferred with higher frequency, the size of the buffer used in data transfer can be reduced.
It is to be noted that while the driver 350 is installed at the sub-processor SPU-1 (321-1) in the example presented in
In order to ensure that the device driver 350 operated via the SPU (321-1), is notified of an interrupt originating from the network card 301, as in the embodiment, the network card must be able to apply an interrupt to any processor in the multiprocessor environment.
An example of a structure that may be adopted in such a network card capable of applying an interrupt to any processor is explained in reference to
An address analysis unit 352 checks the recipient IP address indicated in the data (e.g., the IP packet) received by the transmission/reception unit 351 and executes a search of a table stored in a table management unit 353. At the table management unit 353, a table such as that shown in
As shown in
The information processing apparatus achieved in this embodiment operates in an asymmetrical multiprocessor environment where a plurality of processors in the information processing apparatus each execute specific processing among various types of data processing. When data are received via the network card 301 in this apparatus, the specific processor to receive the data must first be identified before notifying the processor of the interrupt. Accordingly, a table such as that shown in
For instance, the SNR corresponding to the received data (IP packet) is identified and an interrupt is applied to the identified processor based upon the table shown in
More specifically, an interrupt generation unit 354 in the network card 301 shown in
It is to be noted that when the PPU 311 or the SPU 321 executes data communication via the network card 301 in the structure shown in
Since the SPU-1 (321-1) is designated to the execution of device driver, the embodiment achieves an added advantage in that the SPU can be continuously engaged in the execution of the device driver without affecting other programs. This means that transmission/reception completion processing can be executed through polling. The sequence of the polled data transmission/reception completion processing is now explained in reference to the flowchart presented in
The SPU controlling the device driver corresponding to the network card executes processing for reading out the status of the network card from the register corresponding to the network card in order to determine the cause of the interrupt in step S201. This status is equivalent to the data transmission/reception status indicating success or an error in the data transmission/reception, having been written via the network card in step S126 in
If it is decided in step S203 that the status indicates that the transmission has been completed, the SPU releases the corresponding memory area where the transmission target data have been stored in step S204. If it is decided in step S205 that the status indicates that the reception has been completed, the SPU proceeds to step S206 to hand the received packet over to the protocol stack. If it is decided in step S207 that the status indicates an error, the SPU proceeds to step S208 to reset the network card. Finally, in step S209, the interrupt status is cleared and the interrupt processing thus ends.
It is to be noted that if the processing shown in
It is to be noted that the protocol stack 331 such as TCP or UDP is set as an execution target to be executed by the PPU 311 and the driver 350 corresponding to the network card 301 is set as an execution target to be executed by the SPU-1 (321-1) in the layer structure shown in
The flowchart presented in
Next, in step S303, the driver 350 executed by the SPU-1 (321-1) copies the packet data in the main memory (XDR) into the local store (LS) corresponding to the SPU-1 (321-1) through DMA executed via the memory flow controller (MFC). Then, in step S304, the driver 350 notifies the protocol stack 331 of the completion of the MFC DMA. Finally, in step S305, the protocol stack 331 releases the main memory area (XDR) having been taken up by the packet data.
Through this processing sequence, the data having been set as the transmission target by the application 340 are handed over from the protocol stack 331 to the driver 350. In addition, the data copied into the local store (LS) corresponding to the SPU-1 (321-1) are mapped into the I/O address space used as the address space for the network card 301 through the mapping processing explained earlier in reference to
Next, in reference to the flowchart presented in
Next, in step S322, the driver 350 records the received packet data having been received at the network card 301 into the local store (LS) corresponding to the SPU-1 (321-1). In step S323, the driver 350 executed by the SPU-1 (321-1) copies the packet data in the local store (LS) into the main memory (XDR) through DMA executed via the memory flow controller (MFC). Then, in step S324, the driver 350 notifies the protocol stack 331 of the completion of the MFC DMA. Finally, in step S325, the protocol stack 331 hands over the received packet data to the application program 340.
Through this processing sequence, the data to be received by the application program 340, first handed over from the driver 350 to the protocol stack 331, are provided to the application 340 from the protocol stack 331.
(Other Embodiments)
In the layer structure explained in reference to
(Example in Which the Driver and the Protocol Stack are Executed by a SPU)
In reference to
Unlike in the structure having been explained in reference to
(An Example in which the Driver, the Protocol Stack and the Application are Executed by a SPU)
In reference to
As shown in
(Examples in Which the Driver, the Protocol Stack and the Application are Executed by Different Processors Among the PPU and the SPUs)
Next, structural examples that may be adopted to execute the driver, the protocol stack and the application via different processors among the PPU and the SPUs are explained in reference to
In the example presented in
By adopting either of these structures, the processing load is dispersed over various processors so as to prevent the processing load on any given processor from becoming excessively heavy.
(Example in Which the Driver is Divided into a Transmission Portion and a Reception Portion to be Processed by Different SPUs)
Next, a structural example in which the driver corresponding to the network card is divided into a transmission portion and a reception portion to be processed by different SPUs is explained.
(Example in Which Different SPUs are Assigned to Control Protocol Stacks each Corresponding to a Specific Protocol Type)
Next, a structural example in which different SPUs are assigned to control protocol stacks each corresponding to a specific protocol type is explained.
The structure allows different SPUs to engage in control each in correspondence to a specific protocol and, as a result, a processing structure through which a specific processor is able to execute processing customized for a specific protocol can be achieved with ease.
It should be understood by those skilled in the art that various modifications, combinations, sub combinations and alterations may occur depending upon design requirements and other factors in so far as they are within the scope of the appended claims or equivalent thereof.
It is to be noted that the sequences of processing described in the specification may be executed in hardware, in software or in a structure achieved by combining specific hardware and software configurations. The processing sequences may be executed based upon software by installing a program having recorded therein the processing sequences in a memory in a computer built into a dedicated hardware unit and executing this program or by installing such a program in an all-purpose computer capable of executing various types of processing and executing the program.
For instance, the program may be recorded in advance in a hard disk or a ROM (read only memory) used as a recording medium. Alternatively, the program may be temporarily or permanently stored (recorded) in a removable recording medium such as a flexible disk, a CD-ROM (compact disk read-only memory), an MO (magneto-optical) disk, a DVD (digital versatile disk), a magnetic disk or a semiconductor memory. Such a removable recording medium may be provided as a packaged software program product.
It is to be noted that instead of installing the program in a removable recording medium into a computer, it may be wirelessly transferred from a download site into a computer. Alternatively, the program may be transferred to a computer through a wired connection via a network such as a LAN (local area network) or the Internet and the computer having received the program transferred thereto may install the program into a recording medium such as an internal hard disk. It is to be noted that while an explanation is given above in reference to the embodiment on an example in which the present invention is adopted in conjunction with the TCP/IP protocol, the present invention may instead be adopted in conjunction with another protocol such as RTP/UDP/IP.
It is to be noted that the various types of processing described in the specification may be executed in time sequence as described or they may be executed concurrently or individually depending upon the processing capability of the apparatus executing the processing or as required. In addition, the term “system” used in the specification refers to a logical aggregate structure that includes a plurality of devices assuming various structures, which are not necessarily installed in a single case.
As described above, according to the embodiments of the present invention, a sub-processor (an SPU in the CELL), different from the main processor (the PPU in the CELL) executing control in the operating system (OS) is designated to control the device driver corresponding to the communication unit and thus, the communication control is executed by the sub-processor in response to an interrupt originating from a network card functioning as the communication unit in an information processing apparatus equipped with a plurality of processors and engaged in communication via a network. As a result, data processing can be executed at the main processor with a high level of efficiency without a time lag in the data processing.
Number | Date | Country | Kind |
---|---|---|---|
2007-002986 | Jan 2007 | JP | national |
2007-296676 | Nov 2007 | JP | national |
Number | Name | Date | Kind |
---|---|---|---|
6434651 | Gentry, Jr. | Aug 2002 | B1 |
6631422 | Althaus et al. | Oct 2003 | B1 |
7783810 | Kyusojin et al. | Aug 2010 | B2 |
20040030757 | Pandya | Feb 2004 | A1 |
20050114463 | Lee | May 2005 | A1 |
20060004904 | Sarangam et al. | Jan 2006 | A1 |
20060174246 | Tamura et al. | Aug 2006 | A1 |
Number | Date | Country |
---|---|---|
64-028735 | Jan 1989 | JP |
2-201569 | Aug 1990 | JP |
03-223955 | Oct 1991 | JP |
5-61812 | Mar 1993 | JP |
06-223047 | Aug 1994 | JP |
6-266676 | Sep 1994 | JP |
2000-181886 | Jun 2000 | JP |
2006-209479 | Aug 2006 | JP |
2006-303750 | Nov 2006 | JP |
2007-12021 | Jan 2007 | JP |
2007-206955 | Jun 2007 | JP |
2007-214730 | Jun 2007 | JP |
2007-206955 | Aug 2007 | JP |
2007-208632 | Aug 2007 | JP |
2007-214730 | Aug 2007 | JP |
2007-214731 | Aug 2007 | JP |
WO2006137234 | Dec 2006 | WO |
Entry |
---|
Robert Craig, Dennis Keefe, and Paul Leroux, “Making the Most of Multi-Core Processors: Part 1”, Mar. 27, 2006, Embedded.com. |
Cell Broadband Engine Architecture Version 1.01, Oct. 3, 2006, Sony Computer Entertainment Inc. |
U.S. Appl. No. 11/669,766, filed Jan. 31, 2007, Kyusojin et al. |
Number | Date | Country | |
---|---|---|---|
20080172682 A1 | Jul 2008 | US |