1. Technical Field
This application relates to managing addressing and memory sharing between the operating system and I/O device drivers performing direct memory access to system memory.
2. Description of Related Art
Direct Memory Access (DMA) is a hardware mechanism that allows peripheral components to transfer their I/O data directly to and from main memory without the need for the system processor to be involved in the transfer. Use of this mechanism can greatly increase throughput to and from a device, because a great deal of overhead is eliminated. A device driver will set up the DMA transfer and synchronize with the hardware, which actually performs the transfer. In this process, the device driver must provide an interface between devices that use 32-bit physical addresses and system code that uses 64-bit virtual addresses. DMA operations call an address-mapping program to map device page addresses to physical memory. Table 1 below is an exemplary address-mapping table used to convert between the device address and the system memory address.
Because it is necessary to call the mapping program to map the address, undesirable latencies are introduced into the DMA process, impacting I/O throughput. At times, the latency to resolve the address can be greater than the time needed to perform the actual data transfer. Therefore, in direct memory access to the system memory, new techniques for minimizing the time for this overhead operation are needed.
In the present invention, advantage is taken of the fact that the latency time necessary to call the mapping program to resolve a single address is almost the same as the time to call the program to resolve a number of addresses. For example, when a 128-byte cache line is used to send 8-byte I/O addresses, sixteen addresses are present; the addresses for all sixteen pages can be resolved with minimal additional time over the cost of resolving one of the addresses. In order to take advantage of this fact, the inventive process requires that system memory, which is generally allocated in pages of 4 kilobytes, be allocated in blocks of n pages, with n being the number of device addresses that can be stored in a cache line. With larger blocks of memory being allocated, the driver can initiate the copying of n pages into system memory with a single call to the address-mapping program. In a cache line that can hold sixteen addresses, memory would be allocated in 64-kilobyte blocks and sixteen 4-kilobyte pages can be copied before another call to the address-mapping program. The overall wait time for accessing the address-mapping table is thus reduced, increasing I/O response time. No change is required to the pagination in the operating system, which can continue to operate on 4-kilobyte pages. All changes are made in the hardware mapping programs and in the device driver software.
The novel features believed characteristic of the invention are set forth in the appended claims. The invention itself, however, as well as a preferred mode of use, further objectives and advantages thereof, will best be understood by reference to the following detailed description of an illustrative embodiment when read in conjunction with the accompanying drawings, wherein:
Referring now to
Peripheral component interconnect (PCI) bus bridge 114 connected to I/O bus 112 provides an interface to PCI local bus 116. A number of modems may be connected to PCI local bus 116. Typical PCI bus implementations will support four PCI expansion slots or add-in connectors.
Additional PCI bus bridges 122 and 124 provide interfaces for additional PCI local buses 126 and 128, from which additional modems or network adapters may be supported. In this manner, data processing system 100 allows connections to multiple network computers. A memory-mapped graphics adapter 130 and hard disk 132 may also be connected to I/O bus 112 as depicted, either directly or indirectly.
Those of ordinary skill in the art will appreciate that the hardware depicted in
The data processing system depicted in
With reference to
With reference now to
Once data is written into memory, the operating system is notified, so that the requesting application can access the data. The operating system software can continue to manage the data in pages, as it has done previously. When the application is through with the data, OS releases a page at a time to be written to the device. Because OS is using pages while the hardware is allocating in larger blocks, care must be taken to ensure that all pages in a block are freed before the block is released.
DMA writes from system memory to a device will now be discussed with reference to
As has been shown, the innovative method does not need to call the address-mapping program as often as previously, as this program is asked to resolve the addresses for all pages in a block at one time. This means that, as illustrated above, when sixteen pages are grouped into a block, fifteen calls to the address-mapping program are avoided for every 64 KB of information managed in a direct memory access.
Of course, the inventive method of managing DMA I/O is not restricted to 64 KB transfers, but would enhance the performance of all transfers needing more than one address resolution.
It is important to note that while the present invention has been described in the context of a fully functioning data processing system, those of ordinary skill in the art will appreciate that the processes of the present invention are capable of being distributed in the form of a computer readable medium of instructions and a variety of forms and that the present invention applies equally regardless of the particular type of signal bearing media actually used to carry out the distribution. Examples of computer readable media include recordable-type media, such as a floppy disk, a hard disk drive, a RAM, CD-ROMs, DVD-ROMs, and transmission-type media, such as digital and analog communications links, wired or wireless communications links using transmission forms, such as, for example, radio frequency and light wave transmissions. The computer readable media may take the form of coded formats that are decoded for actual use in a particular data processing system.
The description of the present invention has been presented for purposes of illustration and description, and is not intended to be exhaustive or limited to the invention in the form disclosed. Many modifications and variations will be apparent to those of ordinary skill in the art. The embodiment was chosen and described in order to best explain the principles of the invention, the practical application, and to enable others of ordinary skill in the art to understand the invention for various embodiments with various modifications as are suited to the particular use contemplated.