Referring to the Drawing, wherein like numbers denote like parts throughout the several views,
Data communicated over LAN 102 and Internet 101 is sent in packets. A packet is a self contained data unit of a fixed size for transmission on a network, having embedded information necessary to route the packet via the network to its ultimate destination. A routing protocol, such as the Transport Control Protocol/Internet Protocol (TCP/IP), specifies the format of the packet and how routing is determined.
It will be understood that
One or more communications buses 205 provide a data communication path for transferring data among CPU 201, main memory 202 and various I/O interface units 211-214, which may also be known as I/O processors (IOPs) or I/O adapters (IOAs). The I/O interface units support communication with a variety of storage and I/O devices. For example, terminal interface unit 211 supports the attachment of one or more user terminals 221-224. Storage interface unit 212 supports the attachment of one or more direct access storage devices (DASD) 225-227 (which are typically rotating magnetic disk drive storage devices, although they could alternatively be other devices, including arrays of disk drives configured to appear as a single large storage device to a host). I/O device interface unit 213 supports the attachment of any of various other types of I/O devices, such as printer 228 and fax machine 229, it being understood that other or additional types of I/O devices could be used.
Local Area Network interface (or “LAN adapter”) 214 supports a connection to one or more external networks, and particularly to LAN 102, for communication with one or more other digital devices. LAN interface 214 includes an internal processor 215 which controls the operation of the LAN interface, and a buffer 216 for temporarily storing data packets. Outbound data packets may be received from CPU 201 and/or memory 202 via communications bus 205, and stored temporarily in buffer 216, for outbound transmission to the LAN. Similarly, inbound data packets may be received from the LAN into buffer 216, and later sent via communications bus 205 to memory 202 or CPU 101. Although buffer 216 is shown as a single unitary entity, it may be partitioned into multiple storage spaces. Computer system 200 of the preferred embodiment contains at least one LAN adapter 214. It may optionally contain multiple LAN or other communications adapter. Where system 200 contains multiple adapters, one or more than one may be coupled, directly or indirectly, to the Internet, and these adapters may connect to the same or different local area networks, or the same or different routers or gateways.
It should be understood that
Although only a single CPU 201 is shown for illustrative purposes in
Computer system 200 depicted in
While various system components have been described and shown at a high level, it should be understood that a typical computer system contains many other components not shown, which are not essential to an understanding of the present invention. In the preferred embodiment, computer system 200 is a computer system based on the IBM i/Series™ architecture, it being understood that the present invention could be implemented on other computer systems.
Operating system 301 further contains one or more communications stack instances 303 (of which one is shown in
System 200 further contains one or more user applications 311-313 (of which three are represented in
In order to communicate with remote processes, applications 311-313 generate outbound data to be sent via communications stack 313 and receive inbound data from the communications stack. Communications stack 313 formats outbound data appropriately for transmission on the network, and in particular formats data into packets of an appropriate size, which are transmitted by LAN adapter driver 302 across communications bus 205 to LAN interface 214. LAN interface 214 may temporarily store packets in its buffer 216 before transmission to LAN 102. Incoming received packets are similarly forwarded by LAN interface 214 to LAN Adapter driver 302, data is extracted from packets and reconstituted in its original form by communications stack 313, and provided to the appropriate application.
In accordance with the preferred embodiment of the present invention, multiple outbound data packets containing outbound data generated by one or more of applications 311-313 may, in appropriate circumstances be accumulated as a single packet train 305. All packets in the packet train are sent together to LAN interface 214 over communications buses 205. Accumulation of packets in a packet train is regulated by packet train control function 304. Accumulation of packets, or “packet training”, reduces the number of times LAN adapter must be invoked to send packets, and consequently reduces an overhead burden on CPU(s) 201 and other system resources due to execution context switching, execution of the LAN adapter driver functions, bus arbitration, and so forth. The operation of the packet training control function is described in greater detail herein.
It will be understood that a typical computer system will contain many other software components (not shown), which are not essential to an understanding of the present invention. In particular, a typical operating system will contain numerous functions and state data unrelated to the transmission of data across a network, such as multi-tasking dispatch functions, memory management, interrupt handling, error recovery, and so forth.
Various software entities are represented in
While the software components of
Each packet 402 within packet train 305 contains a respective packet header 403 specifying such information as a packet destination and other required control information, and packet data 404 which was generated by the originating process.
In accordance with the preferred embodiment of the present invention, outbound data generated by a process (such as one of user application 311-313) executing in CPU(s) 201 is arranged in packets by communications stack 303, and the packets are further grouped in packet trains by packet train control 304 for transmission to LAN interface 214. Packet trains are held in memory 202 as they are being built for transmission to the LAN interface. When a train accumulates a number of packets equal to a target, identified as train_max, the train is sent to the LAN interface. In order to prevent packets from waiting an inordinately long time, a timeout mechanism interrupts the processor and causes trains to be sent after a pre-determined time has elapsed, regardless of whether the train_max has been reached.
For various reasons of system performance, particularly to accommodate packets of differing size, the timeout period before sending a packet train is relatively long. Train_max is dynamically adjustable to avoid timeouts. Specifically, Train_max adjusts downwards more readily than it adjusts upwards to reflect the greater relative “cost” of waiting too long for packets to accumulate vs. not waiting long enough. Train_max is adjusted downward in advance of an actual timeout by projecting whether sufficient packets will be received to satisfy Train_max. Train_max is only adjusted upward if an excess of packets is received in the timeout interval, preferably for a consecutive number (TCL) of packet trains.
Referring to
If train_size=1 (i.e., this is the first packet of a new train), the ‘Y’ branch is taken from step 502. In this case, one or more timers for packet train timeout and timeout check are initialized, and corresponding interrupt(s) enabled (step 503). Conceptually, there are at least two time periods, one being a packet train timeout, representing the maximum length of time a packet can wait in the packet train before being sent, and a second being a timeout check, representing the time at which packet train progress is checked and Train_max adjusted downward if necessary, the second being less that the first. The timer mechanisms are so described herein. However, it would be possible to implement these in a single timer and corresponding interrupt, which after being triggered a first time for the packet train progress check, is reset to a time value corresponding to the remainder of the timeout interval.
The packet train control checks whether the difference between the current time and start_time (the time at which the preceding packet train began accumulating packets) is less than the timeout interval (step 504). If so, then it would have been possible for the new packet to have been appended to the previous packet train without exceeding the timeout, indicating that the train_max should possibly be adjusted upward, and the ‘Y’ branch is taken from step 504. A counter designated “t_cnt” is incremented (step 506), and compared with a t_cnt limit, designated “tcl” (step 507). If t_cnt has reached the limit tcl (the ‘Y’ branch from step 507), then the ‘Y’ branch from step 504 has been taken for the last tcl consecutive packet trains. This fact is taken as a sufficient indication that train_max is too low; accordingly, train_max is incremented at step 508 (but not past some pre-determined maximum). In the preferred embodiment, train_max is simply incremented by 1; however, it would alternatively be possible to increment train_max by some other fixed amount, or by a variable amount such as a percentage. Of the increment is by more than one, then t_cnt should not be incremented until the corresponding number of packets have been received in the timeout interval of the most recently sent packet train. If the t_cnt limit has not been reached, then the ‘N’ branch is taken from step 507, by-passing step 508. If, at step 504, the ‘N’ branch is taken, then the new packet could not have been included in the previous packet train without exceeding the timeout, so t_cnt is reset to zero (step 505). The t_cnt limit can be any appropriate integer to regulate the rate of upward adjustment, and in particular could be 1, making steps 505-507 unnecessary.
After performing steps 504-508, as necessary, the start_time is reset to the current time to record the time at which the current packet train began accumulating packets (step 509). The variable start_time will not be altered again until a new packet train is begun.
The packet train control function then checks whether train_size (the number of packets in the current packet train) has reached train_max, the target maximum number of packets (step 510). If so, the ‘Y’ branch is taken from step 501. The packet timeout interrupt and timeout check interrupts are disabled, and train_size is reset to 0 (step 511). The packet train control then calls the LAN adapter driver 302 to transmit the current packet train stored in memory 202 to the LAN adapter 214 (step 512). If train_max has not yet been reached, the ‘N’ branch is taken from step 510, and the packet train control performs no further action, allowing the current packet train to remain in memory, ready to accumulate additional packets.
The packet train control function as described above with reference to
Referring to
The threshold T is preferably a value selected to project a likelihood that the packet train will be filled before reaching packet train timeout. For example, the threshold T may be derived as a product K*train_max, where K is some coefficient between 0 and 1. K is generally related to the ratio of the timeout check interval to the packet train timeout interval, although it need not be exactly equal to this ratio. For example, if the timeout check interval is exactly half as long as the packet train timeout interval, then K might be approximately 0.5, so that if at least half the required packets are already in the train, it can be projected that the train will accumulate sufficient packets before timeout. However, K could be somewhat less that 0.5 or somewhat more than 0.5, depending on how aggressively it is desired to reduce train_max. In general, it is not important that T be exact. The purpose of the check at step 501 is to provide rapid adjustment of train_max in obvious cases where packets are not accumulating sufficiently rapidly. To reduce overhead, T could be derived from a table or any of various approximations, rather than performing an actual multiplication or division operation.
If the train_size does not meet the threshold T (the ‘N’ branch from step 602), the target train_max is adjusted downward accordingly (step 602). In the preferred embodiment, train_max is adjusted downward by halving, i.e. dividing the current train_max value by 2. The resultant value is rounded to an integer. Preferably, it is rounded downward (since this is easily performed in binary arithmetic by a shift operation). It will be appreciated that any of various alternative downward adjustments could be made. In one alternative variation, train_max is set to some multiple of train_size, such as 1/T*train_size. Train_max will not be adjusted below some pre-determined minimum value, which is preferably 1.
If train_size is now greater than or equal to train_max as adjusted by step 602, the ‘Y’ branch is taken from step 603. In this case, the timeout interrupts is disabled, and train_size and t_cnt are reset to 0 (step 604). The packet train control then calls LAN adapter driver 302 to transmit the packet train in its current state to the LAN adapter (step 605). If the train_size does not meet the adjusted train_max, the ‘N’ branch is taken from step 603, by-passing steps 604 and 605.
In one optional variation of the preferred embodiment, the timeout check timer is re-initialized to an appropriate timer value to cause another timeout check before expiration of the packet timeout interval, shown in
Although adjustment of train_max as described above with respect to
Referring to
In accordance with the preferred embodiment described above, outbound packets are collected in packet trains up to a target maximum size to be transmitted from main memory to a LAN adapter, ultimately from there to be transmitted by the LAN adapter on a LAN. However, the present invention is not necessarily limited to this particular environment or application, and the dynamic adjustment of packet train size by projecting a packet train in advance of timeout, as described herein, could alternatively be applied to other environments or applications. In particular, packet training as described herein could alternatively be practiced by LAN adapter for inbound packets destined for main memory 202 or CPU 201. I.e., LAN adapter could accumulate incoming packets received over LAN 102 in packet trains in its internal buffer 216, and dynamically adjust the target size of packet trains sent to processor or memory, using the techniques described herein. The techniques described herein could alternatively be used externally of a single computer system where appropriate protocols support packet training.
In the preferred embodiment, a packet train target size is adjusted downward more readily that it is adjusted upwards. This embodiment is chosen because it is believed that the relative cost of an excessively large target size (i.e., the packets wait until timeout) is greater than the relative cost of and excessively small target size (i.e., the additional overhead of sending extra packet trains). However, in an environment in which the reverse is the case or the relative costs are more nearly equal, it would alternatively or additionally be possible to adjust the packet train target size upwards in a similar manner, e.g., by checking progress at an intermediate time in the time interval, or just before sending a packet train, and adjusting the target upwards if it appears likely that additional packets will be received before timeout.
In the preferred embodiment, a packet train size is measured solely as a number of packets in a train, regardless of packet size. It would alternatively be possible to measure a packet train size as a number of data bytes, or some other appropriate measure of size.
In general, the routines executed to implement the illustrated embodiments of the invention, whether implemented as part of an operating system or a specific application, program, object, module or sequence of instructions, are referred to herein as “programs” or “computer programs”. The programs typically comprise instructions which, when read and executed by one or more processors in the devices or systems in a computer system consistent with the invention, cause those devices or systems to perform the steps necessary to execute steps or generate elements embodying the various aspects of the present invention. Moreover, while the invention has and hereinafter will be described in the context of fully functioning computer systems, the various embodiments of the invention are capable of being distributed as a program product in a variety of forms, and the invention applies equally regardless of the particular type of signal-bearing media used to actually carry out the distribution. Examples of signal-bearing media include, but are not limited to, volatile and non-volatile memory devices, floppy disks, hard-disk drives, CD-ROM's, DVD's, magnetic tape, and so forth. Furthermore, the invention applies to any form of signal-bearing media regardless of whether data is exchanged from one form of signal-bearing media to another over a transmission network, including a wireless network. Examples of signal-bearing media are illustrated in
Although a specific embodiment of the invention has been disclosed along with certain alternatives, it will be recognized by those skilled in the art that additional variations in form and detail may be made within the scope of the following claims: