The invention relates generally to the transmission of data to multiple computers in a computer network and, more particularly, to a designating the minimum-cost paths for disseminating data in a computer network.
In the field of computer networking, many efforts have been made to determine the best way for servers within a computer network to communicate with one another. In particular, the problem of which network links to use has been a challenge. While there may be a dozen paths that communication between two computers may use, only one or two of those paths may actually be the best. In making this determination, a network engineer may have a set of parameters to follow. These parameters may include: minimizing the distance that communications need to travel, maximizing the bandwidth available for each communication, or minimizing the amount of money spent creating the links between the computers. Such parameters will hereinafter be grouped under the general category of “cost.” In other words, a network engineer tries to minimize the cost of sending messages between computers in a network. The “cost” of a network link as used herein may include, but is not limited to, one or more of the following: the time it takes for data to travel over the link, the physical length of the link, or the monetary cost of the link. Thus, if travel time is being used as a parameter, then a “cheap” link is one that is relatively fast, whereas an “expensive” link is relatively slow.
Several techniques have been developed to create minimum-cost network topologies. However, many of these techniques become unworkable when the problem of intermediate servers is introduced into a network. Intermediate servers are those servers that co-exist in a network with the servers for which communication is being optimized, but are not the intended recipients of the message. Those servers that are the intended recipients will be referred to herein as “recipient servers.”
For example, servers on computer networks may share what is known as a “multi-master” or “distributed” database, in which multiple servers share responsibility for keeping the contents of the database current. An example of such a database is the MICROSOFT ACTIVE DIRECTORY SERVICE. Copies of parts or all of a shared database may be stored on several servers. When one server makes a change to a portion of the database, that change needs to be transmitted to all of the other servers that possess copies of that portion. Transmitting database changes from one server to another is also known as “replicating” the changes. Replication among the various servers of a network takes place according to an established pattern or “replication topology.” Those servers that share the responsibility for maintaining the shared database will be referred to herein as “replicating servers.” A replicating server is one implementation of a “recipient server.”
There are many situations in which a network may have both replicating servers and intermediate servers. One such situation is when a shared database is divided into several partitions, in which a server may only exchange database updates with another server in the same partition. For example, a corporate directory may be divided into sales, development and marketing partitions, such that sales servers only replicate with other sales servers, development servers only replicate with other development servers, and marketing servers only replicate with other marketing servers. In such a network, dissimilar servers would be seen as intermediate servers with respect to one another. For example, marketing servers and development servers would be seen as intermediate servers by the sales servers, since sales data would not be replicated by the other two types of servers, but would simply be passed through. Data replicated between recipient servers may have to pass through these intermediate servers, and therefore they may need to be considered when determining a minimum-cost replication scheme.
Thus it can be seen that there is a need for a new method for designating communication paths in a computer network.
In accordance with the foregoing, a method for designating communication paths in a computer network is provided. According to the invention, communication paths are designated for the transmission of data throughout a network that has both recipient computers, which are the intended recipients of the data, and intermediary computers, which are not the intended recipients, but merely relay the data. Each intermediary computer is grouped with the “closest” recipient computer (i.e. the recipient computer with whom it is “least expensive” to communicate). Communication paths between the resulting groups are then identified. A representation of the network is then created. The representation replaces the intermediary computers with the inter-group communication paths, so that the inter-group communication paths appear to pass directly through the locations occupied by the intermediary computers. The created representation is then further processed so that the “least expensive” communication paths may be designated.
Additional features and advantages of the invention will be made apparent from the following detailed description of illustrative embodiments that proceeds with reference to the accompanying figures.
While the appended claims set forth the features of the present invention with particularity, the invention, together with its objects and advantages, may be best understood from the following detailed description taken in conjunction with the accompanying drawings of which:
a-3e, are an example of a procedure that may be followed in an embodiment of the invention to create a tree for a shortest path forest;
a-4f show an example of how to create a shortest-path forest from the network of
a-7g show an example how a spanning tree may be created from the modified network representation of
Although it is not required, the invention may be implemented by program modules that are executed by a computer. Generally, program modules include routines, objects, components, data structures and the like that perform particular tasks or implement particular abstract data types. A program may include one or more program modules. The invention may be implemented on a variety of types of computers, including personal computers (PCs), hand-held devices, multi-processor systems, microprocessor-based programmable consumer electronics, network PCs, minicomputers, mainframe computers and the like. The invention may also be employed in distributed computing environments, where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, modules may be located in both local and remote memory storage devices.
An example of a networked environment in which this system may be used will now be described with reference to
Referring to
Computer 100 may also contain communications connections that allow the device to communicate with other devices. A communication connection is an example of a communication medium. Communication media typically embodies computer readable instructions, data structures, program modules or other data in a modulated data signal such as a carrier wave or other transport mechanism and includes any information delivery media. By way of example, and not limitation, communication media includes wired media such as a wired network or direct-wired connection, and wireless media such as acoustic, RF, infrared and other wireless media. The term computer readable media as used herein includes both storage media and communication media.
Computer 100 may also have input devices such as a keyboard, mouse, pen, voice input device, touch input device, etc. Output devices such as a display 116, speakers, a printer, etc. may also be included. All these devices are well known in the art and need not be discussed at length here.
According to an embodiment of the invention, the topology of a computer network having intermediate servers may be generated or reorganized so that each intermediate server is grouped with the recipient server with which it is cheapest to communicate. These groupings will be referred to herein as “trees,” with the collection of trees being referred to as a “shortest-path forest.” In a shortest-path forest, each tree has a replicating server as its “root,” and possibly one or more intermediate servers as its “branches.” Links between these groups or “trees” may then be identified and the paths between recipient servers through the branches and through the inter-tree links may be represented without the intermediate servers. As used herein, the terms “shortest path” “closest” and the like do not necessarily equate to physical distance, but are rather meant to be expressions of cost as defined in the Background section. In other words, two servers having a direct link between them that is relatively cheap are said to be “close.” Likewise, the “shortest path” between two servers is really the “cheapest” path in terms of bandwidth, monetary cost, speed, physical distance or whatever other criteria is being used to set up the communication topology.
In one embodiment of the invention, a shortest path forest is first generated using a procedure that is based on Dijkstra's Algorithm. According to this procedure, the replicating servers are designated as the roots. Then, each intermediate server is grouped with the root having the cheapest link to it. Referring to
According to the procedure, the server 150 is designated as the root server as shown in
An example of a procedure that creates a shortest-path forest according to an embodiment of the invention will now be described. To aid in this example, a network having both recipient servers and intermediate servers is shown in
Once all of the servers of the network 180 have been grouped into trees, the shortest-path forest can be considered complete. The network 180 may then be redrawn so that the roots of the trees are at the bottom, as shown in
Now, the most efficient network links for the recipient servers of the network 180 to use for communication can be determined. Typically, determining which network links to use for sending data between recipient servers involves three goals. First, all recipient servers should be connected in the communication topology. Second, redundant communication paths should be avoided. Finally, the total cost of the network links used in the topology should be minimized. One way to fulfill these three goals is to create a so-called “minimum-cost spanning tree”—referred to herein as a “spanning tree.” Several methods exist for creating a spanning tree, one of which involves the use of Kruskal's algorithm, developed by Joseph Kruskal of BELL LABS. This method involves:
(1) Finding the cheapest link that has not yet been considered;
(2) If the link is not redundant, adding it to the tree;
(3) If there are no more edges, stopping the procedure; and,
(4) If there are more edges, repeating steps (1)-(3).
Referring to
In the previous examples, it has been assumed that there is full connectivity between the various servers of the network 180. In reality certain links may not have full connectivity with one another, even if they have endpoints at the same server. For example, a bridge may be required to get data from one link to another. When bridges are present, the above-described procedure may have to be modified so that a shortest-path forest is generated for each bridge prior to the creation of a minimum-cost spanning tree. Also, some links may use incompatible transport protocols, or be available only at certain times. In such cases, the above-described procedure may also have to be modified so that those servers that share a transport protocol or have compatible schedules are treated separately for the purpose of generating a shortest-path forest.
In an embodiment of the invention, the network 180 (
There are many ways to implement the present invention in software. In one implementation, illustrated in
The invention described herein may be used to establish communication paths between computers located at the same site and/or between groups of computers located at different sites. Referring to
When implemented on a shared database network, it may be desirable to modify certain aspects of the invention in order to account for read-only servers. For example, in a directory service database, some servers may hold ‘writeable’ copies of a partition, while others may hold ‘read-only’ copies. In such a scenario, database replication may be set up so that changes are only replicated from writeable servers. In other words, replication between two writeable servers occurs in both directions, but if a writeable server and a read-only server are involved, then replication only occurs from the writeable server to the read-only server, and not vice versa.
According to an embodiment of the invention, additional parameters may be included in the process of designating communication links in order to account for the presence of read-only servers. These parameters include, but are not limited to:
(1) All writeable servers should be linked to one another without any intervening read-only servers;
(2) Read-only servers should be connected to the writeable servers so that they can replicate in any changed data; and,
(3) Read-only servers should not replicate in from other read-only servers, since the other read-only servers cannot possibly have any changes. However, read-only servers may replicate from other read-only servers if required by the communication links of the network. An example would be where a read-only server was connected to the rest of the network solely by a link to another read-only server that was, itself, well connected to the rest of the network.
To illustrate an implementation of these parameters, a simple network, generally labeled 200, is shown in
As a result, when there are both writeable and read-only servers in a shared-database network, and this embodiment of the invention is used, the replication topology ends up being a ‘two-tiered’ tree, in which the top tier includes all of the writeable servers linked in a bi-directional minimum spanning tree, and the bottom tier includes the read-only servers. The bottom tier may include several trees appended to the tier, in which replication occurs in a downward direction. In this example, “downward” means “away from the writeable servers.” An example of a two-tier tree is shown in
It can thus be seen that a new a useful method for designating communication paths in a network has been provided. In view of the many possible embodiments to which the principles of this invention may be applied, it should be recognized that the embodiments described herein with respect to the drawing figures is meant to be illustrative only and should not be taken as limiting the scope of invention. For example, those of skill in the art will recognize that the elements of the illustrated embodiments shown in software may be implemented in hardware and vice versa or that the illustrated embodiments can be modified in arrangement and detail without departing from the spirit of the invention. Therefore, the invention as described herein contemplates all such embodiments as may come within the scope of the following claims and equivalents thereof.
This application is a continuation application of U.S. patent application Ser. No. 09/795,202, filed Feb. 28, 2001 now U.S. Pat. No. 6,879,564.
Number | Name | Date | Kind |
---|---|---|---|
4864559 | Perlman | Sep 1989 | A |
4873517 | Baratz et al. | Oct 1989 | A |
5261051 | Masden et al. | Nov 1993 | A |
5291477 | Liew | Mar 1994 | A |
5313630 | Namioka et al. | May 1994 | A |
5551027 | Choy et al. | Aug 1996 | A |
5588147 | Neeman et al. | Dec 1996 | A |
5608903 | Prasad et al. | Mar 1997 | A |
5649194 | Miller et al. | Jul 1997 | A |
5675787 | Miller et al. | Oct 1997 | A |
5698121 | Kosaka et al. | Dec 1997 | A |
5768519 | Swift et al. | Jun 1998 | A |
5774552 | Grimmer | Jun 1998 | A |
5787441 | Beckhardt | Jul 1998 | A |
5787442 | Hacherl et al. | Jul 1998 | A |
5806074 | Souder et al. | Sep 1998 | A |
5832225 | Hacherl et al. | Nov 1998 | A |
5832275 | Olds | Nov 1998 | A |
5832487 | Olds et al. | Nov 1998 | A |
5832506 | Kuzma | Nov 1998 | A |
5884322 | Sidhu et al. | Mar 1999 | A |
5926816 | Bauer et al. | Jul 1999 | A |
5968121 | Logan et al. | Oct 1999 | A |
5968131 | Mendez et al. | Oct 1999 | A |
6049809 | Raman et al. | Apr 2000 | A |
6052724 | Willie et al. | Apr 2000 | A |
6058401 | Stamos et al. | May 2000 | A |
6138124 | Beckhardt | Oct 2000 | A |
6212557 | Oran | Apr 2001 | B1 |
6247017 | Martin | Jun 2001 | B1 |
6295541 | Bodnar et al. | Sep 2001 | B1 |
6301589 | Hirashima et al. | Oct 2001 | B1 |
6324571 | Hacherl | Nov 2001 | B1 |
6343299 | Huang et al. | Jan 2002 | B1 |
6377950 | Peters et al. | Apr 2002 | B1 |
6427209 | Brezak et al. | Jul 2002 | B1 |
6446077 | Straube et al. | Sep 2002 | B2 |
6446092 | Sutter | Sep 2002 | B1 |
6457053 | Satagopan et al. | Sep 2002 | B1 |
6516327 | Zondervan et al. | Feb 2003 | B1 |
6529917 | Zoltan | Mar 2003 | B1 |
6532479 | Souder et al. | Mar 2003 | B2 |
6539381 | Prasad et al. | Mar 2003 | B1 |
6557111 | Theimer et al. | Apr 2003 | B1 |
6643670 | Parham et al. | Nov 2003 | B2 |
6647393 | Dietterich et al. | Nov 2003 | B1 |
6751634 | Judd | Jun 2004 | B1 |
6823338 | Byrne et al. | Nov 2004 | B1 |
6865576 | Gong et al. | Mar 2005 | B1 |
6879564 | Parham et al. | Apr 2005 | B2 |
6901433 | San Andres et al. | May 2005 | B2 |
7035922 | Parham | Apr 2006 | B2 |
7162499 | Lees et al. | Jan 2007 | B2 |
7184359 | Bridgewater et al. | Feb 2007 | B1 |
7185359 | Schmidt et al. | Feb 2007 | B2 |
7200847 | Straube et al. | Apr 2007 | B2 |
20060026165 | Mohamed et al. | Feb 2006 | A1 |
20060085428 | Bozeman et al. | Apr 2006 | A1 |
20060168120 | Parham | Jul 2006 | A1 |
20060184589 | Lees et al. | Aug 2006 | A1 |
20060200831 | Straube et al. | Sep 2006 | A1 |
20070162519 | Straube, et al. | Jul 2007 | A1 |
Number | Date | Country | |
---|---|---|---|
20050256879 A1 | Nov 2005 | US |
Number | Date | Country | |
---|---|---|---|
Parent | 09795202 | Feb 2001 | US |
Child | 11043607 | US |