1. Technical Field
The present disclosure relates in general to the field of computers, and more particularly to the computer software. Still more particularly, the present disclosure relates to distributed databases.
2. Description of the Related Art
Distributed computing allows a system to share resources, including hardware, software and data. Distributed data may be located in multiple hardware systems, including different servers. A client computer needs to be able to seamlessly locate and manage distributed data from different servers in order to effectively utilize the distributed data.
A method, system and computer program product for managing distributed data is presented. A first datum, which is represented in an upper tier of a data tree, is received from a client computer by a first upper tier partition server. The first upper tier partition server is part of a plurality of upper tier partitions servers. A partition server manager in the first upper tier partition server identifies at least one other upper tier partition server that contains an other datum from the upper tier of the data tree. The at least one other upper tier partition server is registered with the client, such that the client is able to manage other upper tier data stored in the plurality of other upper tier partition servers.
The above, as well as additional purposes, features, and advantages of the present invention will become apparent in the following detailed written description.
The novel features believed characteristic of the invention are set forth in the appended claims. The invention itself, however, as well as a preferred mode of use, further purposes and advantages thereof, will best be understood by reference to the following detailed description of an illustrative embodiment when read in conjunction with the accompanying drawings, where:
With reference now to
Computer 102 includes a processor unit 104 that is coupled to a system bus 106. A video adapter 108, which drives/supports a display 110, is also coupled to system bus 106. System bus 106 is coupled via a bus bridge 112 to an Input/Output (I/O) bus 114. An I/O interface 116 is coupled to I/O bus 114. I/O interface 116 affords communication with various I/O devices, including a keyboard 118, a mouse 120, a Compact Disk-Read Only Memory (CD-ROM) drive 122, and a flash drive memory 124. The format of the ports connected to I/O interface 116 may be any known to those skilled in the art of computer architecture, including but not limited to Universal Serial Bus (USB) ports.
Computer 102 is able to communicate with a software deploying server 150 via a network 128 using a network interface 130, which is coupled to system bus 106. Network 128 may be an external network such as the Internet, or an internal network such as an Ethernet or a Virtual Private Network (VPN). Note the software deploying server 150 may utilize a same or substantially similar architecture as computer 102.
A hard drive interface 132 is also coupled to system bus 106. Hard drive interface 132 interfaces with a hard drive 134. In a preferred embodiment, hard drive 134 populates a system memory 136, which is also coupled to system bus 106. System memory is defined as a lowest level of volatile memory in computer 102. This volatile memory includes additional higher levels of volatile memory (not shown), including, but not limited to, cache memory, registers and buffers. Data that populates system memory 136 includes computer 102's operating system (OS) 138 and application programs 144.
OS 138 includes a shell 140, for providing transparent user access to resources such as application programs 144. Generally, shell 140 is a program that provides an interpreter and an interface between the user and the operating system. More specifically, shell 140 executes commands that are entered into a command line user interface or from a file. Thus, shell 140 (also called a command processor) is generally the highest level of the operating system software hierarchy and serves as a command interpreter. The shell provides a system prompt, interprets commands entered by keyboard, mouse, or other user input media, and sends the interpreted command(s) to the appropriate lower levels of the operating system (e.g., a kernel 142) for processing. Note that while shell 140 is a text-based, line-oriented user interface, the present invention will equally well support other user interface modes, such as graphical, voice, gestural, etc.
As depicted, OS 138 also includes kernel 142, which includes lower levels of functionality for OS 138, including providing essential services required by other parts of OS 138 and application programs 144, including memory management, process and task management, disk management, and mouse and keyboard management.
Application programs 144 include a browser 146. Browser 146 includes program modules and instructions enabling a World Wide Web (WWW) client (i.e., computer 102) to send and receive network messages to the Internet using HyperText Transfer Protocol (HTTP) messaging, thus enabling communication with software deploying server 150.
Application programs 144 in computer 102's system memory (as well as software deploying server 150's system memory) also include a Partitioned Data Manager (PDM) 148, which manages data that may be organized and depicted in a data tree described by database 137. PDM 148 includes code for implementing the processes described in
The hardware elements depicted in computer 102 are not intended to be exhaustive, but rather are representative to highlight essential components required by the present invention. For instance, computer 102 may include alternate memory storage devices such as magnetic cassettes, Digital Versatile Disks (DVDs), Bernoulli cartridges, and the like. These and other variations are intended to be within the spirit and scope of the present invention.
With reference now to
PSM 202 provides software logic used to locate related distributed data. That is, assume that upper tier data (as organized and depicted in an inverted data tree) is distributed across multiple servers. PSM 202 is able to identify and locate all such upper tier data.
DRM 204 provides software logic used to locate any lower tiered data that is related to upper tier data.
DM 206 provides software logic used to control how deep (i.e., how far down the inverted data tree) a client is authorized to go when searching for secondary data.
DIM 208 provides software logic that controls how often data from multiple servers (that contain upper and lower tier data) is pushed onto a client computer.
DDM 210 provides software logic that controls the deregistration and decoupling (i.e., the deactivation) of distributed data servers to the client computer.
Referring now to
Assume that a user of client computer 302 desires to manage a piece of data from a distributed data system. For example, assume that the user wants to store names of employees of a company on one or more of the upper tier partition servers 306a-n. The client computer 302 sends a first employee name to the network 300. PDM 148, which may be stored in and function from only upper tier partition server 304a, or alternatively in any or all of the upper tier partition servers 304a, directs the first employee name to be stored as part of upper tier data 306a in upper tier server 304a. PSM 202 (which is part of PDM 148) examines the employee's name, determines that the name is for an employee of Company A, and locates and identifies all other upper tier data 306b-n stored in the other upper tier partition servers 304b-n that also have the names of employees of Company A. PDM 148 sends a message back to client computer 302 informing client computer 302 of the locations of all other upper tier partition servers 304b-n that also contain the names of other employees of Company A.
PDM 148, stored in client computer 302, one or more of the upper tier partition servers 304a-n, and/or one or more of the lower tier partition servers 308a, is also able to locate lower tier data 310a-n in one or more of the lower tier partition servers 308a-n, which are coupled together by a network 312, and which communicate with the upper tier partition servers 304a-n via a fabric 314. The lower tier data 310a-n represents data that is lower than the upper tier data 306a-n (i.e., subordinate to higher node data in a data tree).
Referring now to
Subordinate to the upper tier data 406 are the lower tier data 410, made up of data 408a-n, which are the respective titles of named employees described in upper tier data 406. Likewise, subordinate to the lower tier data 410 is a lowest tier data 412, made up of data 414a-n, which are the respective social security numbers associated with the employee titles (and their respective employee names) found in the lower tier data 410. While only three tiers of data plus an apex are depicted, it is understood that there may be additional lower layers of data tiers. In one embodiment, however, the lower the data tier, the higher the sensitivity of data stored in the progressively lower tiers. That is, an employee's social security number (found in lowest tier data 412) is more sensitive than that employee's job title (found in lower tier data 410), which is more sensitive than the employees names (upper tier data 406) or employer (apex node 402).
With reference now to
As described in block 510, the upper tier partition server(s) can also locate, using a data relation manager in one or more of the upper tier partition servers, lower tier partition servers that handle related lower tier data (such as the data described in the example shown above in
As described in block 514, the client computer can autonomously service data action in the upper and lower tier server partitions. For example, the client computer can now automatically and/or periodically poll the upper and lower tier server partitions for changes in data, etc.
The process ends at terminator block 516, when the client computer is deregistered and decoupled from the upper and lower tier server partitions. This deregistration (deregistering the ancillary locations of data in the different tiers in different servers) and decoupling of the client computer (from the different tiered server partitions) may be performed by a deregistration and decoupling mechanism that is located in any of the upper tier partition servers (in order to control the client computer's access to such servers).
As described herein, relational data can be partitioned, such that different components of the data are stored and maintained in different servers. If the relational data is organized into different hierarchies (e.g., as an inverted tree), then the top level (e.g., “employee names”) can be partitioned and stored into different servers, and lower levels (e.g., social security numbers, phone numbers, job titles, etc.) can also be partitioned and stored in different servers. The present invention allows a novel process for managing such relational data.
Utilizing the presently described invention, consider the following summary of the example described above for exemplary purposes. A group of employee names has been divided up into many units, with each unit being stored in a different server. A client may want to update or add a first employee name to a database. To do so, the client sends the first employee's name to a first server, which recognizes the employee's name as being that of an employee of Company A. Later, the client may want to add/update a second employee's information. However, the second employee's name may be in another server (which is different from the server that stored the first employee's name). In order for the client to locate where the second employee's name is located, a server partition manager has found all servers that store names of employees for Company A. This information has been passed back to the client, so the client knows where to send the information for the second employee. This information also allows the client to know which servers need to be periodically polled for changes to the employee names.
Continuing with the example, a data relation manager, which is in one or more of the servers, also locates any data that is related to the employees' names (e.g., titles, phone numbers, social security numbers, etc.) The location of this related data is also sent to the client, so the client can update any changes to this related data.
As described above, to control how deep the client can look (i.e., how far down the tree he is authorized to look), a depth manager limits how deep the client can look at related data. Similarly, a duration and interval manager controls how often data changes are pushed onto the client, both for the primary (upper level) data as well as the related (lower level) data.
While the present invention has been particularly shown and described with reference to a preferred embodiment, it will be understood by those skilled in the art that various changes in form and detail may be made therein without departing from the spirit and scope of the invention. For example, while the present description has been directed to a preferred embodiment in which custom software applications are developed, the invention disclosed herein is equally applicable to the development and modification of application software. Furthermore, as used in the specification and the appended claims, the term “computer” or “system” or “computer system” or “computing device” includes any data processing system including, but not limited to, personal computers, servers, workstations, network computers, main frame computers, routers, switches, Personal Digital Assistants (PDA's), telephones, and any other system capable of processing, transmitting, receiving, capturing and/or storing data.