The present invention relates to computer software, and in particular, to client-server systems and methods for accessing metadata information across a network.
Current metadata discovery solutions involve each individual application having its own source metadata browsing and import functionality. The metadata information is generally not accessible across an enterprise network. So, each application implements its own solution which is inefficient. This also results in inconsistent and/or different source metadata information being imported from a data source individually by each application.
In an enterprise environment, there are multiple sub-divisions. For example, a corporation may have different departments such as human resources, finance, legal, etc. Each division represents a different business domain. Data is typically organized in tables by each domain. These tables are typically organized either in a hierarchical or “stateful” structure, or in a flat or “stateless” structure. Currently, browsing of metadata including, e.g., attributes, data types, data sizes, indexes, primary key columns, etc., involves scouring all of these different tables that are scattered across multiple different sub-divisions.
Thus, there is a need for a more universal and efficient solution for discovering and acquiring source metadata in an enterprise environment. The present invention solves these and other problems by providing client-server systems and methods for accessing metadata information across a network.
Embodiments of the present invention include a computer-implemented systems and methods for accessing metadata across a network. A metadata server receives requests to access a data source from one or more clients. The metadata server is coupled between one or more backend servers and the clients. The backend servers may be coupled to the data sources of interest. The metadata server provides a metadata service proxy for establishing communications with the backend servers and for signaling the backend servers to establish connections to data sources. Connection to the data sources may be stateful or stateless. For stateless data sources, the metadata server may dynamically create reusable metadata service provider proxies that receive metadata from metadata service providers on the backend servers. For stateful data sources, unique metadata service provider proxies may be dynamically created and used to service client requests.
In one embodiment, the present invention includes a computer-implemented method of accessing metadata across a network comprising opening first and second connection paths from first and second clients, respectively, to a data source through a metadata server, browsing metadata from the data source by the first and second clients through the first and second connections paths, respectively, dynamically generating a first proxy for the first client at the metadata server for communicating with the data source, proxying a first request from the first client for a first subset of data from the data source over the first connection path through the first proxy, receiving, at the metadata server, metadata from said data source in response to the first request, sending to the first client a first response to the first request including the metadata, during the sending of the first response to the first request, receiving a second request from the second client for a second subset of data from the data source over the second connection path, and sending to the second client a response to the second request.
In one embodiment, the first and second connection paths share a first sub-connection path between the metadata server and the data source.
In one embodiment, the first sub-connection path comprises a second sub-connection path between the first proxy and the data source.
In one embodiment, the first and second connection paths comprise stateful connections to the data source including unique proxies for establishing data source connections, the method further comprising dynamically generating a second proxy for the second client at the metadata server for communicating with the data source, and proxying the second request from the second client for a second subset of data from the data source over the second connection path through the second proxy.
In one embodiment, the method further comprises auto-expiring the first proxy that has timed out and making the first proxy available for servicing a new connection request.
In one embodiment, the first and second connection paths comprise stateless connections to the data source through the first proxy, the method further comprising dynamically assigning the first proxy to the second client at the metadata server for communicating with the data source, and proxying the second request from the second client for a second subset of data from the data source over the second connection path through the first proxy.
In one embodiment, metadata from the data source is received in the metadata server in a plurality of chunks, and wherein at least a portion of the metadata chunks are stored in a buffer.
In one embodiment, the first proxy services the second request from the second client and metadata from the buffer is sent to the first client by said metadata server.
In one embodiment, the method further comprises load balancing data source connections across a plurality of remote servers, wherein a first remote server includes a first plurality of metadata service providers holding connections to a first data source, the first plurality of metadata service providers being in communication with a corresponding first plurality of dynamically generated metadata service provider proxies in said metadata server, and wherein a second remote server includes a second plurality of metadata service providers holding connections to a second data source, the second plurality of metadata service providers being in communication with a corresponding second plurality of dynamically generated metadata service provider proxies in said metadata server, and wherein said metadata server receives a request to access the first data source, and based on the loading of said first remote server, generates a new metadata service provider proxy on the metadata server and signals the second remote server to generate a new metadata service provider for connecting to said first data source and servicing said request through the new metadata service provider proxy.
In one embodiment, a first connection request is received by a metadata service proxy in the metadata server, and wherein the metadata service proxy generates said first proxy in response to said request, wherein said first proxy is a first service provider proxy, and wherein a first service provider is further generated on a second server for connecting to said data source and sending metadata to said first service provider proxy.
In one embodiment, a first connection request is received by a metadata service proxy in the metadata server, and wherein the metadata service proxy sends an open connection request to a metadata service on a second server coupled to the data source if a connection to the data source has not been created, and wherein the metadata service proxy sends a request to retrieve metadata from the data source through the first proxy if the connection to the data source has been created.
In one embodiment, the present invention includes a computer system for accessing metadata across a network comprising a server computer system and a metadata server software component executing one or more of the software components described below for performing the methods described herein.
In one embodiment, one or more computer-readable media are also provided having embedded therein computer-readable code to program one or more processors to perform any of the methods described herein accessing metadata information across a network.
The following detailed description and accompanying drawings provide a better understanding of the nature and advantages of the present invention.
Described herein are client-server systems and methods for accessing metadata over a network using proxies. The apparatuses, methods, and techniques described below may be implemented as a computer program (software) executing on one or more computers, such as a server. The computer program may further be stored on a computer readable medium. The computer readable medium may include instructions for performing the processes described below. In the following description, for purposes of explanation, numerous examples and specific details are set forth in order to provide a thorough understanding of the present invention. It will be evident, however, to one skilled in the art that the present invention as defined by the claims may include some or all of the features in these examples alone or in combination with other features described below, and may further include modifications and equivalents of the features and concepts described herein.
Client 110 may send a request 180 to browse metadata from a data source to the web service 120. The metadata service proxy 140 of the metadata server 122 of the web service 120 communicates connection information, which may include a request for creating new service providers to metadata services 160-162 of backend servers 130-132 and it also forwards requests for metadata to service provider 150 and gets back the metadata response and returns it to the client. For example, client 110 may send a request to metadata server 122 to access a data source, and if no existing connection has been opened to the data source, then metadata service proxy 140 will service this request by sending a request to metadata service 160 to create service provider 170 and metadata server proxy 140 will create the metadata service provider proxy 150 for the service provider 170. Once the service provider 170 and its proxy 150 have been created, proxy 150 may send an open connection request to metadata service provider 170 to open the data source connection, for example. In certain embodiments, connection information may include a connection string for accessing a particular data source. A data source connection string may include information about a data source, such as server name, user id, and/or password, for example. When a client wants to browse metadata of a data source, the client may send data source connection string information to metadata server 122 which will then forward the connection string to metadata service proxy 140, which may generate a metadata service provider proxy 150 and the metadata service provider 170 by means for metadata service 160. Metadata service provider proxy 150 may send a command to open the data source to metadata service provider 170. Metadata service provider 170 may create a connection to the data source 180 in response to the command. As described in more detail below, the request from client 110 may be examined by metadata server 122 to determine if the data source to be accessed is a stateful or stateless data source, and a metadata service provider and corresponding metadata service provider proxy may be shared across multiple clients if the data source is a stateless data source and multiple clients may each use a unique metadata service provider and corresponding metadata service provider proxy to access a data source if the data source is stateful. Once a connection to the data source is established, the metadata service provider proxy 150 of the metadata server 122 communicates specific metadata object requests and responses respectively to and from metadata service provider 170. The web service 120 returns a metadata response 190 to the client 110.
In
The web service 120 may delegate a client's XML requests to the metadata server 122 and/or also deliver XML responses from the metadata server 122 back to the client 110.
The metadata service proxy 140 may communicate with the metadata service 160 on the side of the backend server 130 to create metadata service providers 170-172 when required or requested.
The metadata service provider proxy 150 may forward XML requests for opening or closing data source connections, browsing the data sources 180-182 for metadata to a metadata service provider on the server side. The provider proxy 150 may also return back XML responses received back from the server side.
The metadata services 160-162 on the server side may be responsible for managing the life cycle of the metadata service providers 170-172. It is to be understood that a metadata service a backend server may generate multiple metadata service providers for accessing the same or different data sources and servicing one or more clients.
The metadata service providers 170-172 may hold actual physical connections to the data sources. The metadata service providers 170-172 may process requests from the client 100 and return back responses in, e.g., XML.
Information that may be communicated as part of requests may include any of the following:
The information that may be communicated as part of responses to the requests may include any of the following:
In one embodiment, a service provider may service a client request for opening a connection and browsing metadata from a data source. Physical connection to a data source may be held by a service provider (e.g., providers 170-171 in
The metadata server 122 may balance the load of requests it receives for opening connections to data sources distributing the open requests among the back end servers by maintaining information about the number of active back end servers that have been created and the number of active service providers on each of those back end servers and the type of data sources to which they are connected to. For example referring to
A stateful connection may include a connection to a data source for which browsing requests are associated with some state information. For example, a connection to SAP may be considered to be a stateful connection, because in SAP certain data store objects are grouped based on different application domains. Browsing the data source will therefore involve some kind of state information such as which level in the hierarchy the user is browsing. Referring to
A stateless connection may include a connection to a data source for which browsing requests are not associated with any state information. For example, a connection to Oracle may be considered to be a stateless connection, because in Oracle there is generally no hierarchical grouping of data store objects. A significant distinction between stateful and stateless connections is that stateless connection can be shared while a stateful connection may not generally be shared.
Metadata server 410 may store information for managing connection paths between one or more clients and one or more data sources through service provider proxies and service providers. For example, metadata server 410 may include a Client ID Table 424 and a connection string table 422. Client ID Table 424 may hold information about the client ID and the corresponding service provider proxy servicing this client. Connection string table 424 may be used for handling stateless connections, for example. Connection string table may contain information about the connection string of the data source and a list of service providers holding those connections so that service providers may be shared by multiple clients, for example. Additionally, as described in more detail below, metadata server 410 may include an Idle List and a BufferHolder Table. The Idle List may contain the list of service provider proxies that are in idle state and are available for servicing any new open connection requests. The BufferHolder Table may contain a Client ID and reference to the buffer where chunked responses for a client can be found as described in more detail below.
As mentioned above, the Client ID table 424 may associate a particular client with a particular metadata service provider proxy. The connection string table 422 maintains a mapping between particular connection strings and particular metadata service provider proxies. Using this information, each client's service provider proxy and connection may be tracked and managed.
As there may be multiple backend servers and/or multiple data sources, it is advantageous for different clients to be able to connect to a data source through the same metadata server 410 to access many backend data services containing the data and metadata that the different clients may be looking for.
The metadata server 410 manages connections to backend data services such as 406 and perhaps others by using one or more proxies. For example, as mentioned above, metadata service proxy 416 is used by metadata server 410 to communicate with service 412 to create service provider 414 which will create a physical connection to data source for which the client may browse metadata. Metadata service proxy 416 may also create service provider proxy 418. For the example of stateless connections illustrated at
As mentioned above, metadata server 410 determines and opens an efficient path to backend source 406 one or more proxies. Additionally, metadata server 410 may store connection information in the connection string table 422 along with the reference to service provider proxy and store a client ID in table 424 with the reference to the associated service provider proxy. That is, by referring to tables 422 and 424, and for example buffer holder table 644 which is described below, metadata server 410 is able to more efficiently service the requests 425 of client C1 and other client requests, rather than determining and opening connections to backend data sources and potentially many others each time a new request is received with overlapping connection information. Many other criteria may be included in the set of tables that include tables 422, 424 and 644, e.g., that may be specific to certain applications, protocols, bandwidths, ISPs, securities, passwords, encryption services, selective content blockers, or government, work-related or voluntary monitoring programs, among potentially other criteria specifically being run at, for or in relation to one or more clients and/or at backend servers 406 and/or some required or otherwise used intermediate entity.
In the example of the connection path between client C1402 and backend server 406 through metadata server 410 of
In this example, client C1402 sends a request using across communication channel 425 to the data services web app 404. The channel 425 may use simple object access protocol or SOAP over the hyper text transfer protocol or HTTP, for example. The client C1402 then expects to receive a response to the request from the data services web app over the same channel 425. Metadata service proxy 416 of metadata server 410 communicates with metadata service 412 of backend data service 406 by connection 428 to create metadata service provider/provider proxy pairs. Metadata service proxy 416 then routes requests through service provider proxy 418, across channel 426, to metadata service provider 414 to open a connection and return metadata from the data source. The metadata server 410 may communicate with client C1402 through web service plugin 408 over channel 432 which may involve XML strings. XML strings refer to requests and responses for XML information which are stored as strings.
The metadata service proxy 416 has advantageous functionality. For example, proxy 416 can be used to launch backend 406 to create metadata service 412 in the backend. Proxy 416 can also handle requests for opening connections to the data sources. Proxy 416 can send a message to metadata service 412 to create a new service provider 414 as described above. Proxy 416 can also create the service provider proxy 418. The service provider 414 then can open an actual physical connection to the data source. The proxy 416 can also forward metadata browsing requests to corresponding service provider proxies by accessing a Client ID table to determine the proxy. Proxy 416 may further receive connection requests from other clients and use table 422 to determine if an existing connection may be shared using an existing provider proxy/provider pair.
The metadata service provider proxy 418 also has advantageous functionality. One function of service provider proxy 418 is to act as a proxy for service provider 414 and to forward requests to service provider 414 and to return back responses received from service provider 414 to proxy 416.
The metadata service 412 is configured to service requests from proxy 416 for creating one or more metadata service providers 414. Metadata service 412 also manages terminations for service provider 414.
The metadata service provider 414 contains the actual physical connection to the data source. Metadata service provider 414 opens connections to the data source, browses the data source for data source objects, and retrieves metadata information of data source objects. Metadata service provider 414 also closes connections to the data source.
In general, communication channels 426 and 428 may involve communications of requests and responses as XML strings via TCP/IP or transmission control protocol over internet protocol. Also, channels 425 and 432, as well as a channel 434 between metadata service 416 and metadata service provider 418 of metadata server 410, may involve requests and responses with XML or extensible markup language.
In the example of
There may generally be no difference between the communication between 416-418 and 408-416. One difference between communications between 416-412 and 416-418 or 408-416 may be that communication between 416-412 occurs between two different processes that take place over TCP/IP, whereas the communication between 416-418 or 408-416 occur within a single process since the may be on the same server, for example.
With regard to channels 426 and 428, communications by channel 426 usually will contain requests for opening data source connections, browsing and importing metadata, and closing data source connections, while communications by channel 428 typically contain messages for creating a new service provider or shutting down a service provider.
There is also a communication channel between 412 and 414. This channel may be used so service 412 can manage the life cycle of 414 (e.g., to create and shutdown provider 414 on the backed end). The channel 434 differs in that channel 434 involves forwarding XML requests and XML responses between service 416 and provider proxy 418. These may be two-way communications, whereas the channel between service 412 and provider 414 may be one-way.
Metadata service provider proxy 418 of metadata server 410 communicates with metadata service provider 414 of backend 406 which responds to the request by sending the requested metadata information over channel 426.
For example, client C1 may send a request to metadata server 410 to browse metadata information for 100 objects in the data source to which it has opened a connection. When the metadata server is servicing Client C1, a new Client C2 may send a request to open a connection to the data source with exactly the same connection string “X” (e.g., data base name:“payroll_database”, usename:“sa”, password:“pass123”). Metadata Server 410 may service the open connection request of client C2 simultaneously while it services Client C1. First, client C1402 may send a browse request. Since it is a request for browsing metadata, metadata service 416 may directly look up the client ID in table 424 to find out the service provider proxy 418 associated with client 402, which is used to service the request. The Client ID table may indicate that service provider proxy 418 (“Proxy 1”) should be used for client C1402. Accordingly, provider proxy 418 will service the request by sending the request to service provider 414, which will process the request.
Metadata service provider 414 may retrieve data from the data source and send the data back in response packets (i.e., chunks). Response packets may include dividing the total data set into equal or approximately equal portions and sending the portions (chunks) to from the metadata service provider 414 to the metadata service provider proxy 418. This process is also referred to as “chunking” or “packetizing” the data. The chunks may also contain XML information and perhaps not network related headers or pay-load information. Chunking may or may not be used in specific instances depending on the nature of size of the response that is to be communicated based on processing the client request. For responses that are sufficiently small in size, chunking may not take place.
Provider proxy 418 may receive an indicator in the first response from provider 414 that the data is being sent in chunks. Accordingly, when proxy 418 receives the response it will know whether the response it received is a chunked response or a complete response. When the response is chunked, the proxy 418 may receive the first chunk from the back end and send it to the client, thus making sure that the client receives the first chunk immediately. In the mean time proxy 418 may perform a background operation which will quickly get all the chunks that belong to the response from the provider 414 and store it in a buffer 642. Accordingly, proxy 418 stores the chunks in a buffer 642 when the chunks are received from provider 414. The buffer that is created may be associated with the Client ID by adding the Client ID (“C1”) and a reference to buffer in the Buffer Holder table 644.
Referring now to
Client C2502 may browse metadata for the same data source that client C1402 is browsing. Accordingly, client C2502 sends an XML request for opening a data source connection. If client C2502 sends the same connection string ‘X’ (e.g., data base name:“payroll_database”, usename:“sa”, password:“pass123”), then the provider proxy/provider connection to the data source may be reused by client C2502. For example, metadata service proxy 416 may check the connection string table 422 to determine if there is an entry for the connection string received from client C2502. Since there is a matching entry (i.e., connection string X) indicating that a service provider proxy is available for that data source connection, metadata service 416 may not create any new service provider proxies. Rather, a unique ID for client C2502 may be generated and an entry will be stored in Client ID table 424 that will contain the unique ID together with a reference to the service provider proxy that is present in the connection string table. For example, table 424 may store a unique ID for client C2502, which is associated with provider proxy 418 (“Proxy 1”). Since proxy 1 is already associated with connection string X in table 422, requests and responses with client C2502 may be serviced using metadata service provider proxy 418 and metadata service provider 414, thus re-using the existing connection.
In
The metadata server 810 maintains a client ID table 824. The client ID table 824 maintains a mapping between clients and the proxies that the metadata server uses to communicate with backend source 806.
In the example of the connection of client C1802 to backend data service 806 through metadata server 804 of
In general, channels 826 and 828 may involve communications of requests and responses as XML strings via TCP/IP or transmission control protocol over internet protocol. Also, channels 825 and 832, as well as a channel 834 between metadata service 816 and metadata service provider proxy 818 of metadata server 810, may involve requests and responses with XML or extensible markup language.
Referring now to
In the example of
In general, channel 926 may involve communications of requests and responses as XML strings via TCP/IP or transmission control protocol over internet protocol. Also, channels 925 and 932, as well as a channel 934 between metadata service 816 and metadata service provider 918 of metadata server 810, may involve requests and responses with XML or extensible markup language.
In this stateful example, two different proxies, proxy1818 and proxy2918 are used to communicate respectively over connections 826 and 926 to metadata service providers 814 and 914 of backend data service 806. Client ID table 924 is updated to reflect that proxy1 is used to service the request of client C1, while proxy2 is used to service the request of client C2.
For both clients C1802 and C2902, processing requests to access stateful data sources may be handled in the same manner. For example, client C1802 may send a request to browse metadata for a stateful data source (e.g., SAP). Accordingly, client C1802 sends an XML request for opening a data source connection. Metadata service proxy 816 identifies that the request is for a stateful connection (e.g., by identifying the database or other information in the request from the client), and since a stateful connection will not be shared it process the request as follows. First, metadata server 810 may look up entries in the idle list table, which may include a list of idle service provider proxies that have been created. If any service provider proxy is available in the idle list, then it may be remove from idle list and used to process the open connection request. If no service provider proxies are in the idle list, then a new service provider 814 on back end and corresponding service provider proxy 818 on the front end may be generated for processing the open connection request. A unique ID for the client may be generated and stored as part of the client's response. For example, the unique ID and a reference to the service provider proxy may be stored in a client ID table as described above. The response may be sent back to the client and future responses processed by looking up entries in the client ID table.
In both the stateless and stateful connection examples, the number of connections to the data service 806 are reduced compared to a conventional system not utilizing metadata server 404, 804. In the stateless example, connections 428 and 426 are used to service the requests of both client C1402 and client C2. In the stateful example, connection 828 is used to service the requests of both client C1802 and client C2902. As the costs of providing metadata service increase with the number of connections, expenses are reduced in both examples. Moreover, by using buffer1642 in the stateless example, data retrieval is reduced lowering expenses further.
Computer system 1010 may be coupled via bus 1005 to a display 1012, such as a cathode ray tube (CRT) or liquid crystal display (LCD), for displaying information to a computer user. An input device 1011 such as a keyboard and/or mouse is coupled to bus 1005 for communicating information and command selections from the user to processor 1001. The combination of these components allows the user to communicate with the system. In some systems, bus 1005 may be divided into multiple specialized buses.
Computer system 1010 also includes a network interface 1004 coupled with bus 1005. Network interface 1004 may provide two-way data communication between computer system 1010 and the local network 1020. The network interface 1004 may be a digital subscriber line (DSL) or a modem to provide data communication connection over a telephone line, for example. Another example of the network interface is a local area network (LAN) card to provide a data communication connection to a compatible LAN. Wireless links using radio frequency communications are another example. In any such implementation, network interface 1004 sends and receives electrical, electromagnetic, or optical signals that carry digital data streams representing various types of information.
Computer system 1010 can send and receive information, including messages or other interface actions, through the network interface 1004 to a local network, Intranet, or the Internet 1030. Software components or services described above may reside on multiple different computer systems 1010 or 1015 across a local network or servers 1031-1035 across the network. The processes described above may be implemented on one or more servers, for example. A server may transmit actions or messages from one component, through local network 1020 or Internet 1030, to network interface 1004 to a component on computer system 1010. Different processes may be implemented on any computer system and send and/or receive information across a network, for example. In one embodiment, the techniques describe above may be implemented by software services on one or more servers 1031-1035, for example.
The above description illustrates various embodiments of the present invention along with examples of how aspects of the present invention may be implemented. The above examples and embodiments should not be deemed to be the only embodiments, and are presented to illustrate the flexibility and advantages of the present invention as defined by the following claims. Based on the above disclosure and the following claims, other arrangements, embodiments, implementations and equivalents will be evident to those skilled in the art and may be employed without departing from the spirit and scope of the invention as defined by the claims.
In addition, in methods that may be performed according to the claims below and/or embodiments described herein, the operations have been described in selected typographical sequences. However, the sequences have been selected and so ordered for typographical convenience and are not intended to imply any particular order for performing the operations, unless a particular ordering is expressly indicated or understood by those skilled in the art as being necessary.