The Tabular Data Stream (TDS) protocol is an application-level protocol used for the transfer of requests and responses between clients and database server systems, such as SQL Server®, in enterprise and cloud-based systems. In such systems, the client will typically establish a long-lived connection with the server. Once the connection is established using a transport-level protocol, TDS messages are used to communicate between the client and the server. The TDS session is directly tied to the transport-level session, meaning that a TDS session is established when the transport-level connection is established and the server receives a request to establish a TDS connection. It persists until the transport-level connection is terminated, such as when a TCP socket is closed. TDS does not make any assumption about the transport protocol used, but assumes that the transport protocol supports reliable, in-order delivery of the data.
TDS includes facilities for authentication and identification, channel encryption negotiation, issuing of SQL batches, stored procedure calls, returning data, and transaction manager requests. TDS packets are encapsulated in the packets built for the protocol stack used by the Net-Libraries. For example, if the system uses the TCP/IP Sockets Net-Library, then the TDS packets are encapsulated in the TCP/IP packets of the underlying protocol.
In current database systems, a physical connection generated on a client driver cannot be correlated to a connection on the database server. Therefore, users cannot match what happened with the connection generated at the client machine to what was processed on the server. In one solution, developers may attempt to gather this information using network monitor or “netmon” tracing. Netmon tracing can be used to make a best case guess of which packet sent by TDS from the client correlated to what happened on the server. However, this solution does not work when the packets pass through a proxy server or used named pipes to connect. In another solution, for SQL Azure™ and SQL Server® post connection analysis, developers may query for connection identification information from system management views and store this value on the client driver to correlate the database's physical connection with the client connection that was opened. However, this solution only works post-login or when a connection open does not abruptly fail.
This Summary is provided to introduce a selection of concepts in a simplified form that are further described below in the Detailed Description. This Summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used to limit the scope of the claimed subject matter.
Embodiments provide correlated tracing of connections on client drivers to connections on the server. This tracing is accomplished using an optional pre-login data session that is added to the TDS protocol to enable correlated tracing of connections
In one embodiment, a sixteen-byte connection identifier is logged at each layer of the stack to enable correlation of connections. For example, the connection identifier may be logged using pre-login BID (Built-In Diagnostics) trace points on the client drivers, TDS protocol implementations, and server-side SNI (SQL Server Network Interface) layers. The connection identifier is exposed using APIs on the client connection object and exception object. The connection identifier sent through pre-login data packet logs into an Extended Event on the server to correlate a server connection to a client connection.
To further clarify the above and other advantages and features of embodiments of the present invention, a more particular description of embodiments of the present invention will be rendered by reference to the appended drawings. It is appreciated that these drawings depict only typical embodiments of the invention and are therefore not to be considered limiting of its scope. The invention will be described and explained with additional specificity and detail through the use of the accompanying drawings in which:
In operation, each client 201 opens a connection 207 or pipe to a database 205. For example, a user at client 201 a enters a user-identifier and password to set-up a connection to database 205n. Requests, such as SQL batch and RPC requests, may be sent to the database 205n over that connection 207. When an error occurs on a connection 207 it is difficult to troubleshoot these problems in existing systems. All activity for connection 207, such as error messages, tracing, and logging, is captured to operation store (OP store) 208. There are thousands of connections active at one time. This creates a problem for a troubleshooter who often cannot determine which connection 207 is relevant to a particular client 201.
TDS protocol 302 is an application-level protocol used for the transfer of requests and responses between client driver 301 and database engine 303 as described above. SQL engine 303 corresponds to SQL database 103 (
In a typical client application, the stack 300 is used by a client application to open a connection to a SQL Server and to execute requests. The client driver 301 first establishes a physical connection to the SQL server 303. The client driver 301 then sends a PRELOGIN packet to set up a context for login. An authentication mechanism is used to support user accounts on SQL server 303. The username and password of the user account are transmitted as part of the LOGIN message that the client sends to the server. The client 301 may have more than one connection to the server 303, but each one is established separately in the same way. After the server 303 has received the login record from the client 301, it will notify the client 301 that it has either accepted or rejected the connection request.
The client application will sometimes fail when it attempts to connect to the database. For example, the client application may fail if the network is down due to a bad router or a network hardware problem. Other failures may occur when the application running on the client machine is slow to send the request, the client machine crashes, or the client driver fails. Problems with the SQL engine may also cause failures, such as when the SQL Server is too busy to respond to a connection request. In existing systems, there is no way to determine what caused these failures. Instead, users could track a packet through each part of the stack in an attempt to identify how far a packet traveled through the system and thereby determine where a failure occurred. Because thousands of packets are sent thorough the system, it was very difficult to identify and track individual packets. Therefore, it is difficult to trouble shoot the system and to correlate what happened at the client driver to what happened at the SQL engine.
In one embodiment, an optional Trace Identifier may be added to the TDS protocol for the PRELOGIN message stream in a way that does not interfere with the protocol. The Trace Identifier acts as a “breadcrumb” so that PRELOGIN packets may be tracked through the system. When the Trace Identifier is present in the stack, this Trace Identifier is captured and persisted or logged at each layer 301-303. Each layer has different tracing mechanisms as illustrated in
In one embodiment, the Trace Identifier is a randomly generated sixteen-byte GUID (Globally Unique Identifier) referred to herein as a Client Connection ID. The Client Connection ID is added to the PRELOGIN message. When the Client Connection ID is sent and tracing is turned on within client driver 301, for example, the GUID will be found within the trace points for the client driver. Similarly, within TDS protocol 302 and SQL Engine 303 tracing, the GUID will also be found. As illustrated in
In another embodiment, the Trace Identifier is a Client Connection ID plus an Activity ID, which is a randomly generated sixteen-byte GUID plus a four-byte sequence number. The Activity ID GUID corresponds to a thread on the client machine, and the sequence number is assigned to a message sent by that thread.
The Trace ID—along with its Client Connection ID and Activity ID components—is logged by tracing in each layer of the stack. The PRELOGIN message can then be tracked across the system and through the stack by looking for the specific Trace ID assigned to the PRELOGIN stream. If the Trace ID does not appear in the trace for a particular layer, then the failure point of the PRELOGIN attempt can be identified. For example, if the Trace ID does not appear in TDS, then the user knows that the PRELOGIN packets did not reach the TDS protocol. Similarly, if the Trace ID is not found in the SQL Engine, then the user knows that the PRELOGIN stream from the client did not reach the server. This assists the user in identifying the source of a network problem.
In other embodiments, the Trace ID may be used to identify particular connections out of thousands of connections in the network, thereby allowing the user to focus on specific connections that have failed or are having problems. For example, in a cloud environment, a user can identify the Client Connection ID for a failed connection and a troubleshooter use trace data to analyze the connection. This allows a connection to be analyzed anywhere between the client application and a physical machine or a logical database residing on a machine. A client application may be written to store the Trace ID data for connections that generate error messages. That Trace ID can then be used for troubleshooting the connection.
The prelogin message is logged in one or more layers of a stack using a tracing operation in step 504. Logging of the prelogin message may occur in a client driver stack layer, a TDS protocol stack layer, and/or a database server stack layer. In step 505, the logged prelogin messages from the client driver stack layer, the TDS protocol stack layer, and/or the database server stack layer are correlated. The connection may be between the client machine and an on-premises database. Alternatively, the connection may be between the client machine and a virtual database or a physical machine supporting a database in a distributed computer system.
The trace identifier may comprise a client connection identifier combined with an activity identifier. In one embodiment, the client connection identifier is a sixteen-byte, randomly generated globally unique identifier, and the activity identifier is a sixteen-byte, randomly generated globally unique identifier combined with a sequence number.
It will be understood that steps 501-505 of the process illustrated in
The invention may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, and so forth, which perform particular tasks or implement particular abstract data types. The invention may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in local and/or remote computer storage media including memory storage devices.
With reference to
The computer 700 typically includes a variety of computer-readable media 704. Computer-readable media 704 may be any available media that can be accessed by the computer 700 and includes both volatile and nonvolatile media, and removable and non-removable media, but excludes propagated signals. By way of example, and not limitation, computer-readable media 704 may comprise computer storage media and communication media. Computer storage media includes volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer-readable instructions, data structures, program modules or other data. Computer storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital versatile disks (DVD) or other optical disk storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can accessed by the computer 700. Communication media typically embodies computer-readable instructions, data structures, program modules or other data in a modulated data signal such as a carrier wave or other transport mechanism and includes any information delivery media. The term “modulated data signal” means a signal that has one or more of its characteristics set or changed in such a manner as to encode information in the signal. By way of example, and not limitation, communication media includes wired media such as a wired network or direct-wired connection, and wireless media such as acoustic, RF, infrared and other wireless media. Combinations of the any of the above may also be included within the scope of computer-readable media. Computer-readable media may be embodied as a computer program product, such as software stored on computer storage media.
The data storage or system memory 702 includes computer storage media in the form of volatile and/or nonvolatile memory such as read only memory (ROM) and random access memory (RAM). A basic input/output system (BIOS), containing the basic routines that help to transfer information between elements within computer 700, such as during start-up, is typically stored in ROM. RAM typically contains data and/or program modules that are immediately accessible to and/or presently being operated on by processing unit 701. By way of example, and not limitation, data storage 702 holds an operating system, application programs, and other program modules and program data.
Data storage 702 may also include other removable/non-removable, volatile/nonvolatile computer storage media. By way of example only, data storage 702 may be a hard disk drive that reads from or writes to non-removable, nonvolatile magnetic media, a magnetic disk drive that reads from or writes to a removable, nonvolatile magnetic disk, and an optical disk drive that reads from or writes to a removable, nonvolatile optical disk such as a CD ROM or other optical media. Other removable/non-removable, volatile/nonvolatile computer storage media that can be used in the exemplary operating environment include, but are not limited to, magnetic tape cassettes, flash memory cards, digital versatile disks, digital video tape, solid state RAM, solid state ROM, and the like. The drives and their associated computer storage media, described above and illustrated in
A user may enter commands and information through a user interface 705 or other input devices such as a tablet, electronic digitizer, a microphone, keyboard, and/or pointing device, commonly referred to as mouse, trackball or touch pad. Other input devices may include a joystick, game pad, satellite dish, scanner, or the like. Additionally, voice inputs, gesture inputs using hands or fingers, or other natural user interface (NUI) may also be used with the appropriate input devices, such as a microphone, camera, tablet, touch pad, glove, or other sensor. These and other input devices are often connected to the processing unit 701 through a user input interface 705 that is coupled to the system bus 703, but may be connected by other interface and bus structures, such as a parallel port, game port or a universal serial bus (USB). A monitor 706 or other type of display device is also connected to the system bus 703 via an interface, such as a video interface. The monitor 706 may also be integrated with a touch-screen panel or the like. Note that the monitor and/or touch screen panel can be physically coupled to a housing in which the computing device 700 is incorporated, such as in a tablet-type personal computer. In addition, computers such as the computing device 700 may also include other peripheral output devices such as speakers and printer, which may be connected through an output peripheral interface or the like.
The computer 700 may operate in a networked or cloud-computing environment using logical connections 707 to one or more remote devices, such as a remote computer. The remote computer may be a personal computer, a server, a router, a network PC, a peer device or other common network node, and typically includes many or all of the elements described above relative to the computer 700. The logical connections depicted in
When used in a networked or cloud-computing environment, the computer 700 may be connected to a public or private network through a network interface or adapter 707. In some embodiments, a modem or other means for establishing communications over the network. The modem, which may be internal or external, may be connected to the system bus 703 via the network interface 707 or other appropriate mechanism. A wireless networking component such as comprising an interface and antenna may be coupled through a suitable device such as an access point or peer computer to a network. In a networked environment, program modules depicted relative to the computer 700, or portions thereof, may be stored in the remote memory storage device. It may be appreciated that the network connections shown are exemplary and other means of establishing a communications link between the computers may be used.
Although the subject matter has been described in language specific to structural features and/or methodological acts, it is to be understood that the subject matter defined in the appended claims is not necessarily limited to the specific features or acts described above. Rather, the specific features and acts described above are disclosed as example forms of implementing the claims.