CONTINUOUS QUERY LANGUAGE (CQL) DEBUGGER IN COMPLEX EVENT PROCESSING (CEP)

Information

  • Patent Application
  • 20130014088
  • Publication Number
    20130014088
  • Date Filed
    July 07, 2011
    13 years ago
  • Date Published
    January 10, 2013
    12 years ago
Abstract
A method including receiving, at a computer system, debugging configuration information specifying a functional area of a data stream processing server to be debugged, is described. Furthermore, the method includes identifying, by the computer system, an object associated with the functional area that has been instantiated by the data stream processing server, determining, by the computer system, that tracing for the object is enabled to perform the debugging, and instantiating, by the computer system, a tracelet associated with the object. Further, the method includes stepping, by the computer system, through the tracelet associated with the object to debug the object, and displaying, by the computer system, a visual representation of debugging results associated with the object.
Description
BACKGROUND

The present disclosure relates in general to data logging, and in particular to the debugging of the logging of data pertaining to the operation of a data stream processing server.


Traditional database management systems (DBMSs) execute queries in a “request-response” fashion over finite, stored data sets. For example, a traditional DBMS can receive a request to execute a query from a client, execute the query against a stored database, and return a result set to the client.


In recent years, data stream management systems (DSMSs) have been developed that can execute queries in a continuous manner over potentially unbounded, real-time data streams. For example, a typical DSMS can receive one or more data streams, register a query against the data streams, and continuously execute the query as new data appears in the streams. Since this type of query (referred to herein as a “continuous query”) is long-running, the DSMS can provide a continuous stream of updated results to a client. Due to the continuous nature of such queries, debugging or diagnosing problems within continuous queries is extremely difficult. With a complex event processing (CEP) server, continuous query language (CQL) has been used in describing the continuous queries.


Currently, diagnosing or debugging of continuous queries can be done by performing various levels of logging, such as input/output adapter, output bean, operator, store, synopsis, queues, or the processing nodes in the event processing network level. However, this method cannot provide enough simplicity and flexibility for properly debugging the continuous queries. Usually the problem of logging methods include: too much logging data to analyze, not being able to change the state and continue, not being able to trigger conditions to enable logging, etc. Furthermore, some of the debugging cannot be done using just logging, for example, the pattern operator involving complex state.


DSMSs are particularly suited for applications that require real-time or near real-time processing of streaming data, such as financial ticker analysis, physical probe/sensor monitoring, network traffic management, and the like. Many DSMSs include a server application (referred to herein as a “data stream processing server”) that is configured to perform the core tasks of receiving data streams and performing various operations (e.g., executing continuous queries) on the streams. It would be desirable to have a framework for logging data pertaining to the operation of such a data stream processing server to facilitate performance tuning, debugging, and other functions. Hence, improvements in the art are needed.


BRIEF SUMMARY

One embodiment of the invention includes a method which includes receiving, at a computer system, debugging configuration information specifying a functional area of a data stream processing server to be debugged. Furthermore, the method includes identifying, by the computer system, an object associated with the functional area that has been instantiated by the data stream processing server, determining, by the computer system, that tracing for the object is enabled to perform the debugging, and instantiating, by the computer system, a tracelet associated with the object. Further, the method includes stepping, by the computer system, through the tracelet associated with the object to debug the object, and displaying, by the computer system, a visual representation of debugging results associated with the object.


In another embodiment, a machine-readable medium is described. A machine-readable medium includes instructions for receiving debugging configuration information specifying a functional area of a data stream processing server to be debugged. Furthermore, the machine-readable medium includes instructions for identifying an object associated with the functional area that has been instantiated by the data stream processing server, determining that tracing for the object is enabled to perform the debugging, and instantiating a tracelet associated with the object. Further, the machine-readable medium includes instructions for stepping through the tracelet associated with the object to debug the object, and displaying a visual representation of debugging results associated with the object.


In a further embodiment, a system is described. The system includes a processing component configured to receive debugging configuration information specifying a functional area of a data stream processing server to be debugged, identify an object associated with the functional area that has been instantiated by the data stream processing server, determine that tracing for the object is enabled to perform the debugging, instantiate a tracelet associated with the object, step through the tracelet associated with the object to debug the object, and display a visual representation of debugging results associated with the object.





BRIEF DESCRIPTION OF THE DRAWINGS


FIGS. 1A and 1B are simplified block diagrams of a data stream management system according to an embodiment of the present invention.



FIG. 2 is a graphical representation of a query plan according to an embodiment of the present invention.



FIG. 3 is a flow diagram of a process for configuring logging in a data stream processing server according to an embodiment of the present invention.



FIG. 4 is a simplified diagram of a data structure for storing logging configuration information according to an embodiment of the present invention.



FIG. 5 is a flow diagram of a process for generating log records in a data stream processing server according to an embodiment of the present invention.



FIG. 6 illustrates a log record according to an embodiment of the present invention.



FIG. 7 is a flow diagram of a process for dynamically enabling or disabling logging of query plan objects according to an embodiment of the present invention.



FIG. 8 is a flow diagram of a process for visualizing log records according to an embodiment of the present invention.



FIG. 9 is a screen display of a log visualization user interface according to an embodiment of the present invention.



FIG. 10 is a flow diagram of another process for visualizing log records according to an embodiment of the present invention.



FIG. 11 is a flow diagram of a process for implementing a CQL debugger according to an embodiment of the present invention.



FIG. 12 is a flow diagram of another process for implementing a CQL debugger according to an embodiment of the present invention.



FIG. 13 is a simplified diagram of a data structure for implementing a CQL debugger according to an embodiment of the present invention.



FIG. 14 is a simplified block diagram of a system environment that may be used in accordance with an embodiment of the present invention.



FIG. 15 is a simplified block diagram of a computer system that may be used in accordance with an embodiment of the present invention.





DETAILED DESCRIPTION

In the following description, for the purposes of explanation, numerous details are set forth in order to provide an understanding of various embodiments of the present invention. It will be apparent, however, to one skilled in the art that certain embodiments can be practiced without some of these details.


Aspects of the present invention include a CQL debugger which introduces the following features: 1) step over operators in the query plan, 2) step into data structures of operations (e.g., enqueueing/dequeueing, insert/delete to store, synopsis, index), 3) setting breakpoints on operators in the query plan, 4) setting breakpoints on data structure, 5) setting conditional breakpoints on timestamp or attributes of a tuple, 6) inspect and watch data structures of operators including store, synopsis, queue, index, stat, etc.


Embodiments of the present invention include the following aspects: tracelet in a CQL processor engine, trace/debug implementation in a diag module, a communication channel providing communication from a debugger application to client applications supporting debug sessions including visualizer, eclipse tooling, a command line interface, etc. In one embodiment, a tracelet may be a small code segment in the trace target which is used in tracing/dumping and as a breakpoint. For trace targets including operators, data structures, etc, a tracelet may be placed such that the trace/debug module can intercept accordingly. For example, LogLevelManager.trace (LogArea.OPERATOR, LogEvent.OPERATOR_RUN_BEGIN, this, getOptName( )); may be used. This embodiment may use a static function in implementing the tracelet, but it can also be dynamically injected on class loading using byte code manipulation so that the burden to the programmers/developers to maintain the tracelets can be removed.


In a further embodiment, when the trace/debug module receives ‘trace’ invocation from tracelets, it checks if tracing or breakpoint is set for the target. The checking is done using a multi-dimensional array in order to minimize performance degradation. If tracing is set, the proper level of tracing is processed and if a breakpoint is set, it waits for a user to continue through a visual debugger console interface.


The following shows a high level description of a trace/debug module's task upon receiving tracelet's invocation:

















Levels levels = loglevelManager.getLevels(area,







target.getTargetId( ), event);









 if (levels != null) {



  loglevelManager.traceLevels(area, event, target, levels, args);









}









 Breakpoint bp = loglevelManager.getBreakpoint(area,







target.getTargetId( ), event);









 if (bp != null) {









bp.wait( ); // wait for next, continue



}










Trace targets may implement an IDump interface, which can provide tailored state information to debug clients. This may be particularly important for operators pertaining to complex states, such as a pattern operator. In one embodiment, the pattern operator may implement tailored state visualization logic in dumping the state so that the customers can easily understand the state. Using combinations of trace, dump, and breakpoint, these features described above may be implemented. Due to the minimization of performance impact in checking tracing/breakpoint setup, the target application may not need to be started in special mode, such as debug mode. Instead, customers can invoke the debugger any time even including within the production platform.


The present invention at least allows for the following competitive significances: 1) debuggers are provided but the user needs to run the application in debug mode, whereas, the present invention does not need to run the application in debug mode. 2) Various debuggers only provide information on tuples in the port level, whereas, the present invention can visualize the internal state to solve more complex problems, such as the current state of pattern detection. 3) Certain debuggers only provide stepping through at the operator level. The present invention can step into more fine-grained levels including data structure information and provides more detailed state information for debugging. 4) Other debugger implementations only provide tuple level information. The present invention can provide more tailored state information that provides further insight to the problems.


Further embodiments of the present invention provide techniques for logging data pertaining to the operation of a data stream processing server. In one set of embodiments, logging configuration information can be received specifying a functional area of a data stream processing server to be logged. Based on the logging configuration information, logging can be dynamically enabled for objects associated with the functional area that are instantiated by the data stream processing server, and logging can be dynamically disabled for objects associated with the functional area that are discarded (or no longer used) by the data stream processing server. By dynamically enabling and disabling logging for specific objects in this manner, data regarding the operation of the data stream processing server can be logged without significantly affecting the server's runtime performance. In another set of embodiments, a tool can be provided for visualizing the data logged by the data stream processing server.


According to one embodiment of the present invention, a method for facilitating logging in a data stream processing server is provided. The method comprises receiving, at a computer system, logging configuration information specifying a functional area of a data stream processing server to be logged, and identifying, by the computer system, an object associated with the functional area that has been instantiated by the data stream processing server. The method further comprises enabling, by the computer system, logging for the object, and determining, by the computer system, if the object is no longer used by the data stream processing server. If the object is no longer used, logging is disabled by the computer system for the object.


In one embodiment, enabling logging for the object comprises storing the logging configuration information for the object and generating one or more log records for the object based on the logging configuration information stored for the object.


In one embodiment, disabling logging for the object comprises deleting the logging configuration information stored for the object.


In one embodiment, the logging configuration information includes a first parameter identifying an event upon which to generate a log record and a second parameter identifying a level of detail for the log record. In this embodiment, generating one or more log records for the object comprises, upon occurrence of a predefined event related to the object, retrieving the logging configuration information stored for the object and determining if the predefined event corresponds to the event identified by the first parameter. If the predefined event corresponds to the event identified by the first parameter, a log record is generated for the object, where the generated log record has the level of detail identified by the second parameter.


In one embodiment, the functional area to be logged corresponds to a type of query plan object. In this embodiment, identifying an object associated with the functional area comprises traversing a query plan generated for a continuous query, where the query plan includes a plurality of query plan objects, and identifying a query plan object in the plurality of query plan objects having the type. Further, determining if the object is no longer used comprises determining if the continuous query is dropped.


In one embodiment, the plurality of query plan objects includes an operator object and one or more data structure objects associated with the operator object. In a further embodiment, if logging is enabled for the operator object, logging is automatically enabled for the one or more data structure objects associated with the operator object.


In one embodiment, the method above further comprises identifying another object associated with the functional area, where the another object was instantiated by the data stream processing server subsequently to receiving the logging configuration information, and enabling logging for the another object.


In one embodiment, the logging configuration information is received from a user and is expressed as a Continuous Query Language (CQL) statement. In another embodiment, the logging configuration information is received via an invocation of a Java Management Extensions (JMX) Applications Programming Interface (API).


According to another embodiment of the present invention, a machine-readable storage medium having stored thereon program code executable by a computer system is provided. The program code includes code that causes the computer system to receive logging configuration information specifying a functional area of a data stream processing server to be logged, and code that causes the computer system to identify an object associated with the functional area that has been instantiated by the data stream processing server. The program code further comprises code that causes the computer system to enable logging for the object, code that causes the computer system to determine if the object is no longer used by the data stream processing server, and code that causes the computer system to, if the object is no longer used, disable logging for the object.


According to another embodiment of the present invention, a logging system is provided. The logging system comprises a processing component configured to receive logging configuration information specifying a functional area of a data stream processing server to be logged and to identify an object associated with the functional area that has been instantiated by the data stream processing server. The processing component is further configured to enable logging for the object and to determine if the object is no longer used by the data stream processing server. If the object is no longer used, the processing component is configured to disable logging for the object.


According to another embodiment of the present invention, a method for visualizing log records is provided. The method comprises receiving, at a computer system, a file comprising log records generated by a data stream processing server, where the log records include information pertaining to a query plan and a sequence of one or more events executed by the data stream processing server in accordance with the query plan. The method further comprises generating, by the computer system, a graphical representation of the query plan based on the log records, and displaying, by the computer system, the graphical representation.


In one embodiment, the graphical representation of the query plan comprises one or more nodes, where each node represents a query plan object in the query plan. Examples of query plan objects include operators, queues, stores, indexes, synopses, etc.


In one embodiment, the method above further comprises, in response to a user input, displaying data information for a node.


In one embodiment, the method above further comprises, in response to a first user input, visually portraying execution of the one or more events in sequence by animating the graphical representation, where visually portraying execution of the one or more events in sequence comprises visually portraying execution of the one or more events in real-time based on timestamps associated with the one or more events. In a further embodiment, the method above further comprises, in response to a second user input, pausing the animation.


In one embodiment, the method above further comprises, if the log records indicate that an error occurred during execution of an event in the one or more events, displaying a representation of the error in the graphical representation.


In one embodiment, the method above further comprises providing the one or more events as one or more data streams to another data stream processing server and receiving a continuous query to be executed against the one or more data streams. The continuous query can then be executed by the another data stream processing server while the graphical representation is being animated.


In one embodiment, the method above further comprises, if a result for the continuous query is received from the another data stream processing server, pausing the animation. In another embodiment, the method above further comprises, if a result for the continuous query is received from the another data stream processing server, displaying an alert.


According to another embodiment of the present invention, a machine-readable storage medium having stored thereon program code executable by a computer system is provided. The program code includes code that causes the computer system to receive a file comprising log records generated by a data stream processing server, where the log records include information pertaining to a query plan and a sequence of events executed by the data stream processing server in accordance with the query plan. The program code further comprises code that causes the computer system to generate a graphical representation of the query plan based on the log records and code that causes the computer system to display the graphical representation.


According to another embodiment of the present invention, a log visualization system is provided. The log visualization system comprises a storage component configured to store a file comprising log records generated by a data stream processing server, where the log records include information pertaining to a query plan and a sequence of events executed by the data stream processing server in accordance with the query plan. The log visualization system further comprises a processing component in communication with the storage component, where the processing component is configured to generate a graphical representation of the query plan based on the log records and display the graphical representation.


A further understanding of the nature and advantages of the embodiments disclosed herein can be realized by reference to the remaining portions of the specification and the attached drawings.


Embodiments of the present invention provide techniques for logging data pertaining to the operation of a data stream processing server. In one set of embodiments, logging configuration information can be received specifying a functional area of a data stream processing server to be logged. Based on the logging configuration information, logging can be dynamically enabled for objects associated with the functional area that are instantiated by the data stream processing server, and logging can be dynamically disabled for objects associated with the functional area that are discarded (or no longer used) by the data stream processing server. By dynamically enabling and disabling logging for specific objects in this manner, data regarding the operation of the data stream processing server can be logged without significantly affecting the server's runtime performance.


In certain embodiments, the functional area specified in the logging configuration information can correspond to a type of query plan object, where a query plan object is a component of a query plan, and where a query plan is a data structure used by the data stream processing server to execute a continuous query. Examples of query plan object types include “operator,” “queue,” “store,” “synopsis,” “index,” and the like. In these embodiments, logging can be dynamically enabled or disabled for query plan objects having the specified type based on query plan changes in the data stream processing server. For instance, in one set of embodiments, logging can be dynamically enabled for query plan objects having the specified type that are instantiated upon generation of a new query plan. In another set of embodiments, logging can be dynamically disabled for query plan objects having the specified type that are discarded upon the deletion of an existing query plan.


In one set of embodiments, a tool can be provided for visualizing log records that are generated for query plan objects according to the techniques noted above. For example, the tool can receive log records containing data regarding one or more events executed by the query plan objects in accordance with a query plan. The tool can then generate a visual representation of the query plan and animate, in real-time, the visual representation to illustrate the execution of the events. Such a tool can be useful for administrators, developers, and other users in understanding and analyzing the log records.



FIG. 1A is a simplified block diagram of a data stream management system (DSMS) 100 according to an embodiment of the present invention. DSMS 100 can be implemented in software, hardware, or a combination thereof. Unlike traditional DBMSs, DSMS 100 can process queries in a continuous manner over potentially unbounded, real-time data streams. To facilitate this processing, DSMS 100 can include a server application (e.g., data stream processing server 102) that is configured to receive one or more input data streams (e.g., streams 104, 106), execute continuous queries against the input data streams, and generate one or more output data streams of results (e.g., streams 108, 110).


In one set of embodiments, server 102 can log data pertaining to its runtime operation. For example, in particular embodiments, server 102 can log data pertaining to query plan objects that are used by the server to execute continuous queries. This logged information can then be used by, e.g., an administrator or other user of server 102 to debug errors or analyze performance problems that may haven arisen during query execution. This logging capability is described in greater detail below.



FIG. 1B is a simplified block diagram illustrating a more detailed view of DSMS 100 and data stream processing server 102 according to an embodiment of the present invention. As shown, server 102 can comprise a plurality of software components including a query manager 112, a log manager 114, a plan monitor 116, and log targets 118.


In various embodiments, query manager 112 can receive continuous queries from, e.g., a client application or a user and generate query plans for executing the queries. As described above, a continuous query is a query that can be run in a continuous or persistent fashion against one or more data streams. A query plan is a data structure comprising one or more objects (referred to herein as “query plan objects”) that can be used by server 102 to execute a continuous query. In some embodiments, query manager 112 can generate a separate query plan for each received query. In other embodiments, query manager 112 can maintain a single, global query plan for multiple queries.


By way of example, FIG. 2 is a graphical representation of a query plan 200 that can be generated by query manager 112 for a continuous query. As shown, query plan 200 can include a plurality of query plan objects 202-238 arranged in a hierarchical fashion. In certain embodiments, each query plan object can correspond to a software object (e.g., a JAVA a C++ object) that can be invoked to perform one or more actions. When input data (e.g., input data streams 104, 106 of FIG. 1A) is passed through plan 200 and query plan objects 202-238 are invoked in the specified order, the continuous query associated with plan 200 can be executed.


In one set of embodiments, each query plan object can have a particular type that indicates its functional role within the plan. For example, query plan objects 202-212 are “operator” objects that are configured to carry out specific operations, or steps, in the overall execution of the continuous query. Query plan 200 can also include various other types of query plan objects such as “store” objects 214-218, “queue” objects 220-228, and “synopsis” objects 230-238. Generally speaking, store, queue, and synopsis objects are data structure objects that can be associated with one or more operator objects and can be used to maintain an operator object's state and/or manage data flow into (or out of) an operator object. For instance, in the embodiment of FIG. 2, operator object 210 can be associated with a store object 218, queue objects 224-228, and synopsis objects 234-238.


Once a query plan (such as plan 200) has been generated for a continuous query, query manager 112 (or another component of server 102) can execute the continuous query using the query plan. For example, with respect to query plan 200, query manager 112 can invoke the various query plan objects 202-238 according to the hierarchical ordering of plan 200 and thereby execute the associated query.


Returning to FIG. 1B, log manager 114 can facilitate the logging of various functional areas of server 102. In one set of embodiments, log manager 114 can receive logging configuration information specifying a particular functional area of server 102. This information can be received, for example, from a user via a user interface or from a client application via an invocation of an Application Programming Interface (API). Upon receiving the logging configuration information, log manager 114 can store (in, e.g., log configuration database 120) a copy of the logging configuration information for one or more software objects associated with the specified area that have been instantiated by server 102. This stored information can then be accessed by log manager 114 at runtime of server 102 to generate log records for each object.


For example, at runtime of server 102, the various software objects used by the server (e.g., log targets 118) can invoke log manager 114 upon the occurrence of certain predefined events. In response, log manager 114 can determine, based on the logging configuration information stored in log configuration database 120, whether logging has been enabled for those log targets. If log manager 114 determines that logging has been enabled for a particular log target 118, log manager 114 can instruct the log target to generate a log record and store the record in log record database 122.


In some embodiments, the functional area specified in the logging configuration information received by log manager 114 can correspond to a type of query plan object, such as “operator,” “queue” “store,” “synopsis,” and so on. In these embodiments, log manager 114 can interoperate with plan monitor 116 to identify query plan objects that have been instantiated by query manager 112 (via, e.g., the generation of query plans). Specifically, log manager 114 can send the logging configuration information to plan monitor 116, which is configured to traverse the query plans generated by query manager 112 and identify query plan objects having the specified type. Plan monitor 116 can then return IDs for the identified query plan objects to log manager 114, which can store the IDs with the logging configuration information in log configuration database 120. In this manner, logging can be enabled for these specific query plan objects.


At runtime of server 102, the query plan objects used by the server (e.g., for executing continuous queries) can invoke log manager 114 upon the occurrence of certain predefined events. In response, log manager 114 can determine, based on the logging configuration information stored in log configuration database 120, whether logging has been enabled for those query plan objects. If logging has been enabled for a particular query plan object, logging manager 114 can instruct the query plan object to generate a log record and store the record in log record database 122.


In one set of embodiments, plan monitor 116 can, upon receipt of the logging configuration information from log manager 114, keep track of “change management information” in change management database 124. As used herein, “change management information” refers to changes that should be made to the information stored in log configuration database 120 in the event that new query plan objects are instantiated (e.g., via the generation of new query plans) or existing query plan objects are discarded or rendered obsolete (e.g., via the deletion of existing query plans) by query manager 112.


For example, assume the logging configuration information specifies that logging should be enabled for all operator-type query plan objects, and assume that there are currently two operator objects (having IDs O1 and O2) instantiated in the server. In this case, the change management information can specify that the logging configuration information should be added to log configuration database 120 for any new operator objects subsequently instantiated by query manager 112. Further, the change management information can specify that the logging configuration information stored in log configuration database 120 for operator objects O1 and O2 should be deleted if either of these objects are discarded or rendered obsolete by query manager 112.


Once the change management information described above has been stored in change management database 124, plan monitor 116 can be automatically updated of any query plan changes by query manager 112. For example, query manager 112 can notify plan monitor 116 when a new query plan is generated, or when an existing query plan is discarded. Plan monitor 116 can then determine, based on the change management information stored in change management database 124, if any changes need to be applied to log configuration database 120. If changes need to be made (e.g., logging configuration information needs to be added or deleted for a specific query plan object), plan monitor 116 can instruct log manager 114 to apply those changes. In this manner, logging can be dynamically enabled and disabled for query plan objects in response to query plan changes.


It should be appreciated that FIGS. 1A and 1B are illustrative and not intended to limit embodiments of the present invention. For example, DSMS 100 and server 102 may each have other capabilities or include other components that are not specifically described. One of ordinary skill in the art will recognize many variations, modifications, and alternatives.



FIG. 3 is a flow diagram of a process 300 for configuring logging in a data stream processing server according to an embodiment of the present invention. In one set of embodiments, process 300 can be carried out by log manager 114, plan monitor 116, and query manager 112 of FIG. 1B to enable logging of query plan objects used by server 102. Process 300 can be implemented in hardware, software, or a combination thereof. As software, process 300 can be encoded as program code stored on a machine-readable storage medium.


At blocks 302 and 304, query manager 112 can receive a continuous query and generate a query plan for the query. As described above, a query plan is a data structure comprising one or more objects (query plan objects) that can be used (by, e.g., server 102) to execute a continuous query. In certain embodiments, the processing of steps 302 and 304 can be repeated continuously as new queries are received.


Concurrently with blocks 302 and 304, log manager 114 can receive logging configuration information specifying a type of query plan object to be logged (block 306). In one set of embodiments, the logging configuration information can be received from a user of server 102 via, e.g., a user interface. In these embodiments, the logging configuration information can be expressed as a Continuous Query Language (CQL) statement. In other embodiments, the logging configuration information can be received from a client application or some other automated process via, e.g., an invocation of an Application Programming Interface (API) such as a Java Management Extensions (JMX) API.


In one set of embodiments, the logging configuration information received at block 306 can include at least three parameters: <AREA>, <EVENT>, and <LEVEL>, The <AREA> parameter can specify an identifier (ID) of a particular functional area of server 102 to be logged. For example, in the context of query plan objects, the <AREA> parameter can specify an ID of a particular query plan object type to be logged, such as “operator,” “store,” “queue,” “synopsis,” and the like. In some embodiments, the <AREA> parameter can also specify an ID of a “subtype,” where the subtype represents another level of granularity within the specified area. For example, if the specified area is “operator,” the <AREA> parameter can also include a subtype of “binjoin,” “timewindow,” or other subtypes of operator objects.


The <EVENT> parameter can specify an ID of an event, or operation, upon which logging should occur. In other words, the <EVENT> parameter can indicate when a log record should be generated for the specified area. In one set of embodiments, the permissible ID values for the <EVENT> parameter can vary based on the area specified via the <AREA> parameter. For example, if the specified area is “operator” (denoting the “operator” query plan object type), the permissible ID values for <EVENT> may be limited to those events that are typically carried out by operator objects, such as “begin execution” and “end execution.” As another example, if the specified area is “queue” (denoting the “queue” query plan object type), the permissible ID values for <EVENT> may be limited to those events that are typically carried out by queue objects, such as “enqueue” and “dequeue.”


The <LEVEL> parameter can specify an ID indicating the desired level of detail, or verbosity, of the generated log record. Like the <EVENT> parameter, the permissible ID values for the <LEVEL> parameter can vary based on the area specified via the <AREA> parameter. Further, the meaning of a particular level ID may be different based on the specified area. For example, a level ID of “1” may denote a certain level of detail for the “queue” object type and a different level of detail for the “operator” object type.


In some embodiments, if the area specified via the <AREA> corresponds to the operator object type, certain ID values for the <LEVEL> parameter can cause the generated log record to include information about data structure objects (e.g., stores, queues, synopses, etc.) associated with the operator object. In this manner, logging can be enabled for a plurality of related query plan objects via a single configuration command.


The following is a table of ID values for the <AREA>, <EVENT>, and <LEVEL> parameters that can be recognized by log manager 114 according to an embodiment of the present invention:















EVENT ID and
LEVEL ID and


AREA ID
DESCRIPTION
DESCRIPTION







CEP_QUEUE
21 - Queue DDL
1 - Metadata information such as



22 - Enqueue
number of readers for a writer



23 - Dequeue
queue, the operators involved,



24 - Peek
etc. The exact information to be



25 - Get
logged depends on the type of the




queue.




2 - Timestamp, element kind and




tuple details (only if pinned).




3 - Timestamp, element kind and




tuple details (even if unpinned).




4 - Queue stats




5 - List of all elements in the




queue. The exact information to




be logged depends on the type of




the queue.


CEP_STORE
41 - Store DDL
1 - Metadata information like



42 - Insert
number of readers/stubs, the



43 - Delete
operators invoked, etc. The exact



44 - Get
information to be logged depends



45 - Scan Start
on the type of store.



46 - Scan
4 - Store statistics



47 - Scan Stop
5 - List of all tuples/timestamps.




The exact information to be




logged depends on the type of




store.


CEP_INDEX
61 - Index DDL
1 - Tuple information (only if



62 - Insert
pinned)



63 - Delete
2 - Tuple information (even if



64 - Scan Start
unpinned)



65 - Scan
3 - Index statistics



66 - Scan Stop
4 - List of all tuples


CEP_SYNOPSIS
81 - Synopsis DDL
1 - Metadata information like the



82 - Insert
store identifier, stub identifier,



83 - Delete
number of scans,



84 - Get
predicates/undexes, etc (for a



85 - Scan Start
relational synopsis).



86 - Scan
2 - Tuple information (only if



87 - Scan Stop
pinned)




3 - Tuple information (even if




unpinned)




4 - Store statistics




5 - List of all tuples/timestamps




6 - Underlying index information




7 - List of all tuples


CEP_OPERATOR
101 - Operator DDL
1 - Operator metadata



102 - Beginning of operator
2 - Operator statistics



execution
3 - Underlying structure statistics



103 - End of operator execution
(e.g., input/output queues, store,



104 - Underlying structures
synopsis)



(synopsis, queues, indexes, etc.) -
4 - Underlying structures - least



equivalent of CEP_QUEUE,
detail (equivalent of



CEP_INDEX and
CEP_QUEUE, CEP_INDEX, and



CEP_SYNOPSIS at insert/delete
CEP_SYNOPSIS at level that



105 - Enqueue/dequeue
dumps tuples at insert/delete, only



performed during the execution
if pinned)



106 - Peeks in the input queues
5 - Underlying structures - more



performed during execution
detail (equivalent of level ID 4



107 - Inserts/deletes performed
plus dump stats and scan)



on the synopsis
6 - Underlying structures - most



108 - Underlying synopsis scan
detail (equivalent of level ID 5



109 - Underlying index scan
plus dump the complete list at




every get in the form of a get,




etc.)




7 - Detailed operator dump (this




may be operator specific. For




example, binjoin may decide to




dump more information than




streamsource).




8 - Extremely detailed operator




dump; effectively a code




walkthrough.


CEP_QUERY_OPERATORS
1 - Log all the operators for a
The level will produce the same



specific query
amount of logging as the logging




for all the operators under




consideration. All of the




operators of the query can be




logged. If IDs are not specified,




all queries can be used.


CEP_SPILL
121 - Garbage collection in
1 - Eviction information



spilling
2 - Spilling statistics



122 - Eviction Begin
3 - Spilling reference map



123 - Eviction End


CEP_STORAGE
141 - DB Open
1 - DB information



142 - DB Close
2 - DB Statistics



143 - DB Read



144 - DB Write



145 - DB Delete



146 - DB Transaction Begin



147 - DB Transaction End



148 - DB Query Begin



149 - DB Query End


CEP_QUERY
161 - Creation of query
1 - Query creation text and



162 - Modification of query
corresponding activities (e.g.,



163 - Deletion of query
create, update, drop)



164 - Start of query
2 - Internal query metadata like



165 - End of query
Query ID, external destinations,




destination views, reference




functions, and reference views




along with query text.




3 - Reference count, whether




read or write locked, stack trace


CEP_TABLE
181 - Table creation
1 - Table creation text and



182 - Table update
corresponding activities (creation,



183 - Table deletion
update, deletion)




2 - Table ID, referenced queries,




whether table is silent, push




source (or not), table creation text




3 - Reference count, whether




read or write locked


CEP_WINDOW
201 - Window creation
1 - Window creation/deletion



202 - Window deletion
activity and context




2 - Implementation class name,




destination queries along with




window name




3 - Reference count, whether




read or write locked


CEP_USERFUNCTION
221 - User function creation
1 - User function creation text,



222 - User function deletion
implementation class name




2 - Function ID, destination




queries, creation text




3 - Reference count, whether




read or write locked


CEP_VIEW
241 - Creation of view
1 - Associated query information



242 - Deletion of view
and view creation or deletion




2 - View ID, query ID,




destination queries, query




information




3 - Reference count, whether




read or write locked


CEP_SYSTEM
261 - System state creation
1 - System state,



262 - System state deletion
creation/updation/deletion



263 - System state updation
2 - Reference count, whether




read or write locked


CEP_SYSTEM_STATE
N/A
1 - List of queries




2 - List of tables




3 - List of windows




4 - List of user functions




5 - List of views









Once the logging configuration information is received per block 306, log manager 114 can determine, based on the <AREA> parameter in the received information, the functional area to be logged. For the purposes of process 300, it is assumed that the functional area corresponds to a type of query plan object, such as operator, queue, or the like. Log manager 114 can then send the logging configuration information to plan monitor 116 (block 308).


At block 310, plan monitor 116 can receive the logging configuration information and determine the query plan object type specified therein. Plan monitor 116 can then traverse the query plans generated by query manager 112 (at block 312) and identify query plan objects in the query plans that have the specified type (blocks 314, 316). For example, if the logging configuration information specifies the “operator” object type, plan monitor 116 can identify all of the operator objects that have been instantiated by query manager 112 and are included in one or more query plans.


Once plan monitor 116 has identified query plan objects per block 314, plan monitor 116 can return a list of IDs for the identified query plan objects to log manager 114 (blocks 316, 318). Log manager 114 can then store the object IDs along with the logging configuration information received at block 306 in a data store, such as log configuration database 120 of FIG. 1B (block 320). At runtime of server 102, this stored information can be used to generate log records for the identified query plan objects. This runtime process is discussed in greater detail with respect to FIG. 5 below.


In one set of embodiments, plan monitor 116 can also store change management information in change management database 124 at block 322. As described above, this change management information can represent changes that should be made to the logging configuration information stored in log configuration database 120 (per block 320) in the event that new query plan objects are instantiated (e.g., via the generation of new query plans) or existing query plan objects are discarded or rendered obsolete (e.g., via the deletion of existing query plans) by query manager 112. Accordingly, this change management information can be used to dynamically enable or disable logging for query plan objects as query plan changes occur.


For instance, in one set of embodiments, plan monitor 116 can be automatically notified by query manager 112 when, e.g., a new query plan is generated, or when an existing query plan is discarded. Plan monitor 116 can then determine, based on the information stored in change management database 124, if any changes need to be made to the logging configuration information stored in log configuration database 120 to enable or disable logging for a particular query plan object. If a change needs to be made (e.g., logging configuration information needs to be added or deleted for a specific object), plan monitor 116 can instruct log manager 114 to apply the change. This process is described in greater detail with respect to FIG. 7 below.


It will be appreciated that process 300 is illustrative and that variations and modifications are possible. Steps described as sequential may be executed in parallel, order of steps may be varied, and steps may be modified, combined, added, or omitted. One of ordinary skill in the art would recognize many variations, modifications, and alternatives.


In some embodiments, the logging configuration information stored at block 320 of process 300 can be stored in a particular type of data structure, such as a multi-dimensional array. An example of such a multi-dimensional array 400 is illustrated in FIG. 4. As shown, multi-dimensional array 400 can include a first array 402 that is indexed by area ID. Each area ID index can correspond to a functional area that can be logged in server 102. In one set of embodiments, array 402 can include indices for various query plan object types such as operator, queue, store, synopsis, and so on.


Each value in array 402 can be a pointer to a second array 404 that is indexed by object ID. Each object ID index can correspond to a particular object instance (associated with the selected area) that can be logged by server 102.


Each value in array 404 can be a pointer to a third array 406 that is indexed by event ID. Each event ID index can correspond to a particular event that can be logged for the selected area and object.


Finally, each value in array 406 can be a pointer to a fourth array 408 that is indexed by level ID. Each level ID index can correspond to a particular level of detail for generating a log record for the selected area, object, and event. In one set of embodiments, the values in array 408 can be binary values indicating whether logging is enabled or disabled for that particular combination of [area, object, event, level]. In alternative embodiments, the values in array 408 can be booleans, strings, or any other type of value that can indicate whether logging is enabled or disabled.



FIG. 5 is a flow diagram of a process 500 for generating log records at runtime of server 102 according to an embodiment of the present invention. In one set of embodiments, process 500 can be carried out by log manager 114 and an object being used by server 102 (i.e., log target 118) after configuration process 300 has been performed. In certain embodiments, log target 118 can correspond to a query plan object being used by server 102 to execute a continuous query. Process 500 can be implemented in hardware, software, or a combination thereof. As software, process 500 can be encoded as program code stored on a machine-readable storage medium.


At block 502, log target 118 can invoke log manager 114 upon occurrence of a predetermined event and provide log manager 114 with information pertaining to the event and itself. In various embodiments, log target 118 can be preconfigured with code for invoking log manager 114 in this manner.


In some embodiments, the “predetermined event” that triggers invocation of log manager 114 can be different based on the object type of log target 118. For example, if log target 118 is an operator object, log target 118 can be preconfigured to invoke log manager 114 upon, e.g., the occurrence of “begin execution” and “end execution” events. As another example, if log target 118 is a queue object, log target 118 can be preconfigured to invoke log manager 114 upon, e.g., the occurrence of “enqueue” and “dequeue” events.


At block 504, log manager 114 can determine, from the information received from log target 118, the area ID and object ID for log target 118, as well as the event ID for the event that occurred at block 502. The area ID, object ID, and event ID can then be compared with the logging configuration information stored in log configuration database 120 to determine whether logging has been enabled for that particular combination of [area ID, object ID, event ID] (block 506). For example, if the logging configuration information is stored in the form of multi-dimensional array 400 of FIG. 4, this process can comprise accessing array 402 using the determined area ID, accessing array 404 using the determined object ID, accessing array 406 using the determined event ID, and retrieving the appropriate array 408. In this embodiment, array 408 can identify all of the levels for which logging is enabled.


If logging is not enabled for any levels corresponding to the [area ID, object ID, event ID] determined at block 504, process 500 can end (blocks 506, 508). On the other hand, if logging is enabled for one or more levels, log manager 114 can send the IDs for those levels to log target 118 (block 510). In response, log target 118 can generate a log record based on the specified levels and store the log record in log record database 122 (block 512).


It will be appreciated that process 500 is illustrative and not intended to limit embodiments of the present invention. Steps described as sequential may be executed in parallel, order of steps may be varied, and steps may be modified, combined, added, or omitted. One of ordinary skill in the art would recognize many variations, modifications, and alternatives.



FIG. 6 illustrates a example log record 600 that may be generated per block 512 of process 500 according to an embodiment of the present invention. In this particular example, log record 600 was generated upon the occurrence of an “enqueue” event related to a queue object. Accordingly, log record 600 specifies an event ID (i.e., event name) of “QUEUE ENQUEUE” and a queue object ID of “11.” Log record 600 further includes data that has been logged at a plurality of different levels (level IDs 0-6). As can be seen, the data logged at each level differs in type and detail. For example, the data logged at level ID 0 (the most detailed level) includes a stack trace of an exception that occurred during the enqueue event. The data logged at other level IDs contain various other details about the enqueue event.


Although not shown in FIG. 6, in some embodiments log record 600 can also include a timestamp indicating a time at which the log record was generated or stored. Further, log record 600 can include details about the query plan associated with this particular queue object. In various embodiments, this logged information can be used to visualize the execution of events in the query plan. This visualization technique is discussed in greater detail with respect to FIGS. 8, 9, and 10 below.


It will be appreciated that log record 600 is illustrative and not intended to limit embodiments of the present invention. For example, although log record 600 is shown as being expressed according to a particular structure and using particular naming conventions, log record 600 can also be expressed in many different ways. One of ordinary skill in the art would recognize many variations, modifications, and alternatives.


As described above, in certain embodiments logging can be dynamically enabled or disabled for query plan objects based on query plan changes in server 102. FIG. 7 is a flow diagram illustrating such a process 700 according to an embodiment of the present invention. In one set of embodiments, process 700 can be carried out by query manager 112, plan monitor 116, and log manager 114 after configuration process 300 has been performed. Process 700 can be implemented in hardware, software, or a combination thereof. As software, process 700 can be encoded as program code stored on a machine-readable storage medium.


At block 702, query manager 112 can detect a change that affects one or more query plans used by server 102. For example, query manager 112 can detect when a new query plan has been generated in response to a request to add a new continuous query. Alternatively, query manager 112 can detect when an existing query plan is discarded or obsolete in response to a request to drop an existing continuous query. Upon detecting a query plan change, query manager 112 can send information regarding the change to plan monitor 116. For example, this query plan change information can include IDs of new query plan objects that have been instantiated (if, e.g., a query has been added), or IDs of query plan objects that have been discarded (if., e.g., an existing query has been dropped).


At block 704, plan monitor 116 can receive the query plan change information from query manager 112. Plan monitor 116 can then determine, based on the change management information stored in change management database 124, if any changes need to be made to the logging configuration information stored in log configuration database 120 (block 706).


For example, assume the change management information specifies that the logging configuration information stored in log configuration database 120 for two objects, O1 and O2, should be deleted if either of these objects is discarded or rendered obsolete by query manager 112. Further, assume that the query plan change information received at block 704 indicates that objects O1 and O2 have, in fact, been discarded. In this case, plan monitor 116 can create a change list specifying deletion of the logging configuration information for these specific objects. In other situations, plan monitor 116 can determine that logging configuration information should be added for certain objects to log configuration database 120, and can create a change list specifying the addition of such information accordingly.


If a change needs to be made (e.g., logging configuration information needs to be added or deleted for a specific query plan object), plan monitor 116 can send a change list to log manager 114 (blocks 708, 710). Log manager 114 can then apply the changes to log configuration database 120 (block 712). Alternatively, plan monitor 116 can directly apply the changes to log configuration database 120. By modifying the stored logging configuration information in this manner, logging can be dynamically enabled or disabled for query plan objects as query plan changes occur.


It will be appreciated that process 700 is illustrative and not intended to limit embodiments of the present invention. Steps described as sequential may be executed in parallel, order of steps may be varied, and steps may be modified, combined, added, or omitted. One of ordinary skill in the art would recognize many variations, modifications, and alternatives.


In some situations, the logging techniques described above can create a voluminous amount of log data pertaining to the operation of server 102 that can be difficult to interpret and/or analyze. Accordingly, embodiments of the present invention can provide techniques for visualizing log records created by server 102. In certain embodiments, these visualization techniques allow an end user to graphically view a query plan that has been executed by server 102 and see the progression of operations/events that are performed by query plan objects within the query plan.



FIG. 8 is a flow diagram of a process 800 for visualizing log records according to an embodiment of the present invention. In one set of embodiments, process 800 can be carried out by a software application (e.g., Web-based application. proprietary desktop client application, etc.) that is specifically adapted to visualize log records generated by a data stream processing server such as server 102 of FIG. 1B. As software, process 800 can be encoded as program code stored on a machine-readable storage medium.


At block 802, a file can be received comprising log records generated by a data stream processing server, where the log records contain information pertaining to a query plan and a sequence of events executed by the server in accordance with the query plan. For example, the file can contain log records generated according to process 500 of FIG. 5.


At block 804, a graphical representation of the query plan can be generated based on the log records and can be displayed to an end user. In one set of embodiments, the graphical representation can resemble a tree comprising a plurality of nodes, where each node corresponds to an object (e.g., operator, queue, store, etc.) in the query plan (such as the representation of plan 200 depicted in FIG. 2).


At block 806, the graphical representation of the query plan can be animated, thereby depicting the occurrence of logged events over the course of the query's execution. For example, if the log records received at block 802 include an enqueue event and a subsequent dequeue event for a particular queue object, the occurrence of these events can be depicted and animated accordingly. In some embodiments, this animation can occur in real-time based on timestamps associated with the events in the log records. Thus, a user can understand and analyze, in a visual manner, the flow of events and data during query execution.


In certain embodiments, the animation described at block 806 can be initiated, stopped, paused, rewound, and/or fast-forwarded according to inputs received from a user. Further, if the animation is paused, the user can inspect data related to each query plan object in the query plan. For example, in one embodiment, the user can select a particular query plan object and view information about its state, its associated data structures, etc. at that point in the query execution.


In further embodiments, various alerts and or messages can be displayed to the user during the animation. For example, if the log records contain information about an error (such as the stack trace depicted in log record 600 of FIG. 6), an alert can be generated and displayed advising of that error.


It will be appreciated that process 800 is illustrative and not intended to limit embodiments of the present invention. Steps described as sequential may be executed in parallel, order of steps may be varied, and steps may be modified, combined, added, or omitted. One of ordinary skill in the art would recognize many variations, modifications, and alternatives.



FIG. 9 is a screen display 900 of a visualization application configured to carry out the steps of process 800. As shown, screen display 900 includes window 902 displaying a graphical representation of one or more query plans. Screen display 900 also includes a “plan component details” section 904 for displaying details about a particular query plan object.


In certain embodiments, the visualization application shown in FIG. 9 can (in addition to visualization) allow more sophisticated analyses to be performed on log records. For example, in one embodiment, the application can treat the log records as comprising one or more data streams (e.g., stream of enqueue events, stream of dequeue events, stream of insert into index events, stream of delete from index events, etc.). Accordingly, the application can provide these log records as inputs into a data stream processing server. Queries can then be run against the data streams and the results can be used by the application for various purposes. FIG. 10 is a flow diagram of a such a process 1000.


At block 1002, one or more events in the log file received at block 802 of process 800 can be provided to a data stream processing server. In one set of embodiments, the data stream processing server can be embedded into the visualization application performing the steps of process 1000. Alternatively, the data stream processing server can be running in a different address space or on a different machine.


At blocks 1004 and 1006, a continuous query executed against the data streams can be received, and the query can be provided to the data stream processing server for processing. Merely by way of example, once such query may relate to checking the growth of a particular queue object. Another type of query may relate to correlating the size of an index to a size of a queue. Yet another type of query may relate to correlating the contents of an index to the contents of a queue. In one set of embodiments, the server can execute this query while the graphical representation of the query plan described in the log records is being animated (per block 806 of process 800).


At block 1008, a result set for the continuous query can be received from the data stream processing server. The result set can then be used to perform a specific action. For example, if the result set contains data satisfying a particular condition, the animation of the query plan can be halted, or an alert can be displayed. In this manner, the continuous query can act as a complex breakpoint condition (e.g., break playback if this condition is satisfied). A user can then inspect the contents of various query plan objects to try and determine the cause of any problems that may have occurred during query execution.


It will be appreciated that process 1000 is illustrative and not intended to limit embodiments of the present invention. Steps described as sequential may be executed in parallel, order of steps may be varied, and steps may be modified, combined, added, or omitted. One of ordinary skill in the art would recognize many variations, modifications, and alternatives.


Turning now to FIG. 11 a process 1100 is illustrated which is one implementation of a CQL debugger in CEP. At block 1102, one or more events in the log file received at block 802 of process 800 can be provided to a data stream processing server. In one set of embodiments, the data stream processing server can be embedded into the visualization application performing the steps of process 1100. Alternatively, the data stream processing server can be running in a different address space or on a different machine.


At block 1104, a continuous query executed against the data streams can be received, and the query can be provided to the data stream processing server for processing. Merely by way of example, one such query may relate to checking the growth of particular queue object. Another type of query may relate to correlating the size of an index to a size of a queue. Yet another type of query may relate to correlating the contents of an index to the contents of a queue. In one set of embodiments, the server can execute this query while the graphical representation of the query plan described in the log records is being animated (per block 806 of process 800).


At block 1106, operators in the continuous query are stepped over. This allows for debugging of the operators within the query. Accordingly, once the operators have been identified, the data structures of the operators may be stepped into (block 1108). Stepping into such data structures provides the administrator with the ability to analyze bugs and other issues with the data structures, and develop solutions for such problems.


At block 1110, breakpoints on the operators in the continuous query and the data structures are set. Furthermore, conditional breakpoints based on, for example, timestamps, tuple attributes within the data streams, etc. may also be set (block 1112). Therefore, the process will be able to stop at the hard breakpoints as well as optionally stop at the conditional breakpoints depending on the conditions being met.


Furthermore, at block 1114, inspecting and watching of the data structures of the operators occurs because of the ability to step into the data structures and the breakpoints which have been set. In one embodiment, the data structures of the operators may include store, synopsis, queue, index, stat, etc.; however, other data structures of the operators may be included in the streaming query.


At block 1116, the steps and breakpoints may be executed and as a result a graphical representation of the query plan as the query plan is being debugged may be presented. Such a graphical representation may be presented in a user interface, a mobile interface, etc. Furthermore, the interface may be interactive and provide the administrator, tester, etc. with the ability to manipulate the debugging information. Further, upon receipt of a debugging result(s), an output log of the debugging information may be produced (block 1118).


Turning now to FIG. 12, a process 1200 is illustrated which is one implementation of a CQL debugger in CEP. At block 1202, a trace and/or breakpoint invocation is received. If it is determined that tracing has been set (decision block 1204), then a tracing level is processed (block 1206). In one embodiment, the tracing level may include normal, terse, verbose initialization, verbose data, and the like. Each of the levels may provide additional or alternative tracing information.


At decision block 1208, if it is determined that one or more breakpoints have been set, then it is determined if there is an indication to continue through to the visual debugger console interface (decision block 1210). Once there is an indication to continue to the debugger console interface, then a visual representation of the debugging results is produced (block 1212).


Referring now to FIG. 13, a system 1300 is illustrated for implementing a CQL debugger in CEP. The system 1300 includes a tracelet 1310 in a CQL processor engine 1305. The system 1300 further includes a trace/debug engine 1315 which is included in the data stream processing server 102 within the DSMS 100. In one embodiment, a communication channel provides communication from trace/debug engine 1315 to client applications supporting debug sessions including a visualizer (display device) 1330, eclipse tooling 1325, and a command line interface 1320.


In one embodiment, a tracelet 1310 may be a small code segment in the trace target which is used in tracing/dumping and as a breakpoint. For trace targets including operators, data structures, etc, a tracelet 1310 may be placed such that the trace/debug module can intercept accordingly. For example, LogLevelManager.trace (LogArea.OPERATOR, LogEvent.OPERATOR_RUN_BEGIN, this, getOptName( )); may be used. This embodiment may use a static function in implementing the tracelet, but it can also be dynamically injected on class loading using byte code manipulation so that the burden to the programmers/developers to maintain the tracelets can be removed.


In a further embodiment, when the trace/debug engine 1315 receives ‘trace’ invocation from tracelets 1310, it checks if tracing or breakpoint is set for the target. The checking is done using a multi-dimensional array in order to minimize performance degradation. If tracing is set, the proper level of tracing is processed and if a breakpoint is set, it waits for user to continue through a visual debugger console interface.


The trace/debug engine 1315 includes the following tasks upon receiving tracelet 1310's invocation:

















Levels levels = loglevelManager.getLevels(area,







target.getTargetId( ), event);









 if (levels != null) {



  loglevelManager.traceLevels(area, event, target, levels, args);









}









 Breakpoint bp = loglevelManager.getBreakpoint(area,







target.getTargetId( ), event);









 if (bp != null) {









bp.wait( ); // wait for next, continue



}










In a further embodiment, trace targets may implement an IDump interface, which can provide tailored state information to debug clients. This may be particularly important for operators pertaining to complex states, such as a pattern operator. In one embodiment, the pattern operator may implement tailored state visualization logic in dumping the state so that the customers can easily understand the state. Using combinations of trace, dump, and breakpoint, these features described above may be implemented. Due to the minimization of performance impact in checking tracing/breakpoint setup, the target application may not need to be started in special mode, such as debug mode. Instead, customers can invoke the debugger any time even including within the production platform.



FIG. 14 is a simplified block diagram illustrating a system environment 1400 that may be used in accordance with an embodiment of the present invention. As shown, system environment 1400 includes one or more client computing devices 1402, 1404, 1406, 1408 communicatively coupled with a server computer 1410 via a network 1412. In one set of embodiments, client computing devices 1402, 1404, 1406, 1408 may be configured to run one or more client applications that interact with DSMS 100 of FIGS. 1A and 1B. Further, server computer 1410 may correspond to a machine configured to run DSMS 100. Although system environment 1400 is shown with four client computing devices and one server computer, any number of client computing devices and server computers may be supported.


Client computing devices 1402, 1404, 1406, 1408 may be general purpose personal computers (including, for example, personal computers and/or laptop computers running various versions of Microsoft Windows and/or Apple Macintosh operating systems), cell phones or PDAs (running software such as Microsoft Windows Mobile and being Internet, e-mail, SMS, Blackberry, and/or other communication protocol enabled), and/or workstation computers running any of a variety of commercially-available UNIX or UNIX-like operating systems (including without limitation the variety of GNU/Linux operating systems). Alternatively, client computing devices 1402, 1404, 1406, 1408 may be any other electronic device capable of communicating over a network (e.g., network 1412 described below) with server computer 1410.


Server computer 1410 may be a general purpose computer, specialized server computer (including, e.g., a LINUX server, UNIX server, mid-range server, mainframe computer, rack-mounted server, etc.), server farm, server cluster, or any other appropriate arrangement and/or combination. Server computer 1410 may run an operating system including any of those discussed above, as well as any commercially available server operating system. Server computer 1410 may also run any of a variety of server applications and/or mid-tier applications, including web servers, Java virtual machines, application servers, database servers, and the like. As indicated above, in one set of embodiments, server computer 1410 is adapted to run one or more server and/or middle-tier components such as data stream processing server 102 of DSMS 100.


As shown, client computing devices 1402, 1404, 1406, 1408 and server computer 1410 are communicatively coupled via network 1412. Network 1412 may be any type of network that can support data communications using any of a variety of commercially-available protocols, including without limitation TCP/IP, SNA, IPX, AppleTalk, and the like. Merely by way of example, network 1412 may be a local area network (LAN), such as an Ethernet network, a Token-Ring network and/or the like; a wide-area network; a virtual network, including without limitation a virtual private network (VPN); the Internet; an intranet; an extranet; a public switched telephone network (PSTN); an infra-red network; a wireless network (e.g., a network operating under any of the IEEE 802.11 suite of protocols, the Bluetooth protocol known in the art, and/or any other wireless protocol); and/or any combination of these and/or other networks.


System environment 1400 may also include one or more databases 1414. In one set of embodiments, database 1414 can include any other database or data storage component discussed in the foregoing disclosure, such as log configuration database 102, log record database 122, and change management database 124 of FIG. 1B. Database 1414 may reside in a variety of locations. By way of example, database 1414 may reside on a storage medium local to (and/or resident in) one or more of the computers 1402, 1404, 1406, 1408, 1410. Alternatively, database 1414 may be remote from any or all of the computers 1402, 1404, 1406, 1408, 1410 and/or in communication (e.g., via network 1412) with one or more of these. In one set of embodiments, database 1414 may reside in a storage-area network (SAN) familiar to those skilled in the art. Similarly, any necessary files for performing the functions attributed to the computers 1402, 1404, 1406, 1408, 1410 may be stored locally on the respective computer and/or remotely on database 1414, as appropriate. In one set of embodiments, database 1414 is a relational database, such as Oracle 10 g available from Oracle Corporation. In a particular embodiment, database 1414 is adapted to store, update, and retrieve data streams in response to CQL-formatted commands received at server computer 1410.



FIG. 15 is a simplified block diagram illustrating physical components of a computer system 1500 that may incorporate an embodiment of the present invention. In various embodiments, computer system 1500 may be used to implement any of the computers 1402, 1404, 1406, 1408, 1410 illustrated in system environment 1400 described above. As shown in FIG. 15, computer system 1500 comprises hardware elements that may be electrically coupled via a bus 1524. The hardware elements may include one or more central processing units (CPUs) 1502, one or more input devices 1504 (e.g., a mouse, a keyboard, etc.), and one or more output devices 1506 (e.g., a display device, a printer, etc.). Computer system 1500 may also include one or more storage devices 1508. By way of example, storage device(s) 1508 may include devices such as disk drives, optical storage devices, and solid-state storage devices such as a random access memory (RAM) and/or a read-only memory (ROM), which can be programmable, flash-updateable and/or the like.


Computer system 1500 may additionally include a computer-readable storage media reader 1512, a communications subsystem 1514 (e.g., a modem, a network card (wireless or wired), an infra-red communication device, etc.), and working memory 1518, which may include RAM and ROM devices as described above. In some embodiments, computer system 1500 may also include a processing acceleration unit 1516, which can include a digital signal processor (DSP), a special-purpose processor, and/or the like.


Computer-readable storage media reader 1512 can further be connected to a computer-readable storage media 1510, together (and, optionally, in combination with storage device(s) 1508) comprehensively representing remote, local, fixed, and/or removable storage devices plus storage media for temporarily and/or more permanently containing computer-readable information. Communications subsystem 1514 may permit data to be exchanged with network 1412 of FIG. 14 and/or any other computer described above with respect to system environment 1400.


Computer system 1500 may also comprise software elements, shown as being currently located within working memory 1518, including an operating system 1520 and/or other code 1522, such as an application program (which may be a client application, Web browser, mid-tier application, RDBMS, etc.). It should be appreciated that alternative embodiments of computer system 1500 may have numerous variations from that described above. For example, customized hardware might also be used and/or particular elements might be implemented in hardware, software (including portable software, such as applets), or both. Further, connection to other computing devices such as network input/output devices may be employed.


In one set of embodiments, the techniques described herein may be implemented as program code executable by a computer system (such as a computer system 1400) and may be stored on machine-readable storage media. Machine-readable storage media may can include any appropriate media known or used in the art, including storage media and communication media, such as (but not limited to) volatile and non-volatile, removable and non-removable media implemented in any method or technology for storage and/or transmission of information such as machine-readable instructions, data structures, program modules, or other data, including RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital versatile disk (DVD) or other optical storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store or transmit the desired information and which can be accessed by a computer.


Although specific embodiments of the present invention have been described, various modifications, alterations, alternative constructions, and equivalents are within the scope of the invention. For example, embodiments of the present invention are not restricted to operation within certain specific data processing environments, but are free to operate within a plurality of data processing environments. Additionally, although embodiments of the present invention have been described using a particular series of transactions and steps, it should be apparent to those skilled in the art that the scope of the present invention is not limited to the described series of transactions and steps.


Further, while embodiments of the present invention have been described using a particular combination of hardware and software, it should be recognized that other combinations of hardware and software are also within the scope of the present invention. Embodiments of the present invention may be implemented only in hardware, or only in software, or using combinations thereof.


The specification and drawings are, accordingly, to be regarded in an illustrative rather than a restrictive sense. The scope of the invention should be determined not with reference to the above description, but instead should be determined with reference to the pending claims along with their full scope or equivalents.

Claims
  • 1. A method comprising: receiving, at a computer system, debugging configuration information specifying a functional area of a data stream processing server to be debugged;identifying, by the computer system, an object associated with the functional area that has been instantiated by the data stream processing server;determining, by the computer system, that tracing for the object is enabled to perform the debugging;instantiating, by the computer system, a tracelet associated with the object;stepping, by the computer system, through the tracelet associated with the object to debug the object; anddisplaying, by the computer system, a visual representation of debugging results associated with the object.
  • 2. The method of claim 1, wherein enabling logging for the object comprises: storing the debugging configuration information for the object; andgenerating one or more debugging records for the object based on the debugging configuration information stored for the object.
  • 3. The method of claim 2, wherein disabling logging for the object comprises deleting the debugging configuration information stored for the object.
  • 4. The method of claim 1, further comprising determining that one or more breakpoints are set within the object.
  • 5. The method of claim 4, further comprising continuing through each of the one or more breakpoints to debug the object.
  • 6. The method of claim 1, further comprising in response to determining that tracing for the object is enabled, setting a tracing processing level.
  • 7. The method of claim 1, wherein the stepping further comprises: stepping over operators in the object;stepping into data structures of the operators; andinspecting and watching the data structures of the operators.
  • 8. The method of claim 6, wherein a plurality of query plan objects includes an operator object and one or more data structure objects associated with the operator object.
  • 9. The method of claim 8, wherein if logging is enabled for the operator object, logging is automatically enabled for the one or more data structure objects associated with the operator object.
  • 10. The method of claim 1, further comprising: identifying another object associated with the functional area, wherein the another object was instantiated by the data stream processing server subsequently to receiving the logging configuration information; andenabling logging for the another object.
  • 11. The method of claim 1, wherein the debugging configuration information is received from a user and is expressed as breakpoint statements.
  • 12. The method of claim 1, wherein the debugging configuration information is received via an invocation of a Java Management Extensions (JMX) Applications Programming Interface (API).
  • 13. A machine-readable storage medium having sets of instructions stored thereon which when executed by a machine, cause the machine to: receive debugging configuration information specifying a functional area of a data stream processing server to be debugged;identify an object associated with the functional area that has been instantiated by the data stream processing server;determine that tracing for the object is enabled to perform the debugging;instantiate a tracelet associated with the object;step through the tracelet associated with the object to debug the object; anddisplay a visual representation of debugging results associated with the object.
  • 14. The machine-readable storage medium of claim 13, wherein the sets of instructions further cause the machine to enable debugging for the object to further cause the machine to: store the debugging configuration information for the object; andgenerate one or more debugging records for the object based on the debugging configuration information stored for the object.
  • 15. The machine-readable storage medium of claim 13, wherein the sets of instructions further cause the machine to enable debugging for the object further cause the machine to set one or more breakpoints within the object, wherein the one or more breakpoints are set within a continuous query and within data structures of the object.
  • 16. The machine-readable storage medium of claim 13, wherein the functional area to be debugged corresponds to a type of query plan object.
  • 17. The machine-readable storage medium of claim 13, wherein the sets of instructions further cause the machine to execute steps and breakpoints which cause a graphical representation of the object as the object is being debugged.
  • 18. The machine-readable storage medium of claim 13, wherein the sets of instructions further cause the machine to invoke debugging of the object within a production platform.
  • 19. A system comprising: a processing component configured to: receive debugging configuration information specifying a functional area of a data stream processing server to be debugged;identify an object associated with the functional area that has been instantiated by the data stream processing server;determine that tracing for the object is enabled to perform the debugging;instantiate a tracelet associated with the object;step through the tracelet associated with the object to debug the object; anddisplay a visual representation of debugging results associated with the object.
  • 20. The system of claim 19, further comprising a CQL processor engine including the tracelet, wherein the CQL processor engine is in communication with the processing component.
CROSS-REFERENCES TO RELATED APPLICATIONS

The present application incorporates by reference for all purposes the entire contents of the following related application: U.S. patent application Ser. No. 12/534,384, entitled LOGGING FRAMEWORK FOR A DATA STREAM PROCESSING SERVER filed on Aug. 3, 2009.