The present disclosure generally relates to web-based network applications and, more particularly, to a mechanism for selectively logging events in a distributed network processing system.
Computer systems connected by wide area networks such as the Internet have steadily evolved into vibrant mediums for information exchange. For example, social network sites are fast growing phenomena that provide an interactive medium through which users can create a network of friends for sharing personal information, as well as for exchanging digital media such as music and videos. Social network sites have become an increasingly influential part of contemporary popular culture around the world. A social network site focuses on the building and verifying of online social networks for communities of people who share interests and activities, or who are interested in exploring the interests and activities of others. Most social network services are primarily web based and provide a collection of various ways for users to interact, such as chat, messaging, email, video, voice chat, file sharing, blogging, discussion groups, and the like.
When delivering a web-based or other network application such as a social network website or a dating website, a network application server communicates with other network nodes that perform various processes such as authenticating users, accessing web services, managing payments, storing user account data, etc. During network application development, and/or if there are any problems or failures associated with delivering the web-based application to a given user, a network administrator needs to troubleshoot various physical machines where the processes are performed to determine possible root causes of the problems or failures. Troubleshooting may be time consuming and unreliable, because not all processes have associated failure logs. Many of the processes performed may be performed asynchronously or offline relative to a given user session and are not involved in or relied on for page generation or other synchronous process operations. Accordingly, processing of a given event by these asynchronous processes can be difficult to trace because if a process fails, it often fails in ways that are not apparent or visible to the end-user. For other processes, servers, such as HTTP servers, often maintain a variety of log information; however, the logged information relates typically to all detected events making it difficult to track or correlate how a given client transaction is processed. In addition, the log files, especially, for large network applications having high traffic volumes the log files are not amenable to database storage and are typically maintained in large log files that can be difficult to search and correlate log messages associated with the same event. Still further, for those processes that do have failure logs, the actual logs may expire after a certain amount of time.
The present invention provides a method, apparatus, and systems directed to selectively logging events and processing of events in a distributed network processing system. In particular implementations, the present invention provides a network processing system comprising an event selection process that selects events from a plurality of events for logging, and a set of distributed process modules that log messages associated with the selected events. Logging only a subset of all events allows for the logging of detailed information about how events are processed. Furthermore, the resulting reduction in data allows the logged data to be stored in a database, which facilitates correlation of messages via associated database queries, and other analysis and debugging tasks. For a given flagged event, a log flag causes other processes in the chain to log transactions associated with the event and to send a log message to a central data store. In one implementation, an event may be initiated in response to a request associated with a web-based application, and a log message may include an event identifier (ID) for the event and a transaction or transactions associated with the event. In one implementation, the central data store stores an aggregation of log messages, which may be utilized to troubleshoot failures or to monitor the performance of a web-based application.
A.1. Example Network Environment
In particular implementations, the functional nodes may perform various processes (e.g., authentication, data retrieval, etc.) and may be hosted on one or more physical servers. For example, such servers may include one or more hypertext transfer protocol (HTTP) servers 24 for interfacing with end-user clients 22, an authentication server 26 for authenticating end-user clients 22, a module executor server 27 for executing various modules for various functions (e.g., access a web service), a payment server 28 for managing payments, a database system 30 operatively coupled to one or more databases 32, and an event log database 33, etc. In one implementation, databases 30 may store various types of information such as user account information, user profile data, addresses, preferences, financial account information, etc. Databases 32 may also store content such as digital content data objects and other media assets. A content data object or a content object, in particular implementations, is an individual item of digital information typically stored or embedded in a data file or record. Content objects may take many forms, including: text (e.g., ASCII, SGML, HTML), images (e.g., jpeg, tif and gif), graphics (vector-based or bitmap), audio, video (e.g., mpeg), or other multimedia, and combinations thereof. Content object data may also include executable code objects (e.g., games executable within a browser window or frame), podcasts, etc. Event log database 33 may store logged events as described below in connection with
In one implementation, the network application server 20 is operatively coupled to a network cloud 34 via HTTP server 24 and router 36. Network cloud 34 generally represents one or more interconnected networks, over which end-user clients 22 may communicate with the HTTP server 24 to receive the web-based application. Network cloud 34 may include packet-based wide area networks (such as the Internet), private networks, wireless networks, satellite networks, cellular networks, paging networks, and the like. End-user clients 22 are operably connected to the network environment via a network service provider or any other suitable means. End-user clients may include personal computers or cell phones, as well as other types of mobile devices 23 such as lap top computers, personal digital assistants (PDAs), etc.
A.2. Example Server System Architecture
The server host systems described herein may be implemented in a wide array of computing systems and architectures. The following describes example computing architectures for didactic, rather than limiting, purposes.
The elements of hardware system 200 are described in greater detail below. In particular, network interface 216 provides communication between hardware system 200 and any of a wide range of networks, such as an Ethernet (e.g., IEEE 802.3) network, etc. Mass storage 218 provides permanent storage for the data and programming instructions to perform the above described functions implemented in the location server 22, whereas system memory 214 (e.g., DRAM) provides temporary storage for the data and programming instructions when executed by processor 202. I/O ports 220 are one or more serial and/or parallel communication ports that provide communication between additional peripheral devices, which may be coupled to hardware system 200.
Hardware system 200 may include a variety of system architectures; and various components of hardware system 200 may be rearranged. For example, cache 204 may be on-chip with processor 202. Alternatively, cache 204 and processor 202 may be packed together as a “processor module,” with processor 202 being referred to as the “processor core.” Furthermore, certain embodiments of the present invention may not require nor include all of the above components. For example, the peripheral devices shown coupled to standard I/O bus 208 may couple to high performance I/O bus 206. In addition, in some embodiments only a single bus may exist, with the components of hardware system 200 being coupled to the single bus. Furthermore, hardware system 200 may include additional components, such as additional processors, storage devices, or memories.
As discussed below, in one implementation, the operations of one or more of the physical servers described herein are implemented as a series of software routines run by hardware system 200. These software routines comprise a plurality or series of instructions to be executed by a processor in a hardware system, such as processor 202. Initially, the series of instructions may be stored on a storage device, such as mass storage 218. However, the series of instructions can be stored on any suitable storage medium, such as a diskette, CD-ROM, ROM, EEPROM, etc. Furthermore, the series of instructions need not be stored locally, and could be received from a remote storage device, such as a server on a network, via network/communication interface 216. The instructions are copied from the storage device, such as mass storage 218, into memory 214 and then accessed and executed by processor 202.
An operating system manages and controls the operation of hardware system 200, including the input and output of data to and from software applications (not shown). The operating system provides an interface between the software applications being executed on the system and the hardware components of the system. According to one embodiment of the present invention, the operating system is the FreeBSD operating system or variants of this operating system. However, the present invention may be used with other suitable operating systems, such as the Apple Macintosh Operating System, available from Apple Computer Inc. of Cupertino, Calif., UNIX operating systems, LINUX operating systems, Windows® 95/98/NT/XP/Vista operating systems available from Microsoft Corporation of Redmond, Wash., and the like. Of course, other implementations are possible. For example, the server functionalities described herein may be implemented by a plurality of server blades communicating over a backplane.
As
As discussed above, an event may be initiated upon receipt of an HTTP or other client request. This event may spawn one to many messages in a message processing stream of one or more processes hosted on the same or different servers. One event may split off into multiple processing streams, as
In one implementation, each of the process modules A-E may be hosted by one or more servers such as those described above in connection with
In particular implementations, each process module A-E may include an application programming interface (API) layer that provides the appropriate APIs for communicating with other functional nodes and with external nodes such as web service publishers. An application programming interface (API) is a source code interface that an application, operating system, or library of a web service publisher provides to support requests made by other programs or processes, such as for web services (e.g., functionality, information, etc.). An API defines specifically how to request particular information. For example, an API may require that a request be sent to a particular destination, that the request include arguments for the information being requested (e.g., time, stock price, etc.), etc. The API may also define the rules, syntax, order of information in the request, etc., that should be included in the request. The API may also define how the request should he sent (e.g., as an HTTP request, by e-mail, etc.).
In a particular implementation, the logging functionality can be implemented as a code library defining a logging API and associated functions. During development of the process modules (e.g., process modules A, B, etc.), the software developer may use the library to embed logging commands at one or more points in the software program code that defines a given process.
While the following process flow is described from the perspective of the network application server 20, the process flow may also be performed by the other processes modules A-E. As
If the event selection process indicates that the event should be logged, the network application server 20 sets a log flag or bit (406) in associated event messages that the network application server 20 forwards to one or more additional process modules (408). In particular implementations, a log flag is operative to cause each subsequent process modules A-E in the processing paths to log data associated with each flagged event and to send one or more log messages to an event log database 33. In some implementations, log flags may not be communicated to subsequent process modules via event messages, but may be discovered by subsequent process modules through examination of common resources that associate log flags with particular events. In such an instance, the network application server or another process module could store a log flag along with an identifier or identifiers that are associated with the event.
D. Logging Data Associated with Processed Events
As
In particular implementations, the other process modules B-E along the processing paths also log data associated with their respective processing of flagged events and send log messages to the event log database 33. In particular implementations, a given process module A-E may log data at the beginning of its related processes (e.g., “received the event”), anytime during its related processes (e.g., “accessed a web service,” “successfully retrieved information from the web service,” etc.), and/or at the end of its related processes (e.g., “sent the event to functional node B”). In one implementation, each process module may also log transaction times (e.g., time stamps) or other parameters). Accordingly, any failure along the processing paths may be ascertained by analyzing the aggregation of log messages in the event log database 33. For example, in one scenario, it may be determined that all expected process modules A-D performed their expected operations successfully, except for process E. An administrator or automated process may perform appropriate corrective actions.
In one implementation, the log messages stored in the event log database 33 may be sorted in various ways (e.g., by event, by functional node, user ID, requested action, IP address where originated, edit ID from ticket server, etc.). In particular implementations, the sorted log messages may be utilized for debugging problems or for other purposes such as monitoring the overall performance of the system. Because only a small subset of the events are flagged, the overall system needs to process only a small, manageable amount of logging information.
The present invention has been explained with reference to specific embodiments. For example, while embodiments of the present invention have been described as operating in connection IP and HTTP, the present invention can be used in connection with any suitable protocol environment. Other embodiments will be evident to those of ordinary skill in the art. It is therefore not intended that the present invention be limited, except as indicated by the appended claims.