System and method for backing up distributed controllers in a data network

Description

BACKGROUND OF THE INVENTION

1. Technical Field of the Invention

This invention relates to distributed data networks and, more particularly, to a system and method in a distributed data network of rapidly and efficiently backing up distributed controllers in the network.

2. Description of Related Art

Data networks today may be distributed over wide areas, with a plurality of site locations being linked together over the network. Each of the distributed sites may be controlled by a site controller or central processing unit (CPU) such as a personal computer (PC). For various reasons (for example, power supply failure, hard disk crash, motherboard failure, etc.), a site controller may occasionally fail. Currently, whenever a site controller fails, a network operator must locate an available service technician (and parts) to travel to the site to repair or replace the failed controller. During this time, the site is out of business. That is, the operator of the site is unable to service his customers. Site downtime could be measured in hours or even days.

In order to overcome the disadvantage of existing solutions, it would be advantageous to have a system and method for rapidly and efficiently backing up distributed controllers in the network. The invention would enable the site to continue operations while a technician is dispatched to the site for troubleshooting and repair of the failed site controller. The present invention provides such a system and method.

SUMMARY OF THE INVENTION

In one aspect, the present invention is a system in a distributed data network, for example a network of automated fuel station controllers, for rapidly and efficiently backing up distributed controllers in the network. At each distributed site, the system includes a router, a site controller connected to the router, and a plurality of site devices connected to the site controller through the router. The router, in turn, is connected through a data network to a central controller. The central controller is connected to a database of configuration data for each distributed site, and to a plurality of backup controllers.

In another aspect, the present invention is a method in a distributed data network of rapidly and efficiently backing up distributed controllers in the network. The method begins when a failure of a site controller is detected. A notice of the failure is then sent to a central controller which includes a rack of spare controllers and a database of site configurations. A spare controller is selected and configured with the configuration of the troubled site. The site router at the troubled site is then reconfigured to connect the spare controller to the troubled site through the data network. The spare controller then takes over as the site controller while the faulty controller is repaired or replaced.

In yet another aspect, the present invention is a router that connects a site controller to a data network, and connects a plurality of site devices having serial interfaces to the site controller. The router may include means for detecting a failure of the site controller, or the router may receive an indication from a central controller on the network that the site controller has failed. In the event of a failure of the site controller, the router converts the serial interface data from the plurality of site devices to Internet Protocol (IP) packets and routes the packets over the data network to the central controller.

In yet another aspect, the present invention is a method of backing up an automated fueling-station controller in communication with a data network, including the step of providing at least one spare controller that is also in communication with the data network. When station-controller failure is detected, the method continues with the steps of configuring the spare controller using controller-configuration information previously stored in a database, and routing station-controller communications through the data network to the configured spare controller until the station controller is restored to service.

BRIEF DESCRIPTION OF THE DRAWINGS

The invention will be better understood and its numerous objects and advantages will become more apparent to those skilled in the art by reference to the following drawings, in conjunction with the accompanying specification, in which:

FIG. 1

is a simplified block diagram of an embodiment of the system of the present invention;

FIG. 2

is a flow chart illustrating the steps of the method of the present invention when bringing a spare controller on line;

FIG. 3

is a flow chart illustrating the steps of a recovery process when a repaired site controller is brought back on line; and

FIG. 4

is a flow chart illustrating the steps of database population in accordance with a method of the present invention.

DETAILED DESCRIPTION OF EMBODIMENTS

The present invention is a system and method in a distributed data network of rapidly and efficiently backing up distributed controllers in the network. The invention utilizes Internet technology to reduce the site downtime by facilitating the rapid configuration and connection of a backup controller. The turnaround time is reduced to several minutes as opposed to several hours or days.

All of the distributed sites in a distributed data network are connected to a central controller via, for example, the Internet or a private IP-based intranet. The solution includes a router (or hub) at each site that preferably includes an interworking function (IWF) for interfacing non-IP site devices with the IP-based data network. The site devices are connected to the router which in turn connects to the site controller. The router, in turn, is connected through the IP data network to the central controller. The central controller is connected to a database of configuration data for each distributed site, and to a plurality of backup controllers that may be located, for example, at a help desk.

The router may include means for detecting a failure of the site controller, or the failure may be detected by the central controller. For example, the site controller may send a periodic “heartbeat” signal to the central controller indicating that it is operating normally. If the heartbeat signal stops, the central controller sends an indication to the router that the site controller has failed. Alternatively, an operator at the site may call a central help desk and report the site controller failure.

Upon detection of a failure of one of the site controllers, a notice is sent to a remote help desk which includes a rack of spare site controllers and a database of site configurations. A spare site controller is selected and configured with the configuration of the troubled site. The site router at the troubled site is then reconfigured to connect the spare site controller at the remote help desk to the troubled site. The spare site controller then takes over as the site controller while the faulty controller is repaired or replaced.

In the preferred embodiment of the present invention, the invention is described in the context of the fueling industry in which a distributed network controls a plurality of automated service stations. These automated ‘self-service’ stations allow customers to dispense their own fuel, but may in fact be fully or only partially automated. Each station has a PC which functions as a site controller. Other site devices, with serial interfaces to the PC, include such devices as gasoline dispensers, island card readers, and payment system dial modem interfaces. A failure in the PC causes the router to convert the serial interface data from the site devices to IP packets, and route the packets over the data network to a backup PC which has been configured by the central controller to replace the site PC while it is being repaired.

FIG. 1

is a simplified block diagram of an embodiment of the system of the present invention. In this embodiment, distributed network

100

includes distributed site

110

, here an automated fueling facility, and central control site

160

. While for illustration they are separated by a broken line, there is no physical or distance separation requirement. (In one alternative embodiment, for example, the central control site and one of several distributed sites in the distributed network may exist at the same location, or even use the same computer.) For clarity, only a central control site and one automated fueling facility are illustrated in

FIG. 1

, though there could be (and usually are) numerous distributed sites, and possibly two or more control sites. Communications are accomplished over a data-communications network

150

, which is often the Internet or a wide-area network (WAN), but could be any other suitable network such as an intranet, extranet, or virtual private network (VPN).

Fueling facility

110

includes fuel dispensers

115

and

116

, from which consumers can dispense their own fuel. Such fuel dispensers typically have an island card-reader (ICR) (not shown) that allows purchasers to make payment for the fuel they receive by, for example, credit or debit card. An ICR interface

118

handles communications to and from the ICRs located on dispensers

115

and

116

so that credit or debit purchases can be authorized and the appropriate account information gathered. The dispensers

115

and

116

themselves communicate through dispenser interface

120

, for example, to receive authorization to dispense fuel or to report the quantity sold.

On-site primary controller

140

is a PC or other computing facility that includes operational software and data storage capabilities in order to be able to manage site operations. Site operations may include not only fuel dispensing but related peripheral services as well, such as a robotic car wash. For illustration, car-wash controller

122

is shown communicating through peripheral interface

124

. Communication with separate automated devices, such as a car wash, may be desirable, for example to allow payment to be made through an ICR at the dispenser, or to adjust the price charged based on other purchases already made. Point-of-sale (POS) terminals

125

and

126

are stations for use by a human attendant in totaling and recording sales, making change, and preforming credit card authorizations, and may be used for inventory control as well.

Each of the site components (and any others that may be present), communicate directly or indirectly with on-site primary controller

140

and each other though hub

130

. Hub

130

is an on-site router that directs data traffic, typically serial communications between the various on-site components. Generally, the hub

130

will receive a communication, determine where it should be sent, and effect transmission when the addressed device is ready to receive it. In addition, hub

130

is connected to data network

150

so that the distributed site

110

can communicate with the central control site

160

. Note that this connection can be permanent or ad hoc, as desired.

In this embodiment, the network operations controller (NOC)

165

, located at central control site

160

, manages and supervises the operations of distributed site

110

and the other distributed sites in the network

100

. For example, an owner may want to centrally manage a number of distributed fueling facilities. Certain operations, such as accounting and inventory control, may be efficiently done at this control center, although the specific allocation of management functions may vary according to individual requirements.

Also in communication with data communications network

150

is a central control accounting center (CCAC)

170

that acts as a hub or router, when necessary, to effect communications in accordance with the present invention, as explained more fully below. In this capacity, CCAC

170

handles communications between network

150

and virtual spares

171

,

172

,

173

, and

174

. These virtual spares are backup controllers that can be brought into use when one of the on-site primary controllers, such as on-site controller

140

, is down for maintenance. CCAC

170

may also be connected directly (as shown by the broken line) to NOC

165

, which in a preferred embodiment is located at the same site as the CCAC.

The on-site controllers in distributed network

100

need not be, and very often are not, identical or identically configured. Software product database

180

is used for storing information related to what software is resident on each on-site controller. Likewise, site configuration database

182

similarly maintains a record of the configuration parameters currently in use for each on-site controller in distributed network

100

. (Although two configuration-information databases are shown in this embodiment, more or less could be present, and the nature and quantity of the configuration information stored there may of course vary from application to application.) Databases

180

and

182

are accessible though CCAC

170

, though which they are populated and through which they are used to configure a virtual spare (as explained more fully below).

Note that even though system components of

FIG. 1

are illustrated as separate physical entities, they can also be combined in one machine that is logically separated into a number of components. And as long as they can be placed in communication with the other system components as contemplated by the present invention, there is no requirement that they co-occupy the same machine, physical location, or site.

FIG. 2

is a flow chart illustrating the steps of the method of the present invention when bringing-up a spare controller, for example virtual spare

171

shown in FIG.

1

. (Note that no exact sequence is required, and the steps of the method of the present invention, including those of the illustrated embodiment, may be performed in any logically-allowed order.) The method begins with step

200

, problem determination. This determination may occur in a variety of ways, two of which are shown in FIG.

2

. In a first scenario, the problem determination includes the failure to receive a status message (sometimes called a ‘heartbeat’) that during normal operations is regularly transmitted by a properly functioning site controller (step

202

). In a second scenario, a ‘site-down’ call is received (step

204

) at the central control site

160

, often from an attendant at the distributed site

110

. Note that a system or method embodying the present invention need not include the capability to perform both scenarios, although in some circumstances both may be desirable.

The method then moves to step

205

, where the system, and preferably NOC

165

, makes a determination of which site controller is down and whether back-up or repair is required. Normally, at this point corrective action will be initiated to recover the failed site controller, which often involves dispatching repair personnel to the site (step

210

). Also at this time, a target machine to provide virtual-spare functionality is selected (step

215

), such as virtual spare

171

shown in FIG.

1

. This selection is generally based on availability, but may be based on suitability for a particular situation or other factors as well. Reference is then made to the software product database

180

and the site configuration database

182

(step

220

), to identify the software and parameters related to the down on-site controller identified in step

205

. The virtual spare is then prepared (step

225

). The distributed site software set is loaded from software product database

180

(step

225

a

), the site configuration parameters are loaded from site configuration database

182

(step

225

b

), and the virtual spare is then warm-started (step

225

c

).

Note that in a preferred embodiment, the NOC

165

, upon being notified (or otherwise determining) that a virtual spare is required, selects the appropriate spare for use according to a predetermined set of criteria, and then initiates and supervises the virtual-spare configuration process. In another embodiment, some or all of these functions may be instead performed by hub

130

, or by another component (for example one dedicated for this purpose).

In order to place the virtual spare ‘on-line’, the communication address tables in the on-site hub

130

must be updated so that the address of virtual spare

171

replaces that of on-site controller

140

(step

230

). (The address of virtual spare

171

may include the address of CACC

170

, which will receive messages sent to virtual spare

171

and route them appropriately.) At this point, all communications from the components at distributed site

110

that would ordinarily be directed to the on-site controller

140

are now routed to virtual spare

171

. Virtual spare

171

now functions in place of the on-site controller

140

, having been configured to do so in step

225

. Note that although not shown as a step in

FIG. 2

, it may be necessary for hub

130

to perform a protocol conversion when routing data through network

150

instead of on-site controller

140

. Typically, this means converting serial transmissions to TCP/IP format, but could involve other procedures as well. In a preferred embodiment, an interworking function is resident on hub

130

for this purpose. Finally, the configuration now in place is tested to ensure correct functionality (step

235

), and any necessary adjustments made (step not shown). The virtual spare

171

continues to function for on-site controller

140

until the necessary maintenance is completed and recovery begins. Note that of the site controller outage (whether caused by a failure or the need for system maintenance) may be total or partial. Therefore the spare controller may not be required to assume all site-controller functions in order to manage operations of the on-site equipment during the outage (either because the failure was not total or because complete assumption is not necessary or desired). Note also that as used herein, the terms “back up” and “backing up” refer to replacing some or all controller functionality according to the system and method described, and not merely to the process of making a “backup” copy of software, or of database contents (although copies of software and data may certainly be useful while practicing the invention).

FIG. 3

is a flow chart illustrating the steps of a recovery process according to an embodiment of the present invention, where a repaired on-site controller is brought back on-line. The recovery process follows from process of

FIG. 2

(or an equivalent method), where a virtual spare is bought in as a backup. First, the virtual system is synchronized with the third-party systems (step

310

). For example, if virtual spare

171

has been functioning for on-site controller

140

, virtual spare

171

performs the end-of-day (EOD) synchronization that would ordinarily have been done by the controller

140

, such as balancing accounts, storing data, transmitting reports to the network operator or to third-party financial institutions. Any discrepancies found may then be addressed in the usual manner before the (now-repaired) controller

140

is brought back on-line. The repaired unit, such as on-site controller

140

, is started-up (step

315

). Since it has been down for a time, the repaired unit's configuration files are updated (step

320

),as necessary. It is then ready to be placed back into operation, so the router address tables are altered to change the routing address for relevant communications from the virtual spare

171

address back to the on-site controller

140

address (step

325

).

To ensure that the repaired site controller can perform its normal function, its connectivity to the network is validated (step

330

), and the functionality of the on-site controller itself is also validated (step

335

). Once the results of this test are verified, the virtual spare

171

is returned to inventory (step

340

), that is, made available for other tasks. The process is finished at step

350

, where the problem resolution has been achieved with a minimum of interruptions to normal system operations. Again, while in a preferred embodiment, the NOC

165

directs the process of restoring the site controller to service, this function may also be performed by hub

130

, another system component, or shared.

FIG. 4

is a flow chart illustrating the steps of database population in accordance with a method of the present invention. The system and method of the present invention depend on prior creation of the appropriate database records, since by definition the rapid-and-efficient backup will be required when the site controller is unavailable and cannot provide the information needed to correctly configure a spare. An exception occurs in the case of a planned outage. Since it is in that case known when the site controller will be taken out of service, the virtual spare can be configured from a database created especially for the planned outage, or even directly from the still-operational site controller itself. Since premature failure of a site controller cannot be completely avoided, however, the preferred method remains the population of software product database

180

and the site configuration database

182

at the time the site is installed, or modified, as shown in FIG.

4

.

The process of

FIG. 4

begins with receiving an order for a new network of distributed sites (step

410

). After the order is processed (step

415

), the new site system is staged, and the software product database by site is created (step

420

). At site installation, step

425

, where the actual hardware is put into place and connected, for example as shown by the fueling facility

110

of FIG

1

. The installed site system is configured (step

430

), then the site controller is started-up and registers its configuration in the site configuration database (step

435

).

System upgrades are populated in like fashion. When the need for an upgrade is identified (step

440

), usually based on a customer request, the distribution of the upgrade software is scheduled (step

445

). When ready, the system automatically distributes the software to the site controllers and updates the software product database to reflect the new site configuration (step

450

). A system review process is then initiated to review exceptions and resolve issues (step

455

). Any resulting changes affecting site configuration are added to the site configuration database (step not shown).

Based on the foregoing description, one of ordinary skill in the art should readily appreciate that the present invention advantageously provides a system and method for backing up distributed controllers in a data network.

It is thus believed that the operation and construction of the present invention will be apparent from the foregoing description. While the system and method shown and described has been characterized as being preferred, it will be readily apparent that various changes and modifications could be made therein without departing from the scope of the invention as defined in the following claims.

Claims

1. A system for controlling, through a data network, an automated fuel-distribution site having fuel-dispensing equipment, said system comprising:a site control system located at the site, comprising: a site controller in communication with the fuel-dispensing equipment, the site controller being configured to manage operations of the fuel-dispensing equipment; and a site hub for routing communications between the site equipment and the site controller, and for routing communications to and from the data network; a site-configuration database populated with information regarding the configuration of the site controller; a central control system remotely located from the site comprising: a spare controller reconfigurable to at least partially match the configuration of the site controller and manage the operations of the fuel-dispensing equipment; a central controller that reconfigures the at least one spare controller when required with information from the site-configuration database; and a central hub for routing communications between the central controller, the spare controller, and the site-configuration database, and for routing communications to and from the data network; and means for determining when configuration of the spare controller is required for managing the operations of the fuel-dispensing equipment.
2. The system of claim 1, wherein the data network is the Internet, and further comprising a function available in the site hub for selectively translating site communications addressed to the site controller into an Internet protocol so that the communications can be routed through the Internet to the spare controller when it assumes management of the fuel-dispensing equipment.
3. The system of claim 1, wherein the means for determining when configuration of the spare controller is required comprises:means at the site for generating a predetermined signal pattern when the site controller is functioning properly; and means for detecting when the predetermined signal pattern has been interrupted, indicating that the site controller is not functioning properly.
4. The system of claim 1, wherein the means for determining when configuration of the spare controller is required resides on the site hub, and wherein the site hub further comprises means for generating a notification message to alert the central controller that a site controller failure has been detected.
5. The system of claim 1, further comprising a function in the central controller for directing the site hub to begin routing to the spare controller communications addressed to the site controller.
6. The system of claim 1, wherein the site-configuration database is maintained at the fuel-distribution site.
7. The system of claim 1, wherein the central control system includes:a plurality of spare controllers in communication with the data network and remotely located from the site; and a function in the central controller for selecting one of the plurality of spare controllers to be configured to manage the operations of the site equipment.
8. A system for backing up a site controller in a distributed network having a plurality of sites, each site having a site controller that is configured to manage operating equipment located at the site and a site hub for routing communications between the site equipment and the site controller, each site hub also being in communication with a data communications network, said system comprising:a configuration database populated with configuration information indicating how each of the plurality of site controllers is configured; a configurable spare controller remotely located from the sites and in communication with the data communications network, said spare controller being configurable using the configuration information in the database to manage the operating equipment at a selected site by communicating with the hub at the selected site over the data communications network; a central controller remotely located from the sites and in communication with the data communications network, the central controller including means for configuring the spare controller with configuration information for the site controller at the selected site when backing up of the site controller at the selected site is required; and means for determining when backing up of the site controller at the selected site is required.
9. The system of claim 8, wherein the means for determining when backing up of the site controller at the selected site is required resides on the hub at the selected site, and wherein the hub at the selected site further comprises means for generating a notification message to alert the central controller that a site controller failure at the selected site has been detected.
10. The system of claim 8, wherein the central controller also includes a function that directs the hub to begin routing to the spare controller communications addressed to the site controller.
11. The system of claim 8, further comprising:a plurality of spare controllers remotely located from the sites and in communication with the data network; and a function in the central controller for selecting one of the plurality of spare controllers to assume the function of the site controller at the selected site.
12. A router for connecting a plurality of site components to a site controller and for connecting the site controller through a data network to a central controller and at least one backup site controller, the router comprising:means for determining when the site controller is not operational; and means for rerouting to the backup site controller, communications directed to site controller when the site controller is not operational.
13. The router of claim 12, further comprising a function in the router for converting between serial-interface data and Internet protocol (IP) data packets.
14. A method of backing-up an automated fueling-station controller that manages station components at a fueling station by communicating with them through a station router, the fueling station being part of a distributed network having a central controller that communicates with the station router through a data network, said method comprising the steps of:providing at least one spare controller remotely located from the site and in communication with the data network; populating a database with configuration information for the station controller; detecting a station controller failure; configuring the spare controller using the configuration information from the database so that the spare controller is capable of at least partially functioning as the station controller; and rerouting, by the station router, station communications to and from the spare controller over the data communications network so that the spare controller can manage the station components.
15. The method of claim 14, wherein the step of detecting a station controller failure includes detecting the failure of a station-controller heartbeat signal.
16. The method of claim 14, wherein the step of providing at least one spare controller includes providing a plurality of spare controllers remotely located from the site, and the method further comprises the step of selecting one of the plurality of controllers to act as a backup upon detecting a station controller failure.
17. The method of claim 14, further comprising the step of translating the station communications before rerouting them.
18. A method of backing-up a site controller that manages a site in a distributed network by communicating through a hub, the distributed network including a central controller, a database, and at least one spare controller remotely located from the site and in communication with the hub through a data network, the method comprising the steps of:populating the database with configuration parameters for the site controller; detecting a site controller failure; configuring the spare controller with the configuration parameters for the site controller; and managing the site using the spare controller as a replacement for the failed site controller by routing site-management communications through the data network.
19. The method of claim 18, wherein the distributed network includes a plurality of spare controllers, and the method further comprises the step of selecting a spare controller from the plurality of spare controllers.
20. The method of claim 18, further comprising the steps of:determining that the site controller is ready to return to service; and transferring site management back to the site controller.

Parent Case Info

This application claims the priority of the U.S. Patent Application: U.S. Patent Application Serial No. 60/185,327 filed Feb. 28. 2000.

US Referenced Citations (12)

Number	Name	Date	Kind
4035770	Sarle	Jul 1977	A
4351023	Richer	Sep 1982	A
5202822	McLaughlin et al.	Apr 1993	A
5583796	Reese	Dec 1996	A
5796936	Watabe et al.	Aug 1998	A
5845095	Reed et al.	Dec 1998	A
5886732	Humpleman	Mar 1999	A
5895457	Kurowski et al.	Apr 1999	A
5980090	Royal, Jr. et al.	Nov 1999	A
6085333	DeKoning et al.	Jul 2000	A
6230200	Forecast et al.	May 2001	B1
6557031	Mimura et al.	Apr 2003	B1

Provisional Applications (1)

	Number	Date	Country
	60/185327	Feb 2000	US

System and method for backing up distributed controllers in a data network

Information

Patent Number

Date Filed

Date Issued

Inventors

Original Assignees

Examiners

Agents

CPC

US Classifications

Field of Search

US

International Classifications

Term Extension

Abstract

Description

Claims

Parent Case Info

US Referenced Citations (12)

Provisional Applications (1)