Title of Invention

METHOD AND APPARATUS FOR FAULT TOLERANT PERSISTENCY SERVICE ON NETWORK DEVICE

Abstract A method for providing persistency fault tolerant data stored in a database on a device in networked environment for an external application, the device having an active processor system and a standby processor system involves the following steps: providing an identical standly copy of an active database located on the active processor system, on the standly processor system; monitoring the active processor for a failure; and assuming control by the standby processor assumed control when the failure is detected; wherein switching from the active database to the standby database is transparent to the external application,
Full Text FIELD OF THE INVENTION
This invention relates to communication networks and more particularly to data storage for an optical communication network,
BACKGROUND OF THE INVENTION
While Internet Protocol ("IP") traffic will represent more than 90 percent of the total public communication network traffic by 2002 and communication service providers plan to invest more than $70 billion in core routing and optical transmission equipment to significantly expand their IP/optical backbone networks, revenues from IP services will only approach $25 billion, which represents only a third of the total communication network services revenue pool of $75 billion. This revenue dilemma is primarily the result of extensive competition in the Internet access market, which has essentially resulted in commodity, flat rate pricing. Extensive use of graphics, audio and video content has driven average utilization up significantly, yet the user is charged the same rate. Service providers must add capacity in the network core without any corresponding increase in revenue. The real challenge for service providers is how to generate more revenue from their IP/optical backbones. By taking advantage of the latest advances in IP quality of service ("QoS"), multiprotocol label switching ("MPLS"), and service transformation technology (the conversion

of non-IP services to IP), service providers can evolve dedicated IP infrastructures into a multi-service network architecture, as an alternative to operating separate service-specific networks. The new network architecture is a single multi-service network using IP as the underlying protocol for all service delivery. This allows service providers to supplement IP revenues with other established network service revenues from frame relay, TDM private lines and ATM, resulting in faster payback of the tremendous carrier investment in their IP/optical networks.
However, all facets of the multi-service network architecture must have the reliability of the networks it intends to supplement or replace. Fault tolerance must start at the network edge where services converge. While traditional databases provide efficient storage, they do not address the problems and issues of providing high reliability fault tolerant systems necessary for network devices in this environment. Therefore there is a need for high reliability fault tolerant database storage in the multi-service network environment.
SUMMARY OF THE INVENTION
In one aspect, the present invention provides a method for providing persistency fault tolerant data stored in a database on a device in a networked environment for an external application, the device having an active processor system and a standby processor system. The method comprises the following steps", providing an identical standby copy of an active database located on the

r active processor system, on the standby processor system; monitoring the active processor for a failure; and assuming control by the standby processor assumes control when the failure is detected; wherein switching from the active database to the standby database is transparent to the external application. A system is also disclosed.
STATEMENT OF THE INVENTION
According the present invention relates to a method for providing persistency fault tolerant data stored in a database on a device in a networked environment for an external application, the device having an active processor system and a standby processor system, the method comprised by the steps of: (a) maintaining a checksum for each record in an active database located in the active processor system; (b) checking the checksum during initialization; (c) providing an identical standby copy of the active database located on the active processor system, on the standby processor system as a standby database; (d) monitoring the active processor for a failure; and (e) assuming control by the standby processor system when the failure is detected, wherein switching from the active database to the standby database is transparent to the external application and a magic number is kept to distinguish any tar and zipped file with the standby database.
The present invention is also relates to a system for providing persistency fault tolerant data stored in a database on a device (60) in a networked environment for an external application (10), the device (60) having an active processor system (40) and a standby processor system (50), the

system comprising: checksum means for maintaining a checksum for each record in an active database (20) located in the active processor system (40) and checking the checksum during initialization; standby means for providing an identical standby copy of the active database (20) located on the active processor system (40), on the standby processor system (50) as a standby database (30); monitor means for monitoring the active processor system (40) for a failure; and control means for assuming control by the standby processor system (50) when the failure is detected, wherein switching from the active database to the standby database (30) is transparent to an external application (10) and a magic number is kept to distinguish any tar and zipped file with the standby database (30).
The present invention also relates to a device providing persistency fault tolerant data stored in a database and having an active processor system and a standby processor system, the device comprising; a checksum unit maintaining a checksum for each record in an active database located in the active processor system and checking the checksum during initialization; a standby unit providing an identical standby copy of the active database located on the active processor system, on the standby processor system as a standby database; a monitor unit monitoring the active processor for a failure; and a control unit assuming control by the standby processor system when the failure is detected, wherein switching from the active database to

the Standby database is transparent to an external application and a magic number is kept to distinguish any tar and zipped file with the standby database.
BRIEF DESCRIPTION OF THE DRAWINGS
A more complete understanding of the present invention may be obtained from consideration of the following description in conjunction with the drawing in which:
FIG. 1 is a block diagram of a system to provide persistency fault tolerant data in a network environment for an external application;
FIG. 2 is high-level functional block diagram of the relationship of system elements; and
FIG. 3 is a high-level functional block diagram showing the interaction between a representative external application and the datastore module.
DETAILED DESCRIPTION OF VARIOUS ILLUSTRATIVE EMBODIMENTS
While the present invention is particularly well suited for use with the AmberNetwork ASR2000 and ASR2020 router devices and shall be so described herein, it is equally suited for use with other optical routers having similar capabilities and features for implementation of MPLS redundancy.

MPLS (Multiprotocol Labei Switching) is a standards-approved technology for speeding up network traffic flow and making it easier to manage. MPLS involves setting up a specific path for a given sequence of packets, identified by a label put in each packet, thus saving the time needed for a router to look up the address to the next node to foi-ward the packet to. MPLS is called muitiprotocol because it works with the Internet Protocol ("IP"), Asynchronous Transport Mode ("ATM"), and various frame relay network protocols. Referring to the standard Open Systems Interconnection ("OS!"), MPLS allows most packets to be forwarded at the layer 2 (switching) level rather than at the layer 3 (routing) level. In addition to moving traffic faster overall, MPLS makes it easy to manage a network for quality of service ("QoS"). For these reasons, the technique is expected to be readily adopted as networks begin to carry more and different mixtures of traffic.
While MPLS was originally a way of improving the forwarding speed of routers it is emerging as a crucial standard technology that offers new capabilities for targe-scale IP networks. Traffic engineering, the ability of network operators to dictate the path that traffic takes through their network, and Virtual Private Network support are examples of two key applications where MPLS is superior to any currently available IP technology.
MPLS LDP, CR-LDP. RSVP, RSVP-TE and other protocols are defined by the Internet Engineering Task Force ("IETF")- The definitions describe the

need for protocol redundancy; however do not provide information on its implementation, which is essentially left to a vendor/manufacturer to implement for their particular application requirements. An edge router is a device that routes data between one or more local area networks (LANs) and a backbone network. An edge router is an example of an edge device and is sometimes referred to as a boundary router. An edge router is sometimes contrasted with a core router, which forwards packets to computer hosts within a network (but not between networks).
With an aggregation and core router application, failure of a protocol, can lead to an unacceptable network down time. Hardware and software redundancy must be provided to provide high network availability. While traditional databases provide efficient storage, they do not address the problems and issues of providing high reliability fault tolerant systems necessary for network devices in this environment. The present invention. Method and Apparatus for Fault Tolerant Service on Network Device, enables high reliability fault tolerant database storage in the multi-service network environment.
FIG. 1 is a block diagram of a system to provide persistency fault tolerant data in a network environment for an external application, in accordance with one aspect of the present invention. In one aspect, the present invention provides a system and method for providing persistency fauh tolerant data in a networked environment for an external application 10, In brief, the method comprises defining a database 20 using Structure of Management Information version 2 (SMIv2) format, "then generating structure and metadata corresponding to the database using the SMIv2 definition. Providing an identical standby copy of the database 20 located on a primary system 40, on a secondary system 50 and accessing the active database through an application program interface. Switching from the active database 20 to the standby database 30 is done transparently {to the external application) when a fauh is detected in the primary system.
In one aspect, the present invention provides a system and method for providing persistency fault tolerant data in a networked environment for an external application, in brief, the method comprises defining a database using Structure of Management Information version 2 {SMIv2) format, then generating

structure and metadata corresponding to the database using the SMIv2 definition. Providing an identical standby copy of the database located on a primary system, on a secondary system and accessing the active database through an application program interface. Switching from the active database to the standby database is done transparently (to the external application) when a fault is detected in the primary system.
The present invention provides an efficient persistency for a network data storage device that is fault tolerant. The present invention enables an application to define the data persistency requirements in SMiv2 (Structure of Management Information version 2) format and generate the required schema. The application interacts using APIs (Application Programming Interfaces) to read and/or write persistent information. This enables the application to be highly available as a copy of the data and the necessary library are redundantly kept in the other control plane. When a failure occurs, the redundant card takes over and the same data is available on the redundant control plane.
The present invention supports different kinds of conventional data, including opaque data. Copies of the database with its signature can be verified by the application without having to extract the data from the database.
From the perspective of a network manager, network management takes place between two major types of systems: those systems in control, called

managing systems, and those systems observed and being controlled, called managed systems. The most common managing system is called a Network Management System (NMS). Managed systems can include hosts, servers, or network components such as routers or intelligent repeaters.
To promote interoperability, cooperating systems must adhere to a common framework and a common language, called a protocol. In the Internet Network Management Framework, that protocol is the Simple Network Management Protocol, commonly called SNMP.
The exchange of information between managed network devices and a robust NMS is essential for reliable performance of a managed network. Because some of these devices may have a limited ability to run management software, the software must minimize its performance impact on the managed device. The bulk of the computer processing burden, therefore, is assumed by the NMS. The NMS in turn nins the network management applications that present management information to network managers and other users.
In a managed device, the specialized low-impact software modules, called agents, access information about the managed devices and make it available to the NMS. Managed devices maintain values for a number of variables and report those, as required, to the NMS. For example, an agent might report such data as the number of bytes and packets in and out of the device, or the number of

broadcast messages that were sent and received. In the Internet Network Management Framework, each of these variables is referred to as a managed object. A managed object is a classification of anything that can be managed, anything that an agent can access and report back to the NMS. All managed objects are contained in the Management Information Base (MIB), a database of the managed objects.
An NMS can control a managed device by sending a message to the agent (of that managed device) requiring the device to change the value of one or more of its variables. The managed devices can respond to commands such as Sets or Gets. Sets are used by the NMS to control the device. Gets are used by the NMS to monitor the device,
MIB variables are accessible via the Simple Network Management Protocol (SNMP), which is an application-layer protocol designed to facilitate the exchange of management information between network devices. The SNMP system consists of three parts: SNMP manager, SNMP agent, and MIB.
Instead of defining a large set of commands, SNMP places all operations in a get-request, get-next-request, get-bulk-request, and set-request format. For example, an SNMP manager can get a value from an SNMP agent or store a value in that SNMP agent. The SNMP manager can be part of a network management system (NMS), and the SNMP agent can reside on a networking device such as a

router. The MIB is compiled with network management software. If an SNMP is configured on a router, the SNMP agent can respond to MIB-related queries being sent by the NMS.
An example of an NMS is the network management software which uses the MIB variables to set device variables and to poll devices on the inter-network for specific information. The results of a poll can be graphed and analyzed to help you troubleshoot inter-network problems, increase network performance, verify the configuration of devices, monitor traffic loads, and more.
The SNMP agent gathers data from the MIB, which is the repository for information about device parameters and network data. The agent also can send traps, or notifications of certain events, to the manager.
The present invention, Method And Apparatus For Fault Tolerant Service On Network Device, utilizes a database implemented using the IETF SMIv2 format as a collection of managed objects contained in a MIB, which is a database of managed objects. The program interacts using the API to read or write persistent information. The database uses the IETF SMIv2 format as a data defmition language. SMIv2 Management informafion is viewed as a collection of managed objects, residing in a virtual information store. MIB (the Management Information Base). Cohections of related objects are defined in MIB modules. These modules are written using an adapted subset of OSI"s Abstract Syntax

Notation One, ASN.l (1988). Structure of Management Information (SMI), defines the adapted subset, and to assign a set of associated administrative values. The SMI is divided into three parts: module definitions, object definitions, and, notification definitions. The final RFCs (Request For Comments) defining SMIv2 have been published as hiternet Standard 58 in April 1999; Structure of Management Information Version 2 (SMIv2), RFC 2578, STD 58, April 1999; Textual Conventions for SMlv2, RFC 2579, STD 58, April 1999; and, Conformance Statements for SMIv2, RFC 25SQ, STD 58, April 1999 and are herein incorporated by reference as if set out in full.
Conventional databases use complex mechanisms for storing data which are essentially not designed for use as a network device because of their lack of fault tolerance. The present invention provides for a new way for storing data which makes it fault tolerant. The application services that require persistency information defme the layout schema of the database using SMlv2 format. Other databases either use a proprietary data definition language or a Structured Query Language (SQL) for defining their data. The present invention has data elements defined in SMIv2 format which is then used to generate structures and metadata. The generated structures are used by the application to read and write data. The metadata is used by a database service called datastore to provide access to the data.

As illustrated in FIG. 1, when a network device 60 is started for the first time the layout schema is initialized on top of the file system. The file system is expected to provide POSiX compliant file 10 functions. The applications are notified to then initialize their records by returning an error message when the first read is done. The present invention supports dynamic records that can grow dynamically. The application can then read and write to the persistent information using the database record id (that is generated by the tool) and the row number. A checksum is maintained for each record and is checked every time the system reboots. An identical copy of the database 20 is kept on standby. When the standby module 50 is plugged in, provisioning on the active module is frozen and the database is copied from the active system 10 to the standby system 50. After the database copy 30 is completed standby tasks are spawned. This enables all of the tasks to see the same database as each change in the active database is sent to the standby database as well.
The system for providing persistency fault tolerant data stored in a database on a device (60), in a networked environment for an external applicafion (10) is illustrated, in part, in FIG. I. The device (60) has an active processor system (40) and a standby processor system (50). The system includes checksum means or unit (75) for maintaining a checksum for each record in an active database (20) located in the active processor system (40) and checking the checksum during initialization. Standby means or unit (80) provides an identical standby copy of the active database (20) located on the active processor system (40), on the standby processor system (50) as a standby database (30). Monitor means or unit (85) monitors the active processor system (40) for a failure and control means or unit (90) assumes control by the standby processor system (50) when the failure is detected. Switching from the active database to the standby database (30) is transparent to an external application (10) and a magic number is kept to distinguish any tar and zipped file with the standby database (30).
A backup copy (snapshot) of the database is made using tar and compression techniques. This backup mechanism is similar to the standard application. In addition a magic number is kept to distinguish any tar and zipped file with the datastore snapshot- A version number is stored in the zipped file. The gzip"s header"s comment field is used to store both the magic number and the version information. All backup copies are also kept redundant.

The database is designed to provide a transparent version upgrade when it detects an older version. This is done by using the dsrevise tool to find the changes between the database versions and then generates the code for upgrading the older version to the newer version.
Referring to Fig, 2 there is shown the interaction between the definitions, datastore and application. The application defines the data definitions essentially by defining the MIB, These schema files 102 describes definition of items such as the host, temperature sensor, system card information and line card information that required being persistent in order for the system to be highly reliable and highly available. After the MIB is defined, the MIB definitions are then used to generate information that is used by ihe system. This is done using the datastore language processor utihty (dslp) 104, This generates files used by datastore 106 and application 108, This includes meta data 110 and C header file 112, The applicafion 108 utilizes a compiler 114 to generate an executable module 116 15 from the runtime library 118 and theC source code file 120.
The dslp utility 104 then generates the following files. ■ dsRecld.h: contains the record identities. This contains the record identifies foraJI the
records defined. These record identifiers are used the applicafions. . dsMeta,h; contains the record information required by datastore.

• dsPrintDir.h: contains the mapping for print functions. This is used for
ds^showRecorcls.
• dsPrintProto.h: contains print prototypes for all the datastore records. The
application developer can provide intpiementation of these routines. The
default implementations are also implemented in dsPrintlmpl.c file.
• dsPrintlmpl.c: The C file containing default print messages for all the
records. The applications can also provide implementation of the routines.
■ rmDsStruc.h: The structure used by application to read and write to the
files.
Referring to Table I there is shown exemplary code (found in the MIB file) written using the IETF SMlv2 format as a data definition language. The example related to the definition of the temperature sensor.
1




Referring to Fig. 3, there is shown a block diagram which depicts the interaction between a representative external application 202 and the datastore module 204. The external application 202 uses the datastore module 204 by calling the library functions provided by the "dslibrary" 206. Datastore 204 contains the MetaData 208, log files 210 and data files 212. Commands for 10 accessing datastore 204 include dsinitialize 214, dsutils (check, edit, clear, dump, etc.) 216 and dsexport 218. Dsexport 218 provides the necessary interface to produce an ASCII file 220. Referring to Table 2 there is shown sample pseudo code for accessing persistent information (data).

I


Here a resource manager task that is responsible for keeping the host name, obtains the value stored in the persistent information using the command ds_getRecord. It uses the record identity defined in dsRecId.h file, a row number (0), and buffer where the value needs to be put. If the data has not been initialized then ds_getRecord returns an error, and the record is initialized with a default value. When an entry changes il is updated using ds_setRecord.
The present invention includes a method for exporting data in ASCII format (by using the dsreport command) and that the display mechanism takes care of bytes ordering by use of magic number. Each data file contains a 4-byte magic number whose hex representation is Oxafbeadde. When a datastore data file is read on little endian machine this magic number is read as Oxdeadbeaf It indicates the endianess has changed an all the subsequent displays are made by converting big endian to little endian.
In view of the foregoing description, numerous modifications and alternative embodiments of the invention will be apparent to those skilled in the

art. It should be clearly understood that the particular exemplary computer code can be implemented in a variety of ways in a variety of languages, which are equally well suited for a variety of hardware platforms. Accordingly, this description is to be construed as illustrative only and is for the purpose of teaching those skilied in the art the best mode of carrying out the invention. Details of the structure may be varied substantially without departing from the spirit of the invention, and the exclusive use of all modifications, which come within the scope of the appended claim, is reserved.




We claim:
1. A method for providing persistency fault tolerant data stored in a database on a
device in a networked environment for an external application, the device having an active
processor system and a standby processor system, the method comprised by the steps of:
(a) maintaining a checksum for each record in an active database located in the active processor system;
(b) checking the checksum during initialization;
(c) providing an identical standby copy of the active database located on the active processor system, on the standby processor system as a standby database;
(d) monitoring the active processor for a failure; and
(e) assuming control by the standby processor system when the failure is detected, wherein switching from the active database to the standby database is transparent to the external application and a magic number is kept to distinguish any tar and zipped file with the standby database.

2. The method as claimed in claim 1 wherein the step (c) keeping a compressed backup copy of the database with signature on the active processor system and the standby processor system.
3. The method as claimed in claim 2, if a failure event occurs or corruption event occurs, the compressed backup copy is recovered.

4. The method as claimed in claim 1, wherein the step of defining the database using a predetermined format.
5. The method as claimed in claim I wherein the step (e) is capable of generating structure and metadata corresponding to the database using the definition in the predetermined format.
6. The method as claimed in claim I wherein the step (a) of active processor system accesses the active database through an application program interface.
7. The method as claimed in claim 5 wherein the said predetermined format is Structure of Management Information version 2 (SMlv2) format.
8. A system for providing persistency fault tolerant data stored in a database on a device (60) in a networked environment for an external application (10), the device (60) having an active processor system (40) and a standby processor system (50), the system comprising:
checksum means for maintaining a checksum for each record in an active database (20) located in the active processor system (40) and checking the checksum during initialization;
standby means for providing an identical standby copy of the active database (20) located on the active processor system (40), on the standby processor system (50) as a standby database (30);
monitor means for monitoring the active processor system (40) for a failure; and

control means for assuming control by the standby processor system (50) when the failure is detected,
wherein switching from the active database to the standby database (30) is transparent to an external application (10) and a magic number is kept to distinguish any tar and zipped file with the standby database (30).
9. A device providing persistency fault tolerant data stored in a database and having an
active processor system and a standby processor system, the device comprising:
a checksum unit maintaining a checksum for each record in an active database located in the active processor system and checking the checksum during initialization;
a standby unit providing an identical standby copy of the active database located on the active processor system, on the standby processor system as a standby database;
a monitor vmit monitoring the active processor for a failure; and
a control unit assuming control by the standby processor system when the failure is detected, wherein switching from the active database to the standby database is transparent to an external application and a magic number is kept to distinguish any tar and zipped file with the standby database.
10. A method for providing persistency fault tolerant data stored in a database on a
device in a networked environment for an external application substantially as herein
described with reference to the foregoing examples and accompanying drawings.

11. A system for providing persistency fault tolerant data stored in a database on a
device in a networked environment for an external application substantially as herein
described with reference to the foregoing examples and accompanying drawings.
12. A device providing persistency fault tolerant data stored in a database and having an
active processor system and a standby processor system substantially as herein described

Documents:

1130-chenp-2004 abstract.pdf

1130-chenp-2004 assignment.pdf

1130-chenp-2004 claims-duplicate.pdf

1130-chenp-2004 claims.pdf

1130-chenp-2004 correspondence-others.pdf

1130-chenp-2004 correspondence-po.pdf

1130-chenp-2004 description (complete)-duplicate.pdf

1130-chenp-2004 description (complete).pdf

1130-chenp-2004 drawings -duplicate.pdf

1130-chenp-2004 drawings.pdf

1130-chenp-2004 form-1.pdf

1130-chenp-2004 form-13.pdf

1130-chenp-2004 form-19.pdf

1130-chenp-2004 form-26.pdf

1130-chenp-2004 form-3.pdf

1130-chenp-2004 form-5.pdf

1130-chenp-2004 form-6.pdf

1130-chenp-2004 pct.pdf

1130-chenp-2004 petition.pdf


Patent Number 207244
Indian Patent Application Number 1130/CHENP/2004
PG Journal Number 26/2007
Publication Date 29-Jun-2007
Grant Date 01-Jun-2007
Date of Filing 21-May-2004
Name of Patentee NOKIA, INC.
Applicant Address P.O.BOX 22660 HOT SPRINGS,AR71903-2660
Inventors:
# Inventor's Name Inventor's Address
1 KAMALVANSHI,AJAY 1503 OYAMA PLACE,SAN JOSE CA 95131
PCT International Classification Number G 06 F 11/ 00
PCT International Application Number PCT/US02/12694
PCT International Filing date 2002-12-20
PCT Conventions:
# PCT Application Number Date of Convention Priority Country
1 10/027,577 2001-12-20 U.S.A.