Title of Invention

METHOD, COMPUTER PROGRAM COMPRISING PROGRAM CODE ELEMENTS AND COMPUTER PROGRAM FOR MONITORING A SYSTEM CONDITION OF A SYSTEM WITH DISTRIBUTED COMPONENTS, IN PARTICULAR A NETWORK WITH DISTRIBUTED COMPONENTS AND NETWORK WITH DISTRIBUTED COMPONENTS

Abstract According to the invention, the distributed components of the system are arranged in a logical ring structure. Each component of the system monitors only their respective neighbouring component in said structure and the condition of said neighbouring component is determined. If a component determines a condition of their neighbouring component that corresponds to a predefinable condition, the component informs the other components of the system of the predefined condition of their neighbouring component.
Full Text Description
Method, computer program comprising program code elements and computer program product for monitoring a system condition of a system with distributed components, in particular of a network with distributed components, and network with distributed components •
The application relates to the monitoring and/or checking, generally referred to as monitoring for short in the following, of a system condition of a system with distributed components.
In such systems - known as distributed systems for short - such as for example mobile or stationary radio and/or communication networks, there is as a rule a requirement for all the components of the distributed system to have a knowledge of the status or condition of every other component in the system (monitoring).
If for example a component in a distributed system fails, for example as a result of the fact that a component is or becomes inoperative and/or offline, then it is advantageous for every other component in the system to receive this information.
Different approaches in which the monitoring is implemented by means of a so-called Ping-Pong mechanism using so-called Ping-Pong messages are known from the prior art for monitoring a system condition of a distributed system.
In this situation, in other words in the case of such a mechanism based on Ping-Pong messages, a system component periodically sends a Ping message over the distributed system to a component being monitored, to which the component being monitored responds with a Pong response, a so-called Pong Acknowledgment.

If there is no Pong response from a component being monitored, then this component will be classified, by the component sending the Ping message as a rule, as offline or generally as inoperative.
In order to enable a component of a distributed system to ascertain or check the status or condition of every other component of the distributed system, it queries all the components using the Ping-Pong mechanism.
A known (first) approach here for monitoring a system condition of a distributed system, based on the Ping-Pong mechanism, makes provision whereby each component of a distributed system monitors every other component of this system and for this purpose queries corresponding information concerning the status or the condition of the other components in each case.
To this end, each component of the distributed system sends every other component of the system a Ping message - and receives (back) corresponding Pong responses when the other components in question are in an operative or online condition.
Figure 3 illustrates this known first approach.
A distributed communication system 300, a HiPath IP telephony system 300, with a plurality of communication servers 301 to 306 associated in a communication network is thus represented in Figure 3. Each of these communication servers 301 to 306 needs to know about the failure of any other communication server 301 to 306 in the system 300.
To this end, each of these communication servers 301 to 306 now sends a Ping message 310 to each of the other communication servers 301 to 306 in the HiPath IP telephony system 300 - and receives corresponding Pong
responses 311 when the other communication servers 301 to 306 in question are in an operative or online condition.
The disadvantage of this known first approach is the fact that a high message volume is generated here of an order O(nA2) (n : number of system components) during monitoring of a system condition of a distributed system, which may restrict the power or capacity and/or the error detection capability / speed of the system.
Thus, for example, in the case of the HiPath IP telephony system 300 according to Figure 3, Ping-Pong messages (310, 311) are sent every 60 seconds. In this situation, in the case represented with the 6 communication servers 301 to 306, or alternatively 30 servers, (652)/60s= 1 message per second or (30292)/60s= 29 messages per second respectively are produced.
With regard to a further known approach for monitoring a system condition of a distributed system, based on the Ping-Pong mechanism, provision is made here whereby a central coordinator checks components of a distributed system, registers inoperative components or failed components of the distributed system and propagates corresponding information to all the components in the system.
The message volume generated here is of the order 0(n).
The disadvantage of this further known approach is the fact that it can only be implemented robustly with difficulty because the central coordinator of the distributed system must be maintained in redundant form.
The object of the present invention is therefore to set down a method which simplifies monitoring of a system condition of a distributed system and/or implements or enables this with a reduced resource requirement.
This object is achieved by the method, by the computer program comprising program code elements and by the computer program product for monitoring a system condition of a system with distributed components and also by the network with distributed components having the features according to the respective independent claim.
With regard to the method for monitoring a system condition of a system with distributed components, the distributed components of the system are or will be organized in a logical ring structure.
In this situation, a "logical" ring structure (of system components) means that without restricting the generality no physical concrete structuring of the system components into a ring structure takes place as a rule but that these are organized or structured in this manner conceptually, in other words as a conceptual model, in a virtual ring structure.
Each component of the system monitors only its respective neighboring component in the logical ring structure, whereby a condition of said neighboring component is determined.
If a component determines a condition of its neighboring component, which condition corresponds to a predefinable condition, then it informs the other components of the system about the determined, predefined condition of its neighboring component.
One advantage of the method according to the invention consists in the fact that it combines advantages of the approaches described above, known from the prior art.
According to the invention, only messages of the order O(n) are required in order to monitor the status or the condition of all the (distributed) components
in the distributed system. In this situation, there is however no central entity, the central coordinator, which needs to be maintained robustly.
The approach according to the invention is thus clearly based on a logical ring structure in which the components of the distributed system are or will be organized.
In this case, each component of the system monitors only its respective neighbor, in other words its respective neighboring component, in the ring. In the event of failure, as in the case of an inoperative and/or offline condition, of the neighbor the component informs all the other components about this.
With regard to the network with distributed components, the distributed components are organized in a logical ring structure. The components organized in the ring structure are set up such that
- a component only monitors its respective neighboring component in the
logical ring structure, whereby a condition of the respective neighboring
component can be determined, and
- a component which has determined a condition of its neighboring
component, which condition corresponds to a predefinable condition, informs
the other components of the system about the determined, predefined
condition of its neighboring component.
The computer program comprising program code elements is set up to perform all the steps according to the method according to the invention when the program is executed on a computer.
The computer program product with program code elements stored on a machine-readable medium is set up to perform all the steps according to the method according to the invention when the program is executed on a computer.
The setup and also the computer program with program code elements, set up to perform all the steps according to the method according to the invention when the program is executed on a computer, and also the computer program product with program code elements stored on a machine-readable medium, set up to perform all the steps according to the method according to the invention when the program is executed on a computer, are in particular suitable for performing the method according to the invention or one of its developments described in the following.
Preferred developments of the invention are set down in the dependent claims.
The developments described in the following relate both to the methods and also to the network.
The invention and the developments described in the following can be implemented both in software and also in hardware, by using a special electrical circuit for example.
In addition, it is also possible to implement the invention or a development described in the following by means of a computer-readable storage medium on which is stored the computer program with program code elements that the invention or development executes.
The invention or any development described in the following can also be implemented by means of a computer program product that includes a storage medium on which the computer program with program code elements is stored that the invention or development executes.
In a preferred development, provision is made whereby the predefinable condition is a functional incapacity, in particular an offline condition, or a
functional capacity, in particular an online condition. The term "alive" has become widely accepted among experts for the functional capacity.
The monitoring of the respective neighboring component and/or the determination of the condition of a component can preferably be carried out using a method based on a leasing method.
In this situation, in other words with regard to the method applied according to the invention based on a leasing method, "Alive" information, particularly an "Alive" message, can be sent or is sent from the neighboring component to the component.
It can be advantageous here, particularly in respect of current information about a system condition, to send the "Alive" information periodically.
On the basis of such "Alive" information, which may be periodic but is not necessarily so, it is then possible to determine a functional incapacity of the neighboring component if the neighboring component does not send (no longer sends) "Alive" information. This can clearly mean that - on the other hand - as long as "Alive" information is being sent, an "online" condition can be assumed for the respective component.
In addition, it is advantageous for the process of informing the other components about the predefinable condition of a neighboring component to be carried out using a method based on an "Inform AN" method.
Here, in other words with regard to the method based on an "Inform AH" method applied according to the invention, an acknowledgment process is performed for each other component in the system, in which the respective other component, if it has received the information about the predefinable condition of a neighboring component, confirms the receipt of the information, in particular by using "Acknowledgment" information, specifically an
"Acknowledgment" message. The term "acknowledgment" has become widely accepted among experts for the "confirmation".
In this situation, the acknowledgment (of receipt of the information) occurs as a rule with respect to the component which has determined the predefinable condition of the neighboring component, in particular by the fact that an "Acknowledgment" message is sent to the component.
In addition, provision can be made whereby the predefinable condition is also determined for another component which does not acknowledge the receipt of the information about the determined, defined condition of the neighboring component.
In this case too, all the other components can again be informed about this information. This can be done with or also without a corresponding acknowledgment procedure.
In a further preferred embodiment, provision is made whereby each component stores information about the conditions of the other components, particularly in a local list. As a result, each component (always) has local knowledge about a global status or condition of the distributed system.
By using this knowledge, it is then possible to specifically send "leasing" messages and/or to maintain the ring structure in a "closed" condition.
In addition, the neighboring component to a component in the ring structure can be a predecessor or also a successor in the logical ring structure.
In a further preferred embodiment, provision is made for monitoring a communication network, in particular a stationary communication network, such as a fixed telephone network for example, with distributed components. In this case the components are as a rule communication servers.
The monitoring according to the invention can however also be performed accordingly in the case of mobile, distributed systems, such as mobile radio networks.
Further advantages, features and details of the invention will be described in detail in the following with reference to an embodiment and figures associated with the embodiment. In the drawings:
Figure 1 shows a distributed system with (distributed) components organized in a logical ring structure, in which each component only monitors its respective neighbor in the ring, according to an exemplary embodiment;
Figure 2 shows a distributed system with (distributed) components organized in a logical ring structure, in which each component only monitors its respective neighbor in the ring, where one component has failed, according to an exemplary embodiment;
Figure 3 shows a distributed system with (distributed) components, in which each component monitors every other component of the system, according to the prior art.
Figure 1 and Figure 2 show a distributed communication system 100, a HiPath IP telephony system 100 further developed according to the invention, with a plurality of communication servers 101 to 106, associated in a communication network, also referred to as "components" 101 to 106 for short in the following.
Each of these communication servers 101 to 106 needs to know about a failure (cf. Figure 2) of any other communication server 101 to 106 in the
system 100, this facility being implemented by the monitoring mechanism described in the following.
To this end, the communication servers 101 to 106 of the HiPath IP telephony system 100 further developed according to the invention are organized in a logical ring structure 120 (102 follows 103, 101 follows 102, 106 follows 101, 105 follows 106, 104 follows 105, 103 follows 104).
Here, each communication server 101 to 106 then monitors 110 only its respective successor in the ring structure 120 (102 monitors successor 101; 103 monitors successor 102;...).
In the event of failure 200 of a successor (cf. Figure 2, 102 in the event of failure 200), the respective (preceding in the ring 120) communication server (cf. Figure 2, 103) informs all the other communication servers about this (cf. Figure 2, 211).
The monitoring 110 of a successor server 101 to 106 in the ring structure 120 takes place here, in other words in the case of the HiPath IP telephony system 100 further developed according to the invention, by way of a "leasing" method.
In this situation, a successor, component 102 for example, periodically sends a so-called "Alive" message 110 to the respective preceding communication server in the ring, component 103 according to the example.
If in the case of one communication server (103 in Figure 2) this "Alive" message 110 from its successor (102 in Figure 2) is absent (the "lease" expires), the successor (102 in Figure 2) will be categorized as "offline".
The communication server 103 monitoring this "offline" communication server 102 then informs every other communication server 101, 104 to 106 in the
system 100 about this by means of corresponding information messages 211, in other words about the failure 200 of its successor 102.
Every other communication server 101, 104 to 106 in the system 100 must confirm the receipt of the information by means of an "Acknowledgment" message.
If this acknowledgment is missing from one of the other components 101, 104 to 106, then this communication server will also be categorized as "offline".
All the communication servers are in turn informed about this further failure by the communication server 103.
At this stage the communication server 103 does not however expect any further acknowledgment.
This mechanism is referred to as "Inform All".
Each component 101 to 106 stores its knowledge of the status of other components 101 to 106 in the system 100 in a local list. By this means, each component 101 to 106 always has a local knowledge of the global status of the system 100.
With the aid of this knowledge, each component 101 to 106 sends a "Lease" message to the next "online" predecessor known to it. This guarantees that the ring 120 is always in a closed condition.
At the same time, each component 101 to 106 sends a "Lease" message to all "offline" predecessors which are situated in the ring between the next known "online" predecessor and the component itself.
This method guarantees that components 101 to 106 which come "online" again are re-integrated into the ring 120.
If one component 101 to 106 recognizes from receiving a Lease message that another component is "online" again, then it informs all the other components 101 to 106 using the "Inform All" mechanism.
If account is taken of the fact that in the case of the above system known from the prior art and its monitoring mechanism (cf. Figure 3) 29 messages per second were necessary or were sent, then according to the embodiment -given the same number of messages per second - a monitoring operation can occur at approximately once per second (30/29 messages per second).
As a result, a system with 30 servers can under the same network load be monitored 60 times more quickly and recognize an error. The more servers a system contains, the better this factor becomes.




We claim:
1. A method for monitoring a system condition of a
communication system with distributed components (101-
106), in particular of a network (100) with distributed
components, comprising the following steps:
-arranging the distributed components (101 - 106) of the system in a logical ring structure (120),
-monitoring by each component (101 - 106) of the system, only for their respective neighboring component in the logical ring structure (120), whereby a condition of said neighboring component is determined, wherein
-in an event if a component (103) which has determined a condition of its neighboring component, which corresponds to at least one predefinable condition, then informing all the other components of the system about the said condition of its neighboring component.
2. The method as claimed in claim 1, wherein the predefinable condition in a functional incapacity corresponds to an offline condition and a functional capacity corresponds to an online condition.
3. The method as claimed in claim 1, wherein the monitoring of the respective neighboring component and the determination of the current condition of the neighbouring component is carried out based on a leasing method.
4. The method as claimed in any of the preceding claims, wherein with regard to the leasing method, an "Alive" message(110), is sent from the respective neighboring component to the other component.
5. The method as claimed in any of the preceding claims, wherein the "Alive" message(110) is sent periodically.
6. The method as claimed in any of the preceding claims, wherein the functional incapacity of the neighboring component is determined if the neighboring component does not send any "Alive" information.
7. The method as claimed in claim 1, wherein the process of informing the other components about the predefinable condition of a neighboring component is carried out using "Inform All" method.
8. The method as claimed in claim 7, wherein with regard to the "Inform All" method, performing an acknowledgment process by each of the other components, in which each of the other components, if it has received the information about the predefinable condition of their respective neighboring component, confirms the receipt of the information, by sending an "Acknowledgment" message.
9. The method as claimed in claim 8, wherein the "Acknowledgement" message is sent to the component which has determined the predefinable condition of the respective neighboring component.
10. The method as claimed in any of the preceding claims, further comprising step of: determining the predefined condition for any component which does not acknowledge the receipt of the information about the predefinable condition of the respective neighboring component.
11. The method as claimed in any of the preceding claims, wherein
each component stores information about the conditions of the other
components, in particular in a local list.
12. The method as claimed in any of the preceding claims, wherein the neighboring component is a predecessor component or a successor component of a component in the logical ring structure.
13. The method as claimed in any of the preceding claims, wherein the communication network is one of a stationary communication network and a telephone network and the components are communication servers.
14. A communication system (100) with distributed components
(101 - 106) comprising,
a plurality of components (101 - 106) organized in the logical ring structure (120), each of which is monitoring only its respective neighboring component among components that is a predecessor or successor of said component in the logical ring structure, whereby a predefinable condition of said neighboring component can be determined, and that in an event that a component has
determined a condition of its neighboring component, which condition corresponds to a predefinable condition, informing the other components of the system about the determined, predefined condition of its neighboring component.
15. A communication system as claimed in claim 14, wherein the said distributed components (101-106) are communication server associated in a communication network.

Documents:

1756-DELNP-2007-Abstract-(26-09-2011).pdf

1756-delnp-2007-abstract.pdf

1756-DELNP-2007-Claims-(26-09-2011).pdf

1756-delnp-2007-claims.pdf

1756-DELNP-2007-Correspondence Others-(26-09-2011).pdf

1756-delnp-2007-Correspondence-Others-(25-04-2011).pdf

1756-DELNP-2007-Correspondence-Others.pdf

1756-delnp-2007-description( complete).pdf

1756-DELNP-2007-Drawings-(26-09-2011).pdf

1756-delnp-2007-drawings.pdf

1756-DELNP-2007-Form-1.pdf

1756-delnp-2007-form-2.pdf

1756-delnp-2007-form-26.pdf

1756-delnp-2007-Form-3-(25-04-2011).pdf

1756-DELNP-2007-Form-3-(26-09-2011).pdf

1756-delnp-2007-form-3.pdf

1756-DELNP-2007-Form-5.pdf

1756-delnp-2007-pct-237.pdf

1756-delnp-2007-pct-304.pdf

1756-delnp-2007-Petition-137-(25-04-2011).pdf

abstract.jpg


Patent Number 258515
Indian Patent Application Number 1756/DELNP/2007
PG Journal Number 03/2014
Publication Date 17-Jan-2014
Grant Date 16-Jan-2014
Date of Filing 06-Mar-2007
Name of Patentee SIEMENS AKTIENGESELLSCHAFT
Applicant Address WITTELSBACHERPLATZ, 2, D-80333 MUNICH. GERMANY
Inventors:
# Inventor's Name Inventor's Address
1 BERNDT ; STEFAN AM DARENBERG 1, 59199, BONEN, GERMANY
2 HANNA ; THOMAS TULPENWEG 8, 32758, DETMOLD,GERMANY
3 LAUX ; THORSTEN HEINRICH-STROHMEIER-STR. 11,33104 PADERBORN, GERMANY
4 RUSITSCHKA ; STEFFEN SOMMERSTRABE 3, 81543, MUNCHEN, GERMANY
5 SCHEERING ; CHRISTIAN ERFTWEG 38, 33689, BIELEFELD, GERMANY
6 SOUTHBALL ; ALAN KARWINSKISTR. 4, 81247, MUNCHEN, GERMANY
PCT International Classification Number H04L 12/24
PCT International Application Number PCT/EP2005/052828
PCT International Filing date 2005-06-17
PCT Conventions:
# PCT Application Number Date of Convention Priority Country
1 10 2004 044 987.2 2004-09-16 Germany