Fault management
From Free net encyclopedia
In network management, fault management is the set of functions that (a) detect, isolate, and correct malfunctions in a telecommunications network, (b) compensate for environmental changes, and (c) include maintaining and examining error logs, accepting and acting on error detection notifications, tracing and identifying faults, carrying out sequences of diagnostics tests, correcting faults, reporting error conditions, and localizing and tracing faults by examining and manipulating database information.
Source: from Federal Standard 1037C and from MIL-STD-188
When a fault or event occurs, a network component will often send a notification to the network operator using a proprietary or open protocol such as SNMP. An alarm is a persistent indication of a fault that clears only when the problem that triggered it has been resolved. A current list of problems occurring on the network component is often kept in the form of an active alarm list such as is defined in RFC 3877 the Alarm MIB.Template:Com-stub