Skip to main content

Say Goodbye to Post Mortems, Say Hello to Effective Problem Management


This paper describes the problem management process my company uses to investigate, classify, communicate and remediate the causes of service outages. Most outages have multiple addressable root causes; our process links these to the outage for analysis and assignment of multi le remediation actions. Root causes can also be analyzed in dependently, providing powerful trending metrics. The evolution of our problem management system is discussed, along with classification methods and items tracked. This process has proven to be very effective in eliminating repeat outages. 

Read the paper here.



Comments

Popular posts from this blog

LDWin: Link Discovery for Windows

LDWin supports the following methods of link discovery: CDP - Cisco Discovery Protocol LLDP - Link Layer Discovery Protocol Download LDWin from here.

Battery Room Explosion

A hydrogen explosion occurred in an Uninterruptible Power Source (UPS) battery room. The explosion blew a 400 ft2 hole in the roof, collapsed numerous walls and ceilings throughout the building, and significantly damaged a large portion of the 50,000 ft2 building. Fortunately, the computer/data center was vacant at the time and there were no injuries. Read more about the explosion over at hydrogen tools here .

STG (SNMP Traffic Grapher)

This freeware utility allows monitoring of supporting SNMPv1 and SNMPv2c devices including Cisco. Intended as fast aid for network administrators who need prompt access to current information about state of network equipment. Access STG here (original site) or alternatively here .