Alarm management - a phased approach
By Tyron Vardy, Alarm Management Consultant, Honeywell Process Solutions
Thursday, 06 November, 2014
Poor alarm management can lead to serious consequences in process plants. The integrity and effectiveness of alarm management can be crucial in either assisting the operator by indicating abnormal situations quickly and simply, or otherwise by distracting, confusing or failing to notify the operator of abnormal situations.
The process industry is not the only sector that understands the importance of alarm management. In April 2013, a United States not-for-profit organisation, The Joint Commission, responsible for accrediting hospitals, warned that doctors were increasingly desensitised, immune or overwhelmed by constant hospital medical alarms - what researchers labelled ‘alarm fatigue’.
Between January 2005 and June 2010, the US Food and Drug Administration’s Manufacturer and the User Facility Device Experience database revealed that 566 alarm-related patient deaths were reported. It was found that physicians were prone to turn down alarm volumes, turn them off or adjust the alarm settings outside safe parameters, resulting in these serious and fatal consequences.
Alarm management in the process industry has been dictated by and restricted to the capabilities and functions of specific tools or local site knowledge. Investment in software tools can do little to resolve underlying problems if the results and findings are not actioned in a structured and timely way. Even where process companies take a wider view to include software tools, formal standards and procedures, a framework to tie these elements together is usually lacking.
It is important for the process company to be clear about the reasons to undertake an alarm management project. These will vary between businesses, but without a clear agreement on the benefits sought, it is impossible to tailor an effective program to the organisation’s needs. There are a number of common drivers for embarking on an alarm management journey:
- Regulatory compliance
- Safety improvement and operating efficiency
- Insurance premiums
- Reduction of trips
- Increased production and throughput
- Reduced maintenance and asset availability
There is also evidence to show that measurable benefits from alarm management can be achieved. Honeywell’s own research suggests plants can, on average, cut in half both the unplanned downtime as well as the number and cost of incidents with an effectively executed alarm management program. Other observed benefits include as much as 3% increased capacity utilisation, 5% better energy utilisation and a 5% improvement in mechanical availability.
What these figures suggest is that the primary goal of alarm management is not to reduce the number of alarms. A reduction will occur, but it is the result of a good system put in place to achieve business and operating goals, rather than the aim itself. The quality and clarity of the alarms presented, not the number, is the most important aspect of any alarm management program.
Alarm management is a contributor to many business objectives including reliability, productivity or performance, safeguarding the plant and reducing costs. Something that is often talked about in Australia is the challenge of an ageing population, and therefore a workforce where the skills gap between retirees and the next generation is on the increase. An effective alarm management system will capture the knowledge of experienced staff, file the causes, consequences and corrective actions for each alarm, and retain this information for the benefit of less experienced operators.
The key to developing an effective alarm management strategy is to firstly understand why it is being developed and the different requirements that will drive the solutions.
Program ingredients
Techniques, standards, tools and best practices all play a role in alarm management. Most plants will be familiar with, and already have, a range of these. There is not one specific action for a desired result; instead, operators should pursue a combination of actions that will make a positive and strong impact.
The alarm management program provides the framework to apply the various contributors in a coordinated and logical way to achieve the business goals (Figure 1). It allows lessons learned in one area of a plant or facility to be captured and applied to another area. It also promotes a resilient strategy by building alarm management around a program rather than a specific tool, process or workflow. The strategy is never undermined.
Human factors
Human factors are crucial to effective alarm management. No system can be effective, regardless of how technologically advanced it may be, without the human component.
Established best practice and research should inform every step within the program. If we know that it takes, on average, one to two minutes for an operator to read an alarm, understand the consequences and take corrective action, then there is little point in presenting them with multiple critical alarms in a short time period (or alarm flood). How do you prioritise the emergencies in an alarm system that relies on alerting an operator with two emergency alarms at the same time?
Operators should be engaged at every stage of the improvement plan to capture their knowledge and gain insight into human limitations. This also helps employees to adapt to, and understand, the alarm program.
Alarm management improvement is achieved through a phased approached. The two phases that have the largest impact on the program are the identification and elimination of bad actors and alarm rationalisation.
As each phase is successfully accomplished, the overall number of daily alarms will reduce, as will the alarm floods (Figure 2).
Identification and elimination of bad actors
This phase should focus on the areas of biggest risk and greatest returns, while generating compliant key performance indicator (KPI) reports that meet business deliverables. We need to focus on addressing the problem alarms as they occur so that the facility can stay within its KPIs. The software tools should be easy to use and generate web-based, KPI reports that provide a snapshot of current alarm system performance.
Reporting on KPI metrics is only one part of the solution - improvement comes from the action taken on the information provided by these metrics. To drive improvement in this phase, action clearly needs to be taken. ‘Bad actors’ can range from faulty transmitters to inadequate on/off delays, leading to repeat offenders and those chattering alarms. These provide unnecessary ‘noise’ and ‘fog’ for the operator.
We can start the improvement plan by simply identifying the top three alarms each week, engage with Maintenance and Operations to have these problems addressed and then build this weekly process into the workflow of the organisation to provide ownership and continued delivery of the improvement plan. With this process, it is possible to achieve an 80% reduction in overall alarms.
Alarm rationalisation
Alarm rationalisation is misunderstood throughout the industry. Effective alarm rationalisation can only take place once the ‘noise’ that is caused by nuisance alarms has been eliminated. It is not about reducing the number of alarms, but rather more about the quality of them. This means we need to confirm the design of the alarm is correct in the first place.
The process involves analysing each alarm and looking at its cause, potential consequences and any corrective actions that are required. If there is no operator action that needs to be taken, it is not an alarm. In some cases, operator alerts may be beneficial; however, evaluating the need for each is recommended otherwise alerts may become a nuisance also.
Alarm rationalisation will include review and approval changes from phase two - these include grouping, cloning and a tag-by-tag review, as well as addressing standing alarms and operating modes, alarm priorities and so on. This should lead to an end-of-assessment review, summary and training, highlighting the changes made and reasons for them.
Essentially, we need to determine if the plant alarms have the correct priority, whether operators know what to do and what the actions associated with each alarm are. The purpose of priority is to indicate to the operator which alarm to respond to first when one or more alarms ring in at the same time. This should also follow the guidelines set out in an alarm system design document - the alarm philosophy document (APD).
Consequences and response time will be site specific and detailed in an APD. The APD should dictate that the change management process is followed and the knowledge of the rationalisation exercise is captured. If the consequence of the alarm is severe and the time the operator has to respond is less than two minutes, the alarm priority will most likely be critical. Those with minor consequences and a response time of greater than 30 minutes will be of a lower priority.
Most importantly, operations need to be rationalised prior to implementation for any additional alarms installed due to new equipment, otherwise all the effort in this phase of the alarm management improvement program will be lost.
Conclusion
Alarm identification, elimination of bad actors and alarm rationalisation will have a positive impact on the alarm rate and improve the overall performance of the alarm system. Software tools are key enablers to managing, monitoring and maintaining the alarm system. Using the tools continuously will guarantee long-term success of the alarm system and an improvement in the operators’ reactions to alarms.
Anticipating maintenance problems with predictive analytics
By utilising predictive analytics, process manufacturers can predict failures, enhance...
Air-gapped networks give a false sense of security
So-called 'air-gapped' OT networks can still fall victim to cyber attacks, so what is the...
Maximising automation flexibility: the ISV-driven approach
Vendor lock-in has long been a significant barrier to innovation in the industrial sector, making...