The purpose of this article is to examine how emergency response agencies like modern fire services are able to function safely in high risk, complex, highly variable environments at a moments notice. What are the traits and principles that are applied, and how can these be harnessed and applied to the daily operations in a business or as a component in industry. How can these principles and traits set you up for success as a highly reliable, safe, nimble and sustainable organization?
As observers, we often take for granted what is seen every day without actually delving into the intricacies and complexities of how things need to work in order to achieve a successful outcome. Some examples might include: how large heavy aircraft stay aloft, how line-persons can safely work around dangerous high voltage transmission lines, or how humans can harness nuclear energy to safely meet their needs.
Few actually take the time to stop and consider what might go wrong and how the risk of a severe negative outcome might be mitigated. Fortunately high reliability organizations must consider what controls and backup systems are in place in order to ensure safety and resiliency should a problem arise. These are all examples of organizations that operate in a high risk environment where planning, developing contingencies and hazard mitigation strategies are crucial to their success. You would think that considering all potential hazards, their probabilities and their consequences and then developing risk countermeasures creates confidence and allows these entities to develop reliable systems that almost always function flawlessly to provide safety. To be successful, what they need is adaptive capacity to operate within a complex, variable and uncertain environment, where the safety and reliability is created at the sharp end, the capacity and resilience to prevent a disaster exists in the frontline workers.
It would be prudent to regularly think about these sorts of challenges. Some may consider this as being a bit fatalistic, but there are many who regularly consider all of these questions and more. The problem most often is not in the ability or inability to foresee and plan for failures, but rather in those unforeseen risks that result in catastrophic failures. The challenge becomes how to effectively manage risks amongst complexity, variability and uncertainty. Taleb (2010) describes this as the black swan event. An event that prior to it happening wasn’t fathomable to the human mind. While we can consider and mitigate most risks and foreseeable hazards, it is nearly impossible to consider and plan for all eventualities and develop a prevention system that is foolproof.
What is adaptive capacity? I will explore the idea of capacity in a system and explain how fire service incident command builds adaptive capacity, and how this can be translated into leadership principles. A contemporary example of a capacity challenge is the state of our healthcare systems. When our healthcare system is functioning at or above capacity, with full beds, wards and hallway medicine; what then happens if you add a major disaster or a pandemic into the system? Without additional capacity and a resilient system that can adapt to the event we will see system failure and people may be harmed. In the era of lean management where efficiency is a priority, how much additional adaptive capacity remains within our systems and businesses?
Modern municipal fire and emergency services have become what is typically referred to as an “all hazards” response organization. I would consider this organizational model as one that is adaptive to unforeseen circumstances, and that is highly reliable. Municipal fire departments originated in the late 19th and early 20th century and were typically volunteer community groups. With the rise of the insurance industry, many large urban areas began to incorporate career fire services into their protective community safety umbrella. Today, the vast majority of municipalities are still paid-on-call (sometimes still referred to as volunteer), but the major urban centre fire services are structured as professional government entities with a significant responsibility for public safety and business continuity.
What is unique about modern fires services, is that they are expected to respond to any and all emergencies and then adapt and address whatever issue or event they encounter. What these municipal fire services truly offer their communities is security through the four pillars of resilience engineering (Hollnagel, 2011). These pillars are: anticipating, monitoring, responding and learning. What we have learned in modern safety science and through concepts such as Taleb’s black swan, is that we can’t possibly imagine all potential future events. This is why a core set of principles and a systematic approach helps fire services deal with complex and variable situations in a safe manner. Planning and training for common and known hazards is the norm, but true resilience and adaptive capacity exists with a the real-time approach to problem solving. Let’s take a look at the history of the fire “Incident Command System”, how it presents an adaptive capacity model, and how the contributing traits and principles within this model build their organizational adaptive capacity and make them a high reliability organization.
High Reliability Organizations (HRO) are those that function within complex and variable situations where the risk of harm to the operators or bystanders are of high consequence, but there is an expectation to maintain regular risk-free operations. (Weick, 2007) Modern emergency services plan and train for predictable and foreseen emergency situations that present risk of potential harm to both the responder, and the citizens they support with the services that they provide. However, in an “all hazards” model, there is an expectation of real-time problem solving to maintain public safety and continuity, no matter what the cause of the emergency. The reality is that one cannot imagine all perceivable risks or hazards that may present themselves, yet we still expect the modern emergency services to find solutions to these problems when they occur.
In 2001 I was in training to upgrade from an Advanced Care Paramedic to a Critical Care Flight Paramedic. I was at the Toronto City Centre Airport at the Ontario Ministry of Health Air Ambulance hangar. I vividly recall being told to come upstairs to the crew lounge and I “would not believe what is on the television”. We watched incredulously as the World Trade Centre towers crumbled in New York City and Paramedics, Fire Fighters and Police Officer ran towards the disaster. In response we started to pack extra equipment and prepare the extra aircraft should we be called upon to provide support.
Many can either recall this image or bring about a mental image of those rushing towards an emerging situation while others are fleeing for safety. Was this an example of responders exhibiting a complete disregard for their personal safety and life? What was their motivation and what were they thinking? Or is there an explanation for how they were behaving and a strategic method and framework that was being put into place to address the circumstances that they encountered? I should point out that 9/11 was an extreme event and no system or methodology would prevent harm and death to the first responders and occupants of the Twin Towers that day. However, there was a solid framework that was established that day, a system that can be studied by all organizations who seek to effectively manage uncertainty and evolving emergencies that at first glance appear to be out of control, even with daily operations.
I would like to approach adaptive capacity through the lens of several theories, these include high reliability organizations, resilience engineering, and modern safety science coined as “safety differently” (Dekker, 2015) or “Safety II” (Hollnagel, 2014) as the concepts are very complimentary, leading to positive work, efficiency and safety as an outcome. What are the key elements of an Incident Management System framework that allow emergency services to operate in high risk environments without regularly causing significant harm or fatalities? What can be learned or translated from such a system to business and industry?
It is important to have an understanding of what I am referring to when looking at complex work. The idea of complexity means that you are working with a multitude of factors and inputs that you are not able to control. This is contrasted by complicated or simple work.
An example of complex versus complicated, is when a Paramedic crew responds into the community to address a cardiac arrest where a person’s heart has stopped beating. This is a complex work scenario, where the responders don’t know what they will encounter at the scene. Either on the way or upon arrival, responders seek answers to fundamental questions such as: what caused the cardiac arrest? what are the patients underlying medical conditions? who will be available at the scene to assist and what other unique circumstances will be encountered during the medical management of the patient? Every person’s physiology can have varied complications, comorbidities, medications that all vary, as does the presenting cardiac rhythms and the patient’s responses to treatment. Conversely in another example, getting an airplane into the air is actually a complicated task, but not complex. There is a complicated series of steps that must be followed and managed with a detailed checklist, but assuming that a competent person with the right skills and abilities follows the required steps they will be successful with a low probability that an unplanned emergency will occur.
So in summary, the complex situation is one where you are managing an infinite number of variables with an infinite number of interactions and unknowns that cannot be controlled. Similarly, when fire crews arrive at a fire scene, they must account for an infinite number of variables and complications such as weather, wind, burn time, structural integrity, adjacent exposed buildings, fire load (building materials and content), unknown hazardous materials and chemicals, unknown number of victims, unknown chemical composition of furniture, unknown building layout, overhead wires, obstructions to access and the deteriorating state of the building, all in addition to the hostile environment created by the chemical and physical reactions that result in a fire.
History of Incident Command System
Most people see the uniforms, rank insignia and the regimented formalities of the fire service and equate them to history, tradition and a command structure that is similar to that of the military. Indeed the fire service is a paramilitary organization. Dating back to the early 1900’s this is very much the case of the fire service and the command structure. It makes sense given the two major world wars and many returning soldiers taking on the role of firefighter upon their return to society. As with the autocratic leadership methodologies of scientific management theory (Taylor, 1909) much of this top down hierarchical leadership style has persisted through the industrial revolution and nuclear age.
The appearance of a paramilitary organization may still exist, but many leading fire service agencies have changed and adapted in the modern era to more distributed command and operating structures. This is also very true of the modern military. If you read the stories of modern warfare as navy seals engaged in combat theatres like the Gulf War and Afghanistan, you will find a very different leadership style with distributed authority and deference to expertise. Modern military organizations have also shifted to adopt the principles and characteristics of a high reliability organization.
This paramilitary style continued through the mid-century until there was a major change. Wildfires throughout Northern California ravaged forests and presented enormous risks and dangers to wild land firefighters and local inhabitants. The military style of rank and command structure presented significant shortcomings for rapid decision making, sustained operations and support and ultimately safety. The unfortunate result was the loss of many wild land firefighters. This brought about the creation of “FIRESCOPE”, the birth of the Incident Command System (ICS) which is still the basis and approach for modern disaster management and is sometimes referred to as the Incident Management System (IMS).
The core element of IMS and the ICS was to create a common organizational system and framework that could quickly bring together large numbers of people and form an organization to build the necessary capacity to overcome the hazard. Safety is achieved through effective communication, maintaining an effective span of control and strategic and tactical planning in a scalable manner. The goals are to know where your people are at any given time, communicate effectively, share current information, adapt to evolving and emerging hazards, have a centralized command with specialized branches, divisions, sectors, task forces, strike teams, that are all customizable to the type of incident.
What municipal fire services found however, was that deploying a full scale IMS model to every emergency incident was too large an undertaking for daily operational incidents such as a common house fire. Instead it developed its own ICS with an Incident Commander and a scalable organizational structure, but also incorporated the key elements of IMS, specifically: span of control, developing a strategy and assigning tasks to dispersed leaders to achieve that strategy. Conversely, without a systematized and planned approach to addressing hazards, there were far too many line of duty deaths. What they did recognize was the benefits of adaptable organizational structure could have when confronting the challenges that their daily duties and responsibilities presented. In other words the work of dealing with complex and variable situations that required problem solving to achieve safety was best managed by frontline operators filling the gaps by adapting to the real conditions of operations and their dynamics (Cook and Rasmussen, 2005).
As a result the modern fire service incident command system was created and adapted over several decades and across many large municipal fire service departments. Two key departments that changed the face of ICS were Phoenix, Arizona with the fire ground command system (Brunacini, 1985) adaptation of ICS now known as “Blue Card” incident command training; and Seattle, Washington with the “PASSPORT” personnel accountability methods.
Translating Theory and Practice
The Phoenix Fire Department’s major shift in fire ground command system grew out of the post incident review from the death of firefighter Brett Tarver who died in the Southwest Supermarket fire in 2001.
The Phoenix Fire Department discovered that they were applying the same basic incident approach to this large fire as they did to a small house fire. During this tragic incident they discovered that they did not have the capacity required to respond and adapt to the complex and changing conditions of this industrial fire. When firefighter Tarver became lost, the resources and tactics were not in place to save him. Their response capacity was not adaptive to the situation at hand, and there was no reserve capacity to create contingencies and controls in a hazardous environment.
Learning Teams and Understanding Normal Work
Fire departments have been particularly strong in conducting post incident analysis. Although in the case of Phoenix this was a particularly reactive process due to a line of duty death. The practice of post incident analysis has been adopted by virtually all fire departments to better understand their achievements at any particular incident response.
These can be conducted on a small scale with individual companies when they return after the response (often referred to as tailboard discussions) and conducted informally over coffee, or they can be more formal in nature, often following larger responses and planned shortly after to include a larger number of participants. This is not to be confused with an incident investigation, which is very different. Regardless of negative or positive outcome, this is a standard practice used to identify normal work functions that went well and normal work functions that require improvement following an incident response with a goal to learn and improve.
Traditional occupational health and safety professionals and healthcare facilities are amongst many other organizations that often find themselves conducting incident investigations to discover what went wrong after something bad happens. Unfortunately, this is only looking at work in hindsight, with a significant outcome bias and most often seeking a root cause; meaning finding something or someone to blame through a linear series of causation. This is old science, and this linearity is not a true reflection of complex work. Modern safety science has a new perspective on this approach. Not only should we be looking at events in a post incident analysis, but we should also be actively observing normal work on a regular basis to understand what goes well. In fact this proactive approach is now seen as superior.
As an example professor Sidney Dekker shared a story about a hospital system that he worked with. This particular system harmed 1 out of every 13 patients during the care that was provided. This is, unfortunately, an all too common number. Healthcare is the third leading cause of harm to human beings while they are being treated for an illness or injury. When they looked at the one patient that was harmed, it lead them down the common path of linear cause and effect that was likely to lay blame. However, when taking the Safety-II approach of Eric Hollnagel and looking at the other twelve patients in what would be considered to be normal work, they were able to identify patterns in the team relationships that create an environment of success and thus the outcome of safety. Though ultimately all thirteen patients were subjected to the similar risks as the one that was harmed, they were able to identify the positive factors that lead to the successful outcome of not harming the other twelve patients. This allowed adaptive changes within the organization to reduce harm.
To compare, for those fire departments that regularly respond to emergencies, what they are actually doing in a post incident analysis is trying to understand their normal work. They get the team together and learn through their varied perspectives, inputs and described experiences seeking all of the factors that contributed to their normal work in a complex and variable work environment. During these sessions all feedback and input is given equal weight and it is understood that the unique experience of one participant will not be the same as the unique experience of another participant, as no one person experiences the work in the exact same way. The result of the post incident analysis is lessons learned from the combined experiences of all participants, each with unique focus, goals, observations, work locations, and cognitive inputs that they all share. These are the frontline experts doing the work and their knowledge, experience and input is valued. This helps to form a larger, more complete cognitive picture of the work that they perform in order to learn from their combined experiences and find better ways to do the work. There is a long history of innovations that have come out of this process from specific tools, to rescue procedures and approaches to safety.
In continuing with the analysis of normal work and the role of post incident analysis (PIA) as a learning opportunity, PIAs are conducted with the frontline workers, usually with an experienced facilitator leading the session . Higher levels of management are usually not included as the objective is to provide a safe environment where members can be open and honest about expressing their opinions. The most important aspect of the session is that the frontline input is valued and a key element in the organizational learning process. Just as important as conducting these learning sessions is the creation of the right environment. The concept of an “investigation” is not part of this process, and should be removed from the lexicon. The connotations of an investigation is seeking cause, often to blame. When there is blame, there cannot be learning. (Hollnagel, 2014)
The adaptation of this type of learning analysis that has evolved from the modern safety scientists such as Hollnagel and Dekker is referred to as learning teams. This is an approach that mirrors that of the fire service approach. It can be applied at any time through the work process to learn and understand what normal work looks like and how work is done in comparison to how work is planned, thus finding risk, waste and efficiencies before adverse events. Using such a facilitated process can also be done to replace incident investigations, creating a more robust and open environment to learn and factor complexities and variabilities that contributed to risks as opposed to Newtonian-linear cause and effect.
The key principle of high reliability that comes out of the learning teams approach is the idea of being preoccupied with failure and a reluctance to simplify. When we find blame in an individual, we have oversimplified the problem and often stop there. This falls far short of finding the underlying reasons for the less than positive outcome and avoids consideration of the true complexities of the situation. In understanding each incident and the normal work that has occurred, we seek to find systematic weakness and failures rather than individual ones. The objective is to lower the probability that a negative outcome will occur and thereby avoid the consequences that may subsequently occur. Often complex problems need deeper analysis and organization’s need to embrace solutions, no matter how simple or painless, nor how painful and complex. HRO’s are often obsessed with having a continuous sense of unease, they are constantly evaluating risks and fear unanticipated failure. They know that their methods are tested regularly and can fail at any time.
Thus, these ongoing analyses would help them pick apart the small failures and negative outcomes or unanticipated risks and further build their understanding of the work they do. All members of the team maintain this shared alertness to risk and the potential for failure. They continuously look at data, seek information and challenge their practices and current belief systems with new theories and techniques. This is in part because of the intense sense of team that fire services have developed and a culture of caring for one another. As with aviation, this may be the one cultural aspect that separates fire services from most businesses in the private sector. With fire services and with aviation, a significant error could result in catastrophic fatal outcomes, whereas in the private sector, a significant error most often will result in catastrophic financial outcomes.
Work as Planned versus Work as Done
This brings me to the concept of work as planned versus work actually being done. In complex and variable environments one cannot capture in words a policy or procedure that will encompass all aspects of the work. The people who know the work the best are those that actually perform the work. This flies in the face of century old organizational structures and cultures that operate “top down”, where management has the ultimate authority and employees are expected to do as they are told, to comply with policy.
The last century has been dominated by the concept of scientific management (Taylor, 1909) where it was believed that there were two classes of employees, the workers and the management, and that the most efficient and safest way to conduct work was to have the “smart” managers design the work and the “dumb” workers follow directions. This approach came about in the era of industrialization with mechanization of the workplace and was correlated to the enormous upswing in debilitating worker injuries and death. As the insurance industry grew, a paradigm shift occurred, from forcing employers to focus safety on changes to prevent the unsafe work environments, to the employee being the cause of unsafe work and directing employees in the safest way to do work. Employees became a problem to be controlled and management knew the one best way to do work. However, the concepts of behaviour based efficiency and safety made little impact on either of these elements, yet we apparently still maintain this Tayloristic approach in many modern day industries.
What we have learned in the modern safety science is that Tayloristic and authoritarian leadership does not create a safe working environment. Mainly because it does not allow for the required adaptive capacity and normative decision making that needs to happen at the pointy end of the stick, the front line. It is the workers that dynamically create workplace safety and efficiency in real-time through their expertise in the everyday work that they do. The idea of learning teams and understanding normal work is what we can use to feed the design of policy frameworks to assist staff to accomplish their work with support instead of restriction. The functional concept is freedom within a framework and through a learning team approach most fire departments have moved to standard operating guidelines (SOG) instead of must do policies and procedures.
In reality, it is often impossible for employees to know and remember the plethora of complex policies in a workplace. Organizations will tout policies as necessary to keep people in-line, to do the right thing, to hold them accountable. What modern safety science has actually learned is that the majority of people do the right things regardless of or despite of an existing policy. People are not a problem that needs fixing, they are a solution to be harnessed. (Dekker, 2015) Policies are about the corporate entity’s ability to maintain compliance and offset liability. They are not actually about creating efficient and safe work. And most importantly, when looking to create adaptive capacity, strict policies create competing priorities and dissuade innovation.
When conducting learning teams as a discovery tool, when done well and with the right facilitation, what we learn is that most policies have a work around or are not followed on a regular basis because they impede efficient work. This creates a goal conflict. As a frontline worker, should I follow the policy and not meet my targeted expectation for work outputs, or should I use the method that I know works best? Accidents happen in the grey areas of these goal conflicts, the unspoken to undertone pressures. And then what happens when we investigate? We blame staff for not following a policy, instead of trying to understand what made sense to that person at that time. One approach that I use is to simply ask people to tell me about any policies that make their work more difficult. It never seems hard for them to find many of these policies and most often there is little explanation why they actually exist or when they were last reviewed.
Understanding real and normal work from the perspective of the frontline staff, without fear of retribution for questioning policies and procedures; is therefore one of the key aspects to create adaptive capacity in an organization to operate safely and efficiently in complex environments. Your frontline staff will provide the answers when they have the freedom to tackle solutions without the confines of over bureaucratization. Give them guidelines and goals, but every step of the process need not be prescribed.
The key element of the fire incident command system is that we develop a system that allows us to match the work we perform to the complexity and variability of the environment at hand. The deployment to any such response must be the right level for the incident, have capacity to stay ahead of the incident and resilient enough for changing conditions. When working in hazard zones the system deployed must have adaptive capacity. The unique element of incident command is that the work is deployed into teams of teams with a common framework and autonomy of work to make local decisions in the moment, yet they remain under a central organizational structure. These attributes are the normative cultural aspects of fire department organizations that contribute the most to overall safety and functionality.
Gary Klein (1989) a behavioural psychologist wrote a landmark paper on Naturalistic Decision Making. His research was a qualitative and ethnographic investigation into how fire ground leaders make decisions in times of stress. Prior to this research the theory was that decisions were based solely on a process that required weighing options based on inputs and making decisions on one option over another. Surprisingly, this research discovered that such strategic and tactical decisions, made under stress were most often described as being experiential based on natural inclinations for common situations. In other words, past exposure, experience and instinct. Another famous cognitive psychologist Daniel Khanneman (2011) describes two distinct decision making frameworks within our human brains. The System-I or fast thinking (Heuristic) and System-II or slow thinking (detailed analysis). When looking back at the work of Klein it becomes abundantly apparent that much of the initial fire ground operations operate in the System-I thinking based on heuristics. Which essentially is, experience plus gut instinct and rapid decision making based on observation of recognized facts. We all do this all the time. This is called the affect heuristic. Often it can be a benefit, but there are other types of heuristics that can also work against us.
The point I am trying to illustrate is that with training, exposure and experience our minds build strategies to accomplish tasks instinctively with efficiency and safety, and most often we are correct. However, this also sets us up for other heuristics and biases that can cause harm, such as stereotyping and tunnel vision, or in a worse case, sunk cost bias. We are inclined to consider that we have come so far on something that just a little bit further won’t hurt as it would be a waste to turn back or stop.
The incident command system and adaptive capacity create a system of checks and balances to prevent tunnel vision and broaden the knowledge and perspective of an emerging situation. This is accomplished through a number of elements built into the system to shift into a System-II type of analysis and provide more strategic thinking. At every step of the way there are System-I actions, and System-II re-analysis built into the framework which distributes leadership and shared responsibility. How is this accomplished?
Adaptive Capacity vs Lean Management
Considering how a fire response plays out, the dispatch centre receives reports from the community. Based on these reports a weight of response is developed often with a computer aided dispatch system. This weight of response is based generally on past experiences and the fire department’s pre-planning protocols. These pre-fire plans are developed during the daily operations by frontline fire companies and fire inspection and prevention staff who feed a database of building, building layouts, fire loads, occupancies, hazards, fire suppression systems, access and egress, and hydrant locations. All of this information gathering also helps the municipalities consider the size and scope of their fire departments to ensure capacity and resilience. Through an understanding of all of the buildings and risks within a community this is how these municipalities build their system models and project the capacity needed.
This type of analysis is an important aspect in understanding the needs of the organization and community. This ongoing and careful analysis is how fire services understand their community and understand adaptive capacity for resilience. The idea of adaptive capacity includes the ability to ramp up operations beyond the typical demand on a system in the event of crisis, while still maintaining an efficient and affordable daily operation. Currently this is a challenge for many industries, including the healthcare sector.
Having many of our hospitals at full capacity in times of normal operations does not allow for the unexpected low probability high consequence events that may occur, such as a pandemic or a mass casualty event. As with healthcare, fire departments must compete with many other interests for valuable taxpayer dollars and are continuously challenged during annual budget deliberations. It stands to reason that we all want this system capacity in the event of crisis and emergencies (which comes with significant increased costs), but we also want value and productivity from these resources on a daily basis along with the dedicated learning and preparation. This draws a parallel to our nation having a standing military. How can we best use their capacity and expertise during periods of low demand while still allowing for learning, development and preparation for periods of high demand?
The concept of lean management when done properly is about quality and efficiency of operations. A drive solely focused on efficiency shaves off our adaptive capacity, and in many organizations, has eliminated a certain level of managerial knowledge and expertise that is crucial in supporting the frontlines when major unexpected situations arise. Lean management should not mean removing capacity, since an element of quality is having enough capacity in a system to ensure continued optimal quality in the face of adversity and increased demands. This is the complex challenge of designing a system around quality and efficiency that will not deteriorate in high demand or crisis situations when a system is pushed to the edge of its operating envelope. Applying lean quality improvement methodology does not mean we should cut resources, it means that resources should be optimized to improve quality, reduce waste and have capacity to maintain quality when the system is stressed, yet have productive capabilities during times of low demand. Waste it not always excess, there are several types of waste in the lean lexicon.
Distributed Leadership and Resilience
The truly adaptive aspect of fire ground incident command is the rate at which an operational scale can be increased in response to the resource needs of a particular situation. Critical to this system is its scalability both upward and downward to handle unknown disruptions. Often starting with an initial single unit arriving on a scene there are several initial key elements that set the stage for additional resources, yet still maintaining a safe operation within a hazard zone.
Initial size up and reporting happens when the first unit arrives, they provide dispatch and all incoming units with an radio report and an action plan that includes: initial observations and extent of the emergency, the establishment of incident command, the initial actions of the first arriving resources, the type of emergency factors that need to be addressed, universal orientation by officially establishing and naming the front of the incident, any safety issues that subsequent responding crews need to be aware of and any observable details to help other prepare for what will follow. They establish a responsible leader, define a mental picture of scene and important locations, this is followed by a more detailed follow-up report after circumnavigating the scene. The significance of this approach is building a scenario where everyone can start to understand the magnitude of the situation and what sort of capacity will be required to operate safely and efficiently. When combining this with historical data about similar incidents and operations it makes the service capable of understanding the best, middle and worst case scenarios and how the system is designed to adapt the capacity accordingly. Clearly and transparently communicating your business issues and goals with well-informed research and current data to all people involved is a key principle and trait of a high reliability organization. When everyone understands all of the factors in a transparent manner, not only will everyone be acting on the same page of information, then the leader can receive the most appropriate and effective feedback to make important decisions.
As additional units arrive the system sets about being adaptive with capacity, first the initial leader has established priority needs and work assignments, those arriving have a picture in their mind of where to go and what needs to get done and autonomy in their work zones. As the work begins the command structure is handed over often to officers who have additional knowledge, experience and are practiced in larger strategic operations. They operate from a better vantage point and develop strategies and supporting tactics to solve the problem at hand. Many of these operations are standardized within a framework of operating guidelines so that others working in proximity know what to expect and can operate safely and complimentary in a hazard zone together. Transparency builds trust in a high functioning team.
Staging is a process in the adaptive capacity of incident command systems that allows the leader to determine what is actually needed for the emergency. As resources are applied to the situation there is a determination of the actual weight of response that is needed. Once all tasks are assigned to incoming units that are necessary for the given situation a primary and secondary staging area can be setup. What this does is establish a capacity for the further up scaling of needed resources for the current situation in the primary staging area, as well as additional capacity for those unexpected events that might occur in the secondary staging area with a worst case scenario in mind; but also balanced to keep resources available to respond to another incident within the community and addressing the regular business as usual. Does your organization have the ability to fluctuate daily demand yet ramp up for crises, maintain quality and safety and still meet your regular ongoing business needs? This is often the role of municipal fire departments to also take on as they assist all other city services in time of crises, even if it is not a front line community emergency response. As I write this, the Canadian Forces are ramping up to deploy as added capacity to the Canadian healthcare system due to the Covid-19 pandemic.
Team Dynamics and Resilience
Another key element of an HRO is a commitment to resilience. Resilience is often thrown around with little meaning other than the ability to recover from adverse events. Organizations that operate with resilience often work with teams of teams, decentralized leadership and autonomy which shares common goals and expertise across working groups. Each fire company works as a small team that is a part of a much larger team. These teams all carry varied skills, levels of knowledge and expertise. Often what you will find within individual fire companies is a timeline of experience from inexperienced rookie to a knowledgeable and experienced captain. This generational model of handing over legacy knowledge and building shared experience helps to provide safety when working within the smaller team. This model also shares knowledge between individual teams and leadership abilities within each working sector, division or branch (multiple teams). In a military scenario this helps to address the unfortunate scenario of a team leader being wounded our killed. Others have share expertise to carry one with the mission.
Each company can have an area of specialized expertise, for example: search and rescue, ventilation, auto-extrication, high-angle rescue, water rescue, and so forth. These team leaders communicate and coordinate within a centralized command structure that keeps depth and breadth as situations evolve. With complex and variable responses this decentralized expertise has historically multiplied resourcefulness through the varied areas of expertise that firefighters bring to the profession with them. I have seen firefighters who are structural engineers, carpenters, electricians, chemical engineers and more. When encountered with a situation that requires expert knowledge beyond that of an incident commander, experts are trusted and take precedence over authority regardless of rank, HRO’s and fire incident commanders call upon the required expertise (internal and external to the organization) to provide the in-depth knowledge and fill in the blanks for decision making. This is another practice in medicine that keeps patients safe. When we don’t defer to experts in particular areas we often see risk expand. In addition, high reliability organizations provide opportunities for this knowledge development and retention.
Many progressive fire services practice a rotational model where staff spend time in specialized areas of operation or various leadership positions to learn about different aspects of the organization. This sort of cross operational leadership and sharing of knowledge creates more trust, support and better working relationships across the service. It has been shown to broaden perspectives of leaders and reduce silos when they have understanding of other leaders perspectives and operational challenges. It also develops dearth and breadth of leadership and internal succession planning. What does your organization do to promote varied knowledge and understanding of other business lines and operating divisions amongst your leaders?
Communications and Hand-Overs
Incident command programs share a common element with professor Dekker’s twelve high functioning medical teams: a common language and communication patterns. As demonstrated in modern safety literature in medicine, highly effective and efficient teams have a communications style that allows them to clearly and effectively communicate amongst each other to ensure understanding. The Fire Incident Command System establishes a radio protocol that creates a clear picture of who is currently leading the situation and when that leadership has been handed over. Less important is rank and more important is simply having a clear picture of who is taking responsibility in the moment. Another unique aspect found in medicine and fire incident command is the ability to share and decentralize leadership and have clear hand-overs. Without a clearly communicated handover there is immense risk to workers and patients.
Built into the incident command communication structure are distinct elements that create moments of work stoppage. Another parallel of highly effective safe teamwork in medicine. When an incident command calls for a Personal Accountability Report (PAR), a Conditions-Actions-Needs report (CAN), or if any crew member calls a “May Day,” there are defined responses, communications protocols and clear actions and expectations. Priority is given to radio traffic that addresses the needs of those who are in immediate danger and new information that is most important to the team’s crucial decision making ability. This also clearly came from the medical research for patient safety. Having systems of communication that allow for an immediate stop, safety assessment, shared input and new perspectives from all parties involved is essential to safe operations especially when there is any status change or individual with a concern.
This establishes a mindset where the team can reset and consider whether they missed something, something has changed and/or if they need a change in strategy or tactics. In the case of firefighting this may mean a withdrawal and change from offensive to defensive tactics. In traditional industrial Occupational Health & Safety hindsight often finds failure to stop work as a cause of an accident; if you dig deeper however, it is never this simple. There are often deep complex issues and small incremental creep towards the failure that ultimately results from team dynamics and communication.
Mirroring these fie ICS concepts in the HRO theory is the concept of sensitivity to operations. The idea that anyone in the front line can affect immediate change in the current actions, strategies and tactics based on their unique perspectives and expertise. Leaders in HRO’s take frontline input seriously and welcome openness, transparency and frequent candid communication. They share information and take questions and concerns seriously no matter the level of experience, seniority, rank or corporate level. What does your business or organization do to create an environment of open transparent communication, valued input, and critical stops to facilitate new approaches and strategies?
Risk Management Plan and Safety Officer
When speaking about traditional risk management plans in industry, most often it is an assessment of high risk hazards that could be anticipated to be encountered in the work and threats to the business and its continuity or competition. The anticipated risks are seldom where major failure occurs.
What I am really interested in sharing is the concept of capacity in failure. We need to step away from the idea that we can anticipate or prevent all bad things from happening. We need to re-evaluate and consider that “accident” is not the bad word that it has become. Not all incidents are preventable. Evidence has mounted and modern safety science has recognized that there are no “zero” events. The notion of a “zero harm” goal while morally applaudable in concept, creates secondary behaviours that create more opportunity for harm. In referring back to the insurance industry that developed over the last century, actuaries would have you believe that everything is predictable and event probability it calculable. It perhaps is, in frequent events that are statistically powered. It is not, however, with infrequent and atypical larger scale events. This is our blind spot. We need to embrace the concept of failing well.
High Reliability Organizations have learned that it is not a matter of IF something will happen or fail, but WHEN it will happen or fail. The concept of error or human error is a construct that we place on an adverse situation in hindsight. If the same situation occurred and was successful or no harm came, then would we have considered it an error? Contrary to popular belief, modern safety science is showing that hazard prevention is not an effective approach to events that cause significant harm or death. Our focus needs to be on building capacity to fail successfully. What this means is establishing controls to known or anticipated risk, but also having capacity in systems to absorb failures and losses without creating harm. There should never be less than one control between any potential failure (Conklin, 2012).
Examples of controls are things such as safety harnesses that are donned while working at height, pass alarms and gas detectors when working in an oxygen depleted environment, and shoring when working in excavated trenches during a rescue. We see controls in redundancy of aircraft controls and dual pilots, and in engineering controls that prevent access or operation of devices if specified conditions are not met. This must also be translated into daily business operations. What are your basic controls if there is a prolonged power outage? If your network server goes down? If you water supply is cut off? If your sales force all become ill? If a large competitor suddenly comes to market you didn’t anticipate? If your primary logistics carrier fails to operate?
The most distinct element of capacity and resiliency in the fire ground incident command model is the Rapid Intervention Team (RIT), the concept of rescuing the rescuers. This is the ultimate element of capacity to fail well within the fire ground incident command system. This is a fresh crew, equipped and ready to jump into action and rescue any firefighter working inside a structure fire. They are equipped with additional air bottles and masks for downed firefighters and if deployed, the operation focuses to facilitate and augment this rescue and to maintain communications. How many organizations have the capacity to move into immediate crisis management mode with the capacity and expertise to take action and at the same time, keep all of the standard operations up and running?
Health, Wellness, Safety
The safety officer in the incident command structure on a fire ground, is an experienced and knowledgeable person who is well versed in the practices of firefighting and whose role is to recognize and mitigate hazards and risks and know when standard controls and unsafe countermeasures have not been applied. This is an independent set of eyes with the ability to roam and observe; with the skills, knowledge and experience to recognize when something is “off” or not quite right; and who has a direct line to the incident commander. Consider, the idea of an independent person that is not engrossed in operations, with an arm’s length perspective that can give direct feedback to a CEO? What could this look like in your organization?
The final aspect that I am going to discuss is that of health and wellness. We have worked for a long time within a Tayloristic society focused on productivity, top down management, authoritarianism and driving efficiency. Recent years are showing a lack of capacity to meet workplace demands resulting in a growing problem with physical and psychological health conditions in the workplace. The author Simon Sinek (2016) in his book Leaders eat last, speaks about the sociological aspects of groups and teams and their need of feeling safe and supported. This need is for a genuine and authentic concern for our staff, not a window dressing. People can tell the difference.
Fire incident command systems plan from the start for enough capacity to sustain and complete the work while accounting for health and wellness. The strategy and tactics are established with a rest and rehabilitation unit, medical monitoring and defined work breaks. Looking back on my past experience this has been an evolution in the industry, and I recall many times in the early days of my career working through multiple air bottles in the summer heat or winter cold with little or no rest, dehydrated and exhausted. Ego and pride allowed us to suck it up and carry on. Firefighters are now well aware of the effects on their health and have an expectation that their wellbeing will be planned into rest and rehabilitation as part of their work on incident scenes.
It is extremely important to understand the workplace demands and create the necessary support to keep people healthy. A fire scene is often an extreme physical condition, so in a typical workplace or office environment the factors affecting worker health may not be overtly recognizable. This takes us back to learning about real work as done versus work as planned and understanding what frontline staff truly need, to feel supported and cared for in their work. Understanding that people have families, stressors, illnesses and come to work as a complete person, who are then exposed to various elements of the work environment. As leaders, we will share these experiences of our staff. Schein (2018) in Humble Leadership talks about the effects of Tayloristic work environments and the detrimental effects of “professional relationships”, our society has become so scared about liability to the point that many people don’t truly know or develop relationships with their coworkers. Emergency services become high reliability by creating a more family like atmosphere, getting to know the team, and contribute to a better functioning high reliability team.
I would be remiss to not acknowledge that as with all organizations fire services are far from perfect. Many have struggled with elements of past practice and tradition, especially in the areas of diversity and inclusion. It is extremely important to outline how detrimental this has been to those services that have failed to implement and adapt psychologically safe and inclusive work environments. These organizations fell behind on diversity of opinion and experience, questioning the status quo, understanding all of their worker’s perspectives and making people feel safe, supported and welcomed. Others likely felt threatened by change and a lack of communication, transparency, support and under prepared for such new normals. Workplaces are complex systems that require adaptation and leadership capacity internally as well. There is a way forward through a restorative justice approach, please see my earlier article on retributive versus restorative justice.
Application to Business and Industry
When we change our perspective to understanding why we succeed at normal work, we can create the conditions that lead to success and avoid the conditions that lead to failure. We can learn far more from all the times we get things right than we can from the few times that it goes wrong. We need to stop fixating on our past failures, consider where we are vulnerable to failure and celebrate our successes.
My leadership perspective is built on the hypothesis that safety is an outcome of quality leadership, and that ultimately what we are interested in are the principles and concepts that create success in the work that we do. By applying the principles of modern safety science and quality improvement as a leadership framework we can better lead organizations in a manner where safety is the outcome. This includes all aspects of safety including a psychologically safe and diverse workplace. If you want to lead with resilience, you must actively manage your adaptive capacity.
Let’s recap the principles and concepts of Safety Differently and High Reliability Organizations that can be applied in everyday business and industry leadership as Adaptive Capacity Leadership.
1. Adopt a learning teams approach. Commit to Resilience.
Eliminate the old concept of trying to figure out what went wrong and who or what is to blame, adopt the learning teams approach to discovering normal work, when things are going right.
Constantly look for better ways to do work that reflects how work is actually done by your staff.
Teams that don’t wait for audits or inspections, improve on their own without outside influence.
2. Be constantly uneasy and aware of potential failure.
Incorporate a diversity of opinion by recruiting staff with broad backgrounds and knowledge bases that can observe and question with varied perspectives.
Provide opportunity for learning outside your business verticals or silos.
Fixate on how things could fail, even if they have not already.
Past success is NOT an indicator of current or future success.
3. Distribute Leadership. Defer to Experts.
Build leadership capacity throughout your organization.
As the CEO or head of a department, facilitate the development of a team of your replacements.
Support cross sectional learning and development of varied expertise.
Trust others to take on leadership challenges with autonomy and provide mentorship.
Trust experts at any level or rank.
4. Open, Transparent, Clear, Communication.
Foster trust and psychosocial safety amongst your teams to communicate openly, unreservedly, transparently and with a feeling of safety that builds open dialogue.
People must feel supported and valued in their difference of opinion to gain the high reliability organization communication style.
5. Accept and Embrace Complexity.
Look beyond the easy or surface answers.
Seek to understand the complexities and reject the blame and shame.
If your challenge leads to an individual person, that is only the beginning. Keep digging for the real issues.
Andersson, K., & Ostrom, E. (2008). Analyzing decentralized resource regimes from a polycentric perspective. Policy Sciences, 41(1), 71-93.
Bigley, G., & Roberts, K. (2001). The Incident Command System: High-Reliability Organizing for Complex and Volatile Task Environments. The Academy of Management Journal, 44(6), 1281-1299. Retrieved April 4, 2020, from www.jstor.org/stable/3069401
BERGSTRÖM, Johan ; HENRIQSON, Eder ; et DAHLSTRÖM, Nicklas. From Crew Resource Management To Operational Resilience In : Proceedings of the fourth Resilience Engineering Symposium : June 8-10, 2011, Sophia Antipolis, France [en ligne]. Paris : Presses des Mines, 2011 (généré le 04 avril 2020). Disponible sur Internet : http://books.openedition.org/pressesmines/967. ISBN : 9782356710918. DOI : https://doi.org/10.4000/books.pressesmines.967.
Branlat, M., Fern, L., Voshell, M., & Trent, S. (2009). Understanding Coordination Challenges in Urban Firefighting: A Study of Critical Incident Reports. In Proceedings of the 52nd Annual Meeting of the Human Factors and Ergonomics Society. San Antonio, TX, Oct. 2009.
Cook, R., & Rasmussen, J. (2005). “Going solid”: a model of system dynamics and consequences for patient safety. Quality & safety in health care, 14(2), 130–134. https://doi.org/10.1136/qshc.2003.009530
Cohen-Hatton, S. R., Butler, P. C., & Honey, R. C. (2015). An Investigation of Operational Decision Making in Situ: Incident Command in the U.K. Fire and Rescue Service. Human Factors, 57(5), 793–804. https://doi.org/10.1177/0018720815578266
Conklin, T. (2012). Pre-Accident Investigation: An introduction to Organizational Safety. CRC Press, Boca Ramon, FL
Dekker, S.W.A. The danger of losing situation awareness. Cogn Tech Work 17, 159–161 (2015). https://doi.org/10.1007/s10111-015-0320-8
Dekker, S.W.A, Pitzer, C. (2016) Examining the asymptote in safety progress: a literature review, International Journal of Occupational Safety and Ergonomics, 22:1, 57-65, DOI: 10.1080/10803548.2015.1112104
Dekker, Sidney. (2014). The field guide to human error investigations, 3rd ed. Aldershot, Hants, England; Burlington, VT : Ashgate
Dekker, S. W. A. (2014). Deferring to expertise versus the prima donna syndrome: a manager’s dilemma. Cognition, Technology and Work, 16(4), 541–548. https://doi.org/10.1007/s10111-014-0284-0
Dekker, S. W. A. (2015). The danger of losing situation awareness. Cognition, Technology and Work, 17(2), 159–161. https://doi.org/10.1007/s10111-015-0320-8
Dekker, S. W. A., Long, R., & Wybo, J. L. (2015). Zero vision and a Western salvation narrative. Safety Science, 88, 219–223. https://doi.org/10.1016/j.ssci.2015.11.016
Hollnagel, E. (2008). Investigation as an Impediment to Learning. In E. Hollnagel, C. P. Nemeth, & S. W. A.
Dekker (Eds.), Resilience Engineering Perspectives: Remaining Sensitive to the Possibility of Failure (pp.259-268). Adelshot, UK: Ashgate.
Hollnagel, E. (2009). The ETTO Principle: Efficiency-Thoroughness Trade-Off – Why Things That Go Right Sometimes Go Wrong. Farnham, UK: Ashgate.
Hollnagel , E. , Leveson , N. , & Woods , D. D. (Eds.), ( 2006 ) . Resilience engineering:
Concepts and precepts . Aldershot, UK : Ashgate
Kahneman, D. (2013). Thinking, fast and slow.
Klaene, B. J., & Sanders, R. E. (2008). Structural Firefighting: Strategies and Tactics (2nd ed.). Sudbury, MA: Jones & Bartlett Publishers.
Maglio, M. A., Scott, C., Davis, A. L., Allen, J., & Taylor, J. A. (2016). Situational pressures that influence firefighters’ decision making about personal protective equipment: A qualitative analysis. American Journal of Health Behavior, 40(5), 555–567. https://doi.org/10.5993/AJHB.40.5.2
Shattuck, L. G., & Woods, D. D. (2000). Communication of intent in military command and control systems. In C.McCann & R. Pigeau (Eds.), The Human in Command: Exploring the Modern Military Experience (pp. 279-291). New York, NY: Plenum Publishers.
Taleb, Nassim Nicholas, 1960-. (2007). The black swan : the impact of the highly improbable. New York:Random House,
Weick, K. E., & Sutcliffe, K. M. (2001). Managing the Unexpected: Assuring High Performance in an Age of Complexity. San Francisco, CA: Jossey-Bass.
Weick, Karl E.; Sutcliffe, Kathleen M. (2007). Managing the Unexpected. Hoboken, NJ: Jossey-Bass.
Woods, D. D., & Branlat, M. (2011). Basic Patterns in How Adaptive Systems Fail. In E. Hollnagel, J. Pariès, D. D. Woods, & J. Wreathall (Eds.), Resilience
Woods, D. D. (2010). How do systems manage their adaptive capacity to successfully handle disruptions? A resilience engineering perspective. Retrieved from https://www.researchgate.net/publication/286581322
Woods, D. D. (2018). The theory of graceful extensibility: basic rules that govern adaptive systems. Environment Systems and Decisions, 38(4), 433–457. https://doi.org/10.1007/s10669-018-9708-3
Ash, J., & Smallman, C. (2010). A case study of decision making in emergencies. Risk Management, 12(3), 185–207. https://doi.org/10.1057/rm.2010.2
With special thanks to Chief R. Davidson and B.J. Ramey