Resilience engineering is a subfield of safety science research that focuses on understanding how complex adaptive systems cope when encountering a surprise. The term resilience in this context refers to the capabilities that a system must possess in order to deal effectively with unanticipated events. Resilience engineering examines how systems build, sustain, degrade, and lose these capabilities.[1]
Resilience engineering researchers have studied multiple safety-critical domains, including aviation, anesthesia, fire safety, space mission control, military operations, power plants, air traffic control, rail engineering, health care, and emergency response to both natural and industrial disasters.[1][2][3] Resilience engineering researchers have also studied the non-safety-critical domain of software operations.[4]
Whereas other approaches to safety (e.g., behavior-based safety, probabilistic risk assessment) focus on designing controls to prevent or mitigate specific known hazards (e.g., hazard analysis), or on assuring that a particular system is safe (e.g., safety cases), resilience engineering looks at a more general capability of systems to deal with hazards that were not previously known before they were encountered.
In particular, resilience engineering researchers study how people are able to cope effectively with complexity to ensure safe system operation, especially when they are experiencing time pressure.[5] Under the resilience engineering paradigm, accidents are not attributable to human error. Instead, the assumption is that humans working in a system are always faced with goal conflicts, and limited resources, requiring them to constantly make trade-offs while under time pressure. When failures happen, they are understood as being due to the system temporarily being unable to cope with complexity.[6] Hence, resilience engineering is related to other perspectives in safety that have reassessed the nature of human error, such as the "new look",[7] the "new view",[8] "safety differently",[9] and Safety-II.[10]
Resilience engineering researchers ask questions such as:
- What can organizations do in order to be better prepared to handle unforeseeable challenges?
- How do organizations adapt their structure and behavior to cope effectively when faced with an unforeseen challenge?
Because incidents often involve unforeseen challenges, resilience engineering researchers often use incident analysis as a research method.[3][2]
Resilience engineering symposia
editThe first symposium on resilience engineering was held in October 2004 in Soderkoping, Sweden.[5] It brought together fourteen safety science researchers with an interest in complex systems. [11]
A second symposium on resilience engineering was held in November 2006 in Sophia Antipolis, France.[12] The symposium had eighty participants.[13] The Resilience Engineering Association, an association of researchers and practitioners with an interest in resilience engineering, continues to hold bi-annual symposia.[14]
These symposia led to a series of books being published (see Books section below).
Themes
editThis section discusses aspects of the resilience engineering perspective that are different from traditional approaches to safety.
Normal work leads to both success and failure
editThe resilience engineering perspective assumes that the nature of work which people do within a system that contributes to an accident is fundamentally the same as the work that people do that contributes to successful outcomes. As a consequence, if work practices are only examined after an accident and are only interpreted in the context of the accident, the result of this analysis is subject to selection bias.[11]
Fundamental surprise
editThe resilience engineering perspective posits that a significant number of failure modes are literally inconceivable in advance of them happening, because the environment that systems operate in are very dynamic and the perspectives of the people within the system are always inherently limited.[11] These sorts of events are sometimes referred to as fundamental surprise. Contrast this with the approach of probabilistic risk assessment which focuses on evaluate conceivable risks.
Human performance variability as an asset
editThe resilience engineering perspective holds that human performance variability has positive effects as well as negative ones, and that safety is increased by amplifying the positive effects of human variability as well as adding controls to mitigate the negative effects. For example, the ability of humans to adapt their behavior based on novel circumstances is a positive effect that creates safety.[11] As a consequence, adding controls to mitigate the effects of human variability can reduce safety in certain circumstances[15]
The centrality of expertise and experience
editExpert operators are an important source of resilience inside of systems. These operators become experts through previous experience at dealing with failures.[11][16]
Risk is unavoidable
editUnder the resilience engineering perspective, the operators are always required to trade-off risks. As a consequence, in order to create safety, it is sometimes necessary for a system to take on some risk.[11]
Bringing existing resilience to bear vs generating new resilience
editThe researcher Richard Cook distinguishes two separate kinds of work that tend to be conflated under the heading resilience engineering:[17]
Bringing existing resilience to bear
editThe first type of resilience engineering work is determining how to best take advantage of the resilience that is already present in the system. Cook uses the example of setting a broken bone as this type of work: the resilience is already present in the physiology of bone, and setting the bone uses this resilience to achieving better healing outcomes.
Cook notes that this first type of resilience work does not require a deep understanding of the underlying mechanisms of resilience: humans have been setting bones long before the mechanism by which bone heals was understood.
Generating new resilience
editThe second type of resilience engineering work involves altering mechanisms in the system in order to increase the amount of the resilience. Cook uses the example of new drugs such as Abaloparatide and Teriparatide, which mimic Parathyroid hormone-related protein and are used to treat osteoporosis.
Cook notes that this second type of resilience work requires a much deeper understanding of the underlying existing resilience mechanisms in order to create interventions that can effectively increase resilience.
Hollnagel perspective
editThe safety researcher Erik Hollnagel views resilient performance as requiring four systemic potentials:[18]
- The potential to respond
- The potential to monitor
- The potential to learn
- The potential to anticipate.
This has been described in a White Paper from Eurocontrol on Systemic Potentials Management https://skybrary.aero/bookshelf/systemic-potentials-management-building-basis-resilient-performance
Woods perspective
editThe safety researcher David Woods considers the following two concepts in his definition of resilience:[19]
- graceful extensibility: the ability of a system to develop new capabilities when faced with a surprise that cannot be dealt with effectively with a system's existing capabilities
- sustained adaptability: the ability of a system to continue to keep adapting to surprises, over long periods of time
These two concepts are elaborated in Woods's theory of graceful extensibility.
Woods contrasts resilience with robustness, which is the ability of a system to deal effectively with potential challenges that were anticipated in advance.
The safety researcher Richard Cook argued that bone should serve as the archetype for understanding what resilience is in the Woods perspective.[17] Cook notes that bone has both graceful extensibility (has a soft boundary at which it can extend function) and sustained adaptability (bone is constantly adapting through a dynamic balance between creation and destruction that is directed by mechanical strain).
In Woods's view, there are three common patterns to the failure of complex adaptive systems:[20]
- decompensation: exhaustion of capacity when encountering a disturbance
- working at cross purposes: when individual agents in a system behave in a way that achieves local goals but goes against global goals
- getting stuck in outdated behaviors: relying on strategies that were previously adaptive but are no longer so due to changes in the environment
Resilient Health care
editIn 2012 the growing interest for resilience engineering gave rise to the sub-field of Resilient Health Care. This led to a series of annual conferences on the topic that are still ongoing as well as a series of books, on Resilient Health Care, and in 2022 to the establishment of the Resilient Health Care Society (registered in Sweden). (https://rhcs.se/)
Books
edit- Resilience Engineering: Concepts and Precepts by David Woods, Erik Hollnagel, and Nancy Leveson, 2006.
- Resilience Engineering in Practice: A Guidebook by Jean Pariès, John Wreathall, and Erik Hollnagel, 2013.
- Resilient Health Care, Volume 1: Erik Hollnagel, Jeffrey Braithwaite, and Robert L. Wears (eds), 2015.
- Resilient Health Care, Volume 2: The Resilience of Everyday Clinical Work by Erik Hollnagel, Jeffrey Braithwaite, Robert Wears (eds), 2015.
- Resilient Health Care, Volume 3: Reconciling Work-as-Imagined and Work-as-Done by Jeffrey Braithwaite, Robert Wears, and Erik Hollnagel (eds), 2016.
- Resilience Engineering Perspectives, Volume 1: Remaining Sensitive to the Possibility of Failure by Erik Hollnagel, Christopher Nemeth, and Sidney Dekker (eds.), 2016. ISBN 978-0754671275
- Resilience Engineering Perspectives, Volume 2: Remaining Sensitive to the Possibility of Failure by Christopher Nemeth, Erik Hollnagel, and Sidney Dekker (eds.), 2016. ISBN 978-1351903882
- Governance and Control of Financial Systems: A Resilience Engineering Perspective by Gunilla Sundström and Erik Hollnagel, 2018.
References
edit- ^ a b Woods, D.D. (2018). "Resilience is a Verb" (PDF). In Trump, B.D.; Florin, M.-V.; Linkov, I (eds.). IRGC resource guide on resilience (vol. 2): Domains of resilience for complex interconnected systems. Lausanne, CH: EPFL International Risk Governance Center.
- ^ a b Pariès, Jean (15 May 2017). Resilience Engineering in Practice. CRC Press. ISBN 978-1-317-06525-8. OCLC 1151009227.
- ^ a b Hollnagel, Erik; Christopher P. Nemeth; Sidney Dekker, eds. (2019). Resilience engineering perspectives. Vol. 2: Preparation and Restoration. CRC Press. ISBN 978-0-367-38540-8. OCLC 1105725342.
- ^ Woods, D.D. (2017). STELLA: Report from the SNAFUcatchers Workshop on Coping With Complexity. Columbus, OH: Ohio State University.
- ^ a b Dekker, Sidney (2019). Foundations of safety science: a century of understanding accidents and disasters. Boca Raton. ISBN 978-1-351-05977-0. OCLC 1091899791.
{{cite book}}
: CS1 maint: location missing publisher (link) - ^ (David), Woods, D. (2017). Resilience Engineering: Concepts and Precepts. CRC Press. ISBN 978-1-317-06528-9. OCLC 1011232533.
{{cite book}}
: CS1 maint: multiple names: authors list (link) - ^ Woods, David D.; Sidney Dekker; Richard Cook; Leila Johannesen (2017). Behind human error (2nd ed.). Boca Raton. ISBN 978-1-317-17553-7. OCLC 1004974951.
{{cite book}}
: CS1 maint: location missing publisher (link) - ^ Dekker, Sidney W. A. (2002-10-01). "Reconstructing human contributions to accidents: the new view on error and performance". Journal of Safety Research. 33 (3): 371–385. doi:10.1016/S0022-4375(02)00032-4. ISSN 0022-4375. PMID 12404999. S2CID 46350729.
- ^ Dekker, Sidney (2015). Safety differently : human factors for a new era (Second ed.). Boca Raton, FL. ISBN 978-1-4822-4200-3. OCLC 881430177.
{{cite book}}
: CS1 maint: location missing publisher (link) - ^ Hollnagel, Erik (2014). Safety-I and safety-II: the past and future of safety management. Farnham. ISBN 978-1-4724-2306-1. OCLC 875819877.
{{cite book}}
: CS1 maint: location missing publisher (link) - ^ a b c d e f Erik Hollnagel; Christopher P. Nemeth; Sidney Dekker, eds. (2008–2009). Resilience engineering perspectives. Aldershot, Hampshire, England: Ashgate. ISBN 978-0-7546-7127-5. OCLC 192027611.
- ^ "2006 Sophia Antipolis (F)". Resilience Engineering Association. Retrieved 2022-09-25.
- ^ Resilience engineering perspectives. Erik Hollnagel, Christopher P. Nemeth, Sidney Dekker. Aldershot, Hampshire, England: Ashgate. 2008–2009. ISBN 978-0-7546-7127-5. OCLC 192027611.
{{cite book}}
: CS1 maint: others (link) - ^ "Symposium". Resilience Engineering Association. Retrieved 2022-09-25.
- ^ Dekker, Sidney (2018). The safety anarchist: relying on human expertise and innovation, reducing bureaucracy and compliance. London. ISBN 978-1-351-40364-1. OCLC 1022761874.
{{cite book}}
: CS1 maint: location missing publisher (link) - ^ "Hindsight 31 | SKYbrary Aviation Safety". skybrary.aero. Retrieved 2022-09-25.
- ^ a b A Few Observations on the Marvelous Resilience of Bone & Resilience Engineering - Dr. Richard Cook, retrieved 2022-09-25
- ^ Hollnagel, Erik (2017-05-15), "Epilogue: RAG – The Resilience Analysis Grid", Resilience Engineering in Practice, CRC Press, pp. 275–296, doi:10.1201/9781317065265-19, ISBN 978-1-315-60569-2, retrieved 2022-09-17
- ^ Woods, David D. (September 2015). "Four concepts for resilience and the implications for the future of resilience engineering". Reliability Engineering & System Safety. 141: 5–9. doi:10.1016/j.ress.2015.03.018.
- ^ Woods, David D.; Branlat, Matthieu (2017-05-15), "Basic Patterns in How Adaptive Systems Fail", Resilience Engineering in Practice, CRC Press, pp. 127–143, doi:10.1201/9781317065265-10, ISBN 978-1-315-60569-2, retrieved 2022-09-24