Talk:Root cause analysis
This article is rated C-class on Wikipedia's content assessment scale. It is of interest to the following WikiProjects: | ||||||||||||||||||||||||||||||||||||||||||||
|
Miscellaneous discussion
editThe link below is broken - use <http://www.realitycharting.com/apollo-rca/comparisons> Mfhall (talk) 00:07, 9 September 2009 (UTC)
Stop being an add. 75.249.123.2 (talk) 23:26, 17 November 2008 (UTC)
The fact that this page is getting too close to the Apollo method is an effort to clarify and improve upon an otherwise poorly written page. Which by the way is a fundamental tenet of Wikipedia. As for a page being "general" I suppose that is acceptable to anyone who is only generally interested in knowing general information about a given subject. However, when I take the time to learn something new, I generally like to know everything there is to know. Human history is laden with simple minded attempts to explain the nature and structure of causes, ref: <www.realitycharting.com/root-cause-analysis/causal-thinking> for details. For a much more thoughtful discussion of all the RCA methods go to <www.realitycharting.com/root-cause-analysis/comparisons> ARCAMAN 18:30, 30 August 2008 (UTC)
This page is getting too close to the Apollo method of root cause analysis and does not represent the breadth of techniques available. For example, this piece:
General process for performing and documenting an RCA-based Corrective Action
Notice that RCA (in steps 3, 4 and 5) forms the most critical part of successful corrective action, because it directs the corrective action at the root of the problem. That is to say, it is effective solutions we seek, not root causes. Root causes are secondary to the goal of prevention, and are only revealed after we decide which solutions to implement. Define the problem. Gather data/evidence. Ask why and identify the causal relationships associated with the defined problem. Identify which causes if removed or changed will prevent recurrence. Identify effective solutions that prevent recurrence, are within your control, meet your goals and objectives and do not cause other problems. Implement the recommendations. Observe the recommended solutions to ensure effectiveness.
Previously, this was more general.
12.50.102.130 (talk) 15:17, 13 July 2008 (UTC)
The "Basic elements of root cause" at the end of this article should be completely removed from this discussion at it is represents only one of the methods in use today and there is no concurrence to these "elements" what-so-ever. Indeed this is only one of hundreds of categorical lists that attempt to use categorization of causes to determine the "Root Cause." See Ishikawa Fish Bone Diagram for a similar list. TapRoot and MORT also use similar lists. ARCAMAN 00:05, 10 June 2008 (UTC) —Preceding unsigned comment added by Deangano (talk • contribs)
It has been suggested that this page has too many embedded lists. I agree with this in some respects, but the introduction, which provides the principles of RCA, is a necessary list to communicate what RCA is. Since RCA is a stepwise process, telling a story with prose is totally inappropriate. As for the list of different types of root cause analysis, as stated in the introduction there is no universal consensus or standard. Myself and several other notable and accepted experts in the field tried to create a standard in 2005 and while there was good debate, in the end nothing ever came of it because of the significant differences in the practice of RCA. This is not unlike many things in science and just because there is no consensus on a particular topic, restricting it from Wikipedia does not serve the user at all. Indeed the article could be made to be more encyclopedic, but I think the emphasize for the encyclopedic style should be in the sub-pages that discuss the various RCA methods. These definitely need improvement. ARCAMAN 23:54, 9 June 2008 (UTC) —Preceding unsigned comment added by Deangano (talk • contribs)
Root Cause Analysis is an engineering term, and I suggest there should be a link in the Failure mode and effects analysis wiki.
If Root Cause Analysis is used in Psychology, could we have separate articles? --Graham Proud 01:45, 25 April 2006 (UTC)
- Root cause analysis is not just an engineering term, as it is applied in a wide variety of fields. Root cause failure analysis (RCFA) might be more appropriate if a separate, engineering-oriented article is desired. Also, FMEA is more of a proactive risk analysis method applied in the design stage, and in my experience is not typically applied in a retroactive RCFA. --72.141.22.69 19:36, 29 July 2006 (UTC)
Root cause analysis is an engineering Term:
The engineering shall start from the incident management to provide an evidence of the issue. The responsibility of the service desk lies in provding the complete information of the issue with the evidence and the configuration items impacted.
Service desk shall monitor its life cycle on regular basis and update the problem management to reduce the down time of the services.
Later the problem management has to analyze the incident completely and shall start working with the evidence of the incident updating the problem log
Then, the life cycle of root cause analysis shall provide the detailed information of the errors.
This helps the problem management to raise an RFC (Request for change) if required.
--unsigned
Root Cause Analysis is not just an engineering term. While the most formal techniques are generally used by engineers, the term root cause and attempts to identify root causes has been around for over 100 years in fields other than engineering. In fact, the earliest reference I can find is from November 1905 and was in the medical journal The Lancet. It doesn't look like engineers used the term (at least in any professional journals) until some time in the 1970s.
Also, I was looking through the revision history and was surprised to see what I thought were relevent links removed as spam. Granted, there's no need for 6 Taproot links, but in looking at the guidelines for external links (Wikipedia:External links#Links normally to be avoided) I don't see why a single link there or to the page from the REASON people or a variety of others would be removed. Since Wikipedia is supposed to be a work of reference, isn't it appropriate on the "Root Cause Analysis" page to identify the generally recognized and widely used methods for analyzing root causes? Please excuse my ignorance, but if I'm misunderstanding, then what then DOES belong on the links for this page?
--Prainog 13:02, 15 October 2006 (UTC)
I believe the term RCA has multiple uses. It is more than the engineering term, and although there is an engineering application foundation for RCA to the best of my knowledge, RCA can be used in any endeavor when negative outcomes ("failures") occur. My objection here is to the haphazard mixing of the terms RCA and Failure Modes and Effects (including criticality) Analysis (FMEA/FMECA). Here the link is very tenuous, and we're definitely talking engineering analysis. My concern is that many printed references incorrectly state things like FMEAs identify root causes. This is simply not the case;there is no automatic connection between FMEA and the failures causes and a root cause of the failure. Many failures dealt with in industrial practice may have a root cause, but the cost of using it is prohibitively expensive. The user is advised to have great caution reading anything connecting the two. JK August (talk) 05:55, 29 April 2014 (UTC)
I agree but most editors would disagree.
68.143.40.146 22:14, 17 October 2006 (UTC)
- I appreciate the support, but which are you agreeing with that you think most wouldn't - that root cause analysis isn't strictly an engineering term or that the page should have references to the common methods for RCA?
I would like to translate this article about the RCA into german. RCA is not described in the german version of wikipedia. Are there concerns?
I notice that this page has no discussion and the article has no mention of software problem root cause analysis for help desks. I notice that this is quite popular when one does a google search. Anybody have any ideas for an addition?
24.183.226.168 04:00, 30 January 2007 (UTC)
I believe I started this wiki many years ago when there was no information on the subject of RCA available in Wikipedia at all. I wrote the first draft off the top of my head, with no references. My goal was to learn how Wikipedia worked and provide source material to draw in others. Nearly eight years later, I can see I accomplished one goal; I failed on the second. Others have taken and developed this subject making many corrections. However, although current interest is largely software, the origins of RCA are in studies and development of what is now called NASA in the study of rocket launch failures in the late 1950s and early 1960s. Many methods have come along that have not been well-published. They reflect different trains of RCA thought. My contributions reflect nuclear use of RCA following the accident at Three Mile Island in the 1980s. I intend to make periodic contributions again to support my second, unattained goal -- learning how Wikipedia works. I was so frustrated making insertions and changes on the first go around in 2008 that I basically quit editing. Kudos to everyone here who drew me back. JK August (talk) 05:43, 29 April 2014 (UTC)
There is no reference or citation for the importance of RCA. Has there been no scholarly work on differential outcomes for organizations that good or bad RCA systems? I'm looking, but have not found anything. Normhowe (talk) 15:30, 30 December 2020 (UTC)
5 "schools"
editThe article listed 5 "schools" of root cause analysis that have their bases in safety, production, process, failure and systems. A 6th was added without changing the prior text to mention a 6th, so I removed it pending cleanup. I don't disagree with the addition made by Dalechadwick but it didn't make sense to leave the article describing that there's 5 broadly-defined schools and listing 6. So it isn't lost in the history, the removed text was:
clinical interventions. RCA has emerged as a key tool in healthcare cqi and qi. Often coupled with, compared to, evidence-based practices.
I'm wondering if there's some reference that discusses the 5 or if it's simply based on someone's interpretation of the field. The inclusion of the medical field seems like a legitimate addition to the list, or if that's too narrow, what about "service-based"? While the consequences of unwanted outcomes are steeper than most other service industries, the medical field is still a service industry. --Prainog 01:16, 29 March 2007 (UTC)
Causal factor tree analysis
editIs causal factor tree analysis in any way different from fault tree analysis? -- Karada 15:05, 16 May 2007 (UTC)
A fault tree analysis is an inductive method (top down) used to evaluate faults or failure events. FTA can be used as a Hazard Analysis tool to show what COULD happen and it can also be used as a probablistic risk assement tool as well. A causal factor tree would only be used after an event occured but a Fault Tree can be applicable in both scenarios. —Preceding unsigned comment added by 192.35.35.34 (talk) 23:24, 5 January 2011 (UTC)
THEORY OF TROUBLE SHOOTING
editI AM PASTING HERE CONTENT OF MY RESEARCH FOR GENERAL COMMENTS. ONCE I GET PERMISION FOR UPLOADING FILES, I SHALL UPLOAD COMPLETE RESEARCH WITH SKETCHES THERE IN)
Very lengthy text pasted herein
|
---|
SYNOPSIS After observing many critical troubles in some system or the other over the period of time in various units & systems, it had occurred to me that most of the time we are not having any exact methodology for analysing troubles in the machines / systems & we are relying upon past experience & trial & error approach. During decision-making lectures in MDI Gurgaon, the important concept of “Trouble shooting is decision making in reverse.” was conceptualized by me. Since in decision-making, we are trying to alter various inputs for any desirable output from the system but in case of trouble shooting, some undesirable output is before us for which we are trying to find out the real reason or root cause. Later on, based on various other inputs & concepts, I have tried to make a comprehensive methodology for thinking for any root cause analysis & I have used it many no. of times to get technically verifiable root causes for many troubles. One aspect which occurred to me was that we are mostly asking about “What has happened?” & How much is the quantity of trouble parameter?” but these questions can not give any answer to “Why the trouble has appeared?” which is the most important question. I am sure that this attempt will help all concerned in not only in O & M field but for all others as well. I have seen good Doctors using these concepts for diagnosing critical ailments inadvertently. Even all our elaborate schemes for transmission lines & Transformers protections are based on the philosophy of this “theory of trouble shooting”. THEORY OF TROUBLE SHOOTING FIRST TWO SENTENCES OF THE THEORY Any man made or natural device, system or subsystem can be defined as a means by which movement, flow or transmission of any form of living being, material, mechanical component, any form of energy, money or information is enabled or controlled in desirable directions or is prevented in undesirable directions. Whenever any trouble appears, the words desirable & undesirable get interchanged in above definition, from observer’s perspective. 1. INTRODUCTION: - In any modern machine, when it is running perfectly well, it gives immense satisfaction to the user and the user takes it for granted that the machine will run as intended for all through its working life without any trouble. However, all machines including any complicated machine like modern Power house Turbine at some time or the other, start giving trouble. These are of two types: - 1) Troubles due to wear & tear with aging: - These troubles are very common which arise due to worn out components and can be solved by replacing/repairing or overhauling of the machine as per manufacturers recommendations. This can be referred to from any good manual prepared by the manufacturer. 2) Critical troubles: - Though relatively rare, these kind of troubles are those in which a machine just can not perform any useful intended purpose unless this trouble is rectified. And if at all one attempts to run the machine with a critical trouble, there will be a great risk for the machine due to which no one attempts to run a machine with a critical trouble. Frequent recurrence of some problem can also be termed as critical trouble. These troubles are seen when all other conventional methods of trouble shooting mentioned in manuals have not yielded any results. Every mechanic & repairman solves some problem or other in his day-to-day activities. For so many centuries, mankind has always found solutions to the troubles in one-way or the other. You may ask me as that if it is so, then what is so great about this so-called “Trouble Shooting Theory”. For this question, kindly allow me to state that whenever any body has found reasons for any trouble, he or she has inadvertently applied the same theory described in following pages, although for relatively small problems, one may not have concentrated so well on the logic for trouble shooting, which I have attempted to do in some depth. In order to understand the concept better, I am giving an analogy here: Let us assume that there is a sphere of about 12 inches diameter, which is having a power full bulb inside; there are large no. of small 1 inch dia. Circular glasses on the surface of the sphere covering it entirely. Out of all the 1-inch dia. circular glasses, only one glass is having a large black spot on it & we are supposed to find the glass with the black spot. It will not be a difficult task & all we have to do is to scan the entire surface of the sphere & locate the black spot. Now, imagine that the sphere is rotating at say 100 rpm & we are observing it in stationary position near it. Now, the problem is not that simple & we may not be able to locate the black spot since we shall not be in a position to differentiate between normal glasses & the glass with the black spot. The total solution process becomes a bit hazy before our mind. In day-to-day troubles encountered by us, we are in a similar hazy status of mind unless the reason for the trouble is found out by any means what so ever. Now, since the rotation of the sphere cannot be stopped, let us imagine that we are having a stroboscopic light with adjustable frequency for viewing the sphere. If we synchronise the frequency of stroboscope properly, we can observe the sphere in almost stationary condition. Now our job becomes very easy, mind becomes very clear & any one can do it properly. The “Theory of Trouble Shooting” being described later on is also a similar tool in the hands of concerned persons by which the haziness in the mind disappears & we are able to find out the real root cause for any trouble. This theory has been designed in such a way that instead of concentrating our minds on “What is happening & how much is the extent of the trouble?” we start thinking about “Why the trouble has appeared after all?”. This is the most important question if the trouble is going to be solved at all at any stage of time. It needs to be appreciated that our mind will never give answer to any “Why?” if all we are thinking about is “What? & How much?” Now, before going further, let us define at the start what is meant by trouble, puzzle and problem. These are relevant in subsequent pages of this theory. TROUBLE: - A trouble is a problem in which no apparent idea is available for the reason and solution of the problem. PUZZLE: - A puzzle is a problem in which some vague guidelines are available for the solution and one has to do some hit and trial method for finding the solution of the same. PROBLEM: - It is a problem in which definite method is available for solving the problem and when one tries with the method accurately, one can easily find the solution. In this paper, I have tried to device some method by which a trouble can be first converted in to a puzzle and the puzzle can be latter on converted in to a problem. The problem can be then easily solved if one is serious and accurate with his method. 2. DOT ON PAPER: - Before going in to details, let us try to think about a very simple analogy, which is given for the purpose of developing abilities to observe simple events in no. of different perspectives. These perspectives are very essential for solving any critical troubles. Let us consider a plain white paper without any mark on it. Now, imagine that a pencil has marked a dot on the paper. Now suppose this dot represents a problem, which has to be removed for reverting the paper to the same original blank status. First of all suppose we have a pencil eraser with us. Then removal of dot is nothing but a small problem of erasing the dot with eraser. Secondly, imagine that instead of pencil, we have marked the paper with a pen and we are not having ink eraser. Now removing of dot has become difficult without proper eraser and the simple problem has turned in to a puzzle. Now let us consider other perspectives. Imagine we have very-very accurate weighing machine, which can weigh even a dot of ink. When a dot was put on the paper, theoretically with very accurate weight and other accurate devices, imagine what are the changes that have been effected due to this simple event. The more nos. you can think off, the more will be the probability of successful trouble shooting later on. Some of these are listed here for example, but you are advised to think about them for some time before reading. 1) The dot will cause the weight of paper to increase, though with very small amount. 2) The dot, which is visible from front, can also be seen from the backside if you hold the paper in front of light source. 3) The ink in the pen will last for lesser time & length after a dot has been released from it on the paper. 4) If we have a microscopic very accurate laser thickness measure, the thickness of paper at the point of dot will be found to be increased. 5) If a photograph of the paper is taken, it will transfer the paper dot on to a negative & later on to a positive print of photograph. 6) If the paper is to be marked with a pinhole later on, at a particular point only, the dot can also serve the purpose of a marker for pinhole creation. I know that some of you are going to laugh at all this “bakwas” but please believe me, the example is given with some particular purpose only. Now, after applying eraser fluid, over dot, say we were happy that the dot has been removed when suddenly it came to some body’s mind that it is the technical requirement that the weight of the paper now must be the same or less than as it was initially and even the weight of eraser fluid has to be removed from it & dot also should not be visible. Now, we have really landed in to a trouble. How to go about it? Here a very - very simple problem was initially turned in to a puzzle and the puzzle has been later on turned in to a trouble. This example is given for purpose of developing aptitude with which you can turn trouble in to a puzzle and a puzzle in to a problem later on. Again I shall request you to ponder over the trouble for some time before proceeding further. Now let me offer one of the possible solution for the trouble mentioned above. Since white eraser fluid has to be retained for covering of the dot and since the weight of the paper is to be made equal to or less than its original weight, let us consider we have a sharp blade by which we can make the paper thinner than before by rubbing the edge of the blade over the paper. Now, if we make some portion of paper slightly thinner and weigh it again, the wt. of the paper may become equal to or less than its original weight. The “dot” case also hence proves to establish following concept, which is going to be used extensively later on. “ANY ABNORMALITY IN ANY SYSTEM MANIFESTS ITSELF IN MORE THAN ONE WAYS AND IF A PERSON IS CAREFUL ENOUGH, HE CAN ARRIVE AT SOME CONCLUSION REGARDING ROOT CAUSE FOR A TROUBLE BASED ON REASONING FOR PECULIAR OBSERVATIONS OTHER THAN THE MAIN TROUBLE”. 3. NORMAL RUNNING: - Before going in to abnormal behaviour of any machine or system, let us think for some time what is meant by the term normal running of the machine / system and why is it running normal. For some persons, normal running may mean that the machine is well maintained, well operated, well designed having better quality of materials and manufacturing technique, having reputed / market leaders brand name etc. All of above are true but if one goes still deeper in to this question and concentrate on the times when the said system or types of machine did not come in to existence, one can realize that for every machine / system when it is working normally, actually it is the conceptual design of the machine which must have been flashed in to its inventor’s mind that is working properly. Any device or system on the earth, which is man made, could not have come into existence if its basic idea or conceptual design did not come into its inventor’s mind. Actually all the man made devices and systems are just the ideas which had taken concrete shape due to some one’s initial efforts. In these efforts, the designer or inventor generally must have had taken in to account all the requirements, which must be fulfilled so that the device or system will serve its intended purpose. If one can think for some time what must have been in designer’s mind for some particular requirement in a device, one can guess to some extent what might have gone wrong when a particular requirement is not being mate with resulting in the so called trouble. However, details about it will be discussed later on. Here is a quote from the book “Thought Power” by Swami Shivananda, which was quite helpful for me during development of this technique. “ Every action has a past which leads up to it, every action has a future which proceeds from it. An action implies a desire which prompted it and a thought which shaped it.” “Each thought is a link in an endless chain of causes and effects, each effect becoming a cause & each cause having been an effect and each link in the end less chain is welded out of three components desire, thought and activity. A desire stimulates a thought; a thought embodies itself as an act. Act constitutes the web of destiny”. On this basis, for the purpose of “Trouble Shooting” I have got the basic idea as “ Every event for a device (here word event includes each and every thing which a device or system or any component there of is subjected to undergo due to intentional or unintentional efforts by man, nature or any other associated device or system) has a past which leads up to it, every event has a future which proceeds from it.” “Each event is a link in an endless chain of causes and effects, each effect becoming a cause and each cause having been an effect”. If the above principle is correctly understood, one can attempt to diagnose a trouble by observing the nature of abnormality VERY CAREFULLY. 4. ABNORMAL OR PECULIAR OBSERVATIONS: - These observations play most important role for diagnosing a trouble. Some of the salient points of these observations may include (This list can never become exhaustive and one will have to apply individual judgment while observing). The nature of these peculiar observations may be different for various spheres of activities viz. plant O&M, civil, electrical, C&I, Computers & telecommunications, Doctors, Cooks etc. but the basic philosophy for any root cause analysis is the same as has been explained below. A. PHYSICAL OBSERVATIONS a) Sound: - Intensity, pattern or type, periodicity, relationship between sound occurring and rotational rpm of the machine and any other abnormalities occurring simultaneously with sound. b) Vibrations: - Its nature, pattern, cyclic variation in magnitude, directional variations, relation with the RPM of machine etc. c) Visual observations: - Any sign or indication any where which may indicate earlier presence of high temperature, water, steam, oil or any other chemical leakage, physical deformity, rubbing and associated smoothness or dent caused by rubbing. Deformity in insulation or support system or lining material, normal wear & tear or excessive wear & tear of a particular component, departure from horizontal (checked by spirit level) or vertical (checked by plumb) component from its intended orientation; deviation from intended concentricity (which may be reflected in an assembled system or during disassembly), looseness some where due to some fault in supporting system etc. Here a small case study may be mentioned. One of the vertical pump was having frequent gland failures as a major problem. Many attempts like repeated overhauling, checking of shaft straightness, replacement of Shaft, sleeve etc. a no. of times were tried but the problem remained. Ultimately, even manufacturers were called but they also could not diagnose the root cause of the problem & the problem remained. At last, when once the motor & coupling had been removed & the stuffing box was being dissembled, it was noticed by chance that the shaft is eccentric in the cavity of the stuffing box. The reason for this peculiar observation was analysed & it was suspected that column pipe encasing the shaft might not be exactly vertical & hence the shaft may also be not in vertical position causing it to become eccentric within the stuffing box. When the column pipe was checked for verticality by means of a Plumb, it was found that the column pipe was itself not exactly vertical while the base of the stuffing box was exactly horizontal. When this column pipe was fixed properly, the problem disappeared altogether. d) Jamming:- If a device can be rotated by hand, the nature of its free movement or if it is requiring more force for rotation, the position in which more force is required. e) Recorded parameters:- When recorders are available for some important parameters, the nature of their variation from normal value, relationship with other recorded parameters, the nature of the band width obtained by joining all maximum and minimum values by respective lines, the gradient of variations from normal values etc. f) Log sheet parameters:- All related log sheet parameters as to the extent any deviation has occurred. However, it must be borne in mind that while taking log sheet reading, one can never be sure about getting accurate readings at the particular time unless one has taken some initial efforts to ensure them. g) Ammeters:- The periodicity and the amplitude of hunting in current and their time relationship with abnormal sound, vibration or any other abnormality. h) Temperature:- The temperature rise above normal working temperature is also an indication for some abnormality in the system, more frequently observed as bearing temperature rise. I) Leakages:- Any leakage in the system of oil, water steam, air etc. may also be peculiar observations which might have caused some inadequacy of lubricant, air, steam, cooling water etc. due to which any trouble might have occurred. j) Any other observations:- Any other observations like deformity, erosion, corrosion, improper heat treatment etc., which may or may not have direct relationship with abnormality but which has been one of the major deviation in the machine or system as regards to prior normal running of the system and the running subsequent to the arrival of the abnormality. These observations must be inquired from the person who is conversant with the system and knows about the behaviour of the machine during earlier normal running and subsequent “ Trouble Full “ machine. Mostly these observations come from the operating staff including local operators who are regularly monitoring machine and its behaviour. k) For pipelines:- Pipes line layouts, nature & no. of bends & pipe diameter play important role in pr. drops. For pipes carrying steam / gas, unintended slight bends some times choke the flow of gas due to accumulation of condensed liquid in lowest point of pipes. While taking these observations, one will have to be careful and use his judgment for segregating between normal and abnormal observations. During later analysis, all the abnormal observations are going to play a great role. Apart from these observations, which are generally valid for running machines, a lot many more no. of peculiar observations (deviations from ideal design state) will be seen when a big machine with trouble is being dissembled. All of them can be judged and inferred according to situation specific circumstances as well as prevailing trouble only and they can not be described here.
A small case study may be included here. When one of the patient (my relative) was putting on weight although she was not having any food habits to that effect; many tests were conducted but to no avail. Ultimately, when a Doctor observed some lump on the throat, and he correlated all other observations with the decease, he advised for thyroid test & the result was that thyroid gland was having some problem, which was easily cured by taking concerned medicines. Here also, the basic philosophy of the “Theory” can be seen to be operative. E. COOKING / TASTE OF FOOD RELATED PARAMETERS / OBSERVATIONS Whenever any dish is not up to the liking of the consumer, these parameters are relevant. Some examples are quantity / quality of ingredients, whether over cooked or under cooked, sequence of processing different steps, staleness / freshness of food items etc. can also be termed as peculiar observations pertaining to culinary skills. Most of the times a person generally only thinks about the immediate trouble and the ways and means to overcome the trouble as given in manufacture’s manual. When one does not get the results as per manual, one generally gets into the mental block that this machine or system is having some problem and it must be referred to the manufacturers. The manufacturer comes and checks all the dimensions and if nothing can be concluded from that, the problem remains as it is. Here, one analogy may serve the purpose of better understanding. When a machine is running normal, it is similar to driving on a high way with well-defined destinations given in the manufacturer’s manual. When one lands into trouble, some manuals or manufacturers do give directions for trouble shooting which are similar to driving in a town connected to a high way where you can ask and know the direction in which the High way is there. However, when some critical fault develops in to the machine and some component of the system deviates from the original conceptual intended function, it fails to serve its intended purpose and the trouble is the ultimate effect of that abnormality in the component or sub system. The Manufacturer or designer, while preparing manuals, takes for granted that all the components will serve their purpose and generally their recommendations are only to the extent of most obvious reason for abnormality. At that time, they can not visualize that some particular component will not serve the intended purpose based on which all their further recommendations are prepared. As an analogy, consider yourself to be placed in a barren field where there are no land marks or guide posts available in visible range and there is nobody to ask about the direction in which a high way might be there. You only have some vague idea about your starting point and the approximate direction of your destination. In such an event, one will have to navigate by most fundamental directional sense provided by the sun and the stars. While driving by this technique, when one comes across some land marks then they serve the purpose of guide posts based on which one can later on find his way from the maps or other means. But unless and until some landmark is visible, one cannot guess at all in which location; he has landed himself unintentionally. In case of troubleshooting, all the abnormal observations serve the purpose of guideposts since all of them are most probably the effect of some particular problem. Although the ultimate abnormality may not give its reasons so concretely but these observations can give the idea about the origin of the cause if one constructs logical cause and effect fish bone diagram of all the abnormal observations. Here it must be borne in mind that just like the Sun & the Stars were the only guide available for a man in the vast, empty, barren field similarly most fundamental laws of physics, chemistry, electricity, electronics & medical science etc. can only give the cause and effect relation ships for all the abnormal observations. In most of the troubles we witness, our minds become preoccupied with what is happening & how much is the extent of trouble. So much so, that after trying some obvious reasons for the trouble, which are not able to explain the trouble properly, we stop thinking about the root causes for the trouble & start feeling that some outsider expert will come & solve the trouble for us. Based on credentials of the expert in the relevant field, we call the expert & do as per his advice; which also is not able to solve the trouble at times.
Although, initially I have advised for doing such an elaborate exercise on paper for keeping mind focused on the problem, once you start using this technique for few no. of times, you will find that all this is redundant and you only need to work out back wards in your mind from peculiar observations to the root cause which can also explain some other observation or critical trouble. Normally it has been seen that at a particular point of time, only one critical fault develops in the machine although theoretically, the possibility of two different and independent faults developing simultaneously in a machine can not be ruled out. While determining cause and effects for each abnormal observations, sometimes some particular link may go against the establishment conventional thinking about the issues; but it must be borne in mind that abnormal observation are a reality which has to have some explanation; although we may not be in a position to guess about it at the first hand. In case, even than you are not able to guess the root cause, leave the efforts rest. This is the time when subconscious mind takes over the problem and it has been a proven fact established scientifically that subconscious mind is at times mightier than the conscious mind. Most likely, if your all abnormal observation and reasons for the same are technically correct, some vague idea will suddenly flash in your mind, which though seems most unlike cause at the beginning, gradually will be able to explain for all the observations as well as critical trouble. The job of the analyst in this manner becomes similar to that of a detective with only difference being that the detective is trying to find out the culprit person who has done the crime based on available evidence while the analyst is trying to find out the “culprit” reason due to which the trouble is appearing based on all the available “peculiar observations as well as critical trouble” as “evidence”. Here again allow me to quote some paras from this book “Through Power” by Swami Shivananda. “Miss not any opportunity. Avail yourself of all opportunities” “Man is cer tainly not a creature of environments or circumstances. He can control and modify them by his capacities, character, thoughts, good actions and right exertion” “Through right thinking, reasoning introspection and meditation, you will have to clarify your ideas. Then confusion will vanish. The thoughts will get settled and will ground. “Hard thinking, persistent thinking, clear thinking, thinking to the roots of problems, to the very fundamentals of the situation, to the very presuppositions of all thoughts and beings is the very essence of Vedic Sadhana”. In case you are arriving at some apparently absurd reasons for the trouble, you can check the correctness of your exercise by devising some tests or measurements in which first you will guess about the result by further logical analysis of “If-Then” and then can verify the same by measuring concerned parameter. Although by now, the chances are that you have arrived at the real root cause but as a confirmation test of your analysis you can now start reasoning about hitherto unanalyzed observations, or else, you can predict that if so and so is the root cause, it will also affect some other parameter which has not been so far checked. If that parameter also corroborates your analysis you can be rest assured that you have actually arrived at the root cause of the problem and declare it to all others concerned. Most of you have at one time or another solved several problems but if now you introspect about it, you will find that it is this same theory which was working in your mind when you were able to predict the reason for some particular trouble. It may be noted that by plotting fish bone diagrams, we have converted a trouble in to a puzzle and when we further arrive at right conclusion, the puzzle has got changed into a problem. As far as two case studies mentioned above are concerned, you will appreciate the fact that in case of vertical pump, the eccentricity of the shaft within the stuffing box was the peculiar observation which created the doubt about the verticality of the column pipe & when it was confirmed by the plumb, the correctness of the analysis was confirmed without disturbing the pump. Similarly, in second case study, the Doctor observed the lump on the throat & treated as a peculiar observation in our theory. When he tried to find the reason for the lump, it occurred to him that this might be due to abnormal functioning of the THYROID. The same was confirmed after the required taste was carried out. In both of the above cases, as well as in any other successful root cause analysis, the real root cause could not have been found out if required attention had not been paid towards the said peculiar observations. 6. GRAPHICAL REPRESENTATION OF THE THEORY: For graphical representation of the theory discussed above, an attempt is being made as under: a. Conventional thinking about the trouble: In conventional thinking, our mind is concentrated on the trouble itself & based on our experience & judgment for similar troubles encountered earlier, our thinking tries to find out probable reasons for the trouble one after the other. In case this does not help in finding the root cause for the trouble, we feel that the problem is not able to be solved inspite of our best efforts & some expert or manufacturer needs to be consulted. The manufacturer or expert has some what wider horizon of datas / facts pertaining to his system & he thinks from such a wider perspective. Some times, even this is not able to solve the trouble. Then, selective replacement of parts starts which is really a trial & error process & we may or may not be able to solve the trouble since the root cause for the same is not under stood by us. b. Finding root cause of the trouble by means of the theory: With the theory of trouble shooting, our mind is attracted towards not only the trouble itself, which is being monitored meticulously, but also to all available peculiar observations as well which are not so meticulously monitored in conventional approach. Hence, our mind tries to find out probable reasons for all the peculiar observations one by one & there is maximum probability now that we may come to some mental conclusion based on reasoning for peculiar observations. When some particular reason is thought about for some peculiar observations, which are also able to explain some other peculiar observation as well, our mind locks it & tries to explain the trouble & remaining peculiar observations with the said reason. I have seen that when ever this happens, all the peculiar observations & the trouble it self are able to be explained logically due to some particular root cause which can be declared as the real culprit for the trouble. Some times, instead of the real root cause, we arrive at some intermediate reasons (events), which are caused by the real root cause. Then similar exercise ultimately leads us to the real culprit root cause of the trouble, which can be relatively easily tackled without much trial & error approach. This picture of the process of trouble shooting is very easy to remember & once you practice this kind of analysis for few cases, you need not go through the whole theory again & again. 7. CLOSED V/S OPEN SYSTEMS: Above type of thinking holds good for closed systems in which we are not able to visualise what is happening inside the system like a Turbine, Pump or any other closed machine. However, in case of open systems like a boiler of a power plant or a ship or any other bigger system in which we are able to see & witness some of the things going on inside; slightly different approach is required as under: In closed systems, we are able to witness only the effects of internal events peculiar observations & we need not consider the effects of these observations. Only causes of these observations are required to be explored. We are only going backwards for finding out the reasons of peculiar observations. In open systems, some of the peculiar observations are having some of the effects as “input” to the system. Here we shall have to extend our cause & effect chain in forward direction as well (particularly for observations which are having some effect as input to the system). In doing so we may stumble upon some fact, which is the real culprit for the trouble. Words are not able to explain this phenomenon properly but Graphically it can be easily under stood as under:
In a nut shell, it can be said that this theory holds good for closed as well as open system & can be perfectly understood by little practice. Once we are able to find out the real root cause for the trouble, its solution involves some creativity based on the person’s experience & understanding of the system so that the harmful effects of the trouble are minimised / avoided; in case we are not able to tackle the trouble itself completely by available resources. As an example of open system trouble, one of the units In a power station was having critical trouble of clinker formation. The flue gas was taken out from the boiler & was used for carrying coal powder to the furnace by means of individual mill fans. It was observed that relatively large diameter duct had been used in order to avoid choking problems In this unit (Peculiar observation). The effect of this was that the flue gas temperature for this unit in the zone between coal firing level & FG recirculation taping point level was relatively higher. Besides, there was inadequacy of Oxygen in this zone as a result of which the conditions favourable for clinkering (high temperature & reducing atmosphere) had been created. When corrective actions of increasing air supply in this zone were taken by means of damper adjustments, the clinkering problem had vanished. Here, the effect of a Peculiar Observation was contributing towards the critical trouble & as per method for open systems, the problem had been tackled. 8. CONCLUSIONS:- By now, you must have realized some of the following conclusions derived by me which are always true but have not been emphasized in following words so far. 1. Any machine or mechanical system only behaves logically & logical reasoning is the only language by which a human mind can get inside information about any machine. Hence, if we develop our minds properly in this direction, we shall be able to TALK or CONVERSE with the machines. 2. At any particular time, any particular trouble will be caused by only one root cause just as any tree can have only one seed as its origin. There can never be two seeds, which have grown in to one tree, & similarly, there can never be two root causes for a particular trouble. In case there are apparently two root causes for a particular trouble, then there are chances that one is either an effect of another or is not really causing the trouble. 3. When ever any machine is having trouble, the trouble is the only measured and monitored parameter of the machine but it will definitely have some unknown effects due to root cause which are not so critically monitored. 4. Any non-conventional and peculiar trouble is the ultimate result of its “Genetic” circumstances and hence each and every peculiar trouble is one of it’s own kind. The data bank the manufacturers have can at best lead to statistical probability / possibility of some what similar reason in cases of troubles which appear very frequently in the over all population. 5. All the tests, checks and allowable deviations or tolerances, which are done & measured by manufacturers, indicate only the health or status of the individual part at that time, but they can NOT give the exact cause and effect relationship of why a problem had arose without intervention of human mind.( Tests only show us how is it at present but can not directly tell why a particular trouble or peculiar observation has happened at all.) 6. Apart from the healthiness of components and parts being assembled of a troubleful machine, for deciding about Run / Repair /Replace, the biggest problem for the operating agency or user, who runs the machine is why exactly the critical trouble had occurred in the first place, so that same kind of trouble must not recur again in the same or similar other machines in future. 7. The theory of trouble shooting can directly take the analyst’s mind to the root cause of the problem just like a guided missile reaches its target on its own. 8. One of the most outstanding features of this theory is that it can give the analyst’s confirmatory checks about correctness of the analysis during the analysis itself. 9. Any root cause analysis successfully done by any one any where so far IN THE WORLD is the result of INADVERTENT application of the same mental way of thinking which I have systematically researched & have laid down procedures & rules regarding same. 10. Any analysis made using this theory will lead to the solution to the trouble which will be most economical to implement since we can attack the root cause of the trouble directly without any trial & error. |
Sources
editIt needs them. Also, References.
74.77.137.241 (talk)SAB —Preceding comment was added at 22:03, 18 July 2008 (UTC)
Barrier Analysis Loop Link
editThe "barrier analysis" link in the body leads back to this same article. If it is a separate technique, it should have an article of its own; if not, the link should be removed here. I am inclined to think the former is the case. —Preceding unsigned comment added by 170.170.59.138 (talk) 23:09, 28 December 2008 (UTC)
Broken links
editThe link to the reference 4 "The Management Oversight and Risk Tree (MORT)". International Crisis Management Association. Retrieved 1 October 2014 is broken. Ppso (talk) 11:54, 14 June 2017 (UTC)
Yes a number of issues
editIt does look as though the entry here is obsolete but also not very informative. In the examples the tone is not encyclopedic but the sub topic could be greatly shortened to something like:
- Problem -- My car won't run
- Why #1 -- The engine won't start
- Why #2 -- There is no gasoline for the engine
- Why #3 -- Because the gas tank was not filled
- Why #4 -- Because gasoline could not be purchased
- Why #5 -- Because I'm unemployed
- Why #6 -- Because there are no jobs I'm qualified for
- Why #7 -- Because automation handles the tasks I was trained for now
The root cause of a problem consists of drilling down through as many "whys" as it takes to reach the bedrock cause, though at any step along the way a corrective action can be taken. Why 6 in my example can be corrected by re-training. Why #3 can be corrected by borrowing gasoline. Why #4 can be corrected by borrowing money. And so it goes.
Point being that RCA is performed to identify the root case but also to identify possible solutions that could have been taken at any step where the cause and contributing factors are identified.
I think that the examples section might benefit from a simpler format, basically a bullet-point series of examples which convey the methodology of RCA better. SoftwareThing (talk) 18:35, 23 July 2018 (UTC)
From a General Problem-Solving Strategy to RCA
editThis article was difficult to read. It included a lot of material describing a general problem-solving strategy, and it was difficult to see what pertained to RCA per se.
I have significantly edited it with a more Wikipedia tone, refocusing it on RCA and deleting almost everything related to corrective actions. I have also restructured the plan and added new sections. Hopefully, the article is now easier to read and clearer.
Section "Application domains" still needs more work. I have already added some material for two application domains that I know well (IT and telecoms), but the other application domains really need to be developed further.
I have also added several references, but more references are needed.
I meant to add a new section explaining various techniques for performing RCA (e.g., interviews of technicians in manufacturing, automated deductions based on monitoring data, case bases, and dependency graphs in IT), but it is getting late and I need to stop for now. We may also want to elaborate a bit on the data mining and case-based reasoning techniques that are used for automating RCA.