Editor’s Note—The paper on which this article is based was originally presented at the 2016 IEEE Product Safety Engineering Society Symposium, where it received recognition as the Best Symposium Paper. It is reprinted here, with permission, from the proceedings of the 2016 IEEE Product Safety Engineering Society International Symposium on Product Compliance Engineering. Copyright 2016 IEEE.
The international standard ISO 26262 “Road vehicles – Functional safety” has been released in final form since late 2011 [1]. It provides a standardized set of processes and methods to assure the functional safety of electrical and electronic systems in the automotive domain. The standard is an evolution of the IEC 61508 functional safety standard, applied specifically to the automotive realm [2].
ISO 26262 requires a variety of processes and frameworks for safety management, safety concept development, requirements flow-down, and verification & validation activities. The standard also requires quantified metrics to be calculated for safety-related systems.
Of particular interest is the Probabilistic Metric for Hardware Failure (or PMHF)1, which represents a calculated estimate of the rate of hazard occurrence due to random hardware failures. This value must be calculated for systems rated at a high Automotive Safety Integrity Level (or ASIL)2. Specifically, systems rated at ASIL C or ASIL D must achieve targets such as those proposed by the standard and listed in Table 1.
1 Note there are alternate means available within the standard; for example, the Failure Rate Class discussed in ISO 26262-5 9.4.3; however, such methods are not the topic of this paper.
2 Each of these ASIL levels specify the requirements of ISO 26262 and the safety measures to avoid unreasonable risk; D being the highest in terms of hazard and stringent safety measures and A being the least stringent
ASIL |
PMHF Requirement |
note |
D |
< 1E-08/hr (10 FIT) 3 |
required |
C |
< 1E-07/hr (100 FIT) 3 |
required |
B |
< 1E-07/hr (100 FIT) 3 |
recommended |
A |
– |
no requirement |
Table 1: PMHF requirements in ISO 26262
Fault Tree Analysis (FTA) is a method often proposed for calculation of the PMHF in real-world systems. However, FTA is a very general method, subject to a wide range of interpretations and techniques depending on the objectives of a given problem, the type of failures & faults being considered, and the terminology employed by various industries. There is not yet an accurate and well-explained practical guide to the specific techniques appropriate for PMHF calculation in the automotive industry. For example, large and complex systems, such as those that comprise real-world automotive products, are often difficult to capture in a FTA in a systematic and repeatable way. The use of diagnostic coverage (D.C.) (e.g., by an imperfect safety mechanism which may detect some but not all element faults) is often utilized in hardware metric calculations. However, D.C. concepts are not widely clarified in the industry literature, leaving a gap in understanding for many FTA practitioners. At lower levels of the FTA, specific frameworks for calculating the effect of single-point and dual point faults (including dual-point latent faults) are necessary to obtain a correct PMHF estimation. All these topics will be addressed here along with a worked automotive example.
FAULT TREE ANALYSIS
FTA is a logical combination of intermediate events and basic events, which can be assembled using AND and OR logical operators to analyze the effects of component faults on system failures. In a safety analysis, the FTA typically begins with a top-level event representing a major hazardous event, and/or the violation of a safety goal or Functional Safety Requirement, as defined in ISO 26262. The analysis is then performed by deducing what conditions or events would lead to the top-level event, and in what logical combination. The method has been in use in industrial settings for several decades (see for example [3], [4], [5]). More recently, the method has been applied to automotive systems [6], [7] and suggested for wider use as an analysis framework. In some cases, the FTA may be qualitative in nature. If probabilities of the underlying lower-level events can be estimated, then an estimate of the probability can be made for the top-level event. The PMHF is just such a quantitative estimation.
Applicability and Limits on the Use of Quantitative FTA
Quantitative methodologies are a useful tool in safety assurance processes, where the objective is to reduce risk to a quantifiably acceptable level by estimating the rates of occurrence of hazardous events. Standards such as IEC 61508 are based strongly around this principle. Given such an estimate of the probability of safety-related hazards, the risk can in principle be mitigated to an acceptably-low level.
However, there are several critical limitations to such methods that must be recognized. While random failures in electronic hardware may be modeled with probabilistic methods, systematic failures (for example in deterministic software) cannot be modeled in this way. This may lead the analyst to under-represent or overlook important systematic failures [8]. There is wide variability in underlying data available for the reliability failure of electronic components, which in turn leads to calculations with a relatively wide range of uncertainty. There is also evidence of a tendency for the analyst to believe in the independence of events which are represented independently in the FTA, while objective observation would find correlation between events [9]. ISO 26262 takes steps to mitigate these limitations, for example by recognizing the primacy of process adherence in preventing and avoiding systematic faults, which are not generally quantifiable by probabilistic methods. It is important to remember that analyst judgment is a critical factor in the success of a quantified FTA. The analysis is neither a formal proof nor a validation of safety, but merely a structured record of the analyst’s best understanding.
Quantified FTA Calculation – A Simple Example
For the simple lamp circuit shown in Figure 1, assume that the safety goal or safety requirement is “prevent missing lamp indication.” This might be appropriate, for example, in a case where a lamp indication to a safety operator is a critical function of the safety concept. The system lifetime, for which the PMHF is valid and for which the components are specified, is assumed to be 9,500 hours for this example. (Note this would be a typical automotive useful-life estimate). Failure for the lamp to indicate may be caused by various failure modes such as i) resistor R1 fail open; ii) capacitor C1 fail short; iii) lamp X and lamp Y both fail off (e.g., burn out). For these basic events, represented as circles in the FTA of Figure 2, the rate of occurrence per hour (represented by the variable λ) is estimated based on historical component data, for example using the guidance of a handbook such as reference [10].3
(3 Note the random hardware failure empirical data is often expressed in units of ‘FIT’. This unit is short-hand for “failures in time” and represents the failure rate per hour times 1 billion hours, so that 1 FIT = 1E-09 failures per hour.)
Using this information, the probability of failure (PoF) for the top-level event can be estimated using probability mathematics. Note that this PoF is directly related to the PMHF value required by ISO 26262. For most cases with very small component failure rates, the PMHF is calculated by dividing the PoF of the top-level event by the system lifetime. (This will be shown in more detail, later in the paper). In this simple example, the PMHF would be calculated as 5.061E-09 failures per hour or 5.061 FIT.
FTA METHODOLOGIES FOR PMHF CALCULATIONS
While the simple example shown is adequate for a basic description of the method, the analysis of real-world systems includes several added complexities. This section develops a FTA methodology for PMHF estimation, using several different concepts appropriate for different stages of the analysis. An example analysis is deployed in parallel to help illustrate the methodology, based on an automotive electronic power-assisted steering (EPAS) system described by Chang [11].
Top-Level Events (TLE)
Modern automotive systems are not monolithic simple systems, but rather complex multi-function systems. It is no small challenge to correctly structure the top-level functions and malfunctions which may contribute to safety goal violation. The method described here is based loosely on the methodology described by Freese [12], and supported by the SAE J2980 standard [13].
In safety analysis, top-level events of fault trees represent hazardous scenarios. In the terminology of ISO 26262, top-level events for a given item is typically the violation of a safety goal. In an ideal case, a model of item-level functions and malfunctions is developed as a part of the functional safety concept and item definition stages. In case such information is not available, it can be built as a first step within the FTA. To build the top levels of the FTA, we must understand the core functions of the system. Functions can often be described as intentional or commanded transitions between states. Safety goal violations often are defined by unintended transition of a core function between different system states; or by missing transitions which are intended but not achieved.
The various system states of an EPAS could be, for example 1) steer assist ON in left direction; 2) steer assist ON in right direction; and 3) steer assist OFF. The core functions are the allowed transitions between each state. All possible operating and non-operating conditions of the system (system operating at various speeds, system failing to operate etc.) should be covered within these core function transitions. Incorrect transitions or failure to transition when intended can be enumerated as potential malfunctioning behaviors. These core functions and malfunctions are shown for the EPAS example in Figure 3a. Often the malfunctions identified in this way will be somewhat redundant, such that the analysis lends itself to some aggregation. For example, malfunctions “Unintended assist to the left” and “Unintended assist to the right” can be aggregated as one hazard “Unintended assist.” This aggregation step requires some judgment by the analyst. For example, while left and right may be aggregated into a single malfunction category, the malfunction “loss of assist” is distinct from “assist stuck off” because the two conditions present different situations and risks to the driver. Using such judgments, an aggregated list of hazards is assembled as shown in Figure 3b. These Hazards would represent the top-level events of four different FTAs. We will further develop hazard H2 – Unintended Assist in future sections of this paper.
Intermediate Level Events (ILE)
Once the top level event is identified, the corresponding system structure must be built within the FTA. The objective is to build a structure that contains all relevant random hardware failures that could contribute to the top-level event. The low level events of the quantitative FTA will consist of random hardware failures of system components that contribute to the hazardous nature of an unintended transition of a core function. Any system or sub-system can be structured according to its inputs and outputs.
At the highest level, a system incorporates sensed inputs, some logic processing to combine and manipulate inputs, and a set of actuation outputs. In many systems it is convenient to further distinguish between sub-elements. For example, “processing” may include signal A/D conversion, logic processing, and provision of an output signal. The most general version of a system block diagram is shown in Figure 4. If we can arrange major subsystems in series as shown, then we can consider that failures of a given block can be due to one of two categories of faults: i) faults within the block itself; or ii) faults in the input to the block. This logic can be shown in an FTA structure in Figure 5.
This framework for developing intermediate-level events can be demonstrated with a practical automotive EPAS example as shown in the architecture diagram of Figure 6. The EPAS utilizes an electric motor to provide assistance for steering. To determine the proper level of assist, the system incorporates sensors for vehicle speed, driver applied torque, and steering wheel rotational position, which sends information to the microcontroller. Based on these values, the controller can determine and deliver a suitable assistive torque. A relay and H-bridge driver is used for actuating the MOSFET switches in the H-bridge circuit. Based on the PWM signal output from the microcontroller, specific MOSFET switches are actuated and the DC motor operates in the desired mode and torque value.
A fault tree representing this architecture is shown in Figure 7. Each intermediate-level event (ILE) can be comprised of either faults within the element itself, or faulted inputs to the event, which in turn are caused by faulty elements upstream. The structure is repeated until all relevant elements involved are listed.
Bottom-Level Events or Basic Events
At the bottom-most levels of the fault tree, root causes of failures are described, at a level low enough that no contributing events are needed. In some cases, the elements and intermediate events listed in Figures 3 and 4 are simply divisible into such root causes. In other cases, they may consist of circuitry modes which require additional logical gates. Since the objective of the analysis is to analyze contributions to safety goals, only the failure modes that contribute to safety goal violations should be listed.
Single-Point Faults: If the contribution is direct, so that the failure mode would lead to violation of a safety goal, it is termed a potential “single point fault.” Determination of potential single-point faults should be a priority in the analysis and serves as a defining first-step in the definition of basic events. In the event of the EPAS example, we consider overcurrent failure of the H-Bridge driver to be a potential single-point fault, since it could lead directly to an unintended steering assist condition. As noted earlier, the rates of such a failure mode may be estimated using readily available sources. For the H-bridge driver employed in this example, a failure rate of 28 FIT is estimated using reference [12].
Safety Mechanisms and Diagnostic Coverage: After potential single-point faults are found, the analysis should consider safety mechanisms that have the capability to protect against the SPF in question. A safety mechanism is a technical solution to detect faults or control failures in order to achieve or maintain a safe state. Examples of safety mechanisms include input and output monitors, ECC, CRC checks, range checks, watchdog functions, and the like. In the example case of the EPAS H-bridge driver, the potential single-point failure “H-bridge overcurrent” is protected by a safety mechanism referred to as “SM1.” SM1 is a rationality check, using the current sensors shown in Figure 6. In the case of implausibly high current which would indicate an unintended motor action, the safety mechanism shuts down power by opening the contacts in the relay assembly. The safety mechanism is comprised of all the hardware and software that directly implements this function, including the current sensors, the microcontroller, software to perform the rationality check, and the independent current-shut-down contained in the relay-switching assembly. Indeed, the reason for the “safety mechanism” concept is to aggregate multiple hardware and software elements into a single construct. It is important to estimate the systematic effectiveness of the safety mechanism. For this, a value of Diagnostic Coverage (D.C.) is determined, based on test data or available references such as [1] and [2]. Diagnostic Coverage is the percentage of a given element’s faults that can be detected and mitigated using the safety mechanism. For example, Rationality Check for an H-bridge driver is estimated to have a D.C. = 90%, based on ISO 26262 part 5 Annex D [1]. Critically, this value represents the systematic capability of the safety mechanism; it does not reflect the potential for random hardware faults (e.g., faults within the current sensor or other hardware elements) which might render the safety mechanism ineffective. These random faults within the safety mechanism will be addressed as potential dual-point faults.
Dual Point Faults: In cases where a safety mechanism protects against a potential single point fault, it is technically feasible to incur a random hardware failure in the element under consideration, and additionally in an element of the safety mechanism. Therefore, the portion of the element failure that is successfully covered by the safety mechanism is typically recorded as a potential dual-point fault.
In Figure 8, the structure of potential single-point and dual-point faults are shown. The Single Point Failure Section, in bold at the left, represents the combination of random hardware failure of the H-bridge (E501) AND the systematic inability of the safety mechanism to diagnose and/or react to such failure (E502). Note that, per the definitions of ISO26262, the actual fault itself is not referred to as single-point when covered by a safety mechanism. This combination is referred to as “residual fault,” which is the portion of a potential single-point fault that remains un-covered by a safety mechanism, if such a mechanism is in place.
The dual-point portion of the fault tree, shown in italics at the right of Figure 8, contains safety related potential failures of the H-bridge driver IC that are covered by the diagnostic coverage; AND random hardware failures within the safety mechanism. In order for this portion to violate a safety goal the following sequence of events has to occur:
An independent random hardware failure occurs in an element of the safety mechanism (for example in the relay assembly). This fault does not violate a safety goal on its own;
Failure of the H-bridge driver IC. In such a case, since the current sensor assembly has failed, the H-bridge fault will not be detected and the safety goal will therefore be violated.
The summary fault tree is shown in Figure 8. Note that probability of each basic element is calculable with the exception of E504. Event E504 is not a basic event but is itself a combination of events to be described in the following sections.
Basic Events for Random Hardware Failures in Safety Mechanisms
To formally complete the fault tree, it is necessary to further develop the underlying random hardware faults within the safety mechanism itself. In the terminology of Figure 8, it is necessary to further develop the tree below event E504. This event represents the probability that a random hardware failure has occurred in the safety mechanism itself, rendering the safety mechanism ineffective.
Recall that the safety mechanism SM1 is a rationality check on the current entering the motor, which is performed by software on the microcontroller based on current sensor input. In the case of dangerous inconsistency between commanded and actual current, which might indicate an unintended steer assist, SM1 includes an independent shut-down of power to the motor via the relay switching assembly. Based on this description, SM1 could be impacted by random faults within the microcontroller, the relay switching assembly, or the current sensors. For the scope of this paper, we only consider hazardous faults of the relay switching assembly as shown in Figure 9. A similar structure can be drawn for latent faults in the microcontroller or the current sensor. To further develop these events, we can make some additional design assumptions regarding additional safety mechanisms within the system. Specifically, Safety Mechanism SM2 is defined as relay start-up testing. In our example, the system is designed with start-up testing to verify the relay can actively switch on/off when commanded. This check is run at every start-up event, which is assumed to be performed on average every hour in a typical automotive analysis. With these assumptions, we can build the final lowest-level events related to relay fault within the safety mechanism. Specifically, SM1 may be impacted by two distinct scenarios of relay fault:
- the relay may be faulted in a condition where the safety mechanism SM2 has been systematically unable to detect the condition. This condition is represented by the combination of events E302 and E303 in Figure 9.
- the relay may be in non-faulted condition and pass the start-up test of SM2; but then entered a faulted state during vehicle operation. This condition is represented by the combination of E304 and E305 in Figure 9.
PMHF CALCULATION USING FAULT TREE METHODS
As noted earlier in the section entitled “Fault Tree Analysis” the PMHF value is calculated using the total probability of the top-level event in the FTA. This probability has several contributors as listed below. After those contributors are summarized, the final equation for PMHF is defined along with some notes on useful simplifications for real-world practice.
Single-Point and Residual Faults (PoFRF)
In general, for residual faults and potential single-point faults: If X% is the diagnostic coverage provided by a safety mechanism and total probability of failure of the element is PoFE, then probability of failure of element due to residual faults is PoFRF = PoFE * (1-X%). The remaining PoF will be carried over for calculating the probability of failure due to dual point faults.
Dual-Point Faults (PoFDPF)
In general, for potential dual-point faults, contributions to random hardware failures of the safety mechanism itself can be divided into two sub-categories.
- Failures of a safety mechanism that are “latent” (e.g., not detectable) – PoFLSM,T: This sub-category represents the scenario when there is no diagnostic coverage for faults within the safety mechanism during system lifetime, which means that diagnostic coverage is incapable to control these kind of faults. These are so-called latent safety mechanism faults.
- Failures of safety mechanism that are detectable, but occur within the diagnostic test interval – PoFLSM,t: This sub-category represents the scenario when diagnostic coverage for latent faults is capable of covering the faults, but where an element of SM incurs a random fault inside the test interval. In this case, the controller successfully processes the current plausibility only at system power on and determines that there is no failure of the SM; but then SM fails in operation, before the next power-up test.
Combining both sub-categories, we can compute the total probability of failure due to dual-point faults as
PoFDPF | = PoFLSM,T + PoFLSM,t |
= PoFSM,T*(1-Y%) + PoF SM,t*Y% |
where PoFSM,T is the probability of failure of SM1 over system lifetime, PoFSM,t is the probability of failure of SM1 over the diagnostic test time interval and Y% is the diagnostic coverage of safety mechanism SM2 (e.g., the mechanism that monitors for faults in elements of SM1).
Total Probability of Failure for a Single Element
For a given element, we are concerned with the total contribution to overall hazardous failure of an element PoFE,H will include both SPFs and DPFs. The total probability of failures contributing to the hazardous top-level event is simply the addition of the residual and dual-point portions. For a given element:
PoFE,H | = PoFRF + PoFDPF | |
= PoFE (1-X%) + PoFE *(X%)* PoFLSM | ||
= PoFE (1-X%) + PoFE *(X%)* [PoFLSM,T + PoFLSM,t] | ||
= PoFE (1-X%) | + PoFE *(X%) | |
* [PoFSM,T*(1-Y%) + PoFSM,t*Y%] |
Full Fault Tree
It is not possible to generalize an equation for any given fault tree, since system elements can be configured in a huge variety of ways. For the case described here and summarized in Figure 6, when major elements are aligned in series without system-level full redundancy, we can inspect the fault tree in Figure 7 to determine the probability of failure of the top-level event. Using the simplification noted previously for relatively low probabilities of failure, we can estimate the probability as simply the summation
PoFTLE = S PoFEi,H
where the summation variable i simply reflects the full set of elements that could be single point faults. Put more simply, the framework provided in section III of this paper can be repeated for each potential single-point fault, until the full tree is populated and the top-level event probability calculated.
Calculating Probabilistic Metric for Hardware Failure – PMHF
ISO 26262 defines PMHF to be the average probability of failure per hour over the operational lifetime of the item. The average probability of failure is calculated using the FTA as described above. If the failure rate is suitably low relative to the system life4, we can simply estimate the PMHF by dividing the probability of the top level event by the system lifetime T.
(4PoF over a vehicle lifety time is modeled as an exponential distribution, such that PoF = 1-e–λT where T is system lifetime and l is the constant failure rate over time T. Note that for very small values of l, such as are typical for modern electronic systems, PoF can be estimated as PoF = lT with good practical accuracy (see[14]). For components in a periodic proof test interval, PoF can be estimated as PoF = lt/2, where t is the proof test interval, if components are assumed to be repaired instantaneously (see [15]).)
This approximation is simply expressed as
PMHF = PoFTLE / Tsystem-life
DISCUSSION OF RESULTS
Relative Scale of Single-Point versus Dual-Point Events
Re-visiting the fault tree of Figure 8 and the final equation for PoFE,H, it is evident that the left-side of the tree (reflecting the impact of residual or single-point faults) is the dominant side quantitatively, in any system where element failure probabilities are relatively low. In a full implementation of the analysis summarized here, using representative values from an automotive system, 99.98% of the specific PoF for event E05 comes from the single-point side of the tree, and only 0.02% comes from the dual-point side. It is typically a reasonable assumption, especially early in the analysis when details of dual-point faults are not yet well quantified, to estimate PMHF as
PMHF = Si [PoFEi (1 – Xi%)] / Tsys-life
where the index i reflects each element that is a potential single point fault. This would be the equivalent of computing the top-level event probability without consideration of any potential dual-point faults, detected or otherwise. But while this estimate is typically close to the actual computed value, it is not conservative, as it essentially ignores the dual-point contribution.
Conservative Approximation of PMHF
Some alternate methods, notably including Failure Modes, Effects, and Diagnostics Analysis (FMEDA), provide a summed probability of failure for all latent faults (reflecting, for example, the sum total of PoF of event E504 in Figure 8, and all elements similar to it in the analysis). This value, multiplied by the (simple to calculate) total failure rate of all safety-related safety modes, provides a conservative estimate of the dual point PoF:
Note that, similar to prior equations, the index i reflects summation for each element which could be a single point failure. The index j reflects summation of all safety-related elements, whether they are potential SPFs or not. This conservative approximation supposes that any potential single point fault i may be covered by a safety mechanism, which in turn may be subject to the failure of any element j within that safety mechanism. A full FTA, of the type described in prior sections, would be required to determine the detailed specific combinations of faults that lead to the top-level event. This approximation is often used as a conservative summary of an FMEDA calculation, which is less precise but sometimes more tractable to calculate for complex systems.
SUMMARY
This paper demonstrated a structured and systematic quantitative fault tree analysis while showing various techniques for calculating the PMHF considering both single point and dual point latent faults. A real world automotive example was used and practical methods were used to provide a step by step approach to contain all the functions and failure modes of the system. The techniques used were reusable in different sections of the fault tree, hence making it more simple and understandable as well as providing a defined linkage between the system functions and failures.
In the later portion of the fault tree, the same fault tree methods were used to calculate the overall PMHF value. The results showed the importance of dual point failures in the PMHF calculation and also the importance of proof test time interval between diagnostic tests. These values may be very small, but for a larger and complex system, the final contribution of these values could be significant and should not be ignored. The quantitative analysis also discussed variances between diagnostic coverages and failure of safety mechanisms and also the significance of failure rates and probability of failure.
ACKNOWLEDGMENTS
The authors would like to acknowledge the developers of Medini Analyze software which was used to perform the
fault tree analyses shown here. Additionally, our colleague Jatin Goyal of Clemson University was instrumental in several key discussions in the development of this work, for which we are grateful.
REFERENCES
- ISO 26262:2011, “Road vehicles – Functional safety,” International Organisation for Standardisation, first edition.
- IEC 61508:2010, “ Functional safety of electrical/ electronic/ programmable electronic safety-related systems,” International Organisation for Standardisation, second edition.
- Lee, W. S., Grosh, D. L., Tillman, F. A., & Lie, C. H. (1985). Fault Tree Analysis, Methods, and Applications – A Review. Reliability, IEEE Transactions on, 34(3), 194-203.
- Volkanovski, Andrija, Marko Čepin, and Borut Mavko. “Application of the fault tree analysis for assessment of power system reliability.” Reliability Engineering & System Safety 94.6 (2009): 1116-1127.
- Dugan, Joanne Bechta, Salvatore J. Bavuso, and Mark Boyd. “Dynamic fault-tree models for fault-tolerant computer systems.” Reliability, IEEE Transactions on 41.3 (1992): 363-377.
- McKelvin, Mark L., and Alberto Sangiovanni-Vincentelli. “Fault Tree Analysis for the Design Exploration of Fault Tolerant Automotive Architectures.” No. 2009-01-1377. SAE Technical Paper, 2009.
- Campean, Felician, and Ed Henshall. “A function failure approach to fault tree analysis for automotive systems.” No. 2008-01-0846. SAE Technical Paper, 2008.
- Machol, Robert E. “The Titanic Coincidence.” in Interfaces 5 (5): 53-5, 1975.
- Leveson, Nancy G. “Engineering a Safer World: Systems Thinking Applied to Safety (Engineering Systems).” MIT Press Cambridge (2011).
- IEC TR 62380:2004, Reliability data handbook – Universal model for reliability prediction of electronics components, PCBs and equipment, International Electrotechnical Commission, first edition
- Hyuk-Jun, Chang. “Design and Development of a Functional Safety Compliant Electric Power Steering System.” Journal of Electrical Engineering and Technology 10.4 (2015): 1915-1920.
- Freese, W. “System Failures Frame Analysis Deriving Fault Trees from Function Definitions and System Specification” Medini Analyze User Conference, Detroit, Michigan (2015),
- SAE, J2980: Considerations for ISO 26262 ASIL Hazard Classification, SAE (2015)
- Kalbfleisch, John D., and Ross L. Prentice. The statistical analysis of failure time data. Vol. 360. John Wiley & Sons, 2011.
- Kovacs, Z., et al. “FRANTIC II-a computer code for time dependent unavailability analysis.” (1987).