How to Deal with High-Integrity Components
Part I of this article focused on the distinction between safety-critical components and high-integrity components. In this second part, we discuss the main aspects related to high-integrity components.
Safety Strategy for High-Integrity Components
In general, good high-integrity component (HIC) design is based on having several physical barriers between a hazard and any entity for whom that hazard may pose a danger. The greater the potential consequences of the risk presented by the hazard, the greater the importance of undertaking whatever mitigation measures or efforts are required to reduce the likelihood of the risk being realized. For some components, especially those used in primary circuits, achieving adequate levels of safety can require meeting very demanding requirements.
An effective HIC safety strategy views hazard identification and hazard analysis and control as a continuous, iterative process applied throughout HIC development and use. Once hazards have been identified, they are addressed either by eliminating them from the HIC design if possible or, if not, by preventing or minimizing their likelihood of occurrence, controlling the risks that do occur, and minimizing their potential damage. Safety must be built into an HIC from the beginning; it cannot be added to a completed design or tested into an equipment.
Qualitative rather than quantitative approaches need to be emphasized in any HIC, as quantitative procedures must necessarily omit important but unmeasurable factors and, therefore, may be misleading.
Supporting a claim of high integrity for a component requires a lot of work. The technical literature explicitly states that it is hard to build a safety case for an HIC. In broad terms, a safety case represents the full suite of documentary justification used to support the safe operation of an HIC. This includes claims being made regarding the safety of the HIC, the arguments that allow such claims to be made, and the necessary evidence to substantiate them. Each element (e.g., design, manufacture, failure analysis, a forewarning of failure, etc.) of a safety case should be robust and stand on its own, with individual elements as independent of one another as possible so that a deficiency in one element does not undermine the arguments presented in the other elements.
Routine verification tests are recommended for high-integrity diode safety barriers, as well as tests of current-carrying capacity of printed circuit board connections on which HICs are mounted.
In addition, qualified non-destructive testing (NDT) during HIC manufacturing is used to ensure the absence of structurally significant defects. Derived from defect tolerance assessment, this type of testing will reliably detect defects early in the product life cycle with a suitable margin. These tests, together with other relevant routine production line tests, are the basis for building confidence for achieving full manufacture inspection qualification. The general approach is to determine the size of a limiting defect at the end of a product’s life and then applying a margin of at least two times. This number is then combined with the predicted failure rate throughout a product’s life to determine a defect size that must be rejected at the start of life.
Defect tolerance assessment methodology covers the approach and key input parameters, including selection of limiting locations, material property determination (lower bound materials toughness properties), classification
of loadings and stresses, defect characterization, analysis type, failure assessment curves, materials aging, and the determination of limiting and safety-significant defects.
In this context, it is important that an HIC be designed for inspectability. The main elements of this methodology are to develop an inspection specification to define defect types and performance requirements, develop inspection techniques to meet the requirements of that specification, and then qualifying inspection procedures and personnel through a combination of technical justifications and practical trials.
The following examples illustrate the parameters of concern that need to be addressed for several typical components to qualify as an HIC [1]:
- A safety shunt can enter in a failure mode only by short circuit, and at least two shunts in parallel shall be used as an HIC;
- A mains transformer becomes an HIC when it has an attached fuse in the primary circuit and a current limiting resistor in the secondary output. In addition, the wire sizes and segregation must be inspected and must pass routine testing;
- A current limiting resistor HIC needs to be constructed from vitreous-enameled wires and may fail only by opening. A carbon resistor cannot be an HIC;
- Capacitor (e.g., Y1 capacitors) HICs need to have high reliability, and at least two capacitors shall be mounted in series. Electrolytic and tantalum capacitors cannot be an HIC;
- A pressure sensor transmitter is an HIC if it is designed to meet safety integrity level 3 (SIL 3), per IEC 61508, the industrial functional safety standard, and has high availability (i.e., continues to work in the presence of failure).
The SIL mentioned above has four categories, from 1 to 4. It is defined by the end-user through a risk analysis of the process. SIL is related to the fulfillment of the tolerance risk. This means that the SIL level results from the combination of two factors:
- Frequency of failure occurrence, and
- Consideration of the consequences of failure (dangerous failure or safe failure).
In accordance with established engineering practices, improving the safety and reliability of the equipment’s expected function should be implemented by adding elements of redundancy and diversity. Redundancy is targeted at meeting the single failure criteria, whereby the failure of just one part of an HIC must not result in the failure of the overall HIC. Diversity aims to provide protection against common cause failure; redundant electrical power and communications are recommended to be utilized.
High-Integrity Protection Systems (HIPS)
An interesting application of the HIC is represented by the high-integrity protection system (HIPS), a part of a safety instrumented system (SIS) and regarded as the last line of defense. A HIPS is an independently instrumented system, the function of which is to protect an installation from over-pressure, overheating, or overflow hazards, and differs from traditional safety systems such as relief devices. A HIPS system consists of multiple barriers, including a process shutdown system (PSD) and an emergency shutdown system (ESD). It also includes processes to isolate the concerned equipment from the source of danger and mitigating the risk of harm before the design conditions are exceeded.
A typical HIPS will include 2 or 3 output elements (solenoid valves and actuators) in series and is often required to shut down within 2-3 seconds for gas and 6-8 seconds for liquids, depending on the pipeline pressure, flow rate, and the diameter and class of the pipeline. The initiator of the shutdown sequence (peak pressure surge, flow, or temperature) is detected by an input element, such as sensing transmitters for pressure, flow, or temperature. In this case, three sensors are connected to the logic solver (solid-state or programmable logic controller (PLC)), which is configured to vote with a 2oo3 logic (2 out of 3). If the predefined parameters for pressure, flow, or temperature are exceeded, the logic solver will shut down the output elements and the process. The 2oo3 configuration is usually preferred for HIPS, as it provides availability as well as reliability for the system [2].
Each HIPS component needs to be documented with the following information:
- Quality plan and manufacturing control plan
- Component certificates
- Component specifications
- Component reliability report
- Tests procedures
- Tests reports
- Dimensional drawings
The minimum SIL level required for a HIPS is SIL 3. This safety integrity level is to be justified by evidence of compliance with the following requirements [3]:
- Common cause failure (or CCF, the result of one or more events, causing coincident failures of two or more separate channels in a multiple channel system and leading to system failure) is to be considered;
- Safe failure (a failure that does not have the potential to put the HIPS system in a hazardous or fail-to-function state) of the process is to be defined;
- Proof-test (a periodic test performed to detect failures in a HIPS system so that, if necessary, the system can be restored to an “as new” condition or as close as practical to this condition) intervals are to be defined and applied;
- The response time requirements for the HIPS system are to be clearly defined;
- A description of the process measurements and trip points is to be provided;
- A description of SIS process output actions and the criteria for successful operation is to be defined;
- A trip is to be ordered when the system de-energizes;
- HIPS system is to be reset after shutdown;
- Procedures for starting up and restarting the HIPS system are to be clearly defined;
- All interfaces between the HIPS system and the other systems are to be carefully analyzed;
- The software is to be compliant with SIL 3 level; and
- The meantime to repair in which it is feasible for the HIPS system to be compliant with SIL 3 level.
Looking at the components available on the market today, it is not difficult to source the different specifications needed for an HIC. The challenge is more in the validation and verification of the end-use equipment to ensure that it fully meets the requirements outlined in the safety requirements specification (SRS) and that the SIL level is maintained throughout the safety lifetime of the equipment.
Software as a High-Integrity Component
The traditional approach to producing software is to determine the requirements, implement them, and then try to ensure that there are no errors in either. The problems with this approach from a safety standpoint are that correct implementation of the requirements does not guarantee safety, and it is impossible to ensure that software is “perfect.” In fact, perfect (error-free) software does not exist. Software as a component requires special attention and needs to be treated as a high-integrity component.
It is possible that the origins of the concept of HIC can be found in software, in which the architecture focuses on the decomposition of the design into individual functional or logical components that represent well-defined communication interfaces containing methods, events, and properties. When high-integrity components are defined as those with a low likelihood of failure, it is difficult to apply this definition to software components. Some regulations consider the probability of failure of software as 100% based on the presumption that if a defect exists in the software (e.g., error in the algorithm), and the algorithm is executed, the error will happen in any case. In other words, the software cannot be a high integrity software.
In reality, this is not totally correct. Using adequate tools (i.e., architectural risk control measures, aspect-oriented, logical and physical design, etc.), software components can be of high integrity, becoming fault-tolerant and reducing the opportunities for software failures that can cause an unacceptable risk of harm [3].
When designing high-integrity software, it is important to keep fault tolerance and security issues at the forefront of considerations. The three main objectives of high-integrity software are:
- Confidentiality (sometimes termed privacy) by protecting against unauthorized and/or accidental disclosure of information caused by system failures or user errors;
- Integrity by protecting against unauthorized and/or unintentional modification of information caused by system failures or user errors; and
- Availability by protecting against unauthorized withholding of information and/or failures of resources.
For example, software in safety-critical equipment requires encryption, authentication, and access control to protect against unauthorized modification. When the information from such equipment passes over an untrusted communication link, additional mechanisms must be incorporated to deal with any lost, spurious, or corrupted communications.
Following is a short list of safety features incorporated in a high-integrity software:
- Dual watchdogs, such as independent watchdog and system window watchdog
- Backup clock circuitry with clock security system
- Supply monitoring
- I/O function locking
- Critical register protections with write-once registers
- Memory protection unit with enough regions to ensure data integrity from invalid behavior
- Dual stack pointer
- Fault exceptions and debug module
The control system for an HIC software module utilizes fiber optics for communication and incorporates multiple fault-tolerant redundancies and highly reliable SIL-rated, field-proven components. The HIC software design should provide the capability for full system testing as required to maintain its SIL rating over the life of the HIC software module.
To guide the development of high-integrity software, some international safety standards can be used, as follows:
- Industrial functionality: IEC 61508 series, using SIL 3
- Safety-instrumented systems for the process industry sector: IEC61511 and ANSI/ISA S84.01
- Machinery equipment: ISO 13849 series and IEC 62061
- Industrial cybersecurity: IEC 62443, using Security Level (S-L) 4
- Programmable controllers: IEC 61131 series
- Automotive industry: ISO 26262 series, using ASIL D
- Medical application: IEC 62304, using Class C
- Railway: EN 50128 and EN 50657, using SIL 4
- User-programmable integrated circuits (i.e., FPGA and CPLD): EN 50129
The above standards provide processes and techniques to help make a claim of achieving an acceptable level of integrity and hence risk. Note also that some of these processes and techniques also help to reduce random hardware failure.
Conclusion
There are many different challenges in analyzing, designing, building, and testing an HIC. One of the main challenges is the lack of standards outlining design parameters, resulting in a high level of interaction between end-users, engineering, and contractors during the analysis and design phase.
In our opinion, safety-critical components and high-integrity components are examples that provide a better understanding of safety significance and complexity. Choosing the right path in selecting components that meet specific qualification standards gives increased confidence in the component robustness for an equipment where safety integrity is required. The designer needs to assess the characteristics of these components and the failure trigger stress factors (electrical, thermal, shock and vibration, aging, electrical noise, etc.) to reduce the likelihood of component failure.
The assessment of the mechanisms of failure (both permanent and transient), the mechanisms for detection of these failures, and the capability to respond to a failure by a clear understanding of the propagation limits of failure are tools that increase the probability that the harmful states cannot be reached or, if they are reached, are detected and handled safely before losses occur.
Despite the paramount importance of safety issues in electrical equipment, purchasers and vendors of electrical equipment often have a limited understanding of safety issues and the hazards of such equipment. The result of this limited understanding is a lack of effective means to manage these issues. This is an important social, ethical, and regulatory issue that will need to be addressed constructively in order to ensure that these principles are correctly applied to electrical equipment.
References
- Independent High Integrity Safety System, ABB, 2017.
- IEC Standard 61508-1: 2010, Ed.2.0, “Functional safety of electrical / electronic / programmable electronic safety-related systems – Part 1: General requirements.”
- Software System Safety Handbook, Joint Services Computer Resources Management Group, U.S. Navy, U.S. Army, and the U.S. Air Force, 1999.