When calibrated test equipment is found in an out-of-tolerance condition, there is additional risk to all products on which it was used. It is important to understand the magnitude of the potential risk because it can lead to dangerous consumer situations and additional business costs.
Typically quality systems have a procedure for handling non-conforming material, however, this is non-conforming instrumentation used in a process, not material produced by a process. There is little guidance available describing how to evaluate out-of-tolerance conditions leaving engineering and quality personnel to develop their own process. When faced with an As-Found: Out-Of-Tolerance (OOT) condition, a systematic approach to identify what the out-of-tolerance values were, when, where and how the OOT unit was used, will help concentrate your efforts to identify those areas that will need further analysis.
What does out-of-tolerance mean? Calibration is a comparison of a metrology laboratory’s standard, with a known value and uncertainty, to the unknown behavior of a unit submitted for calibration. When the unit under test (UUT) does not meet the expected test limits, it is considered to be Out-of-Tolerance. The type of measurement data and calibration information provided can vary widely, depending on the type of metrology laboratory performing the calibration. For instance, at the National Metrology Institutes (NMI), such as NIST, the metrology laboratory may provide the comparison data only and not utilize any test limits and not make any statement of compliance. It is up to the instruments’ owner to perform any analysis and determine the compliance status of each individual piece of calibrated equipment. For the typical NMI customer, this process is relatively easy to handle because they are staffed with highly knowledgeable metrology professionals who are responsible for a limited quantity of lab standards. However, if this is the only information received by a manufacturing environment customer, who has significant quantities of test and measurement equipment, monitoring the behavior of each individual piece of equipment is impractical at best! Fortunately, the manufacturers of test equipment have done most of the analysis work. This is accomplished through the manufacturers’ published specifications which describe what type of behavior can be expected for the majority of the units manufactured, following a typical calibration interval. It is from the Original Equipment Manufacturers’ (OEM) published specifications that purchasing decisions are made. It is also from these published specifications that a commercial calibration provider will most likely determine the allowable tolerances, or test limits for the calibration process. Many commercial calibration providers offer a default service that uses the OEM’s published specifications; however, it is the responsibility of both the customer and the calibration lab (internal or external), to agree upon the specifications which will be used in the calibration process. A customer can request their equipment to be calibrated against any specification they provide. Once the calibration specifications have been agreed upon, the laboratory can calculate the test limits against which the laboratory results can be compared and a statement of compliance can be determined.
Statement of Compliance
Most commercial calibration customers are looking for the calibration laboratory to make a statement of compliance for the As-Found condition of the Unit Under Test (UUT). On the surface, making this determination appears rather straight forward and simple, however, upon closer examination, it becomes more complex; there are no perfect instruments and no perfect measurements. All measurements have some degree of uncertainty and how to deal with these uncertainties with respect to making a statement of compliance differs greatly. There are several different approaches which could be used when making compliance statements. Some labs will not make a statement at all; some labs will mark the data that does not meet the limits with an asterisk or some other means, but not make a compliance statement; still other labs will make a compliance statement, quantify the results with an uncertainty value and provide additional consumer risk information. In any case, it is critical for the customer to understand the decision rules used by the laboratory in making any compliance statements.
The statement As-Found: In-tolerance is generally assumed to mean that the entire instrument, all functions, parameters, ranges and test points – are within the calibration specifications at the time of calibration, for the stated conditions at the location where the calibration took place. An As-found: in-tolerance condition is a good indication the UUT was performing within expectations since the last calibration was completed. For the commercial calibration customer who has hundreds or thousands of calibrated items, the statement of compliance may be the single most important piece of information on a calibration certificate. In essence the metrology laboratory, staffed with measurement experts, has completed an initial data evaluation and concluded the unit to be performing within the agreed upon specifications so the customer does not have to spend very much additional time reviewing the calibration. Likewise an As-Found: Out-Of-Tolerance (OOT) condition indicates that at least one data point in the data report drifted or shifted beyond the allowable tolerance limits and the measurements it was providing may not have been accurate at some point since the previous calibration. Again, the laboratory measurement experts have indicated that this unit had a problem and needs further analysis by the customer. The As-Found: Out-Of-Tolerance statement of compliance is the flag or trigger for many quality or manufacturing engineering departments to start an investigation, evaluation or analysis.
The object of the OOT evaluation process is to identify the at risk products the Out-of-Tolerance units touched. The following approach is not very difficult and follows a logical thought process; however there are a few pitfalls to be aware of and to avoid. This is an investigation; I caution against having the end result already in mind. It is tempting to want the conclusion to show that there were no at risk products because of the work involved. The answers to the questions in the process will lead you to the appropriate conclusion. The approach here is to eliminate products without risk and to narrow down the pool of at risk products.
What is Out-of-Tolerance?
The first thing to do when faced with an out-of-tolerance unit is to read through the calibration certificate and data to get a firm understanding of what specifically failed calibration. A complete set of As-Found and As-Left calibration measurement data is essential for a proper out-of-tolerance evaluation. A Calibration Certificate without data is never a good idea, but when faced with an out-of-tolerance unit, the lack of measurement data will significantly impact the ability to conduct an analysis and quantify any potential risk. If the metrology laboratory provides an out-of-tolerance report that only shows the out-of-tolerance data you have something on which to conduct an evaluation, but even this limited information does not provide a complete picture. A review of all the calibration data should be done to identify what functions, parameters, ranges and test points were found out-of-tolerance. For example, let’s say a voltmeter has a full scale range of 1000 V, a resolution of 1 V, and an accuracy of ± 5 V, and the unit was found to read 1006 V at full scale (out-of-tolerance) and in-tolerance at all the other readings which were taken every 200 V. This means that during the use of the voltmeter, over its most recent calibration cycle, any measurements between 800 V and the full scale 1000 V were likely giving erroneous values to the user of the meter for the measurements taken. Again, a full set of data will be very helpful at this point in answering questions like: how many points within a range were out-of-tolerance; was the entire range out of tolerance; were all the ranges even checked; was there a linearity issue; was only the zero out-of-tolerance; or only the full scale reading out of tolerance; were other relevant test points close to or at their limits? The quality of the calibration and quantity of data available can have a tremendous impact on narrowing the scope of the evaluation at this point.
When did it happen?
The next step should be to identify the time frame during which questionable measurements may have been taken. This objective is to identify a specific time when the instrument was last known to be taking correct measurements. Often, this is going to be the previous calibration date; the historical calibration certificate will have this date. Basically, the unit was known to be measuring correctly when it left the metrology lab through its As-Left measurement data on the most recent calibration certificate. This will provide a starting point to work from, and most likely the longest period to examine. If you are fortunate to have a well developed measurement assurance program, you might have collected additional data during the period in question which can reduce the evaluation time frame. Most metrology laboratories follow good metrology practices (GMetP) and conduct mid-cycle checks, tests, and inter-comparisons, also called cross-checks, to determine the “health” of their measurement processes and provide confidence in the quality of the measurement process. If these checks are documented and have measurement data, you may be able to reduce the period of questionable measurements. For example, let’s say the voltmeter in a production cell was found out-of-tolerance during its annual calibration, but you have a process where a precision voltage source is used to verify the performance of the voltmeter every quarter. A review of this data may allow you to conclude the voltmeter was performing accurately 3 months ago, so the questionable period is only going to be the last 3 months instead of 12 months which significantly reduces the pool of potential at risk products. A schedule of cross-checks and inter-comparisons is often developed for critical measurements or high volume processes in order to reduce risk, liability, and evaluation time.
Where is it used?
The objective at this point is to identify where this instrument has been used during the questionable period. This is where the really big challenges can start. Typically, this is where the last link in the chain of traceability is often broken, linking the actual calibrated instrument to the processes, products and services provided. The ease of identifying potential impacted product depends upon the design of the end users processes and systems. In a large facility test equipment can move around without tracking its location. This is especially true of handheld instruments and bench level instruments. A robustly designed system with strict instrument control procedures will be able to identify exactly where any given instrument was located for any given time frame. Nearly all companies have a system that assigns an identification number to each instrument, and some even track its assigned department or location, but few systems track the movement of equipment within the facility and even fewer log the date and use of instrumentation. The maintenance of such an instrument movement log must be strictly followed, any hole or missing location data will bring any evaluation to a halt. Imagine a facility with 50 identical instruments that move around different production cells without any control. It would be impossible to identify what measurements or products it touched and what errors went undetected. With a robust tracking system that indicates if and when this instrument moved, you should be able to identify where this instrument was at any given time.
How is it used?
The last step in the out-of-tolerance information gathering process is to identify how the out-of-tolerance instrument was being used. Determine exactly what measurements were being made at a given location, during the time frame in question. This information will likely be found in the end users procedures, or the operator’s work instructions, or an engineering specification. The objective at this step is to determine whether the out-of-tolerance instrument could have affected any of the products manufactured or services provided by this instrument, in this time frame, in this location, for these measurements. This can be accomplished by reviewing the process documentation, and all revisions that were in effect during the time frame in question, for the out-of-tolerance measurements that were identified in the first step. Were any of the out-of-tolerance functions, parameters, ranges and test points used to make the measurements listed in the process documentation? If the answer is no, congratulations, your evaluation has ruled out the potential risk to product. Now you just have to completely document the steps you have taken, your conclusion and justification, as any auditor will tell you, if it isn’t written, it didn’t happen, you must produce objective evidence.
Analyzing the Impact
If the process documentation indicates that measurements were taken using any of the out-of-tolerance functions or ranges, then you have to go further and quantify the severity of the impacted products or services. Now comes the most difficult part of the process, quantifying the impact on products and services. In order to effectively complete this analysis, a thorough understanding of the affected process
is necessary and a working understanding of tolerances and the application of uncertainties is extremely helpful. Due to the wide variety of applications and situations possible, a few sample cases will be used to illustrate the analysis process for common situations likely to occur.
Case 1: No Impact
Let’s say the process documentation states that the voltmeter is used to measure a 600 V on a product with a process tolerance of ± 10 V. Since our process measurement was not in the out-of-tolerance portion of the meter (800 V to 1000 V), we can conclude with reasonable confidence that no product was affected.
Case 2: Impact Evaluation Using Ratios
In Case 2 we will use accuracy ratios in our analysis. An analysis by ratios can help quantify the potential impact by a rough order of magnitude, but may not be sufficient. For instance, a ratio change from 100:1 to 80:1 may be fairly insignificant, but a ratio change from 4:1 to 2:1 could have quite the impact on the end products. A ratio analysis may be a quick way to rule out potential recalls if the ratios involved are sufficiently high. However, if the ratios are low, then additional evaluation becomes necessary. This method may also be the only option available if there isn’t any historical process measurement data to review. For example in this case, the process documentation states that the voltmeter is used to measure a 1000 V on a product with a process tolerance of ± 50 V. Since our process measurement was in the out-of-tolerance portion of the meter (800 V to 1000 V), product might have been negatively impacted. We need to go a step further and compare our process tolerance to the magnitude of the out-of-tolerance data. The process tolerance in this case was ± 50 V, so our process limits are 9950 V to 1050 V. The accuracy of the meter was ± 5 V which means the meter is 10 times more accurate than our process tolerance giving us a Process Accuracy Ratio (50 V / 5 V) of 10:1. Now the calibration report stated the meter was reading 1008 V when the calibration lab injected a precision 1000 V into the meter, which basically means the meter behaved as if it had an accuracy of ± 8 V which drops our Process Accuracy Ratio (50 V/ 8 V) to 6.25:1. Is the risk due to a reduced process ratio acceptable? That comes down to a business decision.
Case 3: Impact Evaluation Using As-Found Calibration Data
In this case, the process documentation states that the voltmeter is used to measure a 1000 V on a product with a process tolerance of ± 50 V. Since our process measurement was in the out-of-tolerance portion of the meter (800 V to 1000 V), product might have been negatively impacted. We need to go a step further and compare our process tolerance to the magnitude of the out-of-tolerance data. The process tolerance in this case was ± 50 V, so our process limits are 9950 V to 1050 V. The out-of-tolerance data indicated that the meter was reading 1008 V, or out of specification, beyond the upper tolerance limit of 1005 V, by +3 V. This additional 3 Volt error is well below our ± 50 V process tolerance, so there wasn’t a problem…. or was there? You might want to jump to that conclusion, and you would be correct as long as your process stayed centered on 1000 V, but what if your process moved around and didn’t stay centered? Isn’t that why process tolerances are created to begin with! To figure out what is going on here, go back to the fact that the meter was reading high by +8V; the meter has a total +8 V bias or offset. The meter was actually delivering process limits of 9958 V to 1058 V. Which means any measurements greater than 1042 V during the time frame in question actually exceeded the upper process limit. With this information, you should review any historical process measurement data you have and identify any products that had measurements greater than 1042 V. You have now identified the specific units that might have been impacted by the out-of-tolerance unit and may have to be recalled. But wait, there’s more! Remember, no measurement is perfect, so what about the metrology labs measurement data, doesn’t that have some error in it too? Why yes, yes it does….
Case 4: Impact Evaluation Using As-Found Calibration Data and the Lab’s Uncertainty
Continuing with Case 3 information, let’s say the metrology lab reported their uncertainty for the measurement: 1008 V ± 7.1 mV. That means the value they report lies somewhere between 1007.9929 V and 1008.0071 V. This additional uncertainty will carry on down to the process tolerance calculation. So in the worst case the meter was actually delivering process limits of 9957.9929 V to 1058.0071 V, which in our case is insignificant because the resolution of the meter is not sensitive enough to see this small difference in voltage. It is interesting to note that in this situation the metrology lab had an uncertainty of ±7.1 mV for the calibration against the unit’s tolerance of ± 5 V which provides a calibration Test Uncertainty Ratio of 704:1 (5 V / 7.1 mV) meaning the calibration lab standards were over 704 times more accurate than the meter being calibrated. Here is where the value of that pesky Test Uncertainty Ratio those metrology guys are always talking about comes into play. Had the metrology laboratory’s uncertainty been ± 1.25 V, their reported measurement would have been 1008 V ± 1.25 V, and the TUR would have been 4:1 (5 V/ 1.25 V) meaning the meter would have actually been delivering process limits of 9957.675 V to 1059.25 V, which when rounded by the resolutions of the meter become 9958 V to 1059 V. Now this additional count might not seem like a big deal, but it does increase the size of the potential recall and increase the potential risk and cost.
Again, here is where a complete calibration report with As-Found and As-Left data becomes very helpful. This is also the point where the Test Uncertainty Ratio (TUR) and the Uncertainty of the Calibration Laboratory come into play and why all calibrations should include uncertainties for every measurement. The laboratory’s uncertainty information on the measurements they provide will give you the information to further refine your evaluation and subsequent analysis. Every bit of measurement information at your disposal allows you to make additional distinctions, observations, calculations and improves the quality and confidence in your conclusions and recommendations for further actions. The cost of a single product recall will far exceed the additional cost associated with a complete calibration which includes As-Found and As-Left data with uncertainties.
As cases 2, 3, and 4 illustrate, an out-of-tolerance instrument that could affect the end product or service can lead to a tremendous amount of work because the analysis will need to be completed for each product or service identified. This could lead to hundreds or thousands of calculations! As you can imagine, any effort spent in the four steps (what, when, where, and how) in the evaluation process which eliminates additional products to be analyzed is well worth the time. When faced with an As-Found: Out-Of-Tolerance (OOT) condition, a systematic approach to identify what the out-of-tolerance values were, when, where and how the OOT unit was used, will help concentrate your efforts to identify those areas that will need further analysis. The objective is to filter out as many possible items that do not need closer analysis so you can get to the ones where detailed analysis is required in order to quantify the impact to the products or services provided.
All this evaluation and analysis is a tremendous amount of work. However, it does not have to be difficult. A well thought out electronic system linking instrumentation to processes and product traceability as part of a measurement assurance program can ease the burden of out-of-tolerance evaluations and analysis. A measurement assurance program is more than a calibration program; it is a thought process to link and relate measurements through the entire produce life cycle, from concept to end product. Hopefully this approach and general guidelines will ease the burden to solving one of the most dreaded situations in the measurement world: the evaluation of an out-of-tolerance instrument and its potential impact.
This article has not been reviewed by the ICM Editorial Advisory Board.