Today, high-speed USB cable connections are everywhere, with data rates up to 10 Gb/s as in USB3 Superspeed+ devices, or even higher. Since end users connect USB cables in their homes, which represent electrostatically unsafe environments, system vendors require high levels of system level ESD robustness, typically 15 kV contact discharge according to IEC 61000-4-2 [1].
It is no trivial matter to properly interpret system level test results on high-speed boards. Board manufacturers (OEMs) assess the ESD robustness of their system by means of gun testing, not always in accordance with the IEC standard. In particular, exposed terminals are often zapped directly. This procedure is similar to the Human Metal Model (HMM) [2]. A recent industry-wide round robin study [3,4], however, showed a very large variance in the HMM tests of up to 5 kV. A major cause of the irreproducibility are gun artifacts, which will be discussed below.
It will be shown that 50 Ω HMM testing instead provides a much more reproducible test, which, moreover, correlates very well with SEED simulations [5] of the system. Even in this case, early first peak failures may lead to significantly lower test results than expected, because the inductances in the system determine the current distribution between protection and SoC in the first peak at low current levels. The root cause is described in detail below and effective protections solutions will be proposed.
What Can I Expect When Shooting an ESD Gun Into a USB3 Interface Board?
A NoiseKen ESS-2000AX with a TC815R gun is used to deliver a contact discharge into an RX input of the USB connector on the board (Figure 1). The board is inserted into a PCI slot of a PC. The gun voltage starts at 200 V and is increased in 100 V steps until the board fails functionally, which is detected by inserting a Passmark PMUSB3 loopback plug into the USB port that monitors the data rate. If the USB3 data transfer at 5 Gb/s fails, the board switches back to USB2 data transfer at 480 Mb/s, via separate pins.
It was found that without on-board protection the board fails at 600 V, which is the inherent ESD robustness of the USB3 IC. With protection, the failure level varies from about 1 kV to 5 kV, which is both unexpectedly low and also highly irreproducible.
How Do I Check Whether the Gun is in Spec?
Always check first the calibration of the gun waveform by firing into a 2 Ω Pellegrini calibration target mounted in a sufficiently large ground plane. Figure 2 shows an example of three current waveforms at 1 kV, recorded using a Fischer-F65 probe connected to a Tektronix DPO7254 2.5 GHz scope. The reproducibility of the discharges into the 2 Ω Pellegrini target is shown to be very good and the current waveform is in spec: for a 1 kV discharge, the standard [1] stipulates a 1st peak amplitude of 3.75 A with a maximum deviation of 15% and a 2nd peak amplitude of 2 A with a maximum deviation of 30%.
Next, in accordance with the HMM best practice recommendation [2], the gun waveform into the USB3 RX in a PC is verified (Figure 3). The PC chassis defines the ground in this case.
The gun is fired into an SMA adapter connected to the RX input of a USB3 board, seated here in a PCI slot of a Gigabyte X99SLI motherboard. The gun was set to repeated discharges (one per second). It was held by hand while its tip was supported by the SMA connector. Thus, no intentional changes to the set-up occurred between discharges. Nevertheless, the current waveform is much less reproducible in this set-up. Figure 4 shows that the 2nd peak remains stable and on target (2 A) but the first peak amplitude now varies between 65% and 125% of the 3.75 A target at 1 kV, which is clearly out of spec (± 15% [1, 2]), as opposed to the current waveforms into the Pellegrini target (cf. Figure 2).
The amplitude of the 1st peak is determined by the capacitive coupling of the gun ground to the world ground, which is obviously influenced by the ground plane around the EUT. The impact of the ground plane can be studied by using a Pellegrini target with and without a large ground plane. Without ground plane the first peak decreases by 40%, which may explain the lower amplitude in Figure 4. It is not clear what exactly causes the unintentional changes to the capacitive coupling during repetitive zapping.
Can the Gun Discharge Although I Did Not Pull the Trigger?
During gun testing it was observed that sometimes the PC would reset when its chassis was touched by the gun, although the trigger had not been pulled. Because it was suspected that some charge remained on the gun, a Warmbier EFM51 E-Field meter was employed to measure the voltage E-field at 2 cm from the gun tip. It was found that, after the gun had been fired into the intended target, its tip charges up again, typically to about 10% of its preset voltage in 30 s. The same effect was observed on another gun, a Schloeder SESD 30000.
Unintended gun charging is reminiscent of the trailing pulse issue in HBM testers which was discovered some 10 years ago [6,7]. The leakage current by itself is very small, in the order of a few µA, but if it hits a protection with very low leakage, the voltage across the protection will rise to its clamping voltage. This effect can be observed using a high-ohmic (10 MΩ) Tektronix TDS754 500 MHz scope connected to a protection with a clamping voltage of 6 V and a leakage current well below 1 nA. Figure 5 shows the trailing pulse for a Schloeder gun after a 1 kV discharge. The IEC pulse is indicated. Its waveform cannot be resolved at this time scale. The leakage current can be estimated to be 1 kV / 50 M Ω yielding Ileak » 20µA.
This current is too small to harm an RX input directly, but if the input is high-ohmic, it will experience 6 V on its input for 10 ms, which can easily damage sensitive gate oxides. A USB3 RX has a 50 Ω termination, however, which would short any low-current trailing pulse. Nevertheless, in a switched termination, the possibility remains that an upset from a previous discharge might put the USB application into a high-impedance state.
What If I Do Not Hit the Target When I Pull the Trigger?
When the gun is triggered without the tip touching a target, the tip charges up to the preset voltage. This means that a capacitor of about 40 pF from the tip to the internal gun ground [8] becomes charged (Figure 6). If the charged tip subsequently touches any target, the 40 pF capacitor discharges via the DUT and Cg (Figure 7).
The waveform is similar to a typical first peak, but since the 330 Ω resistor is now not in the current path, the current is only limited by the wave impedance of Ltip and the DUT resistance Rd, which means that very high peak currents are possible. Figure 7 illustrates this effect. Compared to a controlled discharge into the Pellegrini target, such a stray discharge exhibits no 2nd peak, since the 150 pF gun capacitor is not discharged, but the 1st peak amplitude can be much higher than the nominal amplitude. First peak current values of up to 2.5x the nominal current were observed.
Can the System Be Harmed by Electromagnetic Coupled Gun Radiation?
Even if the above-mentioned causes for error are excluded, it may remain difficult to obtain reproducible gun test results. The most likely explanation for the remaining variation is electromagnetic coupling of the gun tip to traces and IO pins on the board which are not protected by the on-board protection, as described earlier by the HMM working group [2,3]. Indeed, using a 10 mm diameter loop probe showed voltages up to 5 V around the USB3 SoC when the gun was fired into the RX at the other side of the board, 10 cm away. Systematic investigation of EM coupling is complex [9,10] and out of scope for this article but these examples illustrate once more the volatility of gun tests on PCBs in another system.
How Reproducible is a Gun Test on High-Speed Interface Boards?
We have found that in a typical set-up used by OEMs for system testing, the second peak is relatively stable and always in spec, but first peak current amplitude may vary from 50% to 250% or more of the nominal value according to the HMM model. Furthermore, hard-to-prevent stray discharges may even produce currents or voltages higher than allowed in the IEC 61000-4-2 specification.
Is There a Better Way to Perform System Level Tests on High-Speed Interface Boards?
A 50 Ω HMM system generally provides a much better reproducibility. We will show this by using an HPPI 3010C TLP system, which is also capable of generating HMM waveforms into 50 Ω.
First, we determined the failure signature of an unprotected USB IC by performing an HMM test of the board without protection. A typical HMM I-V curve for the IC by itself is shown in Figure 8.
The HMM measurement was interrupted every 100 mA for functional testing. This was done by inserting the board into a PCI slot of the PC and performing a test of the USB3 connection using the Passmark USB loopback plug.
It was found that functional failure occurs at a second peak current of 1.8 A (Figure 8 – inset). The first peak current at this setting is 2.4 A and the first peak voltage is 23 V. Failure of the internal IC protection, indicated by increased leakage, does not occur until 4.7 A of second peak current.
Further measurements show that negative polarity pulses are less critical. Failure does not occur until a first peak current of -5.4 A. Therefore, we will focus on the positive peak in the remainder of this article.
Does the System Failure Occur in the First or in the Second Peak?
In order to separate the effect of the first and second peak, TLP and vf-TLP tests were performed separately on fresh boards without protection. Figure 9 shows the current and voltage waveforms of the 1 ns / 600 ps vf-TLP measurement after functional failure occurs. The fail current is 2.5 A and the fail voltage is 21 V, which agrees very well with the HMM results.
These results clearly indicate that functional failures occur when the first peak reaches about 2.5 A. Figure 8 (inset) shows that at a first peak current of 2.5 A, the current in the second peak is only 1.8 A, i.e. much lower than It2 = 4.1 A. Moreover, as mentioned above, the functional failure at 2.5 A of first peak current, corresponds to the observed failure signature in the HMM tests.
Note that a fail current of 2.5 A may appear low, but it is typical for high-speed communication SoCs. The first peak current waveform is comparable to a CDM charge, albeit with a somewhat slower risetime. A recent JEDEC publication [11] lists the expected CDM peak currents for 10-20 Gb/s devices as 2-3 A.
A 2.5 A amplitude translates into an equivalent CDM fail voltage of about 150 V, taking into account the package capacitance [12].
The fact that the first peak causes a functional failure without causing any noticeable leakage increase suggests that the failure mode is probably a gate oxide failure.
But the Failure Voltage is Much Higher Than the Gate Oxide Failure Voltage…?
The internal gate voltage and the externally measured voltage are not identical. What is the internal voltage on the SoC silicon when an external voltage Vt2 = 21V is observed?
The SoC is wire-bonded. From the length of the bondwires, the SoC inductance is estimated to be about 3.5 nH. The corresponding voltage during rising slope of the first peak is about V = L.di/dt ≈ 14.5 V for dt ≈ 0.6 ns. This yields an internal SoC failure voltage of about Vf = 21-14.5V = 6.5 V.
The OEM confirmed that the SoC is manufactured in a 65nm CMOS technology with a gate oxide thickness of 1.9 nm. The NMOS gate oxide breakdown voltage for such a gate oxide at 1 ns pulse duration is about BVox = 6.4 V [13]. This confirms the assumption that functional failure occurs, because BVox ≈ 6.4 V is exceeded at 2.5 A of first peak current.
At a first peak current of 2.5 A, the current in the second peak is 1.8 A and the voltage 3.8 V, of which 0.5 V is due to the inductive overshoot, which is lower due to the longer risetime (dt = 10 ns). Thus, the voltage in the second peak is too low to cause any gate oxide damage.
In addition, gun tests on the unprotected USB boards were performed as well. They showed a failure level of 600 V, which is consistent with a first peak Ifail = 2.5 A, since 1kV corresponds to a first peak current of 3.75 A.
What If I Add an On-Board Protection?
Let us first investigate the electrical response of a USB3 board with protection using TLP testing (Figure 11), as the measurements are easier to interpret. A risetime of 0.6 ns is chosen for compatibility with an HMM pulse. There is an onboard series resistance Rb =1 Ω (cf. Figure 1). The inset shows the beginning of the voltage waveform of each curve just before triggering (blue) and just after (green).
Consider first the I-V curve: At low TLP currents, the voltage is below Vt1 and the protection has not yet triggered. All current, therefore, flows into the SoC internal protection. Figure 11 shows that the internal protection triggers at about 1 V and that it has an Rs ≈ 1.5 Ω (cf. Figure 1). With the additional Rb = 1 Ω, the total resistance in the path towards the SoC is about 2.5 Ω. At about 0.6 A the trigger voltage Vt1 ≈ 8 V is exceeded and current flows via the board protection. The voltage then drops to the protection snapback voltage Vsb ≈ 1.7 V. Note that Vt1 is already reached in the initial overshoot of the TLP pulse. This overshoot is due to the total inductance of about 5 nH in the path towards the SoC (3.5 nH for the SoC bondwires and an additional 1.5 nH for the non-ideal PCB traces and the 1 Ω resistor). This yields an estimated voltage overshoot of about 5.5 V, which agrees well with the observed overshoot of about 6 V in Figure 11.
This illustrates that any inductance between on-board and on-chip protection helps to trigger the on-board protection. It is important to note that the inductance of the on-board protection of about 3 nH (mainly due to its bondwires) does not impact the trigger voltage. This is because until the protection triggers, there is no current flowing through the protection and hence no L.dI/dt across the protection inductance. The protection triggers at a very low current of about It1 ≈ 50mA. Therefore, immediately after triggering L.dI/dt of the protection is very small, in the order
of 0.25 V.
For higher currents, however, the protection inductance cannot be neglected anymore, as we shall see below.
The TLP measurement of the USB3 board with on-board protection proves that the root cause of the premature failures is not related to a trigger failure of the protection. The protection triggers at I ≈ 0.6 A of TLP current, which corresponds to the second peak current of an HMM discharge. The corresponding first peak current is about twice this current, i.e. about 1.2 A. This is much lower than the 2.4 A at which the SoC fails (see previous section). We may, therefore, exclude protection trigger failure as root cause.
How Can I Measure Residual Current and Voltage Into the IC?
The residual current into the IC has been defined (Figure 12, from [14]) as the ESD current which does not flow into the external TVS but into the IC instead. The magnitude depends on the relative impedances in each of the current path and the on-voltage of the internal and external protection. The residual voltage is the voltage at the IC pin-to-be-protected, which is through the board components connected to the TVS.
It is not easy to measure the residual current into the RX (Figure 1) without modifying the board, e.g. by adding an integrated current loop around the trace. Furthermore, the gate connection is hidden behind the bondwires, so we cannot measure the gate voltage directly. In order to measure these parameters, we built a USB3 system evaluation board, which closely mimics the real components.
The schematic of the protected RX input of the USB3 board, shown in Figure 1, can be simplified into the replacement diagram shown in Figure 13.
Lc and Rc represent the protection inductance and resistance, Lb and Rb the equivalent board inductance and resistance and, finally, Ls and Rs the inductance and resistance of the SoC.
Since the internal SoC nodes are not accessible for electrical measurements, an evaluation board was built (Figure 14), in which two forward biased diodes replace the internal protection. One diode represents the up-diode of the rail-based protection in the SoC and the second one the clamp. By measuring the voltage at point P, the current into the replacement SoC can be deduced. The gun current at point A is measured via a Tektronix F-65 current probe.
Figure 15 shows the measured currents into the replacement SoC, compared to the total gun current for a 1 kV gun discharge. The second peak is significantly reduced (10x) by the protection, but the first peak is only reduced by 3x.
The reason for this difference is the dynamic impedance Z = ωL of the inductances in protection, SoC, and PCB. Due to the fast risetime in the first peak (corresponding to a high frequency), the impedance is most significant in the first peak and virtually negligible in the second. Hence, an inductive current distribution between protection and SoC is established in the first peak. The inductance values yield a current to the SoC which is about 40% of the total gun current during the first peak. This implies that, although the protection triggers, still 40% of the first peak current flows into the SoC.
Can I Simulate System Level Discharges?
The inductive current distribution can be simulated using a SEED simulation approach [5] using the schematic of Figure 14. Comparison of Figure 15 and Figure 16 show that simulated and measured current waveforms agree very well. The simulations reproduce the difference in peak reduction seen in the measurements very well.
So, Why Does the Board Fail Prematurely?
In the preceding section, we have shown that the USB3 SoC fails once the first peak current exceeds 2.4 A. At this current the voltage including inductive overshoot on the protection is about 21 V, which is clearly larger than the trigger voltage Vt1 = 8 V. The inductance of the protection does not impact protection triggering, but it reduces the amount of first peak current which can be shunted by the protection, thus putting the SoC at risk at higher currents. The expected fail level of the board with protection is 2.4 A / 40% = 6 A. This would result in an expected gun fail voltage around 2.5 kV (taking into account the reduced first peak due to insufficient gun grounding in the PC).
When the USB3 board is tested in the PC, the variability in the gun test results was found to be very large: Fail levels between 1 kV and 5 kV were found. The following factors explain this result:
The critical factor which determines failure of the USB3 board is the first current into SoC. Once 2.5 A are exceeded, permanent functional failure ensues.
Due to the inductive current distribution between protection and SoC, a large amount of residual current flows into the SoC, although the protection has triggered, yielding a lower than expected system pass level, of around 2.5 kV.
The large variability in the first peak current of the NoiseKen gun (50-250%) causes a large variation in gun test pass levels of 1 kV – 5 kV.
How Can We Improve the Protection of High-Speed Interfaces?
First of all, careful board design to avoid parasitics related to the PCB traces [14], can significantly impact the overall ESD performance. But it is also possible to improve the protection devices. The protection used in the previous chapters was wire-bonded. The bondwires have a significant series inductance. One solution involves using a package with Cu pillars instead of bondwires [16], which reduces the series inductance of the protection.
The effective inductance is difficult to measure directly but it may be derived by comparing the 3 dB point in the insertion loss measurements of both protections [16]. The resulting inductances are 3 nH for the wire-bonded protection and about 1 nH for the one with the Cu pillars.
An even more effective solution is to use a common mode choke with integrated protection [17], which adds about 35 nH of inductance between protection and SoC. Because the inductances for both differential lines are coupled, the effective differential mode inductance is virtually zero. Thus, a common mode choke may significantly improve the ESD protection of system without adversely impacting any differential (data) signal.
The three solutions with bondwires, Cu pillars, and common mode choke, were compared using measurements and SEED simulations. The results of the first peak measurements on a system evaluation board (see previous section) are shown Figure 17 and compared with simulated first peak amplitudes. Table 1 summarizes the measured and simulates first peak amplitudes. The agreement between SEED simulations and measurements is very good.
1st peak (A) | measured | simulated |
>gun | 3.76 | 3.64 |
wire-bonded | 1.40 | 1.42 |
Cu pillar |
1.03 |
0.86 |
CM choke | 0.33 | 0.32 |
Table 1: First peak current on evaluation board (Figure 14)
We see that using a protection with Cu pillars improves the system ESD performance by decreasing the residual current by 30%. The best protection is offered by a common-mode choke, which reduces the residual current by a factor of >10! The main reason for this improved performance is that the common mode choke adds additional inductance between protection and SoC. Because of the coupled coils, the inductance for differential USB3 signals is, nevertheless, very small, which implies that the signal integrity remains very good.
What Does This Solution Mean in Terms of kV?
The proposed solutions were verified on the USB3 boards using HMM tests. For comparison, the tests were repeated with the original wire-bonded protection. The results are summarized in Table 2. The first two columns show the pass and fail currents in the first peak. The third column shows the expected fail current, based on the simulated reduction of the first peak (Table 1). There is good agreement between the simulated and observed fail currents, which confirms that the inductive current distribution is a good model to explain the relative effectiveness of the different protections.
HMM (A) | gun (kV) | ||||
pass | fail | simul. | pass | fail | |
fail | |||||
no prot. | 2.2 | 2.4 | 0.5 | 0.6 | |
wirebond | 5.4 | 7.2 | 6.2 | 2.2 |
2.4 |
Cu pillar | 12.7 | 13.5 | 13.3 | 6.2 | 6.5 |
CM choke | >30 | 28 | 15 | 16 |
Table 2: HMM and gun test results of the proposed solutions
The last two columns of Table 2 show the observed pass and fail voltages of the different solutions during gun test (NoiseKen, single shots, positive). Using the protections with Cu pillars increases ESD (gun) pass level to over 6 kV. Use of the common mode choke boosts ESD robustness to 15 kV. The HMM results are consistent with the gun test results.
For a real gun test, positive and negative polarities need to be tested, usually 10x at each setting. For negative polarities, the SoC is less sensitive (first peak failure current during HMM is Ifail ≈ 5.4 A. Therefore, the overall failure voltage is determined by failure for positive polarity.
What Are the Bottom Line Recommendations?
It has been shown that the root cause for early failure of a USB3 board is an excessive residual current during the first peak of the HMM discharge. The protection triggers and absorbs the second peak of the discharge but the first peak is not sufficiently suppressed. This is due to an inductive current distribution between protection and SoC.
Using a protection with lower inductance (with leadless Cu pillars) improves the ESD robustness to 6 kV. Using a common mode choke further boosts ESD robustness to 15 kV, because the common mode choke adds additional inductance between protection and SoC. Because of the coupled coils, the inductance for differential USB3 signals is, nevertheless, very small, which implies that the signal integrity remains very good.
On a final note, it was found that the many gun artifacts that were discovered render gun test results irreproducible. It is, therefore, recommended to instead characterize high-speed application boards, such as USB3 boards, by means of 50 Ω HMM.
References
- Electromagnetic Compatibility (EMC): Part 4-2: Testing and Measurement Techniques–Electrostatic Discharge Immunity Test, IEC 61000-4-2, edition 2, 2008.
- Human Metal Model, ESD TR5.6-01-09, 2009.
- K. Muhonen, R. Ashton, T. Smedes, M. Scholz, R. Velghe, J. Barth, N. Peachey, W. Stadler, E. Grund, “HMM Round Robin Study: What to Expect When Testing Components to the IEC 61000-4-2 Waveform”, EOS/ESD Symp. Proc. 2012.
- K. Muhonen, “Best Practices for System Level ESD Testing of Semiconductor Components.” CSICS, 2013.
- System level ESD: Part II: Implementation of effective ESD robust designs, JEDEC Publication JEP162, 2013.
- T. Meuse, L. Ting, J. Schichl, R. Barrett, D. Bennett, R. Cline, Ch. Duvvury, M. Hopkins, H. Kunz, J. Leiserson, and R. Steinhoff, “Formation and suppression of a newly discovered secondary EOS event in HBM test systems”, EOS/ESD Symp. Proc. 2004.
- M. Etherton, V. Axelrod, T. Meuse, J. Miller, and H. Marom, “HBM ESD failures caused by a parasitic pre-discharge current spike”, EOS/ESD Symp. Proc. 2008.
- H. -M. Ritter, L. Koch, M. Schneider, and G. Notermans, “Air-discharge testing of single components”, EOS/ESD Symp. Proc. 2015.
- W. Huang, D. Liu, J. Xiao, D. Pommerenke, J. Ming, G. Muchaidze, “Probe characterization and data process for current reconstruction by near field scanning method” APEMC, 2010.
- W. Huang, J. Dunnihoo, D. Pommerenke, “Effects of TVS integration on system level ESD robustness”, EOS/ESD Symp. Proc. 2010.
- “Charged Device Model (CDM) Qualification Issues”, JEDEC, 2014.
- N.Jack, H. Gieser, “Advances in Contact CDM and CC-TLP Methods”, IEW 2016.
- A. Ille, W. Stadler, A. Kerber, T. Pompl, T. Brodbeck, K. Esmark, and A. Bravaix, “ Ultra-thin gate oxide reliability in the ESD time domain ” EOS/ESD Symp. Proc. 2006.
- System level ESD: Part I: “Common Misconceptions and Recommended Basic Approaches”, JEDEC Publication JEP161, 2011.
- D. Johnsson and H. Gossner, “Study of system co-design of a realistic mobile board”, EOS/ESD Symp. Proc. 2015.
- G. Notermans, H. -M. Ritter, J. Utzig, S. Holland, Z. Pan, J. Wynants, P. Huiskamp, W. Peters, and B. Laue, “An off-chip ESD protection for high-speed interfaces”, EOS/ESD Symp. Proc. 2015.
- J. Werner, J.Schuett, and G.Notermans “Integrated Common Mode Filter for USB3 Interfaces”, EMC Symp. Proc. 2016.
The EOS/ESD Association is the largest industry group dedicated to advancing the theory and the practice of ESD avoidance, with more than 2000 members worldwide. Readers can learn more about the Association and its work at www.esda.org.