SIS and the Bathtub Curve

“When I was young, I would sit in the bath and ideas would come to me. But I’m not young anymore, so now I just sit in the bath.” — Aki Kaurismaki

“We’re installing a safety instrumented system in our existing process. Do we have to pull out all the existing instruments and replace them with certified SIL-rated instruments?”

This is a question we hear often, and the answer is not straightforward. It certainly isn’t the definitive “Yes” that many expect.

Certified SIL-Rated Instruments

Some instruments—sensors and final control devices—are designed and manufactured with the intent of being used in safety applications. End users expect the manufacturers of these field devices to get them certified by third-party test organizations. (There was a time when the only accepted third-party test organizations were the German TÜVs: Technischer Überwachungsverein, or in English, Technical Inspection Association. Since then, however, several other organizations have gotten into the business of certifying field devices.) The certification is per the International Electrotechnical Commission standard, IEC 61508, Functional safety of electrical/electronic/programmable electronic safety-related systems.

There is another standard to which most operations in the chemical process industries refer instead: IEC 61511, Functional safety – Safety instrumented systems for the process industry sector. It was written under the umbrella of IEC 61508, but caters exclusively to end users in the process industries, using the language and jargon of the process industries in a way familiar to those in the process industries. The ISA standard, S84, Application of Safety Instrumented Systems for the Process Industries, pre-dated IEC 61511, but harmonized with the IEC standard in 2004. Since then, S84 has been superseded by ANSI/ISA-61511, Functional Safety – Safety Instrumented Systems for the Process Industry Sector, making the harmonization complete and seamless.

Why this is important is that the standards allow for end users to choose field devices that they have proven in use, rather than rely solely on equipment manufacturers to certify their devices per IEC 61508. So, when end users have field devices that they have proven in use, they can continue to use them rather than feel compelled to replace them with different devices that have been certified per IEC 61508.

Proven In Use

Establishing that a field device is “proven-in-use” is a significant exercise. The 61508 Association lists the following seven items as the minimum requirements:

A formal system for gathering reliability data that differentiates between safe and dangerous failures
Means of assessing the recorded data to determine the safety integrity of the device/equipment, and its suitability for the intended use
Evidence that the application is clearly comparable
Recorded historical evidence of device hours in use
Evidence of the manufacturer’s management, quality, and configuration manufacturing systems
Device firmware revision records
Proof that reliability data records are updated and reviewed regularly

The effort to establish that a device is proven in use is not trivial; it’s not enough to state “Yeah, we’ve used this device in this application before”. It is manageable, though. “Proven-in-use” is especially important when certified devices aren’t available, or when a facility has a substantial investment in a type of device with which they are happy.

The Bathtub Curve

“Proven-in-use” can justify the use of a type of device. Before committing to an individual piece of equipment, however, it is important to take the bathtub curve into consideration.

The bathtub curve, familiar to reliability engineers, is a plot of failure rate as function of a component’s age. The initial phase of the bathtub curve is a period of high but rapidly falling failure rates. It is sometimes referred to as the “burn-in” or “infant mortality” period. It is during this period that flawed equipment reveals itself. For consumers, a 90-day limited warranty is intended to cover the burn-in period.

After the burn-in period, the failure rate flattens out. The failure rate remains essentially constant. One advantage of a constant failure rate is that it is much easier to calculate the probability of failure of a component when the failure rate isn’t changing.

Eventually, though, accumulated wear and tear catch up with devices, to the point that the failure rate begins to climb. This period is called the “wear-out” period. Typically, it is more gradual than the precipitous drop of the burn-in period. When a manufacturer says that their equipment has a useful life of 20 years, they are not saying that at 19.9 years the device is fine and that at 20.1 years it will suddenly fail. No, what they are saying is that at about 20 years, the failure rate will begin to climb noticeably.

A Bathtub Curve for People

All components and systems follow the bathtub curve, including human beings. Data from the Social Security Administration shows that Americans experience the “infant mortality” period for the first two years, followed by a constant rate period until about age 13. Then something happens (Do parents just tire of saying no?) that causes a slight upward shift to a new constant that holds until about age 35. Then it’s all downhill from there. Fortunately, we don’t talk much about the “useful life” of people.

Bathtub curve for the American people, based on data from the Social Security Administration.

Implications for Safety Instrument Systems

The calculations for average probability of failure on demand (PFD_AVG), the basis for all SIL verification studies, assume a constant failure rate. Implicitly, then, SIL verification studies assume the component is in the bottom, flat part of the bathtub curve. However, devices don’t spend their entire life in the constant rate period of the bathtub curve.

One implication of the bathtub curve is that of installing new devices. Whether 61508-certified, proven-in-use, or otherwise, all new devices are subject initially to the high failure rates of the burn-in period. Replacing an existing device with a new device will subject the process to infant mortality that it had survived previously. That doesn’t mean you should avoid upgrading equipment, but it is a consideration.

A second consideration is more important. Existing field devices are, by definition, not new. While they have already survived the infant mortality period, they are well into the constant rate period, where PFD_AVG calculations can be based on a constant, well-understood failure rate. However, they may be through the constant rate period and into the wear-out period. That means that any calculation based on historic failure rates will underestimate the PFD_AVG, perhaps to the extent that the Safety Integrity Level of a function using that device will be overstated.

Consider Using Existing Devices for Safety

So, you’ve decided to install a safety instrumented system, or you’ve decided to convert an existing interlock in your control system into a SIL-rated safety function in your safety instrumented system. Don’t feel like you must yank your existing instrumentation and replace it with certified SIL-rated instrumentation. This is especially true if the devices are of a type familiar to your operations and maintenance departments, have served you well, and have been recently installed.

Keep in mind, though, that as devices near the end of their useful life, their failure rates increase. Whether certified SIL-rated devices or proven-in-use, existing instruments will not give the risk reduction you are counting on once they are past their useful life. Maybe they should just sit in the bath.

Author

Mike Schmidt

With a career in the CPI that began in 1977 with Union Carbide, Mike was profoundly impacted by the 1984 tragedy in Bhopal and has been working on process safety ever since.

View all posts