The importance of effective Safety Analysis
In the recent Wilson research industry survey, semiconductor companies reporting have made it clear that functional safety activities consume a large portion of the overall development lifecycle. Specifically, over 90% of companies have indicated that the safety activities consume 20% or more of the lifecycle. Of that, 25% of semiconductor companies indicate that Safety Analysis is the biggest challenge faced by safety engineers.
The overall objective of Safety Analysis is to understand that the design is safe and that there is sufficient safety mechanisms and measures protecting the design from random hardware faults that can occur during the operational life of the IC.
Safety analysis encapsulates many activities, including:
- Failure Mode Effects Diagnostic Analysis (FMEA)
- Failure Mode Effects Diagnostic Analysis (FMEDA)
- Fault Tree Analysis (FTA)
- Dependent Failure Analysis (DFA) or common cause failure analysis
- Freedom from Interference (FFI)
Each of these activities contributes to the overall objective of defining the optimal safety architecture, one that achieves the safety targets while minimizing impact to silicon area and the power budget.
Random Failure Lifecycle Overview
The random failure lifecycle is a sub-flow within the broader overall safety lifecycle, as described in <guidelines link>. Siemens EDA defines the random failure lifecycle as a three step process. In the first step, a project team identifies the optimal set of safety features required to protect a design from random failures (Safety Analysis). Step two is Safety Insertion where the hardware and software safety features defined in step one are implemented. In the third step (Safety Verification), the effectiveness of the safety architecture is proven using fault optimization, analysis, and injection. Included in step three is the evaluation of the safety metrics.
Today, the Safety Planning and Analysis activities are heavily weighted towards an expert driven exercise. This was acceptable in the past but the explosion of automotive IC complexity to support ADAS and AV systems is often resulting in sub-optimal or incorrect definition of safety architectures. This isn’t a knock on safety architects. It’s purely a result of the explosion in automotive SoC complexity; multi-billion gate designs, multiple disparate functional islands containing multi-core processing, dedicated AI/ML engines, mixed-signal processing engines, and more. The sheer complexity of these chips is making safety planning and analysis more challenging than ever before.
The Cost of incorrect Safety Analysis
In the event that the safety metrics generated in step three do not hit the safety target or the power, performance and/or area impact (PPA) is too severe, the program incurs a substantial cost. The first cost is the time wasted in performing safety verification, an often time consuming task and ranked the #2 biggest challenge in the safety lifecycle. The second cost is the need to iterate back though Safety Analysis and Insertion.
During this iteration, engineers must evaluate why the safety architecture was insufficient and identify the architectural and/or design changes required. Once identified, the changes must be implemented. If the changes are to the underlying hardware, the design must be taken off the shelf and functional verification re-closed to ensure the modifications did not break the intended functionality. And finally, the expensive task of Safety Verification must be performed again to obtain the updated safety metrics.
To avoid this unnecessary cost, there must be a high degree of confidence that the safety architecture proposed in the first step will lead to satisfactory results after Safety Verification completes.
Augmenting Safety Analysis with automation
It is important to state that this post is not suggesting replacing expert judgement with a tool. In fact, just the opposite. Automation is intended to augment expert judgement by providing them early cycle feedback on safety architectures proposed by experts. This includes recommendations on safety features, safety architecture gap analysis, safety architecture overlap, and more. Through a series of design analysis techniques, automation helps safety architects lock down the optimal safety architecture prior to implementation of the safety architecture.
Best Practice: The automation techniques below are recommended to help identify the optimal safety architecture and safety features to deploy. – Safety Critical vs. Non-Safety Critical Analysis – Cone extraction and Cone of Influence Analysis – Failure rate analysis – Propagation analysis |
This early cycle analysis ensures that the Safety Verification phase is simply a sign-off step rather than a first glimpse into the effectiveness of the architecture.
Conclusion
Semiconductors destined for the automotive industry come in a variety of flavors, ranging from small mixed-signal chips to large multi-island domain controllers. Regardless of the flavor, accurate safety planning and analysis is critical to delivering a safe IC on time and on schedule. Spending a little more time early in initial safety analysis and leveraging the power of automation can help produce a single iteration safety lifecycle and a semiconductor containing the optimal safety architecture.
If you are interested in learning more about how Siemens software automation tools can help, please reach out to your local sales team or go here.
Other Topics
This post is part of a broader safety series highlighting the challenges practitioners face during the development of safety critical ICs. To view other posts in the series, please refer to Guidelines to a successful ISO 26262 Life.