Colliding Worlds in Safety Analysis
Traditionally failure mode identification has been an expert driven exercise with a failure mode commonly written in common language, such as “The ALU produces the wrong arithmetic result” or “The receive data path has corrupted data.” This tops-down expert driven analysis provides an easily comprehensible set of failure scenarios. However, the ability of even seasoned safety engineers to exhaustively identify each failure mode is becoming impractical in today’s architectures. The list below represents just some of the architectural challenges facing safety engineers.
- Several functional islands
- Multiple clock & power domains
- Dozens of intersecting high speed interfaces
Fortunately, EDA has made significant advancements in providing automation to aid safety experts. Tools understand design structure and makeup such as gates, flops, cells, and interconnectivity between those core elements. In regards to failure modes, tools comprehend fault models (stuck-at/transient/etc…) for those gates, flops, nets, and cells.
But now we arrive at the crux of the matter. On one hand, we have a world consisting of traditional tops-down expert driven analysis and on the other, a tool driven world which can provide significant automation to the safety lifecycle. Tools cannot understand the unbounded informal definitions that tops-down analysis provides. Likewise, an integrator won’t understand the impact of failure modes when stated at the granularity of stuck-at or transient faults. Simply put, tops-down and bottoms-up worlds are colliding.
Given the progression of design and automation tools in safety applications, are we long overdue for a methodology shift? I’m not suggesting safety experts be removed from the loop, but rather a new world where tops-down expert driven analysis marries seamlessly with bottoms-up. Specifically, are we ready for a bit of revolution where industry can lean more heavily on tooling in performing safety analysis?