Writing safer, clearer C – MISRA C
Embedded developers often bemoan the fact that no programming language is ideal for their particular needs. In a way, this situation is unsurprising, because, although a great many developers are working on embedded applications, they are still only quite a small subset of the world’s programming community. Nevertheless, some languages have been developed with embedded in mind. Notable examples are PL/M, Forth and Ada, all of which have been widely used, but never universally accepted.
The compromise, that has been adopted almost universally, is C …
The C language is compact, expressive and powerful. It provides a programmer with the means to write efficient, readable and maintainable code. All of these features account for its popularity. Unfortunately, the language also enables the unwary developer to write dangerous, insecure code that can cause serious problems at all stages of a development project and into deployment. For applications where safety and/or security are a major priority, these shortcomings of the language are a major concern.
It was against this background that, in the late 1990s, the Motor Industry Software Reliability Association [MISRA] introduced a set of guidelines for the use of C in vehicle systems, which became known as MISRA-C. Since then, the guidelines have been steadily refined, with a new update being published in recent weeks. A similar approach to the use of C+ has also been established. Although the guidelines were aimed at developers of software for use in cars, it was quickly realized that they are equally applicable to many other application areas, where safety is critical.
Full details of MISRA-C are obtainable from MISRA themselves and there are many tools available that support the approach. I will just give a flavor of the guidelines here. My references are from the original 1998 guide – small details have changed, but the overall philosophy and approach has not.
Rule 46: The value of an expression shall be the same under any order of evaluation that the standard permits
The C language standard provides a very wide latitude to compliers with respect to evaluation order in expressions. Any code that is sensitive to evaluation order is, thus, compiler dependent and unsafe.
For example, the use of the increment and decrement operators may be troublesome:
val = n++ + arr[n];
Which element of arr is accessed? It is not clear, so the code should be re-written thus:
val = arr[++n];
But for more clarity, maybe this would be better:
val = n + arr[n+1]; n++;
A similar problem may occur with multiple function calls used within an expression. A function call might have a side-effect that impacts another. For example:
val = fun1() + fun2();
In this case, if either function can effect the result from the other, the code is ambiguous. To write safe code, any possible ambiguity must be removed:
val = fun1(); val += fun2();
Rule 70: Functions shall not call themselves, either directly or indirectly
From time to time, an elegant way to express an algorithm is through the use of recursion. However, unless the recursion is very tightly controlled, there is a danger of stack overflow, which can, in turn, result in very hard to locate bugs. In safety critical code, recursion should be avoided.
Rule 110: Unions shall not be used to access the sub-parts of larger data types
Although C is a typed language, typing is not very strictly enforced and developers may be tempted to circumvent typing to “simplify” their code. One example would be using a union to “take apart” an unsigned integer, thus:
union e { unsigned int ui; unsigned char a[4]; }f;
In this case, each byte of ui can be accessed as an element of a. However, we cannot be sure wither a[0] is the least of most significant byte, as this is an implementation issue. [Essentially associated with the endianity of the processor.] The alternative might be to use shifting and masking, thus:
unsigned char getbyte(unsigned int input, unsigned int index) { input >>= (index * 8); return input & 0xff; }
It may be argued that these rules [and most, if not all, of MISRA-C] are just common sense and any good programmer would take such an approach. This may be true, but a set of clear guidelines leave less to chance.
Comments
Leave a Reply
You must be logged in to post a comment.
MISRA-C is great for boosting code quality, though sometimes overkill when no lives are on the line. Sadly, though it doesn’t provide guidance on any stylistic issues.
Programmers and teams interested in keeping bugs out of their embedded software without paying the full tax of adhering to MISRA-C, may want to check out Barr Group’s “Embedded C Coding Standard”.