MISRA C: Write safer, clearer C code




Embedded developers often bemoan the fact that no programming language is ideal for their particular needs. In a way, this situation is unsurprising, because, although a great many developers are working on embedded applications, they are still only quite a small subset of the world’s programming community. Nevertheless, some languages have been developed with embedded in mind. Notable examples are PL/M, Forth and Ada, all of which have been widely used, but never universally accepted. Other languages, like Rust, are gaining support, but are not yet mainstream. The compromise, which has been adopted almost universally, is C. How can that compromise be made to work most effectively?

The C language is compact, expressive and powerful. It provides a programmer with the means to write efficient, readable and maintainable code. All of these features account for its popularity. Unfortunately, the language also enables the unwary developer to write dangerous, insecure code that can cause serious problems at all stages of a development project and into deployment. For applications where safety and/or security are a major priority, these shortcomings of the language are a major concern.

It was against this background that, in the late 1990s, the Motor Industry Software Reliability Association (MISRA) introduced a set of guidelines for the use of C in vehicle systems, which became known as MISRA C. Since then, the guidelines have been steadily refined, with updates being published from time to time. A similar approach to the use of C++ has also been established. Although the guidelines were originally aimed at developers of software for use in cars, it was quickly realized that they are equally applicable to many other application areas where safety is critical, and the standard is now widely adopted in many industries.

Although MISRA C is not a style guide – indeed many users apply a style guide as well as the standard – numerous rules also promote the writing of clear, readable maintainable code. This is very beneficial, as code that is straightforward to understand is much less likely to harbor subtle bugs or undefined behavior.

Full details of MISRA C are obtainable from https://misra.org.uk and there are many tools available that support the approach.

I will just give a flavor of the guidelines here. My references are from MISRA C:2012 third edition, first revision. MISRA C is under constant review, with incremental changes addressing clarity and accuracy of the guidelines and support for newer versions of the C language standard. Although details change, the overall philosophy and approach do not.

Rule 13.2 – The value of an expression and its persistent side effects shall be the same under all permitted evaluation orders

The C language standard provides a very wide latitude to compliers with respect to evaluation order in expressions. Any code that is sensitive to evaluation order is, thus, compiler dependent and compiler-dependent code should always be considered unsafe.

For example, the use of the increment and decrement operators may be troublesome:

val = n++ + arr[n];

Which element of arr is accessed? Did the programmer expect the value of n used to index the array to be that before the increment or after? Although it might look as if the increment is performed before the array index, that assumes left-right expression evaluation, which is not a valid assumption. So, the code is not clear and should be re-written thus:

val = n + arr[n+1];
n++;

or

val = n++;
val += arr[n];

or even

val = n;
n++;
val += arr[n];

Which of these option you choose depends on personal style. They all perform the same operation, and, in fact, an optimizing compiler would most likely generate exactly the same code.

A similar problem may occur with multiple function calls used within an expression. A function call might have a side-effect that impacts another. For example:

val = fun1() + fun2();

In this case, if either function can affect the result from the other, the code is ambiguous. To write safe code, any possible ambiguity must be removed:

val = fun1();
val += fun2();

It is now clear that fun1() is executed first.

Rule 17.2 – Functions shall not call themselves, either directly or indirectly

From time to time, an elegant way to express an algorithm is through the use of recursion. However, unless the recursion is very tightly controlled, there is a danger of stack overflow, which can, in turn, result in very hard to locate bugs. In safety critical code, recursion should be avoided.

Rule 19.2 – The union keyword should not be used

Although C is a typed language, typing is not very strictly enforced, and developers may be tempted to override typing to “simplify” their code. Adhering to the constraints of data types is essential to create safe code, as any attempts to get around data types can produce undefined results. The union keyword can be used for a number of purposes, which generally result in unclear code, but can also be a means to circumvent typing.

One example would be using a union to “take apart” an unsigned integer, thus:

union e
{
   unsigned int ui;
   unsigned char a[4];
}f;

In this case, each byte of ui can be accessed as an element of a. However, we cannot be sure whether a[0] is the least of most significant byte, as this is an implementation issue. (Essentially associated with the endianity of the processor.) The alternative might be to use shifting and masking, thus:

unsigned char getbyte(unsigned int input, unsigned int index)
{
  input >>= (index * 8);
  return input & 0xff;
}

It may be argued that these rules (and most, if not all, of MISRA C) are just common sense and any good programmer would take such an approach. This may be true, but a set of clear guidelines leave less to chance.


Related Contents:

For more Embedded, subscribe to Embedded’s weekly email newsletter.

The post MISRA C: Write safer, clearer C code appeared first on Embedded.com.





Original article: MISRA C: Write safer, clearer C code
Author: Colin Walls