|
|
|
User Defined Semantics (-sem)
The -sem()option allows the user to endow his functions
with user-defined semantics. This may be considered an extension of
the -function() option (See
Function Mimicry).
Recall that with the -function()
option the user may copy the semantics of a built-in function to any
other function but new semantics cannot be created.
With the -sem option, entirely new checks can be created;
integral and pointer arguments can be checked in combination with each
other using usual C operators and syntax. Also, you can specify
some constraints upon the return value.
The format of the -sem() option is:
-sem( name [ , sem ] ... )
This associates each of the semantics sem ... with the named
function name. The semantics sem are defined below. The
semantics replace any semantics that may have been previously assigned
to name. If no sem is given, i.e. if only name is
given, the option is taken as a request to remove semantics from the
named function. Once semantics have been given to a named function, the
-function() option may be used to copy the semantics in
whole or in part to other functions.
Possible Semantics
sem may be one of:
-
r_null -- the function may return the null pointer. This
information is used in subsequent value tracking. For example:
/*lint -sem( f, r_null ) */
char *f();
char *p = f();
*p = 0; /* warning, p may be null */
This is identical to the semantic S2 defined in
Function Listing - Semantics
and it is considered a Return semantic. See
Function Mimicry
for the definition of Return semantic. A more flexible way to provide
Return semantics is given below under expressions (exp).
-
r_no -- the function does not return. Code following such a function
is considered unreachable. This semantic is identical to the
semantic defined in
Function Listing - Semantics
as S3 ; it is the semantic
used for the exit() function. This also is considered a
Return semantic.
-
ip (e.g. 3p ) -- the ith argument should be checked for
null. If the ith argument could possibly be null this
will be reported. For example:
/*lint -sem( g, 1p ) warn if g() is passed a NULL */
/*lint -sem( f, r_null ) f() may return NULL */
char *f();
void g(char *);
g( f() ); /* warning, g is passed a possible null */
This semantic is identical to the S1 semantic described
in Function Listing - Semantics
Note that for the purposes of this example, we have placed the
-sem options within lint comments. They may also be
placed in a project-wide options file (.lnt file).
-
exp (a semantic expression involving the expression elements
as follows)
-
in denotes the ith argument which must be
integral (E.g. 3n refers to the 3rd
argument). An argument is integral if it is typed
int or some variation of integral such as
char, unsigned long, an enumeration,
etc.
i may be @ (commercial at) in which case the return
value is implied. For example, the expression:
@n == 4 || @n > 1n
states that the return value will either be equal to 4 or
will be greater than the first argument.
-
ip denotes the ith argument which must be some
form of pointer (or array). The value of this
variable is the number of items pointed to by the
pointer (or in the array). For example, the
expression:
2p == 10
specifies a constraint that the 2nd argument, which
happens to be a pointer, should have exactly 10 items.
Just as with in , i may be @ in which case
the return value is indicated.
-
iP is like ip except that all values are
specified in bytes. For example, the semantic:
2P == 10
specifies that the size in bytes of the area pointed to
by the 2nd argument is 10. To specify a return pointer
where the area pointed to is measured in bytes we use
@P .
-
integer (any C integral or character constant) denotes
itself.
-
identifier may be a macro (non-function type), enumerator,
or const variable. The indentifier is
retained at option processing time and evaluated
at the time of function call. If a macro, the
macro must evaluate to an integral expression.
-
( )
-
Unary operators: + - ! ~
-
Binary operators: + - * / % < <= == != > >= | & ^
<< >> || &&
-
Ternary operator: ?:
Semantic Expressions
Operators, parentheses and constants have their usual C/C++ meaning.
Also the precedence of operators is identical to C/C++.
There may be at most two expressions in any one -sem option, one
expressing Return semantics and one expressing Function-wide semantics.
Return semantics
An expression involving the return value (one of @n, @p,
@P) is a Return semantic and indicates something about the
returned value. For example, if the semantics for strlen() were
given explicitly, they might be given as:
-sem( strlen, @n < 1p, 1p )
In words, the return value is strictly less than the size of the buffer
given as first argument. Also the first argument should not be null.
To express further uncertainty about the return value, one or more
expressions involving the return value may be alternated using the
|| operator. For example:
-sem( fgets, @p == 1p || @p == 0 )
represents a possible Return semantic for the built-in function
fgets . Recall that the fgets function returns the address
of the buffer passed as first argument unless an end of file (or error)
occurs in which case the null pointer is returned. If the Return
semantic indicates, in the case of a pointer, that the return value may
possibly be zero (as in this example) this is taken as a possibility of
returning the null pointer.
Function-wide semantics
An expression that is not a Return semantic is a 'Function-wide'
semantic (to use the terminology of
Special Functions). It
indicates a predicate that should be true. If
there is a decided possibility that it is false, a diagnostic is issued.
What constitutes a "decided possibility"? This is determined by
considerations described in
Value Tracking . If
nothing is known about a situation, no diagnostic is issued.
If what we do know suggests the possibility of a violation of the
Function-wide semantic, a diagnostic is issued.
For example, to check to see if the region of storage passed to function
g() is at least 6 bytes you may use the following:
//lint -sem( g, 1P >= 6 ) 1st arg. must have at least 6 bytes
short a[3]; // a[] has 6 bytes
short *p = a+1; // p points to 4 bytes
void g( short * );
g( a ); // OK
g( p ); // Warning
Several constraints may be AND'ed using the && operator. For
example, to check that fread( buffer, size,
count, stream ) has non-zero second and third
arguments and that their product exactly equals the size of the buffer
you may use the following option.
-sem( fread, 1P==2n*3n && 2n>0 && 3n>0 )
Note that we rely on C's operator precedence to properly group operator
arguments.
To continue with our example we should add Return Semantics.
fread returns a value no greater than the third argument
(count ). Also, the first and fourth arguments should be checked
for null. A complete semantic option for fread becomes:
-sem( fread, 1P==2n*3n && 2n>0 && 3n>0, @n<=3n, 1p, 4p )
It is possible to employ symbols in semantic expressions rather than
hard numbers. For example:
//lint -sem( f, 1p > N )
#define N 100
int a[N]; int b[N+1];
void f( int * );
.
.
.
f( a ); // warn
f( b ); // OK
Just as is the case with -function, -sem may be applied to
member functions. For example:
//lint -sem( X::cpy, 1P <= BUFLEN )
const int BUFLEN = 4;
class X
{
public:
char buf[BUFLEN];
void cpy( char * p )
{ strcpy( buf, p ); }
void slen( char * p );
};
void f( X &x )
{
x.cpy( "abcd" ); // Warning
x.cpy( "abc" ); // OK
}
In this example, the argument to X::cpy must be less than or equal
to BUFLEN. The byte requirements of "abcd" are 5 (including
the nul character) and BUFLEN is defined to be 4. Hence a warning
is issued here.
Notes on Semantic Specificatons
- Every function has, potentially, a Return semantic, a Function-wide
semantic and Argument semantics for each of the arguments. An
expression of the form ip when it stands alone and is not
part of another expression becomes an Argument semantic for
argument i. Thus, for the option
-sem( f, 2p, 1p>0 )
2p becomes an Argument semantic (the pointer should not be
NULL) for argument 2. We can transfer this semantic to, say, the
3rd argument of function g by using the option
-function( f(2), g(3) )
The expression 1p>0 becomes the Function-wide semantic for
function f and can be transferred via the 0 subscript as in:
-function( f(0), g(0) )
We could have placed these two together as one large semantic as
in:
-sem( f, 2p && 1p > 0 )
The earlier rendition is preferred because there is a specialized
set of warning messages for the argument semantic of passing null
pointers to functions.
- Please note that r_null and an expression involving argument
@ are Return semantics. You cannot have both in one option. Thus
you cannot have
-sem( f, r_null, @p = 1p )
It is easy to convert this into an acceptable semantic as follows:
-sem( f, @p == 0 || @p == 1p )
- The notations for arguments and return values was not chosen
capriciously. A notation such as @n == 2n may look strange
at first but it was chosen so as not to conflict with user
identifiers.
- Please note that the types of arguments are signed integral values.
Thus we may write
-sem( strlen, @n<1p )
We are not comparing here integers with pointers. Rather we are
comparing the number of items that a pointer points to (an integer)
with an integral return value.
For uniformity, the arithmetic of semantics is signed integral
arithmetic, usually long precision. This means that greater-than
comparisons with numbers higher than the largest signed long
will not work.
|
|
|
Home | Contact
| Order
PC-lint and FlexeLint are trademarks of Gimpel Software
Copyright © 2011, Gimpel Software, All rights reserved.
|
|