Gimpel Software
  Order        Patches        Discussion Forum        Blog 
Contact      Site Map       
   Home
   Bug of the Month
   Products
   Order
   Support
   Company
   Links
   Interactive Demo
Google Search box  
Search gimpel.com:

User Defined Semantics (-sem)

The -sem()option allows the user to endow his functions with user-defined semantics. This may be considered an extension of the -function() option (See Function Mimicry). Recall that with the -function() option the user may copy the semantics of a built-in function to any other function but new semantics cannot be created.

With the -sem option, entirely new checks can be created; integral and pointer arguments can be checked in combination with each other using usual C operators and syntax. Also, you can specify some constraints upon the return value.

The format of the -sem() option is:

    -sem( name [ , sem ] ... )

This associates each of the semantics sem ... with the named function name. The semantics sem are defined below. The semantics replace any semantics that may have been previously assigned to name. If no sem is given, i.e. if only name is given, the option is taken as a request to remove semantics from the named function. Once semantics have been given to a named function, the -function() option may be used to copy the semantics in whole or in part to other functions.

Possible Semantics

sem may be one of:

  • r_null -- the function may return the null pointer. This information is used in subsequent value tracking. For example:
                /*lint -sem( f, r_null ) */
                char *f();
                char *p = f();
                *p = 0;     /* warning, p may be null */

    This is identical to the semantic S2 defined in Function Listing - Semantics and it is considered a Return semantic. See Function Mimicry for the definition of Return semantic. A more flexible way to provide Return semantics is given below under expressions (exp).

  • r_no -- the function does not return. Code following such a function is considered unreachable. This semantic is identical to the semantic defined in Function Listing - Semantics as S3 ; it is the semantic used for the exit() function. This also is considered a Return semantic.

  • ip (e.g. 3p ) -- the ith argument should be checked for null. If the ith argument could possibly be null this will be reported. For example:

                 /*lint -sem( g, 1p ) warn if g() is passed a NULL */
                /*lint -sem( f, r_null ) f() may return NULL */
                char *f();
                void g(char *);
                g( f() );     /* warning, g is passed a possible null */ 

    This semantic is identical to the S1 semantic described in Function Listing - Semantics

    Note that for the purposes of this example, we have placed the -sem options within lint comments. They may also be placed in a project-wide options file (.lnt file).

  • exp (a semantic expression involving the expression elements as follows)

    • in denotes the ith argument which must be integral (E.g. 3n refers to the 3rd argument). An argument is integral if it is typed int or some variation of integral such as char, unsigned long, an enumeration, etc.

      i may be @ (commercial at) in which case the return value is implied. For example, the expression:

                  @n == 4 || @n > 1n 

      states that the return value will either be equal to 4 or will be greater than the first argument.

    • ip denotes the ith argument which must be some form of pointer (or array). The value of this variable is the number of items pointed to by the pointer (or in the array). For example, the expression:

                  2p == 10 

      specifies a constraint that the 2nd argument, which happens to be a pointer, should have exactly 10 items. Just as with in , i may be @ in which case the return value is indicated.

    • iP is like ip except that all values are specified in bytes. For example, the semantic:

                  2P == 10 

      specifies that the size in bytes of the area pointed to by the 2nd argument is 10. To specify a return pointer where the area pointed to is measured in bytes we use @P .

    • integer (any C integral or character constant) denotes itself.

    • identifier may be a macro (non-function type), enumerator, or const variable. The indentifier is retained at option processing time and evaluated at the time of function call. If a macro, the macro must evaluate to an integral expression.

    • ( )

    • Unary operators: + - ! ~

    • Binary operators: + - * / % < <= == != > >= | & ^ << >> || &&

    • Ternary operator: ?:

Semantic Expressions

Operators, parentheses and constants have their usual C/C++ meaning. Also the precedence of operators is identical to C/C++.

There may be at most two expressions in any one -sem option, one expressing Return semantics and one expressing Function-wide semantics.

Return semantics

An expression involving the return value (one of @n, @p, @P) is a Return semantic and indicates something about the returned value. For example, if the semantics for strlen() were given explicitly, they might be given as:

    -sem( strlen, @n < 1p, 1p ) 

In words, the return value is strictly less than the size of the buffer given as first argument. Also the first argument should not be null.

To express further uncertainty about the return value, one or more expressions involving the return value may be alternated using the || operator. For example:

    -sem( fgets, @p == 1p || @p == 0 )

represents a possible Return semantic for the built-in function fgets . Recall that the fgets function returns the address of the buffer passed as first argument unless an end of file (or error) occurs in which case the null pointer is returned. If the Return semantic indicates, in the case of a pointer, that the return value may possibly be zero (as in this example) this is taken as a possibility of returning the null pointer.

Function-wide semantics

An expression that is not a Return semantic is a 'Function-wide' semantic (to use the terminology of Special Functions). It indicates a predicate that should be true. If there is a decided possibility that it is false, a diagnostic is issued.

What constitutes a "decided possibility"? This is determined by considerations described in Value Tracking . If nothing is known about a situation, no diagnostic is issued. If what we do know suggests the possibility of a violation of the Function-wide semantic, a diagnostic is issued.

For example, to check to see if the region of storage passed to function g() is at least 6 bytes you may use the following:

    //lint -sem( g, 1P >= 6 ) 1st arg. must have at least 6 bytes
    short a[3];      // a[] has 6 bytes
    short *p = a+1;  // p points to 4 bytes
    void g( short * );
    g( a );     // OK
    g( p );     // Warning
    

Several constraints may be AND'ed using the && operator. For example, to check that fread( buffer, size, count, stream ) has non-zero second and third arguments and that their product exactly equals the size of the buffer you may use the following option.

    -sem( fread, 1P==2n*3n && 2n>0 && 3n>0 )

Note that we rely on C's operator precedence to properly group operator arguments.

To continue with our example we should add Return Semantics. fread returns a value no greater than the third argument (count ). Also, the first and fourth arguments should be checked for null. A complete semantic option for fread becomes:

    -sem( fread, 1P==2n*3n && 2n>0 && 3n>0, @n<=3n, 1p, 4p )

It is possible to employ symbols in semantic expressions rather than hard numbers. For example:

             //lint -sem( f, 1p > N )
            #define N 100
            int a[N]; int b[N+1];
            void f( int * );
                .
                .
                .
                f( a );   // warn
                f( b );   // OK
                

Just as is the case with -function, -sem may be applied to member functions. For example:

             //lint -sem( X::cpy, 1P <= BUFLEN )

            const int BUFLEN = 4;

            class X
                {
              public:
                char buf[BUFLEN];
                void cpy( char * p )
                    { strcpy( buf, p ); }
                void slen( char * p );
                };

            void f( X &x )
                {
                x.cpy( "abcd" );    // Warning
                x.cpy( "abc" );     // OK
                } 

In this example, the argument to X::cpy must be less than or equal to BUFLEN. The byte requirements of "abcd" are 5 (including the nul character) and BUFLEN is defined to be 4. Hence a warning is issued here.

Notes on Semantic Specificatons

  1. Every function has, potentially, a Return semantic, a Function-wide semantic and Argument semantics for each of the arguments. An expression of the form ip when it stands alone and is not part of another expression becomes an Argument semantic for argument i. Thus, for the option
       -sem( f, 2p, 1p>0 )

    2p becomes an Argument semantic (the pointer should not be NULL) for argument 2. We can transfer this semantic to, say, the 3rd argument of function g by using the option

       -function( f(2), g(3) )

    The expression 1p>0 becomes the Function-wide semantic for function f and can be transferred via the 0 subscript as in:

       -function( f(0), g(0) )

    We could have placed these two together as one large semantic as in:

       -sem( f, 2p && 1p > 0 )

    The earlier rendition is preferred because there is a specialized set of warning messages for the argument semantic of passing null pointers to functions.

  2. Please note that r_null and an expression involving argument @ are Return semantics. You cannot have both in one option. Thus you cannot have
       -sem( f, r_null, @p = 1p )

    It is easy to convert this into an acceptable semantic as follows:

       -sem( f, @p == 0 || @p == 1p )
  3. The notations for arguments and return values was not chosen capriciously. A notation such as @n == 2n may look strange at first but it was chosen so as not to conflict with user identifiers.
  4. Please note that the types of arguments are signed integral values. Thus we may write
       -sem( strlen, @n<1p )

    We are not comparing here integers with pointers. Rather we are comparing the number of items that a pointer points to (an integer) with an integral return value.

    For uniformity, the arithmetic of semantics is signed integral arithmetic, usually long precision. This means that greater-than comparisons with numbers higher than the largest signed long will not work.


Home | Contact | Order

PC-lint and FlexeLint are trademarks of Gimpel Software
Copyright © 2013, Gimpel Software, All rights reserved.