Strong type checking is gold;
normal type checking is silver;
and casting is brass
8.1 Introduction
8.2 What are Strong Types?
8.3 -strong
8.3.1 Description of -strong
8.3.2 Examples of -strong
8.4 -index
8.4.1 Description of -index
8.4.2 Examples of -index
8.4.3 Multidimensional Arrays
8.5 Type Hierarchies
8.5.1 The Need for a Type Hierarchy
8.5.2 The Natural Type Hierarchy
8.5.3 Adding to the Natural Hierarchy
8.5.4 Restricting Down Assignments (-father)
8.5.5 Printing the Hierarchy Tree
8.6 Hints on Strong Typing
8.7 Reference Information
8.7.1 Strong Expressions
8.7.2 General Information
8.8 Strong Types and Prototypes
8.1 Introduction
The notion of strong typing is not usually carefully defined. It generally
means the kind of type checking that Pascal has and that C and C++ do not.
This includes the following:
- User-defined types match only through the nominal type, not through
the underlying type as is done with C. In other words, typedefs are ignored
in C and C++ for the purpose of determining type compatibility.
- A special Boolean type is supported which must be used where Booleans
are expected. In C, any scalar can be used as a Boolean and any Boolean
is type int. The new boolean type in C++ may be considered
a step in the direction of stronger Boolean typing, but nonetheless, any
scalar may still be employed as a Boolean.
- The Pascal-equivalent of char and enum objects are not
automatically converted to and from int as is done in C. Explicit
conversion is required.
- Every array has an expected index type and every subscript must match
this type. In C, any integral-typed expression can be used as a subscript
for any array.
- Pascal has a set facility typically implemented as a finite number
of bits that are either on or off. In C, one uses bit-wise operations on
integral quantities to achieve the same effect. C's approach is more flexible
but Pascal sets and their members cannot be improperly mixed.
In addition to these static checks, Pascal systems have run-time checks
including subscript bounds and null pointer checks. We do not include these
under the notion of strong type checking.
In the following explanation, each of the static type checks enumerated
above will be seen to be represented, as options to PC-lint/FlexeLint.
Strong type checking is a mixed blessing. Rigid adherence to a strong
type scheme is sometimes more trouble than it is worth. It is difficult,
for example, to write generic functions that operate on nominally different
types when types must match exactly. To provide necessary flexibility a
type hierarchy scheme has been introduced. This also is described below.
8.2 What are Strong Types?
Have you ever gone through the trouble of typedef'ing types and
then wondered whether it was worth the trouble? It didn't seem like the
compiler was checking these types for strict compliance.
Consider the following typical example:
typedef int Count;
typedef int Bool;
Count n;
Bool stop;
.
.
.
n = stop ; // mistake but no warning
This programmer botch goes undetected by the compiler because the compiler
is empowered by the ANSI standard to check only underlying types which,
in this case, are both the same (int).
The -strong option and its supplementary option -index
exist to support full or partial typedef-based type-checking. We
refer to this as strong type-checking. In addition to checking,
these options have an effect on generated prototypes. See 8.8 "Strong
Types and Prototypes".
8.3 -strong
8.3.1 Description of -strong
-strong(flags[, name] ... )
identifies each name as a strong type with properties specified
by flags. Presumably there is a later typedef defining any
such name to be a type. This option has no effect on typedef's
defined earlier. If name is omitted, then flags specifies
properties for all typedef'ed types that are not identified by some
other -strong option. Please note, this option must come before
the typedef.
The flags can be:
A Issue a warning upon some kind of Assignment to the strong
type. (assignment operator, return value, argument passing,
initialization). A may be followed by one or more of
the following letters which soften the meaning of A.
i ignore Initialization.
r ignore Return statements.
p ignore argument Passing.
a ignore the Assignment operator.
c ignore assignment of Constants. (constants include
integral constants, quoted strings and expressions of the form:
&x where x is a static or automatic variable.)
z ignore assignment of Zero. A zero is defined as any
zero constant that has not been cast to a strong type.
For example, OL and (int) 0 are considered zero but
(HANDLE) 0 where HANDLE is a strong type is not.
Also, (HANDLE*) 0 is not considered zero.
As an example, -strong(Ai,BITS) will issue a warning
whenever a value whose type is not BITS is assigned to a
variable whose type is BITS except when initialized.
X Check for strong typing when a value is eXtracted. This
causes a warning to be issued when a strongly typed value is
assigned to a variable of some other type (in one of the four
ways described above). But note, the softeners (i, r, p,
a, c, z) cannot be used with X.
J Check for strong typing when a value is Joined (i.e.,
combined) with another type across a binary operator. This
can be softened with one or more of the following lower-case
letters immediately following the J:
e ignore Equality operators (== and !=)
and the conditional operator (?:).
r ignore the four Relational operators (> >= < <=).
o ignore the Other binary operators which are the five
arithmetic operators (+ - * / %) and the three bit-wise
operators (| & ^).
c ignore combining with Constants.
z ignore when combining with a Zero value. See the 'A' flag
above for what constitutes a zero.
By 'ignoring' we mean that no message is produced. If, for
example, Meters is a strong type then it might be appropriate
to check only Equality and Relational operators and
leave others alone. In this case Jo would be appropriate.
B The type is Boolean. Normally only one name would be provided
and normally this would be used in conjunction with
other flags. (If through the fortunes of using a third party
library, multiple Boolean's are thrust upon you, make sure
these are related through a type hierarchy. See 8.5 "Type
Hierarchies".) The letter 'B' has two effects:
1. Every Boolean operator will be assumed, for the purpose of
strong type-checking, to return a type compatible with this
type. The Boolean operators are those that indicate true or false
and include the four Relational and two Equality operators
mentioned above, Unary !, and Binary && and ||.
2. Every context expecting a Boolean, such as an if
clause, while clause, second expression of a for
statement, operands of Unary ! and Binary || and
&&, will expect to see this strong type or a warning will
be issued.
b This is like flag B except that it has only effect
number 1 above. It does not have effect 2. Boolean contexts
do not require the type.
Flag B is quite restrictive insisting as it does that all
Boolean contexts require the indicated Boolean type. By
contrast, flag b is quite permissive. It insists on
nothing by itself and serves to identify certain operators as
returning a type strongly compatible with the strong type. See
also the 'l' flag below.
l is the Library flag. This designates that the objects of the
type may be assigned values from or combined with library
functions (or objects) or may be passed as arguments to
library functions. The usual scenario is that a library
function is prototyped without strong types and the user is
passing in strongly typed arguments. Presumably the user has
no control over the declarations within a library. Also, this
flag is necessary to get built-in predicates such as
isupper to be accepted with flag B. See the
example below.
f goes with B or b and means that bit fields of
length one should not be Boolean (otherwise they are). See
Bit field example below.
These flags may appear in any order except that softeners for A
and J must immediately follow the letter. There is at most one 'B'
or 'b'. If there is an 'f' there should also be a 'B'
or 'b'. In general, lower-case letters reduce or soften the strictness
of the type checking whereas upper-case letters add to it. The only exceptions
are possibly 'b' and 'f' where it is not clear whether they
add or subtract strictness.
If no flags are provided, the type becomes a 'strong type' but engenders
no specific checking other than for declarations.
8.3.2 Examples of -strong
For example, the option
-strong(A)
indicates that, by default, all typedef types are checked on
Assignment (A) to see that the value assigned has the same typedef
type.
The options:
-strong(A) -strong(Ac,Count)
specify that all typedef types will be checked on Assignment
and constants will be allowed to be assigned to variables of type Count.
As another example,
-strong(A) -strong(,Count)
removes strong checking for Count but leaves Assignment checking
on for everything else. The order of the options may be inverted. Thus
-strong(,Count) -strong(A)
is the same as above.
Consider:
//lint -strong(Ab,Bool)
typedef int Bool;
Bool gt(a,b)
int a, b;
{
if(a) return a > b; // OK
else return 0; // Warning
}
This identifies Bool as a strong type. If the flag b were
not provided in the -strong option, the result of the comparison
operator in the first return statement would not have been regarded
as matching up with the type of the function. The second return
results in a Warning because 0 is not a Bool type. An option
of -strong(Acb,Bool), i.e. adding the c flag, would suppress
this warning.
We do not recommend the option 'c' with a Boolean type. It is
better to define
#define False (bool) 0
and
return False;
Had we used an upper-case B rather than lower-case b as
in:
-strong( AB, Bool )
then this would have resulted in a Warning that the if clause
if(a)... is not Boolean (variable a is int). Presumably
we should write:
if( a != 0 ) ...
As another example:
/*lint -strong( AJXl, STRING ) */
typedef char *STRING;
STRING s;
.
.
.
s = malloc(20);
strcpy( s, "abc" );
Since malloc and strcpy are library routines, we would
ordinarily obtain strong type violations when assigning the value returned
by malloc to a strongly typed variable s or when passing
the strongly typed s into strcpy. However, the l flag
suppresses these strong type clashes.
Strong types can be used with bit fields. Bit fields of length one are
assumed to be, for the purpose of strong type checking, the prevailing
Boolean type if any. If there is no prevailing Boolean type or if the length
is other than one, then, for the purpose of strong type checking, the type
is the bulk type from which the fields are carved. Thus:
//lint -strong( AJXb, Bool )
//lint -strong( AJX, BitField )
typedef int Bool;
typedef unsigned BitField;
struct foo
{
unsigned a:1, b:2;
BitField c:1, d:2, e:3;
} x;
void f()
{
x.a = (Bool) 1; // OK
x.b = (Bool) 0; // strong type violation
x.a = 0; // strong type violation
x.b = 2; // OK
x.c = x.a; // OK
x.e = 1; // strong type violation
x.e = x.d; // OK
}
In the above, members a and c are strongly typed Bool,
members d and e are typed BitField and member b
is not strongly typed.
To suppress the Boolean assumption for one-bit bit fields use the flag
'f' in the -strong option for the Boolean. In the example
above, this would be -strong(AJXbf,Bool).
8.4 -index
8.4.1 Description of -index
-index(flags,ixtype,sitype[,sitype] ... )
This option is supplementary to and can be used in conjunction with
the -strong option. It specifies that ixtype is the exclusive
index type to be used with arrays of (or pointers to) the Strongly Indexed
type sitype (or sitype's if more than one is provided). Both
the ixtype and the sitype are assumed to be names of types
subsequently defined by a typedef declaration. flags can
be
c allow Constants as well as ixtype, to be used as indices.
d allow array Dimensions to be specified without using an ixtype.
8.4.2 Examples of -index
For example:
//lint -strong( AzJX, Count, Temperature)
//lint -index( d, Count, Temperature )
// Only Count can index a Temperature
typedef float Temperature;
typedef int Count;
Temperature t[100]; // OK because of d flag
Temperature *pt = t; // pointers are also checked
// ... within a function
Count i;
t[0] = t[1]; // Warnings, no c flag
for( i = 0; i < 100; i++ )
t[i] = 0.0; // OK, i is a Count
pt[1] = 2.0; // Warning
i = pt - t; // OK, pt-t is a Count
In the above, Temperature is said to be strongly indexed
and Count is said to be a strong index.
If the d flag were not provided, then the array dimension should
be cast to the proper type as for example:
Temperature t[ (Count) 100 ];
However, this is a little cumbersome. It is better to define the array
dimension in terms of a manifest constant, as in:
#define MAX_T (Count) 100
Temperature t[MAX_T];
This has the advantage that the same MAX_T can be used in the
for statement to govern the range of the for.
Note that pointers to the Strongly Indexed type (such as pt above)
are also checked when used in array notation. Indeed, whenever a value
is added to a pointer that is pointing to a strongly indexed type, the
value added is checked to make sure that it has the proper strong index.
Moreover, when strongly indexed pointers are subtracted, the resulting
type is considered to be the common Strong Index. Thus, in the example,
i = pt - t;
no warning resulted.
It is common to have parallel arrays (arrays with identical dimensions
but different types) processed with similar indices. The -index
option is set up to conveniently support this. For example, if Pressure
and Voltage were types of arrays similar to the array t of
Temperature one might write:
//lint -index( , Count, Temperature, Pressure, Voltage )
.
.
.
Temperature t[MAX_T];
Pressure p[MAX_T];
Voltage v[MAX_T];
.
.
.
8.4.3 Multidimensional Arrays
The indices into multidimensional arrays can also be checked. Just make
sure the intermediate type is an explicit typedef type. An example
is Row in the code below:
/* Types to define and access a 25x80 Screen.
a Screen is 25 Row's
a Row is 80 Att_Char's */
/*lint -index( d, Row_Ix, Row )
-index( d, Col_Ix, Att_Char ) */
typedef unsigned short Att_Char;
typedef Att_Char Row[80];
typedef Row Screen[25];
typedef int Row_Ix; /* Row Index */
typedef int Col_Ix; /* Column Index */
#define BLANK (Att_Char) (0x700 + ' ')
Screen scr;
Row_Ix row;
Col_Ix col;
void main()
{
int i = 0;
scr[ row ][ col ] = BLANK; /* OK */
scr[ i ][ col ] = BLANK; /* Warning */
scr[col][row] = BLANK; /* Two Warnings */
}
In the above, we have defined a Screen to be an array of Row's.
Using an intermediate type does not change the configuration of
the array in memory. Other than for type-checking, it is the same as if
we had written:
typedef Att_Char Screen[25][80];
8.5 Type Hierarchies
A discovery that was made only after the first version of strong typing
was implemented was that some sort of type hierarchy was needed.
8.5.1 The Need for a Type Hierarchy
Consider a Flags type which supports the setting and testing
of individual bits within a word. An application might need several different
such types. For example, one might write:
typedef unsigned Flags1;
typedef unsigned Flags2;
typedef unsigned Flags3;
#define A_FLAG (Flags1) 1
#define B_FLAG (Flags2) 1
#define C_FLAG (Flags3) 1
Then, with strong typing, an A_FLAG can be used with only a Flags1
type, a B_FLAG can be used with only a Flags2 type, and a
C_FLAG can be used with only a Flags3 type. This, of course,
is just an example. Normally there would be many more constants of each
Flags type.
What frequently happens, however, is that some generic routines exist
to deal with Flags in general. For example, you may have a stack
facility that will contain routines to push and pop Flags. You might
have a routine to print Flags (given some table that is provided
as an argument to give string descriptions of individual bits).
Although you could cast the Flags types to and from another more
generic type, the practice is not to be recommended, except as a last resort.
Not only is a cast unsightly, it is hazardous since it suspends type-checking
completely.
8.5.2 The Natural Type Hierarchy
The solution is to use a type hierarchy. Define a generic type called
Flags and define all the other Flags in terms of it:
typedef unsigned Flags;
typedef Flags Flags1;
typedef Flags Flags2;
typedef Flags Flags3;
In this case Flags1 can be combined freely with Flags,
but not with Flags2 or with Flags3.
This depends, however, on the state of the fhs (Hierarchy of
Strong types) flag which is normally ON. If you turn it off with the
-fhs
option the natural hierarchy is not formed.
We say that Flags is a parent type to each of Flags1,
Flags2 and Flags3 which are its children. Being a
parent to a child type is similar to being a base type to a derived type
in an object-oriented system with one difference. A parent is normally
interchangeable with each of its children; a parent can be assigned to
a child and a child can be assigned to a parent. But a base type cannot
normally be assigned to a derived type. But even this property can be obtained
via the -father option (see 8.5.4).
A generic Flags type can be useful for all sorts of things, such
as a generic zero value, as the following example shows:
//lint -strong(AJX)
typedef unsigned Flags;
typedef Flags Flags1;
typedef Flags Flags2;
#define FZERO (Flags) 0
#define F_ONE (Flags) 1
void m()
{
Flags1 f1 = FZERO; // OK
Flags2 f2;
f2 = f1; // Warn
if(f1 & f2) // Warn because of J flag
f2 = f2 | F_ONE; // OK
f2 = F_ONE | f2; // OK Flag2 = Flag2
f2 = F_ONE | f1; // Warn Flag2 = Flag1
}
Note that the type of a binary operator is the type of the most restrictive
type of the type hierarchy (i.e., the child rather than the parent). Thus,
in the last example above, when a Flags OR's with a Flags1
the result is a Flags1 which clashes with the Flags2.
Type hierarchies can be an arbitrary number of levels deep.
There is evidence that type hierarchies are being built by programmers
even in the absence of strong type-checking. For example, the header for
Microsoft's Windows SDK, windows.h, contains:
...
typedef unsigned int WORD;
typedef WORD ATOM;
typedef WORD HANDLE;
typedef HANDLE HWND;
typedef HANDLE GLOBALHANDLE;
typedef HANDLE LOCALHANDLE;
typedef HANDLE HSTR;
typedef HANDLE HICON;
typedef HANDLE HDC;
typedef HANDLE HMENU;
typedef HANDLE HPEN;
typedef HANDLE HFONT;
typedef HANDLE HBRUSH;
typedef HANDLE HBITMAP;
typedef HANDLE HCURSOR;
typedef HANDLE HRGN;
typedef HANDLE HPALETTE;
...
8.5.3 Adding to the Natural Hierarchy
The strong type hierarchy tree that is naturally constructed via typedef's
has a limitation. All the types in a single tree must be the same underlying
type. The -parent option can be used to supplement (or completely
replace) the strong type hierarchy established via typedef's.
An option of the form:
-parent( Parent, Child [, Child] ... )
where Parent and Child are type names defined via typedef
will create a link in the strong type hierarchy between the Parent
and each of the Child types. The Parent is considered to
be equivalent to each Child for the purpose of Strong type matching.
The types need not be the same underlying type and normal checking between
the types is unchanged.
A link that would form a loop in the tree is not permitted.
For example, given the options:
-parent(Flags1,Small)
-strong(AJX)
and the following code:
typedef unsigned Flags;
typedef Flags Flags1;
typedef Flags Flags2;
typedef unsigned char Small;
then the following type hierarchy is established:
Flags
/ \
Flags1 Flags2
|
Small
If an object of type Small is assigned to a variable of type
Flags1 or Flags, no strong type violation will be reported.
Conversely, if an object of type Flags or Flags1 is assigned
to type Small, no strong type violation will be reported but a loss
of precision message will still be issued (unless otherwise inhibited)
because normal type checking is not suspended.
8.5.4 Restricting Down Assignments (-father)
The option
-father( Parent, Child [, Child] ... )
is similar to the -parent option and has all the effects of the
-parent option and has the additional property of making each of
the links from Child to Parent one-way. That is, assignment
from Parent to Child triggers a warning. You may think of
-father as a strict version of -parent.
The rationale for this option is shown in the following example.
typedef int FIndex;
typedef FIndex Index;
Here Index is a special Index into an array. FIndex is
a Flag or an Index. If negative it is taken to be a special flag
and otherwise can take on any of the values of Index. By defining
Index in terms of FIndex we are implying that FIndex
is the parent of Index. The reader not accustomed to OOP may think
that we have the derivation backwards, that the simpler typedef,
Index, should be the parent. But Index is the more specific
type; every Index is an FIndex but not conversely. Whereas
it is expected that we can assign from Index to FIndex it
could be dangerous to do the inverse.
Since we don't want down assignments we give the option
-father( FIndex, Index )
in addition to the strong options, say
-strong( AcJX, FIndex, Index )
Then
FIndex n = -1;
Index i = 3;
i = n; /* Warning */
m = i; /* OK */
The safe way to convert a FIndex to Index is via a function
call as in
Index F_to_I( FIndex fi )
{ return (Index)(fi >= 0 ? fi : 0); }
Then, although we need to use a cast in this function we need not use
a cast in the rest of the program.
The net result of all this is that although flags and indices occupy
the same storage location, we will never use a flag where an Index is needed.
8.5.5 Printing the Hierarchy Tree
To obtain a visual picture of the hierarchy tree, use the letter 'h'
in connection with the -v option.
8.6 Hints on Strong Typing
- Beware of excessive casting. If, in order to pull off a system of strong
typing you need to cast just about every access, you are missing the point.
The casts will inhibit even ordinary checking which has considerable value.
Remember, strong type-checking is gold, normal type-checking is silver,
and casting is brass.
- Rather than cast, use type hierarchies. For example:
/*lint -strong(AXJ,Tight) -strong(,Loose) */
typedef int Tight;
typedef Tight Loose;
Tight has a maximal amount of Strong Type checking; Loose
has none. Since Loose is defined in terms of Tight the two
types are interchangeable from the standpoint of Strong Type checking.
Presumably you work with Tight int's most of the time. When
absolutely necessary to achieve some effect, Loose is used.
- A time when it is really good to cast is to endow some otherwise neutral
constant with a special type. FZERO of the previous section 8.5.2
is an example.
- For large, mature projects enter strong typing slowly working on one
family of strong types at a time. A family of strong types is one hierarchy
structure.
- Don't bother with making pointers to functions strong types. For example:
typedef int (*Func_Ptr)(void);
If you make Func_Ptr strong, you're not likely to get much more
checking that if you didn't make it strong. The problem is that you would
then have to cast any existing function name when assigning to such a pointer.
This represents a net loss of type-checking (remember: gold, silver,
brass).
- Rather than making a pointer a strong type, make the base type a strong
type. For example:
typedef char TEXT;
typedef TEXT *STRING;
TEXT buffer[100];
STRING s;
It may seem wise to strong type both STRING and TEXT.
This would be a mistake since whenever you assign buffer to s,
for example, you would have to cast. But note that -strong(Ac,STRING)
would allow the assignment. It is usually better to strong type just TEXT.
Then when buffer is assigned to s the indirect object TEXT
is strongly checked and no cast is needed.
This holds for structures as well as for scalars. For example, in MS Windows
programming there are a number of typedef'ed types that are pointers.
Examples include: LPRECT, LPLOGFONT, LPMSG, LPSTR,
LPWNDCLASS, etc. If you make these -strong(A) you will have
problems passing to Windows functions addresses of Window's structs.
At most make them -strong(AcX).
- Care is needed in declaring strong self-referential struct's.
The usual method, i.e.,
typedef struct list { struct list * next ; ... }
LIST;
is incompatible with making LIST a strong type because its member
next will not be a pointer to a strong type. It is better to use:
typedef struct list LIST;
struct list { LIST * next; ...};
This is explicitly sanctioned in ANSI C (3.5.2.3) and will make next
compatible with other pointers to LIST.
- Once a type is made strong it should not then be made unstrong or weak.
For example
//lint -strong(AJX)
typedef int INT;
//lint -strong(,INT)
// INT is still strong
8.7 Reference Information
8.7.1 Strong Expressions
An expression is strongly typed if
- it is a strongly typed variable, function, array, or member of union
or struct or an indirectly referenced pointer to a strong type.
- it is a cast to some strong type.
- it is one of the type-propagating unary operators, (viz. +
- ++ -- ~), applied to a strongly typed expression.
- it is formed by one of the balance and propagate binary operators
applied to two strongly typed expressions (having the same strong type).
The balance and propagate operators consist of the five binary arithmetics
(+ - * / %), the three bit-wise operators (& |
^), and the conditional operator (? :).
- it is a shift operator whose left side is a strong type.
- it is a comma operator whose right side is a strong type.
- it is an assignment operator whose left side is a strong type.
- it is a Boolean operator and some type has been designated as
Boolean (with a b or B flag in the -strong option).
The Boolean operators consist of the four relationals (> >=
< <=), the two equality operators (== !=), the
two logical operators (|| &&), and unary !
8.7.2 General Information
When the option
-strong( flags [, name] ... )
is processed, name and flags are entered into a so-called
Strong Table created for this purpose. If there is no name, then
a variable, Default Flags, is set to the flags provided. When a
subsequent typedef is encountered within the code, the Strong Table
is consulted first. If the typedef name is not found, the Default
Flags are used. These flags become the identifying flags for strong typing
purposes for the type.
The option
-index( flags, ixtype, sitype[, ...] )
is treated similarly. Each sitype is entered into the Strong
Table (if not already there) and its index flags ORed with other strong
flags in the table. A pointer is established from sitype to ixtype
which is another entry in the Strong Table.
For these reasons it does not, in general, matter in what order the
-strong options are placed other than that they be placed before
the associated typedef. There should be at most one option that
specifies Default Flags.
8.8 Strong Types and Prototypes
If you are producing prototypes with some variation of the -od
option (Output Declarations), and if you want to see the typedef
types rather than the raw types, just make sure that the relevant typedef
types are strong. You can make them all strong with a single option: -strong().
Since you have not specified 'A', 'J' or 'X' you will
not receive messages owing to strong type mismatches for Assigning, Joining
or eXtraction. However, you may get them for declarations. You can set
-etd(strong)
to inhibit any such messages.
|