A C Primer
Basic data types
The basic types may be qualified with the keywords short, long, or unsigned,
as in:
-
short int
-
long int
-
unsigned int
Constants
For integers, the usual decimal notation is used. Hexadecimal integer
constants are preceded by 0x.
Single character constants are enclosed by single quotes.
Character string constants are enclosed by double quotes. String
constants are always stored with a '\0' as a terminating byte.
Variable declarations
Every variable must be declared. The form of a declaration is
datatype variable-name;
Operators
-
Arithmetic operators
+*,/,% (mod),- (unary minus)
-
Relational and logical operators
> >= < <= == (equals) != (not equals)
&& (and)
|| (or)
! (not)
-
Type conversion rules
char and int can be freely mixed in arithmetic expressions.
chars are simply equivalent to their ASCII code representations.
Relational and logical expressions are defined to have value 1 if true
and 0 if false.
Explicit type conversions may be forced by "type casting", as in
int x = (int) &y; // store the address of y in the integer
variable x.
-
Increment and decrement operators
++ is a unary operator which adds 1 to its operand. -- subtracts
1. These operators may be used as prefixes or suffixes. If
used as a prefix, ++ increments its operand before it is used; if used
as a suffix, ++ increments its operand after it is used.
- Bit-level boolean
and shift operators
& bitwise and
| bitwise or
^ bitwise exclusive or
<< left shift
>> right shift
~ complement (unary)
-
Assignment operators
= is the standard assignment operator. Others can be
formed by a combination of a binary operator (such as +,*,<<, etc.)
and '='. For example, a += b is equivalent to a = a+b. The
result of an assignment is the assigned value. The variable on the
left is changed as a side effect. Thus, an assignment can be embedded
within an expression. For example, a = (b = c) stores the value of
c in both a and b.
-
Operator precedence (from highest to lowest)
() [] -> .
! ~ ++ -- - (type) * &
sizeof (all unary operators)
* / % (multiplication and division)
+ - (addition and subtraction)
<< >> (shifts)
< <= > >= (less,
greater comparison)
== != (equality comparison)
& (bitwise and)
^ (bitwise xor)
| (bitwise or)
&& (logical and)
|| (logical or)
?: (conditional)
= += -= etc. (assignment
operators)
Statements
-
Any expression followed by a semicolon is an "expression statement".
For example,
x = 35;
i++;
-
Compound statement
{ declarations
statements }
-
if statement
if (expression)
statement
if (expression)
statement1
else
statement2
-
while statement
while (expression)
statement
-
do statement
do statement
while (expression);
-
for statement
for (exp1; exp2; exp3)
statement
-
switch statement
switch (expression)
{
case constant : statement(s)
case constant : statement(s)
.
.
.
case constant : statement(s)
default : statement(s) (optional)
}
-
return statement
return;
return expression;
-
null statement
;
Program structure
A C program consists of a file of external definitions of functions and
variables.
Function definitions may not be nested. Execution begins with
the function called "main".
All arguments to functions are passed by value. However, the values
of these arguments may themselves be pointers (i.e., addresses).
Command-line arguments
When a program is loaded, arguments may be passed from the shell to the
main program. The shell parses the command line into distinct words,
stores the words as character strings, and creates an array of pointers
to the strings. The shell then passes two parameters to the main
program: the number of words on the command line (including the command
name itself), and an array of character pointers, which point to the strings.
For the main program to be able to access the array of strings, it should
be declared as
int main(int argc, char *argv[])
argc is the word count
argv[0] is the first word from the command line
argv[1] is the second word
argv[2] is the third word, etc.
Scope
Declarations may have "file scope" or "block scope".
Any declaration appearing outside a function has file scope; that is,
it is in effect from the point of the declaration to the end of the file
in which it occurs.
Any declaration appearing within a block has block scope; it is in effect
from the point of the declaration to the end of the block in which it occurs.
(A "block" is either the body of a function or the body of a compound statement.)
Storage class
Variables may have either "auto" or "static" storage class.
Variables declared within a block default to auto storage class, but
this can be overridden with the keyword "static". Memory space for
auto variables is allocated on the run-time stack when control enters the
block and is deallocated when control leaves the block. If an auto
variable's declaration includes an initializer, the initialization is performed
each time the variable is allocated. If there is no initializer,
the initial value of the variable is undefined.
Variables declared outside a function, or those explicitly given the
keyword "static", have static storage class. Memory space for static
variables is allocated in the program's data section prior to the start
of execution. If a static variable's declaration includes an initializer,
the initialization is performed exactly once, prior to the start of execution.
If there is no initializer, the variable is initialized to 0.
Linkage
Code for a C program may span several source files. Each source file
is compiled separately, producing an object file. A linker combines
the object files into a single executable file.
A function or variable defined in one file can be referred to by code
in another file if it has "external" linkage, according to the following
rules:
A function or variable defined within a block cannot have external linkage.
Functions and variables defined outside any function have, by default,
external linkage, unless overridden by the "static" keyword. (So
the keyword "static", when applied to a function or variable defined outside
a function, does not actually indicate static storage class. It is
used to hide the function or variable from code in other files.)
Structured data types
-
Arrays
int a[20];
creates an array of 20 integers. The individual elements are numbered
a[0], a[1],..., a[19]. The size of the array must be expressed as
a constant.
float b[10][20]; creates a 2-dimensional array of floating-point
numbers with 10 rows and 20 columns.
extern int c[]; declares c to be an array of integers, defined in
another source file. Because the definition (including size) appears
in the other file, the size need not be specified here.
-
Pointers
int *p; declares p to be a pointer to an integer;
that is, p is a variable which can hold the address of an integer variable.
Within an expression, *p refers to the integer variable that p points to.
'*' used in this way is called the "dereferencing" operator. For
example, if b and c are declared to be integer variables, then
p = &b; // stores the address of b in the variable p
c = *p; // the variable which p points to is copied to c
*p = 25; // 25 is stored in the variable which p points to
An array name, used without a subscript, is equivalent to a pointer to
the first element of the array. Thus, if "arr" is the name of an
array, &arr[0] is equivalent to arr. *arr is equivalent to arr[0].
-
Structures
A structure in C is similar to a class in Java, but it includes data
fields only (called "members"), not methods. A typical declaration:
struct {
char name[30];
char ssn[10];
char address[30];
} employee, *employee_pointer;
This declares "employee" to be a structure variable and "employee_pointer"
to be a pointer variable which can point to an employee structure.
The members of employee may be referred to as employee.name, employee.ssn,
and employee.address. The members of the structure which employee_pointer
points to may be referred to as employee_pointer->name, employee_pointer->ssn,
and employee_pointer->address.
Self-referential structures must have "tags". A tag is an identifier
which occurs immediately after the word "struct". For example, the
declaration of a struct used to represent a node in a binary tree might
look like this:
struct treenode {
char data[10];
struct treenode *left;
struct treenode *right;
};
"treenode" is the tag for the structure. This tag can then be used
in declarations of variables, such as
struct treenode *root;
Type definitions
New data type identifiers can be defined by the programmer, using the keyword
"typedef". Any variable declaration can be transformed into a type
definition simply by preceding it by "typedef". For example,
typedef int int3[3]; // "int3"
is now a data type meaning "an array of three integers"
/* As defined below, "treenode" is a data type representing the
given structure. "treenodeptr" is a data type representing variables
of type "pointer to treenode." */
typedef struct treenode {
char data[10];
struct treenode *left;
struct treenode *right;
} treenode, *treenodeptr;
Dynamic storage allocation
Memory space for variables can be allocated at runtime from a process's
run-time "heap", using the library function "malloc" (allocate memory).
malloc accepts one parameter, an integer representing the number of bytes
to be allocated. malloc allocates a block of memory of the requested
number of bytes, and returns a pointer to the first byte of the block.
(malloc returns NULL if it is out of space and cannot perform the allocation.)
The memory space allocated in this way is a dynamic variable which has
no name and must be referred to using a pointer. Dynamic variables
are not automatically garbage-collected as in Java; the programmer is responsible
for deallocating space for dynamic variables by using the library function
"free".
For example, to allocate a node in a binary tree using the treenode
declaration above, we would write
treenode *nodeptr; // declaration of the pointer variable
"nodeptr"
nodeptr = malloc(sizeof treenode);
Preprocessor macros
Preprocessor macros can be used to define constants. For example,
#define PI 3.14159
#define NULL 0
Header files
Header files are used for the type and function definitions associated
with a code module. The header file is then "included" in a program
that uses the module. For example,
#include <stdio.h>
#include "mymodule.h"
The first form is used for system- or compiler-defined header files.
The second form is used for programmer-defined header files.
Most system calls and library functions require the inclusion of at
least one header file in order to load the proper declarations. Generally,
the man page for the call indicates which header files are needed.