8.3 Type Encoding

This is an advanced section. Type encodings are used extensively by the compiler and by the runtime, but you generally do not need to know about them to use Objective-C.

The Objective-C compiler generates type encodings for all the types. These type encodings are used at runtime to find out information about selectors and methods and about objects and classes.

The types are encoded in the following way:

_BoolB
charc
unsigned charC
shorts
unsigned shortS
inti
unsigned intI
longl
unsigned longL
long longq
unsigned long longQ
floatf
doubled
long doubleD
voidv
id@
Class#
SEL:
char**
enuman enum is encoded exactly as the integer type that the compiler uses for it, which depends on the enumeration values. Often the compiler users unsigned int, which is then encoded as I.
unknown type?
Complex typesj followed by the inner type. For example _Complex double is encoded as "jd".
bit-fieldsb followed by the starting position of the bit-field, the type of the bit-field and the size of the bit-field (the bit-fields encoding was changed from the NeXT’s compiler encoding, see below)

The encoding of bit-fields has changed to allow bit-fields to be properly handled by the runtime functions that compute sizes and alignments of types that contain bit-fields. The previous encoding contained only the size of the bit-field. Using only this information it is not possible to reliably compute the size occupied by the bit-field. This is very important in the presence of the Boehm’s garbage collector because the objects are allocated using the typed memory facility available in this collector. The typed memory allocation requires information about where the pointers are located inside the object.

The position in the bit-field is the position, counting in bits, of the bit closest to the beginning of the structure.

The non-atomic types are encoded as follows:

pointers^’ followed by the pointed type.
arrays[’ followed by the number of elements in the array followed by the type of the elements followed by ‘]
structures{’ followed by the name of the structure (or ‘?’ if the structure is unnamed), the ‘=’ sign, the type of the members and by ‘}
unions(’ followed by the name of the structure (or ‘?’ if the union is unnamed), the ‘=’ sign, the type of the members followed by ‘)
vectors![’ followed by the vector_size (the number of bytes composing the vector) followed by a comma, followed by the alignment (in bytes) of the vector, followed by the type of the elements followed by ‘]

Here are some types and their encodings, as they are generated by the compiler on an i386 machine:


Objective-C typeCompiler encoding
int a[10];
[10i]
struct {
  int i;
  float f[3];
  int a:3;
  int b:2;
  char c;
}
{?=i[3f]b128i3b131i2c}
int a __attribute__ ((vector_size (16)));
![16,16i] (alignment would depend on the machine)

In addition to the types the compiler also encodes the type specifiers. The table below describes the encoding of the current Objective-C type specifiers:


SpecifierEncoding
constr
inn
inoutN
outo
bycopyO
byrefR
onewayV

The type specifiers are encoded just before the type. Unlike types however, the type specifiers are only encoded when they appear in method argument types.

Note how const interacts with pointers:


Objective-C typeCompiler encoding
const int
ri
const int*
^ri
int *const
r^i

const int* is a pointer to a const int, and so is encoded as ^ri. int* const, instead, is a const pointer to an int, and so is encoded as r^i.

Finally, there is a complication when encoding const char * versus char * const. Because char * is encoded as * and not as ^c, there is no way to express the fact that r applies to the pointer or to the pointee.

Hence, it is assumed as a convention that r* means const char * (since it is what is most often meant), and there is no way to encode char *const. char *const would simply be encoded as *, and the const is lost.