When we declare a variable of type int and we don’t tell the compiler if it is supposed to be signed or unsigned it is signed by default:
signed int s_number; // signed
unsigned int u_number; // unsigned
int number; // equivalent with signed int
This is true for all integer number types (short, int, long, long long).
But there is one exception: The char type!
The char Type is Special
According to the C Standard, it depends on the implementation if char is signed or unsigned. The standard also says that char doesn’t collapse into one of signed char or unsigned char but is considered its own type (although it is represented internally as signed or unsigned, of course).
But the most surprising thing to know about char is that it is not guaranteed to be 8 bits. That means the C Standard does not guarantee a byte to be 8 bits. There are some very old architectures where char is 9 bits wide for example.
Fortunately for us, basically all architectures in use today have bytes that are 8 bits wide. Still, a program that assumes char to be 8 bits wide will not be portable to every platform where a C compiler is available.
Ask the Compiler
The characteristics of the char type (as well as all other types) are exposed in the standard header file limits.h. Therefore we can write a simple C program to find out how char is implemented on our platform:
#include <stdio.h>
#include <limits.h>
int main(void)
{
printf("char bits: %d\n", CHAR_BIT);
printf("char is: %s\n", CHAR_MIN < 0 ? "signed" : "unsigned");
printf("signed char min: %d\n", SCHAR_MIN);
printf("signed char max: %d\n", SCHAR_MAX);
printf("unsigned char min: 0\n");
printf("unsigned char max: %d\n", UCHAR_MAX);
return 0;
}
If you save this program with the filename char.c you can use the following command to compile it on Linux and macOS:
$ gcc -o char char.c
If we run this program on a number of modern platforms we get the following results:
Linux (x86-64)
$ ./char
char bits: 8
char is: signed
signed char min: -128
signed char max: 127
unsigned char min: 0
unsigned char max: 255
Linux on Raspberry Pi (32 Bit ARM)
$ ./char
char bits: 8
char is: unsigned
signed char min: -128
signed char max: 127
unsigned char min: 0
unsigned char max: 255
macOS (x86-64)
$ ./char
char bits: 8
char is: signed
signed char min: -128
signed char max: 127
unsigned char min: 0
unsigned char max: 255
macOS (Apple Silicon)
$ ./char
char bits: 8
char is: signed
signed char min: -128
signed char max: 127
unsigned char min: 0
unsigned char max: 255
Windows (x86-64)
char bits: 8
char is: signed
signed char min: -128
signed char max: 127
unsigned char min: 0
unsigned char max: 255
The interesting thing to note is that most modern platforms give us the same output. Only the 32-bit ARM machine running Linux represents char as an unsigned value while all other platforms represent it as a signed value.
The same is true for ARM 64 (aarch64). It is just the case that Apple decided to represent char as an unsigned type on their platform. But on other ARM 64 machines, you will most likely find char to be unsigned (unfortunately, I don’t have one lying around at the moment to show you).
Takeaways
On most platforms in use today char will be represented as a signed char value. Not so on non-Mac ARM devices, though. So we should always keep in mind that char can be either signed or unsigned.
Most compilers even provide a flag to switch between both possible representations of char. E.g. with gcc you can use the flags
-fsigned-char
or
-funsigned-char
to explicitly set the signedness of char. If we try to compile our program on Linux (x86-64) with
$ gcc -fsigned-char -o char char.c
we get the following output:
$ ./char
char bits: 8
char is: signed
signed char min: -128
signed char max: 127
unsigned char min: 0
unsigned char max: 255
Of course, this might come with a performance penalty because normally the compiler should use the representation that is most performant on your hardware as the default. Therefore, this compiler feature should be seen mainly as an easy way to check the portability of your code.