Linux ABI - Life with Distributed Computational System based on OpenSource

Motivation:

When I read:

./linux/types.h

, there is comment saying that:

aligned u64 should be used in defining kernel<->userspace ABIs to avoid common 332/64-bit comapt problems.

What is ABI:

It stands for Application Binary Interface.

ABI is important when it comes to application that use external libraries. If a program is built to use a particular library and that library is later updated, you don't want to have to re-compile that application (and from the end-user's standpoint, you may not have the source).

If the updated library uses the same ABI, then your program will not need to change.

The interface to the library (which is all your program really cares about) is the same even though the internal workings may have changed. Tow versions of a library that have the same ABI are sometimes called "binary-compatible" since they have the same low-level interface (you should be able to replace the old version with the new one and not have any major problems).

Sometimes, ABI changes are unavoidable. When this happens, any programs that use that library will not work unless they are re-comppiled to use the new version of the library. If ABI changes but the API does not, then the old and new library versions are sometimes called "source compatible". This implies that while a program compiled for one library version will not work with the other, source code written for one will work for the other if re-compiled.

For this reason, library writers tend to try to keep their ABI stable (to minimize disruption). Keeping an ABI stable means not changing function interfaces (return type and number, types and order of arguments), definitions of data types or data structures (return type and number, types, and order of argumenmts), definitions of data types or data structures, defined constants, etc. New functions and data types can be added, but existing ones must stay the same. If you expand, say, a 16-bit data structure field into a 32-bit field, then already-compiled code that uses that data structure members gets converted into memory addresses and offsets during compilation and if the dataaaa structure changes, then these offsets will not point to what the code is expecting them to point to and the results are unpredictable at best.

That's why there is comment again:

aligned u64 should be used in defininng kernel<->userspace ABIs to avoid common 32/64-bit compat problems.

Let's keep going.

ABI is not necessarily something you will explicitly provide unless you are expecting people to interface with your code using assembly. It is nooot language-specific either, since (for example) a C application and a Pascal application will use the same ABI afterr they are compiled.

Forther More:

Regarding ABI is regarding ELF file format. The reason this information is included is because ELF format defines the interface between operating system and application. When you tell the OS to run a program, it expects the program to be formatted in a certain way and (for example) expects the first section of the binary to be an ELF header containing certain information at specific memory offsets.

This is how the application communicates important information about itself to the operating system. If you build a program in a not be able to interpret the binary file or run the application. This is one big either re-compiled or run inside some type of emulation layer that can translate from one binary format to another.

All Comment:

aligned u64 should be used in defining kernel<->userspace ABIs to avoid common 32/64-bit compat problems.

64-bit values align to 4-byte boundaries on x86_32 (and possibly other architectures) and to 8-byte boundaries on 64-bit architectures. The new aligned_64 type enforce 8-byte alignment so that structs containing aligned 64 values have the same alignment on 32 bit and 64 bit architectures.

No conversions are necessary between 32 bit user-space and a 64 bit kernel.