Linker (computing)

An illustration of the linking process. Object files and static libraries are assembled into a new library or executable

In

assembler) and combines them into a single executable file, library

A simpler version that writes its output directly to memory is called the loader, though loading is typically considered a separate process.^[1]^[2]

Overview

Computer programs typically are composed of several parts or modules; these parts/modules do not need to be contained within a single object file, and in such cases refer to each other using symbols as addresses into other modules, which are mapped into memory addresses when linked for execution.

While the process of linking is meant to ultimately combine these independent parts, there are many good reasons to develop those separately at the

monolithic whole and the ability to better define the purpose and responsibilities of each individual piece, which is essential for managing complexity and increasing long-term maintainability in software architecture

.

Typically, an object file can contain three kinds of symbols:

defined "external" symbols, sometimes called "public" or "entry" symbols, which allow it to be called by other modules,
undefined "external" symbols, which reference other modules where these symbols are defined, and
local symbols, used internally within the object file to facilitate
relocation
.

For most compilers, each object file is the result of compiling one input source code file. When a program comprises multiple object files, the linker combines these files into a unified executable program, resolving the symbols as it goes along.

Linkers can take objects from a collection called a

shared library

, the entire library has to be loaded during runtime as it is not known which functions or methods will be called during runtime. Library linking may thus be an iterative process, with some referenced modules requiring additional modules to be linked, and so on. Libraries exist for diverse purposes, and one or more system libraries are usually linked in by default.

The linker also takes care of arranging the objects in a program's

zero

). Relocating machine code may involve re-targeting absolute jumps, loads, and stores.

The executable output by the linker may need another relocation pass when it is finally loaded into memory (just before execution). This pass is usually omitted on

On some

Dynamic linking

Many operating system environments allow dynamic linking, deferring the resolution of some undefined symbols until a program is run. That means that the executable code still contains undefined symbols, plus a list of objects or libraries that will provide definitions for these. Loading the program will load these objects/libraries as well, and perform a final linking.

This approach offers two advantages:

Often-used libraries (for example the standard system libraries) need to be stored in only one location, not duplicated in every single executable file, thus saving limited memory and disk space.
If a bug in a library function is corrected by replacing the library or performance is improved, all programs using it dynamically will benefit from the correction after restarting them. Programs that included this function by static linking would have to be re-linked first.

There are also disadvantages:

Known on the
backward compatible
.
A program, together with the libraries it uses, might be certified (e.g. as to correctness, documentation requirements, or performance) as a package, but not if components can be replaced (this also argues against automatic OS updates in critical systems; in both cases, the OS and libraries form part of a qualified environment).

Contained or virtual environments may further allow system administrators to mitigate or trade-off these individual pros and cons.

Static linking

Static linking is the result of the linker copying all library routines used in the program into the executable image. This may require more disk space and memory than dynamic linking, but is more portable, since it does not require the presence of the library on the system where it runs. Static linking also prevents "DLL hell", since each program includes exactly the versions of library routines that it requires, with no conflict with other programs. A program using just a few routines from a library does not require the entire library to be installed.

Relocation

As the compiler has no information on the layout of objects in the final output, it cannot take advantage of shorter or more efficient instructions that place a requirement on the address of another object. For example, a jump instruction can reference an absolute address or an offset from the current location, and the offset could be expressed with different lengths depending on the distance to the target. By first generating the most conservative instruction (usually the largest relative or absolute variant, depending on platform) and adding relaxation hints, it is possible to substitute shorter or more efficient instructions during the final link. In regard to jump optimizations this is also called automatic jump-sizing.^[4] This step can be performed only after all input objects have been read and assigned temporary addresses; the linker relaxation pass subsequently reassigns addresses, which may in turn allow more potential relaxations to occur. In general, the substituted sequences are shorter, which allows this process to always converge on the best solution given a fixed order of objects; if this is not the case, relaxations can conflict, and the linker needs to weigh the advantages of either option.

While instruction relaxation typically occurs at link-time, inner-module relaxation can already take place as part of the optimizing process at

dynamic dead-code elimination

techniques.

Linkage editor

In IBM

OS/360, including z/OS for the z/Architecture

mainframes, this type of program is known as a linkage editor. As the name implies a linkage editor has the additional capability of allowing the addition, replacement, and/or deletion of individual program sections. Operating systems such as OS/360 have format for executable load-modules containing supplementary data about the component sections of a program, so that an individual program section can be replaced, and other parts of the program updated so that relocatable addresses and other references can be corrected by the linkage editor, as part of the process.

One advantage of this is that it allows a program to be maintained without having to keep all of the intermediate object files, or without having to re-compile program sections that haven't changed. It also permits program updates to be distributed in the form of small files (originally

card decks), containing only the object module to be replaced. In such systems, object code is in the form and format of 80-byte punched-card images, so that updates can be introduced into a system using that medium. In later releases of OS/360 and in subsequent systems, load-modules contain additional data about versions of components modules, to create a traceable record of updates. It also allows one to add, change, or remove an overlay

structure from an already linked load module.

The term "linkage editor" should not be construed as implying that the program operates in a user-interactive mode like a text editor. It is intended for batch-mode execution, with the editing commands being supplied by the user in sequentially organized files, such as punched cards, DASD, or magnetic tape.

Linkage editing (IBM nomenclature) or consolidation or collection (ICL nomenclature) refers to the linkage editor's or consolidator's act of combining the various pieces into a relocatable binary, whereas the loading and relocation into an absolute binary at the target address is normally considered a separate step.^[2]

Linker Control Scripts

In the beginning linkers gave users very limited control over the arrangement of generated output object files. As the target systems became complex with different memory requirements such as embedded systems, it became necessary to give users control to generate output object files with their specific requirements such as defining base addresses' of segments. Linkers control scripts were used for this.

Common implementations

On Unix and Unix-like systems, the linker is known as "ld". Origins of the name "ld" are "LoaDer" and "Link eDitor". The term "loader" was used to describe the process of loading external symbols from other programs during the process of linking.^[5]

GNU linker

The GNU linker (or GNU ld) is the

GNU Binary Utilities (binutils). Two versions of ld are provided in binutils: the traditional GNU ld based on bfd, and a "streamlined" ELF-only version called gold

.

The command-line and linker script syntaxes of GNU ld is the de facto standard in much of the Unix-like world. The LLVM project's linker, lld, is designed to be drop-in compatible,^[7] and may be used directly with the GNU compiler. Another drop-in replacement, mold, is a highly parallelized and faster alternative which is also supported by GNU tools.^[8]

References

IBM Corporation. 1972. Archived
(PDF) from the original on 2020-03-06. Retrieved 2020-03-07.

^
LCCN 78-19961
. (xii+100 pages)

^ BRF-LINKER User Manual. August 1984. ND-60.196.01.

ISBN 0-13-052564-2. Archived
(PDF) from the original on 2020-03-23. Retrieved 2008-10-01. (xiv+294+4 pages)

^ "1. ld". UNIX PROGRAMMER'S MANUAL (6 ed.). May 1975.

^ "GNU Binutils: Linker Scripts". 2018-07-18. Archived from the original on 2020-03-06. Retrieved 2019-01-18.

^ "LLD - The LLVM Linker — lld 14 documentation". lld.llvm.org.

^ "GCC 12 Adds Support For Using The Mold Linker". www.phoronix.com.

Further reading

Fraser, Christopher W.;
S2CID 206508204
.

Operating System 360 - Linkage Editor (E) - Program Logic Manual (PDF) (3 ed.).
International Business Machines Corporation. 1969-07-23 [June 1967]. Program number 360S-ED-510. File No. S360-31. Form Y28-6610-2. Archived from the original
(PDF) on 2007-10-01. Retrieved 2020-03-07.

S2CID 42995338
.

OCLC 42413382. Archived from the original on 2012-12-05. Retrieved 2020-01-12. Code: [1][2] Errata: [3]

S2CID 5694671. Archived
(PDF) from the original on 2020-03-07. Retrieved 2020-03-07. (19 pages)

Ramsey, Norman (May 1996). "Relocating Machine Instructions by Currying" (PDF). ACM SIGPLAN Notices. 31 (5): 226–236.
doi:10.1145/249069.231429. Archived
(PDF) from the original on 2020-05-18.

External links

Look up linker in Wiktionary, the free dictionary.

Ian Lance Justin's Linkers blog entries

Linkers and Loaders, a Linux Journal article by Sandeep Grover

Another Listing of Where to Get a Complete Collection of Free Tools for Assembly Language Development

GNU linker manual

LLD - The LLVM Linker

ld(1): The GNU linker – Linux User Manual – User Commands

v
t
e
Application binary interface (ABI)
Parts and
conventions

Alignment

Calling convention

Call stack

Library
static

Machine code

Memory segmentation

Name mangling

Object code

Opaque pointer

Position-independent code

Relocation

System call

Virtual method table

Related topics

Binary-code compatibility

Foreign function interface

Language binding

Linker
dynamic

Loader

Year 2038 problem

v
t
e
Executable and object file formats

a.out

AIF

COFF

CMD

COM

ECOFF

ELF

GOFF

Hunk

Mach-O

MZ

NE

OMF

OS/360

PE

PEF

X

XCOFF

Comparison of formats

.exe

v
t
e
Unix command-line interface programs and shell builtins
File system

cat

chattr

chmod

chown

chgrp

cksum

cmp

cp

dd

du

df

file

fuser

ln

ls

mkdir

mv

pax

pwd

rm

rmdir

split

tee

touch

type

umask

Processes

at

bg

crontab

fg

kill

nice

ps

time

User environment

env

exit

logname

mesg

talk

tput

uname

who

write

Text processing

awk

basename

comm

csplit

cut

diff

dirname

ed

ex

fold

head

iconv

join

m4

more

nl

paste

patch

printf

read

sed

sort

strings

tail

tr

troff

uniq

vi

wc

xargs

Shell builtins

alias

cd

echo

test

unset

wait

Searching

find

grep

Documentation

man

Software development

ar

ctags

lex

make

nm

strip

yacc

Miscellaneous

bc

cal

expr

lp

od

sleep

true and false

Categories
Standard Unix programs

Unix SUS2008 utilities

List

Authority control databases: National

France

BnF data

Germany

Israel

United States

Retrieved from "https://en.wikipedia.org/w/index.php?title=Linker_(computing)&oldid=1217194294"

[IBM_1972-1] IBM Corporation. 1972. Archived
(PDF) from the original on 2020-03-06. Retrieved 2020-03-07.

[Barron_1978_Consolidator-2] 
LCCN 78-19961
. (xii+100 pages)

[BRF_1984-3] BRF-LINKER User Manual. August 1984. ND-60.196.01.

[Salomon_1992-4] ISBN 0-13-052564-2. Archived
(PDF) from the original on 2020-03-23. Retrieved 2008-10-01. (xiv+294+4 pages)

[UNIX_V6_manuals-5] "1. ld". UNIX PROGRAMMER'S MANUAL (6 ed.). May 1975.

[GNU_2018_Binutils-6] "GNU Binutils: Linker Scripts". 2018-07-18. Archived from the original on 2020-03-06. Retrieved 2019-01-18.

[7] "LLD - The LLVM Linker — lld 14 documentation". lld.llvm.org.

[8] "GCC 12 Adds Support For Using The Mold Linker". www.phoronix.com.

[1]

[2]

[4]

[5]

[7]

[8]