Systems Programming C programs in (address) space and (run

Transcription

Systems Programming C programs in (address) space and (run
C programs in (address) space and (run-)time
Systems Programming
Where is my data and why do I have to know?
02. C Programs in Space and Time
I
Database and Information Systems Group
Department of Computer & Information Science
University of Konstanz
C is closely related to the machine. Before talking about
pointers, storage allocation etc. some background knowledge
about address space, (virtual) memory and its allocation
during program execution comes in handy
I
Knowledge about the memory layout of a program is quite
helpful when debugging
Summer Term 2008
I
Knowledge about what is happening inside the machine on
program execution is fundamental, to both, debugging
programs and, in first place, writing clean code
Alexander Holupirek
2
1
C, assembler, and machine code
ausführbarer
Binärcode (hexadezimal dargestellt)
Repetition Computer Architecture
Storage Classes
C-Quellcode
From Source Code To Executable Code
int a, b;
a = b * b;
Construction of an Executable
Intel iA32-Assembler-Quellcode
mov
imul
mov
0x403030,%eax
0x403030,%eax
%eax,0x403020
Maschinenbefehle bzw.
Prozessorinstruktionen
Relocation Process
Adresse
3
4012ee
4012ef
4012f0
4012f1
4012f2
4012f3
4012f4
4012f5
4012f6
4012f7
4012f8
4012f9
4012fa
4012fb
4012fc
4012fd
4012fe
a1
30
30
40
00
0f
af
05
30
30
40
00
a3
20
30
40
00
Inhalt (je 1 Byte)
4
C, assembler, and machine code
C-Quellcode
Address Space
Ausführbarer Binärcode
Speicheradresse
int a=4, b;
int main(void) {
Assembler-Quellcode
if (a>5)
8048344:
804834b:
83 3d 94 94 04 08 05
7e 0c
cmpl
jle
$0x5,0x8049494
8048359
c7 05 8c 95 04 08 01
00 00 00
movl
$0x1,0x804958c
b=1;
804834d:
8048354:
8048357:
eb 0a
jmp
8048363
8048359:
8048360:
c7 05 8c 95 04 08 00
00 00 00
movl
$0x0,0x804958c
8048363:
c9
...
else
b=0;
}
Speicherinhalte
0
Startadresse des
Datenblocks
0x10000000
Letzte Byteadresse
des Datenblocks
16 Byte
Datenblock
0x1000000f
0x10000010
Adresse des ersten
Byte nach dem
Datenblock
Größe des
Datenblocks
Adressen einzelner
Byte
0x50000000
0x50000001
Höchstmögliche Adresse
(»Speicherende«)
Zahlenwerte in Binär- und Assemblercode
sind alle hexadezimal zu verstehen
a liegt auf Adresse 0x8049494
b liegt auf Adresse 0x804958c
Speicheradressen
Tiefstmögliche Adresse
(»Speicherbeginn«)
Speicherinhalt
(=Maschinenbefehl)
0x56
0xfc
max.
5
Byte Ordering
6
Alignment Rules
Goal: Optimal Performance
0
Adr.
Adressraum
Daten (4 Byte):
MSB
d3
Big-Endian-System
n
max.
Adr.
n
n+1
n+2
n+3
Inhalt
d3
d2
d1
d0
MSB
LSB
LSB
d2
d1
d0
I
Determine the address locations for variables and instructions
I
Great impact on compiler, assembler, linker tools
Little-Endian-System
Adr.
n
n+1
n+2
n+3
d0
d1
d2
d3
Datenbus
Adressraum
Inhalt
Adressen
(hexadezimal)
LSB
MSB
DatenLangwort
(misaligned)
Mit der Adresse n wird auf die 4 Byte großen Daten im Programm zugegriffen
MSB = Most Significant Byte (höchstwertiges Byte)
LSB = Least Significant Byte (niedrigstwertiges Byte)
0x35
0x36
0x37
0x38
Adressoffsets (Byteadressen)
+0
0x34
+1
0x35
+2
0x36
+3
0x37
0x38
0x39
0x3a
0x3b
1. Zugriff
2. Zugriff
Langwortgrenzen auf dem Bus
Langwortgrenzen (ohne Rest durch 4 teilbar) im Adressraum
7
8
Alignment Rules (cont.)
Repetition Computer Architecture
For derived types16 (constructed from the basic types) alignment
rules apply to each single component:
alignment(1)
Storage Classes
alignment(4)
struct artikel {char name[5];
int anzahl;
double preis;};
From Source Code To Executable Code
Construction of an Executable
Alignment rules may be influenced through compiler directives
Relocation Process
(-malign-int aligns variables on 32-bit boundaries producing code that runs
somewhat faster on processors with 32-bit busses at the expense of memory)
16
arrays, functions, pointers, structures, unions (we will discuss them later)
10
9
Storage Classes
Automatic Storage Class
Automatic Objects
Placement of data in memory depends on storage class
An object, such as a variable, is a location in storage, and its
interpretation depends on two main attributes: its storage
class and its type
I
auto and register give the declared objects automatic
storage class, and may be used only within functions
I
They are local to a block17 , discarded on exit from the block
I
The storage class determines the lifetime of the storage
associated with the identified object
I
Declarations within a block create automatic objects if no
storage class specification is mentioned or auto is used
I
The types determines the meaning of the values found in the
identified object.
I
I
In C we have two storage classes: automatic and static
Initialization of automatic objects is performed each time the
block is entered at the top (if a jump into the block is
executed the initializations are not performed)
I
Storage class specifiers (auto, extern, register, static)
together with the context of an object’s declaration, specify
its storage class
I
Objects declared register are automatic, and are (if
possible) stored in fast registers of the machine
I
For register the address operator ’&’ is not allowed
I
17
11
aka “compound statement”, such as the body of a function
12
Static Storage Class
Storage Class and Sections
Static Objects
Intermediate Summary
I
May be local to a block or external to all blocks
I
In both cases, they retain their values across exit from and
reentry to functions and blocks
I
Within a block, static objects are declared with static
I
Objects declared outside of all blocks (at the same level as
function definitions) are always static
I
A program executed does not only use storage for its
instructions, but additionally needs space for, e.g., variables
I
Variables may be temporary, dynamically allocated, or static
(i.e., permanent in terms of storage allocation), initialized or
uninitialized, declared as constant (const) and thus read-only
I
Placement of data in memory depends on its storage class
I
On the outer level, the keyword static makes them local to
a particular translation unit (internal linkage)
I
During the translation process the compiler uses sections to
divide the address space into logical units
I
They are global to an entire program by omitting an explicit
storage class, or by using extern (external linkage)
I
Details vary with operating systems and compiler used
13
Typical Program Organisation
14
Program Sections
A typical program divides naturally in sections
Adressraum
Code machine instructions, should be unmodifiable, size is known
after compilation, does not change (.text)
Data I static data
I
I
I
I
.text
initialized (.data) /uninitialized (.bbs)
constant address in memory
permanent life time
dynamic data
I
I
I
PROM oder RAM
schreibgeschützt
.data
RAM
.bss
RAM
PROM:
Programmable Read Only Memory
(im Betrieb nicht beschreibbarer
Speicherbaustein)
RAM:
Random Access Memory
(Speicher mit wahlfreiem Zugriff)
stack or heap
storage space not known
volatile life time
15
16
Virtual Memory and Segments
A Program in Memory
Virtual Memory
I
I
Whenever a process is created, the kernel provides a chunk of
physical memory which can be located anywhere
0
static
data
Through the magic of virtual memory (VM), the process
believes it has all the memory on the computer
Code, Konstanten
aus ausführbarer Datei geladen
initialisierte Daten
bei Prozessstart bereitgestellt
und mit 0 initialisiert (gelöscht)
nicht initialisierte Daten
dynamic
data
Heap
I
Text Segment (.text)
I
Initialized Data Segment (.data)
I
Uninitialized Data Segment (.bss)
I
The Stack
I
The Heap
Adressen
Typically the VM space is laid out in a similar manner:
Stack
bei Prozessstart bereitgestellt,
für dynamische Speicherallozierung,
wächst dem Stapel entgegen
bei Prozessstart bereitgestellt,
wächst zu tieferen Adressen
(bzw. zu höheren Adr.;
prozessorabhängig)
17
Different Memory Layouts
(A) Lösung auf PC (iA32)
Memory Segments
Code, Konstanten
Stack
Programmstartadresse
Text Segment The text segment contains the actual code
(including constants) to be executed. It’s usually sharable, so
multiple instances of a program can share the text segment to
lower memory requirements. This segment is usually marked
read-only so a program can’t modify its own instructions.
(B) Stack umgekehrt wachsend
0
0
initialisierte Daten
nicht initialisierte Daten
Code, Konstanten
initialisierte Daten
Stack
Initialized Data Segment This segment contains global variables
which are initialized by the programmer.
Heap
Uninitialized Data Segment Also named .bss (block started by
symbol) which was an operator used by an old assembler.
This segment contains uninitialized global variables. All
variables in this segment are initialized to 0 or NULL pointers
before the program begins to execute.
Adressen
Adressen
nicht initialisierte Daten
Heap
18
19
20
Memory Segments (cont.)
Variable Placement and Life Time (Code)
int a ;
static int b ;
void
func ( void )
{
char c ;
static int d ;
}
The Stack The stack is a collection of stack frames which we will
discuss later. When a new frame needs to be added (as a
result of a newly called function), the stack grows downward.
The Heap Dynamic memory, where storage can be (de-)allocated
via C’s free(3)/malloc(3). The C library also gets
dynamic memory for its own personal workspace from the
heap as well. As more memory is requested “on the fly”, the
heap grows upward.
int
main ( void )
{
int e ;
int * pi = ( int *) malloc ( sizeof ( int ));
func ();
func ();
free ( pi );
return (0);
}
22
21
Variable Placement and Life Time (Code)
Variable Placement and Life Time (Diagram)
int a ;
/* Permanent life time */
static int b ; /* dito , but reduced scope */
Adresse
0
void
func ( void )
{
char c ; /* only for the life time of func () */
/* but 2 x ; visible only in func ()
*/
static int d ; /* i ’m unique , exist once at a stable */
/* address , visible only in func ()
*/
}
1. Instruktion
2. Instruktion
3. Instruktion
4. Instruktion
...
a
b
d
int
PC(t=0)
PC(t=x)
int
main ( void )
{
int e ; /* life time of main () */
pi
SP(t=x)
int * pi = ( int *) malloc ( sizeof ( int )); /* newborn */
func ();
func ();
free ( pi ); /* RIP , pi points to an invalid address */
return (0);
c
pi
e
SP(t=0)
max.
Code
Daten
Halde (Heap)
Stapel (Stack)
t=0: Programmausführung wird
gestartet, d.h., Ausführungsumgebung ist bereits initialisiert
t=x: beliebiger Zeitpunkt während
der Programmausführung
}
23
24
Variable Placement
Repetition Computer Architecture
Variables (outside a function) Globally declared variables go to the
Uninitialized Data Segment if they are not initialized, to
Initialized Data Segment otherwise. Necessary for the OS to
decide if storage has to be loaded with initialization data
from the executable binary.
Storage Classes
From Source Code To Executable Code
Variables (inside a function) Implicit assumption of auto, go to
The Stack. Declared as static, see above.
Construction of an Executable
Constants (const) Text Segment
Function Parameters Are pushed on The Stack or stored in
registers. If pointers are passed, data is elsewhere.
Relocation Process
26
25
From source code to executable code
Translation steps using gcc(1)
Translation Steps (multi-phase compilation)
Compilation HLL source code to assembler source code
Quellcode C/C++
Objektdatei,
Bibliotheksdatei
Assembler-Quellcode
Assembly Assembler source code to object code
Eingabedateien
Linking Object code to executable code
Compilers and assemblers create object files containing the
generated binary code and data for a source file. Linkers combine
multiple object files into one, loaders take object files and load
them into memory.
*.c/*.cc/*.cpp
*.s
Präprozessor
Ausgabedateien
Goal: An executable binary file (a.out)
Vorverarbeiteter
C/C++-Quellcode
From high-level language (HLL) source code to executable code,
i.e., concrete processor instructions in combination with data.
27
Compiler
*.i/*.ii
Assembler-Quellcode
*.o/*.a
Assembler
*.s
Binder
*.o
Objektdatei
(ungebunden)
a.out
Ausführbare Datei
(= Objektdatei, ladbar)
28
File suffixes and their meaning
Creation of an executable file
(Filename).c
For any given input file, the file name suffix determines what kind
of compilation is done (see gcc(1)) for more details and suffixes:
Kompilieren
gcc
suffix
.c
.i
.h
.s
.o
compilation step
C source code which must be preprocessed
C source code which should not be preprocessed
Header file to be turned into a precompiled header
Assembler code
An object file to be fed straight into linking
(Filename).s
= Operation
= Kommando
= Eingang oder
Ausgang
Assemblieren
gas
(Filename).o
Object/Library Files
ld
Binden
a.out
29
The C Preprocessor
30
File Inclusion
A control line of the form
# include filename
The C preprocessor performs . . .
I
Inclusion of named files
I
Macro Substitution
I
Conditional Compilation
causes the replacement of that line by the entire contents of the
file filename.
Note
The characters in the name filename must not include > or \n, and
the effect is undefined if it contains any of ", ’, \ , or /*.
Location
The named file is searched for in a sequence of implementationdependent places (often starting in /usr/include).
31
32
Macro Substitution
Macro Substitution (cont.)
A control line of the form
A control line of the form
# define identifier token - sequence
# define identifier ( identifier - list ) token - sequence
causes the preprocessor to replace subsequent instances of the
identifier with the given sequence of tokens.
where there is no space between the first identifier and the ’(’, is a
macro definition with parameters given by the identifier list.
Example
Example
# define
# define
# define
# define
# define
# define
EXIT_FAILURE
1
EXIT_SUCCESS
0
S_IRWXU 0000700
S_IRUSR 0000400
S_IWUSR 0000200
S_IXUSR 0000100
/*
/*
/*
/*
# define
# define
# define
# define
# define
RWX mask for owner */
R for owner */
W for owner */
X for owner */
S_ISDIR ( m )
S_ISCHR ( m )
S_ISBLK ( m )
S_ISREG ( m )
S_ISFIFO ( m )
(( m
(( m
(( m
(( m
(( m
&
&
&
&
&
0170000)
0170000)
0170000)
0170000)
0170000)
==
==
==
==
==
0040000)
0020000)
0060000)
0100000)
0010000)
33
Macro Substitution (cont.)
/*
/*
/*
/*
/*
directory */
char sp . */
block sp . */
regular */
fifo
*/
34
Conditional Inclusion
A control line of the form
# undef identifier
Parts of a program may be compiled conditionally
causes the identifier’s preprocessor definition to be forgotten. It is
not erroneous to apply #undef to an unknown identifier.
Example
Example
# ifndef
# ifdef
# define
# else
# define
# endif
# endif
/*
* Some header files may define an abs macro .
* If defined , undef it to prevent a syntax error
* and issue a warning .
* # warning is a pragma ( implementation - dependent action )
*/
# ifdef abs
# undef abs
# warning abs macro collides with abs () prototype , undefining
# endif
35
NULL
__GNUG__
NULL
__null
NULL
0L
36
Predefined Names
Compilation
Several identifiers are predefined, and expand to produce special
information. They, and also the preprocessor expression operator
defined, may not be undefined or redefined.
evtl. temporäre Dateien
Text
A decimal constant containing the current source line number
A string literal containing the name of the file being compiled
A string literal containing the data of compilation ’Mmm dd yyyy’
A string literal containing the data of compilation ’hh:mm:ss’
The constant 1. It is intended that this identifier be defined to
be 1 only in standard-conforming implementations
LINE
FILE
DATE
TIME
STDC
HLL-Quellcode
Text
Kompilation
Compiler
Assembler-Quellcode
Text
Übersetzungsliste mit
Fehlermeldungen
37
Assembly
38
Linking
evtl. temporäre Dateien
evtl. temporäre Dateien
Objektformat
Maschinencode und Zusatzinfo.
Objektformat
Text
Assemblierung
AssemblerQuellcode
Assembler
Maschinencode und
Zusatzinformationen
Binärcode od.
Objektformat
Binden
Objektformat
Maschinencode und Zusatzinfo.
Binder (Linker)
Text
Bibliotheksobjektformat
Maschinencode und Zusatzinfo.
Übersetzungsliste mit Fehlermeldungen und Symboltabelle
39
Absoluter Code oder relozierbarer Code mit Zusatzinfo.
library
search
Text
Link Map (Adressraumbenutzung), Symbolliste
40
Program Section In Virtual Memory
Repetition Computer Architecture
Nach Bindung
Nach Kompilation
Adressraum
Sektion .text (Code):
Storage Classes
0
0
0x08048244
xx
From Source Code To Executable Code
Sektion .data (init. Daten)
0x08049370
0
Construction of an Executable
yy
Jede Sektion beginnt bei Adr. 0, Sektionen
sind »logische. Adressräume« des Compilers
Relocation Process
0xffffffff
Alle Sektionen sind im Adressraum »absolut« platziert
41
Linking an Executable Binary
OBJ1
.text1
OBJ2
.text2
OBJ3
.text3
.data1
Relocation Records
.bss1
.text: Code
.data: initialisierte Variablen
.bss: nicht initialisierte Variablen
.bss2
.data3
.bss3
Eingabedaten: ungebundene Objektdateien
Bindung (linking)
.text1
OBJtotal
.text2
.text3
.data1
.data3
.bss1
.bss2
.bss3
Verarbeitungsresultat: ausführbare Datei (gebunden, reloziert)
I
I
I
I
Once sections are placed subsequently, relocation can start
I
Executable code contains embedded addresses
I
Static data, function calls, jump targets
I
On relocation those have to be changed inside the code
I
Without a relocation table this is not possible
I
A relocation record holds the relative address of a symbol
(name of a variable, a function etc.)
RELOCATION RECORDS FOR [. text ]:
OFFSET
TYPE
VALUE
0000001 a R_386_32
b
00000023 R_386_32
a
00000029 R_386_32
b
Each object code (compiled seperately) starts at address 0
Linking them together involves
I
42
centralization of sections
relocation of adresses
43
44
Source File: compile.c
int a = 1;
int b ;
Analysis of Object Files (compile.o)
$ file compile . o
ELF 32 - bit LSB relocatable , Intel 80386 , version 1 , not stripped
/* Global variable , initialized
-> . data */
/* Global variable , uninitialized -> . bss */
int
main ( void )
{
static int c ;
$ objdump -x compile . o
compile . o :
file format elf32 - i386
compile . o
architecture : i386 , flags 0 x00000011 :
HAS_RELOC , HAS_SYMS
start address 0 x00000000
/* Local , static variable -> . bss */
b = 5;
c = b + a + 16;
return c ;
Sections :
Idx Name
0 . text
}
I
Compile a relocatable object file
1 . data
cc -c compile.c (creates compile.o)
I
2 . bss
Linking an executable binary (one-step compilation)
3 . rodata
cc compile.c -o compile
Size
0000005 a
CONTENTS ,
00000004
CONTENTS ,
00000004
ALLOC
00000005
CONTENTS ,
VMA
LMA
00000000 00000000
ALLOC , LOAD , RELOC ,
00000000 00000000
ALLOC , LOAD , DATA
00000000 00000000
File off
00000034
READONLY ,
00000090
Algn
2**2
CODE
2**2
00000094
2**2
00000000 00000000 00000094
ALLOC , LOAD , READONLY , DATA
2**0
45
Object File: compile.o (cont.)
SYMBOL TABLE :
00000000 l
00000000 l
00000000 l
00000000 l
00000000 l
00000000 l
00000000 g
00000000 g
00000004
df
d
d
d
O
d
O
F
O
* ABS *
. text
. data
. bss
. bss
. rodata
. data
. text
* COM *
00000000
00000000
00000000
00000000
00000004
00000000
00000004
0000005 a
00000004
compile . c
c .0
a
main
b
RELOCATION RECORDS FOR [. text ]:
OFFSET
TYPE
VALUE
0000001 a R_386_32
b
00000023 R_386_32
a
00000029 R_386_32
b
00000031 R_386_32
. bss
00000036 R_386_32
. bss
0000004 c R_386_32
. rodata
47
46
compile . o :
file format elf32 - i386
Disassembly of section . text :
00000000 < main >:
0:
55
push
1:
89 e5
mov
3:
83 ec 18
sub
6:
83 e4 f0
and
9:
b8 00 00 00 00
mov
e:
29 c4
sub
10:
a1 00 00 00 00
mov
15:
89 45 e8
mov
18:
c7 05 00 00 00 00 05
movl
1f:
00 00 00
22:
a1 00 00 00 00
mov
27:
03 05 00 00 00 00
add
2d:
83 c0 10
add
30:
a3 00 00 00 00
mov
35:
a1 00 00 00 00
mov
3a:
8 b 55 e8
mov
3d:
3 b 15 00 00 00 00
cmp
43:
74 13
je
45:
83 ec 08
sub
48:
ff 75 e8
pushl
4b:
68 00 00 00 00
push
50:
e8 fc ff ff ff
call
55:
83 c4 10
add
58:
c9
leave
59:
c3
ret
% ebp
% esp ,% ebp
$0x18 ,% esp
$0xfffffff0 ,% esp
$0x0 ,% eax
% eax ,% esp
0 x0 ,% eax
% eax ,0 xffffffe8 (% ebp )
$0x5 ,0 x0
0 x0 ,% eax
0 x0 ,% eax
$0x10 ,% eax
% eax ,0 x0
0 x0 ,% eax
0 xffffffe8 (% ebp ) ,% edx
0 x0 ,% edx
58 < main +0 x58 >
$0x8 ,% esp
0 xffffffe8 (% ebp )
$0x0
51 < main +0 x51 >
$0x10 ,% esp
48
compile . o :
file format elf32 - i386
Disassembly of section . text :
00000000 < main >:
int b ;
/* Global variable , uninitialized -> . bss
Executable Binary File: compile
compile :
file format elf32 - i386
compile
architecture : i386 , flags 0 x00000112 :
EXEC_P , HAS_SYMS , D_PAGED
start address 0 x1c000408
*/
int
main ( void )
{
0:
55
push
% ebp
... 6 more lines ...
15:
89 45 e8
mov
% eax ,0 xffffffe8 (% ebp )
static int c ; /* Local , static variable -> . bss */
18:
1f:
22:
27:
2d:
30:
35:
b = 5;
c7 05 00 00
00 00 00
c = b + a +
a1 00 00 00
03 05 00 00
83 c0 10
a3 00 00 00
return c ;
a1 00 00 00
Sections :
Idx Name
...
9 . text
movl
$0x5 ,0 x0
...
12 . data
0 x0 ,% eax
0 x0 ,% eax
$0x10 ,% eax
% eax ,0 x0
...
20 . bss
00
mov
add
add
mov
00
mov
0 x0 ,% eax
00 00 05
16;
00
00 00
SYMBOL TABLE :
3 c003140 l
3 c003280 g
1 c0005c0 g
3 c001018 g
}
... 10 more lines ...
Size
O
O
F
O
File off
Algn
00000214 1 c000408 1 c000408 00000408
CONTENTS , ALLOC , LOAD , READONLY , CODE
2**2
00000014 3 c001008 3 c001008
CONTENTS , ALLOC , LOAD , DATA
00001008
2**2
00000184
ALLOC
00001100
2**5
. bss
. bss
. text
. data
VMA
LMA
3 c003100
00000004
00000004
0000005 a
00000004
c .0
b
main
a
49
1 c0005c0 < main >:
int b ;
/* Global variable , uninitialized -> . bss
int
main ( void )
{
1 c0005c0 :
55
1 c0005c1 :
89
1 c0005c3 :
83
1 c0005c6 :
83
1 c0005c9 :
b8
1 c0005ce :
29
1 c0005d0 :
a1
1 c0005d5 :
89
static int
3 c003100
50
*/
Repetition Computer Architecture
e5
ec
e4
00
c4
00
45
c;
push
% ebp
mov
% esp ,% ebp
18
sub
$0x18 ,% esp
f0
and
$0xfffffff0 ,% esp
00 00 00
mov
$0x0 ,% eax
sub
% eax ,% esp
31 00 3 c
mov
0 x3c003100 ,% eax
e8
mov
% eax ,0 xffffffe8 (% ebp )
/* Local , static variable -> . bss */
b = 5;
1 c0005d8 :
c7 05 80
1 c0005df :
00 00 00
c = b + a + 16;
1 c0005e2 :
a1 18 10
1 c0005e7 :
03 05 80
1 c0005ed :
83 c0 10
1 c0005f0 :
a3 40 31
return c ;
1 c0005f5 :
a1 40 31
}
Storage Classes
From Source Code To Executable Code
Construction of an Executable
32 00 3 c 05
movl
$0x5 ,0 x3c003280
00 3 c
32 00 3 c
00 3 c
mov
add
add
mov
0 x3c001018 ,% eax
0 x3c003280 ,% eax
$0x10 ,% eax
% eax ,0 x3c003140
00 3 c
mov
0 x3c003140 ,% eax
Relocation Process
51
52
Relocation Of An Assembler Instruction
Relocation Of An Assembler Instruction (cont.)
During the linking process relocated addresses are injected in the
code, for example the assignment b = 5;
? How to find the right places in the machine code to perform
the substitutions?
Before relocation ( relocatable ‘ compile .o ‘):
18:
c7 05 00 00 00 00 05
movl
$0x5 ,0 x0
1 c0005d8 :
c7 05 80 32 00 3 c 05
movl
$0x5 ,0 x3c003280
After relocation ( executable ‘ compile ‘):
I
Linker has relocation record (relative address) of b
RELOCATION RECORDS FOR [. text ]: ( compile . o )
0000001 a R_386_32
b
The proper address for b can be found in the symbol table.
I
SYMBOL TABLE : ( compile )
3 c003280 g
O . bss
00000004 b
I
SYMBOL TABLE : ( compile )
3 c003280 g
O . bss
00000004 b
1 c0005c0 g
F . text 0000005 a main
The symbol table for compile yields 3c003280 for variable b
53
Relocation Of An Assembler Instruction (cont.)
Putting it all together:
RELOCATION RECORDS FOR [. text ]: ( compile . o )
0000001 a R_386_32
b
( relative offset )
SYMBOL TABLE : ( compile )
3 c003280 g
O . bss
00000004 b
( abs . address of b )
1 c0005c0 g
F . text 0000005 a main ( abs . address of main )
Computing the address where substitution must be performed:
1 c0005c0 + 0000001 a = 1 c0005da
18:
1 c0005d8 :
c7 05 00 00 00 00 05
c7 05 80 32 00 3 c 05
movl
movl
Linker has absolute address of main from symbol table
$0x5 ,0 x0
$0x5 ,0 x3c003280
55
54

Documents pareils