Architecture des Ordinateurs - Moodle

Transcription

Architecture des Ordinateurs - Moodle
!
INSA-3IF
Architecture des
Ordinateurs!
Séance 2 - Annexe!
Christian Wolf,!
INSA-Lyon, Dép. IF!
# 1 C. Wolf
Little Endian vs. Big Endian!
-  L’ordre des octets dans un mot peut être différent selon
une architecture donnée!
-  Little-Endian : x86, MOS 6510!
-  Big-Endian : Motorola 68000!
-  Configurable: ARM, Power-PC, SPARC!
# 2 C. Wolf
Alignement!
-  La connexion entre CPU a une granularité !
–  généralement > 1 octet, !
–  typiquement 1 mot (32 bit, 64 bit etc.)!
-  Accès n’est pas aligné : plusieurs lectures!
-  Ce travail est à faire … mais par qui?!
Reproduit de : IBM
# 3 C. Wolf
Alignement!
-  X86 : géré par le CPU, pas de restrictions!
-  ARM / Motorola 68000 : !
–  interruptions pour les premiers modèles!
–  géré par le CPU pour les derniers modèles!
-  Gestion des intérruptions :!
–  Gestion par une routine logiciel (lent)!
–  Crash/demande à l’utilisateur de redémarrer la machine (Mac
OS d’origine en 1984)!
-  Pour qu’une instruction soit atomique, il faut que l’accès
soit aligné (c’est compliqué et en rapport avec la mémoire virtuelle)!
# 4
C. Wolf
...
...
...
Gestion des opérandes!
ALU
ALU
ALU
ALU
-  Pile!
...
...
...
Memory
-  Accumulateur (un seul registre)!
-  Accumulateur
étendu
! ...
...
...
–  Un registre à usage général!
(a) Stack
(b) Accumulator
(c) Register-memory
–  D’autres registres spécialisés!
...
...
(d) Register-register/
load-store
-  GPR (General purpose register)!
Figure A.1 Operand locations for four instruction set architecture classes. The arrows indicate whether the operand is an input or the result of the arithmetic-logical unit (ALU) operation, or both an input and result. Lighter shades
indicate inputs, and the dark shade indicates the result. In (a), a Top Of Stack register (TOS) points to the top input
operand, which is combined with the operand below. The first operand is removed from the stack, the result takes
the place of the second operand, and TOS is updated to point to the result. All operands are implicit. In (b), the Accumulator is both an implicit input operand and a result. In (c), one input operand is a register, one is in memory, and
the result goes to a register. All operands are registers in (d) and, like the stack architecture, can be transferred to
memory only via separate instructions: push or pop for (a) and load or store for (d).
–  Registre-registre (« Chargement-rangement »)!
–  Registre-mémoire!
–  Avantages : allocation simple des variables par le compilateur!
Stack
Accumulator
Register
(register-memory)
Register (load-store)
Push A
Load A
Load R1,A
Load R1,A
Push B
Add B
Add
Load R2,B
Add
Store C
Store R3,C
Pop C
R3,R1,B
Add
R3,R1,R2
Store R3,C
Figure A.2 The code sequence for C = A + B for four classes of instruction sets. Note
that the Add instruction has implicit operands for stack and accumulator architectures
# 5 C. Wolf
ALU
...
Processor
Memory
ALU
ALU
ALU
. . . TOS
...
...
...
...
...
...
...
ALU
...
...
...
(a) Stack
(b) Accumulator
(c) Register-memory
Memory
...
...
ALU
ALU
ALU
...
(d) Register-register/
load-store
...
...
Figure A.1 Operand locations for four instruction set architecture classes. The arrows indicate whether the operand is an input or the result of the arithmetic-logical unit (ALU) operation, or both an input and result. Lighter shades
indicate inputs, and the dark shade indicates the result. In (a), a Top Of Stack register (TOS) points to the top input
operand, which is combined with the operand below. The first operand is removed from the stack, the result takes
. is updated to point
. . to
. the result. All operands
. . . are implicit. In (b), the
. . Accu.
the place of the second operand, and. .TOS
mulator is both an implicit input operand and a result. In (c), one input operand is a register, one is in memory, and
the result goes to a register. All operands are registers in (d) and, like the stack architecture, can be transferred to
(a) Stack
(d) Register-register/
memory only via separate instructions:
push or pop for(b)(a)Accumulator
and load or store (c)
for Register-memory
(d).
load-store
Figure A.1 Operand locations for four instruction set architecture classes. The arrows indicate whether the operand is an input or the result of the arithmetic-logical unit (ALU) operation, or both an input and result. Lighter shades
Register
indicate inputs, and the dark shade indicates the result.
In (a), a Top Of Stack register (TOS) points to the top input
Stack
Accumulator
Register
operand, which is combined with the operand below.(register-memory)
The first operand is removed
from (load-store)
the stack, the result takes
the place of the second
and TOS
point
to the result. All operands
are implicit. In (b), the AccuPush Aoperand,Load
A is updated to
Load
R1,A
Load R1,A
mulator is both an
implicit
input
operand
and
a
result.
In
(c),
one
input
operand
is
a
register,
Push B
Add B
Add
R3,R1,B
Load R2,B one is in memory, and
the result goes to a register. All operands are registers in (d) and, like the stack architecture, can be transferred to
Add
Store C
Store R3,C
Add
R3,R1,R2
memory only via separate instructions: push or pop for (a) and load or store for (d).
Pop C
Store R3,C
Reproduit de :
Figure A.2 The code sequence for C = A + B for four classes of instruction sets. Note
that the Add instruction has implicit operands for stack and accumulator architectures
and explicit operands for register architectures. It isRegister
assumed that A, B, and C all belong
Stack
Accumulator
(register-memory)
Register
in
memory
and
that
the
values
of
A
and
B
cannot
be
destroyed. Figure A.1 shows
the (load-store)
Hennessy et Patterson
# 6 C. Wolf
Nombre d’opérandes et d’adresses!
A-6
A-6
■
■
Appendix A Instruction Set Principles
Appendix A Instruction Set Principles
Number of
memory
Number
of
addresses
memory
addresses
0
Maximum number
of operands
Maximum
number
allowed
of operands
allowed
3
0
1
3
2
Load-store
Register-memory
1
2
2
2
23
23
Type of architecture
Examples
Type
of architecture
Load-store
Register-memory
Memory-memory
Examples
Alpha, ARM, MIPS, PowerPC, SPARC, SuperH,
TM32
Alpha, ARM, MIPS, PowerPC, SPARC, SuperH,
IBM 360/370, Intel 80x86, Motorola 68000,
TM32
TI TMS320C54x
IBM 360/370, Intel 80x86, Motorola 68000,
VAX
(also has three-operand formats)
TI
TMS320C54x
Memory-memory
Memory-memory
VAX(also
(alsohas
hasthree-operand
two-operand formats)
VAX
formats)
3
Memory-memory
VAX
(also hasper
two-operand
formats)
Figure 3A.3 Typical combinations
of memory
operands and total
operands
typical ALU
instruction with
examples of computers. Computers with no memory reference per ALU instruction are called load-store or registerFigure A.3 Typical combinations of memory operands and total operands per typical ALU instruction with
register computers.
Instructions
with
multiple
memory
operands
per instruction
typical ALUare
instruction
are called
registerexamples
of computers.
Computers
with
no memory
reference
per ALU
called load-store
or registermemory
or
memory-memory,
according
to
whether
they
have
one
or
more
than
one
memory
operand.
register computers. Instructions with multiple memory operands per typical ALU instruction are called registermemory or memory-memory, according to whether they have one or more than one memory operand.
Type
Advantages
Disadvantages
Type
Register-register
(0, 3)
Register-register
(0, 3)
Advantages
Simple, fixed-length instruction encoding.
Simple code generation model. Instructions
Simple, fixed-length instruction encoding.
take similar numbers of clocks to execute
Simple code generation model. Instructions
(see Appendix C).
take similar numbers of clocks to execute
Register-memory (see
DataAppendix
can be accessed
without a separate load
C).
(1, 2)
instruction first. Instruction format tends to
Register-memory Data can be accessed without a separate load
be easy to encode and yields good density.
(1, 2)
instruction first. Instruction format tends to
be easy to encode and yields good density.
Disadvantages
Higher instruction count than architectures with
memory references in instructions. More instructions
Higher instruction count than architectures with
and lower instruction density lead to larger
memory references in instructions. More instructions
programs.
and lower instruction density lead to larger
Operands are not equivalent since a source operand
programs.
in a binary operation is destroyed. Encoding a
Operands are not equivalent since a source operand
register number and a memory address in each
in a binary operation is destroyed. Encoding a
instruction may restrict the number of registers.
register number and a memory address in each
Clocks per instruction vary by operand location.
instruction may restrict the number of registers.
Memory-memory Most compact. Doesn’t waste registers for
Large variation
in instruction
especially
for
Clocks
per instruction
vary bysize,
operand
location.
(2, 2) or (3, 3)
temporaries.
three-operand instructions. In addition, large
Memory-memory Most compact. Doesn’t waste registers for
Large variation in instruction size, especially for
variation in work per instruction. Memory accesses
(2, 2) or (3, 3)
temporaries.
three-operand instructions. In addition, large
create memory bottleneck. (Not used today.)
variation in work per instruction. Memory accesses
memory
(Not used today.)
Figure A.4 Advantages and disadvantages of the three most create
common
typesbottleneck.
of general-purpose
register com-
puters.A.4
TheAdvantages
notation (m, and
n) means
m memoryof
operands
and
n total
operands.
In general,
computers with
fewercomalterFigure
disadvantages
the three
most
common
types
of general-purpose
register
natives
simplify
the
compiler’s
task
since
there
are
fewer
decisions
for
the
compiler
to
make
(see
Section
A.8).
puters. The notation (m, n) means m memory operands and n total operands. In general, computers with fewer alterComputers
with
a
wide
variety
of
flexible
instruction
formats
reduce
the
number
of
bits
required
to
encode
the
pronatives simplify the compiler’s task since there are fewer decisions for the compiler to make (see Section A.8).
gram.
The
number
of
registers
also
affects
the
instruction
size
since
you
need
log
(number
of
registers)
for
each
reg2
Computers with a wide variety of flexible instruction formats reduce the number of bits required to encode the proister specifier
in an
doubling
the number
of registers
3 extra bits
for a register-register
gram.
The number
of instruction.
registers alsoThus,
affects
the instruction
size since
you needtakes
log2 (number
of registers)
for each regarchitecture,
10% of a 32-bit
ister
specifier or
inabout
an instruction.
Thus,instruction.
doubling the number of registers takes 3 extra bits for a register-register
architecture, or about 10% of a 32-bit instruction.
Reproduit de : Hennessy et Patterson
# 7
C. Wolf
of an object they will access. Addressing modes specify constants and registers in
addition to locations in memory. When a memory location is used, the actual
memory address specified by the addressing mode is called the effective address.
Figure A.6 shows all the data addressing modes that have been used in recent
computers. Immediates or literals are usually considered memory addressing
Modes d’adressage!
Addressing mode
Example instruction
Meaning
When used
Register
Add R4,R3
Regs[R4] ← Regs[R4]
+ Regs[R3]
When a value is in a register.
Immediate
Add R4,#3
Regs[R4] ← Regs[R4] + 3
For constants.
Displacement
Add R4,100(R1)
Regs[R4] ← Regs[R4]
+ Mem[100 + Regs[R1]]
Accessing local variables
(+ simulates register indirect,
direct addressing modes).
Register indirect
Add R4,(R1)
Regs[R4] ← Regs[R4]
+ Mem[Regs[R1]]
Accessing using a pointer or a
computed address.
Indexed
Add R3,(R1 + R2)
Regs[R3] ← Regs[R3]
+ Mem[Regs[R1] + Regs[R2]]
Sometimes useful in array
addressing: R1 = base of array;
R2 = index amount.
Direct or
absolute
Add R1,(1001)
Regs[R1] ← Regs[R1]
+ Mem[1001]
Sometimes useful for accessing
static data; address constant may
need to be large.
Memory indirect
Add R1,@(R3)
Regs[R1] ← Regs[R1]
+ Mem[Mem[Regs[R3]]]
If R3 is the address of a pointer p,
then mode yields *p.
Autoincrement
Add R1,(R2)+
Regs[R1] ← Regs[R1]
+ Mem[Regs[R2]]
Regs[R2] ← Regs[R2] + d
Useful for stepping through arrays
within a loop. R2 points to start of
array; each reference increments
R2 by size of an element, d.
Autodecrement
Add R1, –(R2)
Regs[R2] ← Regs[R2] – d
Regs[R1] ← Regs[R1]
+ Mem[Regs[R2]]
Same use as autoincrement.
Autodecrement/-increment can
also act as push/pop to implement
a stack.
Scaled
Add R1,100(R2)[R3]
Regs[R1] ← Regs[R1]
+ Mem[100 + Regs[R2]
+ Regs[R3] * d]
Used to index arrays. May be
applied to any indexed addressing
mode in some computers.
Figure A.6 Selection of addressing modes with examples, meaning, and usage. In autoincrement/-decrement
and scaled addressing modes, the variable d designates the size of the data item being accessed (i.e., whether the
instruction is accessing 1, 2, 4, or 8 bytes). These addressing modes are only useful when the elements being
accessed are adjacent in memory. RISC computers use displacement addressing to simulate register indirect with #
0 8
C. Wolf
■
Dynamically shared libraries (which allow a library to be loaded and linke
at runtime only when it is actually invoked by the program rather than loade
and linked statically before the program is run)
Les sauts relatifs au PC!
In all four cases the target address is not known at compile time, and hence
loaded fromde
memory
a registerpour
before le
the register
indirect jump.
Objectif : limiterusually
le nombre
bit into
utilisés
codage
de la destination! As branches generally use PC-relative addressing to specify their targets, a
important question concerns how far branch targets are from branches. Knowin
Ce nombre de the
dépend
distribution! of these displacements will help in choosing what branch offse
to support,
and thus will affect the instruction length and encoding. Figure A.1
–  de la densité
du code!
shows the distribution of displacements for PC-relative branches in instruction
–  Si un alignement
desofinstruction
sur
frontières
de mot est
About 75%
the branches are
in les
the forward
direction.
forcé!
Percentage of distance
30%
25%
20%
Integer
average
15%
Floating-point average
10%
5%
0%
0
1
2
3
4
5
6
7
8
9 10 11 12 13
Bits of branch displacement
14
15
16
17
18
19
# 9 C. Wolf
20
Encodage!
Objectifs (souvent contradictoires)!
-  Avoir autant de registres et de modes d’adressage que
possible !
-  Limiter l’impact de la taille des champs registre!
–  Solution SPARC & IA64 : fenêtres glissantes!
-  Avoir des instructions à décoder facilement pour une
exécution pipelinée (fixe mieux que variable)!
-  Avoir un code compact!
–  variable mieux que fixe!
–  un code x86 est généralement plus compact qu’un code RISC!
–  Solution ARM : sous ensemble de code 16bit (THUMB) : plus
dense, moins rapide!
# 10
C. Wolf
Encodage : orthogonalité!
Un jeu d’instruction est orthogonal, si toutes les
instructions peuvent utiliser tous les modes d’adressage.!
⇒ Independence de l’encodage!
Avantage :!
–  Simplicité du décodage des instructions!
–  Puissance d’expression!
–  Elégance (un critère très subjectif)!
-  Inconvénients :!
–  Manque d’efficacité d’un point de vu compression de
l’information!
# 11
C. Wolf
Orthogonalité!
Est-ce que l’ISA « ARM » est orthogonal? !
Reproduit de : Mark McDermott, U. Texas
# 12
C. Wolf
Gestion des valeurs immédiates!
-  Solution x86 : on peut lire des constantes de grande
taille dans des registres!
–  mov reg, 0x12345678ABCDEF00
–  Inconvénient : nécessite des instructions de taille variables,
compliquées en décodage.
-  Rappel de la solution ARM : !
–  12 bits disponibles pour encoder une valeur immédiate!
–  Construction d’une valeur par rotation binaire à droite :!
• 
• 
• 
• 
N=8 bit pour une valeur de base!
R=4 bit pour un décalage!
V= ROR (N, 2*R)!
Toutes les valeurs ne peuvent pas être représentées!!
# 13
C. Wolf
ARM : instructions conditionnelles!
Dans l’ISA ARM (comme dans l’ISA IA86), toute instruction peut être conditionnelle :!
Reproduit de : Mark McDermott, U. Texas
# 14
C. Wolf
ARM : instructions conditionnelles!
cmp r0, r1
blt .L1
cmp r0, r1
movge r2, r0
movlt r2, r1
mov r2, r0
b .L2
.L1:
mov r2, r1
.L2:
Même si les instructions conditionnelles ne sont pas exécutées, elles sont
lues de la mémoire.!
A partir de ~ 3 instructions conditionnelles, le branchement devient plus
rapide.!
# 15
C. Wolf
Interruptions!
-  Réactions sur des évènements !
–  internes (exceptions mathématiques, accès mémoire interdite) !
–  externes (matériel, changement de tâche etc.)!
-  Le programme actuel est interrompu, suivi par l’appel
d’une sous-routine!
-  Problème d’atomicité des opérations call, push, pop etc.!
-  ARM : solution différente !
–  Instruction bl (branch and link) sauvegarde le PC dans R14,
suivi par le saut!
–  Les routines d’intérruptions dispose d’une autre version du R14
(comme le SP/R13)!
# 16
C. Wolf