Architecture des Ordinateurs - Moodle
Transcription
Architecture des Ordinateurs - Moodle
! INSA-3IF Architecture des Ordinateurs! Séance 2 - Annexe! Christian Wolf,! INSA-Lyon, Dép. IF! # 1 C. Wolf Little Endian vs. Big Endian! - L’ordre des octets dans un mot peut être différent selon une architecture donnée! - Little-Endian : x86, MOS 6510! - Big-Endian : Motorola 68000! - Configurable: ARM, Power-PC, SPARC! # 2 C. Wolf Alignement! - La connexion entre CPU a une granularité ! – généralement > 1 octet, ! – typiquement 1 mot (32 bit, 64 bit etc.)! - Accès n’est pas aligné : plusieurs lectures! - Ce travail est à faire … mais par qui?! Reproduit de : IBM # 3 C. Wolf Alignement! - X86 : géré par le CPU, pas de restrictions! - ARM / Motorola 68000 : ! – interruptions pour les premiers modèles! – géré par le CPU pour les derniers modèles! - Gestion des intérruptions :! – Gestion par une routine logiciel (lent)! – Crash/demande à l’utilisateur de redémarrer la machine (Mac OS d’origine en 1984)! - Pour qu’une instruction soit atomique, il faut que l’accès soit aligné (c’est compliqué et en rapport avec la mémoire virtuelle)! # 4 C. Wolf ... ... ... Gestion des opérandes! ALU ALU ALU ALU - Pile! ... ... ... Memory - Accumulateur (un seul registre)! - Accumulateur étendu ! ... ... ... – Un registre à usage général! (a) Stack (b) Accumulator (c) Register-memory – D’autres registres spécialisés! ... ... (d) Register-register/ load-store - GPR (General purpose register)! Figure A.1 Operand locations for four instruction set architecture classes. The arrows indicate whether the operand is an input or the result of the arithmetic-logical unit (ALU) operation, or both an input and result. Lighter shades indicate inputs, and the dark shade indicates the result. In (a), a Top Of Stack register (TOS) points to the top input operand, which is combined with the operand below. The first operand is removed from the stack, the result takes the place of the second operand, and TOS is updated to point to the result. All operands are implicit. In (b), the Accumulator is both an implicit input operand and a result. In (c), one input operand is a register, one is in memory, and the result goes to a register. All operands are registers in (d) and, like the stack architecture, can be transferred to memory only via separate instructions: push or pop for (a) and load or store for (d). – Registre-registre (« Chargement-rangement »)! – Registre-mémoire! – Avantages : allocation simple des variables par le compilateur! Stack Accumulator Register (register-memory) Register (load-store) Push A Load A Load R1,A Load R1,A Push B Add B Add Load R2,B Add Store C Store R3,C Pop C R3,R1,B Add R3,R1,R2 Store R3,C Figure A.2 The code sequence for C = A + B for four classes of instruction sets. Note that the Add instruction has implicit operands for stack and accumulator architectures # 5 C. Wolf ALU ... Processor Memory ALU ALU ALU . . . TOS ... ... ... ... ... ... ... ALU ... ... ... (a) Stack (b) Accumulator (c) Register-memory Memory ... ... ALU ALU ALU ... (d) Register-register/ load-store ... ... Figure A.1 Operand locations for four instruction set architecture classes. The arrows indicate whether the operand is an input or the result of the arithmetic-logical unit (ALU) operation, or both an input and result. Lighter shades indicate inputs, and the dark shade indicates the result. In (a), a Top Of Stack register (TOS) points to the top input operand, which is combined with the operand below. The first operand is removed from the stack, the result takes . is updated to point . . to . the result. All operands . . . are implicit. In (b), the . . Accu. the place of the second operand, and. .TOS mulator is both an implicit input operand and a result. In (c), one input operand is a register, one is in memory, and the result goes to a register. All operands are registers in (d) and, like the stack architecture, can be transferred to (a) Stack (d) Register-register/ memory only via separate instructions: push or pop for(b)(a)Accumulator and load or store (c) for Register-memory (d). load-store Figure A.1 Operand locations for four instruction set architecture classes. The arrows indicate whether the operand is an input or the result of the arithmetic-logical unit (ALU) operation, or both an input and result. Lighter shades Register indicate inputs, and the dark shade indicates the result. In (a), a Top Of Stack register (TOS) points to the top input Stack Accumulator Register operand, which is combined with the operand below.(register-memory) The first operand is removed from (load-store) the stack, the result takes the place of the second and TOS point to the result. All operands are implicit. In (b), the AccuPush Aoperand,Load A is updated to Load R1,A Load R1,A mulator is both an implicit input operand and a result. In (c), one input operand is a register, Push B Add B Add R3,R1,B Load R2,B one is in memory, and the result goes to a register. All operands are registers in (d) and, like the stack architecture, can be transferred to Add Store C Store R3,C Add R3,R1,R2 memory only via separate instructions: push or pop for (a) and load or store for (d). Pop C Store R3,C Reproduit de : Figure A.2 The code sequence for C = A + B for four classes of instruction sets. Note that the Add instruction has implicit operands for stack and accumulator architectures and explicit operands for register architectures. It isRegister assumed that A, B, and C all belong Stack Accumulator (register-memory) Register in memory and that the values of A and B cannot be destroyed. Figure A.1 shows the (load-store) Hennessy et Patterson # 6 C. Wolf Nombre d’opérandes et d’adresses! A-6 A-6 ■ ■ Appendix A Instruction Set Principles Appendix A Instruction Set Principles Number of memory Number of addresses memory addresses 0 Maximum number of operands Maximum number allowed of operands allowed 3 0 1 3 2 Load-store Register-memory 1 2 2 2 23 23 Type of architecture Examples Type of architecture Load-store Register-memory Memory-memory Examples Alpha, ARM, MIPS, PowerPC, SPARC, SuperH, TM32 Alpha, ARM, MIPS, PowerPC, SPARC, SuperH, IBM 360/370, Intel 80x86, Motorola 68000, TM32 TI TMS320C54x IBM 360/370, Intel 80x86, Motorola 68000, VAX (also has three-operand formats) TI TMS320C54x Memory-memory Memory-memory VAX(also (alsohas hasthree-operand two-operand formats) VAX formats) 3 Memory-memory VAX (also hasper two-operand formats) Figure 3A.3 Typical combinations of memory operands and total operands typical ALU instruction with examples of computers. Computers with no memory reference per ALU instruction are called load-store or registerFigure A.3 Typical combinations of memory operands and total operands per typical ALU instruction with register computers. Instructions with multiple memory operands per instruction typical ALUare instruction are called registerexamples of computers. Computers with no memory reference per ALU called load-store or registermemory or memory-memory, according to whether they have one or more than one memory operand. register computers. Instructions with multiple memory operands per typical ALU instruction are called registermemory or memory-memory, according to whether they have one or more than one memory operand. Type Advantages Disadvantages Type Register-register (0, 3) Register-register (0, 3) Advantages Simple, fixed-length instruction encoding. Simple code generation model. Instructions Simple, fixed-length instruction encoding. take similar numbers of clocks to execute Simple code generation model. Instructions (see Appendix C). take similar numbers of clocks to execute Register-memory (see DataAppendix can be accessed without a separate load C). (1, 2) instruction first. Instruction format tends to Register-memory Data can be accessed without a separate load be easy to encode and yields good density. (1, 2) instruction first. Instruction format tends to be easy to encode and yields good density. Disadvantages Higher instruction count than architectures with memory references in instructions. More instructions Higher instruction count than architectures with and lower instruction density lead to larger memory references in instructions. More instructions programs. and lower instruction density lead to larger Operands are not equivalent since a source operand programs. in a binary operation is destroyed. Encoding a Operands are not equivalent since a source operand register number and a memory address in each in a binary operation is destroyed. Encoding a instruction may restrict the number of registers. register number and a memory address in each Clocks per instruction vary by operand location. instruction may restrict the number of registers. Memory-memory Most compact. Doesn’t waste registers for Large variation in instruction especially for Clocks per instruction vary bysize, operand location. (2, 2) or (3, 3) temporaries. three-operand instructions. In addition, large Memory-memory Most compact. Doesn’t waste registers for Large variation in instruction size, especially for variation in work per instruction. Memory accesses (2, 2) or (3, 3) temporaries. three-operand instructions. In addition, large create memory bottleneck. (Not used today.) variation in work per instruction. Memory accesses memory (Not used today.) Figure A.4 Advantages and disadvantages of the three most create common typesbottleneck. of general-purpose register com- puters.A.4 TheAdvantages notation (m, and n) means m memoryof operands and n total operands. In general, computers with fewercomalterFigure disadvantages the three most common types of general-purpose register natives simplify the compiler’s task since there are fewer decisions for the compiler to make (see Section A.8). puters. The notation (m, n) means m memory operands and n total operands. In general, computers with fewer alterComputers with a wide variety of flexible instruction formats reduce the number of bits required to encode the pronatives simplify the compiler’s task since there are fewer decisions for the compiler to make (see Section A.8). gram. The number of registers also affects the instruction size since you need log (number of registers) for each reg2 Computers with a wide variety of flexible instruction formats reduce the number of bits required to encode the proister specifier in an doubling the number of registers 3 extra bits for a register-register gram. The number of instruction. registers alsoThus, affects the instruction size since you needtakes log2 (number of registers) for each regarchitecture, 10% of a 32-bit ister specifier or inabout an instruction. Thus,instruction. doubling the number of registers takes 3 extra bits for a register-register architecture, or about 10% of a 32-bit instruction. Reproduit de : Hennessy et Patterson # 7 C. Wolf of an object they will access. Addressing modes specify constants and registers in addition to locations in memory. When a memory location is used, the actual memory address specified by the addressing mode is called the effective address. Figure A.6 shows all the data addressing modes that have been used in recent computers. Immediates or literals are usually considered memory addressing Modes d’adressage! Addressing mode Example instruction Meaning When used Register Add R4,R3 Regs[R4] ← Regs[R4] + Regs[R3] When a value is in a register. Immediate Add R4,#3 Regs[R4] ← Regs[R4] + 3 For constants. Displacement Add R4,100(R1) Regs[R4] ← Regs[R4] + Mem[100 + Regs[R1]] Accessing local variables (+ simulates register indirect, direct addressing modes). Register indirect Add R4,(R1) Regs[R4] ← Regs[R4] + Mem[Regs[R1]] Accessing using a pointer or a computed address. Indexed Add R3,(R1 + R2) Regs[R3] ← Regs[R3] + Mem[Regs[R1] + Regs[R2]] Sometimes useful in array addressing: R1 = base of array; R2 = index amount. Direct or absolute Add R1,(1001) Regs[R1] ← Regs[R1] + Mem[1001] Sometimes useful for accessing static data; address constant may need to be large. Memory indirect Add R1,@(R3) Regs[R1] ← Regs[R1] + Mem[Mem[Regs[R3]]] If R3 is the address of a pointer p, then mode yields *p. Autoincrement Add R1,(R2)+ Regs[R1] ← Regs[R1] + Mem[Regs[R2]] Regs[R2] ← Regs[R2] + d Useful for stepping through arrays within a loop. R2 points to start of array; each reference increments R2 by size of an element, d. Autodecrement Add R1, –(R2) Regs[R2] ← Regs[R2] – d Regs[R1] ← Regs[R1] + Mem[Regs[R2]] Same use as autoincrement. Autodecrement/-increment can also act as push/pop to implement a stack. Scaled Add R1,100(R2)[R3] Regs[R1] ← Regs[R1] + Mem[100 + Regs[R2] + Regs[R3] * d] Used to index arrays. May be applied to any indexed addressing mode in some computers. Figure A.6 Selection of addressing modes with examples, meaning, and usage. In autoincrement/-decrement and scaled addressing modes, the variable d designates the size of the data item being accessed (i.e., whether the instruction is accessing 1, 2, 4, or 8 bytes). These addressing modes are only useful when the elements being accessed are adjacent in memory. RISC computers use displacement addressing to simulate register indirect with # 0 8 C. Wolf ■ Dynamically shared libraries (which allow a library to be loaded and linke at runtime only when it is actually invoked by the program rather than loade and linked statically before the program is run) Les sauts relatifs au PC! In all four cases the target address is not known at compile time, and hence loaded fromde memory a registerpour before le the register indirect jump. Objectif : limiterusually le nombre bit into utilisés codage de la destination! As branches generally use PC-relative addressing to specify their targets, a important question concerns how far branch targets are from branches. Knowin Ce nombre de the dépend distribution! of these displacements will help in choosing what branch offse to support, and thus will affect the instruction length and encoding. Figure A.1 – de la densité du code! shows the distribution of displacements for PC-relative branches in instruction – Si un alignement desofinstruction sur frontières de mot est About 75% the branches are in les the forward direction. forcé! Percentage of distance 30% 25% 20% Integer average 15% Floating-point average 10% 5% 0% 0 1 2 3 4 5 6 7 8 9 10 11 12 13 Bits of branch displacement 14 15 16 17 18 19 # 9 C. Wolf 20 Encodage! Objectifs (souvent contradictoires)! - Avoir autant de registres et de modes d’adressage que possible ! - Limiter l’impact de la taille des champs registre! – Solution SPARC & IA64 : fenêtres glissantes! - Avoir des instructions à décoder facilement pour une exécution pipelinée (fixe mieux que variable)! - Avoir un code compact! – variable mieux que fixe! – un code x86 est généralement plus compact qu’un code RISC! – Solution ARM : sous ensemble de code 16bit (THUMB) : plus dense, moins rapide! # 10 C. Wolf Encodage : orthogonalité! Un jeu d’instruction est orthogonal, si toutes les instructions peuvent utiliser tous les modes d’adressage.! ⇒ Independence de l’encodage! Avantage :! – Simplicité du décodage des instructions! – Puissance d’expression! – Elégance (un critère très subjectif)! - Inconvénients :! – Manque d’efficacité d’un point de vu compression de l’information! # 11 C. Wolf Orthogonalité! Est-ce que l’ISA « ARM » est orthogonal? ! Reproduit de : Mark McDermott, U. Texas # 12 C. Wolf Gestion des valeurs immédiates! - Solution x86 : on peut lire des constantes de grande taille dans des registres! – mov reg, 0x12345678ABCDEF00 – Inconvénient : nécessite des instructions de taille variables, compliquées en décodage. - Rappel de la solution ARM : ! – 12 bits disponibles pour encoder une valeur immédiate! – Construction d’une valeur par rotation binaire à droite :! • • • • N=8 bit pour une valeur de base! R=4 bit pour un décalage! V= ROR (N, 2*R)! Toutes les valeurs ne peuvent pas être représentées!! # 13 C. Wolf ARM : instructions conditionnelles! Dans l’ISA ARM (comme dans l’ISA IA86), toute instruction peut être conditionnelle :! Reproduit de : Mark McDermott, U. Texas # 14 C. Wolf ARM : instructions conditionnelles! cmp r0, r1 blt .L1 cmp r0, r1 movge r2, r0 movlt r2, r1 mov r2, r0 b .L2 .L1: mov r2, r1 .L2: Même si les instructions conditionnelles ne sont pas exécutées, elles sont lues de la mémoire.! A partir de ~ 3 instructions conditionnelles, le branchement devient plus rapide.! # 15 C. Wolf Interruptions! - Réactions sur des évènements ! – internes (exceptions mathématiques, accès mémoire interdite) ! – externes (matériel, changement de tâche etc.)! - Le programme actuel est interrompu, suivi par l’appel d’une sous-routine! - Problème d’atomicité des opérations call, push, pop etc.! - ARM : solution différente ! – Instruction bl (branch and link) sauvegarde le PC dans R14, suivi par le saut! – Les routines d’intérruptions dispose d’une autre version du R14 (comme le SP/R13)! # 16 C. Wolf