FAME – Operating System - Memory management - Etis
Transcription
FAME – Operating System - Memory management - Etis
FAME – Operating System Memory management 2012 – David Picard Contributions: Arnaud Revel, Mickaël Maillard [email protected] ÉCOLE NATIONALE SUPÉRIEURE DE L'ÉLECTRONIQUE ET DE SES APPLICATIONS Process adress space ● ● Process address space is the whole memory as seen by the process Properties: – – – Total (if an address is 32bits long, then the process adress space is 4GB) Continuous (every zone between 0x00 and the highest address are accessible) Private (only the process and the kernel have access to it) ÉCOLE NATIONALE SUPÉRIEURE DE L'ÉLECTRONIQUE ET DE SES APPLICATIONS Splicing the process address space ● 5 zones – – – – – ● ● A reserved space Code zone (often called text) Data zone Stack Kernel 0xffffffff kernel 0xc0000000 stack Kernel zone is only accessible by the kernel Code, Data and Stack are accessible in user mode heap 0x8xxxxxxx text 0x80000000 XXXXXXX 0x00000000 ÉCOLE NATIONALE SUPÉRIEURE DE L'ÉLECTRONIQUE ET DE SES APPLICATIONS Several processes... ● Each process address space has incompatible properties: – – ● Total: not all computer have 4GB of RAM (or even 256TB for x86-64) Private and continuous: with several processes, it's either continuous or private. Kernel zone and reserved zone are shared among processes ÉCOLE NATIONALE SUPÉRIEURE DE L'ÉLECTRONIQUE ET DE SES APPLICATIONS Virtual Memory ● ● Addresses in PAS are not real, physical addresses 2 mechanisms: – – Logical addresses Segmentation Paging Segmentation Linear addresses Paging logical addresses are the ones in PAS ÉCOLE NATIONALE SUPÉRIEURE DE L'ÉLECTRONIQUE ET DE SES APPLICATIONS Physical addresses Segmentation ● Memory is divided in segments ● Any logical address has two parts: – – ● One Id called segment selector One offset relative to the selected segment On x86, registers are dedicated to specific segment: – – – cs: code segment ss: stack segment ds: data segment ÉCOLE NATIONALE SUPÉRIEURE DE L'ÉLECTRONIQUE ET DE SES APPLICATIONS Segment descriptor ● ● ● ● Each segment is represented by a segment descriptor (8 Bytes) Descriptors are store in a table (either global, the GDT, or local, the LDT) There is on ly one GTD, whereas any process has its own LDT Segment descriptors contain: – – – – The adress of the first byte. Size of the segment Type of segment (cs, ss, ds, ...) Some right management information (think of the famous Segmentation fault) ÉCOLE NATIONALE SUPÉRIEURE DE L'ÉLECTRONIQUE ET DE SES APPLICATIONS Types of descriptors ● Code segment descriptors: – ● Data segment descriptors: – ● Describe a data segment or a stack segment (GDT or LDT) Task State Segment (TSS) – ● Describe a code segment (GDT or LDT) Describe a segment with information on a process context (Remember the task_struct of processes) only in GDT Local descriptor table descriptors – Describe a segment containing a LDT (only in GDT) ÉCOLE NATIONALE SUPÉRIEURE DE L'ÉLECTRONIQUE ET DE SES APPLICATIONS Segment selector ● 2 fields: – – Index in the descriptor table Flag specifying the GDT or LDT (TI) Index TI Segment selector 2 register store the address of GDT (GDTR) and current LDT (LDTR) ÉCOLE NATIONALE SUPÉRIEURE DE L'ÉLECTRONIQUE ET DE SES APPLICATIONS Linear address computation Index offset TI GDTR or LDTR 8 * + GDT or LDT Segment descriptor base + Linear address ÉCOLE NATIONALE SUPÉRIEURE DE L'ÉLECTRONIQUE ET DE SES APPLICATIONS Paging ● Splitting linear addresses into pages ● Splitting the physical memory into page frames ● Map between pages and page frames – – – Non Linear mapping (addresses are not continuous) Right management (Page fault exception) One mapping per process ÉCOLE NATIONALE SUPÉRIEURE DE L'ÉLECTRONIQUE ET DE SES APPLICATIONS Paging 0xffffffff 0xc0000000 0x9a5624ce kernel Frame 2 Page 5 stack Frame 4 Page 4 Frame of another process Page 3 Frame 3 Frame 5 0x8xxxxxxx heap Page 2 0x80000000 text Page 1 Frame of Another process XXXXXXX Frame 1 0x00000000 0x00000000 Process linear adresses Physical memory ÉCOLE NATIONALE SUPÉRIEURE DE L'ÉLECTRONIQUE ET DE SES APPLICATIONS Linear adresses ● Split into 3 fields: – – – Directory index for selecting a directory of tables Table index for selecting a table of pages Offset in the page directory table ÉCOLE NATIONALE SUPÉRIEURE DE L'ÉLECTRONIQUE ET DE SES APPLICATIONS offset Physical addresses computation directory table offset Page frame Page table Page directory + + + cr3 Register with Directory address ÉCOLE NATIONALE SUPÉRIEURE DE L'ÉLECTRONIQUE ET DE SES APPLICATIONS Directory and table structures ● ● ● ● ● flag Present: is the page in RAM (if 0, Page fault, exception) Adress of a physical memory zone containing either a table (for directories) or a page (for tables) flag Dirty: page frame has been written flags Read/Write: right management on the page frame flag User/Supervisor: defines the level of premission to access the page ÉCOLE NATIONALE SUPÉRIEURE DE L'ÉLECTRONIQUE ET DE SES APPLICATIONS Paging in Linux ● ● ● ● ● ● 4 levels of indirection (3 directories, 1 table and the offset) Each process has its own private global directory and associated directories and tables During a context switch the address of the global directory is store in the task_struct The size of a page is fixed by the macro PAGE_SIZE The kernel uses its own page tables and maintain their allocation No segmentation in Linux by default ÉCOLE NATIONALE SUPÉRIEURE DE L'ÉLECTRONIQUE ET DE SES APPLICATIONS Translation senarii During the translation of a linear address to a physical address, several things can happen: ● ● ● ● ● Table entry is valid: the address is replaced by the corresponding entry in the table, everything is ok Entry is not valid: we must find a free page frame, associate it with the requested page and push the association in the tables Entry is valid, but the read/write rights are not: throw an exception Entry is valid, but the context rights (user/supervisor) are not : throw an exception Entry is valid but the page frame is not in RAM: get the page frame from the disk (swap), find a free page frame in RAM, copy from the first to the second, update the association. ÉCOLE NATIONALE SUPÉRIEURE DE L'ÉLECTRONIQUE ET DE SES APPLICATIONS Page fault Exception Legal access: allocate a new frame Access type corresponds to the memory zone rights Illegal access: send SIGSEV signal (the famous segmentation fault) The requested address Lies in the PAS Is this a user mode exception ? Oops, kernel error: kill that process ÉCOLE NATIONALE SUPÉRIEURE DE L'ÉLECTRONIQUE ET DE SES APPLICATIONS Hardware acceleration ● Modern processors have built-in hardware decoding circuits for segmentation and paging (MMU – memory management unit) – – – Fill in the register containing the addresses of vm structures (directories, tables, ...) Activate segmentation or paging via a control register The MMU does everything on its own... ÉCOLE NATIONALE SUPÉRIEURE DE L'ÉLECTRONIQUE ET DE SES APPLICATIONS Virutal memory advantages ● ● ● ● Memory abstraction (independence from specific hardware implementations) Easy right management Hides memory fragmentation (simplifies a lot the job of programers) All the complex memory management stuff is hidden in the OS ● Avoid reallocation a lot ● ... ÉCOLE NATIONALE SUPÉRIEURE DE L'ÉLECTRONIQUE ET DE SES APPLICATIONS Cache ● ● ● ● Accessing RAM is costly Processors have a small internal memory zone named cache Thanks to a hardware mechanism often used page are stored in this cache zone If a data is not in the cache, it is loaded into it (cache miss) ÉCOLE NATIONALE SUPÉRIEURE DE L'ÉLECTRONIQUE ET DE SES APPLICATIONS Transition Lookaside Buffers ● ● ● ● ● TLBs accelerate the conversion from logical addresses to physical addresses Small buffers that store the mapping betwwen logical and physical addresses Kernel handles the validity of the mapping in the TLBs (via flush_tlb_* ), because only the kernel handles this mapping Whenever the register storing the address of the global directory register is changed, all TLBs are invalidated During process commutation, TLBs are thus cleared ÉCOLE NATIONALE SUPÉRIEURE DE L'ÉLECTRONIQUE ET DE SES APPLICATIONS Memory descriptors ● struct mm_struct : contains information on the address space of a process ● In the field mm of the process descriptor ● Double linked list of descriptors ● sharing mm_struct between several processes is possible (syscall clone() and fork()) ÉCOLE NATIONALE SUPÉRIEURE DE L'ÉLECTRONIQUE ET DE SES APPLICATIONS mm_struct (mm_types.h) struct mm_struct { struct vm_area_struct * mmap; /* list of VMAs */ ... pgd_t * pgd; /*repertoire global de pages */ atomic_t mm_users; /* How many users with user space? */ atomic_t mm_count; /* How many references to "struct mm_struct" (users count as 1) */ int map_count; /* number of VMAs */ ... unsigned long unsigned long unsigned long data*/ unsigned long unsigned long environnement */ total_vm, locked_vm, shared_vm, exec_vm; stack_vm, reserved_vm, def_flags, nr_ptes; start_code, end_code, start_data, end_data; /* code et start_brk, brk, start_stack; /* tas et pile */ arg_start, arg_end, env_start, env_end; /* arguments et ... }; ÉCOLE NATIONALE SUPÉRIEURE DE L'ÉLECTRONIQUE ET DE SES APPLICATIONS Memory areas ● Exist thank to vm_area_start structure ● Encodes an interval of linear adresses ● No overlap ● Fusion or splitting possible at allocation/reallocation if rights are the same on the 2 regions ÉCOLE NATIONALE SUPÉRIEURE DE L'ÉLECTRONIQUE ET DE SES APPLICATIONS vm_area_struct (mm_types.h) /* * This struct defines a memory VMM memory area. There is one of these * per VM-area/task. A VM area is any part of the process virtual memory * space that has a special rule for the page-fault handlers (ie a shared * library, the executable area etc). */ struct vm_area_struct { struct mm_struct * vm_mm; /* The address space we belong to. */ unsigned long vm_start; /* Our start address within vm_mm. */ unsigned long vm_end; /* The first byte after our end address within vm_mm. */ /* linked list of VM areas per task, sorted by address */ struct vm_area_struct *vm_next, *vm_prev; pgprot_t vm_page_prot; unsigned long vm_flags; /* Access permissions of this VMA. */ /* Flags, see mm.h. */ ... /* Function pointers to deal with this struct. */ const struct vm_operations_struct *vm_ops; ... }; ÉCOLE NATIONALE SUPÉRIEURE DE L'ÉLECTRONIQUE ET DE SES APPLICATIONS Area operations /* * These are the virtual MM functions - opening of an area, closing and * unmapping it (needed to keep files on disk up-to-date etc), pointer * to the functions called when a no-page or a wp-page exception occurs. */ struct vm_operations_struct { void (*open)(struct vm_area_struct * area); void (*close)(struct vm_area_struct * area); int (*fault)(struct vm_area_struct *vma, struct vm_fault *vmf); /* notification that a previously read-only page is about to become * writable, if an error is returned it will cause a SIGBUS */ int (*page_mkwrite)(struct vm_area_struct *vma, struct vm_fault *vmf); ... }; ÉCOLE NATIONALE SUPÉRIEURE DE L'ÉLECTRONIQUE ET DE SES APPLICATIONS Area right management /* * vm_flags in vm_area_struct, see mm_types.h. */ #define VM_READ 0x00000001 /* currently active flags */ #define VM_WRITE 0x00000002 #define VM_EXEC 0x00000004 #define VM_SHARED 0x00000008 /* mprotect() hardcodes VM_MAYREAD >> 4 == VM_READ, and so for r/w/x bits. */ #define VM_MAYREAD 0x00000010 /* limits for mprotect() etc */ #define VM_MAYWRITE 0x00000020 #define VM_MAYEXEC 0x00000040 #define VM_MAYSHARE 0x00000080 #define VM_GROWSDOWN 0x00000100 #define #define #define #define #define #define #define VM_DONTCOPY 0x00020000 VM_DONTEXPAND 0x00040000 VM_RESERVED 0x00080000 VM_ACCOUNT 0x00100000 VM_NORESERVE 0x00200000 VM_HUGETLB 0x00400000 VM_NONLINEAR 0x00800000 /* general info on the segment */ /* /* /* /* /* /* /* Do not copy this vma on fork */ Cannot expand with mremap() */ Count as reserved_vm like IO */ Is a VM accounted object */ should the VM suppress accounting */ Huge TLB Page VM */ Is non-linear (remap_file_pages) */ ÉCOLE NATIONALE SUPÉRIEURE DE L'ÉLECTRONIQUE ET DE SES APPLICATIONS Swap ● ● ● ● Sometime, the available RAM is not sufficient for storing all data (code and data) A major advantage of virtual memory is that page frames do not need to be in RAM A zone on the hard disk can be reserved to store some page frames, called swap The kernel handles the transfert between the RAM and the Swap – – By putting in RAM most used pages By handling the page fault exception when a requested page is not in RAM but in swap ÉCOLE NATIONALE SUPÉRIEURE DE L'ÉLECTRONIQUE ET DE SES APPLICATIONS Handling swap ● Sometimes, a process request a page in swap while RAM is occupied at 100% – – – – ● The system chooses a page (hopefully one with low request rate) Puts it in swap Uses the freed page frame for the requested page Updates the corresponding tables Sometimes the choosen page contains data that can safely be erased (cache data, for example) – – In that case, the system simply erases its content And updates the tables ÉCOLE NATIONALE SUPÉRIEURE DE L'ÉLECTRONIQUE ET DE SES APPLICATIONS Memory allocators ● ● ● As we have seen, the kernel handle the memory Whenever a process needs more memory, it asks the kernel via a system call In fact the demand is done by calling a libc function: – malloc() or brk(), etc... ÉCOLE NATIONALE SUPÉRIEURE DE L'ÉLECTRONIQUE ET DE SES APPLICATIONS Managing the heap ● malloc(size): allocate size ● calloc(n,size): n elements of size size initialized to 0 ● ● ● ● realloc(ptr, size): change the size of a memory zone allocated with malloc or calloc free(ptr) deallocate the zone brk(addr) increase the size of the heap (current->mm>brk) until reaching address addr sbrk(incr) increase the size f the heap of size incr ÉCOLE NATIONALE SUPÉRIEURE DE L'ÉLECTRONIQUE ET DE SES APPLICATIONS sys_brk(addr) System call ● ● ● ● ● If addr is inside an existing Code memory area → out Align addr on a page (smallest allocable element) If decreasing the heap, then call do_munmap() → out Else, verify permissions do_mmap() ÉCOLE NATIONALE SUPÉRIEURE DE L'ÉLECTRONIQUE ET DE SES APPLICATIONS do_mmap() ● Zllocate a new memory area – Verify the request validity (size, max number of areas, rights, ...) – Invoke get_unmapped_area() to obtain a valid address – Compute permission flags of the new vm area – if VM_SHARED equals 0, then try to fuse with existing area – Allocate a new struct vm_area_struct – Fill the vm_area_struct – Insert in the tree of vm area structures Return the new address – ÉCOLE NATIONALE SUPÉRIEURE DE L'ÉLECTRONIQUE ET DE SES APPLICATIONS Kernel object allocation ● ● ● kmem_cache_alloc() Linux kernel allocator is called SLAB and follows these principles: – Data type can influence memory allocation – Kernel functions ask often for areas of the same size (for example the size of a process descriptor) – Requests can be ordered by frequency The allocator has a cache system thet allows for faster allocation by reusing memory areas ÉCOLE NATIONALE SUPÉRIEURE DE L'ÉLECTRONIQUE ET DE SES APPLICATIONS