FAME – Operating System - Memory management - Etis

Transcription

FAME – Operating System - Memory management - Etis
FAME – Operating System
Memory management
2012 – David Picard
Contributions: Arnaud Revel, Mickaël Maillard
[email protected]
ÉCOLE NATIONALE SUPÉRIEURE DE L'ÉLECTRONIQUE ET DE SES APPLICATIONS
Process adress space
●
●
Process address space is the whole memory as
seen by the process
Properties:
–
–
–
Total (if an address is 32bits long, then the process
adress space is 4GB)
Continuous (every zone between 0x00 and the highest
address are accessible)
Private (only the process and the kernel have access to
it)
ÉCOLE NATIONALE SUPÉRIEURE DE L'ÉLECTRONIQUE ET DE SES APPLICATIONS
Splicing the process address space
●
5 zones
–
–
–
–
–
●
●
A reserved space
Code zone (often called
text)
Data zone
Stack
Kernel
0xffffffff
kernel
0xc0000000
stack
Kernel zone is only
accessible by the kernel
Code, Data and Stack are
accessible in user mode
heap
0x8xxxxxxx
text
0x80000000
XXXXXXX
0x00000000
ÉCOLE NATIONALE SUPÉRIEURE DE L'ÉLECTRONIQUE ET DE SES APPLICATIONS
Several processes...
●
Each process address space has incompatible
properties:
–
–
●
Total: not all computer have 4GB of RAM (or even 256TB
for x86-64)
Private and continuous: with several processes, it's either
continuous or private.
Kernel zone and reserved zone are shared
among processes
ÉCOLE NATIONALE SUPÉRIEURE DE L'ÉLECTRONIQUE ET DE SES APPLICATIONS
Virtual Memory
●
●
Addresses in PAS are not real, physical
addresses
2 mechanisms:
–
–
Logical
addresses
Segmentation
Paging
Segmentation
Linear
addresses
Paging
logical addresses are the ones in PAS
ÉCOLE NATIONALE SUPÉRIEURE DE L'ÉLECTRONIQUE ET DE SES APPLICATIONS
Physical
addresses
Segmentation
●
Memory is divided in segments
●
Any logical address has two parts:
–
–
●
One Id called segment selector
One offset relative to the selected segment
On x86, registers are dedicated to specific
segment:
–
–
–
cs: code segment
ss: stack segment
ds: data segment
ÉCOLE NATIONALE SUPÉRIEURE DE L'ÉLECTRONIQUE ET DE SES APPLICATIONS
Segment descriptor
●
●
●
●
Each segment is represented by a segment descriptor (8
Bytes)
Descriptors are store in a table (either global, the GDT, or
local, the LDT)
There is on ly one GTD, whereas any process has its
own LDT
Segment descriptors contain:
–
–
–
–
The adress of the first byte.
Size of the segment
Type of segment (cs, ss, ds, ...)
Some right management information (think of the famous
Segmentation fault)
ÉCOLE NATIONALE SUPÉRIEURE DE L'ÉLECTRONIQUE ET DE SES APPLICATIONS
Types of descriptors
●
Code segment descriptors:
–
●
Data segment descriptors:
–
●
Describe a data segment or a stack segment (GDT or LDT)
Task State Segment (TSS)
–
●
Describe a code segment (GDT or LDT)
Describe a segment with information on a process context
(Remember the task_struct of processes) only in GDT
Local descriptor table descriptors
–
Describe a segment containing a LDT (only in GDT)
ÉCOLE NATIONALE SUPÉRIEURE DE L'ÉLECTRONIQUE ET DE SES APPLICATIONS
Segment selector
●
2 fields:
–
–
Index in the descriptor table
Flag specifying the GDT or LDT (TI)
Index
TI
Segment selector
2 register store the address of GDT (GDTR) and
current LDT (LDTR)
ÉCOLE NATIONALE SUPÉRIEURE DE L'ÉLECTRONIQUE ET DE SES APPLICATIONS
Linear address computation
Index
offset
TI
GDTR or LDTR
8
*
+
GDT or LDT
Segment descriptor
base
+
Linear address
ÉCOLE NATIONALE SUPÉRIEURE DE L'ÉLECTRONIQUE ET DE SES APPLICATIONS
Paging
●
Splitting linear addresses into pages
●
Splitting the physical memory into page frames
●
Map between pages and page frames
–
–
–
Non Linear mapping (addresses are not continuous)
Right management (Page fault exception)
One mapping per process
ÉCOLE NATIONALE SUPÉRIEURE DE L'ÉLECTRONIQUE ET DE SES APPLICATIONS
Paging
0xffffffff
0xc0000000
0x9a5624ce
kernel
Frame 2
Page 5
stack
Frame 4
Page 4
Frame of
another process
Page 3
Frame 3
Frame 5
0x8xxxxxxx
heap
Page 2
0x80000000
text
Page 1
Frame of
Another process
XXXXXXX
Frame 1
0x00000000
0x00000000
Process linear adresses
Physical memory
ÉCOLE NATIONALE SUPÉRIEURE DE L'ÉLECTRONIQUE ET DE SES APPLICATIONS
Linear adresses
●
Split into 3 fields:
–
–
–
Directory index for selecting a directory of tables
Table index for selecting a table of pages
Offset in the page
directory
table
ÉCOLE NATIONALE SUPÉRIEURE DE L'ÉLECTRONIQUE ET DE SES APPLICATIONS
offset
Physical addresses computation
directory
table
offset
Page frame
Page table
Page
directory
+
+
+
cr3
Register with
Directory address
ÉCOLE NATIONALE SUPÉRIEURE DE L'ÉLECTRONIQUE ET DE SES APPLICATIONS
Directory and table structures
●
●
●
●
●
flag Present: is the page in RAM (if 0, Page fault,
exception)
Adress of a physical memory zone containing either
a table (for directories) or a page (for tables)
flag Dirty: page frame has been written
flags Read/Write: right management on the page
frame
flag User/Supervisor: defines the level of
premission to access the page
ÉCOLE NATIONALE SUPÉRIEURE DE L'ÉLECTRONIQUE ET DE SES APPLICATIONS
Paging in Linux
●
●
●
●
●
●
4 levels of indirection (3 directories, 1 table and the
offset)
Each process has its own private global directory and
associated directories and tables
During a context switch the address of the global
directory is store in the task_struct
The size of a page is fixed by the macro PAGE_SIZE
The kernel uses its own page tables and maintain
their allocation
No segmentation in Linux by default
ÉCOLE NATIONALE SUPÉRIEURE DE L'ÉLECTRONIQUE ET DE SES APPLICATIONS
Translation senarii
During the translation of a linear address to a physical address, several
things can happen:
●
●
●
●
●
Table entry is valid: the address is replaced by the corresponding entry
in the table, everything is ok
Entry is not valid: we must find a free page frame, associate it with the
requested page and push the association in the tables
Entry is valid, but the read/write rights are not: throw an exception
Entry is valid, but the context rights (user/supervisor) are not : throw an
exception
Entry is valid but the page frame is not in RAM: get the page frame
from the disk (swap), find a free page frame in RAM, copy from the first
to the second, update the association.
ÉCOLE NATIONALE SUPÉRIEURE DE L'ÉLECTRONIQUE ET DE SES APPLICATIONS
Page fault Exception
Legal access: allocate a
new frame
Access type
corresponds to the
memory zone rights
Illegal access: send
SIGSEV signal (the
famous segmentation
fault)
The requested address
Lies in the PAS
Is this a user mode
exception ?
Oops, kernel error: kill
that process
ÉCOLE NATIONALE SUPÉRIEURE DE L'ÉLECTRONIQUE ET DE SES APPLICATIONS
Hardware acceleration
●
Modern processors have built-in hardware
decoding circuits for segmentation and paging
(MMU – memory management unit)
–
–
–
Fill in the register containing the addresses of vm
structures (directories, tables, ...)
Activate segmentation or paging via a control register
The MMU does everything on its own...
ÉCOLE NATIONALE SUPÉRIEURE DE L'ÉLECTRONIQUE ET DE SES APPLICATIONS
Virutal memory advantages
●
●
●
●
Memory abstraction (independence from
specific hardware implementations)
Easy right management
Hides memory fragmentation (simplifies a lot the
job of programers)
All the complex memory management stuff is
hidden in the OS
●
Avoid reallocation a lot
●
...
ÉCOLE NATIONALE SUPÉRIEURE DE L'ÉLECTRONIQUE ET DE SES APPLICATIONS
Cache
●
●
●
●
Accessing RAM is costly
Processors have a small internal memory zone
named cache
Thanks to a hardware mechanism often used
page are stored in this cache zone
If a data is not in the cache, it is loaded into it
(cache miss)
ÉCOLE NATIONALE SUPÉRIEURE DE L'ÉLECTRONIQUE ET DE SES APPLICATIONS
Transition Lookaside Buffers
●
●
●
●
●
TLBs accelerate the conversion from logical addresses to
physical addresses
Small buffers that store the mapping betwwen logical and
physical addresses
Kernel handles the validity of the mapping in the TLBs (via
flush_tlb_* ), because only the kernel handles this
mapping
Whenever the register storing the address of the global
directory register is changed, all TLBs are invalidated
During process commutation, TLBs are thus cleared
ÉCOLE NATIONALE SUPÉRIEURE DE L'ÉLECTRONIQUE ET DE SES APPLICATIONS
Memory descriptors
●
struct mm_struct : contains information on
the address space of a process
●
In the field mm of the process descriptor
●
Double linked list of descriptors
●
sharing mm_struct between several processes
is possible (syscall clone() and fork())
ÉCOLE NATIONALE SUPÉRIEURE DE L'ÉLECTRONIQUE ET DE SES APPLICATIONS
mm_struct (mm_types.h)
struct mm_struct {
struct vm_area_struct * mmap;
/* list of VMAs */
...
pgd_t * pgd; /*repertoire global de pages */
atomic_t mm_users;
/* How many users with user space? */
atomic_t mm_count;
/* How many references to "struct
mm_struct" (users count as 1) */
int map_count;
/* number of VMAs */
...
unsigned long
unsigned long
unsigned long
data*/
unsigned long
unsigned long
environnement */
total_vm, locked_vm, shared_vm, exec_vm;
stack_vm, reserved_vm, def_flags, nr_ptes;
start_code, end_code, start_data, end_data; /* code et
start_brk, brk, start_stack; /* tas et pile */
arg_start, arg_end, env_start, env_end; /* arguments et
...
};
ÉCOLE NATIONALE SUPÉRIEURE DE L'ÉLECTRONIQUE ET DE SES APPLICATIONS
Memory areas
●
Exist thank to vm_area_start structure
●
Encodes an interval of linear adresses
●
No overlap
●
Fusion or splitting possible at
allocation/reallocation if rights are the same on
the 2 regions
ÉCOLE NATIONALE SUPÉRIEURE DE L'ÉLECTRONIQUE ET DE SES APPLICATIONS
vm_area_struct (mm_types.h)
/*
* This struct defines a memory VMM memory area. There is one of these
* per VM-area/task. A VM area is any part of the process virtual memory
* space that has a special rule for the page-fault handlers (ie a shared
* library, the executable area etc).
*/
struct vm_area_struct {
struct mm_struct * vm_mm; /* The address space we belong to. */
unsigned long vm_start;
/* Our start address within vm_mm. */
unsigned long vm_end;
/* The first byte after our end address
within vm_mm. */
/* linked list of VM areas per task, sorted by address */
struct vm_area_struct *vm_next, *vm_prev;
pgprot_t vm_page_prot;
unsigned long vm_flags;
/* Access permissions of this VMA. */
/* Flags, see mm.h. */
...
/* Function pointers to deal with this struct. */
const struct vm_operations_struct *vm_ops;
...
};
ÉCOLE NATIONALE SUPÉRIEURE DE L'ÉLECTRONIQUE ET DE SES APPLICATIONS
Area operations
/*
* These are the virtual MM functions - opening of an area, closing and
* unmapping it (needed to keep files on disk up-to-date etc), pointer
* to the functions called when a no-page or a wp-page exception occurs.
*/
struct vm_operations_struct {
void (*open)(struct vm_area_struct * area);
void (*close)(struct vm_area_struct * area);
int (*fault)(struct vm_area_struct *vma, struct vm_fault *vmf);
/* notification that a previously read-only page is about to become
* writable, if an error is returned it will cause a SIGBUS */
int (*page_mkwrite)(struct vm_area_struct *vma, struct vm_fault
*vmf);
...
};
ÉCOLE NATIONALE SUPÉRIEURE DE L'ÉLECTRONIQUE ET DE SES APPLICATIONS
Area right management
/*
* vm_flags in vm_area_struct, see mm_types.h.
*/
#define VM_READ
0x00000001
/* currently active flags */
#define VM_WRITE 0x00000002
#define VM_EXEC
0x00000004
#define VM_SHARED
0x00000008
/* mprotect() hardcodes VM_MAYREAD >> 4 == VM_READ, and so for r/w/x bits. */
#define VM_MAYREAD
0x00000010
/* limits for mprotect() etc */
#define VM_MAYWRITE 0x00000020
#define VM_MAYEXEC
0x00000040
#define VM_MAYSHARE 0x00000080
#define VM_GROWSDOWN 0x00000100
#define
#define
#define
#define
#define
#define
#define
VM_DONTCOPY 0x00020000
VM_DONTEXPAND 0x00040000
VM_RESERVED 0x00080000
VM_ACCOUNT
0x00100000
VM_NORESERVE 0x00200000
VM_HUGETLB
0x00400000
VM_NONLINEAR 0x00800000
/* general info on the segment */
/*
/*
/*
/*
/*
/*
/* Do not copy this vma on fork */
Cannot expand with mremap() */
Count as reserved_vm like IO */
Is a VM accounted object */
should the VM suppress accounting */
Huge TLB Page VM */
Is non-linear (remap_file_pages) */
ÉCOLE NATIONALE SUPÉRIEURE DE L'ÉLECTRONIQUE ET DE SES APPLICATIONS
Swap
●
●
●
●
Sometime, the available RAM is not sufficient for
storing all data (code and data)
A major advantage of virtual memory is that page
frames do not need to be in RAM
A zone on the hard disk can be reserved to store
some page frames, called swap
The kernel handles the transfert between the RAM
and the Swap
–
–
By putting in RAM most used pages
By handling the page fault exception when a requested page is
not in RAM but in swap
ÉCOLE NATIONALE SUPÉRIEURE DE L'ÉLECTRONIQUE ET DE SES APPLICATIONS
Handling swap
●
Sometimes, a process request a page in swap
while RAM is occupied at 100%
–
–
–
–
●
The system chooses a page (hopefully one with low
request rate)
Puts it in swap
Uses the freed page frame for the requested page
Updates the corresponding tables
Sometimes the choosen page contains data that
can safely be erased (cache data, for example)
–
–
In that case, the system simply erases its content
And updates the tables
ÉCOLE NATIONALE SUPÉRIEURE DE L'ÉLECTRONIQUE ET DE SES APPLICATIONS
Memory allocators
●
●
●
As we have seen, the kernel handle the
memory
Whenever a process needs more memory, it
asks the kernel via a system call
In fact the demand is done by calling a libc
function:
–
malloc() or brk(), etc...
ÉCOLE NATIONALE SUPÉRIEURE DE L'ÉLECTRONIQUE ET DE SES APPLICATIONS
Managing the heap
●
malloc(size): allocate size
●
calloc(n,size): n elements of size size initialized to 0
●
●
●
●
realloc(ptr, size): change the size of a memory zone
allocated with malloc or calloc
free(ptr) deallocate the zone
brk(addr) increase the size of the heap (current->mm>brk) until reaching address addr
sbrk(incr) increase the size f the heap of size incr
ÉCOLE NATIONALE SUPÉRIEURE DE L'ÉLECTRONIQUE ET DE SES APPLICATIONS
sys_brk(addr) System call
●
●
●
●
●
If addr is inside an existing Code memory area
→ out
Align addr on a page (smallest allocable
element)
If decreasing the heap, then call do_munmap()
→ out
Else, verify permissions
do_mmap()
ÉCOLE NATIONALE SUPÉRIEURE DE L'ÉLECTRONIQUE ET DE SES APPLICATIONS
do_mmap()
●
Zllocate a new memory area
–
Verify the request validity (size, max number of areas,
rights, ...)
–
Invoke get_unmapped_area() to obtain a valid address
–
Compute permission flags of the new vm area
–
if VM_SHARED equals 0, then try to fuse with existing area
–
Allocate a new struct vm_area_struct
–
Fill the vm_area_struct
–
Insert in the tree of vm area structures
Return the new address
–
ÉCOLE NATIONALE SUPÉRIEURE DE L'ÉLECTRONIQUE ET DE SES APPLICATIONS
Kernel object allocation
●
●
●
kmem_cache_alloc()
Linux kernel allocator is called SLAB and
follows these principles:
–
Data type can influence memory allocation
–
Kernel functions ask often for areas of the same size
(for example the size of a process descriptor)
–
Requests can be ordered by frequency
The allocator has a cache system thet
allows for faster allocation by reusing
memory areas
ÉCOLE NATIONALE SUPÉRIEURE DE L'ÉLECTRONIQUE ET DE SES APPLICATIONS

Documents pareils