MIT6.828 Lab4

Lec 9

“Locking”

When we say that a lock protects data, we really mean that the lock protects some
collection of invariants that apply to the data. Invariants are properties of data struc-
tures that are maintained across operations. Typically, an operation’s correct behavior
depends on the invariants being true when the operation begins. The operation may
temporarily violate the invariants but must reestablish them before finishing. For ex-
ample, in the linked list case, the invariant is that list points at the first node in the
list and that each node’s next field points at the next node. The implementation of
insert violates this invariant temporarily: in line 15, l points to the next list element,
but list does not point at l yet (reestablished at line 16). The race condition we ex-
amined above happened because a second CPU executed code that depended on the
list invariants while they were (temporarily) violated. Proper use of a lock ensures that
only one CPU at a time can operate on the data structure in the critical section, so
that no CPU will execute a data structure operation when the data structure’s invari-
ants do not hold.

You can think of locks as serializing concurrent critical sections so that they run
one at a time, and thus preserve invariants (assuming they are correct in isolation).
You can also think of critical sections as being atomic with respect to each other, so
that a critical section that obtains the lock later sees only the complete set of changes
from earlier critical sections, and never sees partially-completed updates.

Lab 4: Preemptive Multitasking

Part A: Multiprocessor Support and Cooperative Multitasking

Multiprocessor Support

对称多处理器模型(SMP):
引导处理器 (BSP) 负责初始化系统和引导操作系统;应用处理器 (AP) 仅在操作系统启动并运行后由 BSP 激活。哪个处理器是 BSP 由硬件和 BIOS 决定。

We are going to make JOS support “symmetric multiprocessing” (SMP), a multiprocessor model in which all CPUs have equivalent access to system resources such as memory and I/O buses. While all CPUs are functionally identical in SMP, during the boot process they can be classified into two types: the bootstrap processor (BSP) is responsible for initializing the system and for booting the operating system; and the application processors (APs) are activated by the BSP only after the operating system is up and running. Which processor is the BSP is determined by the hardware and the BIOS.

每个CPU有个LAPIC单元,LAPIC单元负责在整个系统中传输中断

In an SMP system, each CPU has an accompanying local APIC (LAPIC) unit. The LAPIC units are responsible for delivering interrupts throughout the system. The LAPIC also provides its connected CPU with a unique identifier.

Per-CPU State and Initialization

映射MMIO.

A processor accesses its LAPIC using memory-mapped I/O (MMIO). In MMIO, a portion of physical memory is hardwired to the registers of some I/O devices, so the same load/store instructions typically used to access memory can be used to access device registers. You’ve already seen one IO hole at physical address 0xA0000 (we use this to write to the VGA display buffer).

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
//
// Reserve size bytes in the MMIO region and map [pa,pa+size) at this
// location. Return the base of the reserved region. size does *not*
// have to be multiple of PGSIZE.
//
void *
mmio_map_region(physaddr_t pa, size_t size)
{
static uintptr_t base = MMIOBASE;

size = ROUNDUP(size,PGSIZE);
pa = ROUNDDOWN(pa,PGSIZE);

if (base + size > MMIOLIM)
panic("mmio_map_region: cannot go higher than MMIOLIM!\n");

boot_map_region(kern_pgdir,base,size,pa,PTE_PCD|PTE_PWT|PTE_W);
base += size;
return (void*)(base-size);
}

为每一个cpu映射栈空间.
注意下方是NCPU而不是ncpu,因为ncpu此时还未初始化.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
// Modify mappings in kern_pgdir to support SMP
// - Map the per-CPU stacks in the region [KSTACKTOP-PTSIZE, KSTACKTOP)
//
static void
mem_init_mp(void)
{
for(int i = 0;i<NCPU;++i)
{
uint32_t kstacktop_i = KSTACKTOP-i*(KSTKSIZE+KSTKGAP);
boot_map_region(kern_pgdir,kstacktop_i-KSTKSIZE,KSTKSIZE,PADDR(percpu_kstacks[i]),PTE_W|PTE_P);
}

}

为每个CPU设置tss,idr.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
void
trap_init_percpu(void)
{
int id = cpunum();
thiscpu->cpu_ts.ts_esp0 = KSTACKTOP-id*(KSTKGAP+KSTKSIZE);
thiscpu->cpu_ts.ts_ss0 = GD_KD;
thiscpu->cpu_ts.ts_iomb = sizeof(struct Taskstate);

gdt[(GD_TSS0>>3)+id] = SEG16(STS_T32A, (uint32_t) (&(thiscpu->cpu_ts)),
sizeof(struct Taskstate) - 1, 0);
gdt[(GD_TSS0>>3)+id].sd_s = 0;

// ltr(((GD_TSS0>>3)+id)<<3);
ltr(GD_TSS0 + id*sizeof(struct Gatedesc));
lidt(&idt_pd);
}

下图有助于理解为什么gdt中使用cpt_ts的逻辑地址而不是物理地址,以及ltr操作的tss选择子.


Locking

System Calls for Environment Creation

几个系统调用,需要检查的东西都列出来了,挨着做就行了.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
// Allocate a new environment.
// Returns envid of new environment, or < 0 on error. Errors are:
// -E_NO_FREE_ENV if no free environment is available.
// -E_NO_MEM on memory exhaustion.
static envid_t
sys_exofork(void)
{
struct Env* e;
int result;
if((result = env_alloc(&e,curenv->env_id)))
{
return result;
}
memcpy(&e->env_tf,&curenv->env_tf,sizeof(struct Trapframe));
e->env_status = ENV_NOT_RUNNABLE;
e->env_tf.tf_regs.reg_eax = 0;
return e->env_id;
}

// Set envid's env_status to status, which must be ENV_RUNNABLE
// or ENV_NOT_RUNNABLE.
//
// Returns 0 on success, < 0 on error. Errors are:
// -E_BAD_ENV if environment envid doesn't currently exist,
// or the caller doesn't have permission to change envid.
// -E_INVAL if status is not a valid status for an environment.
static int
sys_env_set_status(envid_t envid, int status)
{
int result;
struct Env* e;
if((result=envid2env(envid,&e,1)))
{
return result;
}
if(status!=ENV_RUNNABLE&&status!=ENV_NOT_RUNNABLE)
{
return -E_INVAL;
}
e->env_status = status;
return 0;

}

// Allocate a page of memory and map it at 'va' with permission
// 'perm' in the address space of 'envid'.
// The page's contents are set to 0.
// If a page is already mapped at 'va', that page is unmapped as a
// side effect.
//
// perm -- PTE_U | PTE_P must be set, PTE_AVAIL | PTE_W may or may not be set,
// but no other bits may be set. See PTE_SYSCALL in inc/mmu.h.
//
// Return 0 on success, < 0 on error. Errors are:
// -E_BAD_ENV if environment envid doesn't currently exist,
// or the caller doesn't have permission to change envid.
// -E_INVAL if va >= UTOP, or va is not page-aligned.
// -E_INVAL if perm is inappropriate (see above).
// -E_NO_MEM if there's no memory to allocate the new page,
// or to allocate any necessary page tables.
static int
sys_page_alloc(envid_t envid, void *va, int perm)
{
int result;
struct PageInfo* page;
struct Env* e;
if((perm|PTE_SYSCALL)!=PTE_SYSCALL)
{
return -E_INVAL;
}
if(va>=(void*)UTOP||va!=ROUNDDOWN(va,PGSIZE))
{
return E_INVAL;
}
if((result = envid2env(envid,&e,1)))
{
return result;
}

if(!(page = page_alloc(ALLOC_ZERO)))
{
return -E_NO_MEM;
}
page_insert(e->env_pgdir,page,va,perm);
return 0;
}

// Map the page of memory at 'srcva' in srcenvid's address space
// at 'dstva' in dstenvid's address space with permission 'perm'.
// Perm has the same restrictions as in sys_page_alloc, except
// that it also must not grant write access to a read-only
// page.
//
// Return 0 on success, < 0 on error. Errors are:
// -E_BAD_ENV if srcenvid and/or dstenvid doesn't currently exist,
// or the caller doesn't have permission to change one of them.
// -E_INVAL if srcva >= UTOP or srcva is not page-aligned,
// or dstva >= UTOP or dstva is not page-aligned.
// -E_INVAL is srcva is not mapped in srcenvid's address space.
// -E_INVAL if perm is inappropriate (see sys_page_alloc).
// -E_INVAL if (perm & PTE_W), but srcva is read-only in srcenvid's
// address space.
// -E_NO_MEM if there's no memory to allocate any necessary page tables.
static int
sys_page_map(envid_t srcenvid, void *srcva,
envid_t dstenvid, void *dstva, int perm)
{
struct Env* srcenv,*dstenv = NULL;
pte_t* srcpte,*dstpte;
struct PageInfo* srcpage = NULL;
int result;
if(srcva>=(void*)UTOP||srcva!=ROUNDDOWN(srcva,PGSIZE)||dstva>=(void*)UTOP||dstva!=ROUNDDOWN(dstva,PGSIZE))
{
return E_INVAL;
}
if((perm|PTE_SYSCALL)!=PTE_SYSCALL)
{
return -E_INVAL;
}
if((result = envid2env(srcenvid,&srcenv,1)))
{
return result;
}
if((result = envid2env(dstenvid,&dstenv,1)))
{
return result;
}

if(!(srcpage = page_lookup(srcenv->env_pgdir,srcva,&srcpte)))
{
return -E_INVAL;
}

if(((*srcpte)&PTE_W)==0&&((perm&PTE_W)!=0))
{
return -E_INVAL;
}

if((result = page_insert(dstenv->env_pgdir,srcpage,dstva,perm)))
{
return result;
}
return 0;

}

// Unmap the page of memory at 'va' in the address space of 'envid'.
// If no page is mapped, the function silently succeeds.
//
// Return 0 on success, < 0 on error. Errors are:
// -E_BAD_ENV if environment envid doesn't currently exist,
// or the caller doesn't have permission to change envid.
// -E_INVAL if va >= UTOP, or va is not page-aligned.
static int
sys_page_unmap(envid_t envid, void *va)
{
struct Env* e;
int result;

if(va>=(void*)UTOP||va!=ROUNDDOWN(va,PGSIZE))
{
return -E_INVAL;
}
if((result = envid2env(envid,&e,1)))
{
return result;
}
page_remove(e->env_pgdir,va);
return 0;
}

Part B: Copy-on-Write Fork

User-level page fault handling

Setting the Page Fault Handler
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
// Set the page fault upcall for 'envid' by modifying the corresponding struct
// Env's 'env_pgfault_upcall' field. When 'envid' causes a page fault, the
// kernel will push a fault record onto the exception stack, then branch to
// 'func'.
//
// Returns 0 on success, < 0 on error. Errors are:
// -E_BAD_ENV if environment envid doesn't currently exist,
// or the caller doesn't have permission to change envid.
static int
sys_env_set_pgfault_upcall(envid_t envid, void *func)
{
// LAB 4: Your code here.
struct Env* e;
int result;
if((result = envid2env(envid,&e,1)))
{
return result;
}
e->env_pgfault_upcall = func;
return 0;
}
Invoking the User Page Fault Handler
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
void
page_fault_handler(struct Trapframe *tf)
{
uint32_t fault_va;
fault_va = rcr2();

if((tf->tf_cs&3)==0)
panic("kernel pagefalut\n");

if(curenv->env_pgfault_upcall==NULL)
{
goto bad;
}

int recursive = (tf->tf_esp >= UXSTACKTOP - PGSIZE) && (tf->tf_esp < UXSTACKTOP);
struct UTrapframe* utf;

if(recursive)
{
utf = (struct UTrapframe*)(tf->tf_esp-sizeof(struct UTrapframe)-sizeof(uint32_t));
// cprintf("recursive\n");
user_mem_assert(curenv,utf,tf->tf_esp-(uint32_t)utf,PTE_W);
}
else
{
utf = (struct UTrapframe*)(UXSTACKTOP-sizeof(struct UTrapframe));
// cprintf("non-recursive\n");
user_mem_assert(curenv,utf,UXSTACKTOP-(uint32_t)utf,PTE_W);
}

utf->utf_esp = tf->tf_esp;
utf->utf_eflags = tf->tf_eflags;
utf->utf_regs = tf->tf_regs;
utf->utf_err = tf->tf_err;
utf->utf_eip = tf->tf_eip;
utf->utf_fault_va = fault_va;

tf->tf_esp = (uintptr_t)utf;
tf->tf_eip = (uint32_t)curenv->env_pgfault_upcall;
env_run(curenv);

bad:
// Destroy the environment that caused the fault.
cprintf("[%08x] user fault va %08x ip %08x\n",
curenv->env_id, fault_va, tf->tf_eip);
print_trapframe(tf);
env_destroy(curenv);
}
User-mode Page Fault Entrypoint
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
.text
.globl _pgfault_upcall
_pgfault_upcall:
// Call the C page fault handler.
pushl %esp // function argument: pointer to UTF
movl _pgfault_handler, %eax
call *%eax
addl $4, %esp // pop function argument

movl 40(%esp),%ebx //取出trap_eip
subl $4,48(%esp) //抬高栈4字节,此空间为trap_eip的返回地址
movl 48(%esp),%eax //取出trap_esp
movl %ebx,(%eax)

addl $8,%esp //跳过err和fault_va

popal //restore the trap-time registers

addl $4,%esp //跳过trap_eip
popf

popl %esp
ret
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
void
set_pgfault_handler(void (*handler)(struct UTrapframe *utf))
{
int r;

if (_pgfault_handler == 0) {
// First time through!
// LAB 4: Your code here.
envid_t eid = sys_getenvid();
if(sys_page_alloc(eid,(void*)(UXSTACKTOP-PGSIZE),PTE_W|PTE_U|PTE_P))
{
//panic("fail\n"); User系统调用失败,凭啥让内核panic?
}
sys_env_set_pgfault_upcall(eid,_pgfault_upcall);
}

// Save handler pointer for assembly to call.
_pgfault_handler = handler;
}

Implementing Copy-on-Write Fork

页表,页目录表的映射分析

实现完用户级页面错误处理例程的安装工作后,下面进入Fork函数的实现.
在此之前,先来理解一个clever mapping trick.

在为环境建立映射的时候,有这样一个操作.

1
2
3
// UVPT maps the env's own page table read-only.
// Permissions: kernel R, user R
e->env_pgdir[PDX(UVPT)] = PADDR(e->env_pgdir) | PTE_P | PTE_U;

UVPT的定义如下

1
2
// User read-only virtual page table (see 'uvpt' below)
#define UVPT (ULIM - PTSIZE)

根据注释,我们可以理解这一操作是将页目录表自身映射到虚拟地址UVPT,使用户进程可以只读访问.But how does it work?

先来看这样一个情形.用户态lib的某个函数要检查一个物理页在其页表中的pte条目来判断操作是否合法.自然而然的想法是:通过之前映射的页目录表找到对应的页目录表目中的页表地址,再访问该页表的对应pte条目.

但问题是,页目录表条目pde中存的是页表的物理地址而不是虚拟地址,没办法访问到对应页表.你可能会想,不对啊,之前在内核态的时候有过访问pte的操作啊.其实是因为之前访问时是先将从页目录表中取出的页表物理地址转换成KVA虚拟地址后再访问的,能完成这样的操作是因为我们曾经将从0开始的物理地址空间映射到了KERNBASE之上.

1
boot_map_region(kern_pgdir, KERNBASE, (1ULL << 32) - KERNBASE, 0, PTE_W);

而用户进程并不具有访问KERNBASE之上虚拟地址空间的能力.

那用户进程应该怎么访问页表条目呢?页表本身也是一个物理页,而访问一个物理页就需要找到该物理页对应的页表,页表的页表,也就是页目录表.把页目录表当作一个页表,不久能找到页表本身的物理地址了?

回想一下分页机制的工作.先在页目录表中通过PDX找到页表,再在页表中通过PTX找到物理页.

那如果我们让页目录表根据PDX找到页目录表自身,页部件就会把页目录表当成页表,再根据PTX找到页表本身的物理地址并进行访问.

理解完这一过程,再来看看如何找到想访问的页表或页表项的地址.
对于32位地址addr –> PDX|PTX|OFFSET.

PDX是页目录表的索引,要让它索引到自身,便固定了得是(UPVT>>22)
由于我们将页目录表当作是页表,则PTX同样也用来在页目录表上索引,查找到的物理地址意义为第PTX个页表的物理地址.
OFFSET便用于在页表上偏移,页表的内容是页表条目,每个大小4字节,所以OFFSET便是该页表的第OFFSET/4项.

综合起来,upvt[n]就相当于访问了物理空间第n页的页表条目.
upvt[addr>>12]访问的就是addr所在物理页的页表条目.
upvd[addr>>22]访问的就是addr对应的页表在页目录表中的条目.
这也解释了entry.S中的宏定义.
其实挺反直觉的,页目录表映射到UVPT,但访问UVPT却访问到的是第0页的页表.而页目录表要用uvpd去访问(UVPT+(UVPT>>12)*4).

1
2
3
4
.globl uvpt
.set uvpt, UVPT
.globl uvpd
.set uvpd, (UVPT+(UVPT>>12)*4)
Fork代码实现

pgfault是Custom page fault handler,如果发生页面错误的是对COW页面的写入操作,分配一个物理页拷贝原数据并设定可写

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
//
// Custom page fault handler - if faulting page is copy-on-write,
// map in our own private writable copy.
//
static void
pgfault(struct UTrapframe *utf)
{
void *addr = (void *) utf->utf_fault_va;
uint32_t err = utf->utf_err;
int r;

if((err&FEC_WR)==0)
{
panic("pgfault: the faulting access was not a write\n");
}
pte_t pte = ((pte_t*)UVPT)[PGNUM(addr)];
if((pte&PTE_COW)==0)
panic("pgfault: the faulting access was not to a copy-on-write page\n");

if((r = sys_page_alloc(0,PFTEMP,PTE_W|PTE_U)))
{
panic("pgfault: page_alloc error when COW\n");
}
memcpy((void*)PFTEMP,ROUNDDOWN(addr,PGSIZE),PGSIZE);
if(sys_page_map(0,PFTEMP,0,ROUNDDOWN(addr,PGSIZE),PTE_W|(pte&PTE_SYSCALL&~PTE_COW)))
{
panic("pgfault: page_map error when COW\n");
}
if ((r = sys_page_unmap(0, (void *)PFTEMP)) < 0)
panic("pgfault: sys_page_unmap() failed: %e\n", r);
}

duppage映射页面pn到envid(子)进程,若为可写页面或COW则将自身(父进程)和envid进程中该页均映射为COW(因为父进程可能也是fork出来的且并没有得到独立的页面,仅是COW).

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
static int
duppage(envid_t envid, unsigned pn)
{
int r;
// LAB 4: Your code here.
uintptr_t addr = pn*PGSIZE;

pte_t pte = ((pte_t*)UVPT)[pn];

if(pte&(PTE_W|PTE_COW))
{

//为子进程映射页面并设置PTE_COW
if((r = sys_page_map(0,(void*)addr,envid,(void*)addr,PTE_COW|PTE_U|(pte&PTE_SYSCALL&~PTE_W)))<0)
{
return r;
}

//清除父进程的PTE_W并设置PTE_COW
if((r = sys_page_map(0,(void*)addr,0,(void*)addr,PTE_COW|PTE_U|(pte&PTE_SYSCALL&~PTE_W)))<0)
{
return r;
}

}
else
{
if((r = sys_page_map(0,(void*)addr,envid,(void*)addr,PTE_U|(pte&PTE_SYSCALL)))<0)
{
return r;
}
}
return 0;
}

fork创建子进程,为子进程映射(duppage)地址空间.为自身和子进程安装错误处理函数.
注意子进程的错误处理函数一定要由父进程安装,因为子进程在调用函数或系统调用时的压栈操作会触发页面错误(目前子进程的栈还是COW的),而错误处理函数还未安装,无法处理.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
//
// User-level fork with copy-on-write.
// Set up our page fault handler appropriately.
// Create a child.
// Copy our address space and page fault handler setup to the child.
// Then mark the child as runnable and return.
//
// Returns: child's envid to the parent, 0 to the child, < 0 on error.
// It is also OK to panic on error.
//
// Hint:
// Use uvpd, uvpt, and duppage.
// Remember to fix "thisenv" in the child process.
// Neither user exception stack should ever be marked copy-on-write,
// so you must allocate a new page for the child's user exception stack.
//
envid_t
fork(void)
{
// LAB 4: Your code here.
envid_t ceid;
int result;
set_pgfault_handler(pgfault);

if((ceid = sys_exofork())<0)
{
return ceid;
}
else if(ceid == 0)
{
//child
thisenv = &(envs[sys_getenvid()&0x3FF]);
return 0;
}
else
{
for (uintptr_t va = 0; va < UTOP;)
{
extern volatile pde_t uvpd[];
extern volatile pte_t uvpt[];
if ((uvpd[va >> PDXSHIFT] & PTE_P) == 0)
{ // page table page not found.
va += NPTENTRIES * PGSIZE;
continue;
}
if ((uvpt[va >> PTXSHIFT] & PTE_P) == 0)
{ // page table entry not found.
va += PGSIZE;
continue;
}
if (va == UXSTACKTOP - PGSIZE)
{ // UXSTACKTOP is not remmaped!
va += PGSIZE;
continue;
}
// this page should be duppage()d.
if ((result = duppage(ceid, (unsigned)(va/PGSIZE))) < 0)
{
return result;
}
va += PGSIZE;
}

if ((result = sys_page_alloc(ceid, (void *)(UXSTACKTOP-PGSIZE), (PTE_U|PTE_W))) < 0)
return result;
extern void _pgfault_upcall(void);
sys_env_set_pgfault_upcall(ceid,_pgfault_upcall);
sys_env_set_status(ceid,ENV_RUNNABLE);
return ceid;
}
}
Copy-On-Write流程分析
准备工作

为了开启Copy-On-Write,要为环境安装_pgfault_upcall函数和_pgfault_handler.前者是汇编实现的page fault处理例程的入口点,会调用后者进行page fault的处理并完成traptime状态的恢复.该安装过程由set_pgfault_handler函数完成,且由于上文提过的原因,子进程_pgfault_upcall函数必须由父进程安装.set_pgfault_handler同时为进程分配单独的错误处理堆栈,子进程的错误处理堆栈由父进程分配.

触发Copy-On-Write

当进程对带有PTE_COW标志的页进行写入操作时,由于没有PTE_W权限,处理器触发pagefault,陷入内核态,由常规的异常处理流程,最终到达内核态的page_fault_handler函数.该函数检查环境的_pgfault_upcall是否安装,是否是递归页面错误,为页面处理准备UTrapFrame结构,最后env_run返回到用户态从_pgfault_upcall开始执行.

处理Copy-On-Write

_pgfault-upcall函数调用用户安装的_pgfault_handler.本实现中该函数流程如下.检查此次pgfault的类型,是否为写入操作,页面是否是COW…若通过检查便为触发pgfault的虚拟地址重新分配一个物理页,拷贝原页的内容.

恢复到traptime

现在触发pgfault的地址已经有了独立的一个可写页面,pgfault_upcall函数恢复到traptime状态继续执行,这次的写入操作可以正常进行了.

Inter-Process communication (IPC)

本实现中采用的共享方式是:

  1. 通过envs所在的所有环境共享的可读页传递value.
  2. 将sender的某页面映射到recver的页面上
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    44
    45
    46
    47
    48
    49
    50
    51
    52
    53
    54
    55
    56
    57
    58
    59
    60
    61
    62
    63
    64
    65
    66
    static int
    sys_ipc_try_send(envid_t envid, uint32_t value, void *srcva, unsigned perm)
    {
    // LAB 4: Your code here.
    int result;
    struct Env* e;
    pte_t* pte;
    struct PageInfo* page;
    if((result = envid2env(envid,&e,0)))
    {
    return result;
    }
    if((e->env_status != ENV_NOT_RUNNABLE) || (e->env_ipc_recving==0))
    {
    return -E_IPC_NOT_RECV;
    }
    if(((uintptr_t)srcva!=-1)&&((uintptr_t)e->env_ipc_dstva!=-1))
    {
    if((uintptr_t)srcva>=UTOP||(srcva!=ROUNDDOWN(srcva,PGSIZE))||((perm&PTE_SYSCALL)!=perm))
    {
    return -E_INVAL;
    }

    if(!(page = page_lookup(curenv->env_pgdir,srcva,&pte)))
    {
    return -E_INVAL;
    }
    if(((*pte&PTE_W)==0)&&((perm&PTE_W)==PTE_W))
    {
    return -E_INVAL;
    }
    if((result = page_insert(e->env_pgdir,page,e->env_ipc_dstva,perm)))
    {
    return result;
    }
    e->env_ipc_perm = perm;
    }
    else
    e->env_ipc_perm = 0;
    e->env_ipc_recving = 0;
    e->env_ipc_from = curenv->env_id;
    e->env_ipc_value = value;

    e->env_status = ENV_RUNNABLE;
    return 0;

    }

    static int
    sys_ipc_recv(void *dstva)
    {
    // LAB 4: Your code here.
    if ((uint32_t)dstva != -1)
    {
    if (((uintptr_t)dstva >= UTOP) || (dstva!=ROUNDDOWN(dstva,PGSIZE)))
    return -E_INVAL;
    }
    curenv->env_ipc_recving = 1;
    curenv->env_ipc_dstva = dstva;
    curenv->env_status = ENV_NOT_RUNNABLE;

    curenv->env_tf.tf_regs.reg_eax = 0;
    sys_yield();
    return 0;
    }

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
int32_t
ipc_recv(envid_t *from_env_store, void *pg, int *perm_store)
{
// LAB 4: Your code here.
int result;
if(pg==NULL)
pg = (void*)-1;
if((result = sys_ipc_recv(pg))<0)
{
if(from_env_store!=NULL)
*from_env_store = 0;
if(perm_store!=NULL)
*perm_store=0;
return result;
}
else
{
if(from_env_store!=NULL)
*from_env_store = thisenv->env_ipc_from;
if(perm_store!=NULL)
*perm_store=thisenv->env_ipc_perm;
}
return thisenv->env_ipc_value;
}

void
ipc_send(envid_t to_env, uint32_t val, void *pg, int perm)
{
// LAB 4: Your code here.
int result;
if(pg == NULL)
pg = (void*)-1;
while(1)
{
result = sys_ipc_try_send(to_env,val,pg,perm);
if(!result)
{
break;
}
if(result == -E_IPC_NOT_RECV)
{
sys_yield();
continue;
}
panic("ipc_send: fail in send--%e\n",result);
}
}
  • 版权声明: 本博客所有文章除特别声明外,著作权归作者所有。转载请注明出处!
  • Copyrights © 2022-2024 翰青HanQi

请我喝杯咖啡吧~

支付宝
微信