Xen Memory Management
[*]
All low-level memory operations go through Xen.
[*]
Guest OSes are responsible for allocating and initializing PTs for processes (restricted to read only access)
[*]
allocates and initialize a page and register it with Xen to serve as the new PT
[*]
Direct page writes are intercepted, validated and applied by the Xen VMM
[*]
update can be batched into a single hypercall (reduce cost of entering/exiting Xen)
[*]
page_info struct associated with each machine page frame
[*]
page type (none, l1, l2, l3, l4, LDT, GDT, RW)
[*]
reference count – number of references to the page
[*]
page frame can be reused only when unpinned and its reference count is zero
[*]
Each domain has a maximum and current memory allocation
[*]
max allocation is set at domain creation time and cannot be modified
[*]
PT updates
[*]
hypercall –> mmu_update()
[*]
writable page tables –> vm_assist()
[*]
Xen exists in the top 64MB (0xFC000000 – 0xFFFFFFFF) section of every guest virtual address space (TLB flush avoided when entering/leaving the hypervisor)
[*]
not accessible or remappable by guest OSes.
[*]
“fast handler” for system calls - direct access from app into guest OS, without going through Xen
[*]
muse execute outside Ring 0
[*]
Each guest supports a “ballon” memory management driver - that is used by the VMM to dynamically adjust the guest’s memory usage
[*]
Page fault handling
[*]
faulting address is written into an extended stack frame on the guest OS stack (normally the faulting address is read from a privileged processor register (CR2))
[*]
In terms of page protection, Ring1/2 are considered to be part of ‘supervisor mode’. The WP bit in CR0 controls whether read-only restrictions are respected in supervisor mode – if the bit is clear then any mapped page is writable. Xen gets around this by always setting the WP bit and disallowing updates to it. xen/arch/x86/boot/x86_32.S#153
[*]
Xen provides a domain with a list of machine frames during bootstrapping, and it is the domain’s responsibility to create the pseudo-physical address space from this
No guarantee that a domain will receive a contiguous stretch of physical memory. Most OSes do not have good support for operating in a fragmented physical address space.
[*]
Machine memory
[*]
entire amount of memory installed in the machine (physical memory)
[*]
4kB machine page frames numbered consecutively starting from 0.
[*]
Pseudo-physical memory
[*]
per-domain abstraction.
[*]
allows a guest OS to consider its memory allocation to consist of a contiguous range of physical page frames starting at physical frame 0.
[*]
machine-to-physical table
[*]
globally readable table maintained by Xen
[*]
records the mapping from machine addresses to pseudo-physical addresses
[*]
table size is proportional to the amount of RAM installed in the machine
[*]
physical-to-machine table
[*]
per-domain table which performs the inverse (physical-to-machine) mapping.
[*]
table size is proportional to the memory allocation of the given domain.
(XEN) VIRTUAL MEMORY ARRANGEMENT (for DOM0)
(XEN) Loaded kernel: c0100000→c042e254
(XEN) Init. ramdisk: c042f000→c07fca00
(XEN) Phys-Mach map: c07fd000→c086e894 == 454 MB (as can be verified by: xm list)
(XEN) Start info: c086f000→c0870000
(XEN) Page tables: c0870000→c0874000 == 16 MB
(XEN) Boot stack: c0874000→c0875000
(XEN) TOTAL: c0000000→c0c00000
(XEN) ENTRY ADDRESS: c0100000
x86-32 Xen supports only guests with 2-level page tables. PGD = l2, PTE =l1
How to intercept interrupts from guest domains
http://lists.xensource.com/archives/html/xen-devel/2006-09/msg00597.html
http://lists.xensource.com/archives/html/xen-devel/2006-09/msg00604.html
Page fault handling for Xen guests
http://lists.xensource.com/archives/html/xen-devel/2006-02/msg00263.html
show pagetable walk if guest cannot handle page
http://lists.xensource.com/archives/html/xen-devel/2006-09/msg00612.html
Memory management, mapping, paging questions...
http://lists.xensource.com/archives/html/xen-devel/2006-10/msg01151.html
Information related to shadowing
http://lists.xensource.com/archives/html/xen-devel/2006-11/msg00319.html
http://lists.xensource.com/archives/html/xen-devel/2006-11/msg00793.html
http://lists.xensource.com/archives/html/xen-devel/2006-11/msg00802.html
How to intercept memory operation in Xen
http://lists.xensource.com/archives/html/xen-devel/2006-11/msg00659.html
http://lists.xensource.com/archives/html/xen-devel/2006-11/msg00664.html
http://lists.xensource.com/archives/html/xen-devel/2006-11/msg00717.html
alert message from dom0 to domU
http://lists.xensource.com/archives/html/xen-devel/2006-12/msg00967.html
Share Memory Between DomainU and Domain0
http://lists.xensource.com/archives/html/xen-devel/2006-12/msg01008.html
Call hypercall straightly from user space
http://lists.xensource.com/archives/html/xen-devel/2006-12/msg01061.html
xen/arch/x86/traps.c#do_page_fault –> fixup_page_fault –> mm.c#ptwr_do_page_fault
xen-3.0.2-2/xen/arch/x86/setup.c#__start_xen()
| \
v \
xen-3.0.2-2/xen/common/domain.c#domain_create() \
| \
v \
xen-3.0.2-2/xen/arch/x86/domain.c#arch_domain_create() \
\
v
xen-3.0.2-2/xen/arch/x86/domain_build.c#construct_dom0()
Xen-ELF image vmlinux-syms-2.6.16-xen has a special'__xen_guest' section
Xen hypercall table:
/xen-3.0.2-2/xen/arch/x86/x86_32/entry.S
#I think this is called when DOM0 attempts to create a DOMU
xen-3.0.2-2/xen/common/dom0_ops.c#do_dom0_op()
trousers-0.2.7/src/tspi/spi_tpm.c#Tspi_TPM_Quote()
|
v
trousers-0.2.7/src/tcsd_api/calltcsapi.c#TCSP_Quote()
|
v
trousers-0.2.7/src/tcsd_api/tcstp.c#TCSP_Quote_TP()
|
v
trousers-0.2.7/src/tcsd_api/tcstp.c#sendTCSDPacket()
原文:https://wiki.cs.dartmouth.edu/nihal/doku.php/xen:memory
一.x86_64是怎么嵌入到Dom0的线性空间的
IA32是通过段保护机制做到的:高64M为Ring-0的Xen空间;
1G-64M为Kernel的Ring-1空间;
其他的3G给Application
x86_64没有段保护机制,必须用页保护机制:2^64-2^47 --> 2^64 == 内核空间
0 --> 2^47 == 用户空间
中间空的部分可以作为他用 == 被Xen用了
二.Xen采用直接模式 == Guest OS使用自己的页表直接访问HPA
方法: 页表里的内容为HPA;页表项Guest OS只可读;普通的页Guest OS可直接读写。
一旦更新引起Page异常。如果想要更新/操作页表,可以调用相应的Hypercall。
VMM也能保证Guest OS只能访问自己的内存。
Guest OS操作内存的流程:
1.Guest OS访问一个新内存地址(GVA),PageFault ==> 更新Guest OS的页表
2.Guest OS先找到页表的GPA,VMM根据GPA找到该GPA对应的HPA(通过P2M)
==> 相当于页表更新,调用页表更新的Hypercall(GPA,HPA)
3.如果子页表不存在,需要挂接该子页
==> 相当于页表挂接操作,调用页表操作的Hypercall(线性地址,HPA)
4.访问该PT表,重复以上2-3步,最终得到一个GVA==>HPA的地址
三.可写页表
由于对页表的操作开销比较大(每次都要进行Hypercall调用),在某些情况下可以改进它()。
方法是:先把页表(实际上只要把总表PD表)拿下来,不让别人访问,把它作为Guest OS的普通的可读写页
Guest OS随便更改,很多次更改完成后,最后提交给Hypercall,让VMM一次完全的完成更新操作。
前提:PAE模式。因为PDE只有一个PD页。
四.Balloon驱动(存在的Dom0和DomU中)
为Dom0和DomU申请/释放内存
可以查看自己和全Machine的内存状况
Balloon驱动根据设置在XenStore的中的目标值来自动调整它的内存的大小。
五.共享页是怎么实现的
Start Info Page(包括里面的内容)是VMM在Domain初始化时拼成的,它的内容包括了Shared Info Page和XenStore的连接,进入Domain的前几件事就是把本Doamin的Shared Info Page利用页表更新上真正VMM已经分配了的存在Start Info Page。
HVM的PV驱动(主要是)当然也要用Shared Info Page,它的Shared Info Page是自己拼成的。
4.就算是Dom0利用VT-x不也很好吗,用了吗?
没有用,半虚拟化不需要用VT-x技术,目的是为了提高系统的性能
5.PAE模式是什么,有什么影响
物理地址扩展 (PAE) 允许将最多64GB 的物理内存用作常规的4KB 页面,并扩展内核能使用的位数以将物理内存地址从32扩展到36。
Dom0只有在迁移的时候才用到影子页表,其他时候都用直接访问物理内存。 注释:
gpfn/gfn: guset page frame number 客户物理页面号(客户操作系统使用gpfn/gfn对客户物理地址空间寻址)
mfn: machine page frame number 机器页面号
smfn: machine page frame number for shadow pages shadow页面所在的机器页面号
l1e: level 1 page table entry
gl1e: level 1 guest page table entry
sl1e: level 1 shadow page table entry 一级shadow页表项
PV: para-virtualization
HVM: Hardware assistant Virtual Machine
页:
[1]