一. PI 说明
转自
Oracle RAC Concept of Past Image (PI)
http://www.remote-dba.net/t_rac_concept_past_image_pi.htm
The past image concept was introduced in the RAC version of Oracle 9i to maintain data integrity. In an Oracle database, a typical data block is not written to the disk immediately, even after it is dirtied. When the same dirty data block is requested by another instance for write or read purposes, an image of the block is created at the owning instance, and only that block is shipped to the requesting instance. This backup image of the block is called the past image (PI) and is kept in memory. In the event of failure, Oracle can reconstruct the current version of the block by reading PIs. It is also possible to have more than one past image in the memory depending on how many times the data block was requested in the dirty stage.
A past image copy of the data block is different from a CR block, which is needed for reconstructing a read-consistent image. A CR version of a block represents a consistent snapshot of the data at a point in time. It is constructed by applying information from the undo/rollback segments. The PI image copy helps the recovery process and aids in maintaining data integrity.
有关CR block 的说明,参考我的blog:
CR (consistent read) blocks create 说明
http://blog.csdn.net/tianlesoftware/archive/2011/06/07/6529401.aspx
For example, suppose user A of Instance 1 has updated row 2 on block 5. Later, user B of Instance 2 intends to update row 6 on the same block 5. The GCS transfers block 5 from Instance A to Instance B. At this point, the past image (PI) for block 5 is created on Instance A.
Lock Modes
From the examination of resource roles, resource modes, and past images, the next step is to consider the possible resource access modes as shown in Table 2.2.
There are three characters that distinguish lock or block access modes. The first letter represents the lock mode, the second character represents the lock role, and the third character (a number) indicates any past images for the lock in the local instance.
-- 介绍LOCK_MODE 各个字段的含义。
LOCK MODE
DESCRIPTION
NL0
Null Local and No past images
SL0
Shared Local with no past image
XL0
Exclusive Local with no past image
NG0
Null Global – Instance owns current block image
SG0
Global Shared Lock – Instance owns current image
XG0
Global Exclusive Lock – Instance own current image
NG1
Global Null – Instance owns the past image block
SG1
Shared Global – Instance owns past image
XG1
Global Exclusive Lock – Instance owns past image.
When a block is brought into the local cache of an instance, it is acquired with the local role. But if a dirty buffer for the same data block is present in a remote instance, a past image is created in the remote instance before the data block is sent to the requesting instance’s cache. Therefore, the data block resource acquires a global role.
For recovery purposes, instances that have past images will keep those past images in their buffer cache until the master instance prompts the lock to release them. When the buffers are discarded, the instance holding the past image will write a block written redo (BWR) to the redo stream. The BWR indicates that the block has already been written to disk and is not needed for recovery by the instance. Buffers are discarded when the disk write is initiated on the master instance. The master instance is where the current status and position of the data block is maintained.
This has been a review of how a GCS resource maintains its access mode and its role. There is another feature called the buffer state, which is covered in the next section.
# 观测1: 球不在任何节点上。
SYS@RAC1//scripts> select inst_id,status from gv$bh where file#=5 and block#=88;
no rows selected
# 节点1要球。
SYS@RAC1//scripts> update hr.employees set salary=1 where employee_id=100;
1 row updated.
# 观测2: 球在节点1上。xcur表示写调用的当前数据缓冲,即排他当前数据块。
SYS@RAC1//scripts> select inst_id,dirty,status from gv$bh where file#=5 and block#=88;
INST_ID + D + STATUS
---------- + - + -------
1 + Y + xcur
1 row selected.
# 节点2要球。
SYS@RAC2//scripts> update hr.employees set salary=2 where employee_id=101;
1 row updated.
# 观测3: 球在节点2上,残像在节点1上。pi表示Past Image,也就是残像。它保留了数据块前一次更改后的样子。
SYS@RAC1//scripts> select inst_id,dirty,status from gv$bh where file#=5 and block#=88;
INST_ID + D + STATUS
---------- + - + -------
1 + Y + pi
2 + Y + xcur
2 rows selected.
# 节点1要球。
SYS@RAC1//scripts> update hr.employees set salary=3 where employee_id=100;
1 row updated.
# 观测4: 球在节点1上,残像在节点2和节点1上都存在。节点2上的残像比节点1上的残像更新。
SYS@RAC1//scripts> select inst_id,dirty,status from gv$bh where file#=5 and block#=88;
INST_ID + D + STATUS
---------- + - + -------
1 + Y + pi
1 + Y + xcur
2 + Y + pi
3 rows selected.
# 节点2又要球。
SYS@RAC2//scripts> update hr.employees set salary=4 where employee_id=101;
1 row updated.
# 观测5: 球在节点2上,很不巧这时候发生了增量检查点,DBWR醒了,想到要工作了,残像(pi)变成了陈旧的一致性读缓存块(cr)。它们完全可以被覆盖。坑爹的我辛苦产生的残像都没了。
SYS@RAC1//scripts> select inst_id,dirty,status from gv$bh where file#=5 and block#=88;
INST_ID + D + STATUS
---------- + - + -------
1 + N + cr
1 + N + cr
2 + Y + xcur
3 rows selected.
# 重来。节点1要球。
SYS@RAC1//scripts> update hr.employees set salary=5 where employee_id=100;
1 row updated.
# 观测6: 球在节点1上,残像在节点2上。陈旧的一致性读缓存块不用理会,它们随时可以消失。
SYS@RAC1//scripts> select inst_id,dirty,status from gv$bh where file#=5 and block#=88;
INST_ID + D + STATUS
---------- + - + -------
1 + Y + xcur
1 + N + cr
1 + N + cr
2 + Y + pi
4 rows selected.
# 节点2要球。
SYS@RAC2//scripts> update hr.employees set salary=6 where employee_id=101;
1 row updated.
# 观测7: 球在节点2上,残像在节点1和2上。节点1上的残像比节点2上的残像更新。回顾一下观测4,很相似。两节点上的残像都出现了。
SYS@RAC1//scripts> select inst_id,dirty,status from gv$bh where file#=5 and block#=88;
INST_ID + D + STATUS
---------- + - + -------
1 + Y + pi
2 + Y + pi
2 + Y + xcur
3 rows selected.
# 节点1要球。
SYS@RAC1//scripts> update hr.employees set salary=7 where employee_id=100;
1 row updated.
# 观测8: 球在节点1上,残像在节点2和1上。节点2上的残像比节点1上的残像更新。原来在节点2上的残像变成了陈旧的一致性读缓存块。没有破坏每一个实例最多只能有一个残像(针对同一个数据块)的规则。
SYS@RAC1//scripts> select inst_id,dirty,status from gv$bh where file#=5 and block#=88;
INST_ID + D + STATUS
---------- + - + -------
1 + Y + pi
1 + Y + xcur
2 + Y + pi
2 + N + cr
4 rows selected.
A:
SYS@RAC1//scripts> run
1 begin
2 for i in 1..100000 loop
3 update hr.employees set salary=i where employee_id=100;
4 end loop;
5* end;
B:
SYS@RAC2//scripts> run
1 begin
2 for i in 1..100000 loop
3 update hr.employees set salary=i where employee_id=101;
4 end loop;
5* end;
等到它们执行完毕后,看一下5号文件88号数据块在buffer cache中占了几个位置:
SYS@RAC2//scripts> select count(*) from gv$bh where file#=5 and block#=88;
COUNT(*)
----------
412
1 row selected.
其中409个是一致性读块缓冲(cr):
SYS@RAC2//scripts> select count(*) from gv$bh where file#=5 and block#=88 where status='cr';
COUNT(*)
----------
409
1 row selected.
1个排他当前块缓冲(xcur):
SYS@RAC2//scripts> select count(*) from gv$bh where file#=5 and block#=88 where status='xcur';
COUNT(*)
----------
1
1 row selected.
还有...2个我们的主角--残像缓冲(pi)。分别在两个节点上。
SYS@RAC1//scripts> select inst_id,dirty,status from gv$bh where file#=5 and block#=88 and status='pi';
INST_ID + D + STATUS
---------- + - + -------
1 + Y + pi
2 + Y + pi
2 rows selected.