Oracle RAC Past Image(PI) 说明

oqnazjhiqn · 发表于 2016-7-28 07:40:08

一. PI 说明
转自
Oracle RAC Concept of Past Image (PI)
http://www.remote-dba.net/t_rac_concept_past_image_pi.htm

The past image concept was introduced in the RAC version of Oracle 9i to maintain data integrity. In an Oracle database, a typical data block is not written to the disk immediately, even after it is dirtied. When the same dirty data block is requested by another instance for write or read purposes, an image of the block is created at the owning instance, and only that block is shipped to the requesting instance. This backup image of the block is called the past image (PI) and is kept in memory. In the event of failure, Oracle can reconstruct the current version of the block by reading PIs. It is also possible to have more than one past image in the memory depending on how many times the data block was requested in the dirty stage.

A past image copy of the data block is different from a CR block, which is needed for reconstructing a read-consistent image. A CR version of a block represents a consistent snapshot of the data at a point in time. It is constructed by applying information from the undo/rollback segments. The PI image copy helps the recovery process and aids in maintaining data integrity.
有关CR block 的说明，参考我的blog：
CR (consistent read) blocks create 说明
http://blog.csdn.net/tianlesoftware/archive/2011/06/07/6529401.aspx

For example, suppose user A of Instance 1 has updated row 2 on block 5. Later, user B of Instance 2 intends to update row 6 on the same block 5. The GCS transfers block 5 from Instance A to Instance B. At this point, the past image (PI) for block 5 is created on Instance A.

　　Lock Modes

From the examination of resource roles, resource modes, and past images, the next step is to consider the possible resource access modes as shown in Table 2.2.
There are three characters that distinguish lock or block access modes. The first letter represents the lock mode, the second character represents the lock role, and the third character (a number) indicates any past images for the lock in the local instance.
-- 介绍LOCK_MODE 各个字段的含义。

LOCK MODE	DESCRIPTION
NL0	Null Local and No past images
SL0	Shared Local with no past image
XL0	Exclusive Local with no past image
NG0	Null Global – Instance owns current block image
SG0	Global Shared Lock – Instance owns current image
XG0	Global Exclusive Lock – Instance own current image
NG1	Global Null – Instance owns the past image block
SG1	Shared Global – Instance owns past image
XG1	Global Exclusive Lock – Instance owns past image.

When a block is brought into the local cache of an instance, it is acquired with the local role. But if a dirty buffer for the same data block is present in a remote instance, a past image is created in the remote instance before the data block is sent to the requesting instance’s cache. Therefore, the data block resource acquires a global role.
For recovery purposes, instances that have past images will keep those past images in their buffer cache until the master instance prompts the lock to release them. When the buffers are discarded, the instance holding the past image will write a block written redo (BWR) to the redo stream. The BWR indicates that the block has already been written to disk and is not needed for recovery by the instance. Buffers are discarded when the disk write is initiated on the master instance. The master instance is where the current status and position of the data block is maintained.
This has been a review of how a GCS resource maintains its access mode and its role. There is another feature called the buffer state, which is covered in the next section.

二. PI 示例

转自：http://blogs.oracle.com/toddbao/entry/past_imagepi

Past Image是一种RAC环境中脏缓冲块的状态，是集群中不同实例对同一个数据缓冲块写而又写后的间接结果。简而言之，Past Image是一种特殊的脏数据块，它保留了前一次更改后的样子。对于同一个block，每一个实例最多只能有一个Past Image。PI 也称残像。实例间争夺、修改热块很容易观察到Past Image。

当前环境是这样的：HR.EMPLOYEES中的100号员工和101号员工都在5号文件的88号数据块中。
每个数据块可以包含多条row记录。可以将block dump出来，查看trace 内容。如：alter system dump datafile 4 block 32;

在我的blog里有一个示例：
Oracle rdba和 dba 说明
http://blog.csdn.net/tianlesoftware/archive/2011/06/07/6529346.aspx

将设称此数据块为球，让两个实例争夺这个球。

# 观测1： 球不在任何节点上。
SYS@RAC1//scripts> select inst_id,status from gv$bh where file#=5 and block#=88;
no rows selected

＃ 节点1要球。
SYS@RAC1//scripts> update hr.employees set salary=1 where employee_id=100;
1 row updated.

# 观测2： 球在节点1上。xcur表示写调用的当前数据缓冲，即排他当前数据块。
SYS@RAC1//scripts> select inst_id,dirty,status from gv$bh where file#=5 and block#=88;
INST_ID + D + STATUS
---------- + - + -------
1 + Y + xcur
1 row selected.

# 节点2要球。
SYS@RAC2//scripts> update hr.employees set salary=2 where employee_id=101;
1 row updated.

# 观测3： 球在节点2上，残像在节点1上。pi表示Past Image，也就是残像。它保留了数据块前一次更改后的样子。
SYS@RAC1//scripts> select inst_id,dirty,status from gv$bh where file#=5 and block#=88;
INST_ID + D + STATUS
---------- + - + -------
1 + Y + pi
2 + Y + xcur
2 rows selected.

＃ 节点1要球。
SYS@RAC1//scripts> update hr.employees set salary=3 where employee_id=100;
1 row updated.

# 观测4： 球在节点1上，残像在节点2和节点1上都存在。节点2上的残像比节点1上的残像更新。
SYS@RAC1//scripts> select inst_id,dirty,status from gv$bh where file#=5 and block#=88;
INST_ID + D + STATUS
---------- + - + -------
1 + Y + pi
1 + Y + xcur
2 + Y + pi
3 rows selected.

# 节点2又要球。
SYS@RAC2//scripts> update hr.employees set salary=4 where employee_id=101;
1 row updated.

# 观测5： 球在节点2上，很不巧这时候发生了增量检查点，DBWR醒了，想到要工作了，残像（pi）变成了陈旧的一致性读缓存块（cr）。它们完全可以被覆盖。坑爹的我辛苦产生的残像都没了。
SYS@RAC1//scripts> select inst_id,dirty,status from gv$bh where file#=5 and block#=88;
INST_ID + D + STATUS
---------- + - + -------
1 + N + cr
1 + N + cr
2 + Y + xcur
3 rows selected.

＃ 重来。节点1要球。
SYS@RAC1//scripts> update hr.employees set salary=5 where employee_id=100;
1 row updated.

# 观测6： 球在节点1上，残像在节点2上。陈旧的一致性读缓存块不用理会，它们随时可以消失。
SYS@RAC1//scripts> select inst_id,dirty,status from gv$bh where file#=5 and block#=88;
INST_ID + D + STATUS
---------- + - + -------
1 + Y + xcur
1 + N + cr
1 + N + cr
2 + Y + pi
4 rows selected.

# 节点2要球。
SYS@RAC2//scripts> update hr.employees set salary=6 where employee_id=101;
1 row updated.

# 观测7： 球在节点2上，残像在节点1和2上。节点1上的残像比节点2上的残像更新。回顾一下观测4，很相似。两节点上的残像都出现了。
SYS@RAC1//scripts> select inst_id,dirty,status from gv$bh where file#=5 and block#=88;
INST_ID + D + STATUS
---------- + - + -------
1 + Y + pi
2 + Y + pi
2 + Y + xcur
3 rows selected.

# 节点1要球。
SYS@RAC1//scripts> update hr.employees set salary=7 where employee_id=100;
1 row updated.

# 观测8： 球在节点1上，残像在节点2和1上。节点2上的残像比节点1上的残像更新。原来在节点2上的残像变成了陈旧的一致性读缓存块。没有破坏每一个实例最多只能有一个残像（针对同一个数据块）的规则。
SYS@RAC1//scripts> select inst_id,dirty,status from gv$bh where file#=5 and block#=88;
INST_ID + D + STATUS
---------- + - + -------
1 + Y + pi
1 + Y + xcur
2 + Y + pi
2 + N + cr
4 rows selected.

不考虑任何检查点的话，当xcur块移动到另一个节点时：原来节点上的xcur块转变成pi块、原来的pi块（如果有的话）转变为cr块，结果是cr块越来越多，pi则最多和节点数一样多。

接下来让两个节点进行一次弹珠球大战：同时在两个节点上执行匿名块A和B。

A：
SYS@RAC1//scripts> run
1 begin
2 for i in 1..100000 loop
3 update hr.employees set salary=i where employee_id=100;
4 end loop;
5* end;

B：
SYS@RAC2//scripts> run
1 begin
2 for i in 1..100000 loop
3 update hr.employees set salary=i where employee_id=101;
4 end loop;
5* end;

等到它们执行完毕后，看一下5号文件88号数据块在buffer cache中占了几个位置:
SYS@RAC2//scripts> select count(*) from gv$bh where file#=5 and block#=88;
COUNT(*)
----------
412
1 row selected.

其中409个是一致性读块缓冲（cr）：
SYS@RAC2//scripts> select count(*) from gv$bh where file#=5 and block#=88 where status='cr';
COUNT(*)
----------
409
1 row selected.

1个排他当前块缓冲（xcur）：
SYS@RAC2//scripts> select count(*) from gv$bh where file#=5 and block#=88 where status='xcur';
COUNT(*)
----------
1
1 row selected.

还有...2个我们的主角－－残像缓冲（pi）。分别在两个节点上。
SYS@RAC1//scripts> select inst_id,dirty,status from gv$bh where file#=5 and block#=88 and status='pi';
INST_ID + D + STATUS
---------- + - + -------
1 + Y + pi
2 + Y + pi
2 rows selected.

某些时候，当你在RAC环境中发现大量的一致性读缓冲（cr）时，可能你看到的是实例间争夺热块的搏斗痕迹。这是一个xcur到pi再到cr的过程。

PI至少有两个作用:
一，需要时节点可以从本地的pi块制造cr块，避免从其他节点请求cr块。
二，当拥有xcur块的实例崩溃后，pi块重新转变为xcur块，提高了实例恢复的速度。

-------------------------------------------------------------------------------------------------------
Blog： http://blog.csdn.net/tianlesoftware
Email: dvd.dba@gmail.com
DBA1 群：62697716(满); DBA2 群：62697977(满) DBA3 群：62697850(满)
DBA 超级群：63306533(满); DBA4 群： 83829929 DBA5群： 142216823
DBA6 群：158654907 聊天群：40132017 聊天2群：69087192
--加群需要在备注说明Oracle表空间和数据文件的关系，否则拒绝申请

账号		自动登录	找回密码
密码			立即注册

大疆运维招人啦，

C++ :try 语句块和异常处理

C++的多态

Red Hat RHCE 8 (EX294) Cert Guide

Java/C++ 区别：看完这一篇，就够用！

别再用过时库了！这 13 个顶级 C++ 库才是

c++ size_t 和 int 的区别

[经验分享] Oracle RAC Past Image(PI) 说明

浏览过的版块

扫码加入运维网微信交流群