tablespace备份

bluekey · 发表于 2002-7-22 01:16

我相信backup状态下,确实写了datafile，拷贝出的文件是不一致的。可不明白的是这样一来，置backup状态还有什么意义，不置backup状态直接ocopy即可，recover时用archived log前滚不也一样吗。而且，backup状态下，既然继续写datafile，为什么还要向log中写块映象呢？oracle为什么要这样设计呢？

rejoice999 · 发表于 2002-7-22 03:08

Good question, bluekey。
ORACLE这么作一定有它的道理，它在作这个ONLINE HOT BACKUP时作了很多操作，实际
的情况比我们能想象到的要复杂得多，请参见以下关于HOT BACKUP的文章，特别要注意
“fractured block”、“before-image logging”。
大家都知道，在HOT BACKUP期间产生的REDO LOG要比正常情况下要多，但为什么多并不
是象有的朋友所说的“把变化的数据块都写入REDO LOG”。事实上不管在正常操作还
是HOT BACKUP期间，LGWR都要把数据块的变化写入REDO LOG，要不怎么恢复呢？以我目
前的理解，REDO LOG里多的这部分主要是由于“before-image logging”。
为什么要有这个“before-image logging”呢？由于备份出来的文件的每一BLOCK内部
都不能保证是一致的（在备前半块时，后半块可能已经被DBWR更新了），所以在
CHECKPOINT（由begin backup引起的）之后，任何一个BLOCK在被第一次修改之前，
ORACLE要把这个BLOCK的IMAGE写到REDO LOG里。将来在MEDIA RECOVER时，如果
在REDO LOG里发现有某DATA BLOCK的BEFORE-IMAGE，就先把这个BEFORE-IMAGE
写到数据文件里，覆盖掉由备份文件RESTORE进来的这个BLOCK的IMAGE（因为这块有
可能是“fractured block”），再在些基础上APPLY块的变化量。
OK，欢迎讨论。

TECH Internals of Recovery
4. Hot Backup

A hot backup is a copy of a datafile that is taken while the file is in
active use. Datafile writes (by DBWR) go on as usual during the
time the backup is being copied. Thus, the backup gets a "fuzzy"
copy of the datafile:

* Some blocks may be ahead in time versus other blocks of the copy.
* Some blocks of the copy may be ahead of the checkpoint SCN in the
file header of the copy.
* Some blocks may contain updates that constitute breakage of
the redo record atomicity guarantee with respect to other
blocks in this or other datafiles.
* Some block copies may be "fractured" (due to front and back
halves being copied at different times, with an intervening
update to the block on disk).

The "hotbackup-fuzzy" copy is unusable without "focusing" (via
the redo log) that occurs when the backup is restored and
undergoes media recovery. Media recovery applies redo (from all
threads) from the begin-backup checkpoint SCN (see Step 2. in
Section 4.1) through the end-point of the recovery operation (either
complete or incomplete). The result is a transaction-consistent
"focused" version of the datafile.

There are three steps to taking a hot backup:

* Execute the ALTER TABLESPACE ... BEGIN BACKUP command.
* Use an operating system copy utility to copy the constituent
datafiles of the tablespace(s).
* Execute the ALTER TABLESPACE ... END BACKUP command.

4.1  BEGIN BACKUP

The BEGIN BACKUP command takes the following actions (not
necessarily in the listed order) for each datafile of the tablespace:

1. It sets a flag in the datafile header - the hotbackup-fuzzy bit
- to indicate that the file is in hot backup. The header with
this flag set (copied by the copy utility) enables the copy to be
recognized as a hot backup. A further purpose of this flag in
the online file header is to cause the checkpoint in the file
header to be "frozen" at the begin-backup checkpoint value
that will be set in Step 4. This is the value that it must have in
the backup copy in order to ensure that, when the backup is
recovered, media recovery will start redo application at a suffi-
ciently early checkpoint SCN so as to cover all changes to the
file in all threads since the execution of BEGIN BACKUP (see
6.5). Since we cannot guarantee that the file header will be the
first block to be written out by the copy utility, it is important
that the file header checkpoint structure remain "frozen" until
END BACKUP time. This flag keeps the datafile checkpoint
structure "frozen" during hot backup, preventing it (and the
checkpoint SCN in the datafile's controlfile record) from being
updated during thread checkpoint events that advance the
database checkpoint. New in v7.2: While the file is in hot
backup, a new "backup" checkpoint structure in the datafile
header receives the updates that the "frozen" checkpoint
would have received.

2. It executes a datafile checkpoint, capturing the resultant
"begin-backup" checkpoint information, including the begin-
backup checkpoint SCN. When the file is checkpointed, all
instances are requested to write out all dirty buffers they have
for the file. If the need for instance recovery is detected at this
time, the file checkpoint operation waits until it is completed
before proceeding. Checkpointing the file at begin-backup
time ensures that only file blocks changed after begin-backup
time might have been written to disk during the course of the
file copy. This guarantee is crucial to enabling block before-
image logging to cope with the fractured block problem, as
described in Step 3.

3. [Platform-dependent option]: It starts block before-image log-
ging for the file. During block before-image logging, all
instances log a full block before-image to the redo log prior to
the first change to each block of the file (since the backup
started, or since the block was read anew into the buffer
cache). This is to forestall a recovery problem that would arise
if the backup were to contain a fractured block copy (mis-
matched halves). This could happen if (the database block size
is greater than the operating system block size, and) the front
and back halves of the block were copied to the backup at dif-
ferent times - with an intervening update to the block on
disk. In this eventuality, recovery can reconstruct the block
using the logged block before-image.

4. It sets the checkpoint in the file header equal to the begin-
backup checkpoint captured in Step 2. This file header check-
point will be "frozen" until END BACKUP is executed.

5. It clears the file's online-fuzzy bit. The online-fuzzy bit
remains clear during the course of the file copy operation, thus
ensuring a cleared online-fuzzy bit in the file copy. Note that
the online-fuzzy bit is set again by the execution of END
BACKUP.

4.2  File Copy

The file copy is done by utilities that are not part of Oracle. The
presumption is that the platform vendor will have backup facilities
that are superior to any portable facility that we could develop. It is
the responsibility of the administrator to ensure that copies are only
taken between the BEGIN BACKUP and END BACKUP
commands, or when the file is not in use.

4.3  END BACKUP

The END BACKUP command takes the following actions for each
datafile of the tablespace:

1. It restores (i.e. sets) the file's online-fuzzy bit.

2. It creates an end-backup redo record (end-backup "marker"

for the datafile. This record, interpreted only by media recov-
ery, contains the begin-backup checkpoint SCN (i.e. the SCN
matching that in the "frozen" checkpoint in the backup's
header). This record serves to mark the end of the redo gener-
ated during the backup. The end-backup "marker" is used by
media recovery to determine when all redo generated between
BEGIN BACKUP and END BACKUP has been applied to the
datafile. Upon encountering the end-backup "marker", media
recovery can (at the next media recovery checkpoint: see
6.7.1) clear the hotbackup-fuzzy bit. This is only important in
preventing an incomplete recovery that might erroneously
attempt to end before all redo generated between BEGIN
BACKUP and END BACKUP has been applied. Ending
incomplete recovery at such a point may result in an inconsis-
tent file, since the backup copy may already have contained
changes beyond this endpoint. As will be seen on 8.1, open
with resetlogs following incomplete media recovery will fail if
any online datafile has the hotbackup-fuzzy bit (or any other
fuzzy bit) set.

3. It clears the file's hotbackup-fuzzy bit.

4. It stops block before-image logging for the file.

5. It advances the file checkpoint to the current database check-
point. This compensates for any file header update(s) missed
during thread checkpoints that may have advanced the data-
base checkpoint while the file was in hot backup state, with its
checkpoint "frozen".

bluekey · 发表于 2002-7-22 04:10

有道理。oracle只所以在backup状态下继续写datafile，而不采用end backup时用archived log应用这段时间所发生的改变，是因为如果文件较大，拷贝时间是很长的，如果采用上述第二种方法，可能需要重新应用很多的archived log，这些archied log中有一个丢失，都会导致这个表空间无法再恢复到正常状态。
另外有一个问题，既然在begin backup时，会自动checkpoint，就没有必要在开始时先执行一遍switch logfile，不知道有的网友在开始热备份前，先switch logfile是什么目的？

aaahhh · 发表于 2002-7-22 04:48

一直就没搞清楚这个问题，现在总算明白了，很高兴.
现在再接再励，讨论为什么RMAN不需要alter tablespace begin/end backup的问题.

rejoice999 · 发表于 2002-7-22 04:54

1。开始热备份前，先switch logfile肯定不是必须的。
2。作一下switch logfile也好，比如LOG FILE 100M，现在seq=2000用到95M了，如果我不switch logfile就begin backup，
那将来这个100M的seq=2000的archived logfile也是恢复必须
的，要保留。如果先switch一下，开始使用seq=2001，再开始begin backup，从2001开始
保留archived logfile就可以了，至少省一个redolog的空间。

rejoice999 · 发表于 2002-7-22 05:31

虽然你自己说“一直就没搞清楚这个问题”，但你的观点一直是
正确的哟，

。看你上网的时间，你会不会和我在同一地方？

不用RMAN时，读DATAFILE和输出备份文件都是由OS命令（cp, ocopy, dd）完成的，
这些命令不是ORACLE的一部分，ORACLE无法协调它们和DBWR之间的关系，也就是
有可能产生"fractured block"，所以得有alter tablespace begin backup，
“before image logging”等一系列的操作来保证将来可以恢复。

RMAN备份时，读取DATA BLOCK是由RMAN连接到TARGER DATABASE时产生的
SERVER PROCESSES来完成的，它是ORACLE的一部分（打入了敌人内部），这些进
程都是自己人，协调起来就好办了，它可以判断DATA BLOCK是否是
"fractured block"，如果是就会重新读，直到读到好的IMAGE为止。所以RMAN永远
不会把"fractured block"写到备份中，当然也就不需要BEGIN BACKUP、“BEFORE
IMAGE LOGGING”了，也就不会产生“EXTRA REDOLOG INFORMATION"了。
另外RMAN备份时，也不用冻结存在FILE HEADER中的CHECKPOINT，因为它知道读取
BLOCK的正确顺序，一定能备出一个有效的CHECHPOINT。
而不用RMAN备份时，ORACLE不能保证第三方工具（cp等）一定先读文件头，再读
DATA BLOCKS，所以要把文件头从BEGIN BACKUP之前冻结到END BACKUP之后。

最初由 aaahhh 发布
[B]一直就没搞清楚这个问题，现在总算明白了，很高兴.
现在再接再励，讨论为什么RMAN不需要alter tablespace begin/end backup的问题. [/B]

aaahhh · 发表于 2002-7-22 06:26

伟大的RMAN

你应该在美国吧，我在欧洲，这阵发贴时间和你相近是因为连续值夜班呢。

biti_rainy · 发表于 2002-7-22 10:25

同时也对两位美洲欧洲的兄弟的认真的态度表示感谢

关于这一部分的问题
因自己看完书后很久没有回顾了
实在模糊了，回答问题之前没有再考究一下
在此再次证明对于自己没有100%把握的事情一定要查阅资料或者实验证明
而就一个人的力量，是无法做的很完美的，所以才需要大家共同讨论
一起进步嘛

begin 之后数据文件确实仍然在被写入
init文件中 _log_blocks_during_backup = true (默认也是true)的情况下
将记录开始到结束之间数据块的前映像

这些前映像将被记录到日志文件中
因为数据块中实际上记录了该块被修改过的次数的
当在备份期间发生了块被修改的事件，则会记录下来，并把前映像写入日志文件
当恢复的时候，如果发现备份期间块被修改，则从日志文件copy回前映像
这样才能保证数据库备份的安全

在结束备份后，数据库本身并不进行更多的工作

前面我的说法有问题
当时也迷惑于如果结束备份的时候，数据库岂不是要为了保证数据文件的同步而做大量的工作？
没有去仔细考究它了，sigh

根据经验表明：
oracle的很多做法总是很有他的道理的，一般来说比你我想到的解决办法要好
如果当我发现某种实现有缺陷的时候
基本上就是我的理解出现了偏差

真是不好意思呀

废话不说了
根据上面的情况
所以有一个非常重要的地方需要我们考虑
这就是我们必须备份热备份期间所产生的归档日志！！！
如果丢失热备份期间的归档日志，那我们的热备份的数据文件很可能就没有意义了
因为这些数据文件的数据块之间版本很可能是不一致的

在开始备份和结束备份的时候
分别记录了scn
通过这个决定redo运用到数据文件
在开始backup的时候锁定了检查点（ckpt count）
结束的时候把检查点前推到数据库检查点

阐述比较混乱，但大体数据库在这样的机制下，保障了数据库的正常运行和备份的安全

richking · 发表于 2002-7-22 11:29

同志们辛苦了,真的是很敬业阿,号佩服
也让我重新认识了alter tablespace begin/end backup
以前可能就是为了应付考试,看得比较粗,
再次谢谢诸位

freewill · 发表于 2002-7-22 13:11

谢谢rejoice999的精彩回答，终于一解心头疑惑，幸好本人一直用打入敌人内部的RMAN进行备份。

[精华] tablespace备份

纠正一下自己的错误的言论

浏览过的版块