楼主: wangzhonnew

[精华] 来讨论一个recovery的问题

[复制链接]
论坛徽章:
21
在线时间
日期:2007-07-25 04:01:022012新春纪念徽章
日期:2012-02-13 15:09:232012新春纪念徽章
日期:2012-02-13 15:09:232012新春纪念徽章
日期:2012-02-13 15:09:232012新春纪念徽章
日期:2012-02-13 15:09:23马上有车
日期:2014-02-19 11:55:14马上有房
日期:2014-02-19 11:55:14马上有钱
日期:2014-02-19 11:55:14马上有对象
日期:2014-02-19 11:55:142012新春纪念徽章
日期:2012-02-13 15:09:23
31#
发表于 2008-11-18 10:01 | 只看该作者
没法保证DB LEVEL IO SIZE跟FS/STORAGE LEVEL IO SIZE是一致的,并且DB2/ORACLE根本就读不懂SNAPCLONE出来的IMAGE,如果没有WRITE SUSPEND的话。大概相当于乱码。

给个ORACLE ON NAS STORAGE案例,看看是不是有所启发。

NAS的工作原理是对于一个被改变的BLOCK,(NETAPP的方法)要么在PERMENANT SPACE(我们正常所看见的FS SPACE)中找一个新的空的BLOCK,把改变后的内容写到新的BLOCK中,而原先的BLOCK就会自动归类到SNAPSHOT SPACE中;(EMC的方法?)或是在SNAPSHOT SPACE中找一个新的空的BLOCK,把原先的内容拷贝到这个新的BLOCK中,而把改变后的内容写回到原先的BLOCK中。任何时候若是SNAPSHOT SPACE用光了,SNAPSHOT就会自动长到PERMENANT SPACE中。

我遇到的案例是,在SNAPSHOT SPACE全部被用光,FS SPACE也快要被用光的情况下,ORACLE TRANSACTION还在继续,最终导致所有的SPACE都被用光,而某些TRANSACTION还没完成,i.e. DB LEVEL IO还没最终完成。DB STALLED。

这个时候STORAGE/FS 也是相当于一个SNAPCLONE IMAGE,但是最终是好几个在这个FS上的DATA FILES CORRUPTED,ORACLE根本没法自动恢复。

使用道具 举报

回复
论坛徽章:
6
2010新春纪念徽章
日期:2010-03-01 11:21:012013年新春福章
日期:2013-02-25 14:51:242014年新春福章
日期:2014-02-18 16:41:11马上有车
日期:2014-02-18 16:41:11沸羊羊
日期:2015-03-04 14:43:432015年新春福章
日期:2015-03-06 11:57:31
32#
发表于 2008-11-18 11:32 | 只看该作者
I know nothing about ORACLE, and few about DB2 LUW.

But I know FlashCopy without stopping database, this could be RECOVERed, with a little bit of transaction data loss while data consistency is guaranteed (on DB2 z/OS). And you just need to use DR plan to restart DB2, not normal restart, it plays well.

And a tip, you should use FlashCopy COPY option. The example shown above sounds like FC NO COPY.

使用道具 举报

回复
论坛徽章:
6
2010新春纪念徽章
日期:2010-03-01 11:21:012013年新春福章
日期:2013-02-25 14:51:242014年新春福章
日期:2014-02-18 16:41:11马上有车
日期:2014-02-18 16:41:11沸羊羊
日期:2015-03-04 14:43:432015年新春福章
日期:2015-03-06 11:57:31
33#
发表于 2008-11-18 11:40 | 只看该作者
原帖由 askgyliu 于 2008-11-18 00:23 发表
想想IBM的GLOBAL MIRRORING是怎样做的,大概可以比较清楚些。

FS-LEVEL的COPY/CLONE,普通STORAGE-LEVEL的CLONE等等,都没法保证DB-LEVEL TRANSACTION CONSISTENCY。


Transaction level consistency is guaranteed by DB2 when restarted, thru LOG apply, and "current" status ("current"=the time when FC consistency-group completed) would be rebuilt. So, any transactions (committed or not) after "current" would be lost.

使用道具 举报

回复
论坛徽章:
21
在线时间
日期:2007-07-25 04:01:022012新春纪念徽章
日期:2012-02-13 15:09:232012新春纪念徽章
日期:2012-02-13 15:09:232012新春纪念徽章
日期:2012-02-13 15:09:232012新春纪念徽章
日期:2012-02-13 15:09:23马上有车
日期:2014-02-19 11:55:14马上有房
日期:2014-02-19 11:55:14马上有钱
日期:2014-02-19 11:55:14马上有对象
日期:2014-02-19 11:55:142012新春纪念徽章
日期:2012-02-13 15:09:23
34#
发表于 2008-11-18 14:39 | 只看该作者
原帖由 Pythagoras 于 2008-11-18 11:32 发表
I know nothing about ORACLE, and few about DB2 LUW.

But I know FlashCopy without stopping database, this could be RECOVERed, with a little bit of transaction data loss while data consistency is guaranteed (on DB2 z/OS). And you just need to use DR plan to restart DB2, not normal restart, it plays well.

And a tip, you should use FlashCopy COPY option. The example shown above sounds like FC NO COPY.


Global Mirroring, aka, DR implementation in many organization, is able to restart and recover the database up to the latest transaction. Although Global Mirroring is also a continuous synchronization of the primary database at the storage level, the implementation is much more complicated, and not just a simple snapclone or flashcopy of the primary database, and it requires more layer than storage synchronization.

A flashcopy without stopping database but with a write suspend to the database is a valid option for DB2 backup, but will a flashcopy without a write suspend to the databasebe recoverable? I doubt so. Flashcopy is just a fast way of cloning the image at the storage level, and it by itself has no mechamism to ensure database consistency.

In a test DB2 database, I did a write suspend to the database, and manually copied all the database files using OS command, and I was able to recover this database using my slow-copied image. However, I will not be able to use an image created with copy-without-write-suspend to recover the database.

Maybe we can also think in another way, the db2 backup command is a db2 "copy" command to copy all the relavent files that DB2 itself can understand and intepret to re-create the database.

The key point is, DB2 or Oracle has no way to understand what created at the storage (without DB intervention). The mechamism of a storage level copy/synchronization has no meaning for database.

使用道具 举报

回复
论坛徽章:
21
在线时间
日期:2007-07-25 04:01:022012新春纪念徽章
日期:2012-02-13 15:09:232012新春纪念徽章
日期:2012-02-13 15:09:232012新春纪念徽章
日期:2012-02-13 15:09:232012新春纪念徽章
日期:2012-02-13 15:09:23马上有车
日期:2014-02-19 11:55:14马上有房
日期:2014-02-19 11:55:14马上有钱
日期:2014-02-19 11:55:14马上有对象
日期:2014-02-19 11:55:142012新春纪念徽章
日期:2012-02-13 15:09:23
35#
发表于 2008-11-18 14:40 | 只看该作者
原帖由 Pythagoras 于 2008-11-18 11:40 发表


Transaction level consistency is guaranteed by DB2 when restarted, thru LOG apply, and "current" status ("current"=the time when FC consistency-group completed) would be rebuilt. So, any transactions (committed or not) after "current" would be lost.


Correct, it is guaranteed by "DB2", not storage synchronization.

使用道具 举报

回复
论坛徽章:
6
2010新春纪念徽章
日期:2010-03-01 11:21:012013年新春福章
日期:2013-02-25 14:51:242014年新春福章
日期:2014-02-18 16:41:11马上有车
日期:2014-02-18 16:41:11沸羊羊
日期:2015-03-04 14:43:432015年新春福章
日期:2015-03-06 11:57:31
36#
发表于 2008-11-18 21:01 | 只看该作者
原帖由 askgyliu 于 2008-11-18 14:39 发表


Will a flashcopy without a write suspend to the databasebe recoverable?


My answer is YES or NO, depends on your judgement standard.

YES, because if the flashcopy is a complete set of files, including user database files (tablespaces and indexspaces), transaction log files (active and archive), system files (catalog and other control information, etc.). It's enough to restart DB2 from these files. Exactly, this is not a RECOVER but a RESTART, and the results are identical.

NO, because DB2 dose NOT know anything about this flashcopy, then no commands or utilities can be used to recover inside of DB2 automatically. But, the exceptions are: 1.If the flashcopy is triggered by DB2 (like BACKUP SYSTEM utility since DB2 z/OS V8), DB2 dose know everything about it, and can be RESTOREed. 2.Anyway, you can do something manually to recover from this flashcopy files, which is not supported by vendor officially.

[ 本帖最后由 Pythagoras 于 2008-11-18 21:03 编辑 ]

使用道具 举报

回复
招聘 : c/c++研发
论坛徽章:
45
技术图书徽章
日期:2014-03-10 14:09:192012新春纪念徽章
日期:2012-02-13 15:12:092012新春纪念徽章
日期:2012-02-13 15:12:092012新春纪念徽章
日期:2012-01-04 11:51:22ITPUB十周年纪念徽章
日期:2011-11-01 16:21:15现任管理团队成员
日期:2011-05-07 01:45:082011新春纪念徽章
日期:2011-01-25 15:42:562011新春纪念徽章
日期:2011-01-25 15:42:332011新春纪念徽章
日期:2011-01-25 15:42:152011新春纪念徽章
日期:2011-01-25 15:41:50
37#
 楼主| 发表于 2008-11-18 21:54 | 只看该作者
终于看到让人眼前一亮的discussion了^_^
yeah, because currently db2/oracle doesn't officially support this kind of recovery, so yeah it is expected things don't work smoothly if just use FS copy image without io suspend.

my original question is "why ibm doesn't support that", so the we cannot just answer "we cannot do such recovery because there's no command to support such recovery" (we know it's not supported already)

and askgyliu, may i ask what kind of test you did for "However, I will not be able to use an image created with copy-without-write-suspend to recover the database."?
Did you use something like "consistency-group" introduced in ibm flashcopy to make sure the image is consistent at FS level? (i'm not the man for storage, the bottomline for this is that we need to make sure the image of the copy is consistent at filesystem level, all pages must be sync for a particular point-in-time)

and for Pythagoras, when you said "because DB2 dose NOT know anything about this flashcopy, then no commands or utilities can be used to recover inside of DB2 automatically." why db2 doesn't know anything about this flashcopy if we can rebuild the catalog of the database (in db2 LUW, db2initdb can be used to create db from a split mirrow image (io suspend, copy database file from FS level))? We have complete set of data files so we can specify where are each tablespace, where are the transaction logs and where are the database control files to rebuild the full database...

使用道具 举报

回复
论坛徽章:
138
19周年集字徽章-19
日期:2020-06-08 08:30:56马上加薪
日期:2014-02-19 11:55:14马上有对象
日期:2014-02-19 11:55:14马上有钱
日期:2014-02-19 11:55:14马上有房
日期:2014-02-19 11:55:14马上有车
日期:2014-02-19 11:55:14马上有房
日期:2014-02-18 16:42:022014年新春福章
日期:2014-02-18 16:42:02路虎
日期:2013-11-22 12:26:18问答徽章
日期:2014-05-08 12:15:31
38#
发表于 2008-11-18 22:55 | 只看该作者
这讨论的和oracle的instance recovery和media recovery没多大区别吧

使用道具 举报

回复
招聘 : c/c++研发
论坛徽章:
45
技术图书徽章
日期:2014-03-10 14:09:192012新春纪念徽章
日期:2012-02-13 15:12:092012新春纪念徽章
日期:2012-02-13 15:12:092012新春纪念徽章
日期:2012-01-04 11:51:22ITPUB十周年纪念徽章
日期:2011-11-01 16:21:15现任管理团队成员
日期:2011-05-07 01:45:082011新春纪念徽章
日期:2011-01-25 15:42:562011新春纪念徽章
日期:2011-01-25 15:42:332011新春纪念徽章
日期:2011-01-25 15:42:152011新春纪念徽章
日期:2011-01-25 15:41:50
39#
 楼主| 发表于 2008-11-19 00:18 | 只看该作者
楼上的说说阿,media recovery再oracle里面怎么实现的?

使用道具 举报

回复
论坛徽章:
6
2010新春纪念徽章
日期:2010-03-01 11:21:012013年新春福章
日期:2013-02-25 14:51:242014年新春福章
日期:2014-02-18 16:41:11马上有车
日期:2014-02-18 16:41:11沸羊羊
日期:2015-03-04 14:43:432015年新春福章
日期:2015-03-06 11:57:31
40#
发表于 2008-11-19 07:49 | 只看该作者
why db2 doesn't know anything about this flashcopy
Because this flashcopy is NOT triggered by DB2, there is no info stored in the DB2 catalog or anywhere. Other words, this flashcopy is taken from the OS side, out of DB2 control.

if we can rebuild the catalog of the database (in db2 LUW, db2initdb can be used to create db from a split mirrow image (io suspend, copy database file from FS level))?
In fact, you needn't to rebuild the catalog, when flashcopy finished, you will get a complete set of files including catalog and everything DB2 wants, what you need to do is just restart your DB2 from the disks made by split mirror (not the original disks). So, this is why my answer is NO, this recover process is commonly used in DR plan, and precisely, this is a restart process.
And IO suspend is NOT necessary.

[ 本帖最后由 Pythagoras 于 2008-11-19 08:00 编辑 ]

使用道具 举报

回复

您需要登录后才可以回帖 登录 | 注册

本版积分规则 发表回复

TOP技术积分榜 社区积分榜 徽章 团队 统计 知识索引树 积分竞拍 文本模式 帮助
  ITPUB首页 | ITPUB论坛 | 数据库技术 | 企业信息化 | 开发技术 | 微软技术 | 软件工程与项目管理 | IBM技术园地 | 行业纵向讨论 | IT招聘 | IT文档
  ChinaUnix | ChinaUnix博客 | ChinaUnix论坛
CopyRight 1999-2011 itpub.net All Right Reserved. 北京盛拓优讯信息技术有限公司版权所有 联系我们 未成年人举报专区 
京ICP备16024965号-8  北京市公安局海淀分局网监中心备案编号:11010802021510 广播电视节目制作经营许可证:编号(京)字第1149号
  
快速回复 返回顶部 返回列表