本帖最后由 ubotutwin 于 2016-8-3 18:38 编辑
前日,遇到一起ORA-01210: data file header is media corrupt故障,由于datafile header损坏造成数据库不能启动,同时无备份、无归档。网上搜索解决方案未果,因此摸索了一个解决datafile header损坏的通用解决方法。 思路如下: 1、复制未损坏的数据文件的datafile header block到损坏的文件头部。 2、通过bbed修改相关数据结构为合法值。 首先,模拟datafile header损坏的情况,为重现完全无法启动情况,先停止数据库, SQL> shutdown immediate Database closed. Database dismounted. ORACLE instance shut down.
使用dd命令模拟datafile header损坏 [oracle@tony orcl]$ cp ./users01.dbf ./users01.dbf.bak [oracle@tony orcl]$ dd if=/dev/zero of=./users01.dbf bs=8192 count=2 2+0 records in 2+0 records out 16384 bytes (16 kB) copied, 0.000828924 seconds, 19.8 MB/s [oracle@tony orcl]$ dd if=./users01.dbf.bak of=./users01.dbf bs=8192 skip=2 seek=2 639+0 records in 639+0 records out 5234688 bytes (5.2 MB) copied, 0.115857 seconds, 45.2 MB/s 启动数据库 SQL> startup ORACLE instance started. Total System Global Area 608174080 bytes Fixed Size 1275104 bytes Variable Size 163580704 bytes Database Buffers 436207616 bytes Redo Buffers 7110656 bytes Database mounted. ORA-01122: database file 4 failed verification check ORA-01110: data file 4: '/u01/oradata/orcl/users01.dbf' ORA-01210: data file header is media corrupt 可以看到,故障已经重现。 首先要从数据库内搜集修改datafile header的相关信息,通过暂时将损坏的文件离线,就可以临时打开数据库。 SQL> alter database datafile 4 offline; Database altered. SQL> alter database open; Database altered. 需要收集的是: 1、控制文件中,该数据文件的创建时间。 2、File$字典表中,该数据文件的scn相关信息。 SQL> select file#,to_char(creation_time,'yyyymmddhh24miss') from v$datafile; FILE# TO_CHAR(CREATION_TIME,'YYYYM ---------- ---------------------------- 1 20100419102249 2 20100419104641 3 20100419102254 4 20100419102256 5 20130104195719 SQL> select file#,crscnwrp,crscnbas from file$; FILE# CRSCNWRP CRSCNBAS ---------- ---------- ---------- 1 0 8 2 0 381980 3 0 5782 4 0 8853 5 0 412969 使用dd工具,将同一表空间内完好数据文件的header block复制到损坏的文件头部,之所以使用同一表空间下的文件,是因为可以省去表空间相关变量的修改: [oracle@tony orcl]$ rm ./users01.dbf.bak [oracle@tony orcl]$ cp ./users01.dbf ./users01.dbf.bak [oracle@tony orcl]$ dd if=./users02.dbf of=./users01.dbf bs=8192 count=2 2+0 records in 2+0 records out 16384 bytes (16 kB) copied, 0.000442415 seconds, 37.0 MB/s [oracle@tony orcl]$ dd if=./users01.dbf.bak of=./users01.dbf bs=8192 skip=2 seek=2 639+0 records in 639+0 records out 5234688 bytes (5.2 MB) copied, 0.0560191 seconds, 93.4 MB/s 通过bbed观察数据块内部结构: BBED> p kcvfh struct kcvfh, 676 bytes @0 struct kcvfhbfh, 20 bytes @0 ub1 type_kcbh @0 0x0b ub1 frmt_kcbh @1 0xa2 ub1 spare1_kcbh @2 0x00 ub1 spare2_kcbh @3 0x00 ub4 rdba_kcbh @4 0x01400001 ub4 bas_kcbh @8 0x00000000 ub2 wrp_kcbh @12 0x0000 ub1 seq_kcbh @14 0x01 ub1 flg_kcbh @15 0x04 (KCBHFCKV) ub2 chkval_kcbh @16 0x0f36 ub2 spare3_kcbh @18 0x0000 struct kcvfhhdr, 76 bytes @20 ub4 kccfhswv @20 0x00000000 ub4 kccfhcvn @24 0x0a200500 ub4 kccfhdbi @28 0x4f5f0ca1 text kccfhdbn[0] @32 O text kccfhdbn[1] @33 R text kccfhdbn[2] @34 C text kccfhdbn[3] @35 L text kccfhdbn[4] @36 text kccfhdbn[5] @37 text kccfhdbn[6] @38 text kccfhdbn[7] @39 ub4 kccfhcsq @40 0x000001d6 ub4 kccfhfsz @44 0x00000280 s_blkz kccfhbsz @48 0x00 ub2 kccfhfno @52 0x0005 ub2 kccfhtyp @54 0x0003 ub4 kccfhacid @56 0x00000000 ub4 kccfhcks @60 0x00000000 text kccfhtag[0] @64 text kccfhtag[1] @65 text kccfhtag[2] @66 text kccfhtag[3] @67 text kccfhtag[4] @68 text kccfhtag[5] @69 text kccfhtag[6] @70 text kccfhtag[7] @71 text kccfhtag[8] @72 text kccfhtag[9] @73 text kccfhtag[10] @74 text kccfhtag[11] @75 text kccfhtag[12] @76 text kccfhtag[13] @77 text kccfhtag[14] @78 text kccfhtag[15] @79 text kccfhtag[16] @80 text kccfhtag[17] @81 text kccfhtag[18] @82 text kccfhtag[19] @83 text kccfhtag[20] @84 text kccfhtag[21] @85 text kccfhtag[22] @86 text kccfhtag[23] @87 text kccfhtag[24] @88 text kccfhtag[25] @89 text kccfhtag[26] @90 text kccfhtag[27] @91 text kccfhtag[28] @92 text kccfhtag[29] @93 text kccfhtag[30] @94 text kccfhtag[31] @95 ub4 kcvfhrdb @96 0x00000000 struct kcvfhcrs, 8 bytes @100 ub4 kscnbas @100 0x00064d29 ub2 kscnwrp @104 0x0000 ub4 kcvfhcrt @108 0x2fe9cb1f ub4 kcvfhrlc @112 0x2fe9c127 struct kcvfhrls, 8 bytes @116 ub4 kscnbas @116 0x0005da31 ub2 kscnwrp @120 0x0000 ub4 kcvfhbti @124 0x00000000 struct kcvfhbsc, 8 bytes @128 ub4 kscnbas @128 0x00000000 ub2 kscnwrp @132 0x0000 ub2 kcvfhbth @136 0x0000 ub2 kcvfhsta @138 0x0004 (KCVFHOFZ) struct kcvfhckp, 36 bytes @484 struct kcvcpscn, 8 bytes @484 ub4 kscnbas @484 0x000681c4 ub2 kscnwrp @488 0x0000 ub4 kcvcptim @492 0x2fea155d ub2 kcvcpthr @496 0x0001 union u, 12 bytes @500 struct kcvcprba, 12 bytes @500 ub4 kcrbaseq @500 0x00000002 ub4 kcrbabno @504 0x0000872b ub2 kcrbabof @508 0x0010 ub1 kcvcpetb[0] @512 0x02 ub1 kcvcpetb[1] @513 0x00 ub1 kcvcpetb[2] @514 0x00 ub1 kcvcpetb[3] @515 0x00 ub1 kcvcpetb[4] @516 0x00 ub1 kcvcpetb[5] @517 0x00 ub1 kcvcpetb[6] @518 0x00 ub1 kcvcpetb[7] @519 0x00 ub4 kcvfhcpc @140 0x00000012 ub4 kcvfhrts @144 0x00000000 ub4 kcvfhccc @148 0x00000011 struct kcvfhbcp, 36 bytes @152 struct kcvcpscn, 8 bytes @152 ub4 kscnbas @152 0x00000000 ub2 kscnwrp @156 0x0000 ub4 kcvcptim @160 0x00000000 ub2 kcvcpthr @164 0x0000 union u, 12 bytes @168 struct kcvcprba, 12 bytes @168 ub4 kcrbaseq @168 0x00000000 ub4 kcrbabno @172 0x00000000 ub2 kcrbabof @176 0x0000 ub1 kcvcpetb[0] @180 0x00 ub1 kcvcpetb[1] @181 0x00 ub1 kcvcpetb[2] @182 0x00 ub1 kcvcpetb[3] @183 0x00 ub1 kcvcpetb[4] @184 0x00 ub1 kcvcpetb[5] @185 0x00 ub1 kcvcpetb[6] @186 0x00 ub1 kcvcpetb[7] @187 0x00 ub4 kcvfhbhz @312 0x00000000 struct kcvfhxcd, 16 bytes @316 ub4 space_kcvmxcd[0] @316 0x00000000 ub4 space_kcvmxcd[1] @320 0x00000000 ub4 space_kcvmxcd[2] @324 0x00000000 ub4 space_kcvmxcd[3] @328 0x00000000 word kcvfhtsn @332 4 ub2 kcvfhtln @336 0x0005 text kcvfhtnm[0] @338 U text kcvfhtnm[1] @339 S text kcvfhtnm[2] @340 E text kcvfhtnm[3] @341 R text kcvfhtnm[4] @342 S text kcvfhtnm[5] @343 text kcvfhtnm[6] @344 text kcvfhtnm[7] @345 text kcvfhtnm[8] @346 text kcvfhtnm[9] @347 text kcvfhtnm[10] @348 text kcvfhtnm[11] @349 text kcvfhtnm[12] @350 text kcvfhtnm[13] @351 text kcvfhtnm[14] @352 text kcvfhtnm[15] @353 text kcvfhtnm[16] @354 text kcvfhtnm[17] @355 text kcvfhtnm[18] @356 text kcvfhtnm[19] @357 text kcvfhtnm[20] @358 text kcvfhtnm[21] @359 text kcvfhtnm[22] @360 text kcvfhtnm[23] @361 text kcvfhtnm[24] @362 text kcvfhtnm[25] @363 text kcvfhtnm[26] @364 text kcvfhtnm[27] @365 text kcvfhtnm[28] @366 text kcvfhtnm[29] @367 ub4 kcvfhrfn @368 0x00000005 struct kcvfhrfs, 8 bytes @372 ub4 kscnbas @372 0x00000000 ub2 kscnwrp @376 0x0000 ub4 kcvfhrft @380 0x00000000 struct kcvfhafs, 8 bytes @384 ub4 kscnbas @384 0x00000000 ub2 kscnwrp @388 0x0000 ub4 kcvfhbbc @392 0x00000000 ub4 kcvfhncb @396 0x00000000 ub4 kcvfhmcb @400 0x00000000 ub4 kcvfhlcb @404 0x00000000 ub4 kcvfhbcs @408 0x00000000 ub2 kcvfhofb @412 0x0000 ub2 kcvfhnfb @414 0x0000 ub4 kcvfhprc @416 0x2ab85c76 struct kcvfhprs, 8 bytes @420 ub4 kscnbas @420 0x00000001 ub2 kscnwrp @424 0x0000 struct kcvfhprfs, 8 bytes @428 ub4 kscnbas @428 0x00000000 ub2 kscnwrp @432 0x0000 ub4 kcvfhtrt @444 0x00000000 需要调整的是: ub4 rdba_kcbh @4 ---data block address ub2 kccfhfno @52 ---file number ub4 kscnbas @100 ---scn bas ub4 kcvfhcrt @108 ---file create time ub4 kcvfhrfn @368 ---reference file number 具体修改方法: BBED> set offset 4 OFFSET 4 BBED> modify /x 010000 4号文件1号块的dba是0x01000001。 BBED> set offset 52 OFFSET 52 BBED> modify /x 04 修改file number为4。 BBED> set offset 100 OFFSET 100 BBED> modify /x 952200 Scn base由之前查询file$的sql结果获得,由于linux是小头存储结构,因此最后要把字节序翻转。 BBED> set offset 108 OFFSET 108 BBED> modify /x 805c BBED> set offset 110 OFFSET 110 BBED> modify /x b82a 这里要注意的是,bbed的modify一次最多只可以修改3个字节的值,因此4字节的kcvfhcrt变量要分为两次修改。 Oracle数据文件create time的计算方法为:文件建立时间-1988年1月1日零时的秒数,其中每个月都按31天计算。 BBED> set offset 368 OFFSET 368 BBED> modify /x 04 修改rfile#为4。 BBED> sum dba 4,1 apply Check value for File 4, Block 1: current = 0xf202, required = 0xf202 最后重新计算数据块的check sum值并写入。 尝试在数据库中将该文件online, SQL> / alter database datafile 4 online * ERROR at line 1: ORA-01113: file 4 needs media recovery if it was restored from backup, or END BACKUP if it was not ORA-01110: data file 4: '/u01/oradata/orcl/users01.dbf' SQL> recover datafile 4; Media recovery complete. SQL> alter database datafile 4 online; Database altered. 该文件已经成功恢复。 SQL> select owner,table_name from dba_tables a where a.tablespace_name='USERS'; OWNER ------------------------------------------------------------ TABLE_NAME ------------------------------------------------------------ SCOTT DEPT SCOTT EMP SCOTT BONUS OWNER ------------------------------------------------------------ TABLE_NAME ------------------------------------------------------------ SCOTT SALGRADE SQL> select * from scott.dept; DEPTNO DNAME LOC ---------- ---------------------------- -------------------------- 10 ACCOUNTING NEW YORK 20 RESEARCH DALLAS 30 SALES CHICAGO 40 OPERATIONS BOSTON 查询该表空间内的数据,完全正常。 在oracle数据库损坏而造成不能启动的情况下,只要不是system表空间的损坏,是有很大可能性通过分析、调整来打开数据库并挽救数据的。只要对oracle的内部原理、结构有足够的了解。
|