楼主: jayli426

[讨论] 为何用ARCH模式传输不写Standby Redologs,直接保存成归档文件存放于Standby端

[复制链接]
论坛徽章:
47
蒙奇·D·路飞
日期:2017-03-27 08:04:23马上有车
日期:2014-02-18 16:41:112014年新春福章
日期:2014-02-18 16:41:11一汽
日期:2013-09-01 20:46:27复活蛋
日期:2013-03-13 07:55:232013年新春福章
日期:2013-02-25 14:51:24ITPUB 11周年纪念徽章
日期:2012-10-09 18:03:322012新春纪念徽章
日期:2012-02-13 15:13:202012新春纪念徽章
日期:2012-02-13 15:13:202012新春纪念徽章
日期:2012-02-13 15:13:20
11#
发表于 2013-5-10 22:46 | 只看该作者
> Is SRLs necessary when using LGWR transmit redo data?

Yes. According to
Usage, Benefits and Limitations of Standby Redo Logs (SRL) [ID 219344.1]
"Standby Redo Logs are only used if you have the LGWR activated for archival to the Remote Standby Database."
You may say that this English sentence can also imply, grammatically, that SRLs may or may not be used if you use LGWR to send redo, because it literally is equivalent, logically speaking, to the statement that if you don't use LGWR, then SRLs are not used, or the statement that if SRLs are used, LGWR is used. Neither statement is logically the same as if LGWR is used, SRLs must be used.

Linguistic analysis aside, we should think over this situation with our knowledge of Oracle. If LGWR is used to send redo, is it possible to not use SRLs but only archived redo log files? I don't think so. Can you imagine there's no place on the standby to receive instantaneous redo the primary side LGWR must send over?

Documentation is indeed not clear. For example,
http://docs.oracle.com/cd/B28359 ... 4/log_transport.htm
"The synchronous and asynchronous redo transport modes require that a redo transport destination have a standby redo log."
The sentence includes both synchronous and asynchronous. Then it means it's not possible to not have SRLs.

Having said that, I hope you can test by not creating SRLs while you use LGWR to send redo. See if it works or throws errors. Show us the errors. Thanks.

使用道具 举报

回复
论坛徽章:
7
2014年世界杯参赛球队: 比利时
日期:2014-06-19 15:28:35优秀写手
日期:2014-10-21 06:00:13马上有房
日期:2014-10-28 13:05:56马上有车
日期:2015-01-15 10:51:27慢羊羊
日期:2015-03-04 14:53:332015年新春福章
日期:2015-03-06 11:58:39摩羯座
日期:2015-08-18 15:00:45
12#
发表于 2013-5-13 23:37 | 只看该作者
Yong Huang 发表于 2013-5-10 22:46
> Is SRLs necessary when using LGWR transmit redo data?

Yes. According to

Hi,黄版,我在我的Data Guard测试环境上做了测试。
测试结论:「主库如果使用LGWR传输redo数据时,备库即使没有创建SRLs亦可以接收并应用redo数据」

测试情况,大概如下:

---主库信息:
version
SQL> select * from v$version;
BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - Production
PL/SQL Release 11.2.0.1.0 - Production
CORE    11.2.0.1.0      Production
TNS for Linux: Version 11.2.0.1.0 - Production
NLSRTL Version 11.2.0.1.0 - Production

SQL> show parameter log_archive_dest_2;
NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
log_archive_dest_2                   string      service=standby lgwr async noa
                                                 ffirm valid_for=(online_logfil
                                                 es,primary_roles) db_unique_na
                                                 me=standby

---备库信息
version
SQL> select * from v$version;
BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - Production
PL/SQL Release 11.2.0.1.0 - Production
CORE    11.2.0.1.0      Production
TNS for Linux: Version 11.2.0.1.0 - Production
NLSRTL Version 11.2.0.1.0 - Production

SQL> select group#, status from v$standby_log ;
    GROUP# STATUS
---------- ----------
         4 ACTIVE
         5 UNASSIGNED
         6 UNASSIGNED
         7 UNASSIGNED

SQL> select member from v$logfile;
MEMBER
--------------------------------------------------------------------------------
/u01/app/oracle/oradata/standby/redo03.log
/u01/app/oracle/oradata/standby/redo02.log
/u01/app/oracle/oradata/standby/redo01.log
/u01/app/oracle/oradata/standby/standby_redo01.log
/u01/app/oracle/oradata/standby/standby_redo02.log
/u01/app/oracle/oradata/standby/standby_redo03.log
/u01/app/oracle/oradata/standby/standby_redo04.log
7 rows selected.

开始测试:

---备库查询open mode
SQL> select open_mode from v$database;
OPEN_MODE
--------------------
READ ONLY

---在备库中删除备库的SRLs
SQL> alter database drop standby logfile group 4;
alter database drop standby logfile group 4
*
ERROR at line 1:
ORA-00261: log 4 of thread 1 is being archived or modified
ORA-00312: online log 4 thread 1:
'/u01/app/oracle/oradata/standby/standby_redo01.log'
注:前面提到过group 4此时是出于active状态的,暂且不能drop。先把group 5、6、7drop掉。

SQL> alter database drop standby logfile group 5;
Database altered.

SQL> alter database drop standby logfile group 6;
Database altered.

SQL> alter database drop standby logfile group 7;
Database altered.

SQL> select group#, status from v$standby_log ;
    GROUP# STATUS
---------- ----------
         4 ACTIVE

---如何drop掉group 4呢?请看如下步骤:
在主库切换日志,触发备库也进行日志切换
SQL> alter system switch logfile;
System altered.

此时再在备库查询视图v$standby_log
SQL> select group#, status,archived from v$standby_log ;
    GROUP# STATUS     ARC
---------- ---------- ---
         4 UNASSIGNED NO

顺利drop掉group 4。至此,备库已经不存在SRLs了
SQL> alter database drop standby logfile group 4;
Database altered.

再次确认备库是否存在SRLS
SQL> select * from v$standby_log;
no rows selected

查看备库日志应用模式
SQL> select dest_id,recovery_mode from v$archive_dest_status where dest_id = 1;
   DEST_ID RECOVERY_MODE
---------- -----------------------
         1 MANAGED

在主库切换日志(从这一步骤开始验证,主库使用LGWR传输redo数据,且备库不存在SRLs时,备库能否接收并应用redo数据)
SQL> archive log list;
Database log mode              Archive Mode
Automatic archival             Enabled
Archive destination            /u01/app/oracle/archivelog
Oldest online log sequence     227
Next log sequence to archive   229
Current log sequence           229
SQL>
SQL>
SQL> alter system switch logfile;
System altered.
SQL>
SQL> archive log list;
Database log mode              Archive Mode
Automatic archival             Enabled
Archive destination            /u01/app/oracle/archivelog
Oldest online log sequence     228
Next log sequence to archive   230
Current log sequence           230
SQL>

备库:
SQL> archive log list;
Database log mode              Archive Mode
Automatic archival             Enabled
Archive destination            /u01/app/oracle/archivelog
Oldest online log sequence     228
Next log sequence to archive   0
Current log sequence           230
SQL>
SQL> select sequence#,applied from v$archived_log where applied <> 'YES';
no rows selected
SQL>
SQL> select sequence#,applied from v$archived_log where sequence# = 229;
SEQUENCE# APPLIED
---------- ---------
       229 YES
SQL>

在主库切换归档后,备库alert所出现的告警信息如下:
Mon May 13 13:56:06 2013
Archived Log entry 218 added for thread 1 sequence 229 rlc 808401979 ID 0x951bfb78 dest 2:
RFS[10]: Opened log for thread 1 sequence 230 dbid -1793319560 branch 808401979
Mon May 13 13:56:09 2013
Media Recovery Log /u01/app/oracle/archivelog/1_229_808401979.dbf
Media Recovery Waiting for thread 1 sequence 230 (in transit)
[oracle@standby trace]$

BTW:在删除备库的SRLs时,备库告警出现过下面的信息。为什么要贴出这些信息呢,因为我发现在我测试的过程中,好像只有我标红的那句话提到了SRLs。如果SRLs的存在对于主库的LGWR来说是必须的话,那么应该会存在跟SRLs有关的错误信息才对。
Mon May 13 12:54:27 2013
alter database drop standby logfile group 5
Completed: alter database drop standby logfile group 5
alter database drop standby logfile group 6
Completed: alter database drop standby logfile group 6
alter database drop standby logfile group 7
Completed: alter database drop standby logfile group 7
Mon May 13 12:54:47 2013
alter database drop standby logfile group 4
ORA-261 signalled during: alter database drop standby logfile group 4...
Mon May 13 12:55:14 2013
RFS[2]: No standby redo logfiles of size 102400 blocks available
RFS[2]: Opened log for thread 1 sequence 220 dbid -1793319560 branch 808401979
Mon May 13 12:55:21 2013
Archived Log entry 208 added for thread 1 sequence 219 ID 0x951bfb78 dest 1:
ARC1: Archive log thread 1 sequence 219 available in 4 minute(s)
Mon May 13 12:56:25 2013
alter database drop standby logfile group 4
Completed: alter database drop standby logfile group 4
Mon May 13 12:56:33 2013
Archived Log entry 209 added for thread 1 sequence 220 rlc 808401979 ID 0x951bfb78 dest 2:
RFS[2]: Opened log for thread 1 sequence 221 dbid -1793319560 branch 808401979

使用道具 举报

回复
论坛徽章:
47
蒙奇·D·路飞
日期:2017-03-27 08:04:23马上有车
日期:2014-02-18 16:41:112014年新春福章
日期:2014-02-18 16:41:11一汽
日期:2013-09-01 20:46:27复活蛋
日期:2013-03-13 07:55:232013年新春福章
日期:2013-02-25 14:51:24ITPUB 11周年纪念徽章
日期:2012-10-09 18:03:322012新春纪念徽章
日期:2012-02-13 15:13:202012新春纪念徽章
日期:2012-02-13 15:13:202012新春纪念徽章
日期:2012-02-13 15:13:20
13#
发表于 2013-5-14 01:58 | 只看该作者
I admit there's something here I don't fully understand. But in 11g, LGWR is a deprecated attribute for log_archive_dest_n. Documentation says "Specify SYNC or ASYNC. ASYNC is the default if neither attribute is specified" instead. So, if you specify "LGWR" as in previous versions, I'm not sure about the effect. Is it possible it's simply ignored and it's still ARCn that handles redo transport? I think you're on Linux (or UNIX). Why not use OS level tools to find out? On the standby, find the RFS processes (select pid from v$managed_standby where process='RFS'). Then for each one

netstat -anp | grep  <RFS process> | grep ^tcp

You may need to run as root. Suppose the line is
tcp        0      0 10.112.66.174:1521          10.114.177.14:49024         ESTABLISHED 27430/oracleoracp84

Based on the remote port number (49024 here), go to the primary and find its process:

netstat -anp | grep 49024

See if the process on the primary is an archiver process.

By the way, technically, it will never be the LGWR process itself that ships redo. Instead it will be LNS, if it's configured to not use ARCn.

使用道具 举报

回复
论坛徽章:
7
2014年世界杯参赛球队: 比利时
日期:2014-06-19 15:28:35优秀写手
日期:2014-10-21 06:00:13马上有房
日期:2014-10-28 13:05:56马上有车
日期:2015-01-15 10:51:27慢羊羊
日期:2015-03-04 14:53:332015年新春福章
日期:2015-03-06 11:58:39摩羯座
日期:2015-08-18 15:00:45
14#
发表于 2013-5-14 15:31 | 只看该作者
Yong Huang 发表于 2013-5-14 01:58
I admit there's something here I don't fully understand. But in 11g, LGWR is a deprecated attribute  ...

SQL> rem 在备库查询
SQL>
SQL> select pid from v$managed_standby where process='RFS';
       PID
----------
      3937
      3932
      3934
      3930
SQL>
SQL> ho
[oracle@standby ~]$
[oracle@standby ~]$ netstat -anp | grep 3937 | grep ^tcp
(Not all processes could be identified, non-owned process info
will not be shown, you would have to be root to see it all.)
tcp        0      0 192.168.0.52:1526           192.168.0.167:18380         ESTABLISHED 3937/oraclestandby  
[oracle@standby ~]$
[oracle@standby ~]$ netstat -anp | grep 3932 | grep ^tcp
(Not all processes could be identified, non-owned process info
will not be shown, you would have to be root to see it all.)
tcp        0      0 192.168.0.52:1526           192.168.0.167:18367         ESTABLISHED 3932/oraclestandby  
[oracle@standby ~]$
[oracle@standby ~]$ netstat -anp | grep 3934 | grep ^tcp
(Not all processes could be identified, non-owned process info
will not be shown, you would have to be root to see it all.)
tcp        0      0 192.168.0.52:1526           192.168.0.167:18368         ESTABLISHED 3934/oraclestandby  
[oracle@standby ~]$
[oracle@standby ~]$ netstat -anp | grep 3930 | grep ^tcp
(Not all processes could be identified, non-owned process info
will not be shown, you would have to be root to see it all.)
tcp        0      0 192.168.0.52:1526           192.168.0.167:18366         ESTABLISHED 3930/oraclestandby  
[oracle@standby ~]$
SQL>
SQL> rem 在主库查询
SQL>
SQL> ho
[oracle@primary ~]$
[oracle@primary ~]$ netstat -anp | grep 18380
(Not all processes could be identified, non-owned process info
will not be shown, you would have to be root to see it all.)
tcp        0      0 192.168.0.167:18380         192.168.0.52:1526           ESTABLISHED 3927/ora_arc2_prima
[oracle@primary ~]$
[oracle@primary ~]$ netstat -anp | grep 18367
(Not all processes could be identified, non-owned process info
will not be shown, you would have to be root to see it all.)
tcp        0      0 192.168.0.167:18367         192.168.0.52:1526           ESTABLISHED 3929/ora_arc3_prima
[oracle@primary ~]$
[oracle@primary ~]$ netstat -anp | grep 18368
(Not all processes could be identified, non-owned process info
will not be shown, you would have to be root to see it all.)
tcp        0      0 192.168.0.167:18368         192.168.0.52:1526           ESTABLISHED 3931/ora_nsa2_prima
[oracle@primary ~]$
[oracle@primary ~]$ netstat -anp | grep 18366
(Not all processes could be identified, non-owned process info
will not be shown, you would have to be root to see it all.)
tcp        0      0 192.168.0.167:18366         192.168.0.52:1526           ESTABLISHED 3925/ora_arc1_prima
[oracle@primary ~]$
[oracle@primary ~]$

上述输出说明了什么问题呢?

使用道具 举报

回复
论坛徽章:
47
蒙奇·D·路飞
日期:2017-03-27 08:04:23马上有车
日期:2014-02-18 16:41:112014年新春福章
日期:2014-02-18 16:41:11一汽
日期:2013-09-01 20:46:27复活蛋
日期:2013-03-13 07:55:232013年新春福章
日期:2013-02-25 14:51:24ITPUB 11周年纪念徽章
日期:2012-10-09 18:03:322012新春纪念徽章
日期:2012-02-13 15:13:202012新春纪念徽章
日期:2012-02-13 15:13:202012新春纪念徽章
日期:2012-02-13 15:13:20
15#
发表于 2013-5-15 00:12 | 只看该作者
本帖最后由 Yong Huang 于 2013-5-14 10:16 编辑

It looks like your primary is still using ARCn to ship redo to the standby, as I suspected. That is, "lgwr" in log_archive_dest_2 has no effect in 11g. As documentation suggests, you need to use "sync". The NSAn background process is explained at
http://docs.oracle.com/cd/E11882_01/server.112/e25513/bgprocesses.htm
"Ships redo from current online redo logs to remote standby destinations configured for ASYNC transport". If you use "sync", NSAn" should change to NSSn.

使用道具 举报

回复
论坛徽章:
7
2014年世界杯参赛球队: 比利时
日期:2014-06-19 15:28:35优秀写手
日期:2014-10-21 06:00:13马上有房
日期:2014-10-28 13:05:56马上有车
日期:2015-01-15 10:51:27慢羊羊
日期:2015-03-04 14:53:332015年新春福章
日期:2015-03-06 11:58:39摩羯座
日期:2015-08-18 15:00:45
16#
发表于 2013-5-15 10:55 | 只看该作者
Yong Huang 发表于 2013-5-15 00:12
It looks like your primary is still using ARCn to ship redo to the standby, as I suspected. That is, ...

1. It looks like your primary is still using ARCn to ship redo to the standby
    是的。貌似无论是sync还是async,主库都是用ARCn+NSAn或ARCn+NSSn传输redo数据。什么时候才会用LGWR传输呢?

2.  As documentation suggests, you need to use "sync".
    为什么?

3. That is, "lgwr" in log_archive_dest_2 has no effect in 11g.
   是的,在11g中,ARCH和LGWR都已经被废弃了。不解的是Oracle为何要在11g中废弃这两个属性?
   在10g中这两个属性的作用是:
   ARCH
      Uses the archiver processes (ARCn) to collect transaction redo data and transmit it to standby destinations.
   LGWR
      Uses the log writer process (LGWR) to collect transaction redo data and transmit it to standby destinations.
   在11g中,log_archive_dest_2是如何来实现上面的功能?用属性ASYNC或SYNC实现?

使用道具 举报

回复
论坛徽章:
47
蒙奇·D·路飞
日期:2017-03-27 08:04:23马上有车
日期:2014-02-18 16:41:112014年新春福章
日期:2014-02-18 16:41:11一汽
日期:2013-09-01 20:46:27复活蛋
日期:2013-03-13 07:55:232013年新春福章
日期:2013-02-25 14:51:24ITPUB 11周年纪念徽章
日期:2012-10-09 18:03:322012新春纪念徽章
日期:2012-02-13 15:13:202012新春纪念徽章
日期:2012-02-13 15:13:202012新春纪念徽章
日期:2012-02-13 15:13:20
17#
发表于 2013-5-15 22:37 | 只看该作者
So you tested changing "lgwr" to "sync" and the standby still works as usual without SRLs? And the processes on the primary that have socket connections to the standby changed from ARCn+NSAn to ARCn+NSSn? Then I don't know where the redo is temporarily accumulated or stored on the standby before a log switch. I mean, without an SRL, an NSS process should send the instantaneous redo to the standby but where is it stored?

Let's do some more research. Maybe you can open an SR and ask Oracle.

使用道具 举报

回复
论坛徽章:
7
2014年世界杯参赛球队: 比利时
日期:2014-06-19 15:28:35优秀写手
日期:2014-10-21 06:00:13马上有房
日期:2014-10-28 13:05:56马上有车
日期:2015-01-15 10:51:27慢羊羊
日期:2015-03-04 14:53:332015年新春福章
日期:2015-03-06 11:58:39摩羯座
日期:2015-08-18 15:00:45
18#
发表于 2013-5-15 23:41 | 只看该作者
Yong Huang 发表于 2013-5-15 22:37
So you tested changing "lgwr" to "sync" and the standby still works as usual without SRLs? And the p ...

> So you tested changing "lgwr" to "sync" and the standby still works as usual without SRLs?
是的,主库依然看到是用arch传输redo data

> And the processes on the primary that have socket connections to the standby changed from ARCn+NSAn to ARCn+NSSn?
是的

> Then I don't know where the redo is temporarily accumulated or stored on the standby before a log switch. I mean, without an SRL, an NSS process should send the instantaneous redo to the standby but where is it stored?
我想的是直接以归档文件的形式存储吧

> Let's do some more research. Maybe you can open an SR and ask Oracle.
  sorry , i don't have metalink account,so can't open an SR.

BTW:可否解释下16的questions?

使用道具 举报

回复
论坛徽章:
13
2013年新春福章
日期:2013-04-08 17:42:48技术图书徽章
日期:2014-05-13 10:11:54马上有对象
日期:2014-02-18 16:44:08ITPUB社区12周年站庆徽章
日期:2013-10-17 13:56:392013年新春福章
日期:2013-06-05 15:29:212013年新春福章
日期:2013-06-05 15:29:212013年新春福章
日期:2013-06-05 15:29:212013年新春福章
日期:2013-05-27 10:23:002013年新春福章
日期:2013-05-27 10:23:002013年新春福章
日期:2013-05-27 10:23:00
19#
发表于 2013-5-16 13:21 | 只看该作者
精彩,忍不住想说个英文单词 :nice

使用道具 举报

回复
论坛徽章:
47
蒙奇·D·路飞
日期:2017-03-27 08:04:23马上有车
日期:2014-02-18 16:41:112014年新春福章
日期:2014-02-18 16:41:11一汽
日期:2013-09-01 20:46:27复活蛋
日期:2013-03-13 07:55:232013年新春福章
日期:2013-02-25 14:51:24ITPUB 11周年纪念徽章
日期:2012-10-09 18:03:322012新春纪念徽章
日期:2012-02-13 15:13:202012新春纪念徽章
日期:2012-02-13 15:13:202012新春纪念徽章
日期:2012-02-13 15:13:20
20#
发表于 2013-5-17 03:53 | 只看该作者
I did some reading on this topic and need to clarify some points I made earlier. First, the NSAn and NSSn processes on the primary. These are named at the OS level (as seen in `ps -ef'). Inside the database, they're called LNS (as seen in v$managed_standby). Even though we see ARCn+NSAn processes making socket connections to the standby, normally it's only NSAn (in case of async) or NSSn (in case of sync) that transmits redo; ARCn processes come to help only when the LNSn processes can't catch up. Details are in
How does LGWR/LNS process work in 10gR2 using LGWR ASYNC [ID 1057898.1]
(There may be a document for 11g but I can't find it.)

Since my standby's always have SRLs and I don't have a data guard environment completely free for my experiment, I'd like to not drop the SRLs and see the behavior. In your case, find exactly the process on the standby that connects to your NSAn or NSSn (e.g. process 3934 in your msg #14). Check its open files with `ls -l /proc/<pid>/fd' or `lsof -p <pid>' and also note the descriptor numbers. Mine shows one SRL. Since you don't have any SRL, I guess it has to point to an archive log? Try tracing its writing while you make fairly heavy DML/DDL on the primary and commit:

strace -f -e trace=desc -p 3934 #install strace if needed: sudo yum -y install strace

See which file the file descriptor points to (the first arg to write() or writev() call in strace output).

使用道具 举报

回复

您需要登录后才可以回帖 登录 | 注册

本版积分规则 发表回复

TOP技术积分榜 社区积分榜 徽章 团队 统计 知识索引树 积分竞拍 文本模式 帮助
  ITPUB首页 | ITPUB论坛 | 数据库技术 | 企业信息化 | 开发技术 | 微软技术 | 软件工程与项目管理 | IBM技术园地 | 行业纵向讨论 | IT招聘 | IT文档
  ChinaUnix | ChinaUnix博客 | ChinaUnix论坛
CopyRight 1999-2011 itpub.net All Right Reserved. 北京盛拓优讯信息技术有限公司版权所有 联系我们 未成年人举报专区 
京ICP备16024965号-8  北京市公安局海淀分局网监中心备案编号:11010802021510 广播电视节目制作经营许可证:编号(京)字第1149号
  
快速回复 返回顶部 返回列表