ITPUB论坛 » Oracle数据库管理 » 安装完RAC出现的问题【问题解决】


2008-7-5 08:23 anycall2010
安装完RAC出现的问题【问题解决】

[b]具体环境:[/b]

数据库版本是10.2.0.1  操作系统是:2.6.9-55.0.0.0.2.ELhugemem

[b]出错的原因:[/b]

我在虚机上安装的,起初安装完毕没有报任何错误.由于安装完毕后,机器巨慢,我就强制将机器重启后,随后出现下面问题.

[b]数据库运行状态:[/b]
[color=Lime][/color]
[b]rac1->  [b]crs_stat -t[/b][/b]
Name           Type           Target    State     Host
------------------------------------------------------------
ora.dbvdb.db   application    ONLINE    UNKNOWN   rac1
ora....b1.inst application    ONLINE    OFFLINE
ora....b2.inst application    ONLINE    OFFLINE
ora....SM1.asm application    ONLINE    UNKNOWN   rac1
ora....C1.lsnr application    ONLINE    UNKNOWN   rac1
ora.rac1.gsd   application    ONLINE    UNKNOWN   rac1
ora.rac1.ons   application    ONLINE    UNKNOWN   rac1
ora.rac1.vip   application    ONLINE    ONLINE    rac1
ora....SM2.asm application    ONLINE    UNKNOWN   rac2
ora....C2.lsnr application    ONLINE    UNKNOWN   rac2
ora.rac2.gsd   application    ONLINE    UNKNOWN   rac2
ora.rac2.ons   application    ONLINE    UNKNOWN   rac2
ora.rac2.vip   application    ONLINE    ONLINE    rac2

[b]rac1-> srvctl status nodeapps -n rac1[/b]
VIP is running on node: rac1
GSD is not running on node: rac1
Listener is not running on node: rac1
ONS daemon is not running on node: rac1

[b]rac1->  srvctl status asm -n rac1[/b]  //ASM也起不来了

ASM instance +ASM1 is not running on node rac1.


[b]集群件检查:[/b]

[root@rac1 ~]# [b]/u01/oracle/product/10.2.0/crs_1/bin/cluvfy stage -post crsinst -n rac1,rac2[/b]
Performing post-checks for cluster services setup

Checking node reachability...
Node reachability check passed from node "rac1".


Checking user equivalence...
User equivalence check failed for user "root".
Check failed on nodes:
        rac2,rac1

ERROR:
User equivalence unavailable on all the nodes.
Verification cannot proceed.


Post-check for cluster services setup was unsuccessful on all the nodes.


[b]rac1-> crsctl check crs  // 节点1的CRS正常![/b]
CSS appears healthy
CRS appears healthy
EVM appears healthy

[b]rac2-> crsctl check crs[/b]  // 节点2的CRS正常![/b]
CSS appears healthy
CRS appears healthy
EVM appears healthy


[b]网络检查:[/b]


[b]rac1-> ping 192.168.0.4[/b] //RAC2的主机地址 状态OK//
PING 192.168.0.4 (192.168.0.4) 56(84) bytes of data.
64 bytes from 192.168.0.4: icmp_seq=0 ttl=64 time=2.03 ms
64 bytes from 192.168.0.4: icmp_seq=1 ttl=64 time=0.736 ms

--- 192.168.0.4 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1002ms
rtt min/avg/max/mdev = 0.736/1.387/2.038/0.651 ms, pipe 2

[b]rac1-> ping 192.168.0.32[/b] //RAC2的私有地址  状态OK//
PING 192.168.0.32 (192.168.0.32) 56(84) bytes of data.
64 bytes from 192.168.0.32: icmp_seq=0 ttl=64 time=5.69 ms
64 bytes from 192.168.0.32: icmp_seq=1 ttl=64 time=0.124 ms

--- 192.168.0.32 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1005ms
rtt min/avg/max/mdev = 0.124/2.907/5.691/2.784 ms, pipe 2

[b]rac1-> ping 10.10.10.32[/b]  //RAC2的VIP  状态OK//
PING 10.10.10.32 (10.10.10.32) 56(84) bytes of data.
64 bytes from 10.10.10.32: icmp_seq=0 ttl=64 time=2.96 ms
64 bytes from 10.10.10.32: icmp_seq=1 ttl=64 time=0.000 ms

--- 10.10.10.32 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1004ms
rtt min/avg/max/mdev = 0.000/1.482/2.964/1.482 ms, pipe 2

[b]分析结果: 

重新配置CRS,然后重新配置节点?[/b] 
请大家给拿拿意见!

[[i] 本帖最后由 anycall2010 于 2008-7-6 09:52 编辑 [/i]]

2008-7-5 08:27 anycall2010
继续看: 

rac1-> more /etc/ocfs2/cluster.conf
node:
        ip_port = 7777
        ip_address = 192.168.0.3
        number = 0
        name = rac1
        cluster = ocfs2

node:
        ip_port = 7777
        ip_address = 192.168.0.4
        number = 1
        name = rac2
        cluster = ocfs2

cluster:
        node_count = 2
        name = ocfs2


rac2-> more /etc/ocfs2/cluster.conf
node:
        ip_port = 7777
        ip_address = 192.168.0.3
        number = 0
        name = rac1
        cluster = ocfs2

node:
        ip_port = 7777
        ip_address = 192.168.0.4
        number = 1
        name = rac2
        cluster = ocfs2

cluster:
        node_count = 2
        name = ocfs2

两个节点没问题 !

2008-7-5 08:34 anycall2010
[root@rac1 ~]# /etc/init.d/o2cb status
Module "configfs": Loaded
Filesystem "configfs": Mounted
Module "ocfs2_nodemanager": Loaded
Module "ocfs2_dlm": Loaded
Module "ocfs2_dlmfs": Loaded
Filesystem "ocfs2_dlmfs": Mounted
Checking O2CB cluster ocfs2: Online
  Heartbeat dead threshold: 61
  Network idle timeout: 10000
  Network keepalive delay: 5000
  Network reconnect delay: 2000
Checking O2CB heartbeat: Active
心跳也没问题 !第2个节点同上!

2008-7-5 08:37 anycall2010
需要重新安装 CRS吗 ?在线等。。。。。。

2008-7-5 08:41 zhangweicai74
关注

2008-7-5 08:52 anycall2010
如果重新安装 CRS工程也太大了 ,我需要做如下步骤了:

        rm /etc/oracle/*
        rm -f /etc/init.d/init.cssd
        rm -f /etc/init.d/init.crs
        rm -f /etc/init.d/init.crsd
        rm -f /etc/init.d/init.evmd
        rm -f /etc/rc2.d/K96init.crs
        rm -f /etc/rc2.d/S96init.crs
        rm -f /etc/rc3.d/K96init.crs
        rm -f /etc/rc3.d/S96init.crs
        rm -f /etc/rc5.d/K96init.crs
        rm -f /etc/rc5.d/S96init.crs
        rm -Rf /etc/oracle/scls_scr
        rm -f /etc/inittab.crs
        cp /etc/inittab.orig /etc/inittab

2008-7-5 09:07 anycall2010
运气来了!我刚才也不知道敲到什么命令,CRS好了!

rac1-> /u01/oracle/product/10.2.0/crs_1/bin/cluvfy stage -post crsinst -n rac1,rac2

Performing post-checks for cluster services setup

Checking node reachability...
Node reachability check passed from node "rac1".


Checking user equivalence...
User equivalence check passed for user "oracle".

Checking Cluster manager integrity...


Checking CSS daemon...
Daemon status check passed for "CSS daemon".

Cluster manager integrity check passed.

Checking cluster integrity...


Cluster integrity check passed


Checking OCR integrity...

Checking the absence of a non-clustered configuration...
All nodes free of non-clustered, local-only configurations.

Uniqueness check for OCR device passed.

Checking the version of OCR...
OCR of correct Version "2" exists.

Checking data integrity of OCR...
Data integrity check for OCR passed.

OCR integrity check passed.

Checking CRS integrity...

Checking daemon liveness...
Liveness check passed for "CRS daemon".

Checking daemon liveness...
Liveness check passed for "CSS daemon".

Checking daemon liveness...
Liveness check passed for "EVM daemon".

Checking CRS health...
CRS health check passed.

CRS integrity check passed.

Checking node application existence...


Checking existence of VIP node application (required)
Check passed.

Checking existence of ONS node application (optional)
Check passed.

Checking existence of GSD node application (optional)
Check passed.


Post-check for cluster services setup was successful.
rac1->

2008-7-5 09:08 anycall2010
和刚才的状态已经 不一样了
rac1->  crs_stat -t
Name           Type           Target    State     Host
------------------------------------------------------------
ora.dbvdb.db   application    ONLINE    UNKNOWN   rac1
ora....b1.inst application    ONLINE    OFFLINE
ora....b2.inst application    ONLINE    OFFLINE
ora....SM1.asm application    ONLINE    UNKNOWN   rac1
ora....C1.lsnr application    ONLINE    UNKNOWN   rac1
ora.rac1.gsd   application    ONLINE    UNKNOWN   rac1
ora.rac1.ons   application    ONLINE    UNKNOWN   rac1
ora.rac1.vip   application    ONLINE    ONLINE    rac1
ora....SM2.asm application    ONLINE    UNKNOWN   rac2
ora....C2.lsnr application    ONLINE    UNKNOWN   rac2
ora.rac2.gsd   application    ONLINE    UNKNOWN   rac2
ora.rac2.ons   application    ONLINE    UNKNOWN   rac2
ora.rac2.vip   application    ONLINE    ONLINE    rac2

2008-7-5 09:12 anycall2010
rac1-> srvctl status nodeapps -n rac1
VIP is running on node: rac1
GSD is not running on node: rac1
Listener is not running on node: rac1
ONS daemon is not running on node: rac1
节点状态和 以前一样,没什么变化?现在感觉好像是ASM的问题了

2008-7-5 09:16 anycall2010
rac1-> srvctl start asm -n rac1
PRKS-1009 : Failed to start ASM instance "+ASM1" on node "rac1", [PRKS-1009 : Failed to start ASM instance "+ASM1" on node "rac1", [CRS-1028: Dependency analysis failed because of:
CRS-0223: Resource 'ora.rac1.ASM1.asm' has placement error.]]
  [PRKS-1009 : Failed to start ASM instance "+ASM1" on node "rac1", [CRS-1028: Dependency analysis failed because of:
CRS-0223: Resource 'ora.rac1.ASM1.asm' has placement error.]]
ASM手工启动不了啊?看来是ASM的问题,老大们,到这一步怎么做?

2008-7-5 09:29 anycall2010
恩,果然是ASM的问题,我做如下操作:
节点1:
[root@rac1 ~]# /etc/init.d/oracleasm enable
Writing Oracle ASM library driver configuration:           [  OK  ]
Scanning system for ASM disks:                             [  OK  ]

[root@rac1 ~]# /etc/init.d/oracleasm scandisks
Scanning system for ASM disks:                             [  OK  ]

节点2:
rac2-> /etc/init.d/oracleasm enable
Writing Oracle ASM library driver configuration: /etc/init.d/oracleasm: line 348: /etc/sysconfig/oracleasm: Permission denied     [FAILED]     //错误出现在这里????

rac2-> /etc/init.d/oracleasm scandisks
Scanning system for ASM disks:                             [  OK  ]

2008-7-5 09:50 anycall2010
我晕,我用的权限不对!真失败!

2008-7-5 09:50 anycall2010
[root@rac2 clusterware]# /etc/init.d/oracleasm  enable
Writing Oracle ASM library driver configuration:           [  OK  ]
Scanning system for ASM disks:                             [  OK  ]
[root@rac2 clusterware]# /etc/init.d/oracleasm scandisks
Scanning system for ASM disks:                             [  OK  ]
[root@rac2 clusterware]#

2008-7-5 10:20 anycall2010
[root@rac1 ~]# su - oracle
rac1-> crs_stop -all
Attempting to stop `ora.rac1.LISTENER_RAC1.lsnr` on member `rac1`
Attempting to stop `ora.rac2.LISTENER_RAC2.lsnr` on member `rac2`
Stop of `ora.rac1.LISTENER_RAC1.lsnr` on member `rac1` succeeded.
Target set to OFFLINE for `ora.dbvdb.dbvdb1.inst`
Attempting to stop `ora.rac1.ASM1.asm` on member `rac1`
Stop of `ora.rac2.LISTENER_RAC2.lsnr` on member `rac2` succeeded.
Target set to OFFLINE for `ora.dbvdb.dbvdb2.inst`
Attempting to stop `ora.rac2.ASM2.asm` on member `rac2`
Stop of `ora.rac1.ASM1.asm` on member `rac1` succeeded.
Attempting to stop `ora.rac1.vip` on member `rac1`
Stop of `ora.rac1.vip` on member `rac1` succeeded.
Stop of `ora.rac2.ASM2.asm` on member `rac2` succeeded.
Attempting to stop `ora.rac2.vip` on member `rac2`
Stop of `ora.rac2.vip` on member `rac2` succeeded.
rac1->  srvctl status nodeapps -n rac1
VIP is not running on node: rac1
GSD is not running on node: rac1
Listener is not running on node: rac1
ONS daemon is not running on node: rac1
rac1-> crs_start -all
Attempting to start `ora.rac1.vip` on member `rac1`
Attempting to start `ora.rac2.vip` on member `rac2`
Start of `ora.rac2.vip` on member `rac2` succeeded.
Start of `ora.rac1.vip` on member `rac1` succeeded.
Attempting to start `ora.rac1.ASM1.asm` on member `rac1`
Attempting to start `ora.rac2.ASM2.asm` on member `rac2`
Start of `ora.rac1.ASM1.asm` on member `rac1` succeeded.
Attempting to start `ora.dbvdb.dbvdb1.inst` on member `rac1`
Start of `ora.rac2.ASM2.asm` on member `rac2` succeeded.
Attempting to start `ora.dbvdb.dbvdb2.inst` on member `rac2`

Start of `ora.dbvdb.dbvdb1.inst` on member `rac1` succeeded.
Attempting to start `ora.rac1.LISTENER_RAC1.lsnr` on member `rac1`
Start of `ora.rac1.LISTENER_RAC1.lsnr` on member `rac1` succeeded.
Start of `ora.dbvdb.dbvdb2.inst` on member `rac2` failed.
rac1 : CRS-1018: Resource ora.rac2.vip (application) is already running on rac2


rac1 : CRS-1018: Resource ora.rac2.vip (application) is already running on rac2


CRS-0215: Could not start resource 'ora.dbvdb.dbvdb2.inst'.

CRS-0223: Resource 'ora.rac2.LISTENER_RAC2.lsnr' has placement error.

2008-7-5 15:34 zengzg
在vmware上装好rac,后有一台机器down掉,之后重启两台机器,发现和你一样有几个怎么都起来了

2008-7-5 15:43 anycall2010
你的能起来?我这里多多少少还有问题,问题也没有彻底解决,正在努力...

2008-7-5 15:46 anycall2010
要是张乐弈来帮帮忙就好了.....

2008-7-5 16:01 paulyibinyi
我第一次安装也遇到过 最好按节点顺序重启下crs 和检查下crs日志

2008-7-5 16:11 anycall2010
总是觉得虚机不太稳定....

2008-7-5 16:16 anycall2010
rac1-> crs_stat -t
Name           Type           Target    State     Host
------------------------------------------------------------
ora.dbvdb.db   application    ONLINE    ONLINE    rac1
ora....b1.inst application    ONLINE    ONLINE    rac1
ora....b2.inst application    ONLINE    ONLINE    rac2
ora....SM1.asm application    ONLINE    ONLINE    rac1
ora....C1.lsnr application    ONLINE    ONLINE    rac1
ora.rac1.gsd   application    ONLINE    ONLINE    rac1
ora.rac1.ons   application    ONLINE    ONLINE    rac1
ora.rac1.vip   application    ONLINE    ONLINE    rac1
ora....SM2.asm application    ONLINE    ONLINE    rac2
ora....C2.lsnr application    ONLINE    ONLINE    rac2
ora.rac2.gsd   application    ONLINE    ONLINE    rac2
ora.rac2.ons   application    ONLINE    ONLINE    rac2
ora.rac2.vip   application    ONLINE    ONLINE    rac2
rac1-> srvctl status nodeapps -n rac1
VIP is running on node: rac1
GSD is running on node: rac1
Listener is running on node: rac1
ONS daemon is running on node: rac1
rac1-> srvctl status nodeapps -n rac2


rac1->  srvctl status nodeapps -n rac2
VIP is running on node: rac2
GSD is running on node: rac2
Listener is running on node: rac2
ONS daemon is running on node: rac2
好象有希望了...不过不能保证

页: [1] 2


Powered by ITPUB论坛