2008-7-5 08:23
anycall2010
安装完RAC出现的问题【问题解决】
[b]具体环境:[/b]
数据库版本是10.2.0.1 操作系统是:2.6.9-55.0.0.0.2.ELhugemem
[b]出错的原因:[/b]
我在虚机上安装的,起初安装完毕没有报任何错误.由于安装完毕后,机器巨慢,我就强制将机器重启后,随后出现下面问题.
[b]数据库运行状态:[/b]
[color=Lime][/color]
[b]rac1-> [b]crs_stat -t[/b][/b]
Name Type Target State Host
------------------------------------------------------------
ora.dbvdb.db application ONLINE UNKNOWN rac1
ora....b1.inst application ONLINE OFFLINE
ora....b2.inst application ONLINE OFFLINE
ora....SM1.asm application ONLINE UNKNOWN rac1
ora....C1.lsnr application ONLINE UNKNOWN rac1
ora.rac1.gsd application ONLINE UNKNOWN rac1
ora.rac1.ons application ONLINE UNKNOWN rac1
ora.rac1.vip application ONLINE ONLINE rac1
ora....SM2.asm application ONLINE UNKNOWN rac2
ora....C2.lsnr application ONLINE UNKNOWN rac2
ora.rac2.gsd application ONLINE UNKNOWN rac2
ora.rac2.ons application ONLINE UNKNOWN rac2
ora.rac2.vip application ONLINE ONLINE rac2
[b]rac1-> srvctl status nodeapps -n rac1[/b]
VIP is running on node: rac1
GSD is not running on node: rac1
Listener is not running on node: rac1
ONS daemon is not running on node: rac1
[b]rac1-> srvctl status asm -n rac1[/b] //ASM也起不来了
ASM instance +ASM1 is not running on node rac1.
[b]集群件检查:[/b]
[root@rac1 ~]# [b]/u01/oracle/product/10.2.0/crs_1/bin/cluvfy stage -post crsinst -n rac1,rac2[/b]
Performing post-checks for cluster services setup
Checking node reachability...
Node reachability check passed from node "rac1".
Checking user equivalence...
User equivalence check failed for user "root".
Check failed on nodes:
rac2,rac1
ERROR:
User equivalence unavailable on all the nodes.
Verification cannot proceed.
Post-check for cluster services setup was unsuccessful on all the nodes.
[b]rac1-> crsctl check crs // 节点1的CRS正常![/b]
CSS appears healthy
CRS appears healthy
EVM appears healthy
[b]rac2-> crsctl check crs[/b] // 节点2的CRS正常![/b]
CSS appears healthy
CRS appears healthy
EVM appears healthy
[b]网络检查:[/b]
[b]rac1-> ping 192.168.0.4[/b] //RAC2的主机地址 状态OK//
PING 192.168.0.4 (192.168.0.4) 56(84) bytes of data.
64 bytes from 192.168.0.4: icmp_seq=0 ttl=64 time=2.03 ms
64 bytes from 192.168.0.4: icmp_seq=1 ttl=64 time=0.736 ms
--- 192.168.0.4 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1002ms
rtt min/avg/max/mdev = 0.736/1.387/2.038/0.651 ms, pipe 2
[b]rac1-> ping 192.168.0.32[/b] //RAC2的私有地址 状态OK//
PING 192.168.0.32 (192.168.0.32) 56(84) bytes of data.
64 bytes from 192.168.0.32: icmp_seq=0 ttl=64 time=5.69 ms
64 bytes from 192.168.0.32: icmp_seq=1 ttl=64 time=0.124 ms
--- 192.168.0.32 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1005ms
rtt min/avg/max/mdev = 0.124/2.907/5.691/2.784 ms, pipe 2
[b]rac1-> ping 10.10.10.32[/b] //RAC2的VIP 状态OK//
PING 10.10.10.32 (10.10.10.32) 56(84) bytes of data.
64 bytes from 10.10.10.32: icmp_seq=0 ttl=64 time=2.96 ms
64 bytes from 10.10.10.32: icmp_seq=1 ttl=64 time=0.000 ms
--- 10.10.10.32 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1004ms
rtt min/avg/max/mdev = 0.000/1.482/2.964/1.482 ms, pipe 2
[b]分析结果:
重新配置CRS,然后重新配置节点?[/b]
请大家给拿拿意见!
[[i] 本帖最后由 anycall2010 于 2008-7-6 09:52 编辑 [/i]]
2008-7-5 08:27
anycall2010
继续看:
rac1-> more /etc/ocfs2/cluster.conf
node:
ip_port = 7777
ip_address = 192.168.0.3
number = 0
name = rac1
cluster = ocfs2
node:
ip_port = 7777
ip_address = 192.168.0.4
number = 1
name = rac2
cluster = ocfs2
cluster:
node_count = 2
name = ocfs2
rac2-> more /etc/ocfs2/cluster.conf
node:
ip_port = 7777
ip_address = 192.168.0.3
number = 0
name = rac1
cluster = ocfs2
node:
ip_port = 7777
ip_address = 192.168.0.4
number = 1
name = rac2
cluster = ocfs2
cluster:
node_count = 2
name = ocfs2
两个节点没问题 !
2008-7-5 08:34
anycall2010
[root@rac1 ~]# /etc/init.d/o2cb status
Module "configfs": Loaded
Filesystem "configfs": Mounted
Module "ocfs2_nodemanager": Loaded
Module "ocfs2_dlm": Loaded
Module "ocfs2_dlmfs": Loaded
Filesystem "ocfs2_dlmfs": Mounted
Checking O2CB cluster ocfs2: Online
Heartbeat dead threshold: 61
Network idle timeout: 10000
Network keepalive delay: 5000
Network reconnect delay: 2000
Checking O2CB heartbeat: Active
心跳也没问题 !第2个节点同上!
2008-7-5 08:37
anycall2010
需要重新安装 CRS吗 ?在线等。。。。。。
2008-7-5 08:41
zhangweicai74
关注
2008-7-5 08:52
anycall2010
如果重新安装 CRS工程也太大了 ,我需要做如下步骤了:
rm /etc/oracle/*
rm -f /etc/init.d/init.cssd
rm -f /etc/init.d/init.crs
rm -f /etc/init.d/init.crsd
rm -f /etc/init.d/init.evmd
rm -f /etc/rc2.d/K96init.crs
rm -f /etc/rc2.d/S96init.crs
rm -f /etc/rc3.d/K96init.crs
rm -f /etc/rc3.d/S96init.crs
rm -f /etc/rc5.d/K96init.crs
rm -f /etc/rc5.d/S96init.crs
rm -Rf /etc/oracle/scls_scr
rm -f /etc/inittab.crs
cp /etc/inittab.orig /etc/inittab
2008-7-5 09:07
anycall2010
运气来了!我刚才也不知道敲到什么命令,CRS好了!
rac1-> /u01/oracle/product/10.2.0/crs_1/bin/cluvfy stage -post crsinst -n rac1,rac2
Performing post-checks for cluster services setup
Checking node reachability...
Node reachability check passed from node "rac1".
Checking user equivalence...
User equivalence check passed for user "oracle".
Checking Cluster manager integrity...
Checking CSS daemon...
Daemon status check passed for "CSS daemon".
Cluster manager integrity check passed.
Checking cluster integrity...
Cluster integrity check passed
Checking OCR integrity...
Checking the absence of a non-clustered configuration...
All nodes free of non-clustered, local-only configurations.
Uniqueness check for OCR device passed.
Checking the version of OCR...
OCR of correct Version "2" exists.
Checking data integrity of OCR...
Data integrity check for OCR passed.
OCR integrity check passed.
Checking CRS integrity...
Checking daemon liveness...
Liveness check passed for "CRS daemon".
Checking daemon liveness...
Liveness check passed for "CSS daemon".
Checking daemon liveness...
Liveness check passed for "EVM daemon".
Checking CRS health...
CRS health check passed.
CRS integrity check passed.
Checking node application existence...
Checking existence of VIP node application (required)
Check passed.
Checking existence of ONS node application (optional)
Check passed.
Checking existence of GSD node application (optional)
Check passed.
Post-check for cluster services setup was successful.
rac1->
2008-7-5 09:08
anycall2010
和刚才的状态已经 不一样了
rac1-> crs_stat -t
Name Type Target State Host
------------------------------------------------------------
ora.dbvdb.db application ONLINE UNKNOWN rac1
ora....b1.inst application ONLINE OFFLINE
ora....b2.inst application ONLINE OFFLINE
ora....SM1.asm application ONLINE UNKNOWN rac1
ora....C1.lsnr application ONLINE UNKNOWN rac1
ora.rac1.gsd application ONLINE UNKNOWN rac1
ora.rac1.ons application ONLINE UNKNOWN rac1
ora.rac1.vip application ONLINE ONLINE rac1
ora....SM2.asm application ONLINE UNKNOWN rac2
ora....C2.lsnr application ONLINE UNKNOWN rac2
ora.rac2.gsd application ONLINE UNKNOWN rac2
ora.rac2.ons application ONLINE UNKNOWN rac2
ora.rac2.vip application ONLINE ONLINE rac2
2008-7-5 09:12
anycall2010
rac1-> srvctl status nodeapps -n rac1
VIP is running on node: rac1
GSD is not running on node: rac1
Listener is not running on node: rac1
ONS daemon is not running on node: rac1
节点状态和 以前一样,没什么变化?现在感觉好像是ASM的问题了
2008-7-5 09:16
anycall2010
rac1-> srvctl start asm -n rac1
PRKS-1009 : Failed to start ASM instance "+ASM1" on node "rac1", [PRKS-1009 : Failed to start ASM instance "+ASM1" on node "rac1", [CRS-1028: Dependency analysis failed because of:
CRS-0223: Resource 'ora.rac1.ASM1.asm' has placement error.]]
[PRKS-1009 : Failed to start ASM instance "+ASM1" on node "rac1", [CRS-1028: Dependency analysis failed because of:
CRS-0223: Resource 'ora.rac1.ASM1.asm' has placement error.]]
ASM手工启动不了啊?看来是ASM的问题,老大们,到这一步怎么做?
2008-7-5 09:29
anycall2010
恩,果然是ASM的问题,我做如下操作:
节点1:
[root@rac1 ~]# /etc/init.d/oracleasm enable
Writing Oracle ASM library driver configuration: [ OK ]
Scanning system for ASM disks: [ OK ]
[root@rac1 ~]# /etc/init.d/oracleasm scandisks
Scanning system for ASM disks: [ OK ]
节点2:
rac2-> /etc/init.d/oracleasm enable
Writing Oracle ASM library driver configuration: /etc/init.d/oracleasm: line 348: /etc/sysconfig/oracleasm: Permission denied [FAILED] //错误出现在这里????
rac2-> /etc/init.d/oracleasm scandisks
Scanning system for ASM disks: [ OK ]
2008-7-5 09:50
anycall2010
我晕,我用的权限不对!真失败!
2008-7-5 09:50
anycall2010
[root@rac2 clusterware]# /etc/init.d/oracleasm enable
Writing Oracle ASM library driver configuration: [ OK ]
Scanning system for ASM disks: [ OK ]
[root@rac2 clusterware]# /etc/init.d/oracleasm scandisks
Scanning system for ASM disks: [ OK ]
[root@rac2 clusterware]#
2008-7-5 10:20
anycall2010
[root@rac1 ~]# su - oracle
rac1-> crs_stop -all
Attempting to stop `ora.rac1.LISTENER_RAC1.lsnr` on member `rac1`
Attempting to stop `ora.rac2.LISTENER_RAC2.lsnr` on member `rac2`
Stop of `ora.rac1.LISTENER_RAC1.lsnr` on member `rac1` succeeded.
Target set to OFFLINE for `ora.dbvdb.dbvdb1.inst`
Attempting to stop `ora.rac1.ASM1.asm` on member `rac1`
Stop of `ora.rac2.LISTENER_RAC2.lsnr` on member `rac2` succeeded.
Target set to OFFLINE for `ora.dbvdb.dbvdb2.inst`
Attempting to stop `ora.rac2.ASM2.asm` on member `rac2`
Stop of `ora.rac1.ASM1.asm` on member `rac1` succeeded.
Attempting to stop `ora.rac1.vip` on member `rac1`
Stop of `ora.rac1.vip` on member `rac1` succeeded.
Stop of `ora.rac2.ASM2.asm` on member `rac2` succeeded.
Attempting to stop `ora.rac2.vip` on member `rac2`
Stop of `ora.rac2.vip` on member `rac2` succeeded.
rac1-> srvctl status nodeapps -n rac1
VIP is not running on node: rac1
GSD is not running on node: rac1
Listener is not running on node: rac1
ONS daemon is not running on node: rac1
rac1-> crs_start -all
Attempting to start `ora.rac1.vip` on member `rac1`
Attempting to start `ora.rac2.vip` on member `rac2`
Start of `ora.rac2.vip` on member `rac2` succeeded.
Start of `ora.rac1.vip` on member `rac1` succeeded.
Attempting to start `ora.rac1.ASM1.asm` on member `rac1`
Attempting to start `ora.rac2.ASM2.asm` on member `rac2`
Start of `ora.rac1.ASM1.asm` on member `rac1` succeeded.
Attempting to start `ora.dbvdb.dbvdb1.inst` on member `rac1`
Start of `ora.rac2.ASM2.asm` on member `rac2` succeeded.
Attempting to start `ora.dbvdb.dbvdb2.inst` on member `rac2`
Start of `ora.dbvdb.dbvdb1.inst` on member `rac1` succeeded.
Attempting to start `ora.rac1.LISTENER_RAC1.lsnr` on member `rac1`
Start of `ora.rac1.LISTENER_RAC1.lsnr` on member `rac1` succeeded.
Start of `ora.dbvdb.dbvdb2.inst` on member `rac2` failed.
rac1 : CRS-1018: Resource ora.rac2.vip (application) is already running on rac2
rac1 : CRS-1018: Resource ora.rac2.vip (application) is already running on rac2
CRS-0215: Could not start resource 'ora.dbvdb.dbvdb2.inst'.
CRS-0223: Resource 'ora.rac2.LISTENER_RAC2.lsnr' has placement error.
2008-7-5 15:34
zengzg
在vmware上装好rac,后有一台机器down掉,之后重启两台机器,发现和你一样有几个怎么都起来了
2008-7-5 15:43
anycall2010
你的能起来?我这里多多少少还有问题,问题也没有彻底解决,正在努力...
2008-7-5 15:46
anycall2010
要是张乐弈来帮帮忙就好了.....
2008-7-5 16:01
paulyibinyi
我第一次安装也遇到过 最好按节点顺序重启下crs 和检查下crs日志
2008-7-5 16:11
anycall2010
总是觉得虚机不太稳定....
2008-7-5 16:16
anycall2010
rac1-> crs_stat -t
Name Type Target State Host
------------------------------------------------------------
ora.dbvdb.db application ONLINE ONLINE rac1
ora....b1.inst application ONLINE ONLINE rac1
ora....b2.inst application ONLINE ONLINE rac2
ora....SM1.asm application ONLINE ONLINE rac1
ora....C1.lsnr application ONLINE ONLINE rac1
ora.rac1.gsd application ONLINE ONLINE rac1
ora.rac1.ons application ONLINE ONLINE rac1
ora.rac1.vip application ONLINE ONLINE rac1
ora....SM2.asm application ONLINE ONLINE rac2
ora....C2.lsnr application ONLINE ONLINE rac2
ora.rac2.gsd application ONLINE ONLINE rac2
ora.rac2.ons application ONLINE ONLINE rac2
ora.rac2.vip application ONLINE ONLINE rac2
rac1-> srvctl status nodeapps -n rac1
VIP is running on node: rac1
GSD is running on node: rac1
Listener is running on node: rac1
ONS daemon is running on node: rac1
rac1-> srvctl status nodeapps -n rac2
rac1-> srvctl status nodeapps -n rac2
VIP is running on node: rac2
GSD is running on node: rac2
Listener is running on node: rac2
ONS daemon is running on node: rac2
好象有希望了...不过不能保证
页:
[1]
2

Powered by ITPUB论坛