Thursday, June 17, 2010

11gR2 nodeapps vip issue

VIP was down as it got attached to a different device.

11gR2 RAC environment has the below issues.


When we run a installer or cluster verify you could see that nodeapps is not running .

Verifying the nodeapps you see that one of the component (vip) is down.


[oragrd@XXX ~]$ srvctl status nodeapps -n XXX

VIP xxx-vip is enabled
VIP xxx-vip is not running
Network is enabled
Network is running on node: xxx
GSD is enabled
GSD is running on node: xxx
ONS is enabled
ONS daemon is running on node: xxx
eONS is enabled
eONS daemon is running on node: xxx



Check the status on cluster registry


$ crs_stat -t |grep vip
obi.vip application ONLINE ONLINE afso...b101
ora....101.vip ora....t1.type ONLINE OFFLINE
ora....102.vip ora....t1.type ONLINE OFFLINE
ora.scan1.vip ora....ip.type ONLINE ONLINE afso...b102
ora.scan2.vip ora....ip.type ONLINE ONLINE afso...b101
ora.scan3.vip ora....ip.type ONLINE ONLINE afso...b101


This indicates VIP is in inconsistent state. Now check weather vip is available on the network
Below command shows that vip is attached to eth3 and it is like a physical IP .


ifconfig -a


eth3 Link encap:Ethernet HWaddr F4:CE:46:AF:49:44
inet addr:10.32.200.151 Bcast:10.32.200.255 Mask:255.255.255.0
inet6 addr: fe80::f6ce:46ff:feaf:4944/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:6116467 errors:0 dropped:0 overruns:0 frame:0
TX packets:70229 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:1504763071 (1.4 GiB) TX bytes:3056240 (2.9 MiB)
Interrupt:139



10.32.200.151 is our vip and its running. Ideally it should not be as crs_stat is showing its down.
When you try restarting , you see that VIP is already in use in the cluster log

===============================

2010-05-27 14:56:15.509: [UiServer][1520855360] Container [ Name: ORDER
MESSAGE:
TextMessage[CRS-5005: IP Address: 10.32.200.151 is already in use in the network]
MSGTYPE:
TextMessage[1]
OBJID:
TextMessage[ora.SOAIAT2.lsnr afsoqatdb102 1]
WAIT:
TextMessage[0]

===============================



Work Around to fix this issue:

[oragrd@XXX ~]$ srvctl stop nodeapps -n XXX

Bringdown the device to which vip is attached

Run below command as root.

ifdown eth3



Run if config and make sure that eth3 is not running or attached to any device .

Then start the nodeapps

[oragrd@XXX ~]$ srvctl start nodeapps -n XXX

Check the status.

[oragrd@XXX ~]$ srvctl status nodeapps -n XXX
-n option has been deprecated.
VIP XXX-vip is enabled
VIP XXX-vip is running on node: XXX
Network is enabled
Network is running on node: XXX
GSD is enabled
GSD is running on node: XXX
ONS is enabled
ONS daemon is running on node: XXX
eONS is enabled
eONS daemon is running on node: XXX

Check on the cluster:

[oragrd@XXX ~]$ crs_stat -t | grep vip
obi.vip application ONLINE ONLINE afso...b101
ora....101.vip ora....t1.type ONLINE ONLINE afso...b101
ora....102.vip ora....t1.type ONLINE ONLINE afso...b102
ora.scan1.vip ora....ip.type ONLINE ONLINE afso...b101
ora.scan2.vip ora....ip.type ONLINE ONLINE afso...b101
ora.scan3.vip ora....ip.type ONLINE ONLINE afso...b101
[oragrd@XXX ~]$

Now make sure vip does not get attached to any device . In our case it was getting attached to eth3 . We need to disable it permanently

cd /etc/sysconfig/network-scripts
vi ifcfg-eth3

remove
IPADDR=
NETMASK=
PEERDNS=yes




This should prevent the device from picking up the IP after reboot.

Thanks ,
Sandarsh Chavalmane

1 comment:

佳皓 said...

一時的錯誤不算什麼,錯而不改才是一生中永遠且最大的錯誤......................................................................