|
问题症状:
db2 connect to <dbname>
SQL0902 xxxxxxx
分析过程:
首先清除db2diag.log,reproduce problem,检查db2diag.log
发现有"Requesting too many semaphores" 的错误
看来是和semaphores有关,检查db2level,发现是linux MI00141_15387
既然是linux就要看ipcs -l:
piidb279 /var/msdb2/tmp 28# uname -a
Linux piidb279 2.4.21-32.0.1.EL.msdwhugemem #1 SMP Mon Dec 5 21:32:44
EST 2005 i686
piidb279 /var/msdb2/tmp 29# db2level
DB21085I Instance "nytxt030" uses "32" bits and DB2 code release
"SQL08023"
with level identifier "03040106".
Informational tokens are "DB2 v8.1.1.98", "special_15387",
"MI00141_15387", and
FixPak "10".
Product is installed at "/opt/IBM/db2/V8.1".
piidb279 /var/ibmdb2/nytxt030/sqllib/db2dump 67# ipcs -s | wc -l
1028
piidb279 /var/ibmdb2/nytxt030/sqllib/db2dump 68# ipcs -l
------ Shared Memory Limits --------
max number of segments = 4096
max seg size (kbytes) = 4194303
max total shared memory (kbytes) = 12582912
min seg size (bytes) = 1
------ Semaphore Limits --------
max number of arrays = 512
max semaphores per array = 250
max semaphores system wide = 32000
max ops per semop call = 32
semaphore max value = 32767
------ Messages: Limits --------
max queues system wide = 1024
max size of message (bytes) = 8192
default max size of queue (bytes) = 16384
这里问题相当明显了max number of arrays只有512,但是ipcs -s却有1024个输出。
立刻想到可能是bug
Machine Type: i686
----------------------------------------------------libc.so Information
GLIBC version: 2.3.2
--------------------------------------------------libstdc++ Information
DB2 v8.x (x86 for 2.4 kernels): does not require the libstdc++ library.
.
uname -a:
Linux piidb279 2.4.21-32.0.1.EL.msdwhugemem #1 SMP Mon Dec 5 21:32:44
EST 2005 i686 i686 i386 GNU/Linux
.
sysinfo:
...
Manufacturer is Dell (Dell Computer Corporation)
Manufacturer (Short) is Dell
Manufacturer (Full) is Dell Computer Corporation
:
App Architecture is x86
Kernel Architecture is i686
Kernel Bit Size is 32
然后检查linux的bug和db2的bug却木有发现可疑的东东。
暂时将bug的可能性放置,从另一个方面入手。
询问客户,客户在一周前db2 crash,然后db2start重新启动instance。
询问客户是否作了资源清除回答否。
并且从错误本身可以看出902是由IPC引起的。
可以推断出事由于客户没有释放资源引起的问题。
告诉客户以后发生crash时一定要释放所有的IPC资源,否则会出现类似情况
结论:
每次instance重新启动前一定要*释放资源*!
包括
db2_kill
ipclean
ps -ef|grep -i db2
kill all processes left |
|