|
环境:
db2 v95
linux RH4 Update6 EM64T, kernel 2.6.9
问题:
原来系统跑在db2 v91 fp2,工作良好,当升级到db2 v95 fp1后db2start无法启动,报错
$db2start
Floating point exception
分析:
首先,看到这个错误以后第一个反应就是db2安装出错。
拿到db2安装日志仔细扫两眼,并没有发现任何问题。
那么就先db2iupdt一下,好像也没有能够解决问题的说……
奇怪,既然db2start不了,我们先从db2pd来try一下(为什么要用db2pd?当然是由于它自身light weight,在没有传递参数的时候基本不做什么和db2相关的工作。在这里用来试验是不是db2引擎里面的问题是最合适的了)。
当敲入db2pd命令以后,竟然发现db2pd也出现同样的错误!
db2pd
Floating point exception
这个就相当奇怪了……
下面应该怎么走?难道是由于db2pd损坏?或者linux不认这种可执行文件格式?
file一把看看?得到下面的信息
setuid setgid ELF 64-bit LSB executable, AMD x86-64, version 1 (SYSV), for GNU/Linux 2.4.1, dynamically linked (uses shared libs), not stripped
看起来再正常不过的说……
把db2pd传到另一台同样装了linux的box上去,发现可以很好地执行呀,并没有报同样的错误……
于是乎,看起来应该不是db2pd文件损坏的缘故,那么gdb咯
gdb db2pd
GNU gdb Red Hat Linux (6.3.0.0-1.153.el4_6.2rh)
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu"...Using host libthread_db library "/lib64/tls/libthread_db.so.1".
(gdb) info shared
No shared libraries loaded at this time.
(gdb) run
Starting program: /home/db2inst2/sqllib/adm/db2pd
Program received signal SIGFPE, Arithmetic exception.
0x0000003660107927 in ?? ()
(gdb) where
#0 0x0000003660107927 in ?? ()
#1 0x0000007fbfffecc0 in ?? ()
#2 0x056bafd260109e33 in ?? ()
#3 0x0000007f00000005 in ?? ()
#4 0x0000002a9bb41000 in ?? ()
#5 0x0000000000000000 in ?? ()
(gdb) info shared
From To Syms Read Shared Object Library
0x0000002a9647b3f0 0x0000002a96af8488 No /opt/ibm/db2/V9.5/lib64/libdb2e.so.1
0x0000002a99ff40e0 0x0000002a99ff41c8 No /opt/ibm/db2/V9.5/lib64/libicudatadb2.so.32
0x0000002a9aad56f0 0x0000002a9ab875b8 No /opt/ibm/db2/V9.5/lib64/libicui18ndb2.so.32
0x0000002a9ad13b60 0x0000002a9ad19428 No /opt/ibm/db2/V9.5/lib64/libicuiodb2.so.32
0x0000002a9ae24870 0x0000002a9ae27948 No /opt/ibm/db2/V9.5/lib64/libiculxdb2.so.32
0x0000002a9af99720 0x0000002a9b032278 No /opt/ibm/db2/V9.5/lib64/libicuucdb2.so.32
0x0000002a9b1ca8f0 0x0000002a9b1daf88 No /opt/ibm/db2/V9.5/lib64/libiculedb2.so.32
0x0000002a9b301150 0x0000002a9b304318 No /opt/ibm/db2/V9.5/lib64/libdb2install.so.1
0x0000002a9b6a2ef0 0x0000002a9b6db4c8 No /opt/ibm/db2/V9.5/lib64/libdb2osse.so.1
0x0000003660603df0 0x0000003660646b98 No /lib64/tls/libm.so.6
0x0000002a9bb8f800 0x0000002a9bbe56f8 No /usr/lib64/libstdc++.so.5
0x0000003664b02060 0x0000003664b0ab08 No /lib64/libgcc_s.so.1
0x000000366031c200 0x00000036603faa2c No /lib64/tls/libc.so.6
0x0000003660800f80 0x0000003660801918 No /lib64/libdl.so.2
0x0000002a9bd1f570 0x0000002a9bd1f741 No /usr/lib64/libaio.so.1
0x0000003667700b40 0x0000003667703008 No /lib64/libcrypt.so.1
0x0000003661405100 0x000000366140d048 No /lib64/tls/libpthread.so.0
0x0000003666302c90 0x0000003666307b38 No /lib64/tls/librt.so.1
0x0000002a9bf21d20 0x0000002a9bf21e08 No /opt/ibm/db2/V9.5/lib64/libDB2XML4CMessages.so
0x0000002a9c13bd10 0x0000002a9c13bdf8 No /opt/ibm/db2/V9.5/lib64/libDB2xalanMsg.so
0x0000002a9c4f4b60 0x0000002a9c6e2d48 No /opt/ibm/db2/V9.5/lib64/libDB2xml4c.so.56
0x0000002a9cac1ca0 0x0000002a9caf0488 No /opt/ibm/db2/V9.5/lib64/libDB2xml4c-depdom.so.56
0x0000002a9cf49210 0x0000002a9d2fdec8 No /opt/ibm/db2/V9.5/lib64/libDB2xslt4c.so.110
0x0000002a9d6d46b0 0x0000002a9d6e6ce8 No /opt/ibm/db2/V9.5/lib64/libdb2dascmn.so.1
0x0000002a9d904f20 0x0000002a9d9946c8 No /opt/ibm/db2/V9.5/lib64/libdb2dstf.so.1
0x0000002a9dbe6ef0 0x0000002a9dc27308 No /opt/ibm/db2/V9.5/lib64/libdb2g11n.so.1
0x0000002a9e4500b0 0x0000002a9e477cd8 No /opt/ibm/db2/V9.5/lib64/libdb2genreg.so.1
0x0000002a9e69eca0 0x0000002a9e69f7b8 No /opt/ibm/db2/V9.5/lib64/libdb2locale.so.1
0x0000002a9e8c7110 0x0000002a9e900be8 No /opt/ibm/db2/V9.5/lib64/libdb2osse_db2.so.1
0x0000002a9eb324d0 0x0000002a9eb37b88 No /opt/ibm/db2/V9.5/lib64/libdb2sdbin.so.1
0x0000002a9ed61160 0x0000002a9ed68738 No /opt/ibm/db2/V9.5/lib64/libdb2trcapi.so.1
No /opt/ibm/db2/V9.5/lib64/libicudatadb2.so.38
0x0000002a9fd097d0 0x0000002a9fdf23b8 No /opt/ibm/db2/V9.5/lib64/libicui18ndb2.so.38
0x0000002aa00b7ab0 0x0000002aa00bd9e8 No /opt/ibm/db2/V9.5/lib64/libicuiodb2.so.38
0x0000002aa02e4e70 0x0000002aa02f8d38 No /opt/ibm/db2/V9.5/lib64/libiculedb2.so.38
0x0000002aa052b710 0x0000002aa052f4e8 No /opt/ibm/db2/V9.5/lib64/libiculxdb2.so.38
0x0000002aa07ad790 0x0000002aa085e018 No /opt/ibm/db2/V9.5/lib64/libicuucdb2.so.38
0x0000003660100a80 0x0000003660110d27 No /lib64/ld-linux-x86-64.so.2
(gdb) disas 0x0000002a9bb41000
No function contains specified address.
(gdb)
看出啥来了?run的时候发生了SIGFPE导致进程crash,但是从stack上面来看,其中的地址无法映射到任何library中……而且#5 0x0000000000000000竟然从0开始,简直没有天理了^_^
难道是stack corruption?如果是stack corruption的话,那么一定是什么东西在程序里面导致的……那么我们在main()入口设置断点看看
(gdb) break main
Breakpoint 1 at 0x41c3dc
(gdb) run
The program being debugged has been started already.
Start it from the beginning? (y or n) y
Starting program: /home/db2inst2/sqllib/adm/db2pd
Program received signal SIGFPE, Arithmetic exception.
0x0000003660107927 in ?? ()
(gdb)
看到什么了?竟然在main()之前就死了!!
也就是说问题肯定不在db2pd里面……
不在db2pd里面,那么说要不就是在library里面,要不就是在kernel里面咯?
看看db2pd都需要什么library:
ldd ~/sqllib/adm/db2pd
libdb2e.so.1 => /opt/ibm/db2/V9.5/lib64/libdb2e.so.1 (0x0000002a95558000)
libicudatadb2.so.32 => /opt/ibm/db2/V9.5/lib64/libicudatadb2.so.32 (0x0000002a99ff3000)
libicui18ndb2.so.32 => /opt/ibm/db2/V9.5/lib64/libicui18ndb2.so.32 (0x0000002a9aa47000)
libicuiodb2.so.32 => /opt/ibm/db2/V9.5/lib64/libicuiodb2.so.32 (0x0000002a9ad0f000)
libiculxdb2.so.32 => /opt/ibm/db2/V9.5/lib64/libiculxdb2.so.32 (0x0000002a9ae1e000)
libicuucdb2.so.32 => /opt/ibm/db2/V9.5/lib64/libicuucdb2.so.32 (0x0000002a9af2d000)
libiculedb2.so.32 => /opt/ibm/db2/V9.5/lib64/libiculedb2.so.32 (0x0000002a9b1ad000)
libdb2install.so.1 => /opt/ibm/db2/V9.5/lib64/libdb2install.so.1 (0x0000002a9b2fe000)
libdb2osse.so.1 => /opt/ibm/db2/V9.5/lib64/libdb2osse.so.1 (0x0000002a9b509000)
libm.so.6 => /lib64/tls/libm.so.6 (0x0000003660600000)
libstdc++.so.5 => /usr/lib64/libstdc++.so.5 (0x0000002a9bb42000)
libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x0000003664b00000)
libc.so.6 => /lib64/tls/libc.so.6 (0x0000003660300000)
libdl.so.2 => /lib64/libdl.so.2 (0x0000003660800000)
libaio.so.1 => /usr/lib64/libaio.so.1 (0x0000002a9bd1f000)
libcrypt.so.1 => /lib64/libcrypt.so.1 (0x0000003667700000)
libpthread.so.0 => /lib64/tls/libpthread.so.0 (0x0000003661400000)
librt.so.1 => /lib64/tls/librt.so.1 (0x0000003666300000)
libDB2XML4CMessages.so => /opt/ibm/db2/V9.5/lib64/libDB2XML4CMessages.so (0x0000002a9bf21000)
libDB2xalanMsg.so => /opt/ibm/db2/V9.5/lib64/libDB2xalanMsg.so (0x0000002a9c13c000)
libDB2xml4c.so.56 => /opt/ibm/db2/V9.5/lib64/libDB2xml4c.so.56 (0x0000002a9c342000)
libDB2xml4c-depdom.so.56 => /opt/ibm/db2/V9.5/lib64/libDB2xml4c-depdom.so.56 (0x0000002a9ca7b000)
libDB2xslt4c.so.110 => /opt/ibm/db2/V9.5/lib64/libDB2xslt4c.so.110 (0x0000002a9cd2d000)
libdb2dascmn.so.1 => /opt/ibm/db2/V9.5/lib64/libdb2dascmn.so.1 (0x0000002a9d6c8000)
libdb2dstf.so.1 => /opt/ibm/db2/V9.5/lib64/libdb2dstf.so.1 (0x0000002a9d8f5000)
libdb2g11n.so.1 => /opt/ibm/db2/V9.5/lib64/libdb2g11n.so.1 (0x0000002a9dbc0000)
libdb2genreg.so.1 => /opt/ibm/db2/V9.5/lib64/libdb2genreg.so.1 (0x0000002a9e444000)
libdb2locale.so.1 => /opt/ibm/db2/V9.5/lib64/libdb2locale.so.1 (0x0000002a9e68a000)
libdb2osse_db2.so.1 => /opt/ibm/db2/V9.5/lib64/libdb2osse_db2.so.1 (0x0000002a9e8ae000)
libdb2sdbin.so.1 => /opt/ibm/db2/V9.5/lib64/libdb2sdbin.so.1 (0x0000002a9eb25000)
libdb2trcapi.so.1 => /opt/ibm/db2/V9.5/lib64/libdb2trcapi.so.1 (0x0000002a9ed59000)
libicudatadb2.so.38 => /opt/ibm/db2/V9.5/lib64/libicudatadb2.so.38 (0x0000002a9ef6e000)
libicui18ndb2.so.38 => /opt/ibm/db2/V9.5/lib64/libicui18ndb2.so.38 (0x0000002a9fc44000)
libicuiodb2.so.38 => /opt/ibm/db2/V9.5/lib64/libicuiodb2.so.38 (0x0000002aa00b3000)
libiculedb2.so.38 => /opt/ibm/db2/V9.5/lib64/libiculedb2.so.38 (0x0000002aa02c4000)
libiculxdb2.so.38 => /opt/ibm/db2/V9.5/lib64/libiculxdb2.so.38 (0x0000002aa0523000)
libicuucdb2.so.38 => /opt/ibm/db2/V9.5/lib64/libicuucdb2.so.38 (0x0000002aa0735000)
/lib64/ld-linux-x86-64.so.2 (0x000000552aaaa000)
嘿嘿,不少哦……
怎么看是不是library的问题呢?我们知道在一个进程被fork之后,操作系统是会对该进程装载library的,基本上可以看作
child : spawn----->setup signal mask ---> execve()
而execve就包括load library还有main()的调用咯
我们知道在load library的时候是不会真正执行里面的function的,所以我们可以粗略地模拟child进程产生到调用main()之间的load library过程,使用dlopen()就好啦……
写个小小程序:
- #include <dlfcn.h>
- #include <stdio.h>
- #include <errno.h>
- int main ( int argc, void** argv )
- {
- void* Lib ;
- if ( argc !=2 )
- {
- printf ("Syntax: %s <Lib name>
- ", (char*)argv[0]) ;
- goto exit ;
- }
- printf ( (Lib=dlopen((char*)argv[1],RTLD_LOCAL|RTLD_NOW))?"Successfully
- load library
- ":"Failed to load library
- " ) ;
- if ( Lib)
- dlclose(Lib);
- exit:
- return 0 ;
- }
复制代码
存成test.c,然后cc test.c -ldl编译
一个一个地去调用那些使用到的library……
……
……
……
BIG SURPRISE!!!
./a.out /opt/ibm/db2/V9.5/lib64/libdb2e.so.1
Floating point exception
在load其中一个library的时候竟然出现了同样的错误!!
哈,看看是什么library? libdb2e.so.1
能够说这个就是出问题的library么?不能!!
因为系统在load这个library之前会先检测其dependency,然后先load所有它depends的library……
那么我们继续看看它都要什么library:
ldd /opt/ibm/db2/V9.5/lib64/libdb2e.so.1
libaio.so.1 => /usr/lib64/libaio.so.1 (0x0000002a99ff3000)
libcrypt.so.1 => /lib64/libcrypt.so.1 (0x0000002a9a213000)
libdl.so.2 => /lib64/libdl.so.2 (0x0000002a9a348000)
libpthread.so.0 => /lib64/tls/libpthread.so.0 (0x0000002a9a44b000)
librt.so.1 => /lib64/tls/librt.so.1 (0x0000002a9a560000)
libm.so.6 => /lib64/tls/libm.so.6 (0x0000002a9a67b000)
libDB2XML4CMessages.so => /opt/ibm/db2/V9.5/lib64/libDB2XML4CMessages.so (0x0000002a9a801000)
libDB2xalanMsg.so => /opt/ibm/db2/V9.5/lib64/libDB2xalanMsg.so (0x0000002a9aa1b000)
libDB2xml4c.so.56 => /opt/ibm/db2/V9.5/lib64/libDB2xml4c.so.56 (0x0000002a9ac22000)
libDB2xml4c-depdom.so.56 => /opt/ibm/db2/V9.5/lib64/libDB2xml4c-depdom.so.56 (0x0000002a9b35b000)
libDB2xslt4c.so.110 => /opt/ibm/db2/V9.5/lib64/libDB2xslt4c.so.110 (0x0000002a9b60c000)
libdb2dascmn.so.1 => /opt/ibm/db2/V9.5/lib64/libdb2dascmn.so.1 (0x0000002a9bfa8000)
libdb2dstf.so.1 => /opt/ibm/db2/V9.5/lib64/libdb2dstf.so.1 (0x0000002a9c1d5000)
libdb2g11n.so.1 => /opt/ibm/db2/V9.5/lib64/libdb2g11n.so.1 (0x0000002a9c49f000)
libdb2genreg.so.1 => /opt/ibm/db2/V9.5/lib64/libdb2genreg.so.1 (0x0000002a9cd24000)
libdb2install.so.1 => /opt/ibm/db2/V9.5/lib64/libdb2install.so.1 (0x0000002a9cf6a000)
libdb2locale.so.1 => /opt/ibm/db2/V9.5/lib64/libdb2locale.so.1 (0x0000002a9d174000)
libdb2osse.so.1 => /opt/ibm/db2/V9.5/lib64/libdb2osse.so.1 (0x0000002a9d398000)
libdb2osse_db2.so.1 => /opt/ibm/db2/V9.5/lib64/libdb2osse_db2.so.1 (0x0000002a9d9b2000)
libdb2sdbin.so.1 => /opt/ibm/db2/V9.5/lib64/libdb2sdbin.so.1 (0x0000002a9dc29000)
libdb2trcapi.so.1 => /opt/ibm/db2/V9.5/lib64/libdb2trcapi.so.1 (0x0000002a9de5e000)
libicudatadb2.so.32 => /opt/ibm/db2/V9.5/lib64/libicudatadb2.so.32 (0x0000002a9e072000)
libicudatadb2.so.38 => /opt/ibm/db2/V9.5/lib64/libicudatadb2.so.38 (0x0000002a9eac5000)
libicui18ndb2.so.32 => /opt/ibm/db2/V9.5/lib64/libicui18ndb2.so.32 (0x0000002a9f79c000)
libicui18ndb2.so.38 => /opt/ibm/db2/V9.5/lib64/libicui18ndb2.so.38 (0x0000002a9fa64000)
libicuiodb2.so.32 => /opt/ibm/db2/V9.5/lib64/libicuiodb2.so.32 (0x0000002a9fed3000)
libicuiodb2.so.38 => /opt/ibm/db2/V9.5/lib64/libicuiodb2.so.38 (0x0000002a9ffe3000)
libiculedb2.so.32 => /opt/ibm/db2/V9.5/lib64/libiculedb2.so.32 (0x0000002aa01f3000)
libiculedb2.so.38 => /opt/ibm/db2/V9.5/lib64/libiculedb2.so.38 (0x0000002aa0344000)
libiculxdb2.so.32 => /opt/ibm/db2/V9.5/lib64/libiculxdb2.so.32 (0x0000002aa05a4000)
libiculxdb2.so.38 => /opt/ibm/db2/V9.5/lib64/libiculxdb2.so.38 (0x0000002aa06b2000)
libicuucdb2.so.32 => /opt/ibm/db2/V9.5/lib64/libicuucdb2.so.32 (0x0000002aa08c4000)
libicuucdb2.so.38 => /opt/ibm/db2/V9.5/lib64/libicuucdb2.so.38 (0x0000002aa0b45000)
libstdc++.so.5 => /usr/lib64/libstdc++.so.5 (0x0000002aa0f33000)
libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x0000002aa110e000)
libc.so.6 => /lib64/tls/libc.so.6 (0x0000002aa121c000)
/lib64/ld-linux-x86-64.so.2 (0x000000552aaaa000)
别怕麻烦,继续一个一个来……
ANOTHER BIG SURPRISE!!!
./a.out /usr/lib64/libaio.so.1
Floating point exception
继续用递归的方法来ldd这个library:
ldd /usr/lib64/libaio.so.1
statically linked
呵呵,两层就到头了……
说明corruption就在这个library里面 :)
估计大家会问,为什么在v91里面就没事呢?
ldd libdb2e.so.1
libirc.so => /wsdb/db2_v91fp2/linuxamd64nocc/s070404/INST/lib/libirc.so (0x0000002a992b0000)
libcrypt.so.1 => /lib64/libcrypt.so.1 (0x0000002a99402000)
libdl.so.2 => /lib64/libdl.so.2 (0x0000002a9953b000)
libpthread.so.0 => /lib64/tls/libpthread.so.0 (0x0000002a9963e000)
librt.so.1 => /lib64/tls/librt.so.1 (0x0000002a99752000)
libDB2xml4c.so.56 => /wsdb/db2_v91fp2/linuxamd64nocc/s070404/INST/lib/libDB2xml4c.so.56 (0x0000002a9985b000)
libDB2xml4c-depdom.so.56 => /wsdb/db2_v91fp2/linuxamd64nocc/s070404/INST/lib/libDB2xml4c-depdom.so.56 (0x0000002a99f1d000)
libDB2XML4CMessages.so => /wsdb/db2_v91fp2/linuxamd64nocc/s070404/INST/lib/libDB2XML4CMessages.so (0x0000002a9a0d9000)
libicudatadb2.so.32 => /wsdb/db2_v91fp2/linuxamd64nocc/s070404/INST/lib/libicudatadb2.so.32 (0x0000002a9a1f4000)
libicui18ndb2.so.32 => /wsdb/db2_v91fp2/linuxamd64nocc/s070404/INST/lib/libicui18ndb2.so.32 (0x0000002a9ac47000)
libicuiodb2.so.32 => /wsdb/db2_v91fp2/linuxamd64nocc/s070404/INST/lib/libicuiodb2.so.32 (0x0000002a9af81000)
libiculxdb2.so.32 => /wsdb/db2_v91fp2/linuxamd64nocc/s070404/INST/lib/libiculxdb2.so.32 (0x0000002a9b093000)
libicuucdb2.so.32 => /wsdb/db2_v91fp2/linuxamd64nocc/s070404/INST/lib/libicuucdb2.so.32 (0x0000002a9b1a4000)
libiculedb2.so.32 => /wsdb/db2_v91fp2/linuxamd64nocc/s070404/INST/lib/libiculedb2.so.32 (0x0000002a9b462000)
libdb2install.so.1 => /wsdb/db2_v91fp2/linuxamd64nocc/s070404/INST/lib/libdb2install.so.1 (0x0000002a9b5ca000)
libdb2locale.so.1 => /wsdb/db2_v91fp2/linuxamd64nocc/s070404/INST/lib/libdb2locale.so.1 (0x0000002a9b6d1000)
libdb2g11n.so.1 => /wsdb/db2_v91fp2/linuxamd64nocc/s070404/INST/lib/libdb2g11n.so.1 (0x0000002a9b7ff000)
libdb2osse.so.1 => /wsdb/db2_v91fp2/linuxamd64nocc/s070404/INST/lib/libdb2osse.so.1 (0x0000002a9bf46000)
libdb2genreg.so.1 => /wsdb/db2_v91fp2/linuxamd64nocc/s070404/INST/lib/libdb2genreg.so.1 (0x0000002a9c3f8000)
libdb2trcapi.so.1 => /wsdb/db2_v91fp2/linuxamd64nocc/s070404/INST/lib/libdb2trcapi.so.1 (0x0000002a9c543000)
libdb2dstf.so.1 => /wsdb/db2_v91fp2/linuxamd64nocc/s070404/INST/lib/libdb2dstf.so.1 (0x0000002a9c650000)
libdb2dascmn.so.1 => /wsdb/db2_v91fp2/linuxamd64nocc/s070404/INST/lib/libdb2dascmn.so.1 (0x0000002a9c845000)
libDB2xslt4c.so.110 => /wsdb/db2_v91fp2/linuxamd64nocc/s070404/INST/lib/libDB2xslt4c.so.110 (0x0000002a9c96d000)
libDB2xalanMsg.so => /wsdb/db2_v91fp2/linuxamd64nocc/s070404/INST/lib/libDB2xalanMsg.so (0x0000002a9d33d000)
libimf.so => /wsdb/db2_v91fp2/linuxamd64nocc/s070404/INST/lib/libimf.so (0x0000002a9d443000)
libm.so.6 => /lib64/tls/libm.so.6 (0x0000002a9d6d5000)
libsvml.so => /wsdb/db2_v91fp2/linuxamd64nocc/s070404/INST/lib/libsvml.so (0x0000002a9d82d000)
libstdc++.so.5 => /usr/lib64/libstdc++.so.5 (0x0000002a9d96e000)
libc.so.6 => /lib64/tls/libc.so.6 (0x0000002a9db4b000)
/lib64/ld-linux-x86-64.so.2 => /lib64/ld-linux-x86-64.so.2 (0x000000552aaaa000)
libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x0000002a9dd6d000)
libdb2osse_db2.so.1 => /wsdb/db2_v91fp2/linuxamd64nocc/s070404/INST/lib/libdb2osse_db2.so.1 (0x0000002a9de79000)
明白了吗
不过别忙着高兴,上面的推论中有一个很大的漏洞,不知道大家注意了没有?
没有的话,现在想一想先^_^
.
.
.
.
.
.
.
.
.
.
.
.
.
漏洞就是,我们一直在assume操作系统的其他模块没有问题,这样我们才从dlopen() 的FPE得出libaio.so.1损坏的结论。
不过事实上,这个结论真的站得住脚么?
为了证明这点,我们另外找了一台能够运行v95的系统,把libaio.so.1拷贝过去,跑一下程序……发生了什么?竟然没有任何问题……
./test libaio.so.1
Successfully load library
这样的话,说明可能问题并不在library本身,而是其他的地方。
下一步,大家能想到该做什么了?
想不到?面壁去……
当然是gdb我们的程序,看一看到底FPE出现在什么地方啦:
(gdb) run libaio.so.1
Starting program: /home/db2inst2/a.out libaio.so.1
Program received signal SIGFPE, Arithmetic exception.
0x0000003660107927 in do_lookup_x () from /lib64/ld-linux-x86-64.so.2
(gdb) where
#0 0x0000003660107927 in do_lookup_x () from /lib64/ld-linux-x86-64.so.2
#1 0x0000003660107cee in _dl_lookup_symbol_x () from /lib64/ld-linux-x86-64.so.2
#2 0x00000036601090c0 in _dl_relocate_object () from /lib64/ld-linux-x86-64.so.2
#3 0x00000036603f8980 in dl_open_worker () from /lib64/tls/libc.so.6
#4 0x000000366010af00 in _dl_catch_error () from /lib64/ld-linux-x86-64.so.2
#5 0x00000036603f8fda in _dl_open () from /lib64/tls/libc.so.6
#6 0x0000003660801054 in dlopen_doit () from /lib64/libdl.so.2
#7 0x000000366010af00 in _dl_catch_error () from /lib64/ld-linux-x86-64.so.2
#8 0x0000003660801552 in _dlerror_run () from /lib64/libdl.so.2
#9 0x0000003660801092 in dlopen@@GLIBC_2.2.5 () from /lib64/libdl.so.2
#10 0x000000000040084f in main (argc=3, argv=0x7fbffff248) at testLoad.c:43
这个看起来就比最开始的那个stack舒服多了,然后我们搜索do_lookup_x和SIGFPE,发现了bugzila的一篇文章
https://bugzilla.redhat.com/show_bug.cgi?id=213252
同时,如果察看glibc版本的话,我们得到:
/lib64/tls/libc.so.6
GNU C Library stable release version 2.3.4, by Roland McGrath et al.
Copyright (C) 2005 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.
There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A
PARTICULAR PURPOSE.
Compiled by GNU CC version 3.4.6 20060404 (Red Hat 3.4.6-8).
Compiled on a Linux 2.4.20 system on 2007-09-12.
Available extensions:
GNU libio by Per Bothner
crypt add-on version 2.1 by Michael Glad and others
Native POSIX Threads Library by Ulrich Drepper et al
RT using linux kernel aio
The C stubs add-on version 2.1.2.
GNU Libidn by Simon Josefsson
BIND-8.2.3-T5B
NIS(YP)/NIS+ NSS modules 0.19 by Thorsten Kukuk
Thread-local storage support included.
For bug reporting instructions, please see:
<http://www.gnu.org/software/libc/bugs.html>.
[db2inst2@aipl366 lib64]$ ldd libdb2e.so.1 | grep libc
libcrypt.so.1 => /lib64/libcrypt.so.1 (0x0000002a9a214000)
libc.so.6 => /lib64/tls/libc.so.6 (0x0000002aa121c000)
看到了 “Compiled on a Linux 2.4.20 system on 2007-09-12” 没有?很可疑哦……
结论呢,就是说,这个问题和db2一点关系没有,需要找redhat的人来搞定。同时很有可能是已经存在的bug的说
希望这篇案例能够对大家的library dependency,gdb,library mapping等操作系统基础知识的理解有所帮助^_^
|
|