|
db2pd有一个不大常用的选项-catch:
http://publib.boulder.ibm.com/in ... 3%61%74%63%68%22%20
通过使用-catch可以捕捉到一些不会被记录到db2diag.log里面的信息。
下面给大家一个通过这种方法捕捉到locktimeout的例子。
环境:
aix5.2 db2v8 fp11
首先初始化3个窗口a,b,c
a:
cd ~
cd sqllib
cd cfg
cp db2cos ..
db2start
db2 update db cfg for sample using locktimeout 5
db2pd -catch locktimeout count 1
b:
db2 connect to sample
db2 +c "insert into t1 values (1)"
c:
db2 connect to sample
db2 +c "delete from t1"
<---等待5秒返回lock timeout
这时在db2diag.log里有:
2006-06-22-16.25.49.854001-240 I4156A274 LEVEL: Event
PID : 199164 TID : 1 PROC : db2pd
INSTANCE: taoewang NODE : 000
FUNCTION: DB2 UDB, RAS/PD component, pdErrorCatch, probe:30
START : Error catch set for ZRC -2146435004
2006-06-22-16.26.00.544312-240 I4431A1200 LEVEL: Event
PID : 163370 TID : 1 PROC : db2agent (SAMPLE)
INSTANCE: taoewang NODE : 000 DB : SAMPLE
APPHDL : 0-11 APPID: *LOCAL.taoewang.060622200428
FUNCTION: DB2 UDB, lock manager, sqlplnfd, probe:999
DATA #1 : <preformatted>
Caught rc -2146435004. Dumping stack trace.
CALLSTCK:
[0] 0x090000003318F730 pdLogPrintf + 0xB8
[1] 0x090000003324BE7C sqlpErrorCatchForPD__FP8SQLP_LRBT1Uii + 0xFFFFFFFFFFFFF
E9C
[2] 0x0900000032DDA584 sqlplnfd__FP20sqle_agent_privatecbP14SQLP_LOCK_INFOP8SQ
LP_LCBP9SQLP_LHSHP8SQLP_LRBUiiN27P8SQLP_AWBT7 + 0x1EC
[3] 0x0900000032C787FC sqlplrq__FP20sqle_agent_privatecbP14SQLP_LOCK_INFO + 0x
D8
[4] 0x0900000032D0C648 sqldReadNorm__FP13SQLD_DFM_WORKl + 0x92C
[5] 0x0900000032BCAC54 @20@sqldFetchNext__FP13SQLD_DFM_WORK + 0x28C
[6] 0x0900000032BCA8CC sqldfrd__FP13SQLD_DFM_WORK + 0xD0
[7] 0x0900000032BA37FC sqldRowFetch__FP20sqle_agent_privatecbP8SQLD_CCBUlT3PP1
0SQLD_VALUEPUiT3 + 0x38
[8] 0x0900000032BA3AA8 sqlritaSimplePerm__FP8sqlrr_cb + 0xFFFFFFFFFFFFFF1C
[9] 0x0900000032BA0A00 sqlriSectInvoke__FP8sqlrr_cbP12sqlri_opparm + 0x18
2006-06-22-16.26.00.545311-240 I5632A403 LEVEL: Event
PID : 163370 TID : 1 PROC : db2agent (SAMPLE)
INSTANCE: taoewang NODE : 000 DB : SAMPLE
APPHDL : 0-11 APPID: *LOCAL.taoewang.060622200428
FUNCTION: DB2 UDB, trace services, pdInvokeCalloutScript, probe:10
START : Invoking sqllib/db2cos script from lock manager sqlplnfd
2006-06-22-16.26.02.807529-240 I6036A386 LEVEL: Event
PID : 163370 TID : 1 PROC : db2agent (SAMPLE)
INSTANCE: taoewang NODE : 000 DB : SAMPLE
APPHDL : 0-11 APPID: *LOCAL.taoewang.060622200428
FUNCTION: DB2 UDB, trace services, pdInvokeCalloutScript, probe:20
STOP : Completed invoking sqllib/db2cos script
打开db2cos.rpt可以见到当时的lock信息
Lock Timeout Caught
Thu Jun 22 16:26:00 EDT 2006
Instance taoewang
Datbase: SAMPLE
Partition Number: 0
PID: 163370
TID: 1
Function: sqlplnfd
Component: lock manager
Probe: 999
Timestamp: 2006-06-22-16.26.00.545246
AppID: *LOCAL.taoewang.060622200428
AppHdl:
Instance taoewang uses 64 bits and DB2 code release SQL08024
with level identifier 03050106
Informational tokens are DB2 v8.1.1.104, s060120, U805924, FixPak 11.
Operating System Information:
OSName: AIX
NodeName: panipuri
Version: 5
Release: 2
Machine: 0009028F4C00
CPU Information:
TotalCPU OnlineCPU ConfigCPU Speed(MHz) HMTDegree Cores/Socket
4 4 4 375 1 n/a
Physical Memory and Swap (Megabytes):
TotalMem FreeMem AvailMem TotalSwap FreeSwap
4096 47 n/a 28704 20837
Virtual Memory (Megabytes):
Total Reserved Available Free
32800 n/a n/a 20883
Message Queue Information:
MsgSeg MsgMax MsgMap MsgMni MsgTql MsgMnb MsgSsz
n/a 4194304 n/a n/a n/a 4194304 n/a
Shared Memory Information:
ShmMax ShmMin ShmIds ShmSeg
68719476736 1 131072 0
Semaphore Information:
SemMap SemMni SemMns SemMnu SemMsl SemOpm SemUme SemUsz SemVmx SemAem
n/a 131072 n/a n/a 65535 1024 n/a n/a 32767 16384
CPU Load Information:
Short Medium Long
1.398895 1.659851 1.883804
Database Partition 0 -- Database SAMPLE -- Active -- Up 0 days 00:21:52
Applications:[/COLOR]
Address AppHandl [nod-index] NumAgents CoorPid Status C-AnchID C-StmtUID L-AnchID L-StmtUID Appid
0x078000000018D5E0 12 [000-00012] 1 504070 UOW-Waiting 0 0 122 1 *LOCAL.taoewang.060622201324
0x078000000011FB40 11 [000-00011] 1 163370 UOW-Executing 114 1 114 1 *LOCAL.taoewang.060622200428
0x078000000011DB80 10 [000-00010] 1 291350 UOW-Waiting 0 0 122 1 *LOCAL.taoewang.060622200409
Database Partition 0 -- Database SAMPLE -- Active -- Up 0 days 00:21:52
Transactions:[/COLOR]
Address AppHandl [nod-index] TranHdl Locks State Tflag Tflag2 Firstlsn Lastlsn LogSpace SpaceReserved TID AxRegCnt GXID
0x0780000020251980 10 [000-00010] 2 3 WRITE 0x00000000 0x00000000 0x000000FBC103 0x000000FBC103 108 159 0x0000000004C7 1 0
0x0780000020252600 11 [000-00011] 3 4 READ 0x00000000 0x00000000 0x000000000000 0x000000000000 0 0 0x0000000004CD 1 0
0x0780000020253280 12 [000-00012] 4 6 WRITE 0x00000000 0x00000000 0x000000FBC136 0x00000107B0F7 192 345 0x0000000004CB 1 0
Database Partition 0 -- Database SAMPLE -- Active -- Up 0 days 00:21:52
BufferPools:
First Active Pool ID 1
Max Bufferpool ID 1
Max Bufferpool ID on Disk 1
Num Bufferpools 5
Address Id Name PageSz PA-NumPgs BA-NumPgs BlkSize ES NumTbsp PgsLeft CurrentSz PostAlter SuspndTSCt
0x07800000203703E0 1 IBMDEFAULTBP 4096 1000 0 0 N 5 0 1000 1000 0
0x07800000203670C0 4096 IBMHIDDENBP4K 4096 16 0 0 N 0 0 16 16 0
0x0780000020367460 4097 IBMHIDDENBP8K 8192 16 0 0 N 0 0 16 16 0
0x0780000020367800 4098 IBMHIDDENBP16K 16384 16 0 0 N 0 0 16 16 0
0x0780000020367BA0 4099 IBMHIDDENBP32K 32768 16 0 0 N 0 0 16 16 0
Database Partition 0 -- Database SAMPLE -- Active -- Up 0 days 00:21:52
Logs:
Current Log Number 0
Pages Written 219
Method 1 Archive Status n/a
Method 1 Next Log to Archive n/a
Method 1 First Failure n/a
Method 2 Archive Status n/a
Method 2 Next Log to Archive n/a
Method 2 First Failure n/a
Address StartLSN State Size Pages Filename
0x0780000020008118 0x000000FA0000 0x00000000 1000 1000 S0000000.LOG
0x07800000200081B8 0x000001388000 0x00000000 1000 1000 S0000001.LOG
0x0780000020008258 0x000001770000 0x00000000 1000 1000 S0000002.LOG
Database Partition 0 -- Database SAMPLE -- Active -- Up 0 days 00:21:52
Locks:
Address TranHdl Lockname Type Mode Sts Owner Dur HldCnt Att ReleaseFlg
0x07800000202D9180 4 0003000C000000070000000052 Row ..X G 4 1 0 0x0008 0x40000000
0x07800000202D95C0 2 53514C4332453037DF495EFE41 Internal P ..S G 2 1 0 0x0000 0x40000000
0x07800000202D8E40 4 53514C4332453037DF495EFE41 Internal P ..S G 4 1 0 0x0000 0x40000000
0x07800000202D96C0 3 53514C4332453037DF495EFE41 Internal P ..S G 3 1 0 0x0000 0x40000000
0x07800000202D99C0 4 53514C4445464C540763DD2841 Internal P ..S G 4 1 0 0x0000 0x40000000
0x07800000202D9000 4 0003000C000000060000000052 Row ..X G 4 1 0 0x0008 0x40000000
0x07800000202D8CC0 3 00000001000000010001720056 Internal V ..S G 3 1 0 0x0000 0x40000000
0x07800000202D9680 4 0003000C000000050000000052 Row ..X G 4 1 0 0x0008 0x40000000
0x07800000202D9240 2 0003000C000000040000000052 Row ..X G 2 1 0 0x0008 0x40000000
0x07800000202D8740 3 0003000C000000040000000052 Row ..X W* 3 1 0 0x0000 0x40000000 [/COLOR]
0x07800000202D8BC0 2 0003000C000000000000000054 Table .IX G 2 1 0 0x0000 0x40000000
0x07800000202D8B40 4 0003000C000000000000000054 Table .IX G 4 1 0 0x0000 0x40000000
0x07800000202D8C40 3 0003000C000000000000000054 Table .IX G 3 1 0 0x0000 0x40000000
首先我们看到有3个application,3个transaction,然后直接跳到locklist那里找到W*,这个就是被locktimeout的。
看到他的lockname=0003000C000000040000000052,然后还有一个lock有同样的名字:
0x07800000202D9240 2 0003000C000000040000000052 Row ..X G 2 1 0 0x0008 0x40000000
看到transaction id=2
0x0780000020251980 10 [000-00010] 2 3 WRITE 0x00000000 0x00000000 0x000000FBC103 0x000000FBC103 108
application id=10:
0x078000000011DB80 10 [000-00010] 1 291350 UOW-Waiting 0 0 122 1 *LOCAL.taoewang.060622200409
对应的L-AnchID=122 L-StmtUID=1
就可以找到相应的dynamic statement:
0x07800000209EE5C0 122 1 1 1 4 4 insert into t1 values (1)
这个就是lock的holder
同理
0x078000000011FB40 11 [000-00011] 1 163370 UOW-Executing 114 1 114 1 *LOCAL.taoewang.060622200428
所定应的114-1就是
0x07800000209EF440 114 1 1 1 4 4 delete from t1
这个就是被locktimeout的东西。
hope it helps |
|