非常急 Oracle auto restart (ORA-00600) 望高手指点

ihekoko · 发表于 2007-3-5 11:06

急
遇到ORA-00600错误

Sun Mar 04 09:13:29 2007
Errors in file g:\admin\cmsdb\udump\cmsdb_ora_4056.trc:
ORA-00600: internal error code, arguments: [17059], [0x2EC6F18C], [], [], [], [], [], []

然后就是DB 自动重起

cmsdb_ora_4056.trc见附件

去google了一把
确实有这个情况不过那是9202 上的BUG
9203就修复了
而我这里是9208 郁闷

哪位高手帮帮忙啊

dreamback · 发表于 2007-3-5 11:13

关注

oldunix · 发表于 2007-3-5 11:29

所以说嘛:
oracle版本不在高,稳定就行!

oldunix · 发表于 2007-3-5 11:32

等9209

ihekoko · 发表于 2007-3-5 11:45

metalink 上有这样一篇文章:

Subject: ORA-600 [17059]
    Doc ID: Note:138554.1 Type: REFERENCE
    Last Revision Date: 11-OCT-2006 Status: PUBLISHED

Note: For additional ORA-600 related information please read Note 146580.1

PURPOSE:
  This article discusses the internal error "ORA-600 [17059]", what
  it means and possible actions. The information here is only applicable
  to the versions listed and is provided only for guidance.

ERROR:
  ORA-600 [17059] [a]

VERSIONS:
  versions 7.1 to 10.1

DESCRIPTION:

  While building a table to hold the list of child cursor dependencies
  relating to a given parent cursor, we exceed the maximum possible size
  of the table.

ARGUMENTS:
  Arg [a] Object containing the table

FUNCTIONALITY:
  Kernel Generic Library cache manager

IMPACT:
  PROCESS FAILURE
  NON CORRUPTIVE - No underlying data corruption.

SUGGESTIONS:

  One symptom of this error is that the session will appear to hang for a
  period of time prior to this error being reported.

  If the Known Issues section below does not help in terms of identifying
  a solution, please submit the trace files and alert.log to Oracle
  Support Services for further analysis.

  Issuing this SQL as SYS (SYSDBA) may help show any problem
  objects in the dictionary:

select do.obj#,
      po.obj# ,
      p_timestamp,
      po.stime ,
      decode(sign(po.stime-p_timestamp),0,'SAME','*DIFFER*') X
from sys.obj$ do, sys.dependency$ d,  sys.obj$ po
where P_OBJ#=po.obj#(+)
   and D_OBJ#=do.obj#
   and do.status=1 /*dependent is valid*/
   and po.status=1 /*parent is valid*/
   and po.stime!=p_timestamp /*parent timestamp not match*/
order by 2,1
;

  Normally the above select would return no rows. If any rows are
  returned the listed dependent objects may need recompiling.

  Known Issues:

  Bug# 4559728 See Note 4559728.8
   Select lob from view loops / OERI [17059] if synonym dropped
   Fixed: 9.2.0.8, 10.1.0.6, 11

  Bug# 4203874 See Note 4203874.8
   Dictionary corruption after DROP partition (can cause OERI:17059)
   Fixed: 9.2.0.8, 10.1.0.5, 10.2.0.1

  Bug# 3555003 See Note 3555003.8
   View compilation hangs / OERI:17059 after DBMS_APPLY_ADM.SET_DML_HANDLER
   Fixed: 9.2.0.6

  Bug# 2707304 See Note 2707304.8
   OERI:17059 / OERI:kqlupd2 / PLS-907 after adding partitions to Partitioned IOT
   Fixed: 9.2.0.3, 10.1.0.2

  Bug# 2636685 See Note 2636685.8
   Hang / OERI:[17059] after adding a list value to a partition
   Fixed: 9.2.0.3, 10.1.0.2

  Bug# 2626347 See Note 2626347.8
   OERI:17059 accessing view after ADD / SPLIT PARTITION
   Fixed: 9.2.0.3, 10.1.0.2

  Bug# 2306331 See Note 2306331.8
   Hang / OERI[17059] on view after SET_KEY or SET_DML_INVOKATION on base table
   Fixed: 9.2.0.2

  Bug# 1115424 See Note 1115424.8
   Cursor authorization and dependency lists too long - can impact shared pool / OERI:17059
   Fixed: 8.0.6.2, 8.1.6.2, 8.1.7.0

  Bug# 631335 See Note 631335.8
   OERI:17059 from extensive re-user of a cursor
   Fixed: 8.0.4.2, 8.0.5.0, 8.1.5.0

  Bug# 558160 See Note 558160.8
   OERI:17059 from granting privileges multiple times
   Fixed: 8.0.3.2, 8.0.4.0, 8.1.5.0

按照他们的说法,上面提到的所有的ora-600 错误都不应该在9208这个版本出现啊

ihekoko · 发表于 2007-3-5 12:08

还有一个这样的BUG居然没在9208中fix掉

Subject: Bug 4297280 - OERI[17059] can occur if over 32767 schemas reference an object
    Doc ID: Note:4297280.8 Type: PATCH
    Last Revision Date: 24-NOV-2005 Status: PUBLISHED
Click here for details of sections in this note.
Bug 4297280  OERI[17059] can occur if over 32767 schemas reference an object
This note gives a brief overview of bug 4297280.
Affects:

Product (Component) Oracle Server (Rdbms)
Range of versions believed to be affected Versions < 11
Versions confirmed as being affected

      * 9.2.0.6
      * 10.2.0.1

Platforms affected Generic (all / most platforms affected)

Fixed:

This issue is fixed in

      * 10.2.0.2 (Server Patch Set)
      * 11g (Future version)

Symptoms:

Related To:

* Internal Error May Occur (ORA-600)

* (None Specified)

Description

ORA-600[17059] can occur if more than 32767 schemas reference
an object.

Workaround:
   Flush the shared pool

难道是这个没理由啊 32767个schema 啊怎么可能?
哪位高手给点建议啊

ihekoko · 发表于 2007-3-5 12:30

自己再顶下
哪位大哥有过这样的经历啊介绍介绍

ihekoko · 发表于 2007-3-5 13:32

看了几篇metalink的文章总算有点收获了

在我上传的trace file中开头是这样一段

*** 2007-03-04 09:10:49.060
*** SESSION ID

86.57476) 2007-03-04 09:10:48.920
LIBRARY OBJECT HANDLE: handle=2ecfcca8

name=INSERT INTO T_ETYLOGFIELD (LOGID, NAME, LOGVALUE, PREVALUE, CHANGETIME) VALUES (:B5, :B4, :B3, :B2, :B1)
[/COLOR]
hash=34912c92 timestamp=02-09-2007 15:24:33
namespace=CRSR flags=RON/KGHP/TIM/OBS/PN0/MED/[50410000]
kkkk-dddd-llll=0000-0001-0001 lock=N pin=S latch#=1
lwt=2ECFCCC0[2ECFCCC0,2ECFCCC0] ltm=2ECFCCC8[2ECFCCC8,2ECFCCC8]
pwt=2ECFCCD8[2ECFCCD8,2ECFCCD8] ptm=2ECFCD30[2ECFCD30,2ECFCD30]
ref=2ECFCCB0[2ECFCCB0, 2ECFCCB0] lnd=2ECFCD3C[2ECC00A0,2ECFD12C]
  LOCK OWNERS:
   lock    user  session count mode flags
  -------- -------- -------- ----- ---- ------------------------
  2d3aec38 2bbd95d8 2bbd95d8    1 N [00]
  3b8eec70 2bc4f558 2bc4f558    1 N [00]
  2d2479e4 2bbeb0f8 2bbeb0f8    1 N [00]
  2d5ab4f0 2bc05038 2bc05038    1 N [00]
  2d39ddfc 2bbf3518 2bbf3518    1 N [00]
  3b8a2120 2bc1ef78 2bc1ef78    1 N [00]
  2d2be10c 2bc4b348 2bc4b348    1 N [00]
  2d2d0b50 2bc35618 2bc35618    1 N [00]
  3b8eeb98 2bc132b8 2bc132b8    1 N [00]
  2d38b6c4 2bc3b478 2bc3b478    1 N [00]
  2d5cf910 2bc52488 2bc52488    1 N [00]
  2cfab29c 2bc13c28 2bc13c28    1 N [00]
  2d4ba2a4 2bc0a528 2bc0a528    1 N [00]
  2d46773c 2bbfe868 2bbfe868    1 N [00]
  2d3d9550 2bc12948 2bc12948    1 N [00]
  2d536750 2bc2a2c8 2bc2a2c8    1 N [00]
  3b8fe498 2bc41c48 2bc41c48    1 N [00]
  3b8eeac0 2bc2e4d8 2bc2e4d8    1 N [00]
  36ef61f8 2bc3fff8 2bc3fff8    1 N [00]
  3b8fde68 2bbf8098 2bbf8098    1 N PNC/[04]
  PIN OWNERS:
   pin    user  session    lock count mode mask
  -------- -------- -------- -------- ----- ---- ----
  3b8c71f8 2bbf8098 2bbf8098 3b8fde68    1 S 0001

  LIBRARY OBJECT: object=2ec6f18c
  type=CRSR flags=EXS[0001] pflags= [00] status=VALD load=0
  CHILDREN: size=32768
[/COLOR]

注意红色的部分可以注意到
2ec6f18c这个 library object 的children object
已经达到32768 超过了最大的 327767
这样就导致了ora-00600的产生而这个object 应该就是上面那个insert sql

结合metalink上的2篇文章
估计原因就是在同一个sql中采用了绑定变量但是由于绑定变量定义的最大长度不同造成了oralce 用原来的sql 的 bind cache 去存储新的绑定变量时无法存储于是将新的sql进行了硬解析这样造成了每个相同的sql都存在很多个相同版本在library cache中仅仅因为他们绑定的变量长度定义的不同当达到上限32767时就会报ora-00600

联系metalink 上的这2篇文章
Subject: Many Child Cursors create in 10.2.0.3 on Windows and Linux
    Doc ID: Note:403616.1 Type: ALERT
    Last Revision Date: 26-FEB-2007 Status: PUBLISHED

In this Document
  Description
  Likelihood of Occurrence
  Possible Symptoms
  Workaround or Resolution
  Patches
  Modification History

Applies to:
Oracle Server - Enterprise Edition - Version: 10.2.0.3 to 10.2.0.3
Microsoft Windows 2000
Microsoft Windows Server 2003
Linux x86
This problem is introduced by the Oracle Server 10.2.0.3 Patch Set on the 32bit Windows and 32bit Linux platforms only.

The problem also occurs on Windows 10.2.0.2 patch bundle #6, or in any 10.2.0.2 release where a one-off fix for Bug 4752541 has been applied.
Description
Many child cursors may be created for SQL statements using bind variables which have different maximum bind lengths when the same SQL was shared in previous releases. This can lead to shared pool concurrency problems such as latch contention, memory problems such as ORA-4031 or internal errors such as ORA-600 [17059].
Likelihood of Occurrence
In order to encounter this problem the database must:

* Be running 10.2.0.3 Oracle 32bit on Linux or Windows platforms
   (or running an earlier release with the fix for Bug 4752541 installed)
* Have SQL issued against it which uses bind variables
   Bug 4752541 installed)
* Have SQL issued against it which uses bind variables
* The bind variables can have different maximum lengths specified by the client/s

This is true of many applications and so the chances of hitting this problem are quite high in OLTP type environments. Warehouse environments which use mainly literal SQL are less likely to be impacted. The problem is most obvious for SQL statements using a large number of binds whose lengths can differ.
Possible Symptoms
Symptoms of this problem include:

* High VERSION_COUNT in V$SQL for cursors using bind variables
* V$SQL_SHARED_CURSOR shows that the reason for the mismatch is "BIND_MISMATCH"
   (ie: The BIND_MISMATCH column shows as "Y" for the child cursors)

For statements with many binds the lack of sharing can result in problems such as:

* Shared pool / library cache latch contention
* ORA-4031 errors
* In extreme cases ORA-600 [17059] may occur

As an example consider the following SQL / PLSQL code:
create table tst (
a varchar2(10),
b varchar2(10),
c varchar2(10),
d varchar2(10)
);
alter system flush shared_pool;
declare
  a0 varchar2(10) := 'x'; a1 varchar2(10) := 'x';
  a2 varchar2(10) := 'x'; a3 varchar2(10) := 'x';
  b0 varchar2(40) := 'x'; b1 varchar2(40) := 'x';
  b2 varchar2(40) := 'x'; b3 varchar2(40) := 'x';
begin
  insert into tst values(a0,a1,a2,a3);
  insert into tst values(a0,a1,a2,b3);
  insert into tst values(a0,a1,b2,a3);
  insert into tst values(a0,a1,b2,b3);
  insert into tst values(a0,b1,a2,a3);
  insert into tst values(a0,b1,a2,b3);
  insert into tst values(a0,b1,b2,a3);
  insert into tst values(a0,b1,b2,b3);
  insert into tst values(b0,a1,a2,a3);
  insert into tst values(b0,a1,a2,b3);
  insert into tst values(b0,a1,b2,a3);
  insert into tst values(b0,a1,b2,b3);
  insert into tst values(b0,b1,a2,a3);
  insert into tst values(b0,b1,a2,b3);
  insert into tst values(b0,b1,b2,a3);
  insert into tst values(b0,b1,b2,b3);
end;
/

select sql_text, version_count from v$sqlarea
where sql_text like 'INSERT INTO TST VA%%'
;

In 10.2.0.3 this will create 16 child cursors, one for each insert.
In previous releases it would only create about 5 child cursors as inserts share cursors that have already been built with longer bind lengths. The problem is more severe for differing bind lengths and larger numbers of binds.

和这篇

HIGH VERSION COUNT/INVALIDATION CAUSE BAD PERFORMANCE

*** 06/17/03 08:19 pm ***

TAR:

----

.

PROBLEM:

--------

The application has been in place for the past 12months. For the past month,

it has been opened to more regions which means more users added to the system

(1000-1500). The current sessions are 4000+. The first incident happened on

May 20 where the performance got so bad, users can not sign on (waiting for PE

lock), they have to shutdown abort. It happened yesterday again.

The analyze of system state and library cache dump shows very high version

count upto 480 for a INSERT statement. Most SQL with high version count > 200

are INSERT, UPDATE.

.

This is not MTS/OPS, but use distributed transaction. It has master tables

which been refresh to other 2 sites (read only snapshot).

.

So far, force shutdown happened twice. The instance have been shutdown almost

every Sunday for scheduled maintenance.

.

The wait for latch free and library cache pin just slowly piled up until it

has to be shutdown.

.

The down database cause entire application outage which is very serious to

ct's business. Ct wants to find out the root cause ASAP.

.

DIAGNOSTIC ANALYSIS:

--------------------

Per ct, when invalidation rate goes high (eg 40+/sec), the performance gets

bad.

But from our checking, the SQL with high invalidation count does not necessary

have high version count. There are SQL with very high version count. In the

latest library cache dump (Jun 17), it is upto 480. The statement is INSERT.

and most SQL has quite high CHILDREN number.

.

From our earlier check for a different SQL (eg: INSERT INTO SR_TRANSIMP_REQS

with bind variable) with high version count, the object involved only has 1

public synonym and 1 table. We found the reason for this SQL not sharing are:

1. optimizer_mode different

2. bind type mismatch

3. analyze to collect statistics run

.

Ct's application does have different optimizer_mode set in different module.

We first thought it was due to frequent analyze (every Sunday night) and

no flush shared_pool ever done. But for this incident happened on Jun 17,

Database was shutdown on Sunday, then brought up without open to application,

analyze was run, then flush shared_pool done. No further analyze done. But

within two days, the version count built upto 480. This can not be explained

by above theory.

.

We need to find out what is the cause of high version count and invalidation

count.

.

After the first incident, it was thought related with bug 1349501 and event

32333 level 10 has been set. But problem still occur.

.

WORKAROUND:

-----------

shutdown - not acceptable

flush shared_pool, ct has not tried this yet

mustapha · 发表于 2007-3-5 14:07

[17059],没碰到过。呵呵，看来自己解决最有效，我也记录下，解决bug很难啊，很多时候打补丁都解决不了，而且有一次碰到了个600，在metalink上都没有记载

kickster · 发表于 2007-3-5 14:48

开个SR话，oracle support会在internal的文档去找，它们的资料比普通oracle account的多

非常急 Oracle auto restart (ORA-00600) 望高手指点

浏览过的版块