|
|
最初由 Yong Huang 发布
[B]
I thought it was every 3 seconds, not minutes. Where did you read this?
Yong Huang [/B]
metalink
note 53711.996
From: Cate Parry 11-Jan-01 15:49
Subject: PMON cleanup
RDBMS Version: 8.1.7.0.0
Operating System and Version: Sun Solaris 2.7
Error Number (if applicable):
Product (i.e. SQL*Loader, Import, etc.):
Product Version: 8.1.7.0.0
PMON cleanup
Is there an init.ora parameter I can set to have pmon cleanup sessions with a status of KILLED more frequently?
thanx in advance
-Cate
--------------------------------------------------------------------------------
From: Oracle, Melissa Holman 15-Jan-01 21:43
Subject: Re : PMON cleanup
Cate,
PMON is cleaning up and rolling back all uncommitted transactions every 3 minutes and this is not configurable.
You may want to check your parameter setting for CLEANUP_ROLLBACK_ENTRIES. . If you have lots of aborted or killed sessions, a high setting can be a problem as users may queue up behind locks held by the killed sessions as it will take longer to round robin through all sessions.
Refer to this note:
PARAMETER: INIT.ORA: CLEANUP_ROLLBACK_ENTRIES
Whenever an user session is KILLed, the session will remain in v$session marked as killed until the user session attempts something. On an attempt, such as a select, a message that this session is killed will be displayed to the user and then the entry is then removed from v$session.
Depending on what the session was doing at the time it was KILLed, it may take PMON some time to clean it up. After the process is KILLed, it is marked as KILLED and PMON starts the cleanup by rolling back any active transactions and releasing all locks held by that session. Once the transaction has been successfully rolled back all locks and resources held by that session should be released even though the session record may remain in v$session.
The problem is that if the client process no longer exists, there is no way for the message that the session as been killed to be propagated to the client process. Therefore, the session object cannot
be entirely cleaned up and the session record remains in v$session. The way to effect this is to either manually kill the shadow process from the OS. Another alternative is to enable dead connection detection (DCD) within Net8. DCD will periodically poll the client sessions to identify those that may have terminated abnormally and should initiate a cleanup of those that no longer exist.
WHAT IS DEAD CONNECTION DETECTION (DCD)?
Doc ID: Note:1013364.6
Subject: What is Dead Connection Detection (DCD)?
Type: PROBLEM
Status: PUBLISHED
Content Type: TEXT/PLAIN
Creation Date: 23-OCT-1995
Last Revision Date: 22-APR-2002
Problem Description:
====================
This document discusses Dead Connection Detection (DCD). DCD is a feature of
SQL*Net V2.1 and later. It detects when a partner in a SQL*Net V2
client/server or server/server connection has terminated unexpectedly and
releases the resources associated with it.
Search words:
sqlnet expire time
SQLNET.EXPIRE_TIME
DEAD CONNECTION DETECTION
=========================
OVERVIEW
--------
Dead Connection Detection (DCD) is a feature of SQL*Net V2.1 and later. It
detects when a partner in a SQL*Net V2 client/server or server/server
connection has terminated unexpectedly, and releases the resources associated
with it.
DCD is intended primarily for environments in which clients power down their
systems without disconnecting from their Oracle sessions, a problem
characteristic of networks with PC clients.
DCD is initiated on the server when a connection is established. At this
time SQL*Net reads the SQL*Net parameter files and sets a timer to generate an
alarm. The timer interval is set by providing a non-zero value in minutes for
the SQLNET.EXPIRE_TIME parameter in the sqlnet.ora file.
When the timer expires, SQL*Net on the server sends a "probe" packet to the
client. (In the case of a database link, the destination of the link
constitutes the server side of the connection.) The probe is essentially an
empty SQL*Net packet and does not represent any form of SQL*Net level data,
but it creates data traffic on the underlying protocol.
If the client end of the connection is still active, the probe is discarded,
and the timer mechanism is reset. If the client has terminated abnormally,
the server will receive an error from the send call issued for the probe, and
SQL*Net on the server will signal the operating system to release the
connection's resources.
On Unix servers, the sqlnet.ora file must be in either $TNS_ADMIN or
$ORACLE_HOME/network/admin. Neither /etc nor /var/opt/oracle alone is valid.
It should be also be noted that in SQL*Net 2.1.x, an active orphan process
(one processing a query, for example) will not be killed until the query
completes. In SQL*Net 2.2, orphaned resources will be released regardless of
activity.
This is a server feature only. The client may be running any supported
SQL*Net V2 release.
THE FUNCTION OF THE PROTOCOL STACK
----------------------------------
While Dead Connection Detection is set at the SQL*Net level, it relies
heavily on the underlying protocol stack for it's successful execution. For
example, you might set SQLNET.EXPIRE_TIME=1 in the sqlnet.ora file, but it is
unlikely that an orphaned server process will be cleaned up immediately upon
expiration of that interval.
TCP/IP, for example, is a connection-oriented protocol, and as such, the
protocol will implement some level of packet timeout and retransmission in an
effort to guarantee the safe and sequenced order of data packets. If a timely
acknowledgement is not received in response to the probe packet, the TCP/IP
stack will retransmit the packet some number of times before timing out.
After TCP/IP gives up, then SQL*Net receives notification that the probe
failed.
The time that it takes TCP/IP to timeout is dependent on the TCP/IP stack,
and timeouts of many minutes are entirely common. This has been an area of
concern for many customers, as many retransmissions at the protocol layer
causes what could be a significant lag between the expiration of the DCD
interval and the time when the orphaned process is actually killed.
The easiest way to determine if the protocol stack is causing such a delay
involves testing different DCD intervals.
TESTING THE PROTOCOL STACK
--------------------------
Set the SQLNET.EXPIRE_TIME parameter to 1 minute and note the time required
to clean up an orphaned server process. Then set SQLNET.EXPIRE_TIME to 5
minutes and again observe the time required to clean up the shadow. If the
TCP/IP timeout is the reason the server resources do not get released, the
time to clean up the shadow should increase by about 4 minutes.
If the TCP/IP retransmission timeout is indeed the problem, the operating
system kernel can be tuned to reduce the interval for and number of packet
retransmissions (on many Unix platforms, the file
/usr/include/netinet/tcp_timer.h contains the configuration parameters).
Reducing the interval and number of retransmissions may impact other system
components, since in effect you are shrinking the window allowed for
connections to process data, possibly resulting in inadvertent loss of
connections during periods of heavy system load. Slower connections
from remote sites may be impacted by this change.
Kernel parameters that may affect retransmission include but are not limited
to TCP_TTL, TCPTV_PERSMIN, TCPTV_MAX, and TCP_LINGERTIME.
***To avoid disrupting other system processes, it is important to contact the
appropriate vendor for assistance in tuning the operating system kernel or
protocol stack.***
MONITORING DEAD CONNECTION DETECTION
------------------------------------
The best way to determine if DCD is enabled and functioning properly is to
generate a server trace and search the file for the DCD probe packet. To
generate a server trace, set TRACE_LEVEL_SERVER=16 and
TRACE_DIRECTORY_SERVER=<path> in sqlnet.ora on the server (note the
location of the sqlnet.ora file). The resulting trace file will have a
filename of svr_<PID>.trc and will be located in the specified directory.
Is DCD Enabled?
Search the server trace file for an entry like the following:
osntns: Enabling dead connection detection (1 min)
The timer interval listed should match the value of SQLNET.EXPIRE_TIME.
Is DCD Working?
Search the server trace file for DCD probe packets. They will appear
in the form of empty data packets, as follows:
nstimexp: entry
nstimexp: timer expired at 05-OCT-95 12:15:05
nsdo: entry
nsdo: cid=0, opcode=67, *bl=0, *what=1, uflgs=0x2, cflgs=0x3
nsdo: nsctx: state=8, flg=0x621c, mvd=0
nsdo: gtn=93, gtc=93, ptn=10, ptc=2048
nsdoacts: entry
nsdofls: entry
nsdofls: DATA flags: 0x0
nsdofls: sending NSPTDA packet
nspsend: entry
nspsend: plen=10, type=6
nttwr: entry
nttwr: socket 4 had bytes written=10
nttwr: exit
nspsend: 10 bytes to transport
nspsend acket dump
nspsend:00 0A 00 00 06 00 00 00 |........|
nspsend:00 00 00 00 00 00 00 00 |........|
nspsend: normal exit
nsdofls: exit (0)
nsdoacts: flushing transport
nttctl: entry
nsdoacts: normal exit
nsdo: normal exit
nstimexp: normal exit
The entry:
nspsend:00 0A 00 00 06 00 00 00 |........|
nspsend:00 00 00 00 00 00 00 00 |........|
represents the probe packet. Note that DCD packets are 10 bytes long when
they are issued to the protocol stack. Once the protocol header and trailer
bytes for the underlying protocols have been added, the packet could be
approximately 70 bytes long.
If DCD is enabled, you will see these probe packets written to the trace
file when the timer expires. If the server is a UNIX system, it might be
useful to establish a connection and tail the trace file:
tail -f svr_<PID>.trc
The time elapsed after each probe packet is written to the server trace should
match the SQLNET.EXPIRE_TIME value.
KNOWN PROBLEMS OR LIMITATIONS
-----------------------------
- Of the few reported problems, perhaps the most significant is DCD's poor
performance on Windows NT. Dead connections are cleaned up only when the
server is rebooted and the database is restarted. Exactly how well DCD works
on NT depends on the client's protocol implementation. SQL*Net v2.3 has
improved the performance over earlier releases.
This has been logged as port-specific Bug#303578.
- On SCO Unix, a problem was reported in which server processes spin,
consuming large amounts of CPU, once the DCD timer expires. The problem is
due to improper signal handling and can be eliminated by disabling DCD.
This is port-specific Bug#293264.
- Orphaned resources are not released if only the client application is
terminated. Only after the client PC has been rebooted does DCD release these
resources. For example, if a Windows application is killed yet Windows
remains running, the probe packet may be received and discarded as if the
connection is still active. As it currently stands, it appears that DCD
detects dead client machines, but not dead client processes.
This is logged as generic Bug#280848.
- The SQL*Net V2 implementation on MVS does not use the generic DCD
mechanism, and therefore the SQLNET.EXPIRE_TIME parameter does not apply. The
KEEPALIVE function of IBM's TCP/IP is used instead. This was implemented
prior to the development of DCD.
This is documented in port-specific Bug#301318.
- DCD relies heavily on issuing probe packets during any phase of the
connection. This is not be possible with some protocols which run
half-duplex. Hence, DCD is not enabled on protocols like APPC/LU6.2.
This is not a bug, but is rather the intended design.
- Local connections using BEQ protocol adapters are not supported with DCD.
Local connections using the IPC protocol adapters are supported with DCD.
如果设置sqlnet.expire_time的话,应该和操作系统级的tcp/ip通讯机制有关
引自Cluster:
To speed up cleanup of killed sessions you can increase the value of the
CLEANUP_ROLLBACK_ENTRIES in init.ora parameter.
PMON processes CLEANUP_ROLLBACK_ENTRIES blocks every 3 minutes.
Therefore, we can calculate when all the blocks will have been cleared.
For example if you have 2753 undo blocks and CLEANUP_ROLLBACK_ENTRIES
equal to 20. It will take approximately
2753 blocks / 20 blocks per batch = 138 batches
138 batches * 3 minutes per batch = 414 minutes = 6 hours 54 minutes
to go through all the undo blocks.
不过此参数在8i里已经没有了
|
|