Thursday, October 4, 2012

ORA-03135: connection lost contact

A firewall timeout may cause an idle session to receive an "ORA-03135: connection lost contact" when running an SQL.
SQL> select to_char(sysdate,'DD.MON.YYYY HH24:MI:SS') from dual

TO_CHAR(SYSDATE,'DD.MON.YYYYH
-----------------------------
28.SEP.2012 10:52:01

Elapsed: 00:00:00.01
(Delay for 35+ minutes)
SQL> r
  1* select to_char(sysdate,'DD.MON.YYYY HH24:MI:SS') from dual
select to_char(sysdate,'DD.MON.YYYY HH24:MI:SS') from dual
*
ERROR at line 1:
ORA-03135: connection lost contact


Elapsed: 00:05:22.63
Note the elapsed time of 5 mintes 22 seconds.

To debug enable a client side SQL*net trace and reproduce the error. In this case after several tests the error occured after 30 minutes.

Oracle 11.1+ database versions mask the "ORA-03135" error and return a result after the 5 minutes which may lead the DBA down the wrong path believing there is a problem with the database.

For 11g the DIAG_ADR_ENABLED=off must be set otherwise trace files will go to the DIAGNOSTICS destination.

To not interfer with other users create the SQLNET.ORA and TNSNAMES.ORA in your own directory and use them by setting the TNS_ADMIN variable.

SQLNET.ORA


TNSPING.TRACE_LEVEL=admin
TNSPING.TRACE_DIRECTORY=/home/oracle/dba/user1/tns_admin/trace/
TRACE_LEVEL_CLIENT=admin
TRACE_DIRECTORY_CLIENT=/home/oracle/dba/user1/tns_admin/trace/
TRACE_UNIQUE_CLIENT=on
TRACE_TIMESTAMP_CLIENT=on
TRACE_TIMESTAMP_SERVER=true
TRACE_FILE_CLIENT=client_trace
DIAG_ADR_ENABLED=off
Information gathered in the trace file:
[28-SEP-2012 11:29:21:622] nioqsm: send-break: failed to send oob break...
[28-SEP-2012 11:29:21:622] nioqper:  error from send-marker
[28-SEP-2012 11:29:21:622] nioqper:    ns main err code: 12547
[28-SEP-2012 11:29:21:622] nioqper:    ns (2)  err code: 12560
[28-SEP-2012 11:29:21:622] nioqper:    nt main err code: 517
[28-SEP-2012 11:29:21:622] nioqper:    nt (2)  err code: 32
[28-SEP-2012 11:29:21:622] nioqper:    nt OS   err code: 0
[28-SEP-2012 11:29:21:622] nioqsm: exit
[28-SEP-2012 11:29:21:622] nioqer: entry
[28-SEP-2012 11:29:21:622] nioqce: entry
[28-SEP-2012 11:29:21:622] nioqce: exit
[28-SEP-2012 11:29:21:623] nioqer: exit
[28-SEP-2012 11:29:21:623] nioqrs: nioqrs: Couldn't send break. returning 3135
The trace file data above matches what is in MOS note:
  • ORA-3135 with Recovery Catalog Creation Across the Network (Firewall included) [ID 805088.1]
    "where nt[1]=32 is Operating System Dependent(OSD) error code.
    
    OSD error is Err#32 Broken pipe. This OSD error is also defined in errno.h:
    * #define EPIPE 32 /* Broken pipe */"
    
    Resolution was to set an SQLNET.EXPIRE_TIME=15 in the database product $ORACLE_HOME/network/admin/sqlnet.ora (choose 15 minutes because it was less than the 30 minutes the firewall was closing the session).

    Other references

  • Troubleshooting ORA-3135 Connection Lost Contact [ID 787354.1]