Oracle DBA Basement

Thursday, February 7, 2013

Data pump, ORA-12161 TNS bad packet, ORA-12592 internal error.

Datapump import over database link failed with:
ORA-12161: TNS:internal error: partial data received
ORA-12592: TNS:bad packet

Solution was to disable ALG on the Juniper firewall. This is similar to another problem when using SCAN listeners and ALG enabled in the firewall.

Processing object type SCHEMA_EXPORT/TABLE/TABLE
ORA-39126: Worker unexpected fatal error in KUPW$WORKER.FETCH_XML_OBJECTS [TABLE:"ME"."MYTAB"]
ORA-12161: TNS:internal error: partial data received
ORA-02063: preceding line from MYDB_LINK

ORA-06512: at "SYS.DBMS_SYS_ERROR", line 95
ORA-06512: at "SYS.KUPW$WORKER", line 9001

----- PL/SQL Call Stack -----
  object      line  object
  handle    number  name
3f7ed28b0     20462  package body SYS.KUPW$WORKER
3f7ed28b0      9028  package body SYS.KUPW$WORKER
3f7ed28b0     10935  package body SYS.KUPW$WORKER
3f7ed28b0      2728  package body SYS.KUPW$WORKER
3f7ed28b0      9697  package body SYS.KUPW$WORKER
3f7ed28b0      1775  package body SYS.KUPW$WORKER
3f7c070a8         2  anonymous block

ORA-39126: Worker unexpected fatal error in KUPW$WORKER.DISPATCH_WORK_ITEMS [TABLE:"ME"."MYOTHERTAB"]
ORA-31600: invalid input value 100001 for parameter HANDLE in function CLOSE
ORA-06512: at "SYS.DBMS_SYS_ERROR", line 105
ORA-06512: at "SYS.DBMS_METADATA", line 544
ORA-06512: at "SYS.DBMS_METADATA", line 4187
ORA-06512: at "SYS.DBMS_METADATA", line 1475
ORA-06512: at "SYS.DBMS_METADATA", line 7481
ORA-06512: at "SYS.KUPW$WORKER", line 2792
ORA-02067: transaction or savepoint rollback required
ORA-06512: at "SYS.DBMS_METADATA", line 1475
ORA-06512: at "SYS.DBMS_METADATA", line 7481
ORA-06512: at "SYS.KUPW$WORKER", line 10928
ORA-12592: TNS:bad packet
ORA-02055: distributed update operation failed; rollback required
ORA-02063: preceding lines from MYDB_LINK

ORA-06512: at "SYS.DBMS_SYS_ERROR", line 95
ORA-06512: at "SYS.KUPW$WORKER", line 9001

----- PL/SQL Call Stack -----
  object      line  object
  handle    number  name
3f7ed28b0     20462  package body SYS.KUPW$WORKER
3f7ed28b0      9028  package body SYS.KUPW$WORKER
3f7ed28b0      9831  package body SYS.KUPW$WORKER
3f7ed28b0      1775  package body SYS.KUPW$WORKER
3f7c070a8         2  anonymous block

Job "ME"."IMPDP_TEST_JOB" stopped due to fatal error at 10:52:42

Thursday, October 4, 2012

ORA-03135: connection lost contact

A firewall timeout may cause an idle session to receive an "ORA-03135: connection lost contact" when running an SQL.

SQL> select to_char(sysdate,'DD.MON.YYYY HH24:MI:SS') from dual

TO_CHAR(SYSDATE,'DD.MON.YYYYH
-----------------------------
28.SEP.2012 10:52:01

Elapsed: 00:00:00.01

(Delay for 35+ minutes)

SQL> r
  1* select to_char(sysdate,'DD.MON.YYYY HH24:MI:SS') from dual
select to_char(sysdate,'DD.MON.YYYY HH24:MI:SS') from dual
*
ERROR at line 1:
ORA-03135: connection lost contact


Elapsed: 00:05:22.63

Note the elapsed time of 5 mintes 22 seconds.

To debug enable a client side SQL*net trace and reproduce the error. In this case after several tests the error occured after 30 minutes.

Oracle 11.1+ database versions mask the "ORA-03135" error and return a result after the 5 minutes which may lead the DBA down the wrong path believing there is a problem with the database.

For 11g the DIAG_ADR_ENABLED=off must be set otherwise trace files will go to the DIAGNOSTICS destination.

To not interfer with other users create the SQLNET.ORA and TNSNAMES.ORA in your own directory and use them by setting the TNS_ADMIN variable.

SQLNET.ORA


TNSPING.TRACE_LEVEL=admin
TNSPING.TRACE_DIRECTORY=/home/oracle/dba/user1/tns_admin/trace/
TRACE_LEVEL_CLIENT=admin
TRACE_DIRECTORY_CLIENT=/home/oracle/dba/user1/tns_admin/trace/
TRACE_UNIQUE_CLIENT=on
TRACE_TIMESTAMP_CLIENT=on
TRACE_TIMESTAMP_SERVER=true
TRACE_FILE_CLIENT=client_trace
DIAG_ADR_ENABLED=off

Information gathered in the trace file:

[28-SEP-2012 11:29:21:622] nioqsm: send-break: failed to send oob break...
[28-SEP-2012 11:29:21:622] nioqper:  error from send-marker
[28-SEP-2012 11:29:21:622] nioqper:    ns main err code: 12547
[28-SEP-2012 11:29:21:622] nioqper:    ns (2)  err code: 12560
[28-SEP-2012 11:29:21:622] nioqper:    nt main err code: 517
[28-SEP-2012 11:29:21:622] nioqper:    nt (2)  err code: 32
[28-SEP-2012 11:29:21:622] nioqper:    nt OS   err code: 0
[28-SEP-2012 11:29:21:622] nioqsm: exit
[28-SEP-2012 11:29:21:622] nioqer: entry
[28-SEP-2012 11:29:21:622] nioqce: entry
[28-SEP-2012 11:29:21:622] nioqce: exit
[28-SEP-2012 11:29:21:623] nioqer: exit
[28-SEP-2012 11:29:21:623] nioqrs: nioqrs: Couldn't send break. returning 3135

The trace file data above matches what is in MOS note:

ORA-3135 with Recovery Catalog Creation Across the Network (Firewall included) [ID 805088.1]

"where nt[1]=32 is Operating System Dependent(OSD) error code.

OSD error is Err#32 Broken pipe. This OSD error is also defined in errno.h:
* #define EPIPE 32 /* Broken pipe */"

Resolution was to set an SQLNET.EXPIRE_TIME=15 in the database product $ORACLE_HOME/network/admin/sqlnet.ora (choose 15 minutes because it was less than the 30 minutes the firewall was closing the session).

Other references

Troubleshooting ORA-3135 Connection Lost Contact [ID 787354.1]

Friday, September 28, 2012

Copying files to ASM using ASMCMD

When manually copying a file from the filesystem to ASM using ASMCMD the file will be placed under a generic ASM directory regardless of specifying the destination. Below example a file is copied to +DATA/mynewdb1/ however file this creates an alias to +DATA/ASM/DATAFILE/

ASMCMD> cp /app/oracle/backup/mydb1/DF.2105.2105.784545123 +DATA/mynewdb1/datafile/DF.2105
copying /app/oracle/backup/mydb1/DF.2105.2105.784545123 -> +DATA/mynewdb1/datafile/DF.2105
ASMCMD> cd +DATA/mynewdb1/datafile/
ASMCMD> ls -al
WARNING:option 'a' is deprecated for 'ls'
please use 'absolutepath'
Type Redund Striped Time Sys Name
...
N DF.2105 => +DATA/ASM/DATAFILE/DF.2105.401.784554241
...
ASMCMD> pwd
+DATA/mynewdb1/datafile

According to Oracle this is expected behaviour.

However files transferred to ASM via the database ONNN background process set the context to which to store the files and are placed in the "correct" location. For example using RMAN connected to the target database:

RMAN> restore controlfile from '/app/oracle/backup/db1/Current.312.769340147';
Starting restore at 23-SEP-12
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=242 instance=MYNEWDB1 device type=DISK
channel ORA_DISK_1: copied control file copy
output file name=+DATA/mynewdb1/control01.ctl
output file name=+DATA/mynewdb1/control02.ctl
Finished restore at 23-SEP-12
RMAN>

To copy a file without it ending up being an alias to the generic location use RMAN which will communicate with ASM via the ONNN process and place it in the appropriate location for the database.

11.2 Background Processes "Onnn, ASM Connection Pool Process, Maintains a connection to the ASM instance for metadata operations, Onnn slave processes are spawned on demand. These processes communicate with the ASM instance."

On a semi-related note, why did Oracle consider "ls -absolutepath" was better than "ls -al"? "ls -al" had fewer keystrokes.

ASMCMD> ls -al
WARNING:option 'a' is deprecated for 'ls'
please use 'absolutepath'

Saturday, September 1, 2012

ORA-27504: IPC error creating OSD context, ORA-27302: failure occurred at: skgxpcini3

CRS-5017: The resource action "ora.mydb.db start" encountered the following error:
ORA-27504: IPC error creating OSD context
ORA-27300: OS system dependent operation:check if failed with status: 0
ORA-27301: OS failure message: Error 0
ORA-27302: failure occurred at: skgxpcini3
ORA-27303: additional information: requested interface vnet5:1 interface not up
 _disable_interface_checking = TRUE to disable this check for single instance cluster. 
Check output from ifconfig co

Alert log error for instance:

SKGXP: ospid 10823: network interface with IP address 169.254.20.147 is DOWN

The error message mentions network interface vnet5:1 In this case vnet5 and vnet6 are private interconnects using HAIP hence the 169.254.x.x address (Link Local Address). On SERVER01, vnet5 is UP, no 169. address bound to it:

vnet5: flags=1000843 mtu 9000 index 6
        inet 10.0.84.194 netmask fffffff0 broadcast 10.0.84.207

On SERVER02, vnet5 not UP, has 169.254.20.147 bound to both vnet5 and vnet6!

vnet5: flags=1000842 mtu 9000 index 8
        inet 10.0.84.195 netmask fffffff0 broadcast 10.0.84.207
vnet5:1: flags=1000842 mtu 9000 index 8
        inet 169.254.20.147 netmask ffff8000 broadcast 169.254.127.255
vnet6: flags=1000843 mtu 9000 index 7
        inet 10.0.84.211 netmask fffffff0 broadcast 10.0.84.223
vnet6:1: flags=1000843 mtu 9000 index 7
        inet 169.254.164.136 netmask ffff8000 broadcast 169.254.255.255
vnet6:2: flags=1000843 mtu 9000 index 7
        inet 169.254.20.147 netmask ffff8000 broadcast 169.254.127.255

Fix was to reboot SERVER02 as it turned out the UNIX administrator had performed some work on the blades concerning MAC addresses for vnet5. Somewhere vnet5 and vnet6 got the link-local address mixed up.

ORA-27102: out of memory

The below shows an "ORA-27102 out of memory" error on startup for an Oracle RAC database however this error can occur in single instances also.

$ srvctl start database -d MYDB
PRCR-1079 : Failed to start resource ora.MYDB.db
CRS-5017: The resource action "ora.MYDB.db start" encountered the following error:
ORA-27102: out of memory
SVR4 Error: 22: Invalid argument
. For details refer to "(:CLSN00107:)" in 
"/app/11.2.0/grid/log/myserver01/agent/crsd/oraagent_oracle/oraagent_oracle.log".

In alert log:

WARNING: The system does not seem to be configured
optimally. Creating a segment of size 0x000000009b000000
failed. Please change the shm parameters so that
a segment can be created for this size. While this is
not a fatal issue, creating one segment may improve
performance

9b000000 hex = 2400M = MEMORY_TARGET and MEMORY_MAX_TARGET settings for this particular instance. Check the shared memory max in the project. Databases are being started by the 'grid' user for RAC however single instances will usually be 'oracle':

> projects -l

user.oracle
...
                 project.max-shm-memory=(priv,4294967295,deny)
user.grid
...
                 project.max-shm-memory=(priv,4294967295,deny)

Shared memory maximum allocated to the project is 4G. If the sum of all SGA/PGAs allocated for instances is more than 4G within the project an instance will encounter the error either on startup or even during operation. To fix either increase the maximum shared memory limit (OS resources permitting) or decrease the SGA/PGA in individual instances to under the limit. The following metalink note describes the problem and solutions: Database Startup On Solaris 10 Fails With Ora-27102 Out Of Memory Error [ID 399895.1]