kcrrwkx: nothing to do

Wednesday, January 2, 2008

Running Oracle Standard Edition 10.2.0.2 on Solaris 9. Encountered a trace file in my background_dump_dest location regarding the archiver process. Upon opening the trace file, I found many entries of the error shown below:

*** 2008-01-02 14:21:49.970
kcrrwkx: nothing to do (end)

A quick search on Metalink using “kcrrwkx nothing to do” as the criteria, I find Metalink Note 372364.1. This is labeled as Bug 4883174 with two possible solutions.

1.This message has zero impact, and can be ignored. Except that the trace files will be consuming disk space, which can be manually removed or running a cron job to clean up.

2. The one-off patch (4883174) for fixing this issue is also available for some platforms.

It is also mentioned the problem is fixed in 11g and possibly in 10.2.0.3. Searching the 10.2.0.3 Patch Set – List of Bug Fixes indicates the problem is fixed in that Oracle version. I have two other databases running 10.2.0.3 and have confirmed the archiver errors do not exist there.

In the end I chose solution #1 as this server will be replaced with new hardware running Solaris 10 and Oracle 10.2.0.3. Maybe 10.2.0.4 if Oracle releases that version soon.


Oracle 10g dbstart errors

Wednesday, December 19, 2007

After correcting my dbhome not found problem, I now encounter a “VER10LIST=10 is not an identifier” error when executing the dbstart script. Found Metalink Note 466241.1 explaining the necessary modification to the $ORACLE_HOME/bin/dbstart script.

At or near line 87, you need to modify the code from this:

export VER10LIST =`$ORACLE_HOME_LISTNER/bin/…`

to this:

VER10LIST =`$ORACLE_HOME_LISTNER/bin/…`
export VER10LIST

The script is executing using the Bourne shell and the syntax for setting an environment variable is:

variable=value
export variable

After fixing the above problem, the next error encountered is “`COUNT=$’ unexpected” at or near line 259. This error is also resolved with a modification listed in Metalink Note 466241.1.

From this:

COUNT=$((COUNT+1))

to this:

COUNT=`expr $COUNT + 1`


MySQL: Lock wait timeout exceeded

Wednesday, December 12, 2007

Running MySQL version 4.1.20. Encountered this error 1205 when attempting to purge my PerfParse database. Recently modified the data retention from 90 days to 60 days.

One of many errors from the /usr/local/nagios/var/perfparse.log.20071212 log file.

INSERT perfdata_bin_summary_data (metric_id,
frequency, ctime, val_count, sum_val,
sum_square_val, max_val, min_val)
VALUES (21, 1, FROM_UNIXTIME(1197446400),
1, 2.85000000000000e+00,
8.12250000000000e+00,2.85000000000000e+00,
2.85000000000000e+00)

(Lock wait timeout exceeded; try restarting transaction)

The tables in PerfParse use the InnoDB storage engine. I believe the purge is holding a table lock longer due to the amount of data being purged by the “perfparse-db-purge” process and is causing normal PerfParse performance data transactions to fail.

To resolve this, I had to reduce the data retention gradually over time instead of trying to purge 30 days all at once.

References
MySQL DBA – “MySQL: Replication stopped: Lock wait timeout exceeded”


Mounting Oracle media on HP-UX

Sunday, December 2, 2007

One annoying aspect of installing Oracle on HP-UX is the requirement to mount the CD or DVD device, unlike Solaris in which it auto-mounts. I guess what makes it annoying to me is trying to remember the commands. Since Oracle media for HP-UX is RockRidge-formatted, the usual mount command cannot be used. In my case, I am working with HP-UX 11i V1 (11.11).

Another annoyance is the Oracle documentation which appears to be incorrect and/or in my case doesn’t work in my environment.

# /usr/sbin/mount -F cdfs -o rr /dev/dsk/c0t0d0 /cdrom
mount: illegal file system specific option rr

However, I was able to find a solution in Metalink Note 219190.1 using “mount rockridge cd-rom hp” as the search criteria.

Using a Google search with “mount rockridge” as the criteria, I also found information at comp.sys.hp.hpux FAQ and HP-UX Tips & Tricks blog.

To mount the DVD, I used the steps below logged in as root. Assumes the pfs executables are located in /usr/sbin.

1) Create/Edit the /etc/pfs_fstab file.

/dev/dsk/c0t0d0 /cdrom pfs-rrip xlat=rrip 0 0

2) Execute commands below.

# nohup pfs_mountd &
# nohup pfsd &
# pfs_mount /cdrom

3) Unmount when finished.

# pfs_umount /cdrom


dbhome not found

Tuesday, November 20, 2007

Encountered a problem where my Oracle 10g database on Solaris 9 would not automatically shutdown or startup during a reboot. The error generated is:

/etc/rc0.d/K01oracle: dbhome: not found
ORACLE_HOME = [] ?

On my server, dbhome is located in $ORACLE_HOME/bin as well as /usr/local/bin. This led me to believe the PATH for root might not have /usr/local/bin included during the UNIX system startup/shutdown process. I copied the script to /usr/bin and now the system appears to function fine during UNIX system startup/shutdown.

Strange thing is both directories are in root’s PATH during normal operation.

/usr/sbin:/usr/bin:/sbin:/usr/local/bin:
/usr/local/sbin:/usr/X/bin:/usr/lib/lvm:
/etc/lvm:/usr/sbin/osa:/usr/ccs/bin:/usr/ucb

I will need to investigate further on why /usr/local/bin is not included in the PATH of root during UNIX system startup/shutdown. Is that by system design (i.e. security) or system misconfiguration?

Another strange thing is this does not occur on my Solaris 10 server running Oracle 10g. Even more confused now.

Update (09-Jan-2008)
On Solaris 10 when running the shutdown/startup scripts manually as the root user, this problem does not occur. However, when the system is rebooted the error does occur. Found the error in /var/svc/log/milestone-multi-user:default.log.

Executing legacy init script “/etc/rc2.d/S99oracle”.
Oracle Startup/Shutdown Begins
/etc/rc2.d/S99oracle: dbhome: not found
ORACLE_HOME = [] ? /etc/rc2.d/S99oracle: test: argument expected
Legacy init script “/etc/rc2.d/S99oracle” exited with return code 1.

My resolution to this problem was to copy dbhome from /usr/local/bin to /usr/bin. The error no longer occurs during server reboot.