8.19.2014

Oracle 11204 Clusterware upgrade - ASM glitch

Yet another tough challenge thrown at my team right after the disaster recovery (DR) simulation drill which performed barely couple of weeks ago. The new task (challenge) in hands is to upgrade the existing four cluster environments from 11.2.0.2 to 11.2.0.4 as Oracle already stopped supporting v11.2.0.2.

Although last week we had a 3 node successfully upgrade track record, we encountered ASM upgrade troubles whilst running rootupgrade.sh in a new cluster environment (7 nodes). The following error was reported during the course of rootupgrade.sh script execution:

CRS-2672: Attempting to start 'ora.asm' on 'node01' 
CRS-5017: The resource action "ora.asm start" encountered the following error: 
ORA-48108: invalid value given for the diagnostic_dest init.ora parameter 
ORA-48140: the specified ADR Base directory does not exist [/u00/app/11.2.0/grid/dbs/{ORACLE_BASE}] 
ORA-48187: specified directory does not exist 
HPUX-ia64 Error: 2: No such file or directory 
Additional information: 1CRS-2674: Start of 'ora.asm' on 'node01' failed 
CRS-2679: Attempting to clean 'ora.asm' on 'node01
CRS-2681: Clean of 'ora.asm' on 'node01' succeeded 
CRS-4000: Command Start failed, or completed with errors. 

When tried to start-up the ASM instance manually through sqlplus prompt, the following error was thrown:

SQL> 
ORA-32004: obsolete or deprecated parameter(s) specified for ASM instance 
ORA-48108: invalid value given for the diagnostic_dest init.ora parameter 
ORA-48140: the specified ADR Base directory does not exist [/u00/app/11.2.0/grid/dbs/{ORACLE_BASE}] 
ORA-48187: specified directory does not exist 
HPUX-ia64 Error: 2: No such file or directory

Sadly, there wasn't much info available about the nature of this problem. As usual, after giving it 1 hr try with different options, we opened a SR with Oracle support and agreed to rollback the upgrade from the node where the rootupgrade script failed. Luckily, this was the first node we tried and other 6 nodes were just running fine. After rolling back to the previous cluster version, ASM instance error was still persist.

To resolve the ASM instance startup issues, the following action was taken:

  • export diagnostic_dest=/u00/app/oracle
  • From active ASM instance on another node, executed the following statement:
    • SQL> ALTER SYSTEM STOP ROLLING MIGRATION;

Cause:
The problem caused an ASM instance startup issue was reported/logged as a known bug (17449823).

Workaround:
According to the MOS Doc ID (1598959.1), the bug is still being worked by the development team, they suggest the following work around on each node just before running the rootupgrade.sh script:
  • mkdir <New-GI-HOME>/dbs/{ORACLE_BASE} 
Third successful attempt
The upgrade failed in first 2 attempts, and the 3 attempt was successful and we managed to upgrade all 7 nodes from 11.2.0.2 to 11.2.0.4. It was also learnt that CRS_HOME, ORACLE_HOME, ORACLE_BASE was not unset before the runinstaller was initiated. In 3rd attempt with unsetting those parameters, upgrade went successfully.

Addendum (24-Aug-2014)
Couple of new challenges encountered in the last  upgrade task on 10 nodes.

  1. OUI window from which runInstaller was initiated got closed due to PC rebooted.
  2. Although the directory {ORACLE_BASE} created under the new GRID home, the issue were reoccurring.
Here is the solution:
  1. How to Complete 11gR2 Grid Infrastructure Configuration Assistant(Plug-in) if OUI is not Available (Doc ID 1360798.1)
  2. Ensure the diagnostic_dest is updated on ASM Spfile to the new location before running the rootupgrade.sh


References:

  • Things to Consider Before Upgrading to 11.2.0.3/11.2.0.4 Grid Infrastructure/ASM ( Doc ID 1363369.1) 
  • Things to Consider Before Upgrading to 11.2.0.4 to Avoid Poor Performance or Wrong Results ( Doc ID 1645862.1) 
  • GI rootupgrade.sh on last node: ASM rolling upgrade action failed ( Doc ID 1598959.1) 
  • bug 17449823




No comments: