Archive for category ASM

TOD – ASM diskgroup space from any database

Have you ever noticed that v$asm views are available in all databases? This is true whether using ASM or not because the ASM home and software is the same as the database software. On this note, while the DB and ASM can share a home, they should not in order to preserve the ability to upgrade ASM separately from the database home. Simply, ASM should always be at or above the highest database software version, but that is another post.

Ok, now that I am back from that tangent. I have had users call me in a panic when they run ASM scripts in a database instead of the ASM instance. When querying v$asm_disk and checking on the physical disk/luns, it will always report 100% used:

  SELECT
      NVL(a.name, '[CANDIDATE]')                       disk_group_name
    , b.path                                           disk_file_path
    , b.total_mb                                       total_mb
    , (b.total_mb - b.free_mb)                         used_mb
    , ROUND((1- (b.free_mb / b.total_mb))*100, 2)      pct_used
  FROM
      v$asm_diskgroup a RIGHT OUTER JOIN v$asm_disk b USING (group_number)
  ORDER BY
     a.name;
Disk Group Name Path              File Size (MB) Used Size (MB) Pct. Used
--------------- ----------------- -------------- -------------- ---------
DGROUP1         ORCL:VOL001               33,791         33,791    100.00
                ORCL:VOL002               33,791         33,791    100.00
                ORCL:VOL003               33,791         33,791    100.00
                ORCL:VOL008               33,791         33,791    100.00
                ORCL:VOL005               33,791         33,791    100.00
                ORCL:VOL006               33,791         33,791    100.00
                ORCL:VOL007               33,791         33,791    100.00
                ORCL:VOL004               33,791         33,791    100.00
***************                   -------------- --------------
                                         270,328        270,328
 
                                  -------------- --------------
Grand Total:                             270,328        270,328

Which can induce great panic as it appears that ASM is full, but when properly run from ASM:

Disk Group Name Path              File Size (MB) Used Size (MB) Pct. Used
--------------- ----------------- -------------- -------------- ---------
DGROUP1         ORCL:VOL001               33,791         24,925     73.76
                ORCL:VOL002               33,791         24,911     73.72
                ORCL:VOL003               33,791         24,910     73.72
                ORCL:VOL008               33,791         24,921     73.75
                ORCL:VOL005               33,791         24,921     73.75
                ORCL:VOL006               33,791         24,905     73.70
                ORCL:VOL007               33,791         24,932     73.78
                ORCL:VOL004               33,791         24,919     73.74
***************                   -------------- --------------
                                         270,328        199,344
 
                                  -------------- --------------
Grand Total:                             270,328        199,344

Some users, including my own developers, need to know how much space is available on the diskgroup from within their own database. It was thought that we had to query asm via SQL*net or have ASM populate a file to be brought into the database to determine how much space was available. This idea has several problems, it is clumsy and complicated, and it is not up to date and must be refreshed often. I noticed, however, what while you cannot query v$asm_disk and get accurate results, you can query v$asm_diskgroup and get accurate results from any database.

  SELECT
      name                                     group_name
    , type                                     type
    , total_mb                                 total_mb
    , (total_mb - free_mb)                     used_mb
    , ROUND((1- (free_mb / total_mb))*100, 2)  pct_used
  FROM
      v$asm_diskgroup
  ORDER BY
     name;
 
Disk Group
Name             Type   Total Size (MB) Used Size (MB) Pct. Used
---------------- ------ --------------- -------------- ---------
DGROUP1          EXTERN         270,328        199,344     73.74
                        --------------- --------------
Grand Total:                    270,328        199,344

This query will report the same data from both the ASM instance as well as any databases on the same server. The interesting thing is, even if the database is NOT using ASM, it will report as above since every (10g or 11g) database on a host communicates with ASM via its own “mini-cluster” (ocssd) if not RAC and the RAC cluster if it is.

No Comments

ASM and the “Vampire” database

Almost all of the 10g and above databases on which I work use ASM.  One of the most common requests I receive on development systems is to add another lun to add more space to ASM.  This is one of the great features of ASM, adding or removing disk without affecting the current databases.  Before adding a lun, however, I always check to see if space is really needed.  Sometimes, there will be a database which has not been backed up in weeks and it will have literally thousands of archive logs that are filling up ASM.  That remedy is easy, either back up the database and archives or delete the archive logs and run a new full backup.

Then there are the vampire databases, those that suck up disk space, but are not active and not being used.  Often, they are just forgotten databases by developers or DBAs that are shut down, removed from oratab and backups and generally forgotten about since no one notices the space they take up in ASM.  In dealing with space issues, I needed a quick way to determine what databases were in ASM, whether or not they were actually running.  Here is what I came up with:

REM asm_db_size.sql
REM Author - Jay Caviness - Grumpy-dba.com
REM 5 March 2009
REM ------------------------------------------------------------
set pages 999
set heading on
set feedback off
set lines 80
col "Size in MB" for 999,999,999
col "Database" for a25
 
select database_name "Database", 
         sum(space)/1024/1024 "Size in MB" FROM (
  SELECT
      CONNECT_BY_ROOT db_name as database_name, space
  FROM
      ( SELECT
            a.parent_index       pindex
          , a.name               db_name
          , a.reference_index    rindex
          , f.bytes              bytes
          , f.space              space
          , f.type               type
        FROM
            v$asm_file f RIGHT OUTER JOIN v$asm_alias a
                         USING (group_number, file_number)
      )
  WHERE type IS NOT NULL
  START WITH (MOD(pindex, POWER(2, 24))) = 0
      CONNECT BY PRIOR rindex = pindex)
group by database_name
order by database_name
/

Which, when run gives the output:

Database                    Size in MB
------------------------- ------------
DB_UNKNOWN                          10
QA                              29,667
QACA20A                          9,011
QARA20A                         13,258
QCONFIG                         17,653
QHC1011                          8,874
QHR1011                         12,068
QICA11A                         17,247
QIRA11A                         27,041
TESTC02                          8,908
TESTR02                         11,791

In this case I know that TESTC02 and TESTR02 are not running and upon checking with the owners of this system, received permission to drive a stake through the heart of…um..er..remove these databases saving 20G of space.

For a quick and dirty delete script for ASM, see the link page and get the “drop_asm_db.sql” script. Happy vampire hunting!

No Comments

TOD – Coming in from the cold – Bringing datafiles into ASM

I received a panicked call the other day from a DBA who had a user create a new datafile on a cooked filesystem. All was not lost and it didn’t require a downtime to fix, however, it did require that tablespace to be taken offline for a time.

In all databases I create using ASM I set the parameter: DB_FILE_CREATE_DEST (and it’s cohort DB_CREATE_ONLINE_LOG_DEST_X, which I will discuss in the future) and set it to the value of the default diskgroup (in our case +DGROUP1). This makes life much easier for a dba as the file parameters are no longer needed:

 SQL> create tablespace asm_demo datafile size 20m;
 
Tablespace created.
 
SQL> select file_name from dba_data_files
  2  where tablespace_name = 'ASM_DEMO';
 
FILE_NAME
------------------------------------------------------------
+DGROUP1/qa/datafile/asm_demo.974.671209339

In the above example, if the db_create_file_dest parameter is set, the only information I need to create a tablespace is the tablespace name. Technically, even the size is not needed as a default size of 10m is used if the size is omitted.

This is all well and good, but what happens if someone comes along and creates a file the old fashioned way:

SQL> create tablespace test datafile '/home/oracle/test_01.dbf' 
  2> size 10m;

How do you get this file into asm? Elementary my dear Watson! First take the tablespace offline.

 SQL>  alter tablespace test offline;

Start RMAN and backup a copy of the datafile into ASM, since the tablespace is offline, no transactions will change the datafile. (I will leave out the extraneous rman info)

 RMAN> backup as copy tablespace 'TEST' format '+DGROUP1';

Tell the data dictionary to use the new copy.

 RMAN> switch tablespace 'TEST' to copy;
datafile 50 switched to datafile copy 
    "+DGROUP1/qa/datafile/test.973.671207181"

Finally, bring the tablespace back online, either from rman or sqlplus, and check where the datafile is no located.

SQL> alter tablespace test online;
 
Tablespace altered.
 
SQL> select file_name from dba_data_files 
  2> where tablespace_name = 'TEST';
 
FILE_NAME
-------------------------------------------------------
+DGROUP1/qa/datafile/test.973.671206913

That is it, now, admittedly taking an active tablespace offline can be a challenge and may require a downtime based on your application. But, the movement of the file into ASM is quite simple. In the end the DBA is happy and the user that created the cooked datafile was educated and given 50 lashes with a wet noodle.

No Comments

The Re-Balance of Power

One of my favorite things about ASM is the ability to add and remove disks/luns from a diskgroup with no down time and an automatic rebalancing of the data on the luns.    This creates all kinds of possibilities.  I have even used this process to move an entire very large database from one SAN to another with no down time.

To remove a lun, use the following command:

SQL> alter diskgroup dgroup1 drop disk VOL004;
Diskgroup altered.

This will kick off a rebalance, that is moving data off of VOL004 and distributing it across the remaining disks. To see this operation in progress, query the v$asm_operation tables:

SQL> select * from v$asm_operation;
GROUP_NUMBER OPERA STAT      POWER     ACTUAL      SOFAR   EST_WORK   EST_RATE EST_MINUTES
------------ ----- ---- ---------- ---------- ---------- ---------- ---------- -----------
           1 REBAL RUN           1          1        163      29082        576          50

Among the data from this query we can see the diskgroup, the operation and how much work is left to do. Also we can see the asm_power_limit of 1. This “power” is a function of the amount of resources that ASM will use to complete the operation. The higher the number, faster it will be done, but the more resources taken away from other processes. This range is 0-11. 11 you say? Why not 10? The truth is it was an inside joke referring to a scene in “Spinal Tap” and an amplifier that went not to 10, but to 11. The limit of 0 was first made available in 10gR2.

To alter the limit there are two possibilities, altering at the system level:

SQL> alter system set asm_power_limit=11;
System altered.

This will alter the power limit for any new operations, but not existing ones. To change an existing operation, you must directly alter the power limit on the diskgroup involved in the operation:

SQL> select * from v$asm_operation
 
GROUP_NUMBER OPERA STAT      POWER     ACTUAL      SOFAR   EST_WORK   EST_RATE EST_MINUTES
------------ ----- ---- ---------- ---------- ---------- ---------- ---------- -----------
           1 REBAL RUN           1          1       4354      32842        665          42
 
SQL> alter diskgroup dgroup1 rebalance power 11;
 
Diskgroup altered.
SQL> select * from v$asm_operation;
 
GROUP_NUMBER OPERA STAT      POWER     ACTUAL      SOFAR   EST_WORK   EST_RATE EST_MINUTES
------------ ----- ---- ---------- ---------- ---------- ---------- ---------- -----------
           1 REBAL RUN          11         11        331      26085       2682           9

As you can see above, changing the power from 1 to 11 changed the estimated completion (EST_MINUTES) of the rebalance operation from 42 minutes down to 9. This is an estimate only and can vary quite a bit during an operation.

Note: The power can be set from the start in the add/drop statement with the following syntax:

SQL>  alter diskgroup dgroup1 drop disk vol005 rebalance power 11;
Diskgroup altered.

The big advantage to being able to set the power limit to 0 comes when it is necessary to add/drop several disks at the same time. If the power limit is already set, then each add/drop will be serialized and each operation will have to complete before the next can begin. If, however, the power limit is set to 0 first, then submit all the add/drop operations and finally set the power limit to a non-zero number for the disk group, the operations are paralellized and run at the same time. For large moves, such as SAN to SAN, this can take hours off a move, because all the operations are done together. For example, if you had to move 1TB from one SAN to another, if the power limit is non-zero when the luns from the new SAN are added and the luns from the old SAN are removed, then the add operation will run first rebalancing so there are 500G on each SAN when complete. Then the remove operation will start, rebalancing the remaining disk from the old SAN the new one. If parallelized, the entire operation is done at the same time moving the entire 1TB of data from the old SAN to the new SAN.

This ability is one of my favorite features of ASM. Power(limits) to the people!

No Comments

Adding a LUN for ASM Use to a RHEL 4/PowerPath Database Server

This procedure assumes using the Qlogic drivers and PowerPath 4.5.1, all installed. In a RAC, the steps are divided into “first node” steps , “Remaining Node(s)” steps and “Updating the ASM Instance”. For a standalone on SAN database server, the “Remaining Node(s)” step can be skipped.

Whenever performing a procedure like this one that will alter SAN/Disk layout on the fly, make sure that you have adequate backups of the database and the system in the unlikely event that something goes wrong.

First Node Steps:
Choose one node where all of the initial work will be done. Later a subset of these steps will be applied to the other cluster nodes.
1. Using the SAN software, present the new LUNS to the Linux nodes.
2. Get a list of the current visible LUNs for later reference using powermt, or a wrapper like:

# /stage/mck_ks/bin/map_pp_pseudo2lun
/dev/emcpowera = LUN 165
/dev/emcpowerb = LUN 330
/dev/emcpowerc = LUN 423
/dev/emcpowerd = LUN 422
/dev/emcpowere = LUN 420
/dev/emcpowerf = LUN 331
/dev/emcpowerg = LUN 421
/dev/emcpowerh = LUN 335

3. Reboot the node (or rescan the SCSI bus)
a. Rebooting is the supported method to rescan the SCSI bus
b. To rescan (experimental), run: /root/rescan_for_luns

4. Using powermt, (or its wrapper), identify the new LUNs:

# /stage/mck_ks/bin/map_pp_pseudo2lun
/dev/emcpowera = LUN 165
/dev/emcpowerb = LUN 330
/dev/emcpowerc = LUN 423
/dev/emcpowerd = LUN 422
/dev/emcpowere = LUN 420
/dev/emcpowerf = LUN 331
/dev/emcpowerg = LUN 421
/dev/emcpowerh = LUN 335
/dev/emcpoweri = LUN 369

5. Partition the new LUNs so they can be used for ASM. For EACH new LUN added run the following (the newlines matter!):

cat < < EOF | fdisk /dev/emcpower<newdevice>
n
p
1
1
 
w
EOF

6. Using powermt, identify the “path” devices. These are the “sdXX” devices in the output below. In the next step we’ll change the ownership of these devices.

# powermt display dev=emcpower<device>
Pseudo name=emcpoweri
CLARiiON ID=APM00031200473 [SG: HA Linux RAC 10g]
Logical device ID=6006016069C30A0004A2400E993ADA11 [LUN 369]
state=alive; policy=CLAROpt; priority=0; queued-IOs=0
Owner: default=SP A, current=SP B
==========================================================
---------------- Host ---------------   - Stor -   -- I/O Path -  -- Stats ---
### HW Path                 I/O Paths    Interf.   Mode    State  Q-IOs Errors
==========================================================
   1 qla2xxx           sdah      SP B2     active  alive      0      0
   1 qla2xxx           sdai      SP A1     active  alive      0      0
   2 qla2xxx           sdaj      SP A3     active  alive      0      0
   2 qla2xxx           sdak      SP B1     active  alive      0      0</device>

7. Now using the pseudo device name and the “path” device names change the ownership so Oracle can use them and change the Udev permissions file, so these are owned correctly after reboots (the device names below are samples taken from the output above):

for i in emcpoweri sdah sdai sdaj sdak
do
    chown oracle:dba /dev/${i}1
     echo "${i}1:oracle:dba:0660" >> /etc/udev/permissions.d/10-oracle-devices.permissions
done

8. Using the oracleasm command, determine the labels for the current ASM disks. The new disks will generally labeled in the same pattern with the next highest numbers:

# service oracleasm listdisks
VOL001
VOL002
VOL003
VOL004

9. Now label the new disk(s) with the oracleasm command. RE-run and increment the label until all the new disks are done (the example below added emcpoweri1).

# service oracleasm createdisk VOL005 /dev/emcpoweri1
Marking disk "/dev/emcpoweri1" as an ASM disk:           [  OK  ]

10. Now confirm that the new disks are labeled using the ‘oracleam listdisks’ command. The new disk should show up.

# service oracleasm listdisks
VOL001
VOL002
VOL003
VOL004
VOL005

Remaining Node(s) Steps:
These steps need to be performed on ALL the remaining RAC nodes AFTER the above has been completed. For standalone DB servers, skip these steps.

1. Using the SAN software, present the new LUNS to the Linux nodes.
2. Get a list of the current visible LUNs for later reference using powermt, or a wrapper like:

# /stage/mck_ks/bin/map_pp_pseudo2lun
/dev/emcpowera = LUN 165
/dev/emcpowerb = LUN 330
/dev/emcpowerc = LUN 423
/dev/emcpowerd = LUN 422
/dev/emcpowere = LUN 420
/dev/emcpowerf = LUN 331
/dev/emcpowerg = LUN 421
/dev/emcpowerh = LUN 335

3. Reboot the node (or rescan the SCSI bus)
a. Rebooting is the supported method to rescan the SCSI bus
b. To rescan (experimental), run: /root/rescan_for_luns

4. Using powermt, (or its wrapper), identify the new LUNs:

# /stage/mck_ks/bin/map_pp_pseudo2lun
/dev/emcpowera = LUN 165
/dev/emcpowerb = LUN 330
/dev/emcpowerc = LUN 423
/dev/emcpowerd = LUN 422
/dev/emcpowere = LUN 420
/dev/emcpowerf = LUN 331
/dev/emcpowerg = LUN 421
/dev/emcpowerh = LUN 335
/dev/emcpoweri = LUN 369

5. Using powermt, identify the “path” devices. These are the “sdXX” devices in the output below. In the next step we’ll change the ownership of these devices.

# powermt display dev=emcpower<device>
Pseudo name=emcpoweri
CLARiiON ID=APM00031200473 [SG: HA Linux RAC 10g]
Logical device ID=6006016069C30A0004A2400E993ADA11 [LUN 369]
state=alive; policy=CLAROpt; priority=0; queued-IOs=0
Owner: default=SP A, current=SP B
==========================================================
---------------- Host ---------------   - Stor -   -- I/O Path -  -- Stats ---
### HW Path                 I/O Paths    Interf.   Mode    State  Q-IOs Errors
==========================================================
   1 qla2xxx           sdah      SP B2     active  alive      0      0
   1 qla2xxx           sdai      SP A1     active  alive      0      0
   2 qla2xxx           sdaj      SP A3     active  alive      0      0
   2 qla2xxx           sdak      SP B1     active  alive      0      0</device>

6. Now using the Pseudo device name and the “path” device names change the ownership so Oracle can use them and change the Udev permissions file, so these are owned correctly after reboots (the device names below are samples taken from the output above):

for i in emcpoweri sdah sdai sdaj sdak
do
    partprobe /dev/$i
    chown oracle:dba /dev/${i}1
    echo "${i}1:oracle:dba:0660" >> /etc/udev/permissions.d/10-oracle-devices.permissions
done

7. Using the oracleasm command, determine the labels for the current ASM disks. The new disks will generally labeled in the same pattern with the next highest numbers:

# service oracleasm listdisks
VOL001
VOL002
VOL003
VOL004

8. Now scan the ASM disks to add the new ones:

# service oracleasm scandisks
Scanning system for ASM disks:                           [  OK  ]

9. Now confirm that the new disks are labeled using the ‘oracleam listdisks’ command. The new disk should show up.

# service oracleasm listdisks
VOL001
VOL002
VOL003
VOL004
VOL005

Updating the ASM instance:
All of these steps must be performed as the oracle user and only AFTER the above steps have been successfully completed for ALL nodes in the cluster. In a RAC cluster, this step needs to be completed on only ONE node in the cluster; since the database is shared, the others will get the change.

1. Make sure you set your SID to ‘+ASM’:

/home/oracle:(+ASM)$ . oraenv
ORACLE_SID = [+ASM] ? +ASM

2. Log in to the ASM instance as sysdba.

/home/oracle:(+ASM)$ sqlplus "/ as sysdba"

3. Now use the following command to add the new ASM disks to the disk group. Notice you can use some vary basic regular expressions in the naming of the volumes. In the case below, we added ASM VOL002, VOL003, VOL004. Don’t try to add already existing volumes, otherwise the whole command will error out, so customize the volume name portion below to reflect your naming scheme for the new LUNs.

SQL> alter diskgroup dgroup1 add disk 'ORCL:VOL00[234]';
Diskgroup altered.

4. You can verify the new disk by running the following SQL script:

SQL> @/home/oracle/asm/asm_disks.sql
 
DiskGroup NamePath  FileName FailGroup FileSize(MB) UsedSize(MB) Pct.Used
--------------- ----------------- -------------------- ------------------
DGROUP1 ORCL:VOL001  VOL001    VOL001    15,359          7,338     47.78
        ORCL:VOL002  VOL002    VOL002    51,199          2,525      4.93
        ORCL:VOL003  VOL003    VOL003    51,199          2,525      4.93
        ORCL:VOL004  VOL004    VOL004    51,199          2,525      4.93
                                        -------------- --------------
Grand Total:                             168,956         14,913

, , ,

No Comments

TOD – How to check if a LUN is used by ASM

ASM is down, init+ASM.ora is gone and you need to see which luns are owned by ASM, if on linux you can generally use the “service oracleasm” command to find the luns that have been marked for use with ASM.  On other platforms the oracleasm service is not available, how then to check?   You can use the od command to check the first few lines of a lun, it will display “ORCLDISK” and the diskgroup in the first five lines if part of and ASM diskgroup:

/dev:()$ od -c /dev/hdiskpower10 | more         
0000000   \0 202 001 001  \0  \0  \0  \0 200  \0  \0  \0 211 341   $ 207
0000020   \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0
0000040   O   R   C   L   D   I   S   K  \0  \0  \0  \0  \0  \0  \0  \0
0000060   \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0
0000100   \n 020  \0  \0  \0  \0 001 003 D   G   R   O   U   P   1  _
0000120    0   0   0   0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0
0000140   \0  \0  \0  \0  \0  \0  \0  \0   D   G   R   O   U   P   1  \0
0000160   \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0
0000200   \0  \0  \0  \0  \0  \0  \0  \0   D   G   R   O   U   P   1   _
0000220    0   0   0   0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0
0000240   \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0

You can see above that this lun is part of diskgroup dgroup1.

No Comments