Archive for category ASM
TOD – ASM diskgroup space from any database
Posted by Jay Caviness in ASM, Tip 'o the day on July 14th, 2009
Have you ever noticed that v$asm views are available in all databases? This is true whether using ASM or not because the ASM home and software is the same as the database software. On this note, while the DB and ASM can share a home, they should not in order to preserve the ability to upgrade ASM separately from the database home. Simply, ASM should always be at or above the highest database software version, but that is another post.
Ok, now that I am back from that tangent. I have had users call me in a panic when they run ASM scripts in a database instead of the ASM instance. When querying v$asm_disk and checking on the physical disk/luns, it will always report 100% used:
SELECT
NVL(a.name, '[CANDIDATE]') disk_group_name
, b.path disk_file_path
, b.total_mb total_mb
, (b.total_mb - b.free_mb) used_mb
, ROUND((1- (b.free_mb / b.total_mb))*100, 2) pct_used
FROM
v$asm_diskgroup a RIGHT OUTER JOIN v$asm_disk b USING (group_number)
ORDER BY
a.name;
Disk Group Name Path File Size (MB) Used Size (MB) Pct. Used
--------------- ----------------- -------------- -------------- ---------
DGROUP1 ORCL:VOL001 33,791 33,791 100.00
ORCL:VOL002 33,791 33,791 100.00
ORCL:VOL003 33,791 33,791 100.00
ORCL:VOL008 33,791 33,791 100.00
ORCL:VOL005 33,791 33,791 100.00
ORCL:VOL006 33,791 33,791 100.00
ORCL:VOL007 33,791 33,791 100.00
ORCL:VOL004 33,791 33,791 100.00
*************** -------------- --------------
270,328 270,328
-------------- --------------
Grand Total: 270,328 270,328Which can induce great panic as it appears that ASM is full, but when properly run from ASM:
Disk Group Name Path File Size (MB) Used Size (MB) Pct. Used
--------------- ----------------- -------------- -------------- ---------
DGROUP1 ORCL:VOL001 33,791 24,925 73.76
ORCL:VOL002 33,791 24,911 73.72
ORCL:VOL003 33,791 24,910 73.72
ORCL:VOL008 33,791 24,921 73.75
ORCL:VOL005 33,791 24,921 73.75
ORCL:VOL006 33,791 24,905 73.70
ORCL:VOL007 33,791 24,932 73.78
ORCL:VOL004 33,791 24,919 73.74
*************** -------------- --------------
270,328 199,344
-------------- --------------
Grand Total: 270,328 199,344Some users, including my own developers, need to know how much space is available on the diskgroup from within their own database. It was thought that we had to query asm via SQL*net or have ASM populate a file to be brought into the database to determine how much space was available. This idea has several problems, it is clumsy and complicated, and it is not up to date and must be refreshed often. I noticed, however, what while you cannot query v$asm_disk and get accurate results, you can query v$asm_diskgroup and get accurate results from any database.
SELECT
name group_name
, type type
, total_mb total_mb
, (total_mb - free_mb) used_mb
, ROUND((1- (free_mb / total_mb))*100, 2) pct_used
FROM
v$asm_diskgroup
ORDER BY
name;
Disk Group
Name Type Total Size (MB) Used Size (MB) Pct. Used
---------------- ------ --------------- -------------- ---------
DGROUP1 EXTERN 270,328 199,344 73.74
--------------- --------------
Grand Total: 270,328 199,344This query will report the same data from both the ASM instance as well as any databases on the same server. The interesting thing is, even if the database is NOT using ASM, it will report as above since every (10g or 11g) database on a host communicates with ASM via its own “mini-cluster” (ocssd) if not RAC and the RAC cluster if it is.
ASM and the “Vampire” database
Posted by Jay Caviness in ASM on March 5th, 2009
Almost all of the 10g and above databases on which I work use ASM. One of the most common requests I receive on development systems is to add another lun to add more space to ASM. This is one of the great features of ASM, adding or removing disk without affecting the current databases. Before adding a lun, however, I always check to see if space is really needed. Sometimes, there will be a database which has not been backed up in weeks and it will have literally thousands of archive logs that are filling up ASM. That remedy is easy, either back up the database and archives or delete the archive logs and run a new full backup.
Then there are the vampire databases, those that suck up disk space, but are not active and not being used. Often, they are just forgotten databases by developers or DBAs that are shut down, removed from oratab and backups and generally forgotten about since no one notices the space they take up in ASM. In dealing with space issues, I needed a quick way to determine what databases were in ASM, whether or not they were actually running. Here is what I came up with:
REM asm_db_size.sql
REM Author - Jay Caviness - Grumpy-dba.com
REM 5 March 2009
REM ------------------------------------------------------------
set pages 999
set heading on
set feedback off
set lines 80
col "Size in MB" for 999,999,999
col "Database" for a25
select database_name "Database",
sum(space)/1024/1024 "Size in MB" FROM (
SELECT
CONNECT_BY_ROOT db_name as database_name, space
FROM
( SELECT
a.parent_index pindex
, a.name db_name
, a.reference_index rindex
, f.bytes bytes
, f.space space
, f.type type
FROM
v$asm_file f RIGHT OUTER JOIN v$asm_alias a
USING (group_number, file_number)
)
WHERE type IS NOT NULL
START WITH (MOD(pindex, POWER(2, 24))) = 0
CONNECT BY PRIOR rindex = pindex)
group by database_name
order by database_name
/Which, when run gives the output:
Database Size in MB ------------------------- ------------ DB_UNKNOWN 10 QA 29,667 QACA20A 9,011 QARA20A 13,258 QCONFIG 17,653 QHC1011 8,874 QHR1011 12,068 QICA11A 17,247 QIRA11A 27,041 TESTC02 8,908 TESTR02 11,791
In this case I know that TESTC02 and TESTR02 are not running and upon checking with the owners of this system, received permission to drive a stake through the heart of…um..er..remove these databases saving 20G of space.
For a quick and dirty delete script for ASM, see the link page and get the “drop_asm_db.sql” script. Happy vampire hunting!
TOD – Coming in from the cold – Bringing datafiles into ASM
Posted by Jay Caviness in ASM, Tip 'o the day on November 19th, 2008
I received a panicked call the other day from a DBA who had a user create a new datafile on a cooked filesystem. All was not lost and it didn’t require a downtime to fix, however, it did require that tablespace to be taken offline for a time.
In all databases I create using ASM I set the parameter: DB_FILE_CREATE_DEST (and it’s cohort DB_CREATE_ONLINE_LOG_DEST_X, which I will discuss in the future) and set it to the value of the default diskgroup (in our case +DGROUP1). This makes life much easier for a dba as the file parameters are no longer needed:
SQL> create tablespace asm_demo datafile size 20m; Tablespace created. SQL> select file_name from dba_data_files 2 where tablespace_name = 'ASM_DEMO'; FILE_NAME ------------------------------------------------------------ +DGROUP1/qa/datafile/asm_demo.974.671209339
In the above example, if the db_create_file_dest parameter is set, the only information I need to create a tablespace is the tablespace name. Technically, even the size is not needed as a default size of 10m is used if the size is omitted.
This is all well and good, but what happens if someone comes along and creates a file the old fashioned way:
SQL> create tablespace test datafile '/home/oracle/test_01.dbf' 2> size 10m;
How do you get this file into asm? Elementary my dear Watson! First take the tablespace offline.
SQL> alter tablespace test offline;
Start RMAN and backup a copy of the datafile into ASM, since the tablespace is offline, no transactions will change the datafile. (I will leave out the extraneous rman info)
RMAN> backup as copy tablespace 'TEST' format '+DGROUP1';
Tell the data dictionary to use the new copy.
RMAN> switch tablespace 'TEST' to copy;
datafile 50 switched to datafile copy
"+DGROUP1/qa/datafile/test.973.671207181"Finally, bring the tablespace back online, either from rman or sqlplus, and check where the datafile is no located.
SQL> alter tablespace test online; Tablespace altered. SQL> select file_name from dba_data_files 2> where tablespace_name = 'TEST'; FILE_NAME ------------------------------------------------------- +DGROUP1/qa/datafile/test.973.671206913
That is it, now, admittedly taking an active tablespace offline can be a challenge and may require a downtime based on your application. But, the movement of the file into ASM is quite simple. In the end the DBA is happy and the user that created the cooked datafile was educated and given 50 lashes with a wet noodle.
The Re-Balance of Power
Posted by Jay Caviness in ASM on October 31st, 2008
One of my favorite things about ASM is the ability to add and remove disks/luns from a diskgroup with no down time and an automatic rebalancing of the data on the luns. This creates all kinds of possibilities. I have even used this process to move an entire very large database from one SAN to another with no down time.
To remove a lun, use the following command:
SQL> alter diskgroup dgroup1 drop disk VOL004; Diskgroup altered.
This will kick off a rebalance, that is moving data off of VOL004 and distributing it across the remaining disks. To see this operation in progress, query the v$asm_operation tables:
SQL> select * from v$asm_operation;
GROUP_NUMBER OPERA STAT POWER ACTUAL SOFAR EST_WORK EST_RATE EST_MINUTES
------------ ----- ---- ---------- ---------- ---------- ---------- ---------- -----------
1 REBAL RUN 1 1 163 29082 576 50Among the data from this query we can see the diskgroup, the operation and how much work is left to do. Also we can see the asm_power_limit of 1. This “power” is a function of the amount of resources that ASM will use to complete the operation. The higher the number, faster it will be done, but the more resources taken away from other processes. This range is 0-11. 11 you say? Why not 10? The truth is it was an inside joke referring to a scene in “Spinal Tap” and an amplifier that went not to 10, but to 11. The limit of 0 was first made available in 10gR2.
To alter the limit there are two possibilities, altering at the system level:
SQL> alter system set asm_power_limit=11; System altered.
This will alter the power limit for any new operations, but not existing ones. To change an existing operation, you must directly alter the power limit on the diskgroup involved in the operation:
SQL> select * from v$asm_operation
GROUP_NUMBER OPERA STAT POWER ACTUAL SOFAR EST_WORK EST_RATE EST_MINUTES
------------ ----- ---- ---------- ---------- ---------- ---------- ---------- -----------
1 REBAL RUN 1 1 4354 32842 665 42
SQL> alter diskgroup dgroup1 rebalance power 11;
Diskgroup altered.
SQL> select * from v$asm_operation;
GROUP_NUMBER OPERA STAT POWER ACTUAL SOFAR EST_WORK EST_RATE EST_MINUTES
------------ ----- ---- ---------- ---------- ---------- ---------- ---------- -----------
1 REBAL RUN 11 11 331 26085 2682 9As you can see above, changing the power from 1 to 11 changed the estimated completion (EST_MINUTES) of the rebalance operation from 42 minutes down to 9. This is an estimate only and can vary quite a bit during an operation.
Note: The power can be set from the start in the add/drop statement with the following syntax:
SQL> alter diskgroup dgroup1 drop disk vol005 rebalance power 11; Diskgroup altered.
The big advantage to being able to set the power limit to 0 comes when it is necessary to add/drop several disks at the same time. If the power limit is already set, then each add/drop will be serialized and each operation will have to complete before the next can begin. If, however, the power limit is set to 0 first, then submit all the add/drop operations and finally set the power limit to a non-zero number for the disk group, the operations are paralellized and run at the same time. For large moves, such as SAN to SAN, this can take hours off a move, because all the operations are done together. For example, if you had to move 1TB from one SAN to another, if the power limit is non-zero when the luns from the new SAN are added and the luns from the old SAN are removed, then the add operation will run first rebalancing so there are 500G on each SAN when complete. Then the remove operation will start, rebalancing the remaining disk from the old SAN the new one. If parallelized, the entire operation is done at the same time moving the entire 1TB of data from the old SAN to the new SAN.
This ability is one of my favorite features of ASM. Power(limits) to the people!
Adding a LUN for ASM Use to a RHEL 4/PowerPath Database Server
Posted by Jay Caviness in ASM on September 12th, 2008
This procedure assumes using the Qlogic drivers and PowerPath 4.5.1, all installed. In a RAC, the steps are divided into “first node” steps , “Remaining Node(s)” steps and “Updating the ASM Instance”. For a standalone on SAN database server, the “Remaining Node(s)” step can be skipped.
Whenever performing a procedure like this one that will alter SAN/Disk layout on the fly, make sure that you have adequate backups of the database and the system in the unlikely event that something goes wrong.
First Node Steps:
Choose one node where all of the initial work will be done. Later a subset of these steps will be applied to the other cluster nodes.
1. Using the SAN software, present the new LUNS to the Linux nodes.
2. Get a list of the current visible LUNs for later reference using powermt, or a wrapper like:
# /stage/mck_ks/bin/map_pp_pseudo2lun /dev/emcpowera = LUN 165 /dev/emcpowerb = LUN 330 /dev/emcpowerc = LUN 423 /dev/emcpowerd = LUN 422 /dev/emcpowere = LUN 420 /dev/emcpowerf = LUN 331 /dev/emcpowerg = LUN 421 /dev/emcpowerh = LUN 335
3. Reboot the node (or rescan the SCSI bus)
a. Rebooting is the supported method to rescan the SCSI bus
b. To rescan (experimental), run: /root/rescan_for_luns
4. Using powermt, (or its wrapper), identify the new LUNs:
# /stage/mck_ks/bin/map_pp_pseudo2lun /dev/emcpowera = LUN 165 /dev/emcpowerb = LUN 330 /dev/emcpowerc = LUN 423 /dev/emcpowerd = LUN 422 /dev/emcpowere = LUN 420 /dev/emcpowerf = LUN 331 /dev/emcpowerg = LUN 421 /dev/emcpowerh = LUN 335 /dev/emcpoweri = LUN 369
5. Partition the new LUNs so they can be used for ASM. For EACH new LUN added run the following (the newlines matter!):
cat < < EOF | fdisk /dev/emcpower<newdevice> n p 1 1 w EOF
6. Using powermt, identify the “path” devices. These are the “sdXX” devices in the output below. In the next step we’ll change the ownership of these devices.
# powermt display dev=emcpower<device> Pseudo name=emcpoweri CLARiiON ID=APM00031200473 [SG: HA Linux RAC 10g] Logical device ID=6006016069C30A0004A2400E993ADA11 [LUN 369] state=alive; policy=CLAROpt; priority=0; queued-IOs=0 Owner: default=SP A, current=SP B ========================================================== ---------------- Host --------------- - Stor - -- I/O Path - -- Stats --- ### HW Path I/O Paths Interf. Mode State Q-IOs Errors ========================================================== 1 qla2xxx sdah SP B2 active alive 0 0 1 qla2xxx sdai SP A1 active alive 0 0 2 qla2xxx sdaj SP A3 active alive 0 0 2 qla2xxx sdak SP B1 active alive 0 0</device>
7. Now using the pseudo device name and the “path” device names change the ownership so Oracle can use them and change the Udev permissions file, so these are owned correctly after reboots (the device names below are samples taken from the output above):
for i in emcpoweri sdah sdai sdaj sdak
do
chown oracle:dba /dev/${i}1
echo "${i}1:oracle:dba:0660" >> /etc/udev/permissions.d/10-oracle-devices.permissions
done8. Using the oracleasm command, determine the labels for the current ASM disks. The new disks will generally labeled in the same pattern with the next highest numbers:
# service oracleasm listdisks VOL001 VOL002 VOL003 VOL004
9. Now label the new disk(s) with the oracleasm command. RE-run and increment the label until all the new disks are done (the example below added emcpoweri1).
# service oracleasm createdisk VOL005 /dev/emcpoweri1 Marking disk "/dev/emcpoweri1" as an ASM disk: [ OK ]
10. Now confirm that the new disks are labeled using the ‘oracleam listdisks’ command. The new disk should show up.
# service oracleasm listdisks VOL001 VOL002 VOL003 VOL004 VOL005
Remaining Node(s) Steps:
These steps need to be performed on ALL the remaining RAC nodes AFTER the above has been completed. For standalone DB servers, skip these steps.
1. Using the SAN software, present the new LUNS to the Linux nodes.
2. Get a list of the current visible LUNs for later reference using powermt, or a wrapper like:
# /stage/mck_ks/bin/map_pp_pseudo2lun /dev/emcpowera = LUN 165 /dev/emcpowerb = LUN 330 /dev/emcpowerc = LUN 423 /dev/emcpowerd = LUN 422 /dev/emcpowere = LUN 420 /dev/emcpowerf = LUN 331 /dev/emcpowerg = LUN 421 /dev/emcpowerh = LUN 335
3. Reboot the node (or rescan the SCSI bus)
a. Rebooting is the supported method to rescan the SCSI bus
b. To rescan (experimental), run: /root/rescan_for_luns
4. Using powermt, (or its wrapper), identify the new LUNs:
# /stage/mck_ks/bin/map_pp_pseudo2lun /dev/emcpowera = LUN 165 /dev/emcpowerb = LUN 330 /dev/emcpowerc = LUN 423 /dev/emcpowerd = LUN 422 /dev/emcpowere = LUN 420 /dev/emcpowerf = LUN 331 /dev/emcpowerg = LUN 421 /dev/emcpowerh = LUN 335 /dev/emcpoweri = LUN 369
5. Using powermt, identify the “path” devices. These are the “sdXX” devices in the output below. In the next step we’ll change the ownership of these devices.
# powermt display dev=emcpower<device> Pseudo name=emcpoweri CLARiiON ID=APM00031200473 [SG: HA Linux RAC 10g] Logical device ID=6006016069C30A0004A2400E993ADA11 [LUN 369] state=alive; policy=CLAROpt; priority=0; queued-IOs=0 Owner: default=SP A, current=SP B ========================================================== ---------------- Host --------------- - Stor - -- I/O Path - -- Stats --- ### HW Path I/O Paths Interf. Mode State Q-IOs Errors ========================================================== 1 qla2xxx sdah SP B2 active alive 0 0 1 qla2xxx sdai SP A1 active alive 0 0 2 qla2xxx sdaj SP A3 active alive 0 0 2 qla2xxx sdak SP B1 active alive 0 0</device>
6. Now using the Pseudo device name and the “path” device names change the ownership so Oracle can use them and change the Udev permissions file, so these are owned correctly after reboots (the device names below are samples taken from the output above):
for i in emcpoweri sdah sdai sdaj sdak
do
partprobe /dev/$i
chown oracle:dba /dev/${i}1
echo "${i}1:oracle:dba:0660" >> /etc/udev/permissions.d/10-oracle-devices.permissions
done7. Using the oracleasm command, determine the labels for the current ASM disks. The new disks will generally labeled in the same pattern with the next highest numbers:
# service oracleasm listdisks VOL001 VOL002 VOL003 VOL004
8. Now scan the ASM disks to add the new ones:
# service oracleasm scandisks Scanning system for ASM disks: [ OK ]
9. Now confirm that the new disks are labeled using the ‘oracleam listdisks’ command. The new disk should show up.
# service oracleasm listdisks VOL001 VOL002 VOL003 VOL004 VOL005
Updating the ASM instance:
All of these steps must be performed as the oracle user and only AFTER the above steps have been successfully completed for ALL nodes in the cluster. In a RAC cluster, this step needs to be completed on only ONE node in the cluster; since the database is shared, the others will get the change.
1. Make sure you set your SID to ‘+ASM’:
/home/oracle:(+ASM)$ . oraenv ORACLE_SID = [+ASM] ? +ASM
2. Log in to the ASM instance as sysdba.
/home/oracle:(+ASM)$ sqlplus "/ as sysdba"
3. Now use the following command to add the new ASM disks to the disk group. Notice you can use some vary basic regular expressions in the naming of the volumes. In the case below, we added ASM VOL002, VOL003, VOL004. Don’t try to add already existing volumes, otherwise the whole command will error out, so customize the volume name portion below to reflect your naming scheme for the new LUNs.
SQL> alter diskgroup dgroup1 add disk 'ORCL:VOL00[234]'; Diskgroup altered.
4. You can verify the new disk by running the following SQL script:
SQL> @/home/oracle/asm/asm_disks.sql
DiskGroup NamePath FileName FailGroup FileSize(MB) UsedSize(MB) Pct.Used
--------------- ----------------- -------------------- ------------------
DGROUP1 ORCL:VOL001 VOL001 VOL001 15,359 7,338 47.78
ORCL:VOL002 VOL002 VOL002 51,199 2,525 4.93
ORCL:VOL003 VOL003 VOL003 51,199 2,525 4.93
ORCL:VOL004 VOL004 VOL004 51,199 2,525 4.93
-------------- --------------
Grand Total: 168,956 14,913TOD – How to check if a LUN is used by ASM
Posted by Jay Caviness in ASM, Tip 'o the day on September 3rd, 2008
ASM is down, init+ASM.ora is gone and you need to see which luns are owned by ASM, if on linux you can generally use the “service oracleasm” command to find the luns that have been marked for use with ASM. On other platforms the oracleasm service is not available, how then to check? You can use the od command to check the first few lines of a lun, it will display “ORCLDISK” and the diskgroup in the first five lines if part of and ASM diskgroup:
/dev:()$ od -c /dev/hdiskpower10 | more 0000000 \0 202 001 001 \0 \0 \0 \0 200 \0 \0 \0 211 341 $ 207 0000020 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 0000040 O R C L D I S K \0 \0 \0 \0 \0 \0 \0 \0 0000060 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 0000100 \n 020 \0 \0 \0 \0 001 003 D G R O U P 1 _ 0000120 0 0 0 0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 0000140 \0 \0 \0 \0 \0 \0 \0 \0 D G R O U P 1 \0 0000160 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 0000200 \0 \0 \0 \0 \0 \0 \0 \0 D G R O U P 1 _ 0000220 0 0 0 0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 0000240 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0
You can see above that this lun is part of diskgroup dgroup1.