Attributes of a Great DBA #1 – Humility
Posted by Jay Caviness in general on May 22nd, 2009
Humility and a decided lack of ego. It is one thing to be confident with your skills, however, most of the best Oracle people I know are also the most humble. Case in point is Mike Ault. I have known of his work, books and appreciation for all things Oracle for many years. I met him in person about two years ago at RMOUG training days in Denver and again at IOUG Collagorate 09 in Orlando. Instead of regaling me with Oracle knowledge, we talked of diving, kids and tiramisu. Why is this so important? Because while you can learn sitting at the feet of a guru, you can also learn something new from even the lowest, newest and greenest of people if you are open to it. I have also known engineers who believe the world should revolve around them, and you know what? They are usually constantly involved in a perceived crisis that someone, anyone, else caused.
Humility will gain you the world. There are a few Oracle bigwigs out there that set up shop at a conference and look more like PT Barnum than a credible source all in the name of hawking their books, services or advice. (Those that know me know of whom I speak!) Getting a perspective from others with different talents than you and lesser as well as greater talents than you. Listen to your colleagues and don’t rush to judgment on ideas just because they are from a newbie, you never know where that spark will come from that will solve a problem now in the future.
Attributes of a Great DBA #2 – Integrity
Posted by Jay Caviness in general on May 15th, 2009
Integrity – be honest in all you do, it is easier than trying to remember what you lied about! There is not a whole lot is new under the sun, ideas/scripts/processes are products of hashing and rehashing old ideas with new to create something that fits your needs. My favorite ASM scripts are based on Jeff Hunters scripts, he certainly writes better formatted SQL than I do and the scripts I based my ASM scripts on are very useful for me. Often plagiarism goes under the guise of “code reuse”, that is fine, but give credit where credit is due.
I have solved many problems for customers over the years, but try to never leave them without an understanding of what went on, how it was resolved and how it was fixed if at all possible and time permits. This is vital to to your client relationship and your own sense of self worth. There are times where a root cause analysis does not bare anything out, and you have to explain to a client or manager that the cause may be found, but it may be cost and time prohibitive. Be honest in all things you do, covering something up almost always involves digging your own grave. Having been on more than one forensic analysis teams, I have seen first hand what happens when someone either maliciously damages a system or damages it accidentally, and then tries to cover it up. It rarely works and the damage to your reputation can be permanent.
Attributes of a Great DBA #3 – Imagination
Posted by Jay Caviness in general on May 13th, 2009
Imagination – Above intelligence? You bet. The ability to think outside the box is critical, much of that comes from experience. I am not a huge Star Trek fan, but I remember a scene from The Wrath of Khan when the crew was trying to force their enemy’s shields down and Kirk said “You have to know why things work.” That is an excellent point, so many things can affect an Oracle instance, database, cluster, etc. It is often like the large mixing boards that music producers use in a studio. If you move one slide up 12 others may move down. It partially comes back to #4, that if you amass knowledge in many different areas, they will mix in such a way that your imagination can find soluti0ns to which there is no (apparent) logic.
Attributes of a Great DBA – #4 Intelligence
Posted by Jay Caviness in general on May 6th, 2009
Intelligence – This had to be in the list right? Not number one, however. There is a difference in “book learnin’” and intelligence. Intelligence is more the process of solving a problem to the point of resolution. It can be coupled directly with wisdom. A DBA must have the ability to go from point A to B to C to solve a problem. With experience, that process may go from A to D to Z because of an intuition born of experience even if you have not seen a similar problem before.
Where do you get this intelligence? As noted above, time, in the guise of experience, is a large part. Absorbing information my multiple sources is the majority of the rest. Oracle is the type of software that you learn by doing, not reading. Don’t just troll OTN forums and blogs, participate! Get a dialog going, you would be amazed at what you can learn from people in similar and dis-similar circumstances. If you are a RAC person join the RAC SIG (www.oracleracsig.org). Most importantly, however, is learn about more than just your area of expertise, by getting outside your comfort zone. While I am a specialist at RAC, I try to be a generalist in as many IT engineering areas as I can. For example, don’t just subscribe to Oracle Magazine or Select Journal, get Network World or Storage. You may not understand all the topics and concepts, but with time you will absorb them and when the time comes to make an architecture, support, design or down-time decision, you might find that some important data from outside your comfort zone has helped to make a better informed decision.
Monday at IOUG Collaborate 2009
Posted by Jay Caviness in general on May 4th, 2009
I am on site at IOUG Collaborate ‘09 this week in sunny Orlando, Florida. I will be speaking as part of a customer panel on RAC on Virtual Machines at 4:30 Wednesday. Had a great converstaion on Oracle RAC and streams with Arup Nanda and talked with Mike Ault already, and I am looking forward to the conferences and chatting with many of my friends I have made over the years.
Top 5 attributes of a great DBA
Posted by Jay Caviness in general on April 23rd, 2009
In the next few days I am posting my top 5 list of what it takes to be a great DBA. While many may not agree, it is not all about knowledge and insight. Am I a great DBA? Time and my clients will tell, but these attributes will help to ensure a long and enjoyable career in the Oracle world.
We will start with #5 -
5. Sense of humor & grace under pressure – Jean Kerr may have been right when she said, “If you can keep your head about you when all about you are losing theirs, its just possible you haven’t grasped the situation”. Most of us have been in one type of crisis or another over the years involving Oracle, the systems it lies upon and users, managers and clients beating at the door with torches and pitchforks. Many of us have wondered if we would survive the onslaught. When you are at the end of your rope, tie a knot, make a joke and hang on, there is little left to lose. I have been in this type of situations many times. Most of the time I have been called into the middle of a disaster (or what is believed to be one) with the simple command of “fix it, it be broke!”, and have a cube or conference call of people staring over your shoulder waiting on every keystroke. As bad as my typing is, that is never a good thing to watch.
The first truly memorable situation I had like this was back in the 1997-1998 timeframe, the web was just taking off and Amazon.com was the darling of the internet boom. I was working for Oracle Support at the time as a technical specialist which is supposed to mean that I know more than most about how the database kernel worked when I got called in because Amazon was down. The call between myself and their DBA became the two of us and about 30 other people all on the conference call. I was not allowed to hang up, transfer or call the client back. Everyone, including a seniror Oracle VP and two VPs from Amazon were on the phone and expecting me to articulate every move I made while manually rebuilding several datafile headers with a hex editor that had become corrupted due to a bug. Someone noted that Amazon had just made the national news because they were down. It was not pretty. But after being on the phone for 422 minutes (our phones had counters on them) every one signed off and the problem was fixed. The point to this anecdote is that I had to, with politeness and humore, be able to tell everyone on the line two things, one that I was not going to repeat every keystroke I made to the audience and two, to please shut up and let me do my work. That if they had to make business or political decisions do it on another call. I was tired, grouchy and more than once had to hit the mute button, but kept my cool with the customer.
Now, I have been in the situation where I screwed up and all the only non-explicative thing I could say was to quote that great American philosopher Urkel – “Did I do that?”. Oh boy, believe me, if you have been in the business long enough, you will break something, and break it bad. I have overwritten datafiles, dropped the wrong table, killed the wrong node, just to mention a few.
For those that know me, I fully admit, that when the mess is over (and sometimes when I push “mute” on the phone before it is over) I can get grouchy, grumpy and generally be a joy to be around.
However, a sense of humor does not always work, and you have to know when to pick your battles, as it were. There is one hospital client I was working with when I first started working with McKesson. I was on a conference call with them and let out a couple of my humorous observations at which point I would swear I heard crickets in the background. I realized very quickly that they had no sense of humor and dropped it at that point. The best part of it was that I have not been on a conference call with that client since!
No one wants a comedian during a crisis. But establishing a good raport with a client or group that is experiencing a problem or just in general is often easier with a little good humor and a lot of empathy and grace.
The Sun also Rises . . .
Posted by Jay Caviness in general on April 20th, 2009
Today Oracle announced it was buying Sun Microsystems in a deal tentatively valued at $7.4 billion. Why does Oracle want Sun? One reason and one reason only, Java. Java is a key part of Oracle’s Fusion middleware strategy and Oracle’s ownership will drive changes in what is supposed to be an “open source” programming language to meet their own needs.
The bigger question is, what will Oracle do with the rest of Sun? Sun is really a hardware company that happens to “own” java. Larry Ellison has said several times in recent years that he does not want Oracle to become a hardware company (remember the Network Computer?). While this may be true, their partnership with HP on the fast selling (and big margin) Exadata project shows that this statement is rather flexible. I would imagine that Oracle will sell off the parts of Sun it has no need for, probably to IBM. Sun itself has been cutting staff consistently for eight years now. I worked a consulting engagement with Sun’s storage devision back in 2005, right at the time Sun bought Storage Tek. The result? I large part of the division was scuttled almost immediately as it was all but replaced by Storage Tek hardware. On a personal level, I am glad it killed my contract, as a few days later I began my great releationship with McKesson. However, quite a few people on that campus were cut.
This may not be the case, however. Oracle has been going to great lengths to own the software stack. Currently, a company can start with Oracle Enterprise Linux, add the database, application server, Fusion middleware and a whole host of Oracle applications derived from Oracle apps, Peoplesoft, JD Edwards, Siebel and others. It is no longer a best of breed scenario. The next logical step would be to own the hardware that runs the stack. You might think that this would put Oracle at odds with HP, but I don’t think so, they will more likely begin a transition to the same love/hate relationship Oracle has with Microsoft.
Another unaswered question from the conference call was the fate of MySQL. I never thought it was a good idea for Sun to buy the open source database in the first plance, but now Oracle owns it. I seriously doubt it will become the new Oracle Lite, more likely any functionality deemed worthy will just be absorbed into the Oracle DB kernel, much like Times Ten and Sleepy Cat were.
It makes me wonder, however, if Oracle still has the cash to make the multi-billion dollar purchases, what will be next? If Bill, Steve and the crew in Redmond get worried, I guess with all the cash Microsoft has in the bank, they could just buy IBM…or Oracle.
ASM and the “Vampire” database
Posted by Jay Caviness in ASM on March 5th, 2009
Almost all of the 10g and above databases on which I work use ASM. One of the most common requests I receive on development systems is to add another lun to add more space to ASM. This is one of the great features of ASM, adding or removing disk without affecting the current databases. Before adding a lun, however, I always check to see if space is really needed. Sometimes, there will be a database which has not been backed up in weeks and it will have literally thousands of archive logs that are filling up ASM. That remedy is easy, either back up the database and archives or delete the archive logs and run a new full backup.
Then there are the vampire databases, those that suck up disk space, but are not active and not being used. Often, they are just forgotten databases by developers or DBAs that are shut down, removed from oratab and backups and generally forgotten about since no one notices the space they take up in ASM. In dealing with space issues, I needed a quick way to determine what databases were in ASM, whether or not they were actually running. Here is what I came up with:
REM asm_db_size.sql
REM Author - Jay Caviness - Grumpy-dba.com
REM 5 March 2009
REM ------------------------------------------------------------
set pages 999
set heading on
set feedback off
set lines 80
col "Size in MB" for 999,999,999
col "Database" for a25
select database_name "Database",
sum(space)/1024/1024 "Size in MB" FROM (
SELECT
CONNECT_BY_ROOT db_name as database_name, space
FROM
( SELECT
a.parent_index pindex
, a.name db_name
, a.reference_index rindex
, f.bytes bytes
, f.space space
, f.type type
FROM
v$asm_file f RIGHT OUTER JOIN v$asm_alias a
USING (group_number, file_number)
)
WHERE type IS NOT NULL
START WITH (MOD(pindex, POWER(2, 24))) = 0
CONNECT BY PRIOR rindex = pindex)
group by database_name
order by database_name
/Which, when run gives the output:
Database Size in MB ------------------------- ------------ DB_UNKNOWN 10 QA 29,667 QACA20A 9,011 QARA20A 13,258 QCONFIG 17,653 QHC1011 8,874 QHR1011 12,068 QICA11A 17,247 QIRA11A 27,041 TESTC02 8,908 TESTR02 11,791
In this case I know that TESTC02 and TESTR02 are not running and upon checking with the owners of this system, received permission to drive a stake through the heart of…um..er..remove these databases saving 20G of space.
For a quick and dirty delete script for ASM, see the link page and get the “drop_asm_db.sql” script. Happy vampire hunting!
Why crossover cables are not supported in RAC
Posted by Jay Caviness in RAC on February 27th, 2009
Many Oracle shops in this world use crossover cables, literally a network cable, between nodes for use as the interconnect between two rac nodes. Does this work, yep, you bet. Is it supported, no. Why? well it all has to do with how a node reacts when its sister fails in a two node cluster.
Each node in the cluster constantly checks on the other nodes in the cluster through both the network (interconnect) and storage (voting disks), if one or both are lost, the cluster node is instructed to commit suicide and reboot itself in hopes of rejoining the cluster healthy and happy.
If a crossover cable is used, and one of the nodes drops the remaining node will have to wait for the tcp timout, generally 60-300 seconds, before it realized that the lost node is gone. At which point, the cluster will remove the lost node from the cluster. What can happen during that time is two fold, the surviving node can lock up, litterally freeze during the wait for the timeout and/or the cluster can become very confused if the dead node restarts and attempts to join the cluster at a point when the cluster still thinks it is there. Strange things have been known to happen, many errors thrown and at times will cause both nodes to evict and restart.
Having a switch between the nodes allows a signal to be sent immediately if a node quits responding, at which time the surviving node will check for 60 seconds then evict the failing node, allowing it to rejoin (upon reboot) a clean cluster without any problems.
In short, crossover cables are fine in an emergency or development, any situation where failover is not critical, but for production, spend the money on a good switch, two in fact if you can bond your nics (that’s for another post), for the best senario to survive a failover with as few issues as possible.
New Year’s Resolutions for DBAs
Posted by Jay Caviness in general on December 30th, 2008
Ah that time again to resolve to be better people in the new year. Most last about as long as a snowflake in hell, but it is good to aspire to new goals. With that in mind, here are a few New Year’s resolutions for DBAs.
1. Understand that we don’t always need root (that’s a tough one right of the bat).
2. Not to make nooses out of spare CAT5 cable and leave them on the network admin’s chair.
3. Take advantage of the fact that Oracle put a Google search right inside the new Metalink.
4. Drop weight by ridding ourselves of the Oracle 7.3 Backup & Recovery Handbook and all the other books we have from two-three versions back. (Don’t laugh, I found a set of Oracle 5.1 and 6.0 manuals in my garage this year).
5. Not to take advantage of caller ID to ignore certain people just because they always ask long, involved and completely innocuous questions.
6. Not to wear my RTFM t-shirt to staff meetings.
7. Complete documentation, clearly, completely and on time to help offset #5
8. Not to beta test new versions of Oracle on production or QA servers (that is all I had left!)
9. Become good friends with the SAs and storage admins, you need all the allies you can get.
10. Learn to say no appropriately
And finally…
11. Resist the temptation to beat the developer with his own laptop when he brings you a 6-page query and complains “it runs slow”.
Happy New Year!!!