Saturday, July 4, 2009

Interview questions for ASE DBAs

This page contains some suggestions for questions to ask when interviewing an applicant for a Sybase ASE DBA job. There are also some questions the candidate might want to ask before (s)he takes the job (see the end of this page).
Please bear in mind that these are just some suggestions which I personally think are relevant. I'm certainly not claiming that these are the "best" or "most representative" questions you could ask. You could use these as a starting point and add further questions of your own.

The questions are listed here. They are repeated, with answers, below. At the bottom of this page, there are also some questions for a candidate DBA to ask a potential future employer.

What are the most important DBA tasks?
What should you do when you find a stacktrace in the server errorlog?
Is there any disadvantage of splitting up your application data into a number of different databases?
Is it necessary to drop & recreate all procedures and triggers every few months?
What are the main advantages and disadvantages of using identity columns?
What do you do when the server can't start due to a corrupt master database?
When you do a BCP-in from a file to a table, what happens to triggers, constraints, rules and defaults on that table?
How do you BCP only a certain set of rows out of a large table?
What's the difference between managing permissions through users and groups or through user-defined roles?
Is there any advantage in using the 64-bit version of ASE instead of the 32-bit version?
Is it a good idea to use datarows locking for all tables by default?
What would you do when the ASE server's performance is bad?
What do you do when a segment gets full?
Are timestamp columns good candidates for primary keys? (since they're always unique for a row)
Does the DBA candidate hold a Sybase Certification?





Questions to ask a candidate DBA


What are the most important DBA tasks ?
In my opinion, these are (in order of importance): (i) ensure a proper database/log dump schedule for all databases (including master); (ii) run dbcc checkstorage on all databases regularly (at least weekly), and follow up any corruption problems found; (iii) run update [index] statistics at least weekly on all user tables; (iv) monitor the server errorlog for messages indicating problems (daily). Of course, a DBA has many other things to do as well, such as supporting users & developers, monitor performance, etc.

What should you do when you find a stacktrace in the server errorlog
Open a case with Sybase TechSupport. There's not much you can do yourself with this information, and only TechSupport has the information to determine whether it's related to a bug, for example. It's not a good idea to ignore such things in the errorlog -- 'cos it might indeed indicate you're hitting a bug.

Is there any disadvantage of splitting up your application data into a number of different databases ?
When there are relations between tables/objects across the different databases, then there is a disadvantage indeed: if you would restore a dump of one of the databases, those relations may not be consistent anymore. This means that you should always back up a consistent set of databases; however, this may be difficult when the system is continuously in use, because a single database is the unit of backup/restore. Therefore, when making this kind of design decision, backup/restore issues should be considered (and the DBA should be consulted).

Is it necessary to drop & recreate all procedures and triggers every few months ?
No; in older Sybase versions (4.x), this was sometimes necessary, as query plans could grow bigger over time, hit an upper limit at some point and cause an error. Both the growing plan and the limit have been removed since at least version 11.0 (or was it already fixed in 10 ? -- I'm not sure...).

What are the main advantages and disadvantages of using identity columns ?
The main advantage of an identity column is that it can generate unique, sequential numbers very efficiently, requiring only a minimal amount of I/O. The disadvantage is that the generated values themselves are not transactional, and that the identity values may jump enourmously when the server is shut down the rough way (resulting in "identity gaps"). You should therefore only use identity columns in applications if you've adressed these issues (go here for more information about identity gaps).

What do you do when the server can't start due to a corrupt master database ?
You create a new master device using buildmaster (on 12.5, use dataserver instead); create a RUN_SERVER file and start the server in single-user mode (using the -m option); then manually add an entry for SYB_BACKUP in sysservers; and then load a database dump of the master database. After that, the server will automatically shut down; restart it and see if your application databases are still there.

To turn up the heat a bit: what if you're using a non-default character set or sort order ?
In this case, things are more complicated: you'll first need to create sybsystemprocs and change the sort order/charset of newly created master database before loading the master database dump (thanks to John Langston for this one).

When you do a BCP-in from a file to a table, what happens to triggers, constraints, rules and defaults on that table ?
For both fast BCP and 'normal' BCP, triggers, constraints and rules are ignored. Defaults will be effective though (go here for a nasty, but little-known side effect).

How do you BCP only a certain set of rows out of a large table ?
If you're in ASE 11.5 or later, create a view for those rows and BCP out from the view. In earlier ASE versions, you'll have to select those rows into a separate table first and BCP out from that table. In both cases, the speed of copying the data depends on whether there is a suitable index for retrieving the rows.

What's the difference between managing permissions through users and groups or through user-defined roles ?
The main difference is that user-defined roles (introduced in ASE 11.5) are server-wide, and are granted to logins. Users and groups (the classic method that has always been there since the first version of Sybase) are limited to a single database. Permissions can be granted/revoked to both user-defined roles and users/groups. Whichever method you choose, don't mix 'm, as the precedence rules are complicated.

Is there any advantage in using the 64-bit version of ASE instead of the 32-bit version ?
The only difference is that the 64-bit version of ASE can handle a larger data cache than the 32-bit version, so you'd optimize on physical I/O. Therefore, this may be an advantage if the amount of data cache is currently a bottleneck. There's no point in using 64-bit ASE with the same amount of "total memory" as for the 32-bit version, because 64-bit ASE comes with an additional overhead in memory usage -- so the net amount of data cache would actually be less for 64-bit than for 32-bit in this case.
(Just for clarity: the 64-bit version is not twice as fast as the 32-bit version, and does not perform its I/O at double the size of the 32-bit version (I once heard someone state these as facts...)).

Is it a good idea to use datarows locking for all tables by default ?
Not by default; only if you're having concurrency (locking) problems on a table, and you're not locking many rows of a table in a single transaction, then you could consider datarows locking for that table. In all other cases, use either datapages or allpages locking.
(I personally favor datapages locking as the default lock scheme for all tables because switching to datarows locking is fast and easy, whereas for allpages locking, the entire table has to be converted which may take long for large tables. Also, datapages locking has other advantages over allpages, such as not locking index pages, update statistics running at level 0, and the availability of the reorg command).

What would you do when the ASE server's performance is bad ?
"Bad performance" is not a very meaningful term, so you'll need to get a more objective diagnosis first. Find out (i) what such a complaint is based on (clearly increasing response times or just a "feeling" that it's slower?), (ii) for which applications/queries/users this seems to be happening, and (iii) whether it happens continuously or just incidentally. Without identifying the specific, reproducable problem, any action is no better than speculation.

What do you do when a segment gets full ?
Wrong: a segment can never get full (even though some error messages state something to that extent). A segment is a "label" for one or more database device fragments; the fragments to which that label has been mapped can get full, but the segments themselves cannot. (Well, OK, this is a bit of a trick question... when those device fragments full up, you either add more space, or clean up old/redundant data.)

Are timestamp columns good candidates for primary keys? (since they're always unique for a row)
Absolutely not, this would be a bad thing to do; timestamp columns are actually a bit too unique to be used as a primary key. For details, see the quiz question for August 2004.

Does the DBA candidate hold a Sybase Certification ?
If (s)he has, consider that a plus !

Questions for a candidate DBA to ask your potential future employer

When you're being interviewed for a DBA vacancy, there are some things related to the DBA environment you might want to know as well. I'd suggest to check out at least the following:


Does the company have a Technical Support contract with Sybase ?
A support contract is required for getting EBFs and for being able to ask questions about technical problems. Without a support contract, you're completely on your own; you should ask yourself if your can fulfill the company's expectations in that case.

Which version of ASE are they using, and on which platform ?
This matters: for example, if they appear to be running 11.0.3 on Data General, find out if they are aware that both this ASE version and this platform are no longer supported by Sybase. If they're not planning to upgrade to a supported version/platform soon, ask yourself if you want to be working there; you risk being on your own, without support, and with an out-of-date ASE version that stops you from keeping your ASE knowledge current.

How many servers, database and concurrent users do they have ? What's the database size like ? Is there a 24*7 uptime requirement ?
It helps to know which scale you're talking about. If you're supposed to look after a 500 Gb, never-no-downtime, 3000-user system, check whether the salary you're being offered is of the same magnitude as the system.

Is Sybase Replication Server involved ?
If it is, and if you know RepServer, reconsider your financial demands -- upwards, that is. Reason is that RepServer DBAs are hard to find -- much harder than ASE DBAs.

Are you also supposed to take care of their Oracle, MS-SQL, etc. servers ? Do you have to manage ASIQ or ASA (SQLAnywhere) as well ?
You may want to know this in advance rather than find out on your first working day....

If you want to get a Sybase certification (or get a more recent one) will they pay for this ?
It should make 'm happy that you're willing to get your certification, 'cos it will make you a better DBA; try to get them to pay for at least part of it. Tip: if you're talking to a management person, calling this a "win-win scenario" might help....

0 comments: