Sybase TechWave 2007 Papers – FINAL CALL!

The final call for Sybase TechWave 2007 Papers ends tonight at midnight pacific daylight savings time. Get your paper in!

TechWave 2007 Call for Papers Now Open! Learn More

Would you like to share your Sybase solution with hundreds of conference attendees? If you’ve solved a unique business challenge utilizing Sybase products, we’d like to hear about it. This is your chance to attend Sybase’s next annual user conference for free!

  • Submit an abstract – Sybase invites you to submit an abstract of a proposed session to be considered for presentation at TechWave 2007.
  • Be Selected – Not all submissions are accepted. This is a terrific opportunity to have your solution showcased in front of a select group of peers.
  • Network with Peers – Present at TechWave and network with hundreds of industry professionals like yourself. Visit http://www.sybase.com/techwave/conference_content to apply.
  • Attend for free! – If you are selected as a conference speaker, you’ll receive a complimentary conference pass valued at $1,795.

Applications must be submitted no later than Friday, March 16, 2007. You will be notified in April if you have been selected for a technical session.

For more information on the conference and how to complete an application to present please visit http://www.sybase.com/techwave. Specific questions can be directed to sybase.techwave@sybase.com.

Share Button

Sybase ASE Tech Talk podcast link now working!

Back in February, I posted Sybase ASE Tech Talk Podcast – FIXED!.  The problem at the time was that the rss feed provided by Sybase was made to syndicate the description of the mp3 files rather than the mp3 files themselves.  I quickly wrote up a simple perl script that would create a valid rss 2.0 feed that would syndicate the mp3 files.

Since then, Sybase created a valid rss 2.0 feed that was linked to the Apple iTunes Store.  This was the only method to get it added to iTunes.  If you were using another podcast aggregator such as Amarok, you were out of luck.  Using iTunes, I was able to identify the url used by iTunes so we can use it in other aggregators. 🙂  http://video.sybase.com/podcasts/ase-podcast.xml

ASE Tech Talk Podcast

Share Button

Sybase ASE: Helping indexes work better

Drew Montgomery wrote up an excellent description of how to use indexes more wisely:
There are times when it seems that the optimizer just doesn’t “get it”. In other words, it doesn’t use what seems like an obvious choice for an index. There are several possible reasons for that, and we are going to explore a couple of them here…

The first could be that the index in question is using old statistics. The statistics for an index are used when evaluating queries to help the optimizer determine the cost of using a particular index. Each index for a table is evaluated by the optimizer along with some other complex interactions to determine if an index (or set of indices if we are dealing with multiple tables) should be used. If the statistics for an index are out of date, old, tired, worn out, they may no longer accurately represent the distribution of the data. The optimizer, using the older statistics, may determine that another course of action is best. This is why we would want to periodically update the statistics. Of course, this isn’t a cure-all for query woes. Far from it! But it can be a cause of some optimizer confusion.

Another reason could be that the query itself doesn’t let us use an index. You may be asking yourself “what could I have done to make the optimizer mad at me?” It’s not personal. It may just be the way you have rendered your query. One little known fact is that if you have an aggregate function against a column, the optimizer can’t use that column from an index. In other words, if you have a query that looks like this:

select * from local_users
where upper(name) like "%SMITH%"

You have prevented the optimizer from using an index based on name. The act of upper casing the column prevents the optimizer from using it. The same would be true if you had used an arithmetic function or even an equation:

select * from accounts
where account_limit * 10 < expected_account_limit[/sql] In the above case, the problem will be that the optimizer could use expected_account_limit OR account_limit but in no case could it use both. You would have to reform the query to move the *10 to the other side of the equation to allow the optimizer to use the account_limit column. In the same vein, if there is a data type mismatch between the columns being used in a query, the optimizer will have to make the column types "match" before it can complete the query. This means that the optimizer is applying an "implied" aggregate function (convert) to align the data types. If you have the following query: [sql]select p.name, p.address, j.title from personnel p, jobs j where p.j_id = j.j_id[/sql] and the column j_id in personnel allows nulls (for those people who are working under cover!), and j_id in jobs does not allow nulls since that is the primary key, the optimizer will not use an index based on the j_id column in the jobs table. In order to "match" the data types, it has to convert the j_id column in the jobs table from "int not null" to "int null" in order to match the definition of the j_id column from the personnel table. Sometime the difference in the columns could be a subtle as a varchar(10) vs. a char(10) or a varchar(20) vs. a varchar(21). In each case, the optimizer will make a conversion to the broadest common denominator to make the match, and in many cases that conversion will prevent the optimizer from using a given index. The last possible that I will discuss is the case where the wrong index exists, or a better index could be created. There are two distinct cases for this one. The first is where an index is given an advantage of some kind. That index could be in the form of the index being unique, or if the index is clustered. That advantage could be enough to sway the optimizer to use a specific index. Another case could be that even though the query is using an index, another index may be more efficient. I recently ran across this specific case. A query was using four different columns in its where clause, and two of those columns in an order by clause. The dynamic of the query was trying to retrieve the data on the basis of the date of the record, in descending order (I.E. they wanted the "youngest" record first) and in fact they had a restriction in the select to get the "top 1" record based on these criteria. A facsimile of the query is as follows: [sql]select top 1 name, balance_date, amount from bank_info where name = @name and account_type = @type and branch_id = @branch and balance_date < @eval_date order by name, balance_date desc[/sql] The query was using an index that had two of the four columns involved. Although this was better than a table scan (table had 82 million rows), it was not hugely efficient. The index did reduce the data set being sifted through, but was still taking 30,000 I/O to complete. Part of the problem was that once the basic query had been satisfied, the result was being put into a sort table that was then interrogated for the youngest record. This is a case for a new index. But the question is: "what does the index go on?" We could build an index against the an id column and the date. That would help the sort, but would not help the where clause a lot. We could build an index on the four columns in the where clause, but the result would still be that the data found would go into a sort table. Probably more efficient, but not as good as it gets. For the particular query, three of the four columns are being matched exactly, with the fourth being the date column's limitation as to what the cut off date is. If the cutoff date is midnight of March 11th, we are looking for the latest record from March 10th and earlier. The best solution is to create an index against the four columns in the were clause, with the date column last and (in the definition of the index) descending. That way when we restrict the other three columns, we start with the newest record. The other half of this is perhaps a little less obvious: use all of the columns of the new index in the order by clause. The change would be as follows: [sql]select top 1 name, balance_date, amount from bank_info where name = @name and account_type = @type and branch_id = @branch and balance_date < @eval_date order by name, account_type, branch_id, balance_date desc[/sql] That way the optimizer can use the index for the where clause and the order by clause simultaneously. The bottom line is that we go from thousands of I/Os to hundreds of I/O's (with the old index and old order by) finally to a much more reasonable 10-15 I/O. The difference in performance is dramatic. Bottom line, be aware of not only what indices exist, but how they are being used. Make sure to take care of them, and they will be like a good motor oil - they will improve the performance of your system while extending the life of the application.

Share Button

Breast Cancer 3 Day – 60 Mile walk – we need supporters!


Join me in the fight against breast cancer

Please support me as I take an amazing journey in the fight against breast cancer! The Breast Cancer 3-Day is a 60-mile walk over the course of three days. Net proceeds benefit the Susan G. Komen Breast Cancer Foundation and the National Philanthropic Trust, funding important breast cancer research, education, screening, and treatment.

My dad’s first wife, Dawn Froebe died of breast cancer in 1970, she gave birth to my older sister Kathryn and my brother Michael. Last year, my mother-in-law (future mother-in-law at the time) wrote Dawn’s name on her shirt as remembrance of her. My mother-in-law walked the 60 miles last year along with Gwen.

In August of 2007, my wife, mother-in-law, Mary Bridgett, Gwen, Melinda and myself will be walking the 60 miles in three days. My sister-in-law will be part of the 3 Day Crew! If you can afford it, please pledge a donation to the Susan G. Komen Breast Cancer Foundation. Click on the image to make a donation.

I’m still a long ways off from making the minimum $2,200 in donations before I can walk the 60 miles. Give what you can – remember that it is TAX DEDUCTABLE! You can donate as much or as little as you wish – if you provide your email address when you donate, the Foundation will email a receipt to you that you can use in your 2007 tax deductions!

SAVE THE BOOBIES!

Share Button

FW: CaseXpress Beta

March 19th, 2007 by bruce.armstrong

Sybase has opened up a beta of the CaseXpress portion of their website.  If you have a support contract and have been frustrated by CaseXpress in the past, here’s your chance to give them feedback that will make your live easier.

Now is your chance!  Go check out the new (beta) Case-Xpress!

Share Button

HOWTO: Compressing text in a database while keeping it searchable with Perl Compress::AsciiFlate

I’m archiving our Sybase ASE MDA tables into a central repository but we do run into space issues from time to time with the stored queries from master…monSysSQLText and master..monProcessSQLText. The workaround was to compress the queries using Compress::Zlib and storing them in an image column if the query was over 400 bytes. The problem comes in that we aren’t able to easily search the queries.

Jimi-Carlo Bukowski-Wills’s Compress::AsciiFlate allows me to compress the queries into a searchable string I can put in a text column. It seems to work fairly well but there are a few caveats:

  1. module reads the entire text into memory before processing it. This isn’t a problem for most query texts though.
  2. if the text is mostly unique words then the compressed text may actually be larger than the original text
  3. it doesn’t handle unicode text. Expect anything in the latin iso_1 character set to work though

Of course, this should work with any database that is set up to use/accept latin iso_1 characters.

use Compress::AsciiFlate;

my $af = new Compress::AsciiFlate;
my $text = 'some words some words some words';

$af->deflate($text);
print $text; # prints: "some words _1 _2 _1 _2"
$af->inflate($text);

print $text; # now prints: "some words some words some words"
print $af->olength; # original length: 33
print $af->dlength; # deflated length: 23
print $af->difference; # 10
print $af->ratio; # 0.696969696969697
print $af->ratio(3); # 0.697
print $af->percentage; # 69.69
print $af->percentage(4); # 69.697
print $af->count; # how many different words: 2
print join(' ',$af->table); # _1 some _2 words
Share Button

For those of you in Bismarck/Mandan, North Dakota…..

In the heart of North Dakota, a circle of women that rule over all mankind. They are the mothers, sisters, daughters and grandmas that pull the strings more expertly than Marlin Brando in the Godfather. They weave the Garments of Power and harness the magic of the Chopsticks with Points. They are…

The Bismarck Mandan Chicks with Sticks!

If you need any sock yarn, mosey on over to Froebe Fibers 🙂

Share Button

A standard Iterator super class for Perl?

I’ve been reading Mark Jason Dominus‘s excellent Higher Order Perl during my wife’s weekly Stitch & Bitch meeting. In the pursuit to get going on using Iterators, I started looking through CPAN and Perlmonks for a ready made Iterator class that would be a superclass for many other iterator classes. Guess what I found.. a hodge podge of iterators that seem mostly to be almost entirely independent implementations of what an iterator is/does. At least, that is the impression that is getting across.

Yup, I’m aware of Iterator, Iterator::Misc, and Iterator::Util but they don’t seem to be used in the other cpan modules very much.

Has anyone else seen this or am I just suffering from “Friday: brain shuts down”?

I cross posted this question on to Perlmonks.

Share Button

Monitoring Databases – what’s wrong with that?

Most database monitoring systems aren’t from the database vendors as you might think, but a hodge-podge of 3rd party vendors that seem to want to charge more than I make in a lifetime for database monitoring software — try finding low cost monitoring software for DB2 on the mainframe.

They typically use standardized, and often deprecated, monitor counters that when used for their product, interfere with any other monitoring products you might use.  For example, if the Operations Department is using Nimbus to monitor the network, VoIP, hosts, tape archival systems, and the database servers to ensure that they are running, what happens when the DBAs want to use DBA Expert? The two products (keep in mind that I chose the products for the example at random) will trip over each other – neither will provide reliable metrics of the databases.

The front ends for the monitoring products always seem to show a fancy GUI full of bright colors, dials, graphs, and the latest and greatest designer kitchen sink. They are very rarely willing to provide any documented API or mechanism for you to obtain the data from their product without a nasty NDA. The premise is that you will use their front end to display and analyze the monitoring metrics.

The database vendors, themselves, are largely to blame. The monitoring APIs that they offer assume that you will only be using a single monitoring system.  For example, in Sybase’s ASE, the new API is to use their MDA tables to obtain performance metrics but the problem comes in when the monitoring software would use multiple methods to obtain additional information that may not be (easily) obtainable from the MDA tables.  sp_sysmon will reset several monitoring counters unless you call it with the ‘noclear’ option.  Unfortunately, the ‘noclear’ is not widely known and rarely used in the monitoring software.  Of course, this is just an example of multiple monitoring APIs from a database vendor.

You know what? I don’t care about the vendors’ fancy front ends. Give me a web service that I can access and use the monitoring metrics in another application, a PDA, etc. A few vendors have tried to offer an API but they are often so damned complicated that you would have had to work at the company to understand the API.  Don’t even get me started on vendors keeping their APIs updated.

Update:  Thanks goes to Peter Dorfman of Sybase to helping clarifying that the MDA tables in ASE ‘clear’ only on a connection basis.  That means if you look at monDeadlocks on connection #1 twice, the first select might show 5 rows and the second 0 rows.  If you ran the select on connection #2 sometime later, you would see the 5 rows plus any other deadlocks that might have occurred since then.  I wasn’t very clear on that as I was (in my head) also including sp_sysmon and other monitoring options that would conflict with the MDA tables.

Share Button

From my wife….

I’m in shock. Happy happy shock.

Jason and I just bought a house.

Well, we aren’t done.  We haven’t closed yet.  But we did put down a deposit, and get our financing approved.  We won’t close until September, when the house is actually, um, built. 

Our house, is a very very very fine house
We get to choose the colors of course, as this isn’t our exact house, but the elevation of what our house will look like from the outside.

I can’t paste the floorplans here, because the are flash files, but if you go here, you can see the floorplans.

I can’t believe we are doing this.  It is literally my dream house.  How the hell can we get my dream house?  This shouldn’t be possible.  But it is!

I’ll post actual pictures later.  Believe me, I took plenty.

Now you know the reason why I haven’t been posting very much here or on the newsgroups lately. 🙂 

Share Button