Sep 25 2006

In the interest of health and safety…

This guy is just asking for someone to flick the switch… I’m sure there are safer places to read your morning paper, but I guess this fella likes living it on the hedge.

In the interest of Health and Safety


Sep 25 2006

The Big mySql Migration – the journey from 4.x to 5.x

I’ve been putting off migrating a (very) big database for a (very) big application from MySql 4.1.21 to 5.x for a very long time. I’ve just embarked on that journey and I’ll try to collect a list of all the (many) hassles I encountered.
When 5.0 first came out my initial reflex to not trust initial releases for critical jobs kicked in and I was reluctant to move. Then as the minor releases of 5.x crept up and the initial release issues were being ironed out, there was less weight to that argument.
My next excuse was that ‘nothing is broken’ so I have no real compelling reason to go through the hassle. Well, although nothing was broken, I must admit the idea of Views and Stored Procedures seemed appealing. At some point I gave it a go. Moving the data to the test 5.x schema was a breeze by using mysqldump to dump the data off 4.1 and pumping it back to 5.x. However the application broke as soon as I told it to use the new schema. So not feeling like a battle then, I decided to wait a bit more.

Finally, The potential of creating dynamic queries with stored procedures and get some crosstab action going was reason by itself to go for it.
But still I waited a bit more, until 5.0.24.

Data Migration:
As I said: easy peasy. mysqldump all the data from 4.1, and load it into a new database on the 5.x schema. When dealing with very large dumps I always disable integrity checks on the import file. Otherwise the import process would take days to complete.
To do this, I append this to the file header:

SET AUTOCOMMIT = 0;
SET FOREIGN_KEY_CHECKS=0;

and this to the footer:
SET FOREIGN_KEY_CHECKS = 1;

COMMIT;
SET AUTOCOMMIT = 1;

Because opening a 3GB file in Vi is no fun, I just create two small files containing each segment and do:

cat headerfile schemadump.sql footerfile > fastschemadump.sql

to create the new faster import file (all this is from the Linux bash shell).

Make sure to create the new database using the correct character set to double ensure that all inserted data will go in the right character set without any conversions.

Un-Quirks
MySql 4.x had a lot of quirks that were fixed in 5.x. Those were not bugs, just ‘gotchas‘.
MySql can be told to work modless, i.e., not in strict mode. You can do this by setting the modes to nothing:

SET @@global.sql_mode=”;

The problem is that my application was working around a lot of these issues and others, as well as containing not 100% standard sql, just because it was tolerated in 4.x.
In 5.x this breaks immediatly. Most notable is the Join syntax change issues (which do not seem to be affected by the above unstrictness directive:

This query works on 4.x but fails badly on 5.x

SELECT t1.x, t2.y
FROM t1, t2
LEFT JOIN t3 ON t3.a = t1.a
WHERE t2.x=t1.x
AND t1.z=1;

The fix is to use JOINs instead of comma syntax to tie those first two tables together (or move that condition to after the last JOIN which is a less elegant solution):

SELECT t1.x, t2.y
FROM t1
JOIN t2 ON t2.x=t1.x
LEFT JOIN t3 ON t3.a = t1.a
WHERE t1.z=1;

MySQL 5.x is NOT necessarily faster than 4.x! Bear that in mind. In fact, if your application is optimized to 4.x and will not benefit from the new features of 5, don’t bother migrating until it does. 5.x is bigger and much more feature rich. That means it’s got a larger codebase and some overhead to support the new features (parsing sql for all the new functions and stored procedure syntax for example).
I found that queries ran somewhat slower in 5.x, until I ran them as Prepared Statements with bound variables. This improved performance quite a bit as the sql is now parsed only once. If you are using ColdFusion, cfqueryparam will do the trick.
There are instructions for the upgrade and full list of incompatibilities on the MySql site, though I’ve always found their documentation side of things a bit lacking.

I’ve found the MySql Performance blog (see this) to be very helpfull.
My migration effort is still ongoing and I’m trying to make sure my rewritten queries work both on 4.x and 5.x, just in case things won’t work out and I’ll have to revert back to 4.x.
Its not an easy going fun ride, but I guess it’s gotta happen.
I hope I’ll have nothing more to tell and the rest of the work goes smoothly.


Sep 7 2006

Internet Explorer and the Ajax Hang up

I’ve spent a good while today trying to figure out why an XmlHttpRequest call, which runs asynchronously, causes IE to ‘hang’ until the whole operation finished – as if it was run synchronously. Firefox, as usual, is unaffected by this problem.

The reason seems to be with IE trying to load a cached version of the file that the XmlHttpRequest call tries to fetch, and getting quite confused in the process.
The solution is to always append a random string to the url, thus making it unique and helping the poor explorer of the internet not get confused.

Continue reading


Sep 7 2006

God is a Txtr

This is another pic taken outside a church (by Jason). Hilarious. It seems like its becoming a trend. After the Resurrection In Pictograms we get messages straight out of God’s mobile phone.

Msg Frm God


Sep 6 2006

CoffeeCat has been rated top 25% on Jars…

After over 4 years, my media cataloging software (still using JRE 1.3) has been rated by Jars! It got a score of 855 of a possible 1000. I’m pretty stoked ;o)

Now all I need is to find the (lost) source code so that I can release version 1.1 with the source wide open (v1.0 doesn’t run well on Linux).
You can get CoffeeCat here.

Jars top 25%

-------------------------------------------
Presentation : 275
Function     : 290
Originality  : 290
-------------------------------------------
Total        : 855 (out of a possible 1000.)

Sep 4 2006

Time is not on my side : Handling multi time zone users

One of the problems when developing multi user internationalised applications is how to seamlessly handle users from different time zones who are all part of the same account. Life would have been much easier if we all shared one timezone and time never had to change, ever. But life is not like that and time is a slippery concept.

Here’s an example of the problem: Lets assume we’re dealing with a CRM type application that involves a calendar of events and records that are date-time bound (like task deadlines etc.). An account in your application has 5 users. They all work in the same company and share the same data, but they are all physically located in 5 different time zones. Say one user from California schedules a conference call event for the entire group at 9AM PST.
The user in New York should see the event in his time zone which would be 12 PM EDT. The user in the UK which is GMT would see the event at 4 PM (or more accurately: at 16:00 hours). And so on – each user should see the event scheduled at the right time for his time zone.
We get another level of complication where you start to factor in daylight saving times. There is no 100% reliable method that I’m aware of to automatically determine with complete accuracy that a certain daylight saving switch will occur at a particular time for a particular region. Sometimes the switch involves political or religious reasons and does not follow a standard.
So your GMT user might require the event to appear at 5 PM (17:00) instead of 4 PM, for example.

The solution I’ve found works best so far is to baseline all dates in the database to UTC. UTC is a fixed time that is based on GMT though they are not the same thing.
During summer time, the time zone in Britain is set to BST which is a +1 time zone. UTC stays at 0, all the time. So that is the perfect time zone baseline.
So, each date goes into the database converted to UTC.
In mySql the handy UTC_TIMESTAMP() function can be used to get the UTC time stamp. This makes sure the only dependency is that the server hosting the mySql database has an accurate clock, that is synched regularly with a time server (via NTP). The synch is important since we do want our server to automatically know when he’s going into daylight savings time. Otherwise, all times converted to UTC will be one hour off.
As long we know all date/ time values in our database are in UTC, all we have to do is run all these date/ time values via some function to covert them to the current user’s time zone.
This conversion can be done on the database level or via some function or object on the application side. The implementation is irrelevant, as long as the date/ time is displayed in the user’s time zone.

Another problem is how to automatically adjust the user’s time zone to account for Daylight Savings without the user having to go and change his time zone setting twice a year. I’m still looking for an adequate solution to this problem. Because of the non-accurate nature of daylight savings, it’s hard to calculate this moment in time automatically. If you have any ideas, I’d love to hear them.

Update:: Just to illustrate what a nightmare dealing with timezones can be, see this faq.