I’ve been putting off migrating a (very) big database for a (very) big application from MySql 4.1.21 to 5.x for a very long time. I’ve just embarked on that journey and I’ll try to collect a list of all the (many) hassles I encountered.
When 5.0 first came out my initial reflex to not trust initial releases for critical jobs kicked in and I was reluctant to move. Then as the minor releases of 5.x crept up and the initial release issues were being ironed out, there was less weight to that argument.
My next excuse was that ‘nothing is broken’ so I have no real compelling reason to go through the hassle. Well, although nothing was broken, I must admit the idea of Views and Stored Procedures seemed appealing. At some point I gave it a go. Moving the data to the test 5.x schema was a breeze by using mysqldump to dump the data off 4.1 and pumping it back to 5.x. However the application broke as soon as I told it to use the new schema. So not feeling like a battle then, I decided to wait a bit more.
Finally, The potential of creating dynamic queries with stored procedures and get some crosstab action going was reason by itself to go for it.
But still I waited a bit more, until 5.0.24.
Data Migration:
As I said: easy peasy. mysqldump all the data from 4.1, and load it into a new database on the 5.x schema. When dealing with very large dumps I always disable integrity checks on the import file. Otherwise the import process would take days to complete.
To do this, I append this to the file header:
SET AUTOCOMMIT = 0;
SET FOREIGN_KEY_CHECKS=0;
and this to the footer:
SET FOREIGN_KEY_CHECKS = 1;
COMMIT;
SET AUTOCOMMIT = 1;
Because opening a 3GB file in Vi is no fun, I just create two small files containing each segment and do:
cat headerfile schemadump.sql footerfile > fastschemadump.sql
to create the new faster import file (all this is from the Linux bash shell).
Make sure to create the new database using the correct character set to double ensure that all inserted data will go in the right character set without any conversions.
Un-Quirks
MySql 4.x had a lot of quirks that were fixed in 5.x. Those were not bugs, just ‘gotchas‘.
MySql can be told to work modless, i.e., not in strict mode. You can do this by setting the modes to nothing:
SET @@global.sql_mode=”;
The problem is that my application was working around a lot of these issues and others, as well as containing not 100% standard sql, just because it was tolerated in 4.x.
In 5.x this breaks immediatly. Most notable is the Join syntax change issues (which do not seem to be affected by the above unstrictness directive:
This query works on 4.x but fails badly on 5.x
SELECT t1.x, t2.y
FROM t1, t2
LEFT JOIN t3 ON t3.a = t1.a
WHERE t2.x=t1.x
AND t1.z=1;
The fix is to use JOINs instead of comma syntax to tie those first two tables together (or move that condition to after the last JOIN which is a less elegant solution):
SELECT t1.x, t2.y
FROM t1
JOIN t2 ON t2.x=t1.x
LEFT JOIN t3 ON t3.a = t1.a
WHERE t1.z=1;
MySQL 5.x is NOT necessarily faster than 4.x! Bear that in mind. In fact, if your application is optimized to 4.x and will not benefit from the new features of 5, don’t bother migrating until it does. 5.x is bigger and much more feature rich. That means it’s got a larger codebase and some overhead to support the new features (parsing sql for all the new functions and stored procedure syntax for example).
I found that queries ran somewhat slower in 5.x, until I ran them as Prepared Statements with bound variables. This improved performance quite a bit as the sql is now parsed only once. If you are using ColdFusion, cfqueryparam will do the trick.
There are instructions for the upgrade and full list of incompatibilities on the MySql site, though I’ve always found their documentation side of things a bit lacking.
I’ve found the MySql Performance blog (see this) to be very helpfull.
My migration effort is still ongoing and I’m trying to make sure my rewritten queries work both on 4.x and 5.x, just in case things won’t work out and I’ll have to revert back to 4.x.
Its not an easy going fun ride, but I guess it’s gotta happen.
I hope I’ll have nothing more to tell and the rest of the work goes smoothly.