MySQL 5.1 vs. MySQL 5.5: Floats, Doubles, and Scientific Notation

Sheeri

13

For the past few days, I have been upgrading a few servers. We are going from Percona’s patched MySQL 5.1 to MariaDB 5.5 (the subquery optimization is excellent, and we have lots of subqueries). Our process involves upgrading a slave first, and letting it replicate for a while, and if everything’s good, update more slaves, then the master.

This has served us well in the past. And now that we are checking data integrity between masters and slaves with pt-table-checksum, the process involves checking before we start that there is data integrity. This is easy, as we checksum twice daily and have a Nagios alert if there are any discrepancies. After the upgrade, we checksum again, to be sure no data has been changed/corrupted in the process of doing a mysqldump export and import.*

Much to my surprise, after importing the data on one of our dev servers, I found that there were a lot of discrepancies. So I picked a chunk to do some comparisons on, and found something interesting:

On Server version: 5.1.65-rel14.0-log Percona Server (GPL), 14.0, Revision 475:
mysql> select float_field from db.tbl where id=218964;
+-------------+
| float_field |
+-------------+
| 9.58084e-05 |
+-------------+
1 row in set (0.04 sec)

On Server version: 5.5.28a-MariaDB-log MariaDB Server
MariaDB [(none)]> select float_field from db.tbl where id=218964;
+--------------+
| float_field |
+--------------+
| 0.0000958084 |
+--------------+
1 row in set (0.24 sec)

Which of course causes a different checksum. I tried SELECTing the values, casting and converting them, but I could not get them to change in the database. MySQL 5.1 insists on storing in scientific notation, and MariaDB 5.5 (and MySQL 5.5, we tested it out) insists on storing without scientific notation.

Frankly, I’m surprised this has not come up before (I did lots of querying Google for MySQL 5.5 and scientific notation), since it radically changes how numbers look when they are stored and retrieved. I guess code does the right thing…except for pt-table-checksum, and I cannot really blame it.

In the end, I used the –ignore-columns option to pt-table-checksum, with the result of:

SELECT GROUP_CONCAT(DISTINCT COLUMN_NAME) FROM INFORMATION_SCHEMA.COLUMNS WHERE DATA_TYPE IN ('float','double') AND TABLE_SCHEMA NOT IN ('mysql','information_schema','performance_schema');

In this way, I can get an accurate checksum to see if anything has changed, before I mark that the upgrade is complete on this slave server.

* This is just on the first slave. After the first slave is upgraded, we use xtrabackup to copy the data to another server to upgrade it.

13 responses

  1. Justin Swanhart wrote on :

    Could be related to:
    Incompatible change: As of MySQL 5.5.3, the server includes dtoa, a library for conversion between strings and numbers by David M. Gay. In MySQL, this library provides the basis for improved conversion between string or DECIMAL values and approximate-value (FLOAT or DOUBLE) numbers.

    Because the conversions produced by this library differ in some cases from previous results, the potential exists for incompatibilities in applications that rely on previous results. For example, applications that depend on a specific exact result from previous conversions might need adjustment to accommodate additional precision.

    For additional information about the properties of dtoa conversions, see Section 12.2, “Type Conversion in Expression Evaluation”.

  2. Sheeri wrote on ::

    Oh, I agree, it’s documented, but I’m surprised I haven’t seen more *bloggers* cautioning about it. Which probably means people are just upgrading and not verifying their data is the same…which is a bit scary.

    1. Justin Swanhart wrote on :

      Normally when I do a major upgrade I take a snapshot of a master and create two new servers, one with the old version of the software and one with the new version. Both use the same data from the snapshot (consistent –single-transaction mysqldump). Since they are guaranteed to be at the same point in time, I can run pt-upgrade against both to compare the results of queries. This also makes it possible to use CHECKSUM TABLE in parallel over multiple tables, to ensure the data is the same (this will be faster than a single threaded pt-checksum)

      Once that is all confirmed and I’m sure the workload returns the same results and performs the same, then I’ll go ahead and set up replication from the original master to both of the slaves. I’ll then let them run for a some time then stop replication on both and use START SLAVE UNTIL to get them to a consistent position. Then do CHECKSUM TABLE in parallel again to make sure replication has been working consistently. Then let the new version catch up and then use it as a data source for your other slaves.

      You don’t really need two servers for this, you can use two different versions of MySQL on the same server, but performance comparisons may be less meaningful if you do that.

    2. Kolbe Kegel wrote on :

      If anything, I think (and hope) it means that people are wisely avoiding the use of FLOAT/DOUBLE columns!

  3. Peter Laursen wrote on ::

    Thanks for pointing this out. Also I overlooked this change in 5.5.

    It affects not only pt_checksum but also various Data Sync applicationw (including ours) when involving a 5.1- and a 5.5+ server with FLOAT/DOUBLE data. Every identical row on source and target will be UPDATED (to what it was) on target and even though the server may ignore the UPDATE statements it is of course a considerable overhead.

  4. Baron wrote on :

    You probably needed http://www.percona.com/doc/percona-toolkit/2.0/pt-table-checksum.html#cmdoption-pt-table-checksum–float-precision

    1. Sheeri wrote on ::

      Nope, precision is different from scientific notation vs. non-scientific notation. I’ve run into float precision before on some minor upgrades, where 0.00001234 is on one server and 0.00001233 is on another. That’s different from 0.00001234 vs. 1.234e-5, the precision is exactly the same, it’s just that pt-table-checksum does not understand that those are the same number.

      But since you developed it, maybe you know different than I do?

      1. Justin Swanhart wrote on :

        Since you couldn’t be bothered to follow the link, you might want to pay special attention to the part about different versions and hardware:

        –float-precision
        type: int

        Precision for FLOAT and DOUBLE number-to-string conversion. Causes FLOAT and DOUBLE values to be rounded to the specified number of digits after the decimal point, with the ROUND() function in MySQL. This can help avoid checksum mismatches due to different floating-point representations of the same values on different MySQL versions and hardware. The default is no rounding; the values are converted to strings by the CONCAT() function, and MySQL chooses the string representation. If you specify a value of 2, for example, then the values 1.008 and 1.009 will be rounded to 1.01, and will checksum as equal.

        1. Sheeri wrote on ::

          Justin, I did follow the link. Precision is absolutely a different issue than the one I was having. Both 0.00001234 and 1.234e-5 have the same precision, and are the same number. They’re not rounded differently. They’re stored and displayed differently, and pt-table-checksum doesn’t realize that 0.00001234 and 1.234e-5 are the same number.

          The precision stuff for pt-table-checksum is so you can set it to ignore rounding differences in different versions, so that if one machine rounds to 0.00001234 and another rounds to 0.00001235, you can ignore that last digit and only compare the first 3 significant digits. In this case, where it’s appropriate to use the option, the numbers are actually different, but you’re telling pt-table-checksum to ignore that different.

          I know of no such option to convince pt-table-checksum that 0.00001234 and 1.234e-5 are the same number, and if there’s a solution to that, I’ll take it.

    2. Brandon Johnson wrote on :

      Baron,

      In this case, float precision would not have helped. Unfortunately we were dealing with very small numbers (to the order of -10 or greater in scientific notation) and the difference was that 5.1 displayed this number in the result set as scientific notation, whereas 5.5 displayed it in decimal notation. That caused the difference that pt-table-checksum was unable to recognize.

  5. Justin Swanhart wrote on :

    mysql> select c1, round(c1, 25) from foo\G
    *************************** 1. row ***************************
    c1: 281474976710656
    round(c1, 25): 281474976710656.0000000000000000000000000
    *************************** 2. row ***************************
    c1: 1.157920892373162e77
    round(c1, 25): 115792089237316200000000000000000000000000000000000000000000000000000000000000.0000000000000000000000000
    *************************** 3. row ***************************
    c1: 1.3407807929942597e154
    round(c1, 25): 13407807929942597000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000.0000000000000000000000000
    3 rows in set (0.00 sec)

    1. Justin Swanhart wrote on :

      never mind. :) I see the problem. big numbers work fine, small numbers don’t.

  6. Pingback from Re: MySQL 5.1 vs. MySQL 5.5: Floats, Doubles, and Scientific Notation | code.openark.org on ::

    [...] Sheeri's MySQL 5.1 vs. MySQL 5.5: Floats, Doubles, and Scientific Notation, I was baffled at this change of floating point number [...]