Wednesday, June 04, 2014

MariaDB 10 is a Sandbox killjoy?

Using MySQL Sandbox I can install multiple instances of MySQL. It is not uncommon for me to run 5 or 6 instances at once, and in some occasions, I get to have even 10 of them. It is usually not a problem. But today I had an issue while testing MariaDB, for which I needed 5 instances, and I the installation failed after the 4th one. To make sure that the host could run that many servers, I tried installing 10 instances of MySQL 5.6 and 5.7. All at once, for a grand total of 20 instances:

$ make_multiple_sandbox --how_many_nodes=10 5.6.14
installing node 1
installing node 2
installing node 3
installing node 4
installing node 5
installing node 6
installing node 7
installing node 8
installing node 9
installing node 10
group directory installed in $HOME/sandboxes/multi_msb_5_6_14
$ make_multiple_sandbox --how_many_nodes=10 5.7.4
installing node 1
installing node 2
installing node 3
installing node 4
installing node 5
installing node 6
installing node 7
installing node 8
installing node 9
installing node 10
group directory installed in $HOME/sandboxes/multi_msb_5_7_4

$ ~/sandboxes/use_all 'select @@port, @@version'
# server: 1:
@@port  @@version
14015   5.6.14-log
# server: 2:
@@port  @@version
14016   5.6.14-log
# server: 3:
@@port  @@version
14017   5.6.14-log
# server: 4:
@@port  @@version
14018   5.6.14-log
# server: 5:
@@port  @@version
14019   5.6.14-log
# server: 6:
@@port  @@version
14020   5.6.14-log
# server: 7:
@@port  @@version
14021   5.6.14-log
# server: 8:
@@port  @@version
14022   5.6.14-log
# server: 9:
@@port  @@version
14023   5.6.14-log
# server: 10:
@@port  @@version
14024   5.6.14-log
# server: 1:
@@port  @@version
7975    5.7.4-m14-log
# server: 2:
@@port  @@version
7976    5.7.4-m14-log
# server: 3:
@@port  @@version
7977    5.7.4-m14-log
# server: 4:
@@port  @@version
7978    5.7.4-m14-log
# server: 5:
@@port  @@version
7979    5.7.4-m14-log
# server: 6:
@@port  @@version
7980    5.7.4-m14-log
# server: 7:
@@port  @@version
7981    5.7.4-m14-log
# server: 8:
@@port  @@version
7982    5.7.4-m14-log
# server: 9:
@@port  @@version
7983    5.7.4-m14-log
# server: 10:
@@port  @@version
7984    5.7.4-m14-log

This worked fine. Then I removed all the instances, and tried again with MariaDB

 $ sbtool -o delete -s ~/sandboxes/multi_msb_5_6_14/ 
 ...
 $ sbtool -o delete -s ~/sandboxes/multi_msb_5_7_4/
 ...

With MariaDB 10, the installation failed after the 4th node.

$ make_multiple_sandbox --how_many_nodes=10 10.0.11
installing node 1
installing node 2
installing node 3
installing node 4
error while creating grant tables
Installing MariaDB/MySQL system tables in '/home/tungsten/sandboxes/multi_msb_10_0_11/node4/data' ...
140604  8:27:14 [Note] InnoDB: Using mutexes to ref count buffer pool pages
140604  8:27:14 [Note] InnoDB: The InnoDB memory heap is disabled
140604  8:27:14 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
140604  8:27:14 [Note] InnoDB: Compressed tables use zlib 1.2.3
140604  8:27:14 [Note] InnoDB: Using Linux native AIO
140604  8:27:14 [Note] InnoDB: Using CPU crc32 instructions
2014-06-04 08:27:14 7f207d353780 InnoDB: Warning: io_setup() failed with EAGAIN. Will make 5 attempts before giving up.
InnoDB: Warning: io_setup() attempt 1 failed.
InnoDB: Warning: io_setup() attempt 2 failed.
InnoDB: Warning: io_setup() attempt 3 failed.
InnoDB: Warning: io_setup() attempt 4 failed.
InnoDB: Warning: io_setup() attempt 5 failed.
2014-06-04 08:27:16 7f207d353780 InnoDB: Error: io_setup() failed with EAGAIN after 5 attempts.
InnoDB: You can disable Linux Native AIO by setting innodb_use_native_aio = 0 in my.cnf
InnoDB: Warning: Linux Native AIO disabled because os_aio_linux_create_io_ctx() failed. To get rid of this warning you can try increasing system fs.aio-max-nr to 1048576 or larger or setting innodb_use_native_aio = 0 in my.cnf
140604  8:27:16 [Note] InnoDB: Initializing buffer pool, size = 128.0M
140604  8:27:16 [Note] InnoDB: Completed initialization of buffer pool
140604  8:27:16 [Note] InnoDB: The first specified data file ./ibdata1 did not exist: a new database to be created!
140604  8:27:16 [Note] InnoDB: Setting file ./ibdata1 size to 12 MB
140604  8:27:16 [Note] InnoDB: Database physically writes the file full: wait...
140604  8:27:16 [Note] InnoDB: Setting log file ./ib_logfile101 size to 48 MB
140604  8:27:17 [Note] InnoDB: Setting log file ./ib_logfile1 size to 48 MB
140604  8:27:18 [Note] InnoDB: Renaming log file ./ib_logfile101 to ./ib_logfile0
140604  8:27:18 [Warning] InnoDB: New log files created, LSN=45781
140604  8:27:18 [Note] InnoDB: Doublewrite buffer not found: creating new
140604  8:27:18 [Note] InnoDB: Doublewrite buffer created
140604  8:27:18 [Note] InnoDB: 128 rollback segment(s) are active.
140604  8:27:19 [Warning] InnoDB: Creating foreign key constraint system tables.
140604  8:27:19 [Note] InnoDB: Foreign key constraint system tables created
140604  8:27:19 [Note] InnoDB: Creating tablespace and datafile system tables.
140604  8:27:19 [Note] InnoDB: Tablespace and datafile system tables created.
140604  8:27:19 [Note] InnoDB: Waiting for purge to start
140604  8:27:19 [Note] InnoDB:  Percona XtraDB (http://www.percona.com) 5.6.17-65.0 started; log sequence number 0
140604  8:27:24 [Note] InnoDB: FTS optimize thread exiting.
140604  8:27:24 [Note] InnoDB: Starting shutdown...
140604  8:27:25 [Note] InnoDB: Shutdown completed; log sequence number 1616697
OK
Filling help tables...
140604  8:27:25 [Note] InnoDB: Using mutexes to ref count buffer pool pages
140604  8:27:25 [Note] InnoDB: The InnoDB memory heap is disabled
140604  8:27:25 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
140604  8:27:25 [Note] InnoDB: Compressed tables use zlib 1.2.3
140604  8:27:25 [Note] InnoDB: Using Linux native AIO
140604  8:27:25 [Note] InnoDB: Using CPU crc32 instructions
2014-06-04 08:27:25 7f12bb0e9780 InnoDB: Warning: io_setup() failed with EAGAIN. Will make 5 attempts before giving up.
InnoDB: Warning: io_setup() attempt 1 failed.
InnoDB: Warning: io_setup() attempt 2 failed.
InnoDB: Warning: io_setup() attempt 3 failed.
InnoDB: Warning: io_setup() attempt 4 failed.
InnoDB: Warning: io_setup() attempt 5 failed.
2014-06-04 08:27:28 7f12bb0e9780 InnoDB: Error: io_setup() failed with EAGAIN after 5 attempts.
InnoDB: You can disable Linux Native AIO by setting innodb_use_native_aio = 0 in my.cnf
InnoDB: Warning: Linux Native AIO disabled because os_aio_linux_create_io_ctx() failed. To get rid of this warning you can try increasing system fs.aio-max-nr to 1048576 or larger or setting innodb_use_native_aio = 0 in my.cnf
140604  8:27:28 [Note] InnoDB: Initializing buffer pool, size = 128.0M
140604  8:27:28 [Note] InnoDB: Completed initialization of buffer pool
140604  8:27:28 [Note] InnoDB: Highest supported file format is Barracuda.
140604  8:27:28 [Note] InnoDB: 128 rollback segment(s) are active.
140604  8:27:28 [Note] InnoDB: Waiting for purge to start
140604  8:27:28 [Note] InnoDB:  Percona XtraDB (http://www.percona.com) 5.6.17-65.0 started; log sequence number 1616697
140604  8:27:28 [Note] InnoDB: FTS optimize thread exiting.
140604  8:27:28 [Note] InnoDB: Starting shutdown...
140604  8:27:30 [Note] InnoDB: Shutdown completed; log sequence number 1616707

This smells like a bug. BTW, the installation fails with both MariaDB 10.0.10 and 10.0.11, and only on Ubuntu Linux. I can install 10 instances just fine on Mac OSX. I haven’t tried with CentOS.

Friday, May 02, 2014

MySQL defaults evolution

MySQL, the original brand, the one developed by the MySQL team at Oracle, is steadily evolving. You can feel it if you try every new release that comes out of the milestone release cycle. Or even if you don’t try all of them, just testing a release once in a while gives you something to think about.

The engineers at Oracle are trying hard to improve the defaults. If you are the out-of-the-box type, and just install the new version on top of the previous one, leaving the same setup in place, you may be up for a for a few surprises. It’s the marketing, see? They tell you that just by replacing your old MySQL (5.1 or 5.5) with MySQL 5.6 you get 30% to 70% performance improvement. Which happens to be true, not only because the server is better, but also because they have changed the defaults. However, this change in defaults may come with some serious consequences for the ones who skip the release notes.

An annoying consequence of the MySQL team focusing on security is that in MySQL 5.6 you get a mandatory warning if you use a password in the command line. On one hand, it’s a good thing, because they force you to use better security practices. On the other hand, it’s a royal PITA, because many applications are broken because of this warning, just by replacing MySQL 5.1 or 5.5 with 5.6. There are solutions. For example, you can adopt the new mysql_config_editor to handle your password, but that would break compatibility with previous MySQL versions. Rewriting complex procedures to use configuration files instead of username and passwords is tricky, especially if you are testing exactly the behavior of using a password on the command line to override the contents of an options file.

INTERMISSION: this is a point of contention with the MySQL team. They have started a trend of introducing features that will prevent working smoothly with previous versions of MySQL. Up to MySQL 5.5, installing a server and staring using it was a set of steps that would work in the same way regardless of the version. With MySQL 5.6, all bets are over. When you install a new server, you get a long list of unwanted messages to the error output (which is fortunately fixed in MySQL 5.7), and then you get the warning if you use a password on the command line. For me, and for many developers who build software related to MySQL, the ability of writing a piece of software that works well with any version is paramount. The MySQL team seems to think that users will be happy to throw everything to the wind and start writing new code for MySQL 5.6 only instead of reusing what was working until the month before. And let’s be clear: I fully understand the need of moving forward, but I don’t understand the need of trapping users in the new behavior without remedy.

Back to the defaults. What else is new? One good improvement in MySQL 5.6 is a change in the default value for SQL_MODE. Up to MySQL 5.5, it was an empty string. In MySQL 5.6.6 and later it is ‘NO_ENGINE_SUBSTITUTION.’ Can this change have side effects? Yes. Code that worked before may break. IMO, it’s a good thing, because getting an error when trying to create a table of a non-existing engine is better than having the table created silently with the default engine. I can, however, think of at least one case where a silent substitution is desirable, as I have seen in action at a customer’s deployment. That aside, one wonders why they did not go the extra mile and add STRICT_TRANS_TABLES (or even STRICT_ALL_TABLES) to the default. It turned out that they did it… and didn’t. When you install a new data directory using mysql_install_db, the procedure creates a my.cnf file in the $BASEDIR (the place where your mysql tarball was expanded), containing the line

sql_mode=NO_ENGINE_SUBSTITUTION,STRICT_TRANS_TABLES 

However, if you have skipped the release notes, you will be left wondering where does this additional directive come from, since the manual mentions only one of them, and SHOW VARIABLES tells you that SQL_MODE contains two values.

MySQL 5.7 has also changed something. There is a lot of cleanup going on. Options and variables that were deprecated long ago suddenly disappear. Did you ever use ‘key-buffer’ as a short for ‘key-buffer-size’? If you did, that directive in the my.cnf won’t work anymore. (I could not find it in the release notes, but the test suite for MySQL Sandbox suddenly stopped working when I tried MySQL 5.7 and then I realized what was happening.) More to the point, though, is the installation procedure. In MySQL 5.6 there is a –random-passwords option that generates a random password for the server, and you can’t do anything with root until you use such random password to access the server and change the password to something else. This is an improvement over the old and despicable root without password, which has been the default since the very beginning of MySQL, and it’s been the source of many security nightmares and interesting bugs. In MySQL 5.7.4, this behavior, i.e. the generation of a random password during the installation, is now the default. It is good news, because the old behavior was a disaster, but if you have an automated way of dealing with installation, there will be more hard work in front of you to handle the changes. The implementation is not script friendly, and definitely nightmarish if you want to install several instances of the server in the same host. What happens when you install MySQL 5.7.4? The installation software generates a random password, and writes it to a file named .mysql_secret in your $HOME directory. A sample file that was created by the installer would be:

# The random password set for the root user at Mon Mar 31 10:16:54 2014 (local time):
2X7,S4PGkIg=H(lJ


EOF

If you wanted a script to read the password generated by this procedure, it would be a mistake to look for the second line. In fact, if you repeat the installation on the same host, you get something like this:

# The random password set for the root user at Mon Mar 31 10:16:54 2014 (local time):
2X7,S4PGkIg=H(lJ


# The random password set for the root user at Tue Apr 29 09:35:07 2014 (local time):
_S07zDt7dQ=,sxw9


# The random password set for the root user at Tue Apr 29 09:42:19 2014 (local time):
r[yn4ND$-5p,4q}5


EOF

Then the correct approach would be looking for the last non empty line in the file. However, if you were installing several instances in the same host (such as MySQL Sandbox does) you wouldn’t be able to find which password belongs to which server. Admittedly, multiple instances of the same server is not what most users do, but since it breaks MySQL Sandbox, which is used by many, I mention it here. BTW, MySQL Sandbox 3.0.44 has a temporary fix for this behavior. If you install MySQL 5.7.4 or later, it will include –skip-random-passwords, and defaults to the old password-less installation. There is a hackish workaround for the above uncompromising design and I will add it to MySQL Sandbox unless the next MySQL version introduces an optional path for the .mysql_secret file.

Summing up, there are mostly good improvements from the MySQL team, although the attention to usability is still low. They are whipping users into better security. A gentler approach would be appreciated. Sometimes I see a post from the community team asking for feedback on some feature being deprecated or modified. I would welcome such requests on matters that affect the default behavior of everyday tools.

Thursday, February 13, 2014

On the road again - FOSSAsia

On the road again - FOSSAsia

It has been a few busy months until now. I have moved from Italy to Thailand, and the move has been my first priority, keeping me from attending FOSDEM and interacting with social media. Now I start catching my breath, and looking around for new events to attend. But before I get into this, let’s make a few things clear:

  • I am still working for Continuent. Actually, it’s because of my company flexibility that I could move to a different country (a different continent, 6 time zones away) without much trouble. Thanks, Continuent! (BTW: Continuent is hiring! )
  • I am still involved with MySQL activities, events, and community matters. I just happen to be in a different time zone, where direct talk with people in Europe and US need to happen on a different schedule.

I am already committed to attend Percona Live MySQL Conference & Expo in Santa Clara, where I will present a tutorial on MySQL replication features and a regular session on multi-master topologies with Tungsten.

But in the meantime, Colin encouraged me to submit talk proposals at FOSSAsia, and both my submissions were accepted. So, at the end of February I will be talking about some of my favorite topics:

  • Easy MySQL multi master replication with Tungsten
  • Data in the cloud: mastering the ephemeral

The exact schedule will be announced shortly. I am eager to attend an open source event in Asia. It’s been a long time since I went to a similar event in Malaysia, which was much pleasant.

Thursday, January 16, 2014

PerconaLive 2014 program is published

PerconaLive 2014 program is published

Percona Live MySQL Conference and Expo, April 1-4, 2014

After a few months of submissions and reviews, the program for PerconaLive MySQL Conference and Expo 2014 is published. The conference will be held in Santa Clara, from April 1 to 4, 2014.

Registration with early bird discount is available until February 2nd. If you plan to attend, this is probably the best time to act.

I will be presenting twice at the conference:

Notice that the Call for Participation is still open for lightning talks and BoF. You can submit a talk until the end of January.

Tuesday, January 07, 2014

Multiple masters : attraction to the stars

In the last 10 years I have worked a lot with replication systems, and I have developed a keen interest in the topic of multiple masters in a single cluster. My interest has a two distinct origins:

  • On one hand, I have interacted countless times with users who want to use a replication system as a drop-in replacement for a single server. In many cases, especially when users are dealing with applications that are not much flexible or modular, this means that the replication system must have several points of data entry, and such points must work independently and in symbiosis with the rest of the nodes.
  • On the other hand, I am a technology lover (look it up in the dictionary: it is spelled geek), and as such I get my curiosity stirred whenever I discover a new possibility of implementing multi-master systems.

The double nature of this professional curiosity makes me sometimes forget that the ultimate goal of technology is to improve the users life. I may fall in love with a cute design or a clever implementation of an idea, but that cleverness must eventually meet with usability, or else it loses its appeal. There are areas where the distinction between usefulness and cleverness is clear cut. And there are others where we really don’t know where we stand because there are so many variables involved.

One of such cases is a star topology, where you have many master nodes, which are connected to each other through a hub. You can consider it a bi-directional master/slave. If you take a master/slave topology, and make every node able to replicate back to the master, then you have almost a star. To make it complete, you also need to add the ability of the master of broadcasting the changes received from the outside nodes, so that every node gets the changes from every other node. Compared to other popular topologies, say point-to-point all-masters, and circular replication, the star topology has the distinct advantage of requiring less connections, and of making it very easy to add a new node.

Star

Figure #1: Star topology

However, anyone can see immediately one disadvantage of the star topology: the hub is the cornerstone of the cluster. It’s a single point of failure (SPOF). If the hub fails, there is no replication anywhere. Period. Therefore, when you are considering a multi-master topology, you have to weigh in the advantages and disadvantages of the star, and usually you consider the SPOF as the most important element to consider.

Depending on which technology you choose, though, there is also another important element to consider, i.e. that data must be replicated twice when you use a star topology. It’s mostly the same thing that happens in a circular replication. If you have nodes A, B, C, and D, and you write data in A, the data is replicated three times before it reaches D (A->B, B->C, and C->D). A star topology is similar. In a system where A, B, and D are terminal nodes, and C is the hub, data needs to travel twice before it reaches D (A->C, C->D). Circular replication

Figure #2: Circular replication

This double transfer is bad for two reasons: it affects performance, and it opens to the risk of unexpected transformations of data. Let’s explore this concept a bit. When we replicate data from a master to a slave, there is little risk of mischief. The data goes from the source to a reproducer. If we use row-based-replication, there is little risk of getting the wrong data in the slave. If we make the slave replicate to a further slave, we need to apply the data, generate a further binary log in the slave host, and replicate data from that second binary log. We can deal with that, but at the price of taking into account more details, like where the data came from, when to stop replicating in a loop, whether the data was created with a given configuration set, and so on. In short, if your slave server has been configured differently from the master, chances are that the data down the line may be different. In a star topology, this translates into the possibility of data in each spoke to be replicated correctly in the hub, but to be possibly different in the other spokes.

Compare this with a point-to-point all-masters. In this topology, there are no SPOFs. You pay for this privilege by having to set a higher number of connections between nodes (every node must connect to every other node), but there is no second hand replication. Before being applied to the slave service, the data is applied only once in the originating master.

Point to point all masters

Figure #2: Point-to-point all-masters topology

Where do I want to go from all the above points? I have reached the conclusion that, much as user like star topologies, because of their simplicity, I find myself often recommending the more complex but more solid point-t-point all-masters setup. Admittedly, the risk of data corruption is minimal. The real spoiler in most scenarios is performance. When users realize that the same load will flow effortlessly in a point-to-point scenario, but cause slave lags in a star topology, then the choice is easy to make. If you use row-based replication, and in a complex topology it is often a necessary requirement, the lag grows to a point where it becomes unbearable.

As I said in the beginning, all depends on the use case: if the data load is not too big, a star topology will run just as fine as point-to-point, and if the data flow is well designed, the risk of bad data transformation becomes negligible. Yet, the full extent of star topologies weaknesses must be taken into account when designing a new system. Sometimes, investing some effort into deploying a point-to-point all-masters topology pays off in the medium to long term. Of course, you can prove that only if you deploy a star and try it out with the same load. If you deploy it on a staging environment, no harm is done. If you deploy in production, then you may regret. In the end, it all boils down to my mantra: don’t trust the theory, but test, test, test.

Thursday, December 12, 2013

Quick and dirty concurrent operations from the shell

Let’s say that you want to measure something in your database, and for that you need several operations to happen in parallel. If you have a capable programming language at your disposal (Perl, Python, Ruby, PHP, or Java would fit the bill) you can code a test that sends several transactions in parallel.

But if all you have is the shell and the mysql client, things can be trickier. Today I needed such a parallel result, and I only had mysql and bash to accomplish the task.

In the shell, it’s easy to run a loop:

for N in $(seq 1 10)
do
    mysql -h host1 -e "insert into sometable values($N)" 
done

But this does run queries sequentially, and each session will open and close before the next one starts. Therefore there is no concurrency at all.
Then I thought that the method for parallel execution in the shell is to run things in the background, and then collect the results. I just needed to find a way of keeping the first session open while the others are being started.

Here’s what I did: I ran a loop with a countdown, using the seq command, and I included a sleep statement in each query, waiting for a decreasing amount of seconds. If I start with 10 seconds, the first query will sleep for 10 seconds, the second one for 9 seconds, and so on. I will run each command in the background, so they will eat up the time independently.

#!/bin/bash
mysql -h host1 test -e 'drop table if exists t1'
mysql -h host1 test -e 'create table t1 (i int not null primary key, ts timestamp)'

for N in $(seq 10 -1 1)
do
    query1="set autocommit=0"
    query2="insert into test.t1 (i) values($N)"
    query3="select sleep($N) into @a; commit"
    mysql -h host1 -e "$query1;$query2;$query3" &
done

wait

mysql -h host1 test -e 'select * from t1'

The effect of this small script is that the commit for these 10 commands come at the same time, as you can see from the resulting table:

+----+---------------------+
| i  | ts                  |
+----+---------------------+
|  1 | 2013-12-12 18:08:00 |
|  2 | 2013-12-12 18:08:00 |
|  3 | 2013-12-12 18:08:00 |
|  4 | 2013-12-12 18:08:00 |
|  5 | 2013-12-12 18:08:00 |
|  6 | 2013-12-12 18:08:00 |
|  7 | 2013-12-12 18:08:00 |
|  8 | 2013-12-12 18:08:00 |
|  9 | 2013-12-12 18:08:00 |
| 10 | 2013-12-12 18:08:00 |
+----+---------------------+

This is a very good result, but what happens if I need to run 500 queries simultaneously, instead of 10? I don’t want to wait 500 seconds (8+ minutes). So I made an improvement:

for N in $(seq 5000 -10 1)
do
    echo $N
    query1="set autocommit=0"
    query2="insert into test.t1 (i) values($N)"
    query3="select sleep(concat('0.', lpad($N,4,'0'))) into @a; commit"
    mysql -h host1 -e "$query1;$query2;$query3" &
done

Now each SLEEP command is called with a fractional argument, starting at “0.5000”, and continuing with “0.4999,” and so on. You can try it. All 500 rows are committed at the same time.

However, the same time is a bit fuzzy. When we use timestamps with second granularity, it’s quite easy to show the same time. But with microseconds it’s a different story. Here’s what happens if I use MySQL 5.6 with timestamp columns using microseconds (TIMESTAMP(3)):

+----+-------------------------+
| i  | ts                      |
+----+-------------------------+
|  1 | 2013-12-12 18:27:24.070 |
|  2 | 2013-12-12 18:27:24.070 |
|  3 | 2013-12-12 18:27:24.069 |
|  4 | 2013-12-12 18:27:24.068 |
|  5 | 2013-12-12 18:27:24.065 |
|  6 | 2013-12-12 18:27:24.066 |
|  7 | 2013-12-12 18:27:24.062 |
|  8 | 2013-12-12 18:27:24.064 |
|  9 | 2013-12-12 18:27:24.064 |
| 10 | 2013-12-12 18:27:24.064 |
+----+-------------------------+

For the purpose of my tests (the actual queries were different) this is not an issue. Your mileage may vary.

Monday, December 09, 2013

Old and new MySQL verbosity

I was pleased to see Morgan’s announcement about a fix to an old problem of mine. In March 2012 I complained about MySQL verbosity during installation.

In MySQL 5.7.3, this behavior was changed. While the default is still as loud as it can, you can now add an option (log_error_verbosity) to send only errors to STDERR, which allows you to hide the output of mysql_install_db, and still get the errors, if they occur.

Well done!

However, the same obnoxious verbosity is also in MariaDB 10.0.x. Since I discussed this specific bug with a few MariaDB developers early in 2012, I was disappointed to see this same output when running mysql_install_db with MariaDB. Here’s the same appeal: MariaDB developers, please fix this usability issue!

And now, for the laughing notes. All versions of MySQL available now, from Oracle, Percona, MariaDB, list this line when installing with mysql_install_db:

Please report any problems with the './scripts/mysqlbug' script!

There is Bug#29716 that was reported in 2008, about mysqlbug being unnecessary (by then, it had been obsolete for 2 or 3 years already), with a patch submitted but not committed. So, in 2013, we still see a reference to a tool that has ceased working for at least 8 years. It should not take much to remove this line and replace it with an appropriate link to the bugs system.

Sunday, December 08, 2013

Submissions at Percona Live Santa Clara 2014 and Lightning talks

The call for participation at Percona Live MySQL Conference and Expo 2014 is now closed. There have been more than 320 submissions, and this will keep the review committee busy for a while.

One important point for everyone who has submitted: if you have submitted a proposal but haven’t included a bio in your account, do it now. If you don’t, your chances of being taken seriously are greatly reduced. To add a bio, go to your account page and fill in the Biography field. Including a picture is not mandatory, but it will be definitely appreciated.

Although the CfP is closed for tutorials and regular sessions, your chances of becoming a celebrity are not over yet. The CfP is still open for Lightning talks and Bird of a Feather sessions.

If you want to submit a lightning talk, you still have time until the end of January. Don’t forget to read the instructions and remember that lightning talks don’t give you a free pass, but a healthy 20% discount.

So far, I have received 16 proposals. Of these, 6 have been rated highly enough to guarantee acceptance (including mine, for which I have not voted.) We still have 6 spots to fill (12 spots in total, 5 minutes each,) and I’d rather fill them with talks that appeal to everyone in the committee, than scrap the barrel of the mediocre ones. My unofficial goal is to have so many good submissions that I will have to withdraw my own talk. Thus, the potential number of available spots is 7. Please kick my talk off stage, by submitting outstanding proposals!

Friday, November 15, 2013

Parallel replication: off by one

One of the most common errors in development is where a loop or a retrieval by index falls short or long by one unit, usually because of an oversight or a logic in coding.

Of the following snippets, which one will run 10 times?

/* #1 */    for (N = 0 ; N < 10; N++) printf("%d\n", N);

/* #2 */    for (N = 0 ; N <= 10; N++) printf("%d\n", N); 

/* #3 */    for (N = 1 ; N <= 10; N++) printf("%d\n", N);

/* #4 */    for (N = 1 ; N < 10; N++) printf("%d\n", N);

The question is deceptive, as there are two snippets that will run 10 times (1 and 3). But they will print different numbers. If you ware aiming for numbers from 1 to 10, only #3 is good.

After many years of programming, off-by-one errors are rare in my code, and I have been able to spot them or prevent them at first sight. That’s why I feel uneasy when I look at the way parallel replication is enabled in MySQL 5.6,5.7 and MariaDB 10.0.5. In both cases, there is a variable that sets the number of replication threads:

set global slave_parallel_workers=5 in MySQL
set global slave_parallel_threads=5 in mariadb

Yet, for both implementations, you can set the number of threads to 1, and it is not the same as disabling parallel replication.

set global slave_parallel_workers=1 in MySQL
set global slave_parallel_threads=1 in mariadb

It will run parallel replication with one thread, meaning that you will have all the overhead of parallel replication with none of the benefits. Not only that, but replication actually slows down. The extra channel reduces performance by 7% in MariaDB and 10% in MySQL.

Now for the punch line. In Tungsten-Replicator, to disable parallel replication you set the number of channels to 1 (the intuitive value). If you set it to 0, the setup fails, as it should, since there would be no replication without channels. The reason for the fit is that in Tungsten, parallel replication was designed around the core functionality, while in MySQl and MariaDB it is an added feature that struggles to be integrated.

Wednesday, November 13, 2013

Call for papers (with lightning talks): Percona Live MySQL Conference 2014

The call for participation for Percona Live MySQL Conference 2014 is still open. As part of the review committee, I will be looking at the proposals, and I hope to see many interesting ones.

There is a novelty in the submission form. In addition to tutorials and regular sessions, now you can submit proposals for lightning talks, to which I am particularly interested, as I have organized the lightning talks in the past two editions, and I am in charge to continue the tradition for the next one.

If you want to be a speaker at the conference, here are some tips to get your proposal accepted:

  • Propose a topic that you know well;
  • Take some time to write a well thought and meaningful proposal: nothing gets me faster to the rejection button than statements like “I want to talk about X, I will think of something to say”;
  • Write your proposal with the attendees in mind, i.e. giving information that will make them want to see your presentation;
  • But also write with the committee in mind. There is a space for private messages to the reviewers. Use it wisely if there is something that we need to know.
  • Mind your buzzwords. I am not easily impressed by fashionable topics. Back your proposal with sound reasoning. Don’t assume that I, or anyone in the committee, see things your way, or the way they are reported in the press.
  • Check your spelling. Another way of getting rejected quickly is when you misspell the topic you claim to be an expert of.
  • And check your spelling again. If you miss the difference between “know its shit” and “know it’s shit,” I am less inclined to approve.
  • Write a sensible bio. We need to know who are you and what you do, to see if your story is compatible with your proposal.
  • Write enough to make your proposal clear. A proposal that is shorter than your bio will raise a red flag. But do not write too much. You are writing a proposal, not an article on the matter. If you have written an interesting article on the topic, give us an URL.

Regarding the lightning talks, I have some more recommendations.

  • A lightning talk last 5 minutes maximum. Don’t propose a topic that cannot be exhausted in that timeframe.
  • An accepted lightning talk does not give you a free pass (unless you are also accepted as speaker for a regular talk). You will be given a code to register at a 15% discount.
  • You should propose something that it is either highly interesting, or surprising, or entertaining, or all the above: the lightning talks are a show.
  • Be daring in your proposals. While a regular talk might be refused if you propose to sing the InnoDB settings, a LT on this topic could be seen as legitimate (but you must demonstrate that you can do it!)
  • Convince me (specifically, as I will be choosing the talks accepted by the committee) that you want to be on stage and have the abilities for the show.
  • Be prepared to show your slides earlier than usual. As the organizer, I need to make sure that you have something meaningful to show before sending you on stage.
  • Be aware of the rules:
    • All slides will be loaded into a single computer, to minimize delays between talks;
    • Someone (probably me) will nag the speakers until they either surrender their slides or escape to Mexico;
    • All speakers will meet 15 minutes before the start, and be given the presentation order. Missing speakers will be replaced by reserve speakers;
    • The speaker will have 5 minutes to deliver the talk.
    • When one minute is left, there will be a light sound to remind of the remaining time.
    • When 10 seconds are left, most likely the audience will start chanting the countdown.
    • When the time is finished, the speaker must leave the podium to the next one.

If you have reached this point, you are ready to submit a proposal!

See also:

Vote on Planet MySQL