Feed aggregator
Replication in MySQL 5.6: GTIDs benefits and limitations – Part 2
The main benefit of using GTIDs is to have much easier failover than with file-based replication. We will see how to change the replication topology when using GTID-based replication. That will show where GTIDs shine and where improvements are expected.
This is the second post of a series of articles focused on MySQL 5.6 GTIDs. You can find part one here.
Our goal will be to go from setup #1 to setup #2 on the picture below, following various scenarios:
For these tests, all servers are running on 127.0.0.1 with ports ranging from 10000 for s0 to 10004 for s4.
Scenario #1: All slaves have processed all the writesThis is the easiest case, we will make s2 a master and redirect replication on the other servers to s2. This scenario can happen when you want to perform a planned failover.
With GTIDs, all the operations are straightforward:
#For s2 (the new master), we remove its configuration as a slave s1> stop slave; s1> reset slave all; # For s0 s0> change master to master_host='127.0.0.1',master_user='rsandbox',master_password='rsandbox',master_port=10001,master_auto_position=1; s0> start slave; # For s1, s3 and s4 mysql> stop slave; mysql> change master to master_port=10002; mysql> start slave;Those of you who have already done these operations with file-based replication know that it is usually very tedious and that proper recording of binlog file/binlog position needs to be done with care if you don’t want to break replication or corrupt your data.
Scenario #2: One of the slaves is behindNow let’s imagine that s0 has crashed, and that s1 has not received all writes (and therefore s3 and s4 are also lagging behind).
s2> select count(*) from t; +----------+ | count(*) | +----------+ | 2 | +----------+ # s1 is behind s1> select count(*) from t; +----------+ | count(*) | +----------+ | 0 | +----------+Can we still use master_auto_position = 1? Let’s hope so, as it is one of the ideas of GTIDs: having for each event across the cluster a monotonically incremental identifier for each event.
Notice that this is the same problem for s0 (which will be late when it comes back) and s1, s3 and s4.
Let’s give it a try!
# For s0,s1, s3, s4 mysql> stop slave; mysql> change master to master_port=10002; mysql> start slave; # And then check the number of records from the t table s1> select count(*) from t; +----------+ | count(*) | +----------+ | 2 | +----------+Great! So again, using GTIDs avoids the tedious work of looking for the binlog position of a specific event. The only part were we should pay attention is the server we choose for promotion: if it is not up-to-date, data may be lost or replication may be broken.
Scenario #3: The master has crashed before sending all writesIf the binary logs of the master are no longer readable, you will probably lose the events that have not been sent to the slaves (your last chance is to be able to recover data from the crashed master, but that’s another story). In this case, you will have to promote the most up-to-date slave and reconfigure the other slaves as we did above.
So we will suppose that we can read the binary logs of the crashed master. The first thing to do after choosing which slave will be the new master is to recover the missing events with mysqlbinlog.
Let’s say that we want to promote s1 as the new master. We need to know the coordinates of the last event executed:
s1> show slave status\G [...] Executed_Gtid_Set: 219be3a9-c3ae-11e2-b985-0800272864ba:1, 3d3871d1-c3ae-11e2-b986-0800272864ba:1-4We can see that it’s not obvious to know which was the last executed event: is it 219be3a9-c3ae-11e2-b985-0800272864ba:1 or 3d3871d1-c3ae-11e2-b986-0800272864ba:4 ? A ‘Last_Executed_GTID’ column would have been useful.
In our case we can check that 3ec18c45-c3ae-11e2-b986-0800272864ba is the server UUID of s2, and that the other one is from s0 (for s0 which is crashed, the server UUID can be read in the auto.cnf file in the datadir).
So the last executed event is 219be3a9-c3ae-11e2-b985-0800272864ba:1. How can I instruct mysqlbinlog to start reading from there? Unfortunately, there is no --start-gtid-position option or equivalent. See bug #68566.
Does it mean that we cannot easily recover the data with mysqlbinlog? There is a solution of course, but very poor in my opinion: look for the binlog file/position of the last executed event and use mysqlbinlog with the good old --start-position option! Even with GTIDs, you cannot totally forget old-style replication positioning.
ConclusionReconfiguring replication when using GTIDs is usually straightforward: just connect the slave to the correct master with master_auto_position = 1. This can even be made easier with mysqlfailover from the MySQL Utilities (this will be the topic of a future post).
Unfortunately, this will not work for every use case, and until this is fixed, it is good to be aware of the current limitations.
The post Replication in MySQL 5.6: GTIDs benefits and limitations – Part 2 appeared first on MySQL Performance Blog.
How to fix your PRM cluster when upgrading to RHEL/CentOS 6.4
If you are using Percona Replication Manager (PRM) with RHEL/CentOS prior to 6.4, upgrading your distribution to 6.4 may break your cluster. In this post I will explain you how to fix your cluster in case it breaks after a distribution upgrade that implies an update of pacemaker from 1.1.7 to 1.18. You can also follow the official documentation here.
The version of Pacemaker (always considered as Technology Preview by RedHat) provided with 6.4 is 1.1.8-x which is not 100% compatible with 1.1.7-x see this report.
So if you want to upgrade, you cannot apply any rolling upgrade process. So like for Pacemaker 0.6.x to 1.0.x, you need again to update all nodes as once. As notified in RHBA-2013-0375, RedHat encourages people to use Pacemaker in combination with the CMAN manager (It may become mandatory with the next release).
CMAN v3 is a Corosync plugin that monitors the names and number of active cluster nodes in order to deliver membership and quorum information to clients (such as the Pacemaker daemons) and it’s part of the RedHat cluster stack. If you were using some puppet recipes published previously here you are not yet using CMAN.
Let’s have look at what happens if we have a cluster with 3 nodes (CentOS 6.3) and using PRM as OCF:
[root@percona1 percona]# crm_mon -1
============
Last updated: Thu May 23 08:04:30 2013
Last change: Thu May 23 08:03:41 2013 via crm_attribute on percona2
Stack: openais
Current DC: percona1 – partition with quorum
Version: 1.1.7-6.el6-148fccfd5985c5590cc601123c6c16e966b85d14
3 Nodes configured, 3 expected votes
7 Resources configured.
============
Online: [ percona1 percona2 percona3 ]
reader_vip_1 (ocf::heartbeat:IPaddr2): Started percona3
reader_vip_2 (ocf::heartbeat:IPaddr2): Started percona2
reader_vip_3 (ocf::heartbeat:IPaddr2): Started percona1
writer_vip (ocf::heartbeat:IPaddr2): Started percona1
Master/Slave Set: ms_MySQL [p_mysql]
Masters: [ percona2 ]
Slaves: [ percona3 percona1 ]
[root@percona1 ~]# cat /etc/redhat-release
CentOS release 6.3 (Final)
[root@percona1 ~]# rpm -q pacemaker
pacemaker-1.1.7-6.el6.x86_64
[root@percona1 ~]# rpm -q corosync
corosync-1.4.1-7.el6_3.1.x86_64
Everything is working
Let’s update our system to 6.4 on one server…
NOTE: In production you should put the cluster in maintenance mode before the update, see bellow how to perform this action
[root@percona1 percona]# yum update -y
[root@percona1 percona]# cat /etc/redhat-release
CentOS release 6.4 (Final)
[root@percona1 ~]# rpm -q pacemaker
pacemaker-1.1.8-7.el6.x86_64
[root@percona1 ~]# rpm -q corosync
corosync-1.4.1-15.el6_4.1.x86_64
Let’s reboot it…
[root@percona1 percona]# reboot
If we check the cluster from another node, we see that percona1 is now offline:
============
Last updated: Thu May 23 08:29:36 2013
Last change: Thu May 23 08:03:41 2013 via crm_attribute on percona2
Stack: openais
Current DC: percona3 – partition with quorum
Version: 1.1.7-6.el6-148fccfd5985c5590cc601123c6c16e966b85d14
3 Nodes configured, 3 expected votes
7 Resources configured.
============
Online: [ percona2 percona3 ]
OFFLINE: [ percona1 ]
reader_vip_1 (ocf::heartbeat:IPaddr2): Started percona2
reader_vip_2 (ocf::heartbeat:IPaddr2): Started percona3
reader_vip_3 (ocf::heartbeat:IPaddr2): Started percona2
writer_vip (ocf::heartbeat:IPaddr2): Started percona3
Master/Slave Set: ms_MySQL [p_mysql]
Masters: [ percona2 ]
Slaves: [ percona3 ]
Stopped: [ p_mysql:2 ]
After the update and after fixing some small issues like the one bellow, you are able to start Corosync and Pacemaker but the node doesn’t join the cluster
May 23 08:34:12 percona1 corosync[1535]: [MAIN ] parse error in config: Can't open logfile '/var/log/corosync.log' for reason: Permission denied (13).#012.
So now you need to update all nodes to Pacemaker 1.1.8 but to avoid again issues with the next distribution update, I prefer to use CMAN as recommended.
First as we have 2 nodes of 3 running, we should try to not stop all our servers… let’s put the cluster in maintenance mode (don’t forget you should have done this even before updating the first node, but I wanted to simulate the problem):
[root@percona3 percona]# crm configure property maintenance-mode=true
We can see that the resources are unmanaged:
============
Last updated: Thu May 23 08:43:49 2013
Last change: Thu May 23 08:43:49 2013 via cibadmin on percona3
Stack: openais
Current DC: percona3 – partition with quorum
Version: 1.1.7-6.el6-148fccfd5985c5590cc601123c6c16e966b85d14
3 Nodes configured, 3 expected votes
7 Resources configured.
============
Online: [ percona2 percona3 ]
OFFLINE: [ percona1 ]
reader_vip_1 (ocf::heartbeat:IPaddr2): Started percona2 (unmanaged)
reader_vip_2 (ocf::heartbeat:IPaddr2): Started percona3 (unmanaged)
reader_vip_3 (ocf::heartbeat:IPaddr2): Started percona2 (unmanaged)
writer_vip (ocf::heartbeat:IPaddr2): Started percona3 (unmanaged)
Master/Slave Set: ms_MySQL [p_mysql] (unmanaged)
p_mysql:0 (ocf::percona:mysql): Master percona2 (unmanaged)
p_mysql:1 (ocf::percona:mysql): Started percona3 (unmanaged)
Stopped: [ p_mysql:2 ]
Now we can upgrade all servers to 6.4
[root@percona2 percona]# yum -y update
[root@percona3 percona]# yum -y update
Meanwhile, we can already prepare the first node to use CMAN:
[root@percona1 ~]# yum -y install cman ccs
Back on the two nodes that were updating, they are now updated to 6.4:
[root@percona3 percona]# cat /etc/redhat-release
CentOS release 6.4 (Final)
And let’s check the cluster status:
[root@percona3 percona]# crm_mon -1
Could not establish cib_ro connection: Connection refused (111)
Connection to cluster failed: Transport endpoint is not connected…
…but MySQL is still running:
[root@percona2 percona]# mysqladmin ping
mysqld is alive
[root@percona3 percona]# mysqladmin ping
mysqld is alive
Let’s install CMAN on percona2 and percona3 too:
[root@percona2 percona]# yum -y install cman ccs
[root@percona3 percona]# yum -y install cman ccs
Then on ALL nodes, stop Pacemaker and Corosync
[root@percona1 ~]# /etc/init.d/pacemaker stop
[root@percona1 ~]# /etc/init.d/corosync stop
[root@percona2 ~]# /etc/init.d/pacemaker stop
[root@percona2 ~]# /etc/init.d/corosync stop
[root@percona3 ~]# /etc/init.d/pacemaker stop
[root@percona3 ~]# /etc/init.d/corosync stop
Remove Corosync from the startup services:
[root@percona1 ~]# chkconfig corosync off
[root@percona2 ~]# chkconfig corosync off
[root@percona3 ~]# chkconfig corosync off
Let’s specify that the cluster can start without quorum:
[root@percona1 ~]# sed -i.sed “s/.*CMAN_QUORUM_TIMEOUT=.*/CMAN_QUORUM_TIMEOUT=0/g” /etc/sysconfig/cman
[root@percona2 ~]# sed -i.sed “s/.*CMAN_QUORUM_TIMEOUT=.*/CMAN_QUORUM_TIMEOUT=0/g” /etc/sysconfig/cman
[root@percona3 ~]# sed -i.sed “s/.*CMAN_QUORUM_TIMEOUT=.*/CMAN_QUORUM_TIMEOUT=0/g” /etc/sysconfig/cman
And create the cluster, perform the following command on one server only:
[root@percona1 ~]# ccs -f /etc/cluster/cluster.conf –createcluster lefred_prm
Now add the nodes to the cluster:
[root@percona1 ~]# ccs -f /etc/cluster/cluster.conf –addnode percona1
Node percona1 added.
[root@percona1 ~]# ccs -f /etc/cluster/cluster.conf –addnode percona2
Node percona2 added.
[root@percona1 ~]# ccs -f /etc/cluster/cluster.conf –addnode percona3
Node percona3 added.
we need then to delegate the fencing to pacemaker (adding a fence device, fence methods to specific node and the instances) :
[root@percona1 ~]# ccs -f /etc/cluster/cluster.conf –addfencedev pcmk agent=fence_pcmk
[root@percona1 ~]# ccs -f /etc/cluster/cluster.conf –addmethod pcmk-redirect percona1
Method pcmk-redirect added to percona1.
[root@percona1 ~]# ccs -f /etc/cluster/cluster.conf –addmethod pcmk-redirect percona2
Method pcmk-redirect added to percona2.
[root@percona1 ~]# ccs -f /etc/cluster/cluster.conf –addmethod pcmk-redirect percona3
Method pcmk-redirect added to percona3.
[root@percona1 ~]# ccs -f /etc/cluster/cluster.conf –addfenceinst pcmk percona1 pcmk-redirect port=percona1
[root@percona1 ~]# ccs -f /etc/cluster/cluster.conf –addfenceinst pcmk percona2 pcmk-redirect port=percona2
[root@percona1 ~]# ccs -f /etc/cluster/cluster.conf –addfenceinst pcmk percona3 pcmk-redirect port=percona3
Encrypt the cluster:
[root@percona1 ~]# ccs -f /etc/cluster/cluster.conf –setcman keyfile=”/etc/corosync/authkey” transport=”udpu”
Let’s check if the configuration file is OK:
[root@percona1 ~]# ccs_config_validate -f /etc/cluster/cluster.conf
Configuration validates
We can now copy the configuration file on all nodes:
[root@percona1 ~]# scp /etc/cluster/cluster.conf percona2:/etc/cluster/
[root@percona1 ~]# scp /etc/cluster/cluster.conf percona3:/etc/cluster/
Enable CMAN at startup on all nodes:
[root@percona1 ~]# chkconfig cman on
[root@percona2 ~]# chkconfig cman on
[root@percona3 ~]# chkconfig cman on
And start the services on all nodes:
[root@percona1 ~]# /etc/init.d/cman start
Starting cluster:
Checking if cluster has been disabled at boot… [ OK ]
Checking Network Manager… [ OK ]
Global setup… [ OK ]
Loading kernel modules… [ OK ]
Mounting configfs… [ OK ]
Starting cman… [ OK ]
Waiting for quorum… [ OK ]
Starting fenced… [ OK ]
Starting dlm_controld… [ OK ]
Tuning DLM kernel config… [ OK ]
Starting gfs_controld… [ OK ]
Unfencing self… [ OK ]
Joining fence domain… [ OK ]
[root@percona1 ~]# /etc/init.d/pacemaker start
Starting cluster:
Checking if cluster has been disabled at boot… [ OK ]
Checking Network Manager… [ OK ]
Global setup… [ OK ]
Loading kernel modules… [ OK ]
Mounting configfs… [ OK ]
Starting cman… [ OK ]
Waiting for quorum… [ OK ]
Starting fenced… [ OK ]
Starting dlm_controld… [ OK ]
Tuning DLM kernel config… [ OK ]
Starting gfs_controld… [ OK ]
Unfencing self… [ OK ]
Joining fence domain… [ OK ]
Starting Pacemaker Cluster Manager: [ OK ]
[root@percona2 ~]# /etc/init.d/cman start
[root@percona2 ~]# /etc/init.d/pacemaker start
[root@percona3 ~]# /etc/init.d/cman start
[root@percona3 ~]# /etc/init.d/pacemaker start
We can now connect crm_mon to the cluster and check its status:
[root@percona2 percona]# crm_mon -1
Last updated: Thu May 23 09:18:58 2013
Last change: Thu May 23 09:16:31 2013 via crm_attribute on percona1
Stack: cman
Current DC: percona1 – partition with quorum
Version: 1.1.8-7.el6-394e906
3 Nodes configured, 3 expected votes
7 Resources configured.
Online: [ percona1 percona2 percona3 ]
reader_vip_1 (ocf::heartbeat:IPaddr2): Started percona3
reader_vip_2 (ocf::heartbeat:IPaddr2): Started percona2
reader_vip_3 (ocf::heartbeat:IPaddr2): Started percona1
writer_vip (ocf::heartbeat:IPaddr2): Started percona1
Master/Slave Set: ms_MySQL [p_mysql]
Masters: [ percona1 ]
Slaves: [ percona2 percona3 ]
We can see that some resources changed this is because we didn’t put it in maintenance on node1 before the update to 6.4
In case we put everything in maintenance mode as it should be before the upgrade to 6.4, it’s time to stop the maintenance mode… but crm command is not present any more
It’s still possible to find the command install crmsh (crm shell from another repository) or just install pcs (Pacemaker Configuration System)
[root@percona2 percona]# yum -y install pcs
[root@percona2 percona]# pcs status
Last updated: Thu May 23 09:24:37 2013
Last change: Thu May 23 09:16:31 2013 via crm_attribute on percona1
Stack: cman
Current DC: percona1 – partition with quorum
Version: 1.1.8-7.el6-394e906
3 Nodes configured, 3 expected votes
7 Resources configured.
Online: [ percona1 percona2 percona3 ]
Full list of resources:
reader_vip_1 (ocf::heartbeat:IPaddr2): Started percona3
reader_vip_2 (ocf::heartbeat:IPaddr2): Started percona2
reader_vip_3 (ocf::heartbeat:IPaddr2): Started percona1
writer_vip (ocf::heartbeat:IPaddr2): Started percona1
Master/Slave Set: ms_MySQL [p_mysql]
Masters: [ percona1 ]
Slaves: [ percona2 percona3 ]
So if you were in maintenance mode, you should have :
[root@percona2 percona]# pcs status
Last updated: Thu May 23 09:26:56 2013
Last change: Thu May 23 09:26:50 2013 via cibadmin on percona2
Stack: cman
Current DC: percona1 – partition with quorum
Version: 1.1.8-7.el6-394e906
3 Nodes configured, 3 expected votes
7 Resources configured.
Online: [ percona1 percona2 percona3 ]
Full list of resources:
reader_vip_1 (ocf::heartbeat:IPaddr2): Started percona3 (unmanaged)
reader_vip_2 (ocf::heartbeat:IPaddr2): Started percona2 (unmanaged)
reader_vip_3 (ocf::heartbeat:IPaddr2): Started percona1 (unmanaged)
writer_vip (ocf::heartbeat:IPaddr2): Started percona1 (unmanaged)
Master/Slave Set: ms_MySQL [p_mysql] (unmanaged)
p_mysql:0 (ocf::percona:mysql): Master percona1 (unmanaged)
p_mysql:1 (ocf::percona:mysql): Slave percona2 (unmanaged)
p_mysql:2 (ocf::percona:mysql): Slave percona3 (unmanaged)
And now you are able to stop maintenance mode:
[root@percona2 percona]# pcs property set maintenance-mode=false
You can also check your cluster using cman_tool or clustat (if you have installed rgmanager)
[root@percona3 ~]# cman_tool nodes
Node Sts Inc Joined Name
1 M 64 2013-05-23 09:52:03 percona1
2 M 64 2013-05-23 09:52:03 percona2
3 M 64 2013-05-23 09:52:03 percona3
[root@percona3 ~]# clustat
Cluster Status for lefred_prm @ Thu May 23 10:20:36 2013
Member Status: Quorate
Member Name ID Status
—— —- —- ——
percona1 1 Online
percona2 2 Online
percona3 3 Online, Local
Now the cluster is fixed and everything works again as expected and you should be ready for the next distro upgrade!
INFO: If you have the file /etc/corosync/service.d/pcmk you need to delete it before installing CMAN
The post How to fix your PRM cluster when upgrading to RHEL/CentOS 6.4 appeared first on MySQL Performance Blog.
Choosing a MySQL HA Solution – MySQL Webinar: June 5
Selecting the most appropriate solution for a MySQL HA infrastructure is as much a business and philosophical decision as it is a technical one, but often the choice is made without adequately considering all three perspectives. When too much attention is paid to one of these aspects at the cost of the others, the resulting system may be over-engineered, poorly-performing, and/or various other flavors of suboptimal.
On Wednesday, June 5, at 10 a.m. PDT (1700 UTC), I will be presenting a webinar entitled, Choosing a MySQL HA Solution, in which we’ll explore the topic of MySQL HA from each of these perspectives. The goal will be to motivate your thinking about HA in a holistic fashion and help guide you towards asking the right questions when considering a new or upgraded HA deployment.
This webinar will be both technical and non-technical in nature, beginning with a discussion of some general HA principles and some common misconceptions. We will then explore some of the more well-known MySQL HA tools and technologies available today (largely grouped into those which use traditional MySQL replication, those which use some other MySQL-level replication, and those which replicate at some other layer of the system stack) and then conclude with some typical use cases where a given approach may be well-suited or particularly contraindicated.
If this topic interests you, then register today to reserve your spot. I look forward to speaking with all of you next week.
The post Choosing a MySQL HA Solution – MySQL Webinar: June 5 appeared first on MySQL Performance Blog.
Temporary Tables and Replication
I recently wrote about non-deterministic queries in the replication stream. That’s resolved by using either MIXED or ROW based replication rather than STATEMENT based.
Another thing that’s not fully handled by STATEMENT based replication is temporary tables. Imagine the following:
- Master: CREATE TEMPORARY TABLE rpltmpbreak (i INT);
- Wait for slave to replicate this statement, then stop and start mysqld (not just STOP/START SLAVE)
- Master: INSERT INTO rpltmpbreak VALUES (1);
- Slave: SHOW SLAVE STATUS \G
If for any reason a slave server shuts down and restarts after the temp table creation, replication will break because the temporary table will no longer exist on the restarted slave server. It’s obvious when you think about it, but nevertheless it’s quite annoying.
A long time ago (early 2007, when I was still working at MySQL AB) I filed a bug report on this. It’s important to realise that back then, row based replication did exist but was so buggy that you wouldn’t recommend it, so the topic was quite relevant. For some reason the bug has remained open for over 6 years until some recent activity.
It is not an issue with determinism and most temporary table constructs are technically regarded as “safe” to replicate via statement based replication, so if you use MIXED you will still find replication broken with the above scenario. Important to realise!
http://dev.mysql.com/doc/refman/5.5/en/replication-features-temptables.html (the obvious place to look) doesn’t really explain this well, but http://dev.mysql.com/doc/refman/5.5/en/replication-rbr-usage.html correctly states that ROW based replication doesn’t suffer from this problem as it replicates the values from the temporary table on the master rather than the statement, thus the slave doesn’t have to deal with the temporary table at all. I’ve suggested that the bug be changed to a documentation issue, updating the page on replication and temporary tables to properly explain the issue and point clearly and explicitly to the solution.
So, why would you ever use STATEMENT or MIXED rather than ROW based replication?
- Well, as I mentioned, earlier row based wasn’t particularly reliable. At that time, for non-deterministic scenarios we recommended mixed as a compromise (that only uses row based information in the replication stream when it’s necessary, and statements the rest of the time). Many issues have been fixed over time and now we can generally say that row based replication is ok in recent versions of MySQL and MariaDB (5.5 or above, just to be sure). So if you’re replicating from an older master, STATEMENT or MIXED might still be preferable, as long as you know that the limitations are.
- Non-local replication (outside the datacenter) is vastly more efficient with STATEMENT based replication: if you’re updating 100,000 rows, it’s a single statement whereas it’s a 100,000 row updates. So depending on bandwidth/cost and such, that might also be a relevant.
If none of those considerations apply, ROW based replication might be the way to go now. But the really important thing to realise is that for each of the choices of STATEMENT, MIXED and ROW, there are advantages and consequences.
Do you have any other reasons for using STATEMENT or MIXED in your environment?
Percona Server for MySQL 5.5.31-30.3 now available
Percona Server for MySQL version 5.5.31-30.3
Percona is glad to announce the release of Percona Server for MySQL 5.5.31-30.3 on May 24, 2013 (Downloads are available here and from the Percona Software Repositories). Based on MySQL 5.5.31, including all the bug fixes in it, Percona Server 5.5.31-30.3 is now the current stable release in the 5.5 series. All of Percona‘s software is open-source and free, all the details of the release can be found in the 5.5.31-30.3 milestone at Launchpad.
New Features:
- Percona Server has ported the Atomic write support for Fusion-io devices patch from MariaDB. This feature adds atomic write support for directFS filesystem on Fusion-io devices. This feature implementation is considered BETA quality.
- Percona Server has introduced innodb_read_views_memory and innodb_descriptors_memory status variables in the Extended Show Engine InnoDB Status to improve InnoDB memory usage diagnostics.
Bugs Fixed:
- Fix for bug #1131187 introduced a regression that could cause a memory leak if query cache was used together with InnoDB. Bug fixed #1170103.
- Fixed the RPM packaging regression that was introduced with the fix for bug #710799. This regression caused mysql schema to be missing after the clean RPM installation. Bug fixed #1174426.
- Fixed the Percona-Server-shared-55 and Percona-XtraDB-Cluster-shared RPM package dependences. Bug fixed #1050654.
- Fixed the upstream bug #68999 which caused compiling Percona Server to fail on CentOS 5 and Debian squeeze due to older OpenSSL version. Bug fixed #1183610.
- If a slave was running with its binary log enabled and then restarted with the binary log disabled, Crash-Resistant Replication could overwrite the relay log info log with an incorrect position. Bug fixed #1092593.
- Fixed the CVE-2012-5615 vulnerability. This vulnerability would allow remote attacker to detect what user accounts exist on the server. This bug fix comes originally from MariaDB (see MDEV-3909). Bug fixed #1171941.
- Fixed the CVE-2012-5627 vulnerability, where an unprivileged MySQL account owner could perform brute-force password guessing attack on other accounts efficiently. This bug fix comes originally from MariaDB (see MDEV-3915). Bug fixed #1172090.
- mysql_set_permission was failing on Debian due to missing libdbd-mysql-perl package. Fixed by adding the package dependency. Bug fixed #1003776.
- Rebuilding Debian source package would fail because dpatch and automake were missing from build-dep. Bug fixed #1023575 (Stephan Adig).
- Backported the fix for the upstream bug #65077 from the MySQL 5.6 version, which removed MyISAM internal temporary table mutex contention. Bug fixed #1179978.
Release notes for Percona Server for MySQL 5.5.31-30.3 are available in our online documentation. Bugs can be reported on the launchpad bug tracker.
The post Percona Server for MySQL 5.5.31-30.3 now available appeared first on MySQL Performance Blog.
A great talk on Go concurrency patterns
This 35-minute video from the recent Google I/O conference explains how to use Go’s concurrency primitives — goroutines, channels, and the select statement — to do things elegantly, correctly, and safely in a few lines of Go, which would otherwise turn your brain into a pretzel in most programming languages.
My favorite thing about Go is that a good Go program looks self-evident and obvious, even when it may be doing things that would be insanely complex in another language. Callbacks, closures, mutexes, and so on just disappear, and the program itself emerges, looking completely unimpressive. In many cases I think “what’s the big deal about that?” until I realize how hard it would be to write in Java, or Perl, or so on. A lot of the code in Percona Toolkit, for example, involved “pipelines” of callbacks passing data along to other callbacks for further processing. These were hard to reason about, hard to make resilient to errors and allow clean termination, and were redesigned several times, never very successfully in my opinion. In Go, channels make those kinds of tasks so simple. Such a program in Go looks suspiciously like a Unix | pipe | and | filter | program. If you think about it, the Unix shell itself is a great example of using “channels” successfully to trivialize what would otherwise be a migraine-inducing task.
ZFS on Linux and MySQL
I am currently working with a large customer and I am involved with servers located in two data centers, one with Solaris servers and the other one with Linux servers. The Solaris side is cleverly setup using zones and ZFS and this provides a very low virtualization overhead. I learned quite a lot about these technologies while looking at this, thanks to Corey Mosher.
On the Linux side, we recently deployed a pair on servers for backup purpose, boxes with 64 300GB SAS drives, 3 raid controllers and 192GB of RAM. These servers will run a few slave instances each of production database servers and will perform the backups. The write load is not excessive so a single server can easily handle the write load of all the MySQL instances. The original idea was to configure them with raid-10 + LVM, making sure to stripe the LV when we need to and align the partition correctly.
We got decent tpcc performance, nearly 37k NoTPM using 5.6.11 and xfs. Then, since ZFS on Linux is available and there is in house ZFS knowledge, we decided to reconfigure one of the server and give ZFS a try. So I trashed the raid-10 arrays, configure JBODs and gave all those drives to ZFS (30 mirrors + spares + OS partition mirror) and I limited the ARC size to 4GB. I don’t want to start a war but ZFS performance level was less than half of xfs for the tpcc test and that’s maybe just normal. We didn’t try too hard to get better performance because we already had more than enough for our purpose and some ZFS features are just too useful for backups (most apply also for btrfs). Let’s review them.
Snapshots
ZFS does snapshot, like LVM but… since it is a copy on write filesystem, the snapshots are free, no performance penalty. You can easily run a server with hundreds of snapshots. With LVM, your IO performance drops to 33% after the first snapshot so keeping a large number of snapshots running is simply not an option. With ZFS you can easily have:
- one snapshot per day for the last 30 days
- one snapshot per hour for the last 2 days
- one snapshot per 5min for the last 2 hours
and that will be perfectly fine. Since starting a snapshot take less than a second, you could even be more zealous. Pretty interesting to speed up point in time recovery when you dataset is 700GB. If you google a bit with “zfs snapshot script” you’ll many scripts ready for the task. Snapshots work best with InnoDB, with MyISAM you’ll have to start the snapshot while holding a “flush tables with read lock” and the flush operation will take some time to complete.
Compression
ZFS can compress data on the fly and it is surprisingly cheap. In fact the best tpcc results I got were when using compression. I still have to explain this, maybe it is related to better raid controller write cache use. Even the fairly slow gzip-1 mode works well. The tpcc database, which contains a lot of random data that doesn’t compress well showed a compression ration of 1.70 with gzip-1. Real data will compress much more. That gives us much more disk space than we expected so even more snapshots!
Integrity
With ZFS each record on disk has a checksum. If a cosmic ray flip a bit on a drive, instead of crashing InnoDB, it will be caught by ZFS and the data will be read from the other drive in the mirror.
Better availability and disk usage
On purpose, I allocated mirror pairs using drives from different controllers. That way, if a controller dies, the storage will still be working. Also, instead of having 1 or 2 spare drives per controller, I have 2 for the whole setup. A small but yet interesting saving.
All put together, ZFS on Linux is a very interesting solution for MySQL backup servers. All backup solutions have an impact on performance with ZFS the impact is up front and the backups are almost free.
The post ZFS on Linux and MySQL appeared first on MySQL Performance Blog.
An old note on the Storage Engine API
Whenever I stick my head into the MySQL storage engine API, I’m reminded of a MySQL User Conference from several years ago now.
Specifically, I’m reminded of a slide from an early talk at the MySQL User Conference by Paul McCullagh describing developing PBXT. For “How to write a Storage Engine for MySQL”, it went something like this:
- Develop basic INSERT (write_row) support – INSERT INTO t1 VALUES (42)
- Develop full table scan (rnd_init, rnd_next, rnd_end) - SELECT * from t1
- If you’re sane, stop here.
A lot of people stop at step 3. It’s a really good place to stop too. It avoids most of the tricky parts that are unexpected, undocumented and unlogical (yes, I’m inventing words here).
Non-Deterministic Query in Replication Stream
You might find a warning like the below in your error log:
130522 17:54:18 [Warning] Unsafe statement written to the binary log using statement format since BINLOG_FORMAT = STATEMENT. Statements writing to a table with an auto-increment column after selecting from another table are unsafe because the order in which rows are retrieved determines what (if any) rows will be written. This order cannot be predicted and may differ on master and the slave.
Statement: INSERT INTO tbl2 SELECT * FROM tbl1 WHERE col IN (417,523)
What do MariaDB and MySQL mean with this warning? The server can’t guarantee that this exact query, with STATEMENT based replication, will always yield identical results on the slave.
Does that mean that you have to use ROW based (or MIXED) replication? Possibly, but not necessarily.
For this type of query, it primarily refers to the fact that without ORDER BY, rows have no order and thus a result set may show up in any order the server decides. Sometimes it’s predictable (depending on storage engine and index use), but that’s not something you want to rely on. You don’t have to ponder that, as an ORDER BY is never harmful.
Would ORDER BY col solve the problem? That depends!
If col is unique, yes. If col is not unique, then multiple rows could result and they’d still have a non-deterministic order. So in that case you’d need to ORDER BY col,anothercol to make it absolutely deterministic. The same of course applies if the WHERE clause only referred to a single col value: if multiple rows can match, then it’s not unique and it will require an additional column for the sort.
There are other query constructs where going to row based or mixed replication is the only way. But, just because the server tells you it can’t safely replicate a query with statement based replication, that doesn’t mean you can’t use statement based replication at all… there might be another way.
Experiences with the McAfee MySQL Audit Plugin
I recently had to do some customer work involving the McAfee MySQL Audit Plugin and would like to share my experience in this post.
Auditing user activity in MySQL has traditionally been challenging. Most data can be obtained from the slow or general log, but this involves a lot of data you don’t need too, and isn’t flexible at all. The specific problem of logging failed connection attempts has been discussed on a previous post in our blog.
Starting with 5.1, the new plugin API gives us more flexibility by allowing users to extend the server’s functionality with their own code, and this is what the McAffee plugin does.
Installation and configuration are straightforward following the available instructions. The only extra step I had to take was to extract the offsets for the Percona Server version I was using for the test (5.5.28-29.1). This is needed as the plugin needs the offset to some MySQL data structures that, the plugin authors say, aren’t exposed by a consistent API. If you also need to do this, the details are clearly explained here.
The plugin writes its output in json format, and supports writing it directly to a file, or to a unix socket, which means you can write a script to listen on this socket and process the audit records as you wish.
Performance-wise, I did basic tests on the VM I was working in and didn’t get significant differences between either output option, or between using the plugin or enabling the general log. Bear in mind these were basic tests (just a few mysqlslap runs with increasing levels of concurrency), but initially, I would think the advantage of the plugin is its flexibility, and not its performance, which seems to be on par with having the general log enabled.
The flexibility comes from the three variables that can be set to control what is logged by the plugin:
- audit_record_cmds : This is the list of commands you want written to the log (all the lists in these variables are comma separated). As pointed here, anything that would generate a write to the general log will be sent to the plugin, and you can control if it gets written on not with this list. I tested this with “connect,Quit” to log successful and failed connections. Yes, it had to be a capital Q in Quit for that to work, and no, my code-fu was not enough to understand why that is the case. Maybe someone more knowledgeable in MySQL internals can enlighten me here.
- audit_record_objs : List of database objects (tables, according to the docs) for which you want events written to the log.
- audit_whitelist_users : This one is undocumented on the wiki at the time of writing, and is a list of users for which you do not want events written to the log.
Just for reference, these are the lines I had to add to my config file for the plugin to work (plus one commented line for switching between file and socket for output):
plugin-load=AUDIT=libaudit_plugin.so
audit_offsets=6464, 6512, 4072, 4512, 104, 2584
audit_json_file=1
audit_json_socket_name=/tmp/audit.sock
#audit_json_socket=1
audit_json_log_file=/var/lib/mysql/audit.log
audit_record_cmds=connect,Quit
Notice the audit_offsets that I mentioned had to be extracted due to this Percona Server version not being included in the binary.
And here’s a few sample output lines generated by the plugin with this configuration:
{"msg-type":"activity","date":"1369155747373","thread-id":"6439","query-id":"0","user":"debian-sys-maint","priv_user":"debian-sys-maint","host":"localhost","cmd":"Connect","query":"Connect"}
{"msg-type":"activity","date":"1369155747373","thread-id":"6439","query-id":"219309","user":"debian-sys-maint","priv_user":"debian-sys-maint","host":"localhost","cmd":"Quit","query":"Quit"}
{"msg-type":"activity","date":"1369155747383","thread-id":"6440","query-id":"0","user":"debian-sys-maint","priv_user":"debian-sys-maint","host":"localhost","cmd":"Connect","query":"Connect"}
In conclusion, the plugin API seems to be opening new possibilities of extending MySQL’s behavior in a way that, once set up, is transparent to users, and the McAfee MySQL Audit Plugin is only one of example of what can be achieved with it. It is a very good one for me, since I think proper audit trail support has been an important missing feature on the server, which has made using MySQL in PCI or SOX compliant environments, to name just two, artificially complicated, as one had to rely on too much info (general log) or external help (snort or similar IDS).
The post Experiences with the McAfee MySQL Audit Plugin appeared first on MySQL Performance Blog.
Agile project management tools
Wow, talk about an industry that’s overcrowded with look-alike me-too products. Online agile project management tools are a dime a dozen, which makes me think that they are probably all very similar and probably don’t solve most people’s needs. I’ve observed that when this is true, nearly-indistinguishable tools get reinvented, until the burden of evaluating the options is greater than the burden of just building yet another one, thus perpetuating the cycle.
Here are some of the products I looked at yesterday:
Acunote, ActiveCollab, AgileBench, AgileZen, Asana, Backlog, Basecamp, Blimp, Bugly, Huboard, IceScrum, JIRA, Kanbanery, Kickoff, Lean-To, Lighthouse, OnTime, PivotalTracker, Planbox, Plan.io, Rally, Redmine, ScrumDo, Sensei, SnowyEvening, Sprintly, TargetProcess, Trac, Trajectory, Trello, Unfuddle, YouTrack, Zoho Projects.
Percona XtraBackup 2.1.3 for MySQL available for download
Percona is glad to announce the release of Percona XtraBackup 2.1.3 for MySQL on May 22, 2013. Downloads are available from our download site here and Percona Software Repositories.
This release fixes a high priority bug. It’s advised to upgrade your latest 2.1 version to 2.1.3 if you’re using the Percona XtraBackup with Percona XtraDB Cluster. This release is the latest stable release in the 2.1 series.
Bug Fixed:
- Percona XtraBackup 2.1.2 would hang when performing State Snapshot Transfer. Bug fixed #1182698.
Release notes with all the bugfixes for Percona XtraBackup 2.1.3 are available in our online documentation. Bugs can be reported on the launchpad bug tracker.
* * *
Percona XtraBackup is the world’s only open-source, free MySQL hot backup software that performs non-blocking backups for InnoDB and XtraDB databases. With Percona XtraBackup, you can achieve the following benefits:
- Backups that complete quickly and reliably
- Uninterrupted transaction processing during backups
- Savings on disk space and network bandwidth
- Automatic backup verification
- Higher uptime due to faster restore time
XtraBackup makes MySQL hot backups for all versions of Percona Server, MySQL, MariaDB, and Drizzle. It performs streaming, compressed, and incremental MySQL backups.
Percona’s enterprise-grade commercial MySQL Support contracts include support for XtraBackup. We recommend support for critical production deployments.
The post Percona XtraBackup 2.1.3 for MySQL available for download appeared first on MySQL Performance Blog.
MySQL vs Drizzle plugin APIs
There’s a big difference in how plugins are treated in MySQL and how they are treated in Drizzle. The MySQL way has been to create a C API in front of the C++-like (I call it C- as it manages to take the worst of both worlds) internal “API”. The Drizzle way is to have plugins be first class citizens and use exactly the same API as if they were inside the server.
This means that MySQL attempts to maintain API stability. This isn’t something worth trying for. Any plugin that isn’t trivial quickly surpasses what is exposed via the C API and has to work around it, or, it’s a storage engine and instead you have this horrible mash of C and C++. The byproduct of this is that no core server features are being re-implemented as plugins. This means the API is being developed in a vacuum devoid of usefulness. At least, this was the case… The authentication plugin API seems to be an exception, and it’s interesting to note that semisync replication is in fact a plugin.
So times may be changing… sort of. Yesterday I noted that some storage engine API features are only available if you’re InnoDB and I’ve voiced my general disappointment in the audit API being unsuitable to implement various forms of query logging already in the server (general query log, slow query log).
One thing to note: when the API is the same for both inside the server and a plugin, it makes initial refactoring very easy, and you quickly see the bits that could be improved.
Hint of the day: Warning level in Error Log to see Aborted Connections
Yields useful information in the MariaDB or MySQL error log file (or syslog on Debian/Ubuntu) you don’t want to miss out on.
You will know about aborted connections, which are otherwise only visible through global status as Aborted_connects (lost connection before they completed authentication) and Aborted_clients (cut fully authenticated connection).
It looks like
130523 2:14:05 [Warning] Aborted connection 173629 to db: 'unconnected' user: 'someapp' host: '10.2.0.50' (Unknown error)You will know when, where from, and if for instance a wrong password was used you’ll see the username. Basically you’ll get as much info as the server has available at that point. Useful.
Percona MySQL University @Portland: June 17
Peter Zaitsev leads a track at the inaugural Percona MySQL University event in Raleigh, N.C. on Jan. 29, 2013.
Portland is a well-recognized hub for Open Source technologies in the Northwest, home to conferences such as OSCON and Open Source Bridge as well as hosts of OpenSQL Camp in 2009. As such it is a very natural place for our next Percona MySQL University event scheduled for June 17.
We run this event in partnership with MySQL Meetup at Portland organized by our own Daniel Nichter, who recently moved to the area.
Percona MySQL University is a daylong, free, fast-paced and very technical MySQL educational event for wide range of people interested in MySQL – Developers, System Administrators, DBAs, etc. It will be held at Portland State University’s Smith Memorial Student Union.
We’ll finalize the schedule next week and still have some speaking opportunities available – if you would like to share your MySQL story at this event please email Matthew Dowell by Tuesday, May 28.
If you’re not in Portland and would like Percona MySQL University to come to your city, please fill out the form to let us know. We’ll try to come to the cities showing greatest interest.
As usual space is limited, so Register Now!
The post Percona MySQL University @Portland: June 17 appeared first on MySQL Performance Blog.
MySQL and the SSB – Part 2 – MyISAM vs InnoDB low concurrency
This blog post is part two in what is now a continuing series on the Star Schema Benchmark.
In my previous blog post I compared MySQL 5.5.30 to MySQL 5.6.10, both with default settings using only the InnoDB storage engine. In my testing I discovered that innodb_old_blocks_time had an effect on performance of the benchmark. There was some discussion in the comments and I promised to follow up with more SSB tests at a later date.
I also promised more low concurrency SSB tests when Peter blogged about the importance of performance at low concurrency.
The SSB
The SSB tests a database’s ability to optimize queries for a star schema. A star schema presents some unique challenge to the database optimizer. The SSB benchmark consists of four sets of queries. Each set is known as a “flight”. I have labeled each query as Q{FLIGHT_NUMBER}.{QUERY_NUMBER}. In general, each flight examines different time periods or different regions. The flights represent the type of investigations and drill-downs that are common in OLAP analysis.
Each query in each flight (Q1.1 for example) is tested with a cold buffer pool. Then the query is tested again without restarting the database. The first test is described as the cold test, and the second as the hot test. The database software is restarted after the hot test. All OS caches are dropped at this time as well.
These set of queries were tested on the SSB at SCALE FACTOR: 20. This means there is approximately 12GB of data in the largest table.
You can find the individual SSB query definitions in my previous blog post.
Test environment
These tests were done on a relatively fast machine with a Xeon E5-2680 (8 cores, 16 threads) with fast IO (OCZ R4 1.6TB) and 128GB memory. For the hot test, the data fits in the buffer pool and has been loaded by the cold test already. The buffer pool and adaptive hash index are cold for the cold test. All tests were done with no concurrency. The hardware for this test was provided by Adotomi. I will be blogging about raw performance of the OCZ card in another post.
Also, while it is labeled on the graphs, it is important to note that in all cases, lower times are better.
SSB Flight #1
Here you will see the start of an interesting trend. MyISAM is faster when the data is not cached (the cold run) but is slower in the hot (cached) run. I did some investigation during the testing and found that InnoDB does more IO than MyISAM when the database is cold, but uses less CPU time when the database is hot. I am only speculating (and I can investigate further), but I believe the adaptive hash index is improving performance of InnoDB significantly during the hot run, as hash indexes are faster than a b-tree index. Also accessing pages from the buffer pool should be faster than getting them from the OS cache, which is another advantage of InnoDB.
SSB Flight #2
Flight #2 is similar to Flight #1. MyISAM is faster than InnoDB when the database is cold, but the opposite is true when the database is hot.
SSB Flight #3
Here in some cases MyISAM is substantially faster than InnoDB both cold and hot.
SSB Flight #4
There is one query in this flight, Q4.3, which is faster using MyISAM than InnoDB. Like the queries in Flight #3 that are faster using MyISAM, Q4.3 examines very little data. It seems that InnoDB performs better when a larger number of rows must be joined together (Q4.1, Q4.2) but worse when small amounts of data are examined.
Conclusion
In some cases MyISAM is faster than InnoDB, but usually only when the buffer pool is cold. Please don’t take away that you should be using MyISAM for everything!. MyISAM may be good for raw performance, but there are limitations which MyISAM imposes that are difficult to work with. MyISAM does not maintain checksum consistency during regular operations and is not ACID compliant. MyISAM and InnoDB may perform differently under concurrency, which this benchmark does not cover. I will make a follow-up post about concurrency in another blog post in this series. Regardless, when the working set fits in memory, InnoDB almost always performs better, at least for this workload.
Notes
MySQL version used: 5.6.11, custom compiled to remove performance_schema
For the InnoDB tests, a 64GB buffer pool was used. O_DIRECT was used so, there was no caching of data at the filesystem level. The InnoDB indexes were built using ALTER TABLE fast index creation (merge sort).
For the MyISAM tests I used a 10GB key buffer. I used ALTER TABLE DISABLE KEYS and built the keys with sort via ALTER TABLE ENABLE KEYS.
my.cnf
The post MySQL and the SSB – Part 2 – MyISAM vs InnoDB low concurrency appeared first on MySQL Performance Blog.
Some storage engine features you only get if you’re InnoDB
I had reason to look into the extended secondary index code in MariaDB and MySQL recently, and there was one bit that I really didn’t like.
MariaDB:
share->set_use_ext_keys_flag(legacy_db_type == DB_TYPE_INNODB);
MySQL:
use_extended_sk= (legacy_db_type == DB_TYPE_INNODB);
In case you were wondering what “legacy_db_type” actually does, let me tell you: it’s not legacy at all, it’s kind of key to how the whole “metadata” system in MySQL works. For example, to drop a table, this magic number is used to work out what storage engine to call to drop the table.
Now, these code snippets basically kiss goodbye to the idea of a “pluggable storage engine” architecture. If you’re not InnoDB, you don’t get to have certain features. This isn’t exactly MySQL or MariaDB encouraging an open storage engine ecosystem (quite the opposite really).
Having the MySQL server have this incredibly basic, busy and incomplete understanding of metadata has always been a bit of a mess. The code for reading a table definition out of the FRM file really does show its age, and has fingers all through the server.
If somebody was serious about refactoring server code, you’d certainly be looking here, as this code is a major source of arbitrary limitations. However, if you have the server and the engine(s) both having separate views of what is the “correct” state of metadata you end up with a mess (anyone who has had InnoDB be out of sync with FRMs knows this one). I worry that the FRM code will be replaced with something even less understandable by humans, again making the mistake that the server knows the state of the engine better than the engine does.
See Also:
- Sergey Petrunia’s blog on the topic of extended keys: http://s.petrunia.net/blog/?p=74
- Sergey Glukhov blogs on the MySQL implementation: http://glukhsv.blogspot.com.au/2012/12/innodb-extended-secondary-keys.html
VMware joins the cloud wars with vCloud Hybrid Service
Although this has been long-rumored, and then was formally mentioned in VMware’s recent investor day, VMware has only just formally announced the vCloud Hybrid Service (vCHS), which is VMware’s foray into the public cloud IaaS market.
VMware has previously had a strategy of being an arms dealer to service providers who wanted to offer cloud IaaS. In addition to the substantial ecosystem of providers who use VMware virtualization as part of various types of IT outsourcing offerings, VMware also signed up a lot of vCloud Powered partners, each of which offered what was essentially vCloud Director (vCD) as a service. It also certified a number of the larger providers as vCloud Datacenter Service Providers; each such provider needed to meet criteria for reliability, security, interoperability, and so forth. In theory, this was a sound channel strategy. In practice, it didn’t work.
Of the certified providers, only CSC has managed to get substantial market share, with Bluelock trailing substantially; the others haven’t gotten much in the way of traction, Dell has now dropped their offering entirely, and neither Verizon nor Terremark ended up launching the service. Otherwise, VMware’s most successful service providers — providers like Terremark, Savvis, Dimension Data, and Virtustream — have been the ones who chose to use VMware’s hypervisor but not its cloud management platform (in the form of vCD).
Indeed, those successful service providers (let’s call them the clueful enterprise-centric providers) are the ones that have built the most IP themselves — and not only are they resistant to buying into vCD, but they are increasingly becoming hypervisor-neutral. Even CSC, which has staunchly remained on VMware running on VCE Vblocks, has steadily reduced its reliance on vCD, bringing in a new portal, service catalog, orchestration engine, and so forth. Similarly, Tier 3 has vCD under the covers, but never so much as exposed the vCD portal to customers. (I think the industry has come to a broad consensus that vCD is too complex of a portal for nearly all customers. Everyone successful, even VMware themselves with vCHS, is front-ending their service with a more user-friendly portal, even if customers who want it can request to use vCD instead.)
In other words, even while VMware remains a critical partner for many of its service providers, those providers are diversifying their technology away from VMware — their success will be, over time, less and less VMware’s success, especially if they’re primarily paying for hypervisor licenses, and not the rest of VMware’s IT operations management (ITOM) tools ecosystem. The vCloud Powered providers that are basically putting out vanilla vCD as a service aren’t getting significant traction in the market — not only can they not compete with Amazon, but they can’t compete against clueful enterprise-centric providers. That means that VMware can’t count on them as a significant revenue stream in the future. And meanwhile, VMware has finally gotten the wake-up call that Amazon’s (and AWS imitators) increasing claim on “shadow IT” is a real threat to VMware’s future not only in the external cloud, but also in internal data centers.
That brings us to today’s reality: VMware is entering the public cloud IaaS market themselves, with an offering intended to compete head-to-head with its partners as well as Amazon and the whole constellation of providers that don’t use VMware in their infrastructure.
VMware’s thinking has clearly changed over the time period that they’ve spent developing this solution. What started out as a vanilla vCD solution intended to enable channel partners who wanted to deliver managed services on top of a quality VMware offering, has morphed into a differentiated offering that VMware will take to market directly as well as through their channel — including taking credit cards on a click-through sign-up for by-the-hour VMs, although the initial launch is a monthly resource-pool model. Furthermore, their benchmark for price-competitiveness is Amazon, not the vCloud providers. (Their hardware choices reflect this, too, including their choice to use EMC software but going scale-out architecture and commodity hardware across the board, rather than much more expensive and much less scalable Vblocks.)
Fundamentally, there is virtually no reason for providers who sell vanilla vCD without any value-adds to continue to exist. VMware’s vCHS will, out of the gate, be better than what those providers offer, especially with regard to interopability with internal VMware deployments — VMware’s key advantage in this market. Even someone like a Bluelock, who’s done a particularly nice implementation and has a few value-adds, will be tremendously challenged in this new world. The clueful providers who happen to use VMware’s hypervisor technology (or even vCD under the covers) will continue on their way just fine — they already have differentiators built into their service, and they are already well on the path to developing and owning their own IP and working opportunistically with best-of-breed suppliers of capabilities.
(There will, of course, continue to be a role for vCloud Powered providers who really just use the platform as cloud-enabled infrastructure — i.e., providers who are mostly going to do managed services or one sort or another, on top of that deployment. Arguably, however, some of those providers may be better served, over the long run, offering those managed services on top of vCHS instead.)
No one should underestimate the power of brand in the cloud IaaS market, particularly since VMware is coming to market with something real. VMware has a rich suite of ITOM capabilities that it can begin to build into an offering. It also has CloudFoundry, which it will integrate, and would logically be as synergistic with this offering as any other IaaS/PaaS integration (much as Microsoft believes Azure PaaS and IaaS elements are synergistic).
I believe that to be a leader in cloud IaaS, you have to develop your own software and IP. As a cloud IaaS provider, you cannot wait for a vendor to do their next big release 12-18 months from now and then take another 6-12 months to integrate it and upgrade to it — you’ll be a fatal 24 months behind a fast-moving market if you do that. VMware’s clueful service providers have long since come to this realization, which is why they’ve moved away from a complete dependence on VMware. Now VMware itself has to ensure that their cloud IaaS offering has a release tempo that is far faster than the software they deliver to enterprises. That, I think, will be good for VMware as a whole, but it will also be a challenge for them going forward.
VMware can be successful in this market, if they really have the wholehearted will to compete. Yes, their traditional buying center is the deeply untrendy and much-maligned IT Operations admin, but if anyone would be the default choice for that population (which still controls about a third of the budget for cloud services), it’s VMware — and VMware is playing right into that story with its emphasis on easy movement of workloads across VMware-based infrastructures, which is the story that these guys have been wanting to hear all along and have been waiting for someone to actually deliver.
Hello, vCHS! Good-bye, vCloud Powered?
Replication in MySQL 5.6: GTIDs benefits and limitations – Part 1
Global Transactions Identifiers are one of the new features regarding replication in MySQL 5.6. They open up a lot of opportunities to make the life of DBAs much easier when having to maintain servers under a specific replication topology. However you should keep in mind some limitations of the current implementation. This post is the first one of a series of articles focused on the implications of enabling GTIDs on a production setup.
The manual describes very nicely how to switch to GTID-based replication, I won’t repeat it.
Basically the steps are:
- Make the master read-only so that the slaves can execute all events and be in sync with the master
- Change configuration for all servers and restart them
- Use CHANGE MASTER TO to instruct all servers to use GTIDs
- Disable read-only mode
This procedure will switch all your servers from regular replication to GTID replication. But if you are running a production system, you will probably want to gradually enable GTID replication for an easier rollback in the event of a problem. And some items in the documentation are not so clear.
For instance:
- Do we really need to restart all the servers at the same time? Downtime is something we like to avoid!
- Is it necessary to make the master read-only?
- Can we use regular replication for some slaves and GTID replication for other slaves at the same time?
To find an answer to these questions, let’s create a simple replication configuration with one master and two slaves, all running MySQL 5.6 with GTIDs disabled.
First try: configure only one of the servers with GTIDsLet’s stop slave #2, change configuration and restart it:
mysql> show slave status\G [...] Slave_IO_Running: No Slave_SQL_Running: Yes [...]The error log tells us why the IO thread has not started:
2013-05-17 13:21:26 3130 [ERROR] Slave I/O: The slave IO thread stops because the master has GTID_MODE OFF and this server has GTID_MODE ON, Error_code: 1593So unfortunately if you want replication to work correctly, gtid_mode must be ON on all servers or OFF on all servers, but not something in the middle.
What if we try to reconfigure the master? This time, replication on slave #1 will stop:
2013-05-17 13:32:08 2563 [ERROR] Slave I/O: The slave IO thread stops because the master has GTID_MODE ON and this server has GTID_MODE OFF, Error_code: 1593These simple tests answer the first two questions: replication works only if all servers have the same value for gtid_mode, so you should restart them at the same time, which is best done by making the master read-only. However, “at the same time” means “at the same binlog position”, so you can perfectly restart the servers one by one.
Second try: GTIDs enabled, mixing regular replication and GTID replicationThis time, we will enable GTID replication on slave #1, but not on slave #2:
# slave #1 mysql> change master to master_auto_position = 1; mysql> start slave;and let’s create a new table on the master:
mysql> create table test.t (id int not null auto_increment primary key);Executing SHOW TABLES FROM test on both slaves shows that the table has been created everywhere. So once GTIDs are enabled on all servers, you can have some slaves using file-based positioning and some other slaves using GTID-based positioning.
This answers the second question: we can have different replication modes on different servers, but only if all servers have gtid_mode set to ON. Could it be interesting to run file-based replication when gtid_mode is ON? I can’t think of any use case, so in practice, you’ll probably use either file-based replication only (gtid_mode=off for all servers) or GTID-based replication only (gtid_mode=on for all servers).
Additional question: how can you know if a slave is using GTID-based replication by inspecting the output of SHOW SLAVE STATUS? Look at the last field, Auto_Position:
# Slave #1 mysql> show slave status\G [...] Auto_Position: 1 -> GTID-based positioning # Slave #2 mysql> show slave status\G [...] Auto_Position: 0 -> File-based positioningConclusionEnabling GTID-based replication can be tricky if your application does not easily tolerate downtime or read-only mode, especially if you have a lot of servers to reconfigure. It would be really nice to be able to mix servers where gtid_mode is ON with servers where gtid_mode is OFF. This would greatly simplify the transition to GTID-based replication and allow easier rollbacks if something goes wrong.
The post Replication in MySQL 5.6: GTIDs benefits and limitations – Part 1 appeared first on MySQL Performance Blog.
Dell withdraws from the public cloud IaaS market
Today, not long after its recent acquisition of Enstratius, Dell announced a withdrawal from the public cloud IaaS market. This removes Dell’s current VMware-based, vCloud Datacenter Service from the market; furthermore, Dell will not launch an OpenStack-based public cloud IaaS offering later this year, as it had originally intended to do. This does not affect Dell’s continued involvement with OpenStack as a CMP for private clouds.
It’s not especially surprising that Dell decided to discontinue its vCloud service, which has gotten highly limited traction in the market, and was expensive even compared to other vCloud offerings — given its intent to launch a different offering, the writing was mostly on the wall already. What’s more surprising is that Dell has decided to focus upon an Enstratius-enabled cloud services broker (CSB) role, when its two key competitors — HP and IBM — are trying to control an entire technology stack that spans hardware, software, and services.
It is clear that it takes significant resources and substantial engineering talent — specifically, software engineering talent — to be truly competitive in the cloud IaaS market, sufficiently so to move the needle of a company as large as Dell. I do not believe that cloud IaaS is, or will become, a commodity; I believe that the providers will, for many years to come, compete to offer the most capable and feature-rich offerings to their customers.
Infrastructure, of course, still needs to be managed. IT operations management (ITOM) tools — whether ITIL-ish as in the current market, or DevOps-ish as in the emerging market — will remain necessary. All the capabilities that make it easy to plan, deploy, monitor, manage, and so forth are still necessary, although you do these things differently in the cloud than on-premise, potentially. Such capabilities can either be built into the IaaS offerings themselves — perhaps with bundled pricing, perhaps as value-added services, but certainly as where much of the margin will be made and providers will differentiate — or they can come from third-party multi-cloud management vendors who are able to overlay those capabilities on top of other people’s clouds.
Dell’s strategy essentially bets on the latter scenario — that Enstratius’s capabilities can be extended into a full management suite that’s multi-cloud, allowing Dell to focus all of its resources on developing the higher-level functionality without dealing with the lower-level bits. Arguably, even if the first scenario ends up being the way the market goes (I favor the former scenario over the latter one, at present), there will still be a market for cloud-agnostic management tools. And if it turns out that Dell has made the wrong bet, they can either launch a new offering, or they may be able to buy a successful IaaS provider later down the line (although given the behemoths that want to rule this space, this isn’t as likely).
From my perspective, as strategies go, it’s a sensible one. Management is going to be where the money really is — it won’t be in providing the infrastructure resources. (In my view, cloud IaaS providers will eventually make thin margins on the resources in order to get the value-adds, which are basically ITOM SaaS, plus most if not all will extend up into PaaS.) By going for a pure management play, with a cloud-native vendor, Dell gets to avoid the legacy of BMC, CA, HP, IBM/Tivoli, and its own Quest, and their struggles to make the shift to managing cloud infrastructure. It’s a relatively conservative wait-and-see play that depends on the assumption that the market will not mature suddenly (beware the S-curve), and that elephants won’t dance.
If Dell really wants to be serious about this market, though, it should start scooping up every other vendor that’s becoming significant in the public cloud management space that has complementing offerings (everyone from New Relic to Opscode, etc.), building itself into an ITOM vendor that can comprehensively address cloud management challenges.
And, of course, Dell is going to need a partner ecosystem of credible, market-leading IaaS offerings. Enstratius already has those partners — now they need to become part of the Dell solutions portfolio.


