Feed aggregator

Instart Logic launches a new kind of acceleration service

There have been three core techniques for accelerating content and application delivery over the Internet — caching (“classic” CDN), network optimization (think protocol tricks, like F5 Web Application Accelerator on the hardware side, or Akamai DSA on the service side), and front-end optimization (FEO, think content re-write, like Aptimize/Riverbed or Strangeloop/Radware on the software side, or Blaze.io/Akamai or Acceloweb/Limelight on the service side).

Now, with the launch of Instart Logic, there’s a fourth technique, that I don’t yet have a name for. In spirit, it’s probably most similar to a SoftWOC, but in this case, the client endpoint is the browser, and the symmetric remote endpoint is the CDN server. The techniques are also different from typical SoftWOC techniques, as far as I know.

From the perspective of an Instart Logic customer, they’re getting a dynamic acceleration service that, from a deployment perspective, is much like a CDN. For most customers, it would entirely replace using a traditional CDN (rather than being additive) — i.e., they would buy this instead of buying Akamai DSA or a similar service. Note that this is a performance play, not a price play — Instart Logic expects that they’ll be in the ballpark of typical dynamic acceleration pricing, and that performance carries a market premium.

The techniques used in the service are intended to dramatically improve load times, especially on congested networks; this is particularly useful in mobile, but it is not mobile-specific. As with FEO, the goal is to allow the end-user to quickly see and interact with the content while the remainder of the page is still downloading.

On the client side, there’s what they call a “NanoVisor” — an HTML5-based thin virtualization layer that runs in the browser. If Instart Logic is full-proxying the customer’s site, the NanoVisor code can simply be injected; otherwise the customer can insert the code into their site. It requires no other changes to the customer’s site. The NanoVisor provides intelligence about the end-user and serves as the client endpoint for the optimization.

On the server side, the “AppSequencer” analyzes page content, and it fragments and orders objects that are then streamed to the NanoVisor. It does large-scale analysis of usage patterns, and it predictively sends things based on the responses that it’s seen before. There’s compression and network optimization techniques, as well as implicit caching.

Like other recent innovators in the CDN space, Instart Logic is predominantly a software company. Whlie they do have servers of their own, they are also using a variety of cloud IaaS providers for capacity. They’re also using Dyn for DNS.

Instart Logic has raised a significant amount of money, almost purely from top-tier VCs — $26 million to date. I think their technology is very promising, which probably means they’ll get a bit of time to prove themselves out and then they’ll get bought by one of the CDNs looking to get an edge on the competition, or maybe even an ADC or WOC vendor.

Instart Logic’s demos are impressive, and they’ve got paying customers at this point, although obviously they’re newly-launched. While it always takes time to build trust in this industry, at this point they’re worth checking out, and I’ve been referring Gartner clients to them ever since I was briefed by them while they were still in stealth mode, a few months back. They’re potentially an excellent fit for customers who are looking for something beyond what DSA-style network optimization offerings can do, but either do not want to do FEO, have reached the limits of what FEO can offer them, or simply want to explore alternatives.


Percona XtraDB Cluster (PXC) in the real world: Share your use cases!

MySQL Performance Blog - Mon, 17/06/2013 - 20:00

The aim of this post is to enumerate real-world usage of Percona XtraDB Cluster (PXC), and also to solicit use cases from the readers. One of the prominent usages in the production environment that we have come across (and our Percona consultants have assisted) is that of HP Cloud. There is a post about it here by Patrick Galbraith of HP. The post focuses on their deployment of PXC for HP Cloud DNS. The post focuses on the key aspects of synchronous replication setup with high-availability guarantees like split-brain immunity.

Nobody likes to debug async replication while its broken or do the master-master/master-slave switchover when master is dying/dead. Yes, there are wrappers/scripts around this to make life easier, however, wouldn’t it be nice if this was built into the system itself? PXC based on Galera strives to provide that. Scaling makes sense only when addition/removal of hosts from a cluster or a HA setup is simple and uncomplicated.

Their post focuses on following aspects:

  • Initial setup
  • Setup of other nodes with SST (Xtrabackup SST)
  • Integration of chef with PXC
  • Finally, integration of HAProxy as a loadbalancer.

To elucidate, their initial setup goes into bootstrapping the first node. Note that in the cloud environment other nodes are not known until they are brought up, hence bootstrapping with an empty gcomm:// is done for the first node by the chef. The second node is then added which SSTs with node1 (based on gcomm://node1 of node2) through Xtrabackup SST (state snapshot transfer). Node3 subsequently joins the cluster with node1 and node2 in its gcomm:// (since by this time node1, node2 are up). After this, a subsequent run of chef-client is done to update the cnf files with IP address of members (excluding itself). The rationale behind this is that when a node is restarted (and there are others when it comes up) it joins the cluster seamlessly. I would like to note here that we are adding a bootstrap parameter to PXC so that any latter modifications like these to cnf files are not required and preset it during cluster startup itself. The only caveat is that the node information – IP address or hostname – should be known in advance (the node itself needn’t be up), which may not be feasible in a cloud environment.

Next, the SST. Xtrabackup SST is used there. SST matters a lot because not only is it used during initial node setup but also it is required when a node has been down for a while and IST (incremental state transfer) is not feasible. It also helps when node data integrity is compromised. So, naturally duration of SST is paramount. We recommend Xtrabackup SST for its reduced locking period from its use (which means the donor is blocked for a shorter while). By using Xtrabackup for SST, you also get its benefits like compression, parallel streaming, encryption, compact backups which can be used for SST (Note, the wsrep_sst_xtrabackup in 5.5.30 can’t do those except parallel, the one in 5.5.31 will handle them all, also XB 2.1 is required for most).

Finally, the HAProxy. HAProxy is one of the loadbalancers recommended for use with PXC. The other one is glb. HAProxy is used with xinetd on the node along with a script which checks PXC for its sync status. As referenced in that post, you can refer a post by Peter Boros (“Percona XtraDB Cluster reference architecture with HaProxy“) for details. In their setup they have automated this with a HAProxy in each AZ (Availability Zone) for the API server. To add, we are looking at reducing the overhead here, through steps like replacing xinetd and clustercheck with a single serving process (we are adding one in 5.5.31), looking for optimizations with HAProxy to account for high connection rates, and using pacemaker with PXC. The goal is to reduce the overhead of status checks, mainly on the node. You can also look this PLMCE talk for HAProxy deployment strategies with PXC.

To conclude, it is interesting to note that they have been able to manage this with a small team. That strongly implies scalability of resources – you scale more with less, and that is how it should be. We would like to hear from you about your architectural setup around PXC – any challenges you faced (and horror stories if any), any special deployment methodologies you employed (Puppet, Chef, Salt, Ansible etc. ), and finally any suggestions.

The post Percona XtraDB Cluster (PXC) in the real world: Share your use cases! appeared first on MySQL Performance Blog.

Call for papers: Percona Live London

MySQL Performance Blog - Fri, 14/06/2013 - 23:48

If you attended last April’s Percona Live MySQL Conference and Expo in Santa Clara, you know it was a rare opportunity to learn from some of the most accomplished system architects and developers in the business. Now it’s time to give back.

The call for papers (CFP) is now open for Percona Live London, November 11-12 at the Millenium Gloucester Conference Center. Login or register as speaker and submit your proposals.

Your participation is an opportunity to make a difference. After all, the conference would be meaningless without great content. So step up and volunteer to present your unique insights into MySQL and MySQL-related technologies either as a keynote, tutorial or breakout session.

Tell us your war story: What have you experienced while actively using MySQL, Percona Server, MariaDB, Drizzle or (fill in the blank) in production? Are you managing multiple environments? Are you working with MySQL or its variants at the code level? Share your perspective about the internals or explain how specific features work in depth. These stories are gold! And you owe it to the MySQL community to share them.

Don’t worry about being fancy. Some of the most popular talks have been on simple topics done well. Many people come to the conferences to learn — and we need content for MySQL newcomers as well as the veterans.

We’re also looking for sponsors. This is the showcase event for Europe’s growing MySQL community of developers, business managers, technology evangelists, DBAs and entrepreneurs. Download the prospectus and take a closer look at the benefits of sponsorship.

Questions? Let me know in the comments section. I look forward to hearing from you!

The post Call for papers: Percona Live London appeared first on MySQL Performance Blog.

Djancocon 2013 call for papers open

Baron Schwartz - Thu, 13/06/2013 - 22:44

Are you a Django user? There’s an upcoming Django conference in Chicago in a few months, and I know they’re looking for speakers with MySQL experience in particular. One suggestion the organizers have floated is a talk on MySQL:

I’m looking for someone to give at least one MySQL talk there. In particular, I would love a (friendly but vigorous) “Why you should use MySQL instead of PostgreSQL talk”, as PostgreSQL tends to get a lot of love and attention at Django events, and MySQL not so much.

Take a look at it and see if you are interested. Presenting at a conference is one of the best things you can do for your career, your company, and your community of open-source software. I highly encourage it if you haven’t tried it.

Percona MySQL University @Portland next Monday!

MySQL Performance Blog - Wed, 12/06/2013 - 20:00

We’re less than a week away from Percona MySQL University at Portland, Oregon next Monday, June 17. The latest in a series of FREE one-day educational events, we are pleased to feature 10 technical talks by members of Team Percona as well as local members of the MySQL Community:

The daylong event will be held at Portland State University’s Smith Memorial Student Union, located at 1825 SW Broadway, Suite 327/8/9 Portland, Oregon 97201. Afterward, we’ll have a networking reception at the famed Paddy’s Bar and Grill sponsored by Tag1 Consulting featuring great networking possibilities and free drinks for event attendees.

If you’re in the Portland area and work with MySQL, then this is an event you can’t afford to miss…   So register now!

Please also join the Portland MySQL Meetup group for more MySQL-focused events in Portland

If you love the ideal of Percona MySQL University and would like us to bring the event to your city, please let us know!

The post Percona MySQL University @Portland next Monday! appeared first on MySQL Performance Blog.

How people are using MySQL… from 1 user to 100 million (upcoming conference talk)

MySQL Performance Blog - Tue, 11/06/2013 - 18:00

MySQL can be deployed in several ways, and that means you can choose a tailor-made path that best meets your needs. With simple services or development systems, many people are using a single server with some backups configured, and then simply take the downtime when a restore is needed.

As the application evolves, additional requirements will appear like hot backups, online schema changes, replication based high availability (which has some caveats). Also, because of asynchronous replication, you can end up having inconsistent data on the nodes and encounter replication errors, which you have to deal with.

Even later, when asynchronous or single-threaded replication is not enough, people can use Percona XtraDB Cluster, which has some of its own caveats, but you can also gain a lot from write set replication.

If the application gets really popular, sooner or later you have to think about write scalability, and implement sharding ultimately.

I will give a talk on this topic at next month’s RAMP conference in Budapest, Hungary. My presentation will provide an overview of some possible deployment and scaling scenarios, when it makes sense to use one or an other, and common pitfalls we typically see in our consulting practice. Please let me know if you plan on attending – and also feel free to post questions in advance below.

The post How people are using MySQL… from 1 user to 100 million (upcoming conference talk) appeared first on MySQL Performance Blog.

Percona Server 5.1.69-14.7 now available: A drop in replacement for MySQL

MySQL Performance Blog - Tue, 11/06/2013 - 04:32

Percona Server for MySQL version 5.1.69-14.7

Percona is glad to announce the release of Percona Server 5.1.69-14.7 on June 10, 2013. A drop in replacement for MySQL, downloads are available  here and from the Percona Software Repositories. Based on MySQL 5.1.69, this release will include all the bug fixes in it. All of Percona’s software is open-source and free, all the details of the release can be found in the 5.1.69-14.7 milestone at Launchpad.

Bugs Fixed:

  • In Ubuntu Precise libmysqlclient18 package was chosen from the distribution’s repository instead of Percona’s which could lead to package conflicts. Bug fixed #1174271.
  • Fixed the RPM Percona-Server-shared-compat package naming issue that could lead to unresolved package dependencies when installing Percona Server 5.1. Bug fixed #893860.
  • The log tracker thread was unaware of the situation when the oldest untracked log records are overwritten by the new log data. In some corner cases this could lead to assertion errors in the log parser or bad changed page data. Bug fixed #1108613.
  • Percona Server wouldn’t start if the XtraDB changed page tracking was enabled and variable innodb_flush_method was set to ALL_O_DIRECT. Bug fixed #1131949.
  • Fixed the RPM package dependencies for different major versions of Percona Server. Bug fixed #1167109.
  • Fixed the CVE-2012-5627 vulnerability, where an unprivileged MySQL account owner could perform brute-force password guessing attack on other accounts efficiently. This bug fix comes originally from MariaDB (see MDEV-3915). Bug fixed #1172090.
  • OpenSSL libraries were not found in 32-bit builds due to a typo. Bug fixed #1175447.
  • Query to the INNODB_CHANGED_PAGES table would cause server to stop with an I/O error if a bitmap file in the middle of requested LSN range was missing. Bug fixed #1179974.
  • Server would crash if an INNODB_CHANGED_PAGES query is issued that has an empty LSN range and thus does not need to read any bitmap files. Bug fixed #1184427.
  • Incorrect schema definition for the User Statistics tables in INFORMATION_SCHEMA (CLIENT_STATISTICS, INDEX_STATISTICS, TABLE_STATISTICS, THREAD_STATISTICS, and USER_STATISTICS) led to the maximum counter values being limited to 32-bit signed integers. Fixed so that these values can be 64-bit unsigned integers now. Bug fixed #714925.
  • mysql_set_permission was failing on Debian due to missing libdbd-mysql-perl package. Fixed by adding the package dependency. Bug fixed #1003776.
  • XtraDB changed page tracking used to hold the log system mutex for the log reads needlessly, potentially limiting performance on write-intensive workloads. Bug fixed #1171699.
  • Missing path separator between the directory and file name components in a bitmap file name could stop the server starting if the innodb_data_home_dir variable didn’t have the path separator at the end. Bug fixed #1181887.
  • A warning is now returned if a bitmap file I/O error occurs after an INNODB_CHANGED_PAGES query started returning data to indicate an incomplete result set. Bug fixed #1185040.
  • Fixed the upstream bug #69379 which caused MySQL clients to return bogus error number for host-not-found errors on Ubuntu 13.04. Bug fixed #1186690.
  • Under very rare circumstances, deleting a zero-size bitmap file at the right moment would make server stop with an I/O error if changed page tracking is enabled. Bug fixed #1184517.
  • The INNODB_CHANGED_PAGES table couldn’t be queried if the log tracker wasn’t running. Bug fixed #1185304.

Other bug fixes: bug fixed #1174346, bug fixed #1160951, bug fixed #1079688, bug fixed #1132412, bug fixed #1153651.

Release notes for Percona Server for MySQL 5.1.69-17.4 are available in our online documentation. Bugs can be reported on the launchpad bug tracker.

The post Percona Server 5.1.69-14.7 now available: A drop in replacement for MySQL appeared first on MySQL Performance Blog.

Migrating between MySQL schemas with Percona Xtrabackup

MySQL Performance Blog - Mon, 10/06/2013 - 20:00

Recently, I was working with a client that asked about using Percona Xtrabackup to take a snapshot of a particular MySQL schema and then reload it with a different schema name on the same server.  It caught me off guard because I’d never really thought about it – typically, I’ve used Xtrabackup simply to clone a server (for replication or migration) or migrate an existing schema to a new server.  However, given the import/export functionality of Xtrabackup combined with Percona Server (Exporting and Importing Tables), it did seem possible.

Further discussion with the client clarified the use case: clients make a mistake and need to compare their old data to their current data on the current live server.  Mysqldump works well for this, but can quite slow on larger schemas.

One of the downsides to mysqldump is the need to scan the full tables and in turn, load that data into and pollute the buffer pool.  Note that this can be somewhat mitigated using innodb_old_blocks_time, but that is outside the scope of this post.  Similarly, reloading the data will be a very IO intense operation (redo logs, binlogs, etc) as well further polluting the buffer pool.  Enter Xtrabackup…

As a quick refresher, Xtrabackup works by copying the dirty tablespace files while streaming the redo-logs to ensure that all transactions are also captured.  The –apply-logs phase simply utilizes the built in crash recovery and applies the redo-logs to the dirty tablespace and voila, you have a consistent binary backup at a point in time.

When running Percona Server, you can utilize the –export flag during the –apply-logs phase and then re-import those files to a running server.  However, you need to have existing table structures in place.  Fortunately, this can be done easily using mysqldump –no-data.

Now, enough with the theory, here is the procedure I used:

  1. Locate an existing snapshot that contains the schema you are interested in (/tmp/snapshot/2013-06-03_11-30/orig)
  2. Get the table structures:  mysqldump –no-data orig > /tmp/orig.schema.sql
  3. Create the new target database:  mysqladmin create orig_old
  4. Load the schema into the target database:  mysql orig_old < /tmp/orig.schema.sql
  5. Ensure innodb_import_table_from_xtrabackup = 1  (dynamic variable)
  6. Prepare the backup using the –export flag:  innobackupex –apply-log –export /tmp/snapshot/2013-06-03_11-30
  7. For each table, run:  ALTER TABLE tblname DISCARD TABLESPACE
  8. Copy the .exp and .ibd files from the snapshot to the new instance:  cp /tmp/snapshot/2013-06-03_11-30/orig/*[.exp|.ibd] /var/lib/mysql/orig_old
  9. Make sure that the files are owned by mysql: chown mysql:mysql /var/lib/mysql/orig_old/*
  10. For each table, run: ALTER TABLE tblname IMPORT TABLESPACE

Now, you have a old version of the schema running side by side with the current version on the same server.  This will allow you to compare and restore values (potentially corrupted via user error or other issues) with SQL rather than needing to import from a remote server.  This can allow for more targeted restores, easier comparison, and allow remote users to compare on a live system without needed to grant access to another “backup” server.

As noted by one of my colleagues (thanks Bill Karwin!), the cumbersome part of this process is the DISCARD/IMPORT TABLESPACE step as that is done manually for each table (currently a blueprint in innobackupex).  He also included this helpful script to generate all of those statements for steps 7 and 10 in two scripts:

mysql -N -B <<’EOF’ > discard-ddl.sqlSELECT CONCAT(‘ALTER TABLE `’, table_name, ‘` DISCARD TABLESPACE;’) AS _ddlFROM INFORMATION_SCHEMA.TABLES WHERE TABLE_SCHEMA=’orig’ AND ENGINE=’InnoDB’;EOFmysql -N -B <<’EOF’ > import-ddl.sqlSELECT CONCAT(‘ALTER TABLE `’, table_name, ‘` IMPORT TABLESPACE;’) AS _ddlFROM INFORMATION_SCHEMA.TABLES WHERE TABLE_SCHEMA=’orig’ AND ENGINE=’InnoDB’;EOF

For full disclosure, here are the versions I was using on my test VM:

  • Percona Server 5.5.30
  • Xtrabackup 2.0.6

Some other use cases that come to mind for this technique would be:

  • Targeted schema migration (to new schema name) using –include=”^orig[.]“ (i.e. functional partitioning)
    • Combine with –replicate-rewrite-db
  • Daily backups from production to staging with different db names (i.e. orig_prod restored to orig_staging)
  • I’m sure there are others as well…

The post Migrating between MySQL schemas with Percona Xtrabackup appeared first on MySQL Performance Blog.

Choosing a MySQL HA Solution – Post-Webinar Q&A

MySQL Performance Blog - Fri, 07/06/2013 - 21:00

Thanks to everyone who was in attendance on 05 June 2013 for my “Choosing a MySQL HA Solution” webinar. If you weren’t able to make it but are interested in listening to the presentation, it’s currently up and available for viewing over at percona.com.

My apologies if we weren’t able to get to your question during the initial session, so I’ll address those lingering questions in this post, along with providing a bit more detail on some of the questions that I did cover during the session.

Q: What is the reason that I recommended DRBD be used only on physical hardware and not on virtual machines?
A: I covered this a bit during the session, but to provide a bit more commentary. There are really two main reasons that I don’t like DRBD with virtual machines. First is the disk IO performance hit that comes with DRBD. When you run a virtual machine, in most cases your virtual disk is basically nothing more than a file on the host filesystem rather than an actual physical volume. So, in effect, you’re adding performance hit to performance hit. The other reason is because virtual machines tend to handle process scheduling and timing much differently than traditional hardware, and I’ve seen situations where even under just a moderate load on the host system, DRBD can end up in a split-brain situation. Typically DRBD is run in combination with heartbeat or pacemaker or something similar, and if one of the servers in the DRBD pair is having issues receiving packets from the other side, things start to degrade.

If you’re just setting up a test environment to get familiar with the technology, certainly there’s nothing wrong with a virtual-machine approach, but for the optimal production deployment you want two identical physical boxes with at least 4 NIC ports. Why 4? Two of those NICs should be directly connected from one machine to another and configured with Linux network interface bonding in balance-rr mode; this is the only NIC bonding mode which will allow you to stripe a TCP connection over multiple ports, and with a two-NIC bonded pair you’ll get roughly 1.67x the throughput of a single port. [Don't try more than 2; the additional work that the kernel has to do in reordering TCP packets can actually result in performance that's worse than a single NIC.] The other two NIC ports should be bonded with active-backup or LACP (depending on your switching infrastructure). In essence, you’re looking for two of everything.

Q: Is it possible to replicate only some tables from master to slave, and if so, how?
A: Yes. There are a few of ways to do it. One way is to set up replication filters on the slave, as described in the MySQL manual. You can configure the slave to only replicate a specific database, a set of databases, or specific tables; you can also configure the slave to replicate everything EXCEPT a specified set of databases and tables. With this method, every event is still written to the binary log on the master, but the slave determines what to do with it. The other approach is to filter what gets written to the binary log on the master. This can be useful if you’re trying to limit the amount of data sent over the wire, but it comes at the cost of having binary logs which are no longer complete or useful for point-in-time recovery. Generally if you’re considering the binary log filtering approach, I think it’s safer to set up a relay master (it can be on the same machine if you use the BLACKHOLE storage engine) in between the actual master and the slaves at the bottom of the replication topology. On the top-level master, you leave all filters disabled; on the relay master you add binary log filters; this ensures that the top-level master and its binary logs are fully intact, and then the binary log filters are executed on the intermediate server, thus resulting in less binary log data being sent down to the lower-level slaves. You can also use the slave_compressed_protocol option in /etc/my.cnf if bandwidth is a concern.

If you’re thinking about employing a filtering solution of any sort and it’s something you’re not that familiar with, I’d suggest reading over the manual’s description of how replication filtering rules are processed. There are some informative flowcharts in addition to the textual explanation.

Q: What is my recommendation for MySQL 5.6? Is MHA ready for MySQL 5.6?
A: My personal recommendation for MySQL 5.6, based on what I have seen so far (admittedly, not that much – I have personally only worked with one customer that’s using 5.6), is that I question its readiness. There have been some unfortunate performance regressions compared to 5.5, such as bug #69258, and bug #69318, and I think it might take one or two more point releases before we can consider it truly production-ready. I suppose I should state an obvious disclaimer here in that the aforementioned statements reflect only my opinion and not any official Percona position, and I will look forward to being able to retract them.

MHA should work fine with 5.6 if you’re not using GTID-based replication. If you are using GTID-based replication, it looks like you’ll need to wait for the next release of MHA.

Q: How do you monitor MySQL to determine when it’s not responsive?
A: The easiest way to do this is to simply connect to it and attempt to run a simple query. I’d suggest doing this as a user that does NOT have the SUPER privilege; a user with SUPER will always be able to connect, and that may not accurately report what your application servers are seeing. But, if you can connect and run a simple query and receive a response back in a reasonable (what is reasonable is determined by your environment), it’s a safe bet that MySQL is up and running.

That said, there are plenty of other MySQL and OS-level status variables that you might want to keep an eye on to prevent a problem before it occurs. For example, if you see a large number of long-running queries in the output of SHOW PROCESSLIST that might be a sign that trouble is brewing. Or if you watch your MySQL server’s memory usage and you see that it’s starting to swap, that might indicate that something is up. FWIW, solutions like MHA and PRM simply attempt to connect and run a simple query, but it’s always possible to build something more involved if that’s what you need.

Q: What MySQL HA solution is most similar to Oracle RAC?
A: Oracle RAC is a “shared everything” system; there really isn’t anything quite like that in the MySQL world, except maybe MyISAM files on top of a clustered filesystem with an external lock manager. You might say that MySQL/NDB Cluster is also somewhat similar to Oracle RAC; all of the SQL nodes in a MySQL Cluster installation are going to be talking to the same set of data nodes on the back end, but I think that’s probably where the similarities end.

Q: What kind of replication issues can you get with storage solutions like GlusterFS?
A: I have to admit that this was probably the most interesting question asked; my initial reaction was, honestly, why would you ever want to use MySQL on top of something like that? I can’t imagine the performance being all that fantastic, and I could see network latency wreaking havoc with MySQL’s internal understanding of what’s happening on the system below. Having never experimented with GlusterFS, though, I decided to give it a shot. I set up a 3 node cluster in AWS, got the volume mounted, and went to try to install MySQL. It failed miserably on the mysql_install_db process.

Every time I tried, I got some variant of this message:
130606 2:22:41 [ERROR] /usr/libexec/mysqld: Incorrect information in file: ‘./mysql/servers.frm’
ERROR: 1033 Incorrect information in file: ‘./mysql/servers.frm’

Running mysql_install_db under strace, I saw that it was getting a lot of “bad file descriptor” errors. I was able to get the server up with –skip-grant-tables and insert a few rows into an InnoDB table, but trying to run a simple mysqlslap just hung on me, so that’s where I left it. Maybe I’ll try messing around with this again at some point in the future, but I’m not optimistic that this is a viable use case. If someone can prove me wrong, I’d be interested in knowing how you’ve set it up.

The post Choosing a MySQL HA Solution – Post-Webinar Q&A appeared first on MySQL Performance Blog.

Private, custom Percona MySQL Training

MySQL Performance Blog - Fri, 07/06/2013 - 20:00

Alexei Rodriguez presenting at PLMCE13

During the Percona Live MySQL Community Event this year we had the pleasure of Alexei Rodriguez discussing why Evernote uses Percona’s Private Custom MySQL Training. Alexei is VP of operations there and you can watch a video of his presentation on YouTube.

Alexei said he values having Percona tailor training to their needs, the expert delivery by our practitioner-trainers, and the ability to speak freely about operational issues during private training.

If you would like to talk with our team about private custom training, please contact Sally Martinez at (855) 55TRAIN or
(925) 271-5054, or via e-mail at sally.martinez@percona.com.

The post Private, custom Percona MySQL Training appeared first on MySQL Performance Blog.

Summertime Percona MySQL training update

MySQL Performance Blog - Thu, 06/06/2013 - 22:44

Now that June has arrived it is time to plan what you will do over the summer months. In addition to your summer vacation plans, give thought to MySQL training for you and your team.

Summer is the time to brush up on those critical skills needed to ensure all systems are ready for the holiday shopping season.

In addition to our revised courses, that I talked about in a previous post, we are also running our new Moving to MySQL 5.6 class. This class covers new features in MySQL 5.6, migration planning, and application verification. This class was designed with the experienced MySQL DBA in mind–so it is a fast paced 2-day course.

Percona has a packed summer MySQL training schedule. In June we have:

In July we have:

In August we have:

We have a 10% discount code for use when ordering, register early and save even more as the 10% discount can be applied to the early registration price.  Just use discount code mpb10 when checking to receive the discount.

The post Summertime Percona MySQL training update appeared first on MySQL Performance Blog.

The moment I first held my newborn daughter in my arms

Baron Schwartz - Thu, 06/06/2013 - 12:06

This is a personal post, not a technical one. We tell ourselves a lot of lies that are not okay. I want to out one of them. It is important to be real, to be true to oneself. This matters.

The lie starts something like this: the moment I held my newborn child in my arms, I looked into her tiny face and felt an all-encompassing, pure love. I was breathless. I could not think, speak, or move. My life was complete, and I knew I was changed forever in that instant. I knew that I would do anything for this tiny creature, even give up my own life.

This script is repeated ad nauseum in books, TV shows, movies, songs, and most of all, person-to-person. If you have children, ask yourself, is this honestly the experience you had? No? Well, you’re not alone, because this is by and large an absurd lie. Unsurprisingly, psychologists who study attachment can tell you that most people take time — some of them a lot of time — to become strongly attached to their children. If you don’t have this phony experience of mindblowing love at first sight, nothing is wrong with you. You are perfectly normal and this is okay.

Many men, in particular, don’t feel strong feelings for their children until many months or even years have passed. I know not only through my own experience, but also because I’ve asked literally dozens of fathers, and heard not one single story that matches what we’re told we should feel by popular media. Shockingly, some men never feel the feelings of fatherly attachment we’re taught we should feel. And that includes many of the best fathers that have ever lived.

If you aren’t a father, maybe you don’t know this, but infants do very little for a long time. They are helpless and confused, and they can’t move, they don’t recognize you, they don’t smile, they don’t laugh, they don’t express affection. They cry, poop, sleep, and eat, and they generally make you cry by interfering with your own pooping, sleeping, and eating. By the time children even begin to crack a smile, parents and grandparents have spent sleep-deprived months mistaking fart grimaces as smiles, out of a desperate desire to believe it’s a smile. Beyond smiles, it takes a long time until children return hugs or say “Daddy!” or snuggle or want to sit on your lap with a book and read.

The experience of being a father is very special, but let’s not lie to ourselves and give ourselves inferiority complexes if we don’t feel anything but exhausted and irritated for a long, long time after a child is born. Be patient and honest with yourself. It is more important to be honest than to conform to ridiculous expectations.

Why do we perpetuate this harmful and inappropriate lie about our blinding love for our children? Maybe it’s because we are confused about what love is. Love is not a feeling, it is the act of caring for another’s well-being, through conscious decision and a huge effort against internal resistance, because you know it is the right thing to do and for no other reason. Loving feelings are sometimes a result of loving action, and often don’t come soon or strongly, but real love is not conditioned upon the feelings, because they are an effect, not a cause. If you’re interested in this line of thought, I recommend reading “The Road Less Traveled” by M. Scott Peck. Love is not a feeling. I know this is not what most cultures teach, but it’s true.

And if you do have an amazing experience of overwhelming feelings in those first few moments, be a little reserved in how you share it with others, okay? It could be a great favor (an act of love, in fact) to a great many people.

Hint of the day: noatime and relatime in fstab

Open Query blog - Thu, 06/06/2013 - 10:16

It’s been written about everywhere, but since we keep spotting installations in the wild where people don’t know about it, it probably deserves another mention.

By default, Linux uses the atime option on a disk mount, which means it writes a timestamp (e.g. a write to the drive) every time it reads anything. So in this case, reads cause writes – and also disk seeks, because a read from a file will then trigger having to write to the directory that contains the file. This even occurs if a file is read from the file system’s page cache (reading from the machine’s memory rather than the drive).

Unless you require an audit trail of users reading files, you generally you don’t want this. Thus, you want to add the noatime option to the disk mount in /etc/fstab. If you have just the defaults in there, you just make it defaults,noatime. It’ll doesn’t necesarily require a reboot as you can use umount/mount, but that gets tricky when dealing with the root filesystem so a reboot is generally easier. Setting these options is one of the first things we do when configuring a server.

Some user applications, such as Mutt (mail reader) do use the read access time. In that case, you can use the relatime option instead, which only writes a timestamp when a file or directory is written to. This is just for completeness of this story, as it’s still sub-optimal for a database server.

If you require read details for auditing (security) of the operating system, make sure all database-related files (database directories, InnoDB log files, binary logs, etc) are on a separate mount where you can use noatime.

Using noatime also makes a lot of sense on a web server, as it does a lot of reads. Remember, the fact that most files are in the filesystem cache doesn’t make a difference. As a general guide, it makes sense to set on most server installations. Quick win.

Multicast replication in Percona XtraDB Cluster (PXC) and Galera

MySQL Performance Blog - Wed, 05/06/2013 - 20:00
Bandwidth multiplication and synchronous clusters

I’ve seen a lot of people setting up clusters with 3-6+ nodes on 1 Gbps networks.  1 Gbps seems like a lot, doesn’t it?  Actually, maybe not as much as you think.  While the theoretical limit of 1Gbps is actually 120MBps, I start to get nervous around 100MBps. By default Galera uses unicast TCP for replication.  Because synchronous replication needs to replicate to all nodes at once, this means 1 copy of your replication message is sent to other node in the cluster.  The more nodes in your cluster, the more the bandwidth required for replication multiplies. Now, this isn’t really much different from standard mysql replication.  1 master with 5 async slaves is going to send a separate replication stream to each, so your bandwidth requirements will be similar.  However, with async replication you have the luxury of not blocking the master from taking writes if bandwidth is constrained and the slaves lag for a bit, not so in Galera. So, let’s see this effect in action.  I have a simple script that outputs the network throughput on an interface every second.  I’m running a sysbench test on one node and measuring the outbound (UP) bandwidth on that same node:

# 2 nodes in the cluster eth1 DOWN:24 KB/s UP:174 KB/s eth1 DOWN:25 KB/s UP:172 KB/s eth1 DOWN:27 KB/s UP:196 KB/s eth1 DOWN:27 KB/s UP:195 KB/s eth1 DOWN:27 KB/s UP:197 KB/s eth1 DOWN:27 KB/s UP:200 KB/s # 3 nodes in the cluster eth1 DOWN:74 KB/s UP:346 KB/s eth1 DOWN:79 KB/s UP:357 KB/s eth1 DOWN:77 KB/s UP:342 KB/s eth1 DOWN:79 KB/s UP:368 KB/s eth1 DOWN:81 KB/s UP:368 KB/s eth1 DOWN:78 KB/s UP:363 KB/s

This isn’t much traffic in my puny local VMs, but you get the idea.  We can clearly see some factor in play adding the extra nodes.

Multicast to the rescue!

One way to address this bandwidth constraint is to switch to multicast UDP replication in Galera.  This is actually really easy to do. First, we need to make sure our environment will support multicast.  This is a question for your network guys and beyond the scope of this post, but in my trivial VM environment, I just need to make sure that the multicast address space routes to my Galera replication interface, eth1:

[all nodes]# ip ro add dev eth1 224.0.0.0/4 [all nodes]# ip ro show | grep 224 224.0.0.0/4 dev eth1 scope link

In that space, we pick an unused mcast address (again, talk to your network guys).  I’m using 239.192.0.11, so we’ll add this to our my.cnf:

wsrep_provider_options = "gmcast.mcast_addr=239.192.0.11"

If you already have wsrep_provider_options set, add it to the semicolon separated list instead of a separate line in your config. If we already have a running cluster, we need to shut it down, configure our mcast address and re-bootstrap it:

[root@node3 mysql]# service mysql stop [root@node2 mysql]# service mysql stop [root@node1 mysql]# service mysql stop[root@node1 mysql]# service mysql start --wsrep_cluster_address=gcomm:// [root@node2 mysql]# service mysql start [root@node3 mysql]# service mysql start

We can see that a multicast node still needs to bind to the Galera replication port, and of course that needs to be bound to the interface that  the multicast will be received on.

[root@node3 mysql]# lsof -P +p 17493 | grep LISTEN mysqld 17493 mysql 11u IPv4 39669 0t0 TCP *:4567 (LISTEN) mysqld 17493 mysql 20u IPv4 39685 0t0 TCP *:3306 (LISTEN)

Now, let’s re-do our above test:

# 2 nodes in the cluster eth1 DOWN:15 KB/s UP:199 KB/s eth1 DOWN:14 KB/s UP:195 KB/s eth1 DOWN:15 KB/s UP:212 KB/s eth1 DOWN:14 KB/s UP:204 KB/s eth1 DOWN:13 KB/s UP:173 KB/s # 3 nodes in the cluster eth1 DOWN:62 KB/s UP:185 KB/s eth1 DOWN:61 KB/s UP:187 KB/s eth1 DOWN:52 KB/s UP:164 KB/s eth1 DOWN:62 KB/s UP:187 KB/s eth1 DOWN:64 KB/s UP:186 KB/s eth1 DOWN:62 KB/s UP:193 KB/s

So, we can see our outbound bandwidth on our master node doesn’t change as we add more nodes when we are using multicast.

Other multicast tips

We can also also bootstrap nodes using the mcast address:

#wsrep_cluster_address = gcomm://192.168.70.2,192.168.70.3,192.168.70.4 wsrep_cluster_address = gcomm://239.192.0.11

And this works fine.  Pretty slick! Note that IST and SST will still use TCP unicast, so we still want to make sure those are configured to use the regular IP of the node. Typically I just set the wsrep_node_address setting on each node if this IP is not the default IP of the server. I could not find a way to migrate an existing unicast cluster to multicast with a rolling update.  I believe (but could be proven wrong) that you must re-bootstrap your entire cluster to enable multicast.

The post Multicast replication in Percona XtraDB Cluster (PXC) and Galera appeared first on MySQL Performance Blog.

IBM buys SoftLayer

CloudPundit: Massive-Scale Computing - Wed, 05/06/2013 - 05:55

It’s been a hot couple of weeks in the cloud infrastructure as a service space. Microsoft’s Azure IaaS (persistent VMs) came out of beta, Google Compute Engine went into public beta, VMware formally launched its public cloud (vCloud Hybrid Service), and Dell withdrew from the mark. Now, IBM is acquiring SoftLayer, with a deal size in the $2B range, around a 4x-5x multiple — roughly the multiple that Rackspace trades at, with RAX no doubt used as a comp despite the vast differences in the two companies’ business models.

SoftLayer is the largest provider of dedicated hosting on the planet, although they do also have cloud IaaS offerings; they sell direct, but also have a huge reseller channel, and they’re often the underlying provider to many mass-market shared hosting providers. Like other dedicated hosters, they are very SMB-centric — tons of dedicated hosting customers are folks with just a server or two. But they also have a sizable number of customers with scale-out businesses to whom “bare metal” (non-virtualized dedicated servers), provisioned flexibly on demand (figure it typically takes 1 to 4 hours to provision bare metal), is very attractive.

Why bare metal? Because virtualization is great for server consolidation (“I used to have 10 workloads on 10 servers that were barely utilized, and now I have one heavily utilized server running all 10 workloads!”), but it’s often viewed as unnecessary overhead when you’re dealing with an environment where all the servers are running nearly fully utilized, as is common in scale-out, Web-scale environments.

SoftLayer’s secret sauce is its automation platform, which handles virtualized and non-virtualized servers with largely equal ease. One of their value propositions has been to bring the kinds of things you expect from cloud VMs, to bare metal — paid by the hour, fully-automated provisioning, API as well as GUI, provisioning from image, and so forth. So the value proposition is often “get the performance of bare metal, in exactly the configuration you want, with the advantages and security of single-tenancy, without giving up the advantages of the cloud”. And, of course, if you want virtualization, you can do that — or SoftLayer will be happy to sell you VMs in their cloud.

SoftLayer also has a nice array of other value-adds that you can self-provision, including being able to click to provision cloud management platforms (CMPs) like CloudStack / Citrix CloudPlatform, and hosting automation platforms like Parallels. Notably, though, they are focused on self-service. Although SoftLayer acquired a small managed hosting business when it merged with ThePlanet, its customer base is nearly exclusively self-managed. (That makes them very different than Rackspace.)

In terms of the competition, SoftLayer’s closest philosophical alignment is Amazon Web Services — don’t offer managed services, but instead build successive layers of automation going up the stack, that eliminate the need for traditional managed services as much as possible. They have a considerably narrower portfolio than AWS, of course, but AWS does not offer bare metal, which is the key attractor for SoftLayer’s customers.

So why does IBM want these guys? Well, they do fill a gap in the IBM portfolio — IBM has historically not served an SMB market directly in gneral, and its developer-centric SmartCloud Enterprise (SCE) has gotten relatively weak traction (seeming to do best where the IBM brand is important, notably Europe), although that can be blamed on SCE’s weak feature set and significant downtime associated with maintenance windows, more so than the go-to-market (although that’s also been problematic). I’ll be interested to see what happens to the SCE roadmap in light of the acquisition. (Also, IBM’s SCE+ offering — essentially a lightweight data center outsourcing / managed hosting offering, delivered on cloud-enabled infrastructure — uses a totally different platform, which they’ll need to converge at some point in time.)

Beyond the “public cloud”, though, SoftLayer’s technology and service philosophy are useful to IBM as a platform strategy, and potentially as bits of software and best practices to embed in other IBM products and services. SoftLayer’s anti-managed-services philosophy isn’t dissonant with IBM’s broader outsourcing strategy as it relates to the cloud. Every IT outsourcer at any kind of reasonable scale actually wants to eliminate managed services where they can, because at this point, it’s hard to get any cheaper labor — the Indian outsourcers have already wrung that dry, and every IT outsourcer today offshores as much of their labor as possible. So your only way to continue to get costs down is to eliminate the people. If you can, through superior technology, eliminate people, then you are in a better competitive position — not just for cost, but also for consistency and quality of service delivery.

I don’t think this was a “must buy” for IBM, but it should be a reasonable acceleration of their cloud plans, assuming that they manage to retain the brains at SoftLayer, and can manage what has been an agility-focused, technology-driven business with a very different customer base and go-to-market approach than the traditional IBM base — and culture. SoftLayer can certainly use more money to for engineering resources (although IBM’s level of engineering commitment to cloud IaaS has been relatively lackluster given its strategic importance), marketing, and sales, and larger customers that might have been otherwise hesitant to use them may be swayed by the IBM brand.

(And it’s a nice exit for GI Partners, at the end of a long road in which they wrapped up EV1 Servers, ThePlanet, and SoftLayer… then pursued an IPO route during a terrible time for IPOs… and finally get to sell the resulting entity for a decent valuation.)


Implementing SchemaSpy in your MySQL environment

MySQL Performance Blog - Tue, 04/06/2013 - 20:00

Lately I have been working with a set of customers on a longer term basis which has given me time to explore new tools using their environments.  One tool that I am finding very helpful is called SchemaSpy.

SchemaSpy is a Java-based tool (requires Java 5 or higher) that analyzes the metadata of a schema in a database and generates a visual representation of it in a browser-displayable format. It lets you click through the hierarchy of database tables via child and parent table relationships as represented by both HTML links and entity-relationship diagrams. It’s also designed to help resolve the obtuse errors that a database sometimes gives related to failures due to constraints.

One of the things that I like about  ERD tools is that I can quickly generate a graphic representation of the database to see which tables are referenced the most, and which are candidates for dropping (why keep old data if you don’t need to?).  I get up to speed a lot faster and can contribute to the Development process a lot more effectively when I know the relationships between tables versus observing queries only.  An additional benefit is that SchemaSpy is command-line driven and builds the html after each run, so I find it convenient to set it up on crontab so that schema changes are automatically picked up.  Nice, eh?

Usage of this tool is very straightforward and it does a lot of the work for you.  Basically if your database has Foreign Keys then you’re laughing, if not don’t despair, you just have a bit more work ahead of you.  Most of the time SchemaSpy “does the right thing” but sometimes you need to give it a little help in the form of metadata files.

Installation of SchemaSpy and DependenciesCreating a mysql.properties file

While not explicitly required, I prefer to create this once so that the  command line is shorter and neater.  In my case here is the properties file in use for my Percona Server 5.6.10 sandbox:

description=MySQL driver=com.mysql.jdbc.Driver connectionSpec=jdbc:mysql://127.0.0.1:5610/schemaspy driverPath=/usr/share/java/mysql-connector-java.jarExample SchemaCREATE TABLE `parent` ( `parent_id` int(10) unsigned NOT NULL AUTO_INCREMENT, PRIMARY KEY (`parent_id`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1CREATE TABLE `child_A` ( `id` int(10) unsigned NOT NULL AUTO_INCREMENT, `parent_id` int(10) unsigned DEFAULT NULL, PRIMARY KEY (`id`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1CREATE TABLE `child_B` ( `id` int(10) unsigned NOT NULL AUTO_INCREMENT, `parent_id` int(10) unsigned DEFAULT NULL, PRIMARY KEY (`id`), KEY `parent_id` (`parent_id`), CONSTRAINT `child_B_ibfk_1` FOREIGN KEY (`parent_id`) REFERENCES `parent` (`parent_id`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1

Notice that in the case of child_A there is an implicit FK relationship to parent via parent_id, however in child_B it is explicit.  This has an impact on how SchemaSpy identifies the relationship, and whether you need to provide a hint or not.

Running SchemaSpy (with Foreign Keys)

When you run schemaSpy without a metadata file, you will see that the relationship between parent and child_B is properly displayed:

Table child_A can be made to display if you tag the Implied relationships box, but this generally only works if you have used the exact same name (in this case, parent_id) in the child and parent tables.  Your environment may be like a lot of shops that we see, where the parent table is known as id, and on the child tables they reference it as parent_id — this might make sense to Developers but it doesn’t help SchemaSpy at all.  This is where Metadata files come in to play

Creating a Metadata File

The format of the metadata file is XML-based.  I have included the metadata file I used to properly link child_A to parent table:

<?xml version="1.0" encoding="UTF-8"?> <schemaMeta xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="http://schemaspy.sourceforge.net/xmlschema/2011/02/05/schemaspy.meta.xsd"> <comments> Main Production Database. Percona, Michael Coburn </comments> <tables> <table name="parent" comments="Parent table"> <column name="parent_id" primaryKey="true"> </column> </table> <table name="child_A" comments="Non-FK relationship, implicit relationship to parent"> <column name="id" primaryKey="true"> </column> <column name="parent_id"> <foreignKey table="parent" column="parent_id"/> </column> </table> </tables> </schemaMeta>

If you’re looking for other examples you can find a more comprehensive example here.

Running SchemaSpy

The syntax below if specific for MySQL, but note that SchemaSpy works for pretty much all the popular RDBMS out there.

java -jar schemaSpy_5.0.0.jar -t mysql -u msandbox_ro -p msandbox -meta schemaspy.meta.xml -o /var/www/schemaspy/

The -o directive tells SchemaSpy where to write the output to, make sure this is a directory that can then be accessed by a web server.  Also note that since I created the mysql.properties file in the same directory as the jar file, I don’t need to pass any host:port or schema name information.

Viewing SchemaSpy output

A completed Relationships view of these three tables should look like this:

Final Thoughts

While I find the Relationships tab the most useful component of SchemaSpy, I don’t want to leave you with the impression that this is the only component of the tool.  There are additional tabs:

  • Tables – Names, number of children, parents, count of columns, row counts, and comments — a great way for a high level overview of the table sizes and an easy way to use the search feature of your browser to zero in on a particular table
  • Constraints – lists the explicit Foreign Key constraints in the database (this does not include constraints identified via metadata files!)
  • Anomalies – Identifies possible relationships between columns/tables based on names, tables without indexes, columns flagged ‘nullable’ and ‘must be unique’ (woops!), single column tables, incrementing column names in tables, and tables with the string NULL instead of the actual SQL NULL value. This is basically a quick sanity check of your schema for any significant errors or items requiring review.
  • Columns – a listing of all columns in the schema, really handy to sort by name to see if you have any implicit constraints that you might have missed and can then write into your metadata file.
  • Donate – This is free software and John Currier asks for donations so he can justify the time spent maintaining SchemaSpy to his wife

Finally don’t forget to automate SchemaSpy via crontab once you’re done.

What are some ERD tools you use and how do they compare with SchemaSpy?  Feel free to answer via the comments.  Thanks for reading!

The post Implementing SchemaSpy in your MySQL environment appeared first on MySQL Performance Blog.

Percona Server 5.6.11-60.3 first Release Candidate now available

MySQL Performance Blog - Tue, 04/06/2013 - 05:21

Percona Server for MySQL version 5.6.11-60.3

Percona is glad to announce the first Release Candidate release of Percona Server 5.6.11-60.3 on June 3rd, 2013 (Downloads are available here and from the Percona Software Repositories).

Based on MySQL 5.6.11, including all the bug fixes in it, Percona Server 5.6.11-60.3 is the first RC release in the Percona Server 5.6 series. All of Percona‘s software is open-source and free, all the details of the release can be found in the 5.6.11-60.3 milestone at Launchpad.

This release contains all the bug fixes from latest Percona Server 5.5 release (currently Percona Server 5.5.31-30.3).

New Features:

Bugs Fixed:

  • Transaction objects are now allocated calling calloc() directly instead of using InnoDB heap allocation. This may improve write performance for high levels of concurrency. Bug fixed #1185686.
  • Under very rare circumstances, deleting a zero-size bitmap file at the right moment would make server stop with an I/O error if changed page tracking is enabled. Bug fixed #1184517.
  • Missing path separator between the directory and file name components in a bitmap file name could stop the server starting if the innodb_data_home_dir variable didn’t have the path separator at the end. Bug fixed #1181887.
  • Changed page tracking used to hold the log system mutex for the log reads needlessly, potentially limiting performance on write-intensive workloads. Bug fixed #1171699.
  • Incorrect schema definition for the User Statistics tables in INFORMATION_SCHEMA (CLIENT_STATISTICS, INDEX_STATISTICS, TABLE_STATISTICS, THREAD_STATISTICS, and USER_STATISTICS) led to the maximum counter values being limited to 32-bit signed integers. Fixed so that these values can be 64-bit unsigned integers now. Bug fixed #714925.
  • Server would crash if an INNODB_CHANGED_PAGES query is issued that has an empty LSN range and thus does not need to read any bitmap files. Bug fixed #1184427.
  • Query to the INNODB_CHANGED_PAGES table would cause server to stop with an I/O error if a bitmap file in the middle of requested LSN range was missing. Bug fixed #1179974.
  • A warning is now returned if a bitmap file I/O error occurs after an INNODB_CHANGED_PAGES query started returning data to indicate an incomplete result set. Bug fixed #1185040.
  • The INNODB_CHANGED_PAGES table couldn’t be queried if the log tracker wasn’t running. Bug fixed #1185304.
  • Fixed the upstream bug #68970 that, in Percona Server, would cause small tablespaces to expand too fast around 500KB tablespace size. Bug fixed #1169494.
  • Fixed the RPM package dependencies issues. Bug fixed #1186831.
  • Reduced the overhead from the Handle Corrupted Tables check as it was missing branch predictor annotations. Bug fixed #1176864.

Other bug fixes: bug fixed #1184695, bug fixed #1184512, bug fixed #1183585, bug fixed #1178606, bug fixed #1177356, bug fixed #1160895, bug fixed #1182876, bug fixed #1180481, bug fixed #1163135, bug fixed #1157078, bug fixed #1182889, bug fixed #1133926, bug fixed #1165098, bug fixed #1182793, bug fixed #1157075, bug fixed #1183625, bug fixed #1155475, bug fixed #1157037, bug fixed #1182065, bug fixed #1182837, bug fixed #1177780, bug fixed #1154954.

Release notes for Percona Server for MySQL 5.6.11-60.3 are available in our online documentation. Bugs can be reported on the launchpad bug tracker.

The post Percona Server 5.6.11-60.3 first Release Candidate now available appeared first on MySQL Performance Blog.

MySQL Query Patterns, Optimized – Webinar questions followup

MySQL Performance Blog - Mon, 03/06/2013 - 23:22

On Friday I gave a presentation on “MySQL Query Patterns, Optimized” for Percona MySQL Webinars.  If you missed it, you can still register to view the recording and my slides.

Thanks to everyone who attended, and especially to folks who asked the great questions.  I answered as many as we had time for  during the session, but here are all the questions with my complete answers:

Q: Can you compare the use of subqueries/multiple joins vs. multiple queries (e.g. temp tables)?

For performance, it’s hard to make a general rule, because there are different types of subqueries (correlated, non-correlated, derived tables, scalar subqueries) and we saw in my presentation that sometimes these perform worse and sometimes better than another equivalent solution.

If you’re talking about using a temporary table to store an interim result set, and then use that to ultimately produce the result you want, that’s another technique that I see used, although it’s less common.  It can be helpful to simplify the development of a report, but whether it’s a benefit to performance is a case-by-case decision.

There are other drawbacks to using temporary tables in this way, for example when using replication, a temporary table could vanish if the slave restarts and then subsequent queries against the temp table fail.

Q: Why doesn’t the query optimizer take care the execution decisions? Since they are all logically the same, I believe it should all translate to the same execution plan. (I know it doesn’t though.)

Why don’t you try writing a SQL query optimizer and see how easy it is?  ;-)  Generative grammars like SQL can produce an infinite variety of  queries.  The query optimizer analyzes queries for common patterns the MySQL designers know can be improved.  But there’s a finite amount of engineer-years spent developing and testing the query optimizer.  The developers have to make a decision about which query patterns the optimizer recognizes, and which are left to the developer to refactor manually.  There could also be some cases where optimizing a query automatically would be more costly than just running the query in a suboptimal form.

Q: Doesn’t the primary key solution for random selection only work when the IDs for movies are distributed uniformly over the range 1..MAX(id)?

Yes, there’s a risk if you have irregular distribution of matching rows that you’ll get skewed results.  For instance, if there’s a gap of id’s between 47000 and 50000 with no movies, but your random number generator picks values in that range with equal frequency as any other range, then the movie immediately following the “gap” will be picked more frequently.

Nearly every solution for randomly choosing rows involves some compromise, either of performance, or of accuracy of randomness.  The ORDER BY RAND() solution is known to have poor performance, but it returns a better random choice.

Another workaround may be to add a column to the table, and populate the rows you want to choose (movies, in this case) with  values known to be consecutive.  Then you could choose a random value, look up the row with “=” instead of “>” and be guaranteed to find exactly one match.  But the tradeoff of this solution is that it requires storing another column, and reinitializing the sequence after making certain insert/update/delete operations.  If you have a dataset that changes infrequently, then this might be a good tradeoff.

Q: Is the tuple comparison independent of sorting order? Wouldn’t it be a problem if the keywords were returned in a different order from what you specify in the query?

The tuple comparison you’re referring to is this example:

WHERE (k1.keyword, k2.keyword, k3.keyword) = ('espionage', 'nuclear-bomb', 'ejector-seat');

This is okay because each correlation name (k1, k2, k3) is an index lookup.  Look at the EXPLAIN output — it accesses these first, by looking up the keywords you specify.  It doesn’t matter what order you specify the keywords in this tuple, it is equivalent to this expression:

WHERE k1.keyword = 'espionage' AND k2.keyword = 'nuclear-bomb' AND k3.keyword = 'ejector-seat';

The AND operator is commutative, so the order of these terms doesn’t matter.

However, if I were using the tuple syntax to look up values against multiple columns in a compound index, then the order would matter, and it would have to match the order of columns in the index.  Example:

WHERE (last_name, first_name) = ('Karwin', 'Bill');

Q: On the Dynamic Pivot, the straight join, can you explain more about why you thought about doing that and why it helped?

When I did not use the STRAIGHT_JOIN, the query optimizer reordered the tables.  It seemed to prefer an index-scan of 7 rows in the `kind_type` table to be first, then look up matching rows in `title` by a secondary index.  But the result was that it created a temporary table to count the movies per production year for each kind_id.

It was more efficient in this case to force MySQL to scan the `title` table first, grouping by kind_id in index order.  This made the first table in the EXPLAIN seem like it was scanning more rows.  But by avoiding the temporary table, and making lookups to the `kind_types` table by primary key, the result was a query that took half the time.

An important conclusion of my presentation today is that sometimes you find unexpected differences in performance like this, so it pays to test all different solutions, and measure the resulting performance!

Q: Bill, back to the query where you were trying to determine the last episode of each TV show. Couldn’t you have used a SUBQUERY to fetch and compare MAX(episode_id) as an option there?

Yes, I think what you’re referring to is what I covered as the Derived-Table Solution on slides 55-59.  Apologies if that wasn’t clear, because I didn’t present the details of all the tables and columns.  It turned out this solution was 150x faster for this case, so your suggestion is a good one!

Thanks again for attending my webinar!  Here are some more tips:

The post MySQL Query Patterns, Optimized – Webinar questions followup appeared first on MySQL Performance Blog.

The small improvements of MySQL 5.6: Duplicate Index Detection

MySQL Performance Blog - Sat, 01/06/2013 - 00:35

Here at the MySQL Performance Blog, we’ve been discussing the several new features that MySQL 5.6 brought: GTID-based replication, InnoDB Fulltext, Memcached integration, a more complete performance schema, online DDL and several other InnoDB and query optimizer improvements. However, I plan to focus on a series of posts on the small but handy improvements – changes and bug corrections – in MySQL 5.6 that can make our lives easier and have passed almost unnoticed by most (not all) DBAs.

Duplicate Index Detection

I commented about this on my last webinar, but did not have time to analyze it in-depth.  If you try to do something like this in MySQL 5.5, you will succeed without errors or warnings:

mysql> ALTER TABLE test ADD INDEX (col2); Query OK, 0 rows affected (0.22 sec) Records: 0 Duplicates: 0 Warnings: 0 mysql> ALTER TABLE test ADD INDEX (col2); Query OK, 0 rows affected (0.25 sec) Records: 0 Duplicates: 0 Warnings: 0 mysql> SHOW CREATE TABLE test\G *************************** 1. row *************************** Table: test Create Table: CREATE TABLE `test` ( `id` int(11) NOT NULL AUTO_INCREMENT, `col2` int(11) DEFAULT NULL, `col3` varchar(200) DEFAULT NULL, PRIMARY KEY (`id`), KEY `col2` (`col2`), KEY `col2_2` (`col2`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1 1 row in set (0.00 sec)

In previous versions of MySQL, you can create two indexes with the same columns (in the same order) and the server will not complain.

If we execute the same sentences in MySQL 5.6, the second ALTER will also succeed -and the index will be created-, but we will get a warning (note severity, to be exact):

mysql> ALTER TABLE test ADD INDEX (col2); Query OK, 0 rows affected (0.36 sec) Records: 0 Duplicates: 0 Warnings: 0 mysql> ALTER TABLE test ADD INDEX (col2); Query OK, 0 rows affected, 1 warning (0.56 sec) Records: 0 Duplicates: 0 Warnings: 1 mysql> SHOW WARNINGS\G *************************** 1. row *************************** Level: Note Code: 1831 Message: Duplicate index 'col2_2' defined on the table 'test.test'. This is deprecated and will be disallowed in a future release. 1 row in set (0.00 sec)

As the message points correctly, this is a human mistake, as it is a waste of resources that could potentially impact our performance, and should be avoided. By the way, a good practice to avoid this is always naming your keys with a consistent pattern. This new behavior was introduced in 5.6.7 with the closing of this bug (although this was initially accepted as a bug as early as 2005!).

The report explains more in detail what the “will be disallowed in a future release” means. In MySQL 5.7 the checks will be stricter: in the default SQL mode, a duplicate index will throw a warning instead of a note. In strict mode, it will throw an error and the second ALTER will fail, preventing the creation of the duplicate index.

Does it mean that tools like pt-duplicate-key-checker will not be necessary for MySQL 5.6? Let’s have a look at the code implementing this feature. The warning will only be thrown if the index has not been created automatically, it is not a foreign key, and it has the exact column definition in the same order. In other words, it checks for duplicate keys, but not redundant ones. What is the difference? Let’s see an example. If we execute:

mysql> ALTER TABLE test ADD INDEX (col2); Query OK, 0 rows affected (0.34 sec) Records: 0 Duplicates: 0 Warnings: 0 mysql> ALTER TABLE test ADD INDEX (col2, col3); Query OK, 0 rows affected (0.39 sec) Records: 0 Duplicates: 0 Warnings: 0

We get no warnings and no errors in 5.6, as the indexes are different. But as you may know, we can use the second index not only for filtering on both columns, but also for filtering by just the first one. Why can’t MySQL enforce this kind of constraints? For many reasons: the first one because it would break 99% of all applications out there that use MySQL, for which we at Percona tend to find redundant indexes. And second, because in some cases, we may need to have what at first seems redundant indexes but effectively they are not -for example, if one of the two indexes was unique or a foreign key.

This is the output of our tool when run on the same table, correctly identifying the redundancy:

$ pt-duplicate-key-checker --tables test.test # ######################################################################## # test.test # ######################################################################## # col2 is a left-prefix of col2_2 # Key definitions: # KEY `col2` (`col2`), # KEY `col2_2` (`col2`,`col3`) # Column types: # `col2` int(11) default null # `col3` varchar(200) default null # To remove this duplicate index, execute: ALTER TABLE `test`.`test` DROP INDEX `col2`; # ######################################################################## # Summary of indexes # ######################################################################## # Size Duplicate Indexes 5 # Total Duplicate Indexes 1 # Total Indexes 3

Additionally, pt-duplicate-key-checker will detect subtle redundancies that are engine-dependent, like redundant suffixes for secondary keys in InnoDB. As some of this redundancies could be necessary, depending on the query optimizer and the MySQL version, we always recommend to check manually the optimizations proposed by Percona Toolkit. The MySQL server, of course, cannot risk to block directly all cases.

A set of MySQL utilities were introduced by Oracle recently, which includes mysqlindexcheck, similar to pt-duplicate-key-checker, but it does not detect all cases. For example:

mysql> alter table test add index redundant (col2, id); Query OK, 0 rows affected (1.57 sec) Records: 0 Duplicates: 0 Warnings: 0 $ mysqlindexcheck --server=user:pass@localhost test.test # Source on localhost: ... connected. $ pt-duplicate-key-checker --tables test.test # ######################################################################## # test.test # ######################################################################## # Key redundant ends with a prefix of the clustered index # Key definitions: # KEY `redundant` (`col2`,`id`) # PRIMARY KEY (`id`), # Column types: # `col2` int(11) default null # `id` int(11) not null auto_increment # To shorten this duplicate clustered index, execute: ALTER TABLE `test`.`test` DROP INDEX `redundant`, ADD INDEX `redundant` (`col2`); # ######################################################################## # Summary of indexes # ######################################################################## # Size Duplicate Indexes 9 # Total Duplicate Indexes 1 # Total Indexes 2

By the way, if you want to get more familiar with this and other particularities of the latest MySQL GA release, have a look at our upcoming sessions for the “Moving to 5.6″ training course in America (Austin, San Jose) and Europe (Manchester, Utrecht).

The post The small improvements of MySQL 5.6: Duplicate Index Detection appeared first on MySQL Performance Blog.

Replication in MySQL 5.6: GTIDs benefits and limitations – Part 2

MySQL Performance Blog - Thu, 30/05/2013 - 20:00

The main benefit of using GTIDs is to have much easier failover than with file-based replication. We will see how to change the replication topology when using GTID-based replication. That will show where GTIDs shine and where improvements are expected.

This is the second post of a series of articles focused on MySQL 5.6 GTIDs. You can find part one here.

Our goal will be to go from setup #1 to setup #2 on the picture below, following various scenarios:

For these tests, all servers are running on 127.0.0.1 with ports ranging from 10000 for s0 to 10004 for s4.

Scenario #1: All slaves have processed all the writes

This is the easiest case, we will make s2 a master and redirect replication on the other servers to s2. This scenario can happen when you want to perform a planned failover.

With GTIDs, all the operations are straightforward:

#For s2 (the new master), we remove its configuration as a slave s1> stop slave; s1> reset slave all; # For s0 s0> change master to master_host='127.0.0.1',master_user='rsandbox',master_password='rsandbox',master_port=10001,master_auto_position=1; s0> start slave; # For s1, s3 and s4 mysql> stop slave; mysql> change master to master_port=10002; mysql> start slave;

Those of you who have already done these operations with file-based replication know that it is usually very tedious and that proper recording of binlog file/binlog position needs to be done with care if you don’t want to break replication or corrupt your data.

Scenario #2: One of the slaves is behind

Now let’s imagine that s0 has crashed, and that s1 has not received all writes (and therefore s3 and s4 are also lagging behind).

s2> select count(*) from t; +----------+ | count(*) | +----------+ | 2 | +----------+ # s1 is behind s1> select count(*) from t; +----------+ | count(*) | +----------+ | 0 | +----------+

Can we still use master_auto_position = 1? Let’s hope so, as it is one of the ideas of GTIDs: having for each event across the cluster a monotonically incremental identifier for each event.

Notice that this is the same problem for s0 (which will be late when it comes back) and s1, s3 and s4.

Let’s give it a try!

# For s0,s1, s3, s4 mysql> stop slave; mysql> change master to master_port=10002; mysql> start slave; # And then check the number of records from the t table s1> select count(*) from t; +----------+ | count(*) | +----------+ | 2 | +----------+

Great! So again, using GTIDs avoids the tedious work of looking for the binlog position of a specific event. The only part were we should pay attention is the server we choose for promotion: if it is not up-to-date, data may be lost or replication may be broken.

Scenario #3: The master has crashed before sending all writes

If the binary logs of the master are no longer readable, you will probably lose the events that have not been sent to the slaves (your last chance is to be able to recover data from the crashed master, but that’s another story). In this case, you will have to promote the most up-to-date slave and reconfigure the other slaves as we did above.

So we will suppose that we can read the binary logs of the crashed master. The first thing to do after choosing which slave will be the new master is to recover the missing events with mysqlbinlog.

Let’s say that we want to promote s1 as the new master. We need to know the coordinates of the last event executed:

s1> show slave status\G [...] Executed_Gtid_Set: 219be3a9-c3ae-11e2-b985-0800272864ba:1, 3d3871d1-c3ae-11e2-b986-0800272864ba:1-4

We can see that it’s not obvious to know which was the last executed event: is it 219be3a9-c3ae-11e2-b985-0800272864ba:1 or 3d3871d1-c3ae-11e2-b986-0800272864ba:4 ? A ‘Last_Executed_GTID’ column would have been useful.

In our case we can check that 3ec18c45-c3ae-11e2-b986-0800272864ba is the server UUID of s2, and that the other one is from s0 (for s0 which is crashed, the server UUID can be read in the auto.cnf file in the datadir).

So the last executed event is 219be3a9-c3ae-11e2-b985-0800272864ba:1. How can I instruct mysqlbinlog to start reading from there? Unfortunately, there is no --start-gtid-position option or equivalent. See bug #68566.

Does it mean that we cannot easily recover the data with mysqlbinlog? There is a solution of course, but very poor in my opinion: look for the binlog file/position of the last executed event and use mysqlbinlog with the good old --start-position option! Even with GTIDs, you cannot totally forget old-style replication positioning.

Conclusion

Reconfiguring replication when using GTIDs is usually straightforward: just connect the slave to the correct master with master_auto_position = 1. This can even be made easier with mysqlfailover from the MySQL Utilities (this will be the topic of a future post).

Unfortunately, this will not work for every use case, and until this is fixed, it is good to be aware of the current limitations.

The post Replication in MySQL 5.6: GTIDs benefits and limitations – Part 2 appeared first on MySQL Performance Blog.

Syndicate content