<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Open Query blog &#187; InnoDB</title>
	<atom:link href="http://openquery.com/blog/tag/innodb/feed" rel="self" type="application/rss+xml" />
	<link>http://openquery.com/blog</link>
	<description>About MySQL, Drizzle, MariaDB and more!</description>
	<lastBuildDate>Wed, 07 Dec 2011 04:00:56 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.2.1</generator>
		<item>
		<title>What a Hosting Provider did Today</title>
		<link>http://openquery.com/blog/what-hosting-provider-did-today</link>
		<comments>http://openquery.com/blog/what-hosting-provider-did-today#comments</comments>
		<pubDate>Mon, 31 Oct 2011 06:39:15 +0000</pubDate>
		<dc:creator>arjen</dc:creator>
				<category><![CDATA[Good practice / Bad practice]]></category>
		<category><![CDATA[destruction]]></category>
		<category><![CDATA[helpful]]></category>
		<category><![CDATA[hosting]]></category>
		<category><![CDATA[InnoDB]]></category>
		<category><![CDATA[logfile]]></category>
		<category><![CDATA[mysql]]></category>
		<category><![CDATA[recovery]]></category>
		<category><![CDATA[sysadmin]]></category>
		<category><![CDATA[tablespace]]></category>

		<guid isPermaLink="false">http://openquery.com/blog/?p=1573</guid>
		<description><![CDATA[I found Dennis the Menace, he now has a job as system administrator for a hosting company. Scenario: client has a problem with a server becoming unavailable (cause unknown) and has it restarted. MySQL had some page corruption in the InnoDB tablespace. The hosting provider, being really helpful, goes in as root and first deletes [...]]]></description>
			<content:encoded><![CDATA[<p>I found Dennis the Menace, he now has a job as system administrator for a hosting company. Scenario: client has a problem with a server becoming unavailable (cause unknown) and has it restarted. MySQL had some page corruption in the InnoDB tablespace.</p>
<p>The hosting provider, being really helpful, goes in as root and first deletes ib_logfile* then ib* in /var/lib/mysql. He later says &#8220;I am sorry if I deleted it. I thought I deleted the log only. Sorry again.&#8221;  Now this may appear nice, but people who know what they&#8217;re doing with MySQL will realise that deleting the iblogfiles actually destroys data also. MySQL of course screams loudly that while it has FRM files it can&#8217;t find the tables. No kidding!</p>
<p>Then, while he&#8217;s been told to not touch anything any more, and I&#8217;m trying to see if I can recover the deleted files on ext3 filesystem (yes there are tools for that), he goes in again and puts an ibdata1 file back. No, not the logfiles &#8211; but he had those somewhere else too. The files get restored and turn out to be two months old (no info on how they were made in the first place but that&#8217;s minor detail in this grand scheme). All the extra write activity on the partition would&#8217;ve also made potential deleted file recovery more difficult or impossible.</p>
<p>This story will still get a &#8220;happy&#8221; ending, using a recent mysqldump to load a new server at a different hosting provider. Really &#8211; some helpfulness is not what you want. Secondary lesson: pick your hosting provider with care. Feel free to ask us for recommendations as we know some excellent providers and have encountered plenty of poor ones.</p>
]]></content:encoded>
			<wfw:commentRss>http://openquery.com/blog/what-hosting-provider-did-today/feed</wfw:commentRss>
		<slash:comments>21</slash:comments>
		</item>
		<item>
		<title>HDlatency &#8211; now with quick option</title>
		<link>http://openquery.com/blog/hdlatency-quick-option</link>
		<comments>http://openquery.com/blog/hdlatency-quick-option#comments</comments>
		<pubDate>Thu, 16 Jun 2011 08:31:07 +0000</pubDate>
		<dc:creator>arjen</dc:creator>
				<category><![CDATA[Software and tools]]></category>
		<category><![CDATA[hdlatency]]></category>
		<category><![CDATA[InnoDB]]></category>
		<category><![CDATA[iscsi]]></category>
		<category><![CDATA[latency]]></category>
		<category><![CDATA[mariadb]]></category>
		<category><![CDATA[mysql]]></category>
		<category><![CDATA[raid]]></category>
		<category><![CDATA[SAN]]></category>

		<guid isPermaLink="false">http://openquery.com/blog/?p=1508</guid>
		<description><![CDATA[I&#8217;ve done a minor update to the hdlatency tool (get it from Launchpad), it now has a &#8211;quick option to have it only do its tests with 16KB blocks rather than a whole range of sizes. This is much quicker, and 16KB is the InnoDB page size so it&#8217;s the most relevant for MySQL/MariaDB deployments. [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;ve done a minor update to the hdlatency tool (<a href="https://lauchpad.net/hdlatency">get it from Launchpad</a>), it now has a &#8211;quick option to have it only do its tests with 16KB blocks rather than a whole range of sizes. This is much quicker, and 16KB is the InnoDB page size so it&#8217;s the most relevant for MySQL/MariaDB deployments.</p>
<p>However, I didn&#8217;t just remove the other stuff, because it can be very helpful in tracking down problems and putting misconceptions to rest. On SANs (and local RAID of course) you have things like block sizes and stripe sizes, and opinions on what might be faster. Interestingly, the real world doesn&#8217;t always agree with the opinions.</p>
<p>We Mark Callaghan correctly pointed out when I first published it, hdlatency does not provide anything new in terms of functionality, the db IO tests of sysbench cover it all. A key advantage of hdlatency is that it doesn&#8217;t have any dependencies, it&#8217;s a small single piece of C code that&#8217;ll compile on or can run on very minimalistic environments. We often don&#8217;t control what the base environment we have to work on is, so that&#8217;s why hdlatency was initially written. It&#8217;s just a quick little tool that does the job.</p>
<p>We find hdlatency particularly useful for comparing environments, primarily at the same client. For instance, the client might consider moving from one storage solution to another &#8211; well, in that case it&#8217;s useful to know whether we can expect an actual performance benefit.</p>
<p>The burst data rate (big sequential read or write) which often gets quoted for a SAN or even an individual disk is of little interest to database use, since its key performance bottleneck lies in random access I/O. The disk head(s) will need to move. So it&#8217;s important to get some real relevant numbers, rather than just go with magic vendor numbers that are not really relevant to you. Also, you can have a fast storage system attached via a slow interface, and consequentially the performance then will not be at all what you&#8217;d want to see. It can be quite bad.</p>
<p>To get an absolute baseline on what are sane numbers, run hdlatency also on a local desktop HD. This may seem odd, but you might well encounter storage systems that show a lower performance than that. &#8216;nuf said.</p>
<p>If you&#8217;re willing to share, I&#8217;d be quite interested in seeing some (&#8211;quick) output data from you &#8211; just make sure you tell what storage it is: type of interface, etc. Simply drop it in a comment to this post, so it can benefit more people. thanks</p>
]]></content:encoded>
			<wfw:commentRss>http://openquery.com/blog/hdlatency-quick-option/feed</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Fast paging in the real world</title>
		<link>http://openquery.com/blog/fast-paging-real-world</link>
		<comments>http://openquery.com/blog/fast-paging-real-world#comments</comments>
		<pubDate>Mon, 31 May 2010 03:36:55 +0000</pubDate>
		<dc:creator>cafuego</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[COUNT]]></category>
		<category><![CDATA[drupal]]></category>
		<category><![CDATA[InnoDB]]></category>
		<category><![CDATA[mysql]]></category>
		<category><![CDATA[SQL_CALC_FOUND_ROWS]]></category>

		<guid isPermaLink="false">http://openquery.com/blog/?p=1271</guid>
		<description><![CDATA[This blag was originally posted at http://cafuego.net/2010/05/26/fast-paging-real-world Some time ago I attended the &#8220;Optimisation by Design&#8221; course from Open Query¹. In it, Arjen teaches how writing better queries and schemas can make your database access much faster (and more reliable). One such way of optimising things is by adding appropriate query hints or flags. These hints [...]]]></description>
			<content:encoded><![CDATA[<p style="text-align: right;"><em>This blag was originally posted at <a href="http://cafuego.net/2010/05/26/fast-paging-real-world">http://cafuego.net/2010/05/26/fast-paging-real-world</a></em></p>
<p>Some time ago I attended the &#8220;Optimisation by Design&#8221; course from Open Query¹. In it, Arjen teaches how writing better queries and schemas can make your database access much faster (and more reliable). One such way of optimising things is by adding appropriate query hints or flags. These hints are magic strings that control how a server executes a query or how it returns results.</p>
<p>An example of such a hint is <em>SQL_CALC_FOUND_ROWS</em>. You use it in a select query with a <em>LIMIT</em> clause. It instructs the server to select a limited numbers of rows, but also to calculate the total number of rows that would have been returned without the limit clause in place. That total number of rows is stored in a session variable, which can be retrieved via <em>SELECT FOUND_ROWS(); </em> That simply reads the variable and clears it on the server, it doesn&#8217;t actually have to look at any table or index data, so it&#8217;s very fast.</p>
<p>This is useful when queries are used to generate pages of data where a user can click a specific page number or click previous/next page. In this case you need the total number of rows to determine how many pages you need to generate links for.</p>
<p>The traditional way is to first run a <em>SELECT COUNT(*)</em> query and then select the rows you want, with <em>LIMIT</em>. If you don&#8217;t use a <em>WHERE</em> clause in your query, this can be pretty fast on MyISAM, as it has a magic variable that contains the number of rows in a table. On InnoDB however, which is <a href="http://cafuego.net/2009/10/10/mysql-yoursql">my storage engine of choice</a>, there is no such variable and consequently it&#8217;s not pretty fast.</p>
<h3></h3>
<h3>Paging Drupal</h3>
<p>At DrupalConSF earlier this year I&#8217;d floated the idea of making Drupal 7 use SQL_CALC_FOUND_ROWS in its pager queries. These are queries generated specifically to display paginated lists of content and the API to do this is pretty straightforward. To do it I needed to add query hint support to the MySQL driver. When it turned out that PostgreSQL and Oracle also support query hints though, the aim became adding hint support for all database drivers.</p>
<p>That&#8217;s now done, though only the patch only implements hints on the pager under MySQL at the moment.</p>
<p>One issue keeps cropping up though, a <a href="http://www.mysqlperformanceblog.com/2007/08/28/to-sql_calc_found_rows-or-not-to-sql_calc_found_rows/">blog by Alexey Kovyrin in 2007</a> that states <em>SELECT COUNT(*)</em> is faster than using <em>SQL_CALC_FOUND_ROWS</em>. It&#8217;s all very well to not have a patch accepted if that statement is correct, but in my experience that is in fact not the case. In my experience the stats are in fact the other way around, SQL_CALC_FOUND_ROWS is nearly always faster than <em>SELECT COUNT(*)</em>.</p>
<p>To back up my claims I thought I should run some benchmarks.</p>
<p>I picked the Drupal pager query that lists content (nodes) on the content administration page. It selects node IDs from the node table with a WHERE clause which filters by the content language. Or, in plain SQL, what currently happens is:</p>
<pre>SELECT COUNT(*) FROM node WHERE language = 'und';
SELECT nid FROM node WHERE language = 'und' LIMIT 0,50;</pre>
<p>and what I&#8217;d like to happen is:</p>
<pre>SELECT SQL_CALC_FOUND_ROWS nid FROM node WHERE language = 'und' LIMIT 0,50;
SELECT FOUND_ROWS();</pre>
<h3></h3>
<h3>Methodology</h3>
<p>I ran two sets of tests. One on a node table with 5,000 rows and one with 200,000 rows. For each of these table sizes I ran a pager with 10, 20, 50, 100 and 200 loops, each time increasing the offset by 50; effectively paging through the table. I ran all these using both MyISAM and InnoDB as the storage engine for the node table and I ran them on two machines. One was my desktop, a dual core Athlon X2 5600 with 4Gb of RAM and the other is a single core Xen virtual machine with 512Mb of RAM.</p>
<p>I was hoping to also run tests with 10,000,000 rows, but the virtual machine did not complete any of the queries. So I ran these on my desktop machine only. Again for 10, 20, 50, 100 and 200 queries per run. First with an offset of 50, then with an offset of 10,000. I restarted the MySQL server between each run. To discount query cache advantages, I ran all tests with the query cache disabled. The script I used is attached at the bottom of this post. The calculated times <strong>do</strong> include the latency of client/server communication, though all tests ran via the local socket connection.</p>
<p>My desktop runs an <a href="http://ourdelta.org/">OurDelta</a> mysql .5.0.87 (the -d10-ourdelta-sail66) to be exact. The virtual machine runs 5.0.87 (-d10-ourdelta65).  Before you complain that not running a vanilla MySQL invalidates the results, I run these because I am able to tweak InnoDB a bit more, so the I/O write load on the virtual machine is somewhat reduced compared to the vanilla MySQL.</p>
<h3></h3>
<h3>Results</h3>
<p><a href="http://cafuego.net/sites/cafuego.net/files/Graphs.png"><img src="http://cafuego.net/sites/cafuego.net/files/small-count-graphs.png" border="0" alt="Query time graphs - NEW is faster than OLD and InnoDB is not slower than MyISAM" hspace="12" vspace="4" width="600" height="275" /></a></p>
<p>The graphs show that using <em>SQL_CALC_FOUND_ROWS</em> is virtually always faster than running two queries that each need to look at actual data. Even when using MyISAM. As the database gets bigger, the speed advantage of <em>SQL_CALC_FOUND_ROWS</em> increases. At the 10,000,000 row mark, it&#8217;s consistently about twice as fast.</p>
<p>Also interesting is that InnoDB seems significantly slower than MyISAM on the shorter runs. I say <em>seems</em>, because (especially with the 10,000,000 row table) the delay is caused by InnoDB first loading the table from disk into its buffer pool. In the spreadsheet you can see the first query takes up to 40 seconds, whilst subsequent ones are much faster. The MyISAM data is still in the OS file cache, so it doesn&#8217;t have that delay on the first query. Because I use <em>innodb_flush_method=O_DIRECT</em>, the InnoDB data is not kept in the OS file cache.</p>
<h3></h3>
<h3>Conclusion</h3>
<p>So, it&#8217;s official. COUNT(*) is dead, long live SQL_CALC_FOUND_ROWS!  :-)</p>
<p>I&#8217;ve attached my raw results as a Gnumeric document, so feel free to peruse them. The test script I&#8217;ve used is also attached, so you can re-run the benchmark on your own systems if you wish.</p>
<h3></h3>
<h3>Conclusion Addendum</h3>
<p>As pointed out in the Drupal <a href="http://drupal.org/node/778050">pager issue</a> that caused me to run these tests, the query I&#8217;m benchmarking uses the language column, which is not indexed and the test also doesn&#8217;t allow the server to cache the COUNT(*) query. I&#8217;ve rerun the tests with 10 million rows after adding an index and I no longer get a signification speed difference between the two ways of getting the total number of rows.</p>
<p>So I suppose that at least SQL_CALC_FOUND_ROWS will cause your non-indexed pager queries to suck a lot less than they might otherwise and it won&#8217;t hurt if they <em>are</em> properly indexed <img src='http://openquery.com/blog/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' /> </p>
<p>¹ I now work for Open Query as a consultant.</p>
]]></content:encoded>
			<wfw:commentRss>http://openquery.com/blog/fast-paging-real-world/feed</wfw:commentRss>
		<slash:comments>6</slash:comments>
		</item>
		<item>
		<title>Unqualified COUNT(*) speed PBXT vs InnoDB</title>
		<link>http://openquery.com/blog/unqualified-count-speed-pbxt-vs-innodb</link>
		<comments>http://openquery.com/blog/unqualified-count-speed-pbxt-vs-innodb#comments</comments>
		<pubDate>Thu, 27 May 2010 04:54:47 +0000</pubDate>
		<dc:creator>arjen</dc:creator>
				<category><![CDATA[Good practice / Bad practice]]></category>
		<category><![CDATA[COUNT]]></category>
		<category><![CDATA[index scan]]></category>
		<category><![CDATA[InnoDB]]></category>
		<category><![CDATA[mariadb]]></category>
		<category><![CDATA[MyISAM]]></category>
		<category><![CDATA[mysql]]></category>
		<category><![CDATA[pbxt]]></category>
		<category><![CDATA[reporting]]></category>

		<guid isPermaLink="false">http://openquery.com/blog/?p=1261</guid>
		<description><![CDATA[So this is about a SELECT COUNT(*) FROM tblname without a WHERE clause. MyISAM has an optimisation for that since it maintains a rowcount for each table. InnoDB and PBXT can&#8217;t do that (at least not easily) because of their multi-versioned nature&#8230; different transactions may see a different number of rows for the table table! [...]]]></description>
			<content:encoded><![CDATA[<p>So this is about a <strong>SELECT COUNT(*) FROM tblname</strong> without a <strong>WHERE</strong> clause. MyISAM has an optimisation for that since it maintains a rowcount for each table. InnoDB and PBXT can&#8217;t do that (at least not easily) because of their multi-versioned nature&#8230; different transactions may <em>see</em> a different number of rows for the table table!</p>
<p>So, it&#8217;s kinda known but nevertheless often ignored that this operation on InnoDB is costly in terms of time; what InnoDB has to do to figure out the exact number of rows is scan the primary key and just tally. Of course it&#8217;s faster if it doesn&#8217;t have to read a lot of the blocks from disk (i.e. smaller dataset or a large enough buffer pool).</p>
<p>I was curious about PBXT&#8217;s performance on this, and behold it appears to be quite a bit faster! For a table with 50 million rows, PBXT took about 20 minutes whereas the same table in InnoDB took 30 minutes. Interesting!</p>
<p>From those numbers [addendum: yes I do realise there's something else wrong on that server to take that long, but it'd be slow regardless] you can tell that doing the query at all is not an efficient thing to do, and definitely not something a frontend web page should be doing. Usually you just need a ballpark figure so running the query in a cron job and putting the value into memcached (or just an include file) will work well in such cases.</p>
<p>If you do use a WHERE clause, all engines (including MyISAM) are in the same boat&#8230; they  might be able to use an index to filter on the conditions &#8211; but the  bigger the table, the more work it is for the engine. PBXT being faster than InnoDB for this task makes it potentially interesting for reporting purposes as well, where otherwise you might consider using MyISAM &#8211; we generally recommend using a separate reporting slave with particular settings anyway (fewer connections but larger session-specific buffers), but it&#8217;s good to have extra choices for the task.</p>
<p>(In case you didn&#8217;t know, it&#8217;s ok for a slave to use a different engine from a master &#8211; so you can really make use of that ability for specialised tasks such as reporting.)</p>
]]></content:encoded>
			<wfw:commentRss>http://openquery.com/blog/unqualified-count-speed-pbxt-vs-innodb/feed</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>PBXT early impressions in production use</title>
		<link>http://openquery.com/blog/pbxt-early-impressions-production</link>
		<comments>http://openquery.com/blog/pbxt-early-impressions-production#comments</comments>
		<pubDate>Thu, 27 May 2010 02:03:19 +0000</pubDate>
		<dc:creator>arjen</dc:creator>
				<category><![CDATA[Software and tools]]></category>
		<category><![CDATA[InnoDB]]></category>
		<category><![CDATA[mariadb]]></category>
		<category><![CDATA[mysql]]></category>
		<category><![CDATA[pbxt]]></category>
		<category><![CDATA[storage engine]]></category>
		<category><![CDATA[XA]]></category>

		<guid isPermaLink="false">http://openquery.com/blog/?p=1257</guid>
		<description><![CDATA[With Paul McCullagh&#8217;s PBXT storage engine getting integrated into MariaDB 5.1, it&#8217;s never been easier to it out. So we have, on a slave off one of our own production systems which gets lots of inserts from our Zabbix monitoring system. That&#8217;s possibly an ideal usage profile, since PBXT is a log based engine (simplistically [...]]]></description>
			<content:encoded><![CDATA[<p>With Paul McCullagh&#8217;s <a href="http://primebase.org" target="_blank">PBXT</a> storage engine getting integrated into <a href="http://askmonty.org" target="_blank">MariaDB 5.1</a>, it&#8217;s never been easier to it out. So we have, on a slave off one of our own production systems which gets lots of inserts from our Zabbix monitoring system.</p>
<p>That&#8217;s possibly an ideal usage profile, since PBXT is a log based engine (simplistically stated, it indexes its transaction logs, rather than rewriting data from log into index and indexing that) so it should require less disk I/O than say InnoDB. And that means it should be particularly suited to for instance logging, which have lots of inserts on a sustained basis. Note that for short insert burst you may not see a difference with InnoDB because of caching, but sustain it and then you can notice.</p>
<p>Because PBXT has such different/distinct architecture there&#8217;s a lot of learning involved. Together with Paul and help from Roland Bouman we also created a stored procedure that can calculate the optimal average row size for PBXT, and even ALTER TABLE statements you can paste to convert tables. The AVG_ROW_LENGTH option is quite critical with PBXT, if set too big (or if you let PBXT guess and it gets it wrong) it&#8217;ll eat heaps more diskspace as well as being much slower, and if too small it&#8217;ll be slower also; this, it needs to be in the right ballpark. For existing datasets it can be calculated, so that&#8217;s what we&#8217;ve worked on. The procs will be published shortly, and Paul will also put them in with the rest of the PBXT files.</p>
<p>Another important aspect for PBXT is having sufficient cache memory allocated, otherwise operations can take much much longer. While the exact &#8220;cause&#8221; is different, one would notice similar performance aspects when using InnoDB on larger datasets and buffers that are too small for the purpose.</p>
<p>So, while using or converting some tables to PBXT takes a bit of consideration, effort and learning, it appears to be dealing with the real world very well so far &#8211; and that&#8217;s a testament to Paul&#8217;s experience. Paul is also very responsive to questions. As we gain more experience, it is our intent to try PBXT for some of our clients that have operational needs that might be a particularly good fit for PBXT.</p>
<p>I should also mention that it is possible to have a consistent  transaction between PBXT, InnoDB and the binary log, because of the  2-phase commit (XA) infrastructure. This means that you should even be  able to do a mysqldump with &#8211;single-transaction if you have both PBXT  and InnoDB tables, and acquire a consistent snapshot!</p>
<p>More experiences and details to come.</p>
]]></content:encoded>
			<wfw:commentRss>http://openquery.com/blog/pbxt-early-impressions-production/feed</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>RAM flakier than expected</title>
		<link>http://openquery.com/blog/ram-flakier-expected</link>
		<comments>http://openquery.com/blog/ram-flakier-expected#comments</comments>
		<pubDate>Thu, 08 Oct 2009 23:12:22 +0000</pubDate>
		<dc:creator>arjen</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[corruption]]></category>
		<category><![CDATA[InnoDB]]></category>
		<category><![CDATA[mysql]]></category>
		<category><![CDATA[Open Query]]></category>
		<category><![CDATA[RAM]]></category>

		<guid isPermaLink="false">http://openquery.com/blog/?p=1007</guid>
		<description><![CDATA[Ref: Google: Computer memory flakier than expected (CNET DeepTech, Stephen Shankland) Summary: According to tests at Google, it appears that today&#8217;s RAM modules have several thousand errors a year, which would be correctable if it weren&#8217;t for the fact that most of us aren&#8217;t using ECC RAM. Previous research, such as some data from a [...]]]></description>
			<content:encoded><![CDATA[<p>Ref: <a href="http://news.cnet.com/8301-30685_3-10370026-264.html" target="_blank">Google: Computer memory flakier than expected</a> (CNET DeepTech, Stephen Shankland)</p>
<p><em>Summary: According to tests at Google, it appears that today&#8217;s RAM modules have several thousand errors a year, which would be correctable if it weren&#8217;t for the fact that most of us aren&#8217;t using ECC RAM.</em></p>
<blockquote><p>Previous research, such as some data from a 300-computer cluster, showed that memory modules had correctable error rates of 200 to 5,000 failures per billion hours of operation. Google, though, found the rate much higher: 25,000 to 75,000 failures per billion hours.</p></blockquote>
<p>This is quite relevant for database servers because they write a lot rather than mainly read (desktop use). In the MySQL context, if a bit gets flipped in RAM, your data <em>could</em> get corrupted, or it&#8217;s ok on disk and you&#8217;re just reading corrupted data somehow. While using more RAM is good for performance, it also means a bigger RAM footprint for your data and thus more exposure to the issue.</p>
<p>In MySQL 5.0 and the general 5.1, the <em>binary and relay logs</em> do not have checksums on log events. If something gets corrupted anywhere <em>on disk or on its way to disk</em>, garbage will come out and we have seen instances where this happens. There are patches to add a checksum to the binlog structure (Google worked on this) and we&#8217;ll be pushing for this to be ported into MariaDB 5.1 urgently. It&#8217;s no use having it just in later versions. It does change the on-disk format, but so be it. This is very very important stuff.</p>
<p>FYI, InnoDB does use page checksums which are also stored on disk. There is an option to turn them off, but our general recommendation would be to not do that <img src='http://openquery.com/blog/wp-includes/images/smilies/icon_wink.gif' alt=';-)' class='wp-smiley' />  What about the iblog files though? Normally they just refer to pages which at some stage get flushed, but a) if through a glitch they refer to a different page that could lose some committed data and b) on recovery, it could directly affect data. <em>Mind you I&#8217;m conjecturing here, more research necessary!</em></p>
<p>Naturally this does not just affect database systems, <em>file systems </em>too can easily suffer from RAM glitches &#8211; probably with the exception of <em>ZFS</em>, since it has checksums everywhere and keeps them separate from the data.</p>
<p>Anything that keeps data around in RAM, and/or is write intensive. <em>Memcached</em>! How do other database systems work in this respect?</p>
<p><em>Note: this post is not intended to be alarmist; I just think it&#8217;s good to be aware of things so they can be taken into account when designing systems. If you look closely at any system, there are things that can potentially be cause for concern. That doesn&#8217;t mean we shouldn&#8217;t use them, per-say.</em></p>
]]></content:encoded>
			<wfw:commentRss>http://openquery.com/blog/ram-flakier-expected/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Tool of the Day: rsnapshot</title>
		<link>http://openquery.com/blog/tool-day-rsnapshot</link>
		<comments>http://openquery.com/blog/tool-day-rsnapshot#comments</comments>
		<pubDate>Thu, 17 Sep 2009 00:06:19 +0000</pubDate>
		<dc:creator>arjen</dc:creator>
				<category><![CDATA[Software and tools]]></category>
		<category><![CDATA[backup]]></category>
		<category><![CDATA[InnoDB]]></category>
		<category><![CDATA[mysql]]></category>
		<category><![CDATA[open query]]></category>
		<category><![CDATA[recovery]]></category>
		<category><![CDATA[restore]]></category>
		<category><![CDATA[rsnapshot]]></category>
		<category><![CDATA[rsync]]></category>
		<category><![CDATA[xtrabackup]]></category>

		<guid isPermaLink="false">http://openquery.com/blog/?p=946</guid>
		<description><![CDATA[rsnapshot is a filesystem snapshot utility for making backups of local and remote systems, based on rsync. Rather than just doing a complete copy every time, it uses hardlinks to create incrementals (which are from a local perspective a full backup also). You can specify how long to keep old backups, and all the other [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://rsnapshot.org/" target="_blank">rsnapshot</a> is a filesystem snapshot utility for making backups of local and remote systems, based on <a href="http://www.samba.org/rsync/" target="_blank">rsync</a>. Rather than just doing a complete copy every time, it uses hardlinks to create incrementals (which are from a local perspective a full backup also). You can specify how long to keep old backups, and all the other usual jazz. You&#8217;d generally have it connect over ssh. You&#8217;ll want/need to run it on a filesystem that supports hardlinks, so that precludes NTFS.</p>
<p>In the context of MySQL, you can&#8217;t just do a filesystem copy of your MySQL data/logs, that would be inconsistent and broken.<em> (amazingly, I still see people insisting/arguing on this &#8211; but heck it&#8217;s your business/data to gamble with, right?)</em></p>
<p>Anyway, if you do a local mysqldump also, or for instance use <a href="https://launchpad.net/percona-xtrabackup" target="_blank">XtraBackup</a> to take a binary backup of your InnoDB tablespace/logs, then rsnapshot can be used to automate the transfer of those files to a different geographical location.</p>
<p>Two extra things you need to do:</p>
<ul>
<li>Regularly <strong>test your backups</strong>. They can fail, and that can be fatal. For XtraBackup, run the <em>prepare</em> command and essentially start up a MySQL instance on it to make sure it&#8217;s all happy. Havint this already done also saves time if you need to restore.</li>
<li>For restore time, you need to <strong>include the time needed to transfer files back</strong> to the target server.</li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://openquery.com/blog/tool-day-rsnapshot/feed</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Good Practice / Bad Practice: CREATE TABLE and the Storage Engine</title>
		<link>http://openquery.com/blog/good-practice-bad-practice-create-table-storage-engine</link>
		<comments>http://openquery.com/blog/good-practice-bad-practice-create-table-storage-engine#comments</comments>
		<pubDate>Wed, 24 Jun 2009 01:16:20 +0000</pubDate>
		<dc:creator>Walter Heck</dc:creator>
				<category><![CDATA[Good practice / Bad practice]]></category>
		<category><![CDATA[CREATE TABLE]]></category>
		<category><![CDATA[InnoDB]]></category>
		<category><![CDATA[mysql]]></category>
		<category><![CDATA[NO_ENGINE_SUBSTITUTION]]></category>
		<category><![CDATA[sql_mode]]></category>

		<guid isPermaLink="false">http://openquery.com/blog/?p=799</guid>
		<description><![CDATA[When you write your create table statements, always make sure that you make them non-ambiguous. That way even though other servers might have different configurations, you make sure your table will be created in the same way. Imagine for instance you are developing an application on a development server, nicely storing all the scripts you [...]]]></description>
			<content:encoded><![CDATA[<p>When you write your create table statements, always make sure that you make them non-ambiguous. That way even though other servers might have different configurations, you make sure your table will be created in the same way.<br />
Imagine for instance you are developing an application on a development server, nicely storing all the scripts you need to create the same database on your production server. If the same script creates a table differently on both servers, that might cause you a lot of headache later on. At Open Query, we strive to minimise (or preferrably eliminate) headaches.</p>
<p>One of the parts of the create table statement that has the largest impact is the storage engine specification. When you omit the storage engine from the create table statement, your table is automatically created with the default storage engine type configured for the server. Since the storage engine is a very important choice when designing your tables, you want to make sure that it is always the correct type.</p>
<p>Here&#8217;s an example: instead of writing <strong>CREATE TABLE city (city_id int, city_name varchar(100))</strong> you should write: <strong>CREATE TABLE city (city_id int, city_name varchar(100)) ENGINE=InnoDB</strong></p>
<p>It is a simple adjustment, but it will save you from possible problems if you just make a habit out of specifying the storage engine.</p>
<p>But wait, there one more thing! It&#8217;s also very important you have <strong>sql_mode=NO_ENGINE_SUBSTITUTION</strong> in your my.cnf, otherwise your table may still silently become a MyISAM table if your desired engine (for any reason) is disabled in the binary, configuration, or at runtime. With this setting error, such a situation will cause an error &#8211; so you know for sure.</p>
<p><em>In our good practice / bad practice series, we will provide you with byte/bite sized pieces of advice on what we consider good (or bad) practice in MySQL-land. The topics can be just about anything, so expect random things to come up. Also, the level of advancedness greatly varies. A topic might be a no-brainer for some, a reminder for others and a revelation for a third person. We strive to tender to all of you!<br />
</em></p>
]]></content:encoded>
			<wfw:commentRss>http://openquery.com/blog/good-practice-bad-practice-create-table-storage-engine/feed</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>How many files does InnoDB have open?</title>
		<link>http://openquery.com/blog/files-innodb-open</link>
		<comments>http://openquery.com/blog/files-innodb-open#comments</comments>
		<pubDate>Tue, 16 Jun 2009 01:44:58 +0000</pubDate>
		<dc:creator>arjen</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[InnoDB]]></category>
		<category><![CDATA[innodb-file-per-table]]></category>
		<category><![CDATA[innodb-open-files]]></category>
		<category><![CDATA[innodb_file_per_table]]></category>
		<category><![CDATA[innodb_open_files]]></category>
		<category><![CDATA[mysql]]></category>
		<category><![CDATA[open files]]></category>
		<category><![CDATA[open-files-limit]]></category>
		<category><![CDATA[open_files_limit]]></category>
		<category><![CDATA[SHOW ENGINE INNODB STATUS]]></category>
		<category><![CDATA[SHOW GLOBAL STATUS]]></category>

		<guid isPermaLink="false">http://openquery.com/blog/?p=775</guid>
		<description><![CDATA[If you use innodb_file_per_table = 1 and innodb_open_files = X (whatever amount is suitable for your server) there&#8217;s no way internal to MySQL for finding out how many IBD files InnoDB actually has open. Neither SHOW GLOBAL STATUS LIKE &#8216;innodb%&#8217; nor SHOW ENGINE INNODB STATUS provide this information. Many sites do have a growing number of tables, [...]]]></description>
			<content:encoded><![CDATA[<p>If you use <strong>innodb_file_per_table = 1</strong> and <strong>innodb_open_files = X</strong> (whatever amount is suitable for your server) there&#8217;s no way internal to MySQL for finding out how many <strong>IBD</strong> files InnoDB actually has open. Neither <strong>SHOW GLOBAL STATUS LIKE &#8216;innodb%&#8217;</strong> nor <strong>SHOW ENGINE INNODB STATU</strong>S provide this information.</p>
<p>Many sites do have a growing number of tables, so you&#8217;ll want to know when it&#8217;s time to up the number (and potentially also <strong>open-files-limit</strong>). Solution: <strong>sudo lsof | grep -c &#8220;\.ibd$&#8221;</strong></p>
]]></content:encoded>
			<wfw:commentRss>http://openquery.com/blog/files-innodb-open/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>What to do with the Falcon engine?</title>
		<link>http://openquery.com/blog/falcon-engine</link>
		<comments>http://openquery.com/blog/falcon-engine#comments</comments>
		<pubDate>Thu, 14 May 2009 23:41:47 +0000</pubDate>
		<dc:creator>arjen</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[falcon]]></category>
		<category><![CDATA[InnoDB]]></category>
		<category><![CDATA[mysql]]></category>
		<category><![CDATA[Open Query]]></category>
		<category><![CDATA[oracle]]></category>

		<guid isPermaLink="false">http://openquery.com/blog/?p=738</guid>
		<description><![CDATA[Keep it. Make sure it gets correctly positioned in the coming months. It appears that with the Oracle acquisition, the reason-to-exist for Falcon is regarded as gone (a non-Oracle-owned InnoDB replacement), previously seen as a strategic imperative &#8211; much delayed though. But look, each engine has unique architectural aspects and thus a niche where it [...]]]></description>
			<content:encoded><![CDATA[<p>Keep it. Make sure it gets correctly positioned in the coming months.</p>
<p>It appears that with the Oracle acquisition, the reason-to-exist for Falcon is regarded as gone (a non-Oracle-owned InnoDB replacement), previously seen as a strategic imperative &#8211; much delayed though.</p>
<p>But look, each engine has unique architectural aspects and thus a niche where it does particularly well. Given that Falcon exists, I&#8217;d suggest to not just &#8220;ditch it&#8221; but have it live as one of the pluggables. What Oracle will do to it is unknown, but Sun/MySQL can make sure of this positioning by making sure in the coming months that Falcon works in 5.1 as a pluggable engine, perhaps also creating a separate bzr project/tree for it on Launchpad.</p>
<p>Then the good work can find its way into the real world, now.</p>
]]></content:encoded>
			<wfw:commentRss>http://openquery.com/blog/falcon-engine/feed</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
	</channel>
</rss>
<!-- WP Super Cache is installed but broken. The path to wp-cache-phase1.php in wp-content/advanced-cache.php must be fixed! -->
