<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: Quest for Resilience: Multi-DC Masters</title>
	<atom:link href="http://openquery.com/blog/quest-resilience-multidc-masters/feed" rel="self" type="application/rss+xml" />
	<link>http://openquery.com/blog/quest-resilience-multidc-masters</link>
	<description>About MySQL, Drizzle, MariaDB and more!</description>
	<lastBuildDate>Mon, 19 Mar 2012 14:26:12 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.2.1</generator>
	<item>
		<title>By: Jeremy Cole</title>
		<link>http://openquery.com/blog/quest-resilience-multidc-masters/comment-page-1#comment-2513</link>
		<dc:creator>Jeremy Cole</dc:creator>
		<pubDate>Wed, 19 May 2010 06:30:03 +0000</pubDate>
		<guid isPermaLink="false">http://openquery.com/blog/?p=1241#comment-2513</guid>
		<description>Hey Arjen,

At Yahoo!, where this was a pretty common requirement, we came up with a model that used a simple TCP proxy to forward traffic, similar (but slightly more heavy-handed) to using a floating/role IP and IP takeover.  Since IP takeover doesn&#039;t work across routers, it can&#039;t be used, but this gets you similar behaviour.

Regards,

Jeremy</description>
		<content:encoded><![CDATA[<p>Hey Arjen,</p>
<p>At Yahoo!, where this was a pretty common requirement, we came up with a model that used a simple TCP proxy to forward traffic, similar (but slightly more heavy-handed) to using a floating/role IP and IP takeover.  Since IP takeover doesn&#8217;t work across routers, it can&#8217;t be used, but this gets you similar behaviour.</p>
<p>Regards,</p>
<p>Jeremy</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Anon</title>
		<link>http://openquery.com/blog/quest-resilience-multidc-masters/comment-page-1#comment-2481</link>
		<dc:creator>Anon</dc:creator>
		<pubDate>Fri, 14 May 2010 17:36:59 +0000</pubDate>
		<guid isPermaLink="false">http://openquery.com/blog/?p=1241#comment-2481</guid>
		<description>Running replication between DCs for a couple years, I&#039;ve only seen seconds_behind &gt; 1s when relay log processing stopped. (error, etc.)

If the APP is reasonably efficient, and its behavior is known/measured, WAN connections shouldn&#039;t be a problem - read only client, or read/write (MM) assuming proper planning and SLA. But of course that&#039;s theory.

Example problem: Important site gets an unexpected traffic spike due to a special event. Its Wordpress implementation is horribly inefficient (lots of poorly written queries in plugins, templates, etc.) and the primary server is struggling - hurting hundreds of other sites. (of course this happens at the end of a long day, with zero notice)

Solution: Move connections to the slave, accessed over WAN, abandoning consistency but spreading the load between two servers. This works great, for several days.

New Problem: Possibly due to DC admin&#039;s throttling, packet loss between the two servers hits ~30% and major problems return for the inefficient site. Simple queries slow, and active connections pile up.

New Solution: The spike in traffic has subsided, so moving the connections back to the primary server, dump from slave, truncate tables on both M+S, and restore to master.

Yeah, it&#039;s probably time for some capacity planning and disaster recovery discussions. :)</description>
		<content:encoded><![CDATA[<p>Running replication between DCs for a couple years, I&#8217;ve only seen seconds_behind &gt; 1s when relay log processing stopped. (error, etc.)</p>
<p>If the APP is reasonably efficient, and its behavior is known/measured, WAN connections shouldn&#8217;t be a problem &#8211; read only client, or read/write (MM) assuming proper planning and SLA. But of course that&#8217;s theory.</p>
<p>Example problem: Important site gets an unexpected traffic spike due to a special event. Its WordPress implementation is horribly inefficient (lots of poorly written queries in plugins, templates, etc.) and the primary server is struggling &#8211; hurting hundreds of other sites. (of course this happens at the end of a long day, with zero notice)</p>
<p>Solution: Move connections to the slave, accessed over WAN, abandoning consistency but spreading the load between two servers. This works great, for several days.</p>
<p>New Problem: Possibly due to DC admin&#8217;s throttling, packet loss between the two servers hits ~30% and major problems return for the inefficient site. Simple queries slow, and active connections pile up.</p>
<p>New Solution: The spike in traffic has subsided, so moving the connections back to the primary server, dump from slave, truncate tables on both M+S, and restore to master.</p>
<p>Yeah, it&#8217;s probably time for some capacity planning and disaster recovery discussions. <img src='http://openquery.com/blog/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
]]></content:encoded>
	</item>
	<item>
		<title>By: paul</title>
		<link>http://openquery.com/blog/quest-resilience-multidc-masters/comment-page-1#comment-2479</link>
		<dc:creator>paul</dc:creator>
		<pubDate>Fri, 14 May 2010 06:21:19 +0000</pubDate>
		<guid isPermaLink="false">http://openquery.com/blog/?p=1241#comment-2479</guid>
		<description>We run a geographic setup using circular replication and DRBD combined.

Each DC has two MySQL master servers using DRBD failover to handle local failure on the nodes.  Then the servers in the DC are linked together using circular replication.

We wrote our own scripts to handle the change over of Slave nodes feeding off its local Master because we have had this kind of setup long before tools like MMM became available.

We have web/application servers in each DC so if one DC goes down then all customer traffic is diverted to the other DC using DNS failover and a low TTL.  DNS failover is not perfect but from our testing it works well enough for Web for ~94% of our customers.</description>
		<content:encoded><![CDATA[<p>We run a geographic setup using circular replication and DRBD combined.</p>
<p>Each DC has two MySQL master servers using DRBD failover to handle local failure on the nodes.  Then the servers in the DC are linked together using circular replication.</p>
<p>We wrote our own scripts to handle the change over of Slave nodes feeding off its local Master because we have had this kind of setup long before tools like MMM became available.</p>
<p>We have web/application servers in each DC so if one DC goes down then all customer traffic is diverted to the other DC using DNS failover and a low TTL.  DNS failover is not perfect but from our testing it works well enough for Web for ~94% of our customers.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Gerry</title>
		<link>http://openquery.com/blog/quest-resilience-multidc-masters/comment-page-1#comment-2477</link>
		<dc:creator>Gerry</dc:creator>
		<pubDate>Fri, 14 May 2010 01:27:10 +0000</pubDate>
		<guid isPermaLink="false">http://openquery.com/blog/?p=1241#comment-2477</guid>
		<description>We have master-master and regular setups across data centers. The master-master setup is only for quick fail over. We monitor slave status very closely and we have determined that under normal conditions, the setup works just fine with all the caveats and considerations.

I would not recommended for high traffic since it will be almost guaranteed that the slave will fall behind for a few seconds. How much is &quot;high&quot; will depend on the link between data centers and the particular application profile.

Reading from the slave across data centers has never been an issue for us.

My $.02
G</description>
		<content:encoded><![CDATA[<p>We have master-master and regular setups across data centers. The master-master setup is only for quick fail over. We monitor slave status very closely and we have determined that under normal conditions, the setup works just fine with all the caveats and considerations.</p>
<p>I would not recommended for high traffic since it will be almost guaranteed that the slave will fall behind for a few seconds. How much is &#8220;high&#8221; will depend on the link between data centers and the particular application profile.</p>
<p>Reading from the slave across data centers has never been an issue for us.</p>
<p>My $.02<br />
G</p>
]]></content:encoded>
	</item>
</channel>
</rss>
<!-- WP Super Cache is installed but broken. The path to wp-cache-phase1.php in wp-content/advanced-cache.php must be fixed! -->
