<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: Migrating HBase: In the Trenches</title>
	<atom:link href="http://blog.mozilla.org/data/2011/02/04/migrating-hbase-in-the-trenches/feed/" rel="self" type="application/rss+xml" />
	<link>http://blog.mozilla.org/data/2011/02/04/migrating-hbase-in-the-trenches/</link>
	<description>Mozilla metrics team technical articles</description>
	<lastBuildDate>Sat, 01 Oct 2011 18:32:33 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.5.1</generator>
	<item>
		<title>By: Tom Goren</title>
		<link>http://blog.mozilla.org/data/2011/02/04/migrating-hbase-in-the-trenches/comment-page-1/#comment-3589</link>
		<dc:creator>Tom Goren</dc:creator>
		<pubDate>Sat, 01 Oct 2011 18:32:33 +0000</pubDate>
		<guid isPermaLink="false">http://blog.mozilla.org/data/?p=348#comment-3589</guid>
		<description><![CDATA[You guys did an awesome job with this.
You are welcome to check out my solution as well on &lt;a href=&quot;http://tech.tomgoren.com/archives/284&quot; title=&quot;hbase migration&quot; rel=&quot;nofollow&quot;&gt;my blog&lt;/a&gt;.
I was a little hesitant to copy hbase data straight from the hdfs due to the same data consistency worries you stated as well when planning.
Instead I went a little roundabout, and while I&#039;m sure your solution out performs mine by far, my approach seems to require a little less manual intervention. 
Also you can divide the table into time stamp based chunks fairly easily, and batch the process.

Anyhow thanks a lot! just my humble contribution, hope it helps somebody (we had to migrate the exact same route 0.20.x to CDH3).]]></description>
		<content:encoded><![CDATA[<p>You guys did an awesome job with this.<br />
You are welcome to check out my solution as well on <a href="http://tech.tomgoren.com/archives/284" title="hbase migration" rel="nofollow">my blog</a>.<br />
I was a little hesitant to copy hbase data straight from the hdfs due to the same data consistency worries you stated as well when planning.<br />
Instead I went a little roundabout, and while I&#8217;m sure your solution out performs mine by far, my approach seems to require a little less manual intervention.<br />
Also you can divide the table into time stamp based chunks fairly easily, and batch the process.</p>
<p>Anyhow thanks a lot! just my humble contribution, hope it helps somebody (we had to migrate the exact same route 0.20.x to CDH3).</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: &#187; HBase Backup Options HBase.info -- All things about HBase</title>
		<link>http://blog.mozilla.org/data/2011/02/04/migrating-hbase-in-the-trenches/comment-page-1/#comment-2970</link>
		<dc:creator>&#187; HBase Backup Options HBase.info -- All things about HBase</dc:creator>
		<pubDate>Thu, 23 Jun 2011 12:10:17 +0000</pubDate>
		<guid isPermaLink="false">http://blog.mozilla.org/data/?p=348#comment-2970</guid>
		<description><![CDATA[[...] 由于Dictcp做集群复制存在数据不一致的问题，Mozilla的开发人员开发了一个Backup工具，具体情况请参考他们的这篇Migrating HBase in the Trenches。 [...]]]></description>
		<content:encoded><![CDATA[<p>[...] 由于Dictcp做集群复制存在数据不一致的问题，Mozilla的开发人员开发了一个Backup工具，具体情况请参考他们的这篇Migrating HBase in the Trenches。 [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Xavier</title>
		<link>http://blog.mozilla.org/data/2011/02/04/migrating-hbase-in-the-trenches/comment-page-1/#comment-2925</link>
		<dc:creator>Xavier</dc:creator>
		<pubDate>Fri, 27 May 2011 15:44:22 +0000</pubDate>
		<guid isPermaLink="false">http://blog.mozilla.org/data/?p=348#comment-2925</guid>
		<description><![CDATA[Hi Matthias,

I updated the link. I moved the repo to our mozilla-metrics github organization a couple of weeks ago. You don&#039;t actually have to fix anything in .META. HBase will figure that out on its own. But you do need to copy it. As I alluded to in the post, to minimize your downtime you can use Backup to make a &quot;dirty&quot; non-functioning copy of the data first. Then during your downtime you&#039;ll only need to copy the files that have changed.

Feel free to join our IRC channel irc.mozilla.org #metrics if you need any further clarification.

Cheers,

Xavier]]></description>
		<content:encoded><![CDATA[<p>Hi Matthias,</p>
<p>I updated the link. I moved the repo to our mozilla-metrics github organization a couple of weeks ago. You don&#8217;t actually have to fix anything in .META. HBase will figure that out on its own. But you do need to copy it. As I alluded to in the post, to minimize your downtime you can use Backup to make a &#8220;dirty&#8221; non-functioning copy of the data first. Then during your downtime you&#8217;ll only need to copy the files that have changed.</p>
<p>Feel free to join our IRC channel irc.mozilla.org #metrics if you need any further clarification.</p>
<p>Cheers,</p>
<p>Xavier</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Matthias</title>
		<link>http://blog.mozilla.org/data/2011/02/04/migrating-hbase-in-the-trenches/comment-page-1/#comment-2924</link>
		<dc:creator>Matthias</dc:creator>
		<pubDate>Fri, 27 May 2011 15:23:52 +0000</pubDate>
		<guid isPermaLink="false">http://blog.mozilla.org/data/?p=348#comment-2924</guid>
		<description><![CDATA[Hi Xavier,
we are also trying to copy hbase data over to new cluster. 0.20.4 to cdh3u0(0.90.1). Since we dont have that much data, we could stop hbase for a short while to perform distcp. However I am not sure how to fix meta information referencing old regionservers once data moves to new cluster. Did you come accross that problem when trying distcp?

Thanks for your help,
Matthias

ps. the link to the backup utility does not work any more. Is this tool still available?]]></description>
		<content:encoded><![CDATA[<p>Hi Xavier,<br />
we are also trying to copy hbase data over to new cluster. 0.20.4 to cdh3u0(0.90.1). Since we dont have that much data, we could stop hbase for a short while to perform distcp. However I am not sure how to fix meta information referencing old regionservers once data moves to new cluster. Did you come accross that problem when trying distcp?</p>
<p>Thanks for your help,<br />
Matthias</p>
<p>ps. the link to the backup utility does not work any more. Is this tool still available?</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: HBase Backup Options &#171; Sematext Blog</title>
		<link>http://blog.mozilla.org/data/2011/02/04/migrating-hbase-in-the-trenches/comment-page-1/#comment-2663</link>
		<dc:creator>HBase Backup Options &#171; Sematext Blog</dc:creator>
		<pubDate>Fri, 11 Mar 2011 16:57:10 +0000</pubDate>
		<guid isPermaLink="false">http://blog.mozilla.org/data/?p=348#comment-2663</guid>
		<description><![CDATA[[...] came up with their own Backup tool.  They&#8217;ve described the tool and its use in the popular Migrating HBase in the Trenches [...]]]></description>
		<content:encoded><![CDATA[<p>[...] came up with their own Backup tool.  They&#8217;ve described the tool and its use in the popular Migrating HBase in the Trenches [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Links 7/2/2011: FOSDEM 2011 Closing, GNOME 3 Test Day &#124; Techrights</title>
		<link>http://blog.mozilla.org/data/2011/02/04/migrating-hbase-in-the-trenches/comment-page-1/#comment-2573</link>
		<dc:creator>Links 7/2/2011: FOSDEM 2011 Closing, GNOME 3 Test Day &#124; Techrights</dc:creator>
		<pubDate>Mon, 07 Feb 2011 13:48:33 +0000</pubDate>
		<guid isPermaLink="false">http://blog.mozilla.org/data/?p=348#comment-2573</guid>
		<description><![CDATA[[...] Migrating HBase: In the Trenches We recently had a situation where we needed to copy a lot of HBase data while migrating from our old datacenter to our new one. The old cluster was running Cloudera’s CDH2 with HBase 0.20.6 and the new one is running CDH3b3. Usually I would use Hadoop’s distcp utility for such a job. As it turned out we were unable to use distcp while HBase was still running on the source cluster. Part of the reason for this is that the HFTP will throw XML errors due to HBase modifying files (particularly the case if HBase removes a directory). And to transfer our entire dataset at the time was going to take well over a day. This presented a serious problem because we couldn’t accept that kind of downtime. We were also about 75% full in the source cluster so doing HBase export was out as well. Thus I created a utility called Backup. [...]]]></description>
		<content:encoded><![CDATA[<p>[...] Migrating HBase: In the Trenches We recently had a situation where we needed to copy a lot of HBase data while migrating from our old datacenter to our new one. The old cluster was running Cloudera’s CDH2 with HBase 0.20.6 and the new one is running CDH3b3. Usually I would use Hadoop’s distcp utility for such a job. As it turned out we were unable to use distcp while HBase was still running on the source cluster. Part of the reason for this is that the HFTP will throw XML errors due to HBase modifying files (particularly the case if HBase removes a directory). And to transfer our entire dataset at the time was going to take well over a day. This presented a serious problem because we couldn’t accept that kind of downtime. We were also about 75% full in the source cluster so doing HBase export was out as well. Thus I created a utility called Backup. [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Tweets that mention Blog of Data » Blog Archive » Migrating HBase: In the Trenches -- Topsy.com</title>
		<link>http://blog.mozilla.org/data/2011/02/04/migrating-hbase-in-the-trenches/comment-page-1/#comment-2565</link>
		<dc:creator>Tweets that mention Blog of Data » Blog Archive » Migrating HBase: In the Trenches -- Topsy.com</dc:creator>
		<pubDate>Sat, 05 Feb 2011 05:35:30 +0000</pubDate>
		<guid isPermaLink="false">http://blog.mozilla.org/data/?p=348#comment-2565</guid>
		<description><![CDATA[[...] This post was mentioned on Twitter by Mozilla News, Planet Mozilla and anurag, Xavier Stevens. Xavier Stevens said: My blog post about our recent HBase migration along with some code to do backup sync&#039;s - http://t.co/5p3Libq [...]]]></description>
		<content:encoded><![CDATA[<p>[...] This post was mentioned on Twitter by Mozilla News, Planet Mozilla and anurag, Xavier Stevens. Xavier Stevens said: My blog post about our recent HBase migration along with some code to do backup sync&#039;s &#8211; <a href="http://t.co/5p3Libq" rel="nofollow">http://t.co/5p3Libq</a> [...]</p>
]]></content:encoded>
	</item>
</channel>
</rss>
