<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: Riak and Cassandra and HBase, oh my!</title>
	<atom:link href="http://blog.mozilla.org/data/2010/05/18/riak-and-cassandra-and-hbase-oh-my/feed/" rel="self" type="application/rss+xml" />
	<link>http://blog.mozilla.org/data/2010/05/18/riak-and-cassandra-and-hbase-oh-my/</link>
	<description>Mozilla metrics team technical articles</description>
	<lastBuildDate>Sat, 01 Oct 2011 18:32:33 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.5.1</generator>
	<item>
		<title>By: Cassandra/Riak/Dynamo Optimistic Concurrency Control</title>
		<link>http://blog.mozilla.org/data/2010/05/18/riak-and-cassandra-and-hbase-oh-my/comment-page-1/#comment-2887</link>
		<dc:creator>Cassandra/Riak/Dynamo Optimistic Concurrency Control</dc:creator>
		<pubDate>Fri, 06 May 2011 02:36:10 +0000</pubDate>
		<guid isPermaLink="false">http://blog.mozilla.org/data/?p=184#comment-2887</guid>
		<description><![CDATA[[...] along with installed user base, the Dynamo clones seem to have something else.&#160; They are truly web scale.&#160; I say this because, unlike virtually all other NoSQL implementations, the Dynamo-based [...]]]></description>
		<content:encoded><![CDATA[<p>[...] along with installed user base, the Dynamo clones seem to have something else.&#160; They are truly web scale.&#160; I say this because, unlike virtually all other NoSQL implementations, the Dynamo-based [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Stacey</title>
		<link>http://blog.mozilla.org/data/2010/05/18/riak-and-cassandra-and-hbase-oh-my/comment-page-1/#comment-2749</link>
		<dc:creator>Stacey</dc:creator>
		<pubDate>Sun, 27 Mar 2011 09:16:49 +0000</pubDate>
		<guid isPermaLink="false">http://blog.mozilla.org/data/?p=184#comment-2749</guid>
		<description><![CDATA[Thanks for an enlightening post and discussion!
@deinspanjer: I’m pretty new to Cassandra, but I can try to comment on your second question, regarding changes that are made during cluster failure, based on some basic knowledge: Cassandra handles changes that occurred while the node was down using what is called a gossip protocol. Basically, when the node that was down rejoins the cluster, it will send/receive gossip messages to/from other nodes. By comparing the generation/version number between its own stats object and the gossip message it received, it will know what has been changed when it was down (more information can be found in the Cassandra wiki: http://wiki.apache.org/cassandra/ArchitectureGossip).

Regarding the Cassandra performance issue (your first question), for those who have been specifically interested in drilling down into Cassandra performance stats, I use a tool called ClearStone. It performs parallel JMX collections from all the Cassandra nodes, provides performance metrics like thread pool/column family store/commit log statistics. It doesn’t actually perform out-of the-box comparison  between NoSQL technologies, but you can use it to get configuration metrics collections from different NoSQL products and compare their performance. You can find it here: http://www.evidentsoftware.com/products/clearstone-for-cassandra/]]></description>
		<content:encoded><![CDATA[<p>Thanks for an enlightening post and discussion!<br />
@deinspanjer: I’m pretty new to Cassandra, but I can try to comment on your second question, regarding changes that are made during cluster failure, based on some basic knowledge: Cassandra handles changes that occurred while the node was down using what is called a gossip protocol. Basically, when the node that was down rejoins the cluster, it will send/receive gossip messages to/from other nodes. By comparing the generation/version number between its own stats object and the gossip message it received, it will know what has been changed when it was down (more information can be found in the Cassandra wiki: <a href="http://wiki.apache.org/cassandra/ArchitectureGossip" rel="nofollow">http://wiki.apache.org/cassandra/ArchitectureGossip</a>).</p>
<p>Regarding the Cassandra performance issue (your first question), for those who have been specifically interested in drilling down into Cassandra performance stats, I use a tool called ClearStone. It performs parallel JMX collections from all the Cassandra nodes, provides performance metrics like thread pool/column family store/commit log statistics. It doesn’t actually perform out-of the-box comparison  between NoSQL technologies, but you can use it to get configuration metrics collections from different NoSQL products and compare their performance. You can find it here: <a href="http://www.evidentsoftware.com/products/clearstone-for-cassandra/" rel="nofollow">http://www.evidentsoftware.com/products/clearstone-for-cassandra/</a></p>
]]></content:encoded>
	</item>
	<item>
		<title>By: NoSQL Startup Basho Raises $7.5M for Riak: Cloud Computing News &#171;</title>
		<link>http://blog.mozilla.org/data/2010/05/18/riak-and-cassandra-and-hbase-oh-my/comment-page-1/#comment-2580</link>
		<dc:creator>NoSQL Startup Basho Raises $7.5M for Riak: Cloud Computing News &#171;</dc:creator>
		<pubDate>Wed, 09 Feb 2011 18:01:18 +0000</pubDate>
		<guid isPermaLink="false">http://blog.mozilla.org/data/?p=184#comment-2580</guid>
		<description><![CDATA[[...] attracted some noteworthy customers, too, including Comcast, Wikia and Opscode, and last spring, Mozilla chose Riak over Cassandra and HBase as the foundation its Mozilla Labs Test Pilot project that analyzes large amounts of Firefox-user [...]]]></description>
		<content:encoded><![CDATA[<p>[...] attracted some noteworthy customers, too, including Comcast, Wikia and Opscode, and last spring, Mozilla chose Riak over Cassandra and HBase as the foundation its Mozilla Labs Test Pilot project that analyzes large amounts of Firefox-user [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: deinspanjer</title>
		<link>http://blog.mozilla.org/data/2010/05/18/riak-and-cassandra-and-hbase-oh-my/comment-page-1/#comment-2504</link>
		<dc:creator>deinspanjer</dc:creator>
		<pubDate>Thu, 13 Jan 2011 15:12:41 +0000</pubDate>
		<guid isPermaLink="false">http://blog.mozilla.org/data/?p=184#comment-2504</guid>
		<description><![CDATA[Riak certainly isn&#039;t a competitor to Hadoop in terms of Map Reducing over billions of keys.  That said.  We have been using it for ad-hoc document storage as described in the use case above and it has worked very well.  Earlier versions were definitely very slow and inefficient for key scanning, exactly as you state.  Basho has put significant development into this area and the latest version has significantly improved performance there.  I don&#039;t think that comparing run-time on a single node set up with anything on Hadoop is exactly apples-to-apples though.  Consider that if you have even a small five to ten node Hadoop cluster, your MR jobs are going to take a minute or two just in startup/teardown.]]></description>
		<content:encoded><![CDATA[<p>Riak certainly isn&#8217;t a competitor to Hadoop in terms of Map Reducing over billions of keys.  That said.  We have been using it for ad-hoc document storage as described in the use case above and it has worked very well.  Earlier versions were definitely very slow and inefficient for key scanning, exactly as you state.  Basho has put significant development into this area and the latest version has significantly improved performance there.  I don&#8217;t think that comparing run-time on a single node set up with anything on Hadoop is exactly apples-to-apples though.  Consider that if you have even a small five to ten node Hadoop cluster, your MR jobs are going to take a minute or two just in startup/teardown.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Alexander</title>
		<link>http://blog.mozilla.org/data/2010/05/18/riak-and-cassandra-and-hbase-oh-my/comment-page-1/#comment-2502</link>
		<dc:creator>Alexander</dc:creator>
		<pubDate>Thu, 13 Jan 2011 08:05:28 +0000</pubDate>
		<guid isPermaLink="false">http://blog.mozilla.org/data/?p=184#comment-2502</guid>
		<description><![CDATA[Riak is indeed cool, but it&#039;s unusable for large-scale data processing. The reason is that sequential key access is awfully, awfully slow. Riak takes several seconds to traverse a just couple of thousands (!) of keys on a single-node setup. Hadoop-style map/reduce jobs on large amounts of sequential log-like data for analysis and aggregation is completely out of the question.]]></description>
		<content:encoded><![CDATA[<p>Riak is indeed cool, but it&#8217;s unusable for large-scale data processing. The reason is that sequential key access is awfully, awfully slow. Riak takes several seconds to traverse a just couple of thousands (!) of keys on a single-node setup. Hadoop-style map/reduce jobs on large amounts of sequential log-like data for analysis and aggregation is completely out of the question.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Do you know Riak? a decentralized, internet-scale database &#171; Newsicare</title>
		<link>http://blog.mozilla.org/data/2010/05/18/riak-and-cassandra-and-hbase-oh-my/comment-page-1/#comment-2325</link>
		<dc:creator>Do you know Riak? a decentralized, internet-scale database &#171; Newsicare</dc:creator>
		<pubDate>Thu, 14 Oct 2010 03:58:16 +0000</pubDate>
		<guid isPermaLink="false">http://blog.mozilla.org/data/?p=184#comment-2325</guid>
		<description><![CDATA[[...] our attention is that testpilot of Mozilla labs using the RIAK. What RIAK can do for Mozilla labs? Source 1. Expected minimum users: 1 million. Design to accommodate 10 million by the end of the year and [...]]]></description>
		<content:encoded><![CDATA[<p>[...] our attention is that testpilot of Mozilla labs using the RIAK. What RIAK can do for Mozilla labs? Source 1. Expected minimum users: 1 million. Design to accommodate 10 million by the end of the year and [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Adi R</title>
		<link>http://blog.mozilla.org/data/2010/05/18/riak-and-cassandra-and-hbase-oh-my/comment-page-1/#comment-2196</link>
		<dc:creator>Adi R</dc:creator>
		<pubDate>Thu, 19 Aug 2010 01:20:23 +0000</pubDate>
		<guid isPermaLink="false">http://blog.mozilla.org/data/?p=184#comment-2196</guid>
		<description><![CDATA[Great article and especially comments.

So any thoughts about Hypertable.org? This is long on my radar and sounds promising, but I don&#039;t know if it&#039;s out there in any real production environments?]]></description>
		<content:encoded><![CDATA[<p>Great article and especially comments.</p>
<p>So any thoughts about Hypertable.org? This is long on my radar and sounds promising, but I don&#8217;t know if it&#8217;s out there in any real production environments?</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Blog of Data &#187; Blog Archive &#187; Benchmarking Riak for the Mozilla Test Pilot Project</title>
		<link>http://blog.mozilla.org/data/2010/05/18/riak-and-cassandra-and-hbase-oh-my/comment-page-1/#comment-2181</link>
		<dc:creator>Blog of Data &#187; Blog Archive &#187; Benchmarking Riak for the Mozilla Test Pilot Project</dc:creator>
		<pubDate>Tue, 17 Aug 2010 04:34:00 +0000</pubDate>
		<guid isPermaLink="false">http://blog.mozilla.org/data/?p=184#comment-2181</guid>
		<description><![CDATA[[...] the experiment results and performing analysis on them. As discussed in the previous blog post, Riak and Cassandra and Hbase, oh my!, we decided on Riak as that [...]]]></description>
		<content:encoded><![CDATA[<p>[...] the experiment results and performing analysis on them. As discussed in the previous blog post, Riak and Cassandra and Hbase, oh my!, we decided on Riak as that [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Alternative Database Technology for the Cloud: There is No Silver Bullet    &#124; Rackspace Cloud Computing &#38; Hosting</title>
		<link>http://blog.mozilla.org/data/2010/05/18/riak-and-cassandra-and-hbase-oh-my/comment-page-1/#comment-2169</link>
		<dc:creator>Alternative Database Technology for the Cloud: There is No Silver Bullet    &#124; Rackspace Cloud Computing &#38; Hosting</dc:creator>
		<pubDate>Fri, 13 Aug 2010 17:39:47 +0000</pubDate>
		<guid isPermaLink="false">http://blog.mozilla.org/data/?p=184#comment-2169</guid>
		<description><![CDATA[[...] of a few database alternatives, but many more also exist.  It also links to the write-up about Mozilla’s Test Pilot project, where they talk about the process they used to select a database that met their [...]]]></description>
		<content:encoded><![CDATA[<p>[...] of a few database alternatives, but many more also exist.  It also links to the write-up about Mozilla’s Test Pilot project, where they talk about the process they used to select a database that met their [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Diego Caravana</title>
		<link>http://blog.mozilla.org/data/2010/05/18/riak-and-cassandra-and-hbase-oh-my/comment-page-1/#comment-2142</link>
		<dc:creator>Diego Caravana</dc:creator>
		<pubDate>Mon, 02 Aug 2010 11:35:14 +0000</pubDate>
		<guid isPermaLink="false">http://blog.mozilla.org/data/?p=184#comment-2142</guid>
		<description><![CDATA[Thanks for this interesting and useful article (and also for the comments). I&#039;m researching tools for a similar project, and I love MongoDB so much with automatic sharding and availability almost ready in the not-yet-released 1.6 version. I&#039;ve tried Cassandra (a good experience indeed) and read something about HBase, but now I think Riak deserves some attention, and Hypertable, too.]]></description>
		<content:encoded><![CDATA[<p>Thanks for this interesting and useful article (and also for the comments). I&#8217;m researching tools for a similar project, and I love MongoDB so much with automatic sharding and availability almost ready in the not-yet-released 1.6 version. I&#8217;ve tried Cassandra (a good experience indeed) and read something about HBase, but now I think Riak deserves some attention, and Hypertable, too.</p>
]]></content:encoded>
	</item>
</channel>
</rss>
