Cassandra: The Definitive Guide and over 1.5 million other books are available for Amazon Kindle . Learn more


or
Sign in to turn on 1-Click ordering.
Trade in Yours
For a £1.60 Gift Card
Trade in
More Buying Choices
Have one to sell? Sell yours here
Sorry, this item is not available in
Image not available for
Colour:
Image not available

 
Start reading Cassandra: The Definitive Guide on your Kindle in under a minute.

Don't have a Kindle? Get your Kindle here, or download a FREE Kindle Reading App.

Cassandra: The Definitive Guide [Paperback]

Eben Hewitt
3.0 out of 5 stars  See all reviews (1 customer review)
RRP: £30.99
Price: £19.83 & this item Delivered FREE in the UK with Super Saver Delivery. See details and conditions
You Save: £11.16 (36%)
o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o
Only 2 left in stock (more on the way).
Dispatched from and sold by Amazon. Gift-wrap available.
Want it Friday, 21 June? Choose Express delivery at checkout. Details

Formats

Amazon Price New from Used from
Kindle Edition £14.99  
Paperback £19.83  
Trade In this Item for up to £1.60
Trade in Cassandra: The Definitive Guide for an Amazon.co.uk gift card of up to £1.60, which you can then spend on millions of items across the site. Trade-in values may vary (terms apply). Special Offer until June 30, 2013: Receive an additional £5 promotional Gift Card, when you trade-in at least £10 worth of books. Learn more

Book Description

29 Nov 2010 1449390412 978-1449390419 1

What could you do with data if scalability wasn't a problem? With this hands-on guide, you'll learn how Apache Cassandra handles hundreds of terabytes of data while remaining highly available across multiple data centers -- capabilities that have attracted Facebook, Twitter, and other data-intensive companies. Cassandra: The Definitive Guide provides the technical details and practical examples you need to assess this database management system and put it to work in a production environment.

Author Eben Hewitt demonstrates the advantages of Cassandra's nonrelational design, and pays special attention to data modeling. If you're a developer, DBA, application architect, or manager looking to solve a database scaling issue or future-proof your application, this guide shows you how to harness Cassandra's speed and flexibility.

  • Understand the tenets of Cassandra's column-oriented structure
  • Learn how to write, update, and read Cassandra data
  • Discover how to add or remove nodes from the cluster as your application requires
  • Examine a working application that translates from a relational model to Cassandra's data model
  • Use examples for writing clients in Java, Python, and C#
  • Use the JMX interface to monitor a cluster's usage, memory patterns, and more
  • Tune memory settings, data storage, and caching for better performance

Frequently Bought Together

Cassandra: The Definitive Guide + Hadoop: The Definitive Guide + HBase: The Definitive Guide
Price For All Three: £64.30

Buy the selected items together


Product details

  • Paperback: 332 pages
  • Publisher: O'Reilly Media; 1 edition (29 Nov 2010)
  • Language: English
  • ISBN-10: 1449390412
  • ISBN-13: 978-1449390419
  • Product Dimensions: 18 x 1.7 x 23.3 cm
  • Average Customer Review: 3.0 out of 5 stars  See all reviews (1 customer review)
  • Amazon Bestsellers Rank: 330,876 in Books (See Top 100 in Books)
  • See Complete Table of Contents

More About the Author

Discover books, learn about writers, and more.

Product Description

About the Author

Eben Hewitt is Director of Application Architecture at a publicly traded company where he is responsible for the design of their mission-critical, global-scale web, mobile and SOA integration projects. He has written several programming books, including Java SOA Cookbook (O'Reilly).


What Other Items Do Customers Buy After Viewing This Item?


Customer Reviews

5 star
0
4 star
0
2 star
0
1 star
0
3.0 out of 5 stars
3.0 out of 5 stars
Most Helpful Customer Reviews
9 of 9 people found the following review helpful
3.0 out of 5 stars OK, but not the usual O'Reilly standard 26 April 2011
Format:Paperback|Amazon Verified Purchase
This seems to be the only Cassandra book available at present, so is probably worth owning if you are interested in Cassaandra. However, it added little to the information available on the web in tutorials etc. The section on the internal architecture is confusing and a little disorganised, even when you already understand much of the material. There are quite a few detailed Java code snippets (for client code), but these are very verbose and not well-explained, so don't add as much value as you'd expect. The diagrams explaining the column-based databse structure are some of the best I've seen for Cassandra, although they aren't used as much as they could be within the book. The areas I was hoping for extra details on (load balancing, order-preserving partitioning) aren't covered in much detail. The sections on managing Cassandra in production are far too superficial - they describe many of the parameters one might set - but don't really discuss the tradeoffs or how to select the values. This is problem of style throughout much of the book - it goes into many implementation details, without discussing properly why they matter. Some of the worked examples similarly abandon the reader halfway through without enough explanation.
Comment | 
Was this review helpful to you?
Most Helpful Customer Reviews on Amazon.com (beta)
Amazon.com: 3.3 out of 5 stars  13 reviews
55 of 57 people found the following review helpful
3.0 out of 5 stars Filled with information but not necessarily the information you want 8 Dec 2010
By John Armstrong - Published on Amazon.com
Format:Paperback
I'm not a database person but I've worked with SQL databases (esp. MySQL) and have read a few papers about non-relational databases, particularly Google's Bigtable. I understand the "web-scale" data challenge and see how a distributed, fault-tolerant, tunable open-source database like Cassandra can be an incredibly useful tool for addressing it. Therefore I was really looking forward to the publication of Eben Hewitt's Cassandra, The Definitive Guide. I was hoping that it would lay out all the important things a person would need to know in order to decide whether Cassandra made sense for their project and, if it did, how specifically they would use it.

Now that the book's out and I've had a chance to read it once through, I have to say that it does not meet my expectations. The author is clearly very interested in his subject and also very anxious to share insights not only into Cassandra but into modern non-relational databases in general (to the extent of including a 25-page appendix "The Nonrelational Landscape" at the end of the book). He does a pretty good job of explaining how Cassandra works at the level of distributed storage including scaling as well as availability and consistency. And though I haven't gone through the steps, he seems to give pretty good instructions for installing, configuring and monitoring a Cassandra cluster.

What he doesn't cover nearly as well as I was hoping (and would have expected from an O'Reilly book) is data modeling in Cassandra and the actual APIs for putting data into the database and getting data out (i.e. querying). It's not that he doesn't cover these subjects at all. In fact he devotes two chapters to data modeling (Chapter 3 The Cassandra Data Model and Chapter 4 Sample Application) and two to APIs (Chapter 7 Reading and Writing Data and Chapter 8 Clients), and these chapters contain a lot of useful information. The problem is that the information I really want is either mixed in with other, for me, less important information and/or is too limited or even not present at all.

Here are some things that I would have expected to be presented in reasonably full, coherent form in a "definitive guide" to Cassandra:

Data modeling:

Column families, supercolumns and columns - what are they for, how do you use them effectively? Especially supercolumns, which, in conjunction with the intrinsically sparse data representation, allow you to blur the distinction between structure and data and store data in "wide" format and even as out-and-out row-specific lists. He touches on matters of this sort, including in the design patterns at the end of his Data Modeling chapter, but doesn't integrate them into a coherent account of how to use the Cassandra data representation model.

Lack of joins - what are the alternatives? He addresses this issue too, but mostly says, denormalize your tables and design for common queries - or even more bluntly, precompute the results of your common queries and put them into your database. This may be a good approach in some situations, but leaves a lot of questions like, when do you precompute your query results, where and how, what triggers the computation, and how do you handle data changes that invalidate previously precomputed query results (one of the problems that normalization and joins were originally designed to solve). Also, I believe he does not say very much about implementing joins and other complex queries on the client side. Does Cassandra have properties that determine more vs. less efficient ways of doing this? How important is planning for locality in your column family organization? And supercolumns for maintaining lists/sets so that you don't have to assemble them at query time?

APIs:

Primary API - what is it? As the author explains, Cassandra doesn't have a query language, so he can't offer a chapter on the Cassandra equivalent of, say, SQL for relational databases. But Cassandra does have an API that lets you put data in and get data out, if not also other things like creating and deleting column families, supercolumns and columns. I was really expecting a chapter (or appendix or whatever) listing out the complete set of API requests and responses, either in some language-neutral format or in terms of the "native" Cassandra language, i.e. Java, ideally with additional information on "bindings" for other client-side languages like PHP, Python and so on. Again the information is sort of there, but not pulled together.

Higher-level wrappers - what are they about? The author talks about Thrift and Avro as (at least somewhat) high-level languages for communicating with Cassandra, but doesn't lay out in any coherent what those languages are. These tools may be very familiar to some, but I'm sure not to all. He does provide enough information - especially in the form of external links - to make it possible to start exploring these tools, but I would have expected the book to give a pretty good idea of what they're about without having to go off and read other material.

While I am, overall, dissatisfied with the book, I found it both an interesting read and an engaging introduction to the world of Cassandra. It also undeniably offers a wealth of information, even if it's not exactly the information a person may be looking for. For this reason I'm rating it 3 stars.
21 of 21 people found the following review helpful
2.0 out of 5 stars Disappointing: premature, lacks organization and support 5 Jan 2011
By Aiden Mark Humpheys - Published on Amazon.com
Format:Paperback
The information in this book is solid enough but its chaotic structure and lack of support for the code examples make it hard to justify a purchase.

The book was written to against version 0.7b2 of Cassandra. That beta status alone should be warning of the perils of premature publication. None of the code examples work (or indeed compile) with the current API (0.7b5). Downloading the latest code from the author's spartan support site offers little gain. The zip ball contains a readme file noting that the code did work once and suggesting the reader fixes it themselves.

There is a consistent pattern of requiring the reader to understand terms which are first defined several chapters later. Slices for example, or setting up the Cassandra JMX interface which is required for data loading in chapter 4 but first described in chapter 8.

Annoying, especially as there is solid information here and it's not badly written. Had the O'Reilly editors been more pro-active, ignored the me-first commercial pressures, delayed publication until the API stabilized and sorted out the structural problems in the writing this could have been a solid read.
3 of 3 people found the following review helpful
2.0 out of 5 stars Nothing definite about this Guide 4 Mar 2013
By Rajeev Jha - Published on Amazon.com
Format:Paperback
First up, I have nothing against the author. The author comes across as a genuine guy who is actually willing to invest energy in explanations. I just wish he had taken up a different topic. Now I am really fed-up of this whole genre of O'reilly books that do not add anything to what you can otherwise learn on the Internet for free. I bought the Indian edition (and paid only 9$)

#1) The edition I have talks about cassandra-0.7 that is already obsolete (now on 4 March, 2013 - we have 1.2)
The preferred way of accessing the store may be CQL3 now.

#2) As an application developer - The biggest concern I had was around solving my problem or data modeling. I do not want to delve too much into how to create a cluster and all. The example model of Hotel reservation is too simplistic. You are better off reading Jay Patel's Ebay tech blogs or Datastax's metric collection sample on the subject. They do a much better job of explaining the cassandra data model.

Also, any effort to introduce cassandra data modeling in terms of "equivalent RDBMS terms " is fraught with danger as cassandra is actually a big map. The book comes short on my data modeling expectations.

#3) Apart from storage, many people would be looking to run analytic on top of cassandra. It would have been great to explain how to run Hadoop/Pig on top of latest cassandra in detail.

#4) I do not/ cannot comment on how this book is for clustering and administration - because that is not my interest - please check other reviews for that.

The fact that we invest in books because they stand the test of time does not apply here. You cannot pull out this book from shelf two years down the line to check some fact or jog your memory. O'reilly sucks big time. These kinds of book are nothing but an effort to ride the latest wave of technology.
Were these reviews helpful?   Let us know
Search Customer Reviews
Only search this product's reviews

Customer Discussions

This product's forum
Discussion Replies Latest Post
No discussions yet

Ask questions, Share opinions, Gain insight
Start a new discussion
Topic:
First post:
Prompts for sign-in
 

Search Customer Discussions
Search all Amazon discussions
   


Listmania!

Create a Listmania! list

Look for similar items by category


Feedback


Amazon.co.uk Privacy Statement Amazon.co.uk Delivery Information Amazon.co.uk Returns & Exchanges