Start reading Apache Flume: Distributed Log Collection for Hadoop on your Kindle in under a minute. Don't have a Kindle? Get your Kindle here or start reading now with a free Kindle Reading App.

Deliver to your Kindle or other device


Try it free

Sample the beginning of this book for free

Deliver to your Kindle or other device

Sorry, this item is not available in
Image not available for
Image not available

Apache Flume: Distributed Log Collection for Hadoop (What You Need to Know) [Kindle Edition]

Steve Hoffman
4.0 out of 5 stars  See all reviews (1 customer review)

Print List Price: £22.99
Kindle Price: £16.79 includes VAT* & free wireless delivery via Amazon Whispernet
You Save: £6.20 (27%)
* Unlike print books, digital books are subject to VAT.

Free Kindle Reading App Anybody can read Kindle books—even without a Kindle device—with the FREE Kindle app for smartphones, tablets and computers.

To get the free app, enter your e-mail address or mobile phone number.


Amazon Price New from Used from
Kindle Edition £16.79  
Paperback £22.99  
Kindle Daily Deal
Kindle Daily Deal: Up to 70% off
Each day we unveil a new book deal at a specially discounted price--for that day only. Learn more about the Kindle Daily Deal or sign up for the Kindle Daily Deal Newsletter to receive free e-mail notifications about each day's deal.

Book Description

In Detail

Apache Flume is a distributed, reliable, and available service for efficiently collecting, aggregating, and moving large amounts of log data. Its main goal is to deliver data from applications to Apache Hadoop's HDFS. It has a simple and flexible architecture based on streaming data flows. It is robust and fault tolerant with many failover and recovery mechanisms.

Apache Flume: Distributed Log Collection for Hadoop covers problems with HDFS and streaming data/logs, and how Flume can resolve these problems. This book explains the generalized architecture of Flume, which includes moving data to/from databases, NO-SQL-ish data stores, as well as optimizing performance. This book includes real-world scenarios on Flume implementation.

Apache Flume: Distributed Log Collection for Hadoop starts with an architectural overview of Flume and then discusses each component in detail. It guides you through the complete installation process and compilation of Flume.

It will give you a heads-up on how to use channels and channel selectors. For each architectural component (Sources, Channels, Sinks, Channel Processors, Sink Groups, and so on) the various implementations will be covered in detail along with configuration options. You can use it to customize Flume to your specific needs. There are pointers given on writing custom implementations as well that would help you learn and implement them.

By the end, you should be able to construct a series of Flume agents to transport your streaming data and logs from your systems into Hadoop in near real time.


A starter guide that covers Apache Flume in detail.

Who this book is for

Apache Flume: Distributed Log Collection for Hadoop is intended for people who are responsible for moving datasets into Hadoop in a timely and reliable manner like software engineers, database administrators, and data warehouse administrators.

Product Description

About the Author

Steve Hoffman

Steve Hoffman has 30 years of software development experience and holds a B.S. in computer engineering from the University of Illinois Urbana-Champaign and a M.S. in computer science from the DePaul University. He is currently a Principal Engineer at Orbitz Worldwide.

More information on Steve can be found at or on Twitter @bacoboy.

This is Steve's first book.

Product details

  • Format: Kindle Edition
  • File Size: 599 KB
  • Print Length: 108 pages
  • Publisher: Packt Publishing (16 July 2013)
  • Sold by: Amazon Media EU S.à r.l.
  • Language: English
  • ISBN-10: 1782167927
  • ISBN-13: 978-1782167921
  • ASIN: B00DZJA82S
  • Text-to-Speech: Enabled
  • X-Ray:
  • Word Wise: Not Enabled
  • Average Customer Review: 4.0 out of 5 stars  See all reviews (1 customer review)
  • Amazon Bestsellers Rank: #838,834 Paid in Kindle Store (See Top 100 Paid in Kindle Store)
  •  Would you like to give feedback on images?

More About the Author

Discover books, learn about writers, and more.

Customer Reviews

5 star
3 star
2 star
1 star
4.0 out of 5 stars
4.0 out of 5 stars
Most Helpful Customer Reviews
4.0 out of 5 stars A lot of information 11 Nov. 2013
By Anna
Format:Paperback|Verified Purchase
but not enough. I wouldn't mind if they could show how to implement Apache Jmeter into the Flume for tests.
Comment | 
Was this review helpful to you?
Most Helpful Customer Reviews on (beta) 4.7 out of 5 stars  6 reviews
1 of 1 people found the following review helpful
5.0 out of 5 stars Flume Distilled 22 Oct. 2013
By K. Khambadkone - Published on
This is the only commercial publication on Flume apart from the product docs and probably the only book you will need to get started. I have become a big fan of PACKT publications and the main reason for this is that they put out these great books which are written by people who have very extensive experience with implementing the technology they are writing about. Not only is the core topic, flume is this case, extensively covered but as a bonus you get a lot of other nuggets in the form of how this technology fits in the overall ecosystem. This is illustrated with copious use cases and integration scenarios which you normally don't find in publications of this kind.

What I liked most about this book is the very thorough coverage of the Flume Architecture, the various topologies that flume can be deployed in. A whole chapter is devoted to each key component such as the Agent, Source, Sink, Channel and interceptors. The writing style was also very professional so much so that it actually read like a story book and was a delight to read.
1 of 1 people found the following review helpful
4.0 out of 5 stars Good book to start for Apache Flume 12 Sept. 2013
By Padmanabh Sahasrabudhe - Published on
I would say it is a good book to start off for a newbie. I found it helpful to get familiar with flume and its usage. I was expecting to get some exposure to the elastic flume side but couldn't find that in this book. But never the less for a new person to big data I think it gave me sufficient know how at least. More helpful features would have been following a single configuration story for the whole book. I would rate it 3.
5.0 out of 5 stars A well-rounded, complete book 10 Feb. 2014
By Helena Kerzner - Published on
In this book the reader will find overview, quick start, channels, sinks (serializers, sink groups), sources, channel selectors, interceptors, ETL and routing, and a practical view of distributed data collection. The material is well organized and goes to the right level of detail. Still, I would give it 4 stars, just because it is not Marcel Proust or James Joyce, but if you also consider the level of other books that you often see - this one stands out so much that it deserves a complete five-star rating.
5.0 out of 5 stars Best book to start with Flume 27 Dec. 2013
By Amit Sharma - Published on
I have gone through the initial chapters of this book and being a newbie to Flume, I think this is possibly the best way I would have prefered to start with Flume.

In 100 pages and 8 chapters , the content is well placed and quite well curated for a beginner.

I will suggest this book to anyone who what to get started with Flume and have a fair understanding of how it works.

4.0 out of 5 stars Good for users bad for developers 30 Oct. 2013
By Ivan - Published on
Format:Kindle Edition|Verified Purchase
Good compact coverage of main concepts of flume-ng. Best book for starting work with flume. If you looking for answer on question about logs import then this book is give you some useful advice but not answers.
Were these reviews helpful?   Let us know
Search Customer Reviews
Only search this product's reviews

Customer Discussions

This product's forum
Discussion Replies Latest Post
No discussions yet

Ask questions, Share opinions, Gain insight
Start a new discussion
First post:
Prompts for sign-in

Search Customer Discussions
Search all Amazon discussions

Look for similar items by category