Start reading Programming Pig on your Kindle in under a minute. Don't have a Kindle? Get your Kindle here or start reading now with a free Kindle Reading App.

Deliver to your Kindle or other device

 
 
 

Try it free

Sample the beginning of this book for free

Deliver to your Kindle or other device

Anybody can read Kindle books—even without a Kindle device—with the FREE Kindle app for smartphones, tablets and computers.
Programming Pig
 
 

Programming Pig [Kindle Edition]

Alan Gates
4.0 out of 5 stars  See all reviews (2 customer reviews)

Print List Price: £25.99
Kindle Price: £17.93 includes VAT* & free wireless delivery via Amazon Whispernet
You Save: £8.06 (31%)
* Unlike print books, digital books are subject to VAT.

Formats

Amazon Price New from Used from
Kindle Edition £17.93  
Paperback £21.07  
Kindle Daily Deal
Kindle Daily Deal: At least 60% off
Each day we unveil a new book deal at a specially discounted price--for that day only. Learn more about the Kindle Daily Deal or sign up for the Kindle Daily Deal Newsletter to receive free e-mail notifications about each day's deal.


Product Description

Book Description

Dataflow Scripting with Hadoop

Product Description

This guide is an ideal learning tool and reference for Apache Pig, the open source engine for executing parallel data flows on Hadoop. With Pig, you can batch-process data without having to create a full-fledged application—making it easy for you to experiment with new datasets.

Programming Pig introduces new users to Pig, and provides experienced users with comprehensive coverage on key features such as the Pig Latin scripting language, the Grunt shell, and User Defined Functions (UDFs) for extending Pig. If you need to analyze terabytes of data, this book shows you how to do it efficiently with Pig.

  • Delve into Pig’s data model, including scalar and complex data types
  • Write Pig Latin scripts to sort, group, join, project, and filter your data
  • Use Grunt to work with the Hadoop Distributed File System (HDFS)
  • Build complex data processing pipelines with Pig’s macros and modularity features
  • Embed Pig Latin in Python for iterative processing and other advanced tasks
  • Create your own load and store functions to handle data formats and storage mechanisms
  • Get performance tips for running scripts on Hadoop clusters in less time

Product details

  • Format: Kindle Edition
  • File Size: 1372 KB
  • Print Length: 222 pages
  • Page Numbers Source ISBN: 1449302645
  • Simultaneous Device Usage: Unlimited
  • Publisher: O'Reilly Media; 1 edition (29 Sep 2011)
  • Sold by: Amazon Media EU S.à r.l.
  • Language: English
  • ASIN: B0065KVFBM
  • Text-to-Speech: Enabled
  • X-Ray:
  • Word Wise: Not Enabled
  • Average Customer Review: 4.0 out of 5 stars  See all reviews (2 customer reviews)
  • Amazon Bestsellers Rank: #390,534 Paid in Kindle Store (See Top 100 Paid in Kindle Store)
  •  Would you like to give feedback on images?


More About the Author

Discover books, learn about writers, and more.

What Other Items Do Customers Buy After Viewing This Item?


Customer Reviews

5 star
0
3 star
0
2 star
0
1 star
0
4.0 out of 5 stars
4.0 out of 5 stars
Most Helpful Customer Reviews
1 of 1 people found the following review helpful
4.0 out of 5 stars Good text book! 12 Nov 2013
Format:Paperback|Verified Purchase
Great for getting an understanding of how to effectively programme using Pig - good practical examples that are easily laid out and straightforward to follow.
Comment | 
Was this review helpful to you?
By Zaza
Format:Paperback|Verified Purchase
Right now, there's not so much available about Pig Latin.
This book is decent, but I like chapter 10 about Pig in the book Hadoop in Action more, best I found until now.

What is missing in this book and what you find in the mentioned chapter of the other book is clear and simple examples of what output a query produces. That really helps understanding what JOIN, COGROUP, GROUP etc really do.
Comment | 
Was this review helpful to you?
Most Helpful Customer Reviews on Amazon.com (beta)
Amazon.com: 4.0 out of 5 stars  15 reviews
16 of 16 people found the following review helpful
2.0 out of 5 stars Lacks Editorial Oversight 27 Mar 2013
By ts - Published on Amazon.com
Format:Paperback|Verified Purchase
The book presents an advanced introduction to PIG. Its a book by an insider for insiders and not an introduction to PIG itself. You may end up producing code to run through some data, but may not necessarily gain any understanding.

The book reads like a blog. Beyond spell check, it has no editorial oversight whatsover. the content order is adhoc it goes from one topic to another without any apparent continuity. for example, right after a cursory introduction to Map Reduce and PIG, the discussion goes into an arcane details of commandline and flag settings without any context.

the book covers a whole lot of concepts, but the introduction of these concepts itself is weak. For example, projections are introduced as something PIG has in common with SQL.

book itself is -2 stars. -1 for amazon's kindle & oreilly. the publishing quality in this book is horrible. the code fonts smaller than main text, uneven spacing etc. default settings on freely available web publishing softwares produces better content than what amazon and orielly have produced here.
2 of 2 people found the following review helpful
3.0 out of 5 stars Just OK 17 Sep 2013
By Siddhardha - Published on Amazon.com
Format:Paperback
The company I was working for started using Big Data technologies recently and we were all expected to come up to speed quickly. For PIG, this is the only book I could find. Luckily it's available on Safari, so I didn't have to buy it myself (I didn't feel that the content justified the cost of purchasing this book). You can find a good chunk of the content in this book in online blogs, tutorials, and wikis, but it helped to have this book by my side when we were working on a project since all the relevant information is in a single location. Some of the information in this book is outdated. For instance, it talks about Boolean data type not being supported but the recent versions of PIG do, so be sure to refer to PIG docs from time to time to make sure you have the latest information. PIG is a relatively immature framework - the authors admit this to a certain extent in the book by mentioning that much of the tuning/optimization effort is outsourced to the user (unlike databases which make an effort to optimize queries). This book includes some insights into how to tune for performance (e.g: what types of JOINs to use and when, writing UDFs for performance) which is certainly helpful but the general tone is along the lines "here are somethings to look for but you need to test and find what works best yourself" - in other words, comprehensive examples illustrating the concepts are missing. Like a lot of other books, some typos exist in this book although it's not too hard to figure out. A basic understanding of hadoop is necessary to use this book but a solid foundation for hadoop is not necessarily required (although it helps a lot). All in all, if you are looking for a single place to refer to for PIG related docs, get this book.
1 of 1 people found the following review helpful
3.0 out of 5 stars Could Have Been Better 17 April 2014
By Big Data Paramedic - Published on Amazon.com
Format:Paperback|Verified Purchase
PIG is a powerful programming tool for big data, yet it is simple to write. I got the book to help myself to PIG programming and the book does help with it. If you are new to PIG programming, you will find it useful.

I was disappointed that the book only has a cursory reference to piggybank.jar which is a big plus for PIG, and that too at the end with no real examples about it. Also there are tons of PIG examples on the internet which pretty much walk through many scenarios much better than this book .

This is a quick reference book, not a PIG Bible
1 of 1 people found the following review helpful
3.0 out of 5 stars Mediocre primer. 7 Aug 2014
By AOL Jack - Published on Amazon.com
Format:Kindle Edition|Verified Purchase
Has some good info especially on how to extend pig functionality, but it leaves a lot to be desired as a primer on how to use pig. Most of the examples are poorly explained. To be honest I learned more about some of the thinking behind pig than I learned about how actually use pig. This book needs a new seriously upgraded edition. It only got three stars because it has no competition.
1 of 1 people found the following review helpful
4.0 out of 5 stars Good & Comprehensive, but lacks the last bit of analytic capability I was hoping for 26 Jun 2014
By Frank D. Evans - Published on Amazon.com
Format:Kindle Edition|Verified Purchase
The book is written pretty well, and the examples are clear and easy to follow for the most part. The only lacking aspect in my opinion was a deeper delve into the analytic capabilities for Pig. This is minor, and may actually be a good prompt for a follow-up "Pig Cookbook". Other than that, this is a great reference and has proven very useful.
Were these reviews helpful?   Let us know
Search Customer Reviews
Only search this product's reviews

Customer Discussions

This product's forum
Discussion Replies Latest Post
No discussions yet

Ask questions, Share opinions, Gain insight
Start a new discussion
Topic:
First post:
Prompts for sign-in
 

Search Customer Discussions
Search all Amazon discussions
   


Look for similar items by category