Top positive review
8 people found this helpful
Python (but really Pandas) for Data Analysis
on 16 May 2014
Python has an especially strong and widespread usage in scientific/engineering/data-analysis computing. Until a few years ago, an important tool that was missing from python was the ability to handle a so-called "data-frame", which in very basic terms is a spreadsheet-like data structure that contains heterogeneous data types in its columns (this type of structure is a main component of, for example, the R programming language for statistical computing). Around 4 years ago, this and related data-structures, and a great big set of tools for working with them, were provided by the pandas library and now pandas is *the* vital component for doing data-analysis in python.
This book is really about pandas (the author is the main author of pandas, after all), and less about either numpy or ipython or other tools. I don't mean that as a criticism. It is precisely as it should be. If you are doing strictly data analysis in python, it is pandas primarily that is center stage, with tools like numpy/ipython etc playing supporting roles. What this book does convey, however, is just how well all these tools work together and how they form a big team for scientific/numerical computing in python.
This book is detailed and extensive. It is entirely focused on well thought out follow-along-yourself code examples, and this makes it a remarkable effective way to learn pandas especially, but also to learn more about numpy/ipython/matplotlib etc.
If you do data analysis in python, this book is a must have. It is highly recommended too for anyone doing scientific/numerical computing in python generally.