The Data WarehouseETL Toolkit and over 2 million other books are available for Amazon Kindle . Learn more

Sign in to turn on 1-Click ordering.
Trade in Yours
For a 8.25 Gift Card
Trade in
More Buying Choices
Have one to sell? Sell yours here
Sorry, this item is not available in
Image not available for
Image not available

Start reading The Data WarehouseETL Toolkit on your Kindle in under a minute.

Don't have a Kindle? Get your Kindle here, or download a FREE Kindle Reading App.

The Data Warehouse ETL Toolkit: Practical Techniques for Extracting, Cleaning, Conforming, and Delivering Data [Paperback]

Ralph Kimball , Joe Caserta
3.7 out of 5 stars  See all reviews (3 customer reviews)
RRP: 30.99
Price: 21.07 & FREE Delivery in the UK. Details
You Save: 9.92 (32%)
o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o
Only 10 left in stock (more on the way).
Dispatched from and sold by Amazon. Gift-wrap available.
Want it Monday, 14 July? Choose Express delivery at checkout. Details


Amazon Price New from Used from
Kindle Edition 19.15  
Paperback 21.07  
Trade In this Item for up to 8.25
Trade in The Data Warehouse ETL Toolkit: Practical Techniques for Extracting, Cleaning, Conforming, and Delivering Data for an Amazon Gift Card of up to 8.25, which you can then spend on millions of items across the site. Trade-in values may vary (terms apply). Learn more

Book Description

24 Sep 2004 0764567578 978-0764567575 1
Cowritten by Ralph Kimball, the world′s leading data warehousing authority, whose previous books have sold more than 150,000 copies Delivers real–world solutions for the most time– and labor–intensive portion of data warehousing–data staging, or the extract, transform, load (ETL) process Delineates best practices for extracting data from scattered sources, removing redundant and inaccurate data, transforming the remaining data into correctly formatted data structures, and then loading the end product into the data warehouse Offers proven time–saving ETL techniques, comprehensive guidance on building dimensional structures, and crucial advice on ensuring data quality

Special Offers and Product Promotions

  • Spend 30 and get Norton 360 21.0 - 3 Computers, 1 Year 2014 for 24.99. Here's how (terms and conditions apply)

Frequently Bought Together

The Data Warehouse ETL Toolkit: Practical Techniques for Extracting, Cleaning, Conforming, and Delivering Data + The Data Warehouse Toolkit: The Definitive Guide to Dimensional Modeling + The Data Warehouse Lifecycle Toolkit
Price For All Three: 74.90

Buy the selected items together

Product details

  • Paperback: 528 pages
  • Publisher: John Wiley & Sons; 1 edition (24 Sep 2004)
  • Language: English
  • ISBN-10: 0764567578
  • ISBN-13: 978-0764567575
  • Product Dimensions: 23.4 x 18.7 x 2.8 cm
  • Average Customer Review: 3.7 out of 5 stars  See all reviews (3 customer reviews)
  • Amazon Bestsellers Rank: 224,006 in Books (See Top 100 in Books)
  • See Complete Table of Contents

More About the Authors

Discover books, learn about writers, and more.

Product Description

From the Back Cover

The single most authoritative guide on the most difficult phase of building a data warehouse The extract, transform, and load (ETL) phase of the data warehouse development life cycle is far and away the most difficult, time–consuming, and labor–intensive phase of building a data warehouse. Done right, companies can maximize their use of data storage; if not, they can end up wasting millions of dollars storing obsolete and rarely used data. Bestselling author Ralph Kimball, along with Joe Caserta, shows you how a properly designed ETL system extracts the data from the source systems, enforces data quality and consistency standards, conforms the data so that separate sources can be used together, and finally delivers the data in a presentation–ready format. Serving as a road map for planning, designing, building, and running the back–room of a data warehouse, this book provides complete coverage of proven, timesaving ETL techniques. Beginning with a quick overview of ETL fundamentals, it then looks at ETL data structures, both relational and dimensional. The authors show how to build useful dimensional structures, providing practical examples of techniques. Along the way you’ll learn how to: Plan and design your ETL system Choose the appropriate architecture from the many possible options Build the development/test/production suite of ETL processes Build a comprehensive data cleaning subsystem Tune the overall ETL process for optimum performance

About the Author

RALPH KIMBALL, PhD, founder of the Kimball Group, has been a leading visionary in the data warehousing industry since 1982 and is one of today’s best–known speakers and educators. He is the author of several bestselling titles published on data warehousing, including The Data Warehouse Toolkit (Wiley). JOE CASERTA is the founder of Caserta Concepts, LLC, a data warehousing consulting firm. He writes frequently for print and online magazines, and is an active contributor to DWList, the major online community for data warehousing professionals.

Inside This Book (Learn More)
First Sentence
Ideally, you must start the design of your ETL system with one of the toughest challenges: surrounding the requirements. Read the first page
Explore More
Browse Sample Pages
Front Cover | Copyright | Table of Contents | Excerpt | Index | Back Cover
Search inside this book:

Customer Reviews

3.7 out of 5 stars
3.7 out of 5 stars
Most Helpful Customer Reviews
32 of 33 people found the following review helpful
2.0 out of 5 stars Wordy, vague and few "Practical Techniques" 27 Jan 2005
By N. Chivers VINE VOICE
Format:Paperback|Verified Purchase
Computing is an exact and unambiguous discipline; consequently I want my computer books to be written in an exact and unambiguous manner. "The Data Warehouse ETL Toolkit" falls far short of this requirement, being wordy, vague, overblown and crammed with jargon. Worst of all, I found there were very few "Practical Techniques" I could take away with me that would help me in my work.
Here's a sample sentence: "This section discusses what needs to go into the data-cleansing baseline for the data warehouse, including simple methods for detecting, capturing and addressing common data-quality issues and procedures for providing the organisation with improved visibility into data-lineage and data-quality improvements over time". Now imagine a whole book written like this. OK, I've taken this sentence out of context, but if I tell you that this was used to introduce a section - there are no preceding or trailing sentences - then I think I am starting to paint a picture.
The authors and publishers seem to have taken the attitude, "Why use a bullet point when a paragraph will do?". Text and examples have been embellished as if in an effort to prove how clever the authors are. A lot of jargon is employed (no glossary), but the reader is always left in doubt as to whether this is industry standard or idiom employed only by the authors.
I think this book could have been so much more useful if they had taken a worked example right through from start to finish. They could have explained where the real world may be different to this perfect model and drawn on their experiences to add colour. Also, if this truly was supposed to be a book of practical techniques, they should have highlighted them, say 1 to 100, through the text, as applicable.
So why two stars rather than none?
Read more ›
Comment | 
Was this review helpful to you?
12 of 12 people found the following review helpful
4.0 out of 5 stars Woolly at times - but good overall 16 Jun 2006
Format:Paperback|Verified Purchase
Problem with this book is it is a bit woolly and wordy (just like the previous reviewer described). However, the main difficulty is there simply is no other book around on the market. As a description of the entire end-to-end ETL process, including many subject areas I'd not even considered (eg. COBOL copy books), it's very good.

However, I'd say the REAL reason for buying this book is it works well with Ralph Kimballs other work "The Data Warehouse Toolkit", and gives an excellent summary of Dimensional Design. I guess the authors felt they must put this in to explain the background. Personally I found it invaluable.

Also the description of "real time ETL" was invaluable. Everyone's talking about it, but the book gives a credible outline solution.

Yes woolly, yes it uses 10 words when two would do, but overall I got a lot out of it.

Comment | 
Was this review helpful to you?
1 of 1 people found the following review helpful
5.0 out of 5 stars Excellent Buy 3 April 2013
Format:Paperback|Verified Purchase
Fantastic down to earth explanations with real business situations.
Worth buying if you are a starter in the DW technology.
Comment | 
Was this review helpful to you?
Most Helpful Customer Reviews on (beta) 4.6 out of 5 stars  26 reviews
23 of 24 people found the following review helpful
5.0 out of 5 stars Another strong Data Warehousing book from Ralph Kimball 23 Nov 2004
By D. Mathews - Published on
In this book Ralph lays down a framework for constructing the DW ETL. This is useful not just in constructing quality ETL processes, but also because Ralph's works tend to 'set' standards in data warehousing. The format of this book is similar to the Lifecycle Toolkit. Ralph takes a very staged, logical approach to the material. Some sections are just great e.g. the chapters on Extraction and Development. A small amount of the material is repeated from the Lifecycle Toolkit and Dimensional Modeling books, but no more than is needed to make this book stand on its own.

Also like the other books, this one takes a vendor agnostic approach. While this may increase the shelf-life of the book, I would have appreciated some comparisons between the major vendors out there today.

Overall: I recommend this one as a buy, even if you have Ralph's other books.
10 of 10 people found the following review helpful
5.0 out of 5 stars Great coverage of the ETL building blocks 19 Dec 2005
By Vincent Mcburney - Published on
This is one of the few references out there providing the building blocks of good ETL design. There is plenty of technical documentation and forums out there that are specific to one ETL tool or DBMS but this is a better starting place for ETL developers. It is required reading as ETL projects often take short cuts in design, data quality and metadata management and reporting. This leads to very expensive Data Warehouse administration costs and often a complete rebuild of load jobs.

The book is relevent for people using most ETL or ELT tools and it will remain relevent for years even as the ETL products continue to advance and mature. It is targeted at DW but the basic flow of Extract, Clean, Conform and Deliver is suitable for most types of data loads.

Good coverage of the alternatives to traditional overnight bulk loads in the section on real-time ETL systems (also describes Microbatch) as the businesses and the major ETL vendors shift to SOA.
12 of 14 people found the following review helpful
5.0 out of 5 stars An almost complete dwh design with ETL orientation 22 Mar 2005
By Massimiliano Celaschi - Published on
This book takes almost all issues in a data warehouse design and represents them oriented to ETL features. Actually, ETLing matches the whole of the data warehouse (more or less), so the need to describe them makes this book an autonomous work you can read without referring to previous books by Kimball. Besides, I think that some technical descriptions have been better performed here: in my experience it is impossible to undertake dwh activities without (at least) a sound knowledge about general features (indexes, use of a bulk loader vs. INSERT, etc.) of RDBMS, and this paper addresses them conveniently. On the other hand, the flat style used lacks to give evidence to the very significant issues, which happen so to be mixed up with less important statements; that demands to pay high attention while reading, but a blurring boundary between subtleties and trivialities seems to be a common shortcoming in dwh literature. Even with that flaw, the ETL Toolkit turn out as an outstanding reference to state of the art of dwh technology.
6 of 6 people found the following review helpful
5.0 out of 5 stars A handy tool on the desk of any ETL Developer. 27 Jun 2006
By Andre Ackermann - Published on
I am currently working as an ETL Developer at a company

Fourier Approach, Centurion, South Africa.

Most of the time this is a fairly hot seat -

because so many business requirements are dependant on the

Quality of Information produced by the ETL process.

I always asked myself,

* Am I doing the right thing?

* Is this the best solution?

* How would other developers do this?

A while ago I attended the course

"ETL Architecture and Design Workshop"

presented by Joe Caserta, and hosted by Alicornio Africa in Johannesburg, South Africa.

Before the presentation we received a copy of the book

"The Data Warehouse ETL Toolkit".

This changed my whole perspective.

The book adressed all my ETL questions,

with examples from real-world situations.

It covers the whole ETL process and gives answers

to almost every question you will ever think of asking.

I must say this is a very handy tool on the desk of any serious ETL Developer.


André Ackermann

ETL Developer
9 of 11 people found the following review helpful
5.0 out of 5 stars Indispensable How-To for all ETL Architects/Developers/Mgrs 4 Jan 2005
By Douglas Little - Published on
Format:Paperback|Verified Purchase
Ralph Kimball has rounded out his complete recipe for building fast, cost effective, robust and durable enterprise dimensional data warehouses with this immensely valuable addition to all IT & Data Warehouse professionals' bookshelves.

Without a doubt ETL has been the biggest stumbling block to deploying and maintaining well architected data warehouses that stand the test of time. Ralph draws on his years of experience and engagement with thousands of projects and crystallizes the `Best Practices' into an effective application architecture for all ETL systems regardless of what tools projects use for implementation.

In this thorough examination of the Extract, Transform and Load (ETL) process, Ralph identifies 38 critical functions that all ETL systems need to implement for success in the long haul. He thoughtfully lays out simple and practical approaches for how each of these functions can be implemented by projects with any size of budget.

For many, the paradoxical nature of ETL (seeming trivial yet replete with endlessly complex details that constantly change) has been the proverbial straw that has broken the bank for many DW projects. Continual customer pressure to grow, improve performance, and quickly deal with changing business conditions have left developers and architects grasping for more powerful and flexible approaches to ETL that meet project timelines, yet evolve and improve with age. Armed with this enlightening roadmap, many DW professionals will be far better equipped to design and build systems that meet the challenges today and tomorrow.
Were these reviews helpful?   Let us know
Search Customer Reviews
Only search this product's reviews

Customer Discussions

This product's forum
Discussion Replies Latest Post
No discussions yet

Ask questions, Share opinions, Gain insight
Start a new discussion
First post:
Prompts for sign-in

Search Customer Discussions
Search all Amazon discussions

Look for similar items by category