Deliver to your Kindle or other device


Try it free

Sample the beginning of this book for free

Deliver to your Kindle or other device

Sorry, this item is not available in
Image not available for
Image not available

The CUDA Handbook: A Comprehensive Guide to GPU Programming [Kindle Edition]

Nicholas Wilt
3.5 out of 5 stars  See all reviews (2 customer reviews)

Print List Price: £36.99
Kindle Price: £28.11 includes VAT* & free wireless delivery via Amazon Whispernet
You Save: £8.88 (24%)
* Unlike print books, digital books are subject to VAT.

Free Kindle Reading App Anybody can read Kindle books—even without a Kindle device—with the FREE Kindle app for smartphones, tablets and computers.

To get the free app, enter your e-mail address or mobile phone number.


Amazon Price New from Used from
Kindle Edition £28.11  
Paperback £29.59  
Kindle Delivers Sign-up to receive email updates and get a Free Kindle Book

Book Description


The CUDA Handbook begins where CUDA by Example (Addison-Wesley, 2011) leaves off, discussing CUDA hardware and software in greater detail and covering both CUDA 5.0 and Kepler. Every CUDA developer, from the casual to the most sophisticated, will find something here of interest and immediate usefulness. Newer CUDA developers will see how the hardware processes commands and how the driver checks progress; more experienced CUDA developers will appreciate the expert coverage of topics such as the driver API and context migration, as well as the guidance on how best to structure CPU/GPU data interchange and synchronization.


The accompanying open source code–more than 25,000 lines of it, freely available at–is specifically intended to be reused and repurposed by developers.


Designed to be both a comprehensive reference and a practical cookbook, the text is divided into the following three parts:

Part I, Overview, gives high-level descriptions of the hardware and software that make CUDA possible.

Part II, Details, provides thorough descriptions of every aspect of CUDA, including

  •  Memory
  • Streams and events
  •  Models of execution, including the dynamic parallelism feature, new with CUDA 5.0 and SM 3.5
  • The streaming multiprocessors, including descriptions of all features through SM 3.5
  • Programming multiple GPUs
  • Texturing

The source code accompanying Part II is presented as reusable microbenchmarks and microdemos, designed to expose specific hardware characteristics or highlight specific use cases.

Part III, Select Applications, details specific families of CUDA applications and key parallel algorithms, including

  •  Streaming workloads
  • Reduction
  • Parallel prefix sum (Scan)
  • N-body
  • Image Processing
These algorithms cover the full range of potential CUDA applications.


Product Description

From the Back Cover


About the Author

Nicholas Wilt has been programming professionally for more than twenty-five years in a variety of areas, including industrial machine vision, graphics, and low-level multimedia software. While at Microsoft, he served as the development lead for Direct3D 5.0 and 6.0, built the prototype for the Desktop Window Manager, and did early GPU computing work. At NVIDIA, he worked on CUDA from its inception, designing and often implementing most of CUDA’s low-level abstractions. Now at Amazon, Mr. Wilt is working on cloud computing technologies relating to GPUs.

Product details

  • Format: Kindle Edition
  • File Size: 34681 KB
  • Print Length: 528 pages
  • Simultaneous Device Usage: Up to 5 simultaneous devices, per publisher limits
  • Publisher: Addison-Wesley Professional; 1 edition (11 Jun. 2013)
  • Sold by: Amazon Media EU S.à r.l.
  • Language: English
  • Text-to-Speech: Enabled
  • X-Ray:
  • Word Wise: Not Enabled
  • Average Customer Review: 3.5 out of 5 stars  See all reviews (2 customer reviews)
  • Amazon Bestsellers Rank: #292,479 Paid in Kindle Store (See Top 100 Paid in Kindle Store)
  •  Would you like to give feedback on images?

More About the Author

Discover books, learn about writers, and more.

Customer Reviews

4 star
3 star
1 star
3.5 out of 5 stars
3.5 out of 5 stars
Most Helpful Customer Reviews
2.0 out of 5 stars Memory is everythng 20 Nov. 2014
Format:Paperback|Verified Purchase
Very detailed book, containing a lot of advanced features (some of them so advanced, usual CUDA programmers will most likely never use them). Definitely recommended for people who want to write nicely optimized CUDA code, making the most of their architecture. There are however some major problems with this book. I am not sure author gave someone this book to read it. You learn about memory in chapter 2. Then you learn about memory in chapter 3. If you haven't had enough, the whole chapter 5 is about memory. I know that when working with CUDA, memory is a matter of life and death, but describing different memory types in 3 places using pretty much CTRL + C, CTRL + V method, seems like desperate attempt to simply fill the pages. Same for footnotes, which exist to describe some feature, which is usually described in more detail, 2-3 sentences below.
The difficulty of this book is also a roller coaster. In some places you are expected to know quite a lot of CUDA, including the whole library documentation, because who needs parameters explanations, and then chapter 7 teaches you how to use thread and block indices.
Comment | 
Was this review helpful to you?
5.0 out of 5 stars Five Stars 28 July 2014
By Javier
Format:Paperback|Verified Purchase
Book's quality is quite good and it arrived in perfect condition even before than I expected.
Comment | 
Was this review helpful to you?
Most Helpful Customer Reviews on (beta) 3.9 out of 5 stars  9 reviews
15 of 16 people found the following review helpful
5.0 out of 5 stars Excellent but NOT for beginners! 4 Sept. 2013
By Timothy Masters - Published on
Format:Paperback|Verified Purchase
As one slowly learns CUDA programming, numerous questions arise concerning the internal workings of the GPU. The beginning programmer does many things on faith: the documentation says to do it this way, so you do it that way, and it works. Why was that way necessary? Not clear.

The documentation supplied by nVidia is very good, and several excellent beginners' books are available. But these things fail to answer the many subtle issues that arise. That's where this book comes in. Over and over as I read it, I said, "Ohhh, that's why I have to do it that way." This book was written by a real insider, someone who knows CUDA as only an insider can. So this book is MANDATORY for anyone who wants to become an expert in CUDA programming.

However, be warned that this book is NOT for beginners! It presupposes extensive experience in CUDA programming. If this is the first CUDA book you pick up, you'll be hopelessly lost. Tackle this book only after you have a lot of CUDA under your belt.
10 of 12 people found the following review helpful
5.0 out of 5 stars Put a (Bara)CUDA in your programming. 5 Aug. 2013
By Robin T. Wernick - Published on
Format:Paperback|Verified Purchase
"The CUDA Handbook" is the largest(480p) and latest( June 2013 ) of NVIDIA's series of GPU programming books. It is also the most comprehensive and useful GPU programming reference for programmers to date. It's a tough world out there for programmers who are trying to keep up with changes in technology and this reference makes the future a much more comfortable place to live. Learn about GPGPU programming and get ahead of the crowd.

For those programmers who haven't had the time to perceive the changes, GPU programming is a current change in programming design that is sweeping the world of network VOIP management, parallel analysis and simulation, and even supercomputing in a single box. I have personally run a Starfield Simulation on a portable with an i7 processor that increased in speed 112 times by using the internal NVIDIA GeForce 570M. The Starfield frame time reduced from about 2 seconds to about .015 sec. Imagine what I could do with a GeForce 690! Charts indicate that it might exceed 700 times the computing speed!!This book not only tells me how to arrange the software to work with the NVIDIA SDK, but it also shows me the important differences in the architecture of many of the NVIDIA cards to obtain optimum performance.

The world of computing is still filled with 32 bit machines( or OS sysstems ) using most of their memory to get their assigned tasks completed. Many of these machines do not have even four core CPUs, forget having over 4GB of memory. They fill computers in production devices, desktops in database support companies, and the racks of IT departments everywhere. The need for faster and more computing does not slow down or stop for these hardware limits. Ant the cost to replace them outright is prohibitive. Now, a demand to manage 5000 computer domains arrives or a messaging demand for 1500 VOIP channels to be mixed in a hundred groups is brought on board or a control simulation to manage six robotic arms in an assembly line needs to be run. Without clustering a dozen to one hundred other computers to manage the computing load, the only practical solution is to employ one or two GPUs. Projects that ignore this message are destined to fail and along with that comes damaged careers and lost jobs.

The solution to avoiding the trap of limited legacy hardware is to use GPUs to take up the load and stop overloading the limited memory and CPU cores to do the increased workload. Each GPU can add 2300 streaming multiprocessors to perform the work. And each GPU cards can add 4GB of high speed memory to the limited program memory on the motherboard, which may only be 2GB.

The book introduces the GPU architecture, GPU device memory usage and loading, and Kernel processor code design. Once you have mastered the terminology and run some of the examples, you will be able to start developing code for specific solutions. The first chapters introduce you to NVIDIA GPU devices. The meat of the book starts in Chapter 5 with proper memory handling procedures. Chapter 7 expands the material on blocks, threads, warps, and lanes will straighten out the terminology and get you headed into constructive code to manage the upcoming design.

If your task goes beyond the capabilities of a single GPU, Chapter 9 introduces multiple GPU programming management. The choice of one of the later client motherboards provides up two four PCIE sockets with the potential of holding four GPUs. That kind of super-computing ability for about $500 a GPU can meet even a gamer's budget. Be aware though that added complexity requires added design refinement. Routines need to be optimized, and Chapter 11 will help you reduce memory usage and Chapter 12 will help you increase the efficiency of Warp usage.

Three more chapters involve reductions for routines used in specialized applications that may become of interest to you and are also helpful in further mastering the concepts needed to master GPU computing.

Personally, I have a financial program that exceeded my i7 CPU capability for prediction using neural networking because it took more than all night to determine ranking for 400,000 stocks. And I thought that the one hour download time off the internet was onerous. Now I have an affordable solution that won't require me to build a shed out in the backyard to hold all the computers that would normally be required to add this feature to my design. All I have to pay for is a bigger power supply and a single GPU card. Happy computing!
5 of 6 people found the following review helpful
5.0 out of 5 stars Fantastic Book 27 Aug. 2013
By pafluxa - Published on
Format:Paperback|Verified Purchase
This book is a must have if you want to dive into the GPU programming world. It is written in a user-friendly language; it is not a "CUDA manual", because even if it describes certain functions and technical aspects of CUDA, the book explains the main features of it by addressing (simplified) real life problems in a very pedagogical way. The book also includes a not-so-extensive review of Dynamic parallelism (which is why I bought the book in the first place), but it should be more than sufficient for most CUDA "newbies" like me.

I can't say much more about this book except this: if you really want to learn CUDA, buy it. You won't be disappointed.
4 of 5 people found the following review helpful
5.0 out of 5 stars It'll be a classic 12 Sept. 2013
By cuda.geek - Published on
Format:Paperback|Verified Purchase
I know a good books about C++, template metaprogramming, C#. The are become classical for people who desired in CS. For CUDA we have only a few books and all of them basically does not provide any answers on question why. But Nicholas do!

I really love it.

Only one thing that not so good from my point of view is latests part about common algorithms. I think people who read this book already know it. But anyway it's only my feelings.
5.0 out of 5 stars samples work nicely with VS2013 9 Mar. 2015
By Talek - Published on
Format:Kindle Edition|Verified Purchase
Awesome book, the book webpage has the samples and they work nicely with my VS2013! and are easier to make work than the cuda by example book, and has a lot of interesting facts about the cuda runtime!
Were these reviews helpful?   Let us know
Search Customer Reviews
Only search this product's reviews

Customer Discussions

This product's forum
Discussion Replies Latest Post
No discussions yet

Ask questions, Share opinions, Gain insight
Start a new discussion
First post:
Prompts for sign-in

Search Customer Discussions
Search all Amazon discussions

Look for similar items by category