1 of 1 people found the following review helpful
N. J. Simicich
- Published on Amazon.com
Vine Customer Review of Free Product
I have been donating spare cycles to Folding@Home. I have a 4 way Intel, and a NVidia graphics card, and the NVidia card outperforms the main computer engine by far in churning out the floating point calculations Folding@Home wants... So I had been interested in what was available. NVIDIA has a toolkit out that allows you to access the CPUs on the graphics card. The toolkit is called CUDA.
I'm a retired programmer with 30 years of programming experience. While I don't work hard, these days, I like to keep my hand in, keep up with technology.
And it seemed that the floating point processing power that was available in the GPU had to be looked at.
It seemed serendipitous that this book became available from Vines just as I was looking at CUDA and the power of the GPU.
Now, when I was a young (19 year old) programmer, someone gave me an enormous scheduling problem to do. A hundred students had to be assigned to discussion groups. Each student was available at some times and not at others (they might be in class, for example). They were male and female and it was imperative that the sexual balance of the group be respected. Study groups had to end up with 4-8 people, if too few or too many, try again using another choice set.
The student's numbers, requested discussion group numbers, and sex were punched onto cards.
I wrote a FORTRAN program (it was 1970) that read the cards into an array (as few bytes per student as possible) and started traversing the problem set. And it was taking a very long time. Finally, I modified the program to check one of the toggle switches on the console, and to dump its state to the console and stop if one of the switches was toggled, and left, late Friday night, with instructions on the console that the operator, on Monday when they needed the computer, should flip the switch and wait for the program to print. The program ran for a CPU Weekend. CPU Weekends were important back then. The most processing that one programmer could do was to run for a weekend. When I worked at IBM Research, Fractals were developed by that MIT guy using spare CPU Weekends.
Now the machine I had use of in 1970 was an IBM 1130. It had 8k of core memory and a 3.2 microsecond cycle time, and a 1 megabyte hard drive. I expect that I could exhaustively search through all potential solutions and find the best fit in a few minutes to an hour today, on my desktop PC. But this was a slow machine. It finished no more than 20% of the search in the CPU Weekend. I declare d the problem "too large to compute" and tried a Monte Carlo approach, where I learned the weaknesses of pseudo random number generators.
These days, the CPU Weekend, the largest piece of work that can be computed in spare cycles, is what can be processed on a Desktop PC with a high end NVIDIA graphics card, over a weekend -- because the $200 NVIDIA Superclocked GTX 460 is the most powerful mass marketed computer to date.
So, I started to think, was this power accessible? Could one reprogram the calculation loop of a spreadsheet, one that might have a million or more rows, to calculate in parallel? How about inherently parallel languages, like J, where data is commonly held in matrices and procesed all at once, in parallel. Could you speed up that sort of processing by reprogramming the calculation loops of J, so that when you casually toss around a large matrix it can be processed in parallel?
And then I saw this book. I was intrigued. Actual projects? Source code, maybe? Explanations?
And today the book came.
Now, I was a bit disappointed. These were all independent papers. Many of them basically say the same thing. We had a computationally hard problem. We reprogrammed it to use CUDA. It sped up a whole lot. We were happy. There is a chart that was repeated in study after study. It goes something like:
We had been running the app on a 2.6 Gz Intel core ?? chip. The app ran in 50 seconds. We reprogrammed it for an 8 way Intel chip, the latest, and we were able to do the computation in 10 seconds. When we ran it on a NVIDIA using CUDA, we were able to run the app in 0.12 seconds after applying full optimization. Our first try at NVIDIA got us a 3 second compute time, but we did something fancy and we were able to get another 75% reduction over these three stages. But the detail of that optimization is not explained here.
The chart is almost exactly the one in the CUDA Programming Guide, where they note that the GTX 480 has a theoretical floating point output of over 1.3 teraflops. Some of the researchers got 85% of theoretical max.
One line from the Introduction intrigued me. We expect processing power to double every two years, although these processors are not getting faster, they are just adding more cores. When people port their compute bound app from a high end Intel box to a GPU, the app runs 10 to 100 times faster. So, in a way, people who port to GPUs travel 8-12 years into the future. And there is no reason to believe that GPUs won't continue to add processing power, since they can add more cores or speed up the ones they have, since right now they are loafing at 1500 mz. So they have traveled to the future and need not come back, but we can reap the benefits of their computation.
There was a lot of detail in the book, but it was the wrong sort, for me. If I were a physical chemist, say, the equations that describe the end location of the electron in the shell and how its position was calculated would have been more useful. It was probably obvious to other physical chemists how they would translate those equations to an algorithm, but it wasn't to me. In a few places people included application pseudocode, while, all too briefly, actual code was quoted, but I didn't see any explanations - especially not in the detail I hoped for. To be clear, there was plenty of detail; it just wasn't where I needed it.
In "A Brief History of Time", Hawking was warned by an editor that every equation he put in his book would halve the readership. By that principle, this book is flat out of readers. '
So in the end, I was disappointed. If I were looking at the cost of a supercomputer, and needed a lot of floating point processing to achieve computational goals, and was told that my budget was so large that it meant layoffs, well, I might see this as proof of concept. I could use this book as a justification in putting a staff together to investigate the use of CUDA and NVIDIA rather than more traditional supercomputer vendors.
If I was one of those investigators, this book would hardly be useful, other than as a way to determine if the speedups I was getting were in line with what the rest of the industry was getting on similar projects.
All that said, this is still a four star book. If you are the person who needs this book, it is a great book, pretty much unique in its field.