Guest post from Cara Bonnett, Office of Information Technology
Kouros Owzar analyzes huge datasets to identify key genetic markers associated with cancer survival. For him, time is of the essence.
A new Duke computing resource that taps the power of graphics processors commonly used to render pixels in video games is going to enable Owzar and other researchers to complete complex statistical analyses as much as 75 times faster.
“Nobody wants to wait a month for an analysis to finish," said Owzar, assistant professor in biostatistics and bioinformatics. "You may run an analysis once, then want to run different scenarios, and if every one of those runs takes a long time, it’s not practical. This kind of tool allows us to tackle certain computationally intensive problems.”
Duke’s new computing cluster – part of a campus-wide network dubbed the Blue Devil Grid – harnesses the power of graphics processing units (GPUs), typically used in high-end gaming. GPUs deliver an order of magnitude difference in computational power compared with the central processing units (CPUs) found in desktop computers, said John Pormann, director of Duke’s Scalable Computing Support Center (SCSC).
Image: "Exponential Parameter Space Detail PSP Rays" byDr. L. Rempe, Wikimedia Commons
Because GPUs have dozens or hundreds of small “cores” or engines capable of doing lots of little things at once, they’re especially good for crunching away at what researchers call “embarrassingly parallel” problems, which are common in areas such as genetics, biostatistics and molecular dynamics.
"GPUs can split the job into smaller chunks, let the different cores just grind them out, and we collect the results at the end,” said Owzar, who, with his colleagues Ivo Shterev, Sin-Ho Jung and Stephen George, has developed a prototype comparing GPU and CPU processing speeds on different tests and data set sizes.
“Based on our experience, the larger the sample size and the more complicated the test statistic, the larger the benefit is in terms of speed.”
An analysis of data on 600 patients, for example – which would have taken about four days on a traditional CPU – now takes four hours. The prototype will be made available soon through http://code.google.com/p/permgpu/ as a stand-alone application and as an extension package for the R statistical environment.
GPUs also are much less expensive than high-performance computer clusters: One card costs $500 to $1,200.
The 16 machines in the Duke’s BDgrid include a mix of consumer-grade GPU cards as well as cards designed specifically for high-performance computing.
To find out more about BDgrid, visit the SCSC wiki or contact Tom Milledge in the SCSC.
Follow news of Lola, the bonobos and Ekolo on...
6 years ago