Thursday, July 17, 2008

Sifting the Shifting Sands

Remember all the hullabaloo when the Human Genome was first completed in 2000? It wasn't the first genome by any stretch, and it certainly won't be the last. Science marches on!

Today, there are more than a thousand completed genomes on record -- from the classic models, nematode C. elegans, fruitfly D. melanogaster and mouse M. musculus -- plus a rogue's gallery of human pathogens, the sea urchin, rice and corn, a bunch of our fellow primates and on and on and on. New ones tumble in every week. It doesn't even make news anymore!

Virtually all of this data is publicly available to any researcher who needs to make a comparison, but we're talking about billions and billions and even a couple of trillions of bits of data.

There's a lot of great science waiting to happen in all this data because time and again researchers find recurring motifs in the genes of wildly divergent species, and in those motifs, clues about what genetic features matter most and what have been changed by evolution. But Where to begin? What tools to use? What does it all mean?!

Duke's Simon Gregory, an assistant professor in medical genetics, is one of the folks who added the mouse genome to the collection, and he also knows a few things about how to sift through this huge sandpile of data to find meaning.

So for the fifth year running, he and a few colleagues are putting on a workshop to help scientists tap into this immeasurably rich source of knowledge. The conference is a three-day geek-out they call the Duke Bioinformatics Workshop, or DBW, and it runs August 18-20.

Participants who think they can hang with it, and might benefit are encouraged to attend. Details here.