To statistician Ming Yuan, the challenge of dealing with big data reminds him of the Indian fable “Blind Men and the Elephant,” in which six blind men touch one distinct part of an elephant — an ear, a tail, a trunk, a husk — and reach narrow conclusions about the nature of the animal.
“All of the data we study is just one facet of a very big object,” says Yuan. “One of our biggest challenges is how we piece data together to develop a complete picture.”
Yuan is in a good position to explore the complete picture at UW–Madison, as a professor in the Department of Statistics and a full investigator with the Morgridge Institute for Research. The department and institute partnered on the unique hire this year, offering a powerful combination of opportunities to land Yuan from Georgia Tech.
Yuan is a strategic hire in the Morgridge Institute’s virology and computational biology research areas, which will attempt to harness the big data revolution to improve human health. Big data will help advance basic research and personalized medicine, including ways to create more precise diagnoses, targeted treatments and even preemptive treatments based on genetic susceptibility to disease.
“Big data will help advance basic research and personalized medicine, including ways to create more precise diagnoses, targeted treatments and even preemptive treatments based on genetic susceptibility to disease.”
The Yuan hiring may be a great model for other UW–Madison disciplines seeking talent in strategic areas. Yuan is a tenured professor of statistics, but his salary and most startup costs are provided by Morgridge. The arrangement gives Yuan a department home, enhanced by the cross-campus potency of working in Morgridge.
“The strong emphasis on research made it possible for Ming to imagine a major career shift into a leading research role,” says statistics department Chair Brian Yandell. “Here the emphasis was on the frontiers, such as new technology and enhanced imaging, and Ming was clearly intrigued.”
Paul Ahlquist, lead scientist for the Morgridge virology research group, says other faculty hires are in the works using this new model. As the institute grows into new research priorities, Morgridge can partner with departments to help attract top faculty whose expertise aligns with the Morgridge mission, giving UW–Madison another arrow in its recruitment quiver.
Ahlquist, a Howard Hughes Medical Institute (HHMI) investigator with UW–Madison faculty appointments in the School of Medicine and Public Health, Graduate School and the College of Agricultural and Life Sciences, says the hiring model used for Yuan very closely resembles HHMI’s approach, which is meant to support boundary-pushing biomedical researchers nationally.
In the case of statistics, it opened new avenues in the joint search. “As well as conducting the usual full and open search, we asked our colleagues, ‘Who is your dream candidate?’ Then, when that candidate did rise to the top, we figured out what it would take to bring him here,” Ahlquist says.
Yuan already has fruitful research leads in his first year with numerous departments, including oncology, biostatistics and medical informatics, and medical physics. “This is exactly what I signed up for when I came here,” he says. “This is a very exciting kind of interdisciplinary environment.”
Yuan’s expertise is in high-dimensional data, or highly unstructured data that is derived from many different sources. These data sources from clinical trials, genetic sequencing and research experiments are growing exponentially, making it harder to differentiate between meaningful patterns and noise.
Yuan develops appropriate mathematical models and effective algorithms to reduce false positives and allow biomedical researchers to target the most useful data.
“The trend is that the nature of data is heterogeneous,” he says. “It is very difficult to put together clinical data with molecular data in any meaningful way.”
The art of big data analysis will start with asking the right questions — ones that can realistically be answered in the swarm of data. “We need to have the precise biological question defined first,” he says. “Data is so unstructured and there’s so much out there, you are going to find any pattern you want … whether it’s true or fake.”
“This is exactly what I signed up for when I came here. This is a very exciting kind of interdisciplinary environment.”
Yuan developed novel computational approaches to studying immunology while at Georgia Tech, where his department chair called him “one of the best statisticians of his generation.” At UW–Madison his goal is to follow the needs of his biomedical partners.
“What I want to know is how my quantitative skills can help people, particularly in biology, achieve their goals,” Yuan says. “I want to follow the challenges biologists are encountering across campus.”
Yandell says Yuan’s hiring has added a new dimension to his department.
“Having one of our faculty at Morgridge means we are more aware of the institute’s scholarly activities, which draws our students and faculty over to Morgridge quite often,” Yandell says. “This is leading to new collaborations, including grants and publications, that would not have occurred otherwise.”
Adds Yandell: “It has stretched our thinking about the field of statistics, and I view that as quite positive.”