University of Rochester President Joel Seligman announced this morning that the University is committing $50 million—in addition to more than $50 million it has spent in recent years—to greatly expand its work in the burgeoning field of data science. The commitment will include the creation of an Institute for Data Science, construction of a state-of-the-art building to house it, and as many as 20 new faculty members with expertise in the field.

Seligman made the announcement as part of his opening remarks at the Rochester Big Data Forum 2013, at which renowned researchers in data science from around the nation are meeting for a day of interdisciplinary talks and discussions.

"This is the top University priority for the University's 2013-18 strategic plans that were adopted by the Board of Trustees on October 11. Data science is a defining discipline of the 21st century. By combining sophisticated analytic techniques with rapidly improving computational capabilities, data science can help extract useful information from the quintillions of bytes of data that are created every day. It is the foundation, for example, of data-informed, personalized medicine, is central to national security and defense, and has already changed online commerce," said Seligman.

"The field of data science is taking off and we're jumping in with both feet," explained Robert L. Clark, senior vice president for research and dean of the Hajim School of Engineering and Applied Sciences. "We want to make the most of the opportunities that data science offers and, by hiring a significant number of experts in this field in the next few years, we can ensure that the applications of data science advance research across campus."

Clark expects that the new Institute will have an impact on the Rochester region through collaborations with local companies and through new companies emerging. "The investment opens up great opportunities for the translation of the discoveries and new techniques that are developed to the commercial sector, and we will also be producing highly trained specialists in this area," Clark said. This would continue a long tradition of entrepreneurship at the University, he added, citing as an example the University's Institute of Optics, whose faculty, staff and alumni have gone on to lead or found 160 companies, many of them locally.

The initiative builds on current University strengths in data science including the Health Science Center for Computational Innovation (which hosts an IBM Blue Gene/Q supercomputer), and the active research that is carried out in fields such as machine learning, artificial intelligence, and biostatistics. It will also leverage existing collaborations with companies such as IBM and Xerox in data science.

"Rochester researchers are already exploiting the tools of data science in their work," said Henry Kautz, chair of the computer science department and director of the Rochester Big Data Initiative. "For example, data science has been a key part of research done here to model and predict the spread of infectious diseases, to track the popularity of political ideas, to understand consumer preferences, and to predict the existence of planets."

Kautz added that the University's expertise in data science is currently dispersed across many departments and relies on individual groups of researchers to connect with each other to share their knowledge. The new Institute will bring these faculty members together with the necessary resources to empower collaborations in data science in all fields, he said.

The new faculty members will be recruited in many departments: biostatistics, psychiatry, physics, computer science, political science, and others. But data science will be a critical component of their work, either as developers or users. These faculty members will also open the way to new areas of research as the work of the Institute develops. Three domains of initial research focus have already been identified.

The first is predictive health analysis. The University is already a leader in tracking and developing methods to control the spread of infectious diseases, and is home to a world center for the collection and analysis of cardiac data, as two examples.

A second domain of work is cognitive systems and artificial intelligence, which focus on increasing our understanding of how the brain makes sense of the world.

The third focus will be analytics on demand; analyzing large-scale data requires the appropriate tools, a challenge that some of the Institute's faculty will be addressing.