Data Science Research Support

Northwestern University currently hosts a significant number of faculty engaged and interested in data science. Some are in the methods disciplines (computer science, applied math, statistics) and many more are in domain disciplines across many schools. Many of these researchers are unaware of one another and there have not been opportunities for developing intellectual exchanges and focused discussions on data science research projects. The research of faculty here at Northwestern has demonstrated that strong interdisciplinary collaborations lead to increased scientific productivity and impact. Northwestern’s Office of the Provost and Office of Research have made a financial commitment in order to promote the strengthening of interdisciplinary collaborations around data science with the goal of moving Northwestern forward in this area. Importantly, we are defining data science (and “big data”) not by the absolute size of the data, but by its increase relative to what has been typical in a discipline.

As a result of the University’s commitment, the Data Science Initiative has been granted funds to support faculty already involved or planning to get involved in Data Science. Funds have been allocated for the retaining and hiring of postdoctoral fellows and for the support of current graduate students. We will support on the order of 15-20 data science projects per year at the level of $10K-$50K in direct costs.

View the Funding Programs page for more information.


Data Science: A seismic shift changing how we research and learn

Northwestern News Special Feature, May 2016

A tidal wave of digital information has ushered in a new era of computing and analysis in the 21st century. Data science, or "big data," is affecting every aspect of Northwestern’s learning and research enterprises — among other things, leading to breakthroughs in precision medicine; contributing to a revolution in astronomy with profound insights about the universe; transforming the scope and depth of social science research with significant policy implications, and fueling research about consumer behavior that is affecting how companies do business.

Visit Northwestern News to read more.


NICO 101-0: Introduction to Programming for Big Data

Fall 2016 - Professors Luis Amaral and Adam Pah

Lectures: September 6-9 and 12-15 from 9:30am-12:00pm & 1:30pm-4:30pm in L361.

Overview: Our digital, connected, sensor rich world is generating extraordinary amounts of data (“Big Data”) that are being used to purposes as diverse as teaching a computer to win at Jeopardy or offering taxi alternatives. The skills needed to go from data to knowledge and application, which go under the name of Data Science, are in big demand in industry, government, and academia. This course provides an introduction to the foundational skills needed by data scientists. Prior knowledge of programming is not needed.

Prerequisites: None.

Restrictions: Intended primarily for undergraduate students. Other students must contact the instructor. Students will need an up-to-date laptop running Linux, OS X, or Windows 7 or higher. Chromebooks will not be permitted. Prior to the start of the course, students must install several packages and verify that they run properly in their machine. Texts: Lecture materials are available online at

Requirements: There will be about 6 homework assignments involving the writing of Python code for solving specific problems. Students’ solutions will be uploaded to a server where they will be unit tested. There will also be a final coding project. All students will be expected to attend lectures and complete in class assignments.

Visit CAESAR to register for the course.

Wherever you look, there is talk of the revolution being brought about by "Big Data." But is Big Data just a fad, as its critics contend, or is there something at its core that is here to stay with us?

Well before the term was coined, particle physicists were driven to Big Data challenges by the necessity of their large-scale detectors and produced datasets. Now however, we live in a world where not just the amount, but also the diversity and complexity of digital information continues to grow exponentially. Materials simulations and astronomy images are pushing the boundaries of exploration. Social networks enable the exchange of information between people; medical devices and e-commerce record the exchange of information between people and machines; GPS devices and bar code scanners allow the exchange of information between machines.