An image-processing tool developed at Caltech's Center for Data-Driven Discovery
An image-processing tool developed at Caltech's Center for Data-Driven Discovery helped biologists Yuling Jiao and Elliot Meyerowitz make a discovery about a signaling molecule - which transmits information between cells - in the Arabidopsis thaliana plant. Credit: Alexandre Cunha/Caltech and Jiyan Qi and Yuling Jiao/Chinese Academy of Sciences
› Larger image

There's a growing need among scientists and engineers for tools that can help them handle, explore and analyze big data. A new collaboration between NASA's Jet Propulsion Laboratory and the California Institute of Technology, both in Pasadena, California, has been created to advance this important field.

JPL's Center for Data Science and Technology (CDST) has joined forces with Caltech's Center for Data-Driven Discovery (CD3), creating the Joint Initiative on Data Science and Technology. A kickoff event for the collaboration was held recently at Caltech's Cahill Center for Astronomy and Astrophysics.

"Our joint center is somewhat like an observatory. The software and other expertise brought by Caltech and JPL scientists in the initiative are the instruments that will allow others to make discoveries," said George Djorgovski, professor of astronomy and director of CD3.

Individually, each center strives to provide the intellectual infrastructure, including expertise and advanced computational tools, to help researchers and companies from around the world analyze and interpret the massive amounts of information they now collect using computer technologies, in order to make data-driven discoveries more efficient and timely.

"We've found a lot of synergy across disciplines and an opportunity to apply emerging capabilities in data science to more effectively capture, process, manage, integrate and analyze data," said Daniel Crichton, manager of JPL's CDST. "JPL's work in building observational systems can be applied to several disciplines from planetary science and Earth science to biological research." It's an opportunity for us to not only impact NASA, but also impact other agencies and research enterprises to advance understanding from the vast amount of highly distributed, massive data that is collected from scientific exploration."

JPL, for example, has been working with the National Cancer Institute for the past several years to develop a knowledge environment to support cancer biomarker research. "This collaboration exemplifies the opportunities to leverage data and computational science tools from space science for cancer research and vice versa," Crichton said.

The Caltech center is also interested in taking data science tools and techniques developed for one field and applying them to another. The CD3 recently collaborated on one such project with Ralph Adolphs, Bren Professor of Psychology and Neuroscience and professor of biology at Caltech. They used tools based on machine learning that were originally developed to analyze data from astronomical sky surveys to process neurobiological data from a study of autism.

"We're getting some promising results," said Djorgovski. "We think this kind of work will help researchers not only publish important papers but also create tools to be used across disciplines. They will be able to say, 'We've got these powerful new tools for knowledge discovery in large and complex data sets. With a combination of big data and novel methodologies, we can do things that we never could before.'"

Both the CD3 and the CDST began operations last fall. The Joint Initiative already has a few projects under way in the areas of Earth science, cancer research, health care informatics, and data visualization. Caltech manages JPL for NASA.

News Media Contact

Elizabeth Landau
Jet Propulsion Laboratory, Pasadena, Calif.