Argonne researchers move 1 PB of data from Chicago to Dallas in under 6 hours

December 18, 2018

Streaming a blockbuster movie in 4K means sending 15 gigabytes of data to your home in two hours.  Argonne National Laboratory researchers recently moved approximately 23,000 times more data (more than 340 terabytes) in the same amount of time from Dallas, Texas, to Chicago, Illinois. The transfer was part of a demonstration that took place at SC18, an annual international conference for high-performance computing, networking, storage and analysis.

“One of the demo high points was the transfer of 1 petabyte of data in under 6 hours – 350 minutes to be precise,” said Ian Foster, director of the Data Science and Learning (DSL) division at Argonne National Laboratory. ​“And the transfer was done sustaining an average rate of 381 gigabits per second.” He emphasized that science communities increasingly need this extreme level of streaming to tackle big data projects involving multiple locations worldwide.

For the demo, StarLight, a communications exchange facility for global advanced networks, provided the infrastructure and helped with troubleshooting issues; and Ciena, a networking systems, services and software company, provided the hardware for moving and storing the reduced data. The research team also used an experimental feature in the Globus transfer service, a hosted service that orchestrates GridFTP and HTTP transfers.

“This demo represents a significant leap – we achieved a rate that is four times what we could reach just two years ago,” said Rajkumar Kettimuthu, a computer scientist in Argonne’s DSL division who led the demo. He attributed the demo’s success to Zhengchun Liu, a research scientist at the University of Chicago, who spent countless hours setting up the environment and testing the Globus transfer service for the demo.  Joaquin Chung, a DSL division postdoctoral appointee, also provided essential support.

The team had two additional goals in mind for the demo. One goal was to demonstrate a workflow in which the data would be streamed from the source to an analysis site at 200 Gbps, a (simulated) near-real-time analysis would halve the amount of data, and each half of this reduced data would then be moved at a rate of 100 Gbps to file systems at two other remote sites for further analysis and distribution. ​“We were able to stream at a rate of 160 Gbps from source to the remote analysis site, but the final transfer was done only at a rate of 45 Gbps,” Kettimuthu said.

The second goal was to do file-system-based transfers in addition to memory-based transfers.  But even though the nodes in the demo had NVMe drives capable of providing the required I/O rates, they did not have enough CPU capability to drive these rates on both storage I/O and network I/O simultaneously.

The team regards these issues as challenges.

“High-speed data transfer is improving dramatically. A few years ago, achieving the rates we’ve demonstrated would have seemed impossible. Today, fast networks and new services are making large data transfer between distributed facilities, storage and computing resources a reality,” Kettimuthu said.