Seagate’s Apache Hadoop on Lustre Connector

By jfranklin


In November last year, Seagate announced their contribution of the “Apache Hadoop on Lustre Connector” as part of their continued commitment to open source communities. Running Apache Hadoop jobs has been made even easier as the Apache Hadoop on Lustre Connector eliminates the need to copy data to the Hadoop Distributed File System (HDFS). The Hadoop on Lustre Connector allows both Hadoop and HPC Lustre clusters to use exactly the same data without having to move data between file systems or storage devices. And Hadoop ecosystem tools like Mahout, Hive and Pig are now able to utilize the Lustre File System.

Importantly, HPC customers are supported through Seagate’s release of source code for a patch to Hadoop that will enable Hadoop to share data with HPC architectures that use Lustre for storage. HPC customers, especially in the Life Science and Energy field, are increasingly using Hadoop and Lustre together as part of their data analysis workflows.

From the press release:
“Seagate believes that direct involvement enabling core capabilities as well as fostering the addition of new application environments is critical to open source community vitality, especially for Lustre which is a foundation for much of the success of high performance computing among science, government and business community leaders. Our work with OpenStack Swift, the Open Compute Project (OCP), OpenSFS, EOFS and now Hadoop is just the beginning,” said Ken Claffey, Vice President of ClusterStor, Seagate Cloud Systems and Solutions. “We are committed to driving open source innovation and partnering with open source communities as they develop cutting-edge enabling technologies that are foundational for the entire industry.”

Seagate’s continued involvement and commitment as an OpenSFS Promoter level member includes being an active board member.

Tags: , ,