Use Case Overview
JLDG has transformed the sharing & management of large-scale scientific data across multiple research institutions into something extremely simple & efficient for users by maximizing the use of Gfarm, a wide-area distributed file system technology. SINET's broadband network serves as the foundation for international research cooperation (ILDG), powerfully promoting the discovery & verification of new theories in particle physics.
This case study introduces an initiative in the field of particle physics research that linked computational resources from multiple research institutions, constructed the data grid "JLDG (Japan Lattice Data Grid)" centered on the wide-area file system "Gfarm," & promoted domestic & international research cooperation.
Challenge
Distributed Data & Increasing Management Burden
In "lattice QCD simulation," an important research topic in particle physics, fundamental data called "QCD configurations" is generated. This data generation required enormous computational resources that a single supercomputer could not keep up with. Therefore, attempts (hepnet-J/sc) to generate data using supercomputers from multiple research institutions & share it via network had been made previously, but several challenges existed.
- Complexity of Data Management
-
Since data was distributed across multiple disks, users had to remember all data locations & mirror destinations (replica locations), resulting in very high management burden.
- Difficulty in Cross-Organizational Use
-
There was no concept of users or groups spanning research institutions, making it difficult to build a support system as research infrastructure across multiple organizations.
- High-Speed Network Requirements
-
The existence of a fast & reliable network was essential because replica creation of large-scale data & data transfer from remote sites occurred frequently.
Solution
Achieving Flat & Seamless Wide-Area Data Sharing
To solve these challenges, development of the data grid "JLDG" began in 2005.
1Seamless Access Through "Gfarm"
JLDG adopted the global file system Gfarm as its core technology. Gfarm realizes a "flat data sharing system without space limitations," allowing users to freely access research data simply by logging into their organization's server without being aware of where the data is actually stored.
2Efficient Data Placement
Behind the scenes, to reduce access time to remote data, a mechanism was introduced where Gfarm automatically places file replicas on servers at each site.
3Cross-Organizational User Management
By combining virtual organization management tools (VOMS) & user authentication systems (Naregi-CA), user management across organizations became possible, enabling multiple research institutions to jointly use the infrastructure.
4International Collaboration Through SINET
This massive data transfer is supported by using SINET's high-speed, highly reliable network (L3-VPN service). JLDG also participates in the international data grid ILDG (International Lattice Data Grid), providing QCD configuration data to domestic & international researchers via SINET, enabling international research collaboration.
Future Development
JLDG is currently used to widely publish research results in computational particle physics, but in the near future, we aim to further develop it as the foundation of "research infrastructure" that researchers use daily. To this end, high-speed & stable network services like SINET are essential to support the replication & transfer of ever-increasing large-scale data.
We will advance research & development of diverse elements such as computer architecture, file systems, communication software, & various libraries to realize a more seamless & efficient data sharing system & contribute to the development of particle physics.
Other Use Cases
- HPCI Shared Storage
- The lifeline of research infrastructure! A highly reliable data sharing platform achieved through geographic distribution & redundancy
- Subaru Telescope Data Analysis
- Initiatives that significantly improved processing speed by leveraging Gfarm & Pwrake
- NICT Science Cloud
- Enables real-time processing, high-speed data visualization, & instant analysis of big data