University of Oklahoma provides storage for academic research

With a high-capacity, cost-effective archiving solution from IBM

Published on 10-Aug-2012

"We only needed three things: bigger, faster and cheaper." - Henry Neeman, Director of the OU Supercomputing Center for Education & Research (OSCER)

Customer:
University of Oklahoma

Industry:
Education

Deployment country:
United States

Solution:
Technical Computing, General Parallel File System (GPFS), High Availability , Storage Consolidation

Overview

Since 1890, the University of Oklahoma (OU) has provided higher-level education and valuable research through its academic programs. With involvement in science, technology, engineering and mathematics, the university has increased its focus on high performance computing (HPC) to support data-centric research. In service of OU’s education and research mission, the OU Supercomputing Center for Education & Research (OSCER), a division of OU Information Technology, provides support for research projects, providing an HPC infrastructure for the university.

Business need:
The University of Oklahoma needed a highly scalable and cost-efficient onsite data archiving solution to support rapidly expanding amounts of academic research data.

Solution:
The university deployed IBM System Storage® DCS9900, IBM System Storage TS3500 Tape Library and IBM System x® 3650 class servers with IBM General Parallel File System and IBM Tivoli® Storage Manager.

Benefits:
The IBM solution supports up to 1.7 petabytes (PB) for disk and 4.3 PB for tape (initial configuration; expandable to over 60 PB) and delivers high performance with near-peak tape drive speeds.

Case Study

Since 1890, the University of Oklahoma (OU) has provided higher-level education and valuable research through its academic programs. With involvement in science, technology, engineering and mathematics, the university has increased its focus on high performance computing (HPC) to support data-centric research. In service of OU’s education and research mission, the OU Supercomputing Center for Education & Research (OSCER), a division of OU Information Technology, provides support for research projects, providing an HPC infrastructure for the university.

Rapid data growth in academic research

In a worldwide trend that spans the full spectrum of academic research, explosive data growth has directly affected university research programs. Including a diverse range of data sources, from gene sequencing to astronomy, datasets have rapidly grown to, in some cases, multiple petabytes (millions of gigabytes).

One ongoing research project that produces massive amounts of data is conducted by OU’s Center for Analysis and Prediction of Storms. Each year, this project becomes one of the world’s largest storm forecasting endeavors, frequently producing terabytes of data per day. Much of this real-time data is shared with professional forecasters, but a large amount is stored for later analysis. Long-term storage of this data holds strong scientific value, and in many cases, is required by research funding agencies. Understandably, storage space had become a major issue for the university.

Need for an onsite storage system

In the past, for projects like storm forecasting, OU did not have the capability to store large amounts of data on campus—much of the data had to be stored offsite at national supercomputing centers. This not only created issues for performance and management at the university, it also forced researchers to reduce the amounts of data for offsite storage, creating potential for loss of information that could be valuable for future analysis.

Henry Neeman, director of OSCER, realized that to continue supporting many of the university’s research projects—and to retain funding—OU would need a large scale archival storage system that enabled long term data storage while containing costs for deployment and operations.

With a clear vision for the new storage system, OU began reviewing bids from multiple vendors. Neeman noticed that while most proposed solutions were technically capable, the IBM solution was able to meet technical requirements and stay within budget. Ultimately, it offered the best value to the university and would go on to establish a powerful new business model for storage of research data.

High-capacity, cost-effective data archive

Implementing a combination of disk- and tape-based storage, OU was able to establish a storage system known as the Oklahoma PetaStore, which is capable of handling petabytes (PB) of data. For high-capacity disk storage, the IBM System Storage DCS9900 was selected—which is scalable up to 1.7 PB. For longer-term data storage, OU chose the System Storage TS3500 Tape Library—with an initial capacity up to 4.3 PB and expandable to over 60 PB. To run these storage systems, six IBM System x3650 class servers were selected, running IBM General Parallel File System (GPFS™) on the disk system and IBM Tivoli Storage Manager on the tape library to automatically move or copy data to tape.

Neeman says one of the main reasons they chose IBM was the cost effectiveness of the tape solution. Unlike the TS3500 and Tivoli Storage Manager, many other tape solutions impose additional cost, such as tape cartridge slot activation upcharges and per-capacity software upcharges—demands that could be prohibitive to researchers. The TS3500 Tape library offers a flexible upgrade path, enabling users to easily and affordably expand the initial capacity. These savings even enabled OU to implement a mechanism to access and manage backup data through extensible interfaces. OU has adopted an innovative business model under which storage costs are shared among stakeholders. In this model, a grant from the National Science Foundation pays for the hardware, software and initial support; OU covers the space, power, cooling, labor and longer-term support costs; and the researchers purchase storage media (tape cartridges and disk drives) to archive their datasets, which OSCER deploys and maintains without usage upcharges.

Storage that impresses on many levels

The PetaStore provides research teams with a hugely expandable archive system, allowing data to be stored through several duplication policy choices that are set by the researchers. The connectivity capabilities allow data to be accessible not only to the university, but to other institutions and collaborators.

Although capacity was more of a priority than speed when designing the PetaStore, this IBM solution has shown strong performance, with tape drives operating close to peak speed. Another key benefit to the solution is its cost-effectiveness—not only for hardware, but for the reduction of labor costs for the researchers. These benefits have been noticed by Neeman, who says, “Without the PetaStore, several very large scale, data-centric research projects would be considerably more difficult, time consuming and expensive to undertake—some of them so much so as to be impractical.”

Continued innovation with IBM

By choosing the IBM solution for the PetaStore project, the University of Oklahoma has ensured a future of continued innovation in academic research. The system not only facilitates storage for the entire lifecycle of research data, it ensures that the PetaStore can continue operating and expanding at very low cost. This is critical for the university to continue to receive funding—the solution’s built-in cost efficiency proves to research funding agencies that the university can continue to operate the storage system within budget.
Overall, the university and research teams have seen numerous advantages to the IBM solution, and plan for it to seamlessly expand along with their storage needs. According to Neeman, "We only needed three things: bigger, faster and cheaper," and the IBM solution was able to deliver on all fronts. Neeman predicts that data storage solutions like the Oklahoma PetaStore will become increasingly common at research institutions across the country and worldwide.

Products and services used

IBM products and services that were used in this case study.

Hardware:
Storage: TS3500 Tape Library, System x: System x3650 M3

Software:
Tivoli Storage Manager, Tivoli Storage Manager for Space Management

Legal Information

© Copyright IBM Corporation 2012 IBM Systems and Technology Group Route 100 Somers, New York 10589 Produced in the United States of America August 2012 IBM, the IBM logo, ibm.com, Tivoli, System Storage, GPFS, and System x are trademarks of International Business Machines Corp., registered in many jurisdictions worldwide. Other product and service names might be trademarks of IBM or other companies. A current list of IBM trademarks is available on the web at “Copyright and trademark information” at ibm.com/legal/copytrade.shtml This document is current as of the initial date of publication and may be changed by IBM at any time. Not all offerings are available in every country in which IBM operates. The client examples cited are presented for illustrative purposes only. Actual performance results may vary depending on specific configurations and operating conditions. It is the user’s responsibility to evaluate and verify the operation of any other products or programs with IBM products and programs. THE INFORMATION IN THIS DOCUMENT IS PROVIDED “AS IS” WITHOUT ANY WARRANTY, EXPRESS OR IMPLIED, INCLUDING WITHOUT ANY WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND ANY WARRANTY OR CONDITION OF NON-INFRINGEMENT. IBM products are warranted according to the terms and conditions of the agreements under which they are provided. Actual available storage capacity may be reported for both uncompressed and compressed data and will vary and may be less than stated.