Skip to main content

IBM General Parallel File System provides exceptional reliability and performance for file system users

Published on 05 Feb 2009

"GSA File has a service level agreement, and GPFS far exceeds the stringent requirements of that agreement, regularly delivering 100% availability across all 35 sites" - — Stanley Wood, Enterprise Architect, CIO Office, IBM

Customer:
IBM Corporation

Industry:
Computer Services

Deployment country:
United States

Solution:
CIO, Security, Service Management

Overview

To manage the rapidly increasing storage requirements of more than 450,000 users and deliver fast and dependable access to more than 4,500 applications, IBM provided a robust high-performance file system for shared data.

Business need:
Manage the rapidly increasing storage requirements of more than 450,000 users and deliver fast and dependable access for more than 4,500 applications

Solution:
An internal service, based on a combination of IBM products and open source technologies, that provides a robust high-performance file system for shared data

Benefits:
Improves performance and regularly delivers 100 percent availability by transparently moving workloads among servers; protects against unplanned outages; substantially reduces administrative costs by simplifying management and capacity planning of file systems

Case Study

Prior to the introduction of Global Storage Architecture (GSA) File environment in 2001, the IBM internal file system environment was fragmented into many small islands. These islands often used different technologies and were managed by independent teams, each of which created their own local use policies. Cross-site projects were sometimes hindered by this inconsistent approach to technology and management. Also, many of the technologies employed were not designed for enterprise scale computing, and teams struggled to expand the environments to meet the growing needs of IBM.

IBM identified a number of specific issues in this environment. First, the cost of managing multiple small islands of unstructured data continued to escalate as volume increased. In addition, the security of data on local servers was increasingly at risk, and strategies for data recovery were often not documented or were difficult to implement. There was no flexibility to move files quickly and transparently to different servers in the event of planned maintenance or other outage. Finally, traditional network file systems such as CIFS (Common Internet File System) or NFS (Network File System) often lacked the scalability, reliability and resilience required by enterprise applications that needed shared access to the same file from multiple application instances.

Minimizing risk with a flexible file system

The GSA File environment was built around the IBM General Parallel File System (GPFS™). Originally designed as a file system to manage resource-intensive multimedia services, GPFS is the core file system in IBM’s internal storage architecture.

Because GPFS is a clustered file system, all the servers within a GPFS cluster have direct access to all the storage in that cluster. The combination of GPFS and IBM WebSphere® Edge Server software make the system extremely flexible and resilient. Workload can be moved from one server to another transparently for planned or unplanned maintenance. The performance of each cluster is balanced automatically and continuously. IBM Tivoli® Storage Manager delivers high performance backup to help staff implement data protection policies that align with data availability needs and service level requirements. Through this IBM Service Management solution, staff can cost-effectively manage backup and recovery across disk and tape from a single point of control.

GPFS supports the replication of metadata and data across multiple storage devices. The GSA environment uses this capability to provide greater operational resilience. GPFS survives the loss of individual disks, disk arrays, storage servers and fabric components without an outage. Moreover, data in GPFS can be transparently migrated from one storage device to another to allow for planned maintenance.

Using GPFS, storage devices and servers can be managed in large pools. Managing pools of servers rather than individual servers dramatically simplifies administration and capacity planning of file systems. Within a GPFS cluster, capacity can be scaled up or down simply by adding and removing servers or disks. There is no need to manually rebalance data as the cluster shrinks or grows. Using large pools of capacity also allows the system to be managed with higher utilization than traditional file systems. Higher utilization allows the same amount of data to be stored on less hardware, leading to lower power usage and reduced equipment costs.

GPFS provides a POSIX-compliant file system, and provides coherent file locking across all nodes in a cluster. This means that most applications require no changes or recompilation.

GSA File is deployed in 35 cells, located around the world. Each cell is built around a single GPFS cluster, and every member of the cluster provides the same services. The WebSphere Edge Server software makes all the servers in the cluster appear as a single server, and GPFS clients are connected transparently by WebSphere software to any server in the cluster.

Because it is by nature a large consumer of bandwidth, and sensitive to network latency, file system service is delivered locally at each of the cells. While the service is delivered locally, GPFS is managed centrally. Three control centers—one in Poughkeepsie, NY and one each in Europe and Japan—remotely manage the various cells.

Offering massive throughput and high reliability for even the most resource-hungry applications, GPFS was the obvious choice at IBM. With a ten-plus year history of development and dependability, GPFS offers clustered file systems and shared device file systems, making it a robust solution that also exploits SAN-attached storage devices for additional resilience.

Meeting the needs of a large user population

Prior to the GPFS solution, file management at IBM was dispersed and non-standardized. Data was tied to a particular server. If the server was lost or down for any reason, the user couldn't gain access. With GPFS the data is no longer tied to an individual server, eliminating the single point of failure. If the workload increases it can easily be distributed through the clustered architecture.

GPFS is the only solution that effectively handles the intense file management needs of the large user population at IBM. GPFS meets or exceeds the storage and application access requirements of IBM, provisioning services automatically depending on the storage requirements of the user. The environment supports a diverse user base including hardware and software development, application hosting and end-user file sharing.

Since it first became available in 2001, the GSA File environment has grown to include 35 cells on five continents. There are more than 115,000 users consuming more than 250 terabytes of data, and the GPFS environment in GSA File includes more than 1 billion files and directories.

Providing high availability at a lower cost

While acceptance of GPFS at IBM has continued to grow at an exponential rate over five years, help desk calls have remained remarkably stable. The number of GPFS user IDs has increased to well over 100,000, but the help desk consistently averages fewer than 200 calls per month. This low rate of service calls also highlights the stable and highly available GPFS environment.

Throughout this period of growth GPFS has provided tremendous availability with its clustered architecture. GSA File has a service level agreement, and GPFS far exceeds the stringent requirements of that agreement, regularly delivering 100% availability across all 35 sites.

The overwhelming acceptance of GPFS by IBM employees, demonstrated by more than seven years of continued strong growth, has yielded significant cost savings and increased efficiency and performance. It also offers a substantial test case to demonstrate the many benefits of this solution. Significant cost reductions have been achieved, particularly over the past three years, as a direct result of self-service tooling and centralized management. With self-service configuration and updating, users can easily customize storage for their own requirements. This tooling is made possible by the clustered GPFS environment, and the large pools of storage that it creates.

Based on its success, IBM Services created the Scale Out File System (SOFS), a commercial offering that delivers the GPFS file system solution on customer premises as a highly scalable, global, clustered network attached storage (NAS) solution.

For more information

Contact your IBM sales representative or IBM Business Partner. Visit us at: ibm.com

Additionally, IBM Global Financing can tailor financing solutions to your specific IT needs. For more information on great rates, flexible payment plans and loans, and asset buyback and disposal, visit: ibm.com/financing

Components

IBM products and services that were used in this case study.

Software:
Tivoli Storage Manager, WebSphere Edge Server, General Parallel File System

Service:
GTS ITS Storage & Data: Storage Opt. & Integration

Legal Information

© Copyright IBM Corporation 2009 IBM Corporation Software Group Route 100 Somers, NY 10589 U.S.A. Produced in the United States of America January 2009 All Rights Reserved IBM, the IBM logo, ibm.com, GPFS, Tivoli and WebSphere are trademarks or registered trademarks of International Business Machines Corporation in the United States, other countries, or both. If these and other IBM trademarked terms are marked on their first occurrence in this information with a trademark symbol (® or ™), these symbols indicateU.S. registered or common law trademarks owned by IBM at the time this information was published. Such trademarks may also be registered or common law trademarks in other countries. A current list of IBM trademarks is available on the Web at “Copyright and trademark information” at ibm.com/legal/copytrade.shtml This case study is an example of how one customer uses IBM products. There is no guarantee of comparable results. References in this publication to IBM products and services do not imply that IBM intends to make them available in all countries in which IBM operates. TIC14061-USEN-00

Bookmark this page