Published on 26-May-2011
Validated on 06 Nov 2012
"The IBM and Mellanox solution may give scientists another way to understand properties that they aren’t able to model in a physical way." - Pete Beckman, Director of the Exascale Technology and Computing Institute at the Argonne National Laboratory
Customer:
Argonne National Laboratory
Industry:
Government
Deployment country:
United States
Solution:
Technical Computing, Deep Computing, Optimizing IT, Optimizing IT, Virtualization, Virtualization - Server
IBM Business Partner:
Mellanox
Overview
The Argonne Leadership Computing Facility (ALCF) at Argonne National Laboratory is operated by the Leadership Computing Facility Division as part of the U.S. Department of Energy’s effort to provide leadership-class computing resources to the scientific community. The Argonne National Laboratory’s mission is to accelerate major scientific discoveries and engineering breakthroughs for humanity, by designing and providing world-leading computing facilities in partnership with the computational science community.
Business need:
Scientists require significant compute power to run complex simulations, and want the ability to scale up their computational experiments almost limitlessly; the cloud computing model potentially holds great promise.
Solution:
Argonne created Magellan, based on 560 IBM® System x® iDataPlex® dx360 M3 compute nodes with Mellanox ConnectX InfiniBand networking, running three software stacks to test the viability of cloud computing for science.
Benefits:
Argonne can rapidly assign exactly the required computing resources to each project, scale them up as required during the project, then return the resources to Magellan on completion.
Case Study
The Argonne Leadership Computing Facility (ALCF) at Argonne National Laboratory is operated by the Leadership Computing Facility Division as part of the U.S. Department of Energy’s effort to provide leadership-class computing resources to the scientific community. The Argonne National Laboratory’s mission is to accelerate major scientific discoveries and engineering breakthroughs for humanity, by designing and providing world-leading computing facilities in partnership with the computational science community.
The Argonne National Laboratory was keen to examine what advantages the cloud computing model could offer to the scientific research community. In a joint research effort with the National Energy Research Scientific Computing Center (NERSC), it created the Magellan cloud testbed project. The aim was to explore a cloud solution that might allow existing projects to take advantage of a more powerful and flexible compute resources, and potentially reduce the pressure on its other high-performance computing (HPC) resources.
Pete Beckman, Director of the Exascale Technology and Computing Institute at the Argonne National Laboratory, says, “The Magellan project is an experiment to find out what kind of science can be done in a cloud, and what kinds of changes we might have to make either to the technical platform or to the software. Our goal is to understand what kinds of scientific applications can run well in the cloud, and look at running them in the cloud on a grand scale.”
Robust performance
Magellan is based on the IBM System x iDataPlex dx360 M3 server, an innovative, half-depth server optimized for maximum density and incredible efficiency. The Argonne National Laboratory deployed 504 iDataPlex compute nodes, each with two quad-core Intel Xeon X5550 processors running at 2.66 GHz, providing a total of a 4,032 cores. Each node each has 24 GB of DDR3 1333 MHz memory. The iDataPlex solution at the Argonne National Laboratory also offers 200 expanded compute nodes, using the same processors but featuring 48 GB of DDR3 1066 MHz memory, a 200 GB Solid State Drive and a 1 TB local SATA disk per node. There are also 12 login and network service nodes in Magellan.
The Argonne National Laboratory made use of the IBM Intelligent Cluster offering, including the physical installation of the cluster and system cabling by the IBM Cluster Enablement Team. All nodes, switches and other components were installed and integrated into the racks during manufacture, and the solution was shipped pre-racked for added speed of deployment. The IBM Cluster Enablement Team also handled software installation and system configuration on-site at Argonne.
Susan Coghlan, Deputy Division Director at the Argonne Leadership Computing Facility says, “NERSC and Argonne have the IBM iDataPlex hardware for their core compute cloud testbed. It’s a very robust platform and it’s exactly what Argonne has on its production HPC cluster. We’re focused on the system software and the science that may be able to run on a cloud, as opposed to the hardware. We want the hardware to just stay up and run without us having to think about it.”
Partnering for speed
IBM delivered a complete turnkey solution, including high-performance Mellanox ConnectX InfiniBand adapters to provide high-bandwidth, low-latency interconnects between the iDataPlex compute nodes. IBM also set up the management nodes for the new solution and the entire surrounding network.
Beckman adds, “For science applications, having a really fast processor, a fast interconnect and these very reliable servers allows the scientists to think about their science and not about what’s happening on the hardware. Scientists actually don’t want to know what it’s running on; they want to get their results published and make new discoveries. Essentially, the user community wants a machine that’s up, that’s generating the right answer, and that’s fast. That’s what we have now with our current solution.”
The Argonne National Laboratory is participating in the development of a new 100 Gb/s network that will link several major research centers across the US. This will enable scientists to move petabytes of data quickly between compute sites. “One of the things that the IBM and Mellanox 40 Gb/s solution gives us is that it allows us to rapidly fill the pipes into the server and empty the pipes back out again, allowing us to take advantage of the 100 Gb/s network,” says Coghlan.
Pioneering solution
Magellan is testing how cloud technologies could make a direct contribution to the scientific community, by potentially enabling unprecedented flexibility, efficiency and speed of provisioning for high-performance computing resources.
Beckman says, “To put our cloud testbed together we decided on a couple of specific building blocks. One was a very fast interconnect that came from Mellanox, and another is the IBM iDataPlex server. Our solution from IBM, Mellanox and Intel has proven to be very stable and reliable. And from the scientists’ perspective, getting their science done and being able to run their applications on the machine is what they want.”
Tackling hard problems
The Department of Energy’s mission is directed toward energy usage, the environment and national security. Research efforts on Magellan include modeling coastal water flows, simulating hurricanes and tornadoes, designing new materials, modeling dark matter, and finding more efficient designs for the internal-combustion engine.
“The IBM and Mellanox solution gives scientists a way to understand properties that they aren’t able to model in any other way, or at least in any other physical way,” says Beckman. “Cloud computing is potentially a great technology for scientific research. People want to take science applications and discovery applications from their laptop and enable them to scale as large as they want. This involves being able to make virtual machines and special kinds of storage systems; it’s these breakthrough technologies that we’re exploring here at Argonne, with help from IBM, Mellanox and Intel.”
The cloud model is potentially well suited to the study of genomics and meta-genomics in the field of biology. Unlike traditional supercomputing applications, which tend to use large parallel applications running on a parallel framework, the applications used by biologists are much more pipeline-driven. Says Beckman, “You have hundreds of thousands of pipelines that have functions that apply to the gene sequences. And that's the kind of processing that a cloud can handle well.”
Accelerating computation
The Intel Xeon 5550 processors used in the IBM iDataPlex employ Intel’s Turbo Boost technology, which enables some cores to be temporarily overclocked to overcome bottlenecks in processing. When demand falls, the cores can be underclocked to reduce their energy consumption and heat output.
“From a user perspective,” says Beckman, “what he wants is something that runs blindingly fast and he never knows that it’s there. The IBM architecture allows data to be transferred back and forth from the CPU to memory, so computation can continue at the same time that the system is moving data to and from the nodes.”
Products and services used
IBM products and services that were used in this case study.
Hardware:
Intelligent Cluster, System x: iDataPlex dx360 M3
Legal Information
© Copyright IBM Corporation 2011 IBM Systems and Technology Group Route 100 Somers, New York 10589 U.S.A. Produced in the United States of America May 2011 All Rights Reserved IBM, the IBM logo, ibm.com, iDataPlex and System x are trademarks of International Business Machines Corporation in the United States, other countries or both. If these and other IBM trademarked terms are marked on their first occurrence in this information with a trademark symbol (® or ™), these symbols indicate U.S. registered or common law trademarks owned by IBM at the time this information was published. Such trademarks may also be registered or common law trademarks in other countries. A current list of IBM trademarks is available on the web at “Copyright and trademark information” at ibm.com/legal/copytrade.shtml Intel, the Intel logo, and Xeon are trademarks of Intel Corporation in the U.S. and/or other countries. Other company, product and service names may be trademarks or service marks of others. IBM and Mellanox are separate companies and each is responsible for its own products. Neither IBM nor Mellanox makes any warranties, express or implied, concerning the other’s products. References in this publication to IBM products or services do not imply that IBM intends to make them available in all countries in which IBM operates. Offerings are subject to change, extension or withdrawal without notice. The information in this document is provided “as-is” without any warranty, either expressed or implied.