SC'2001 SCinet High-Performance Bandwidth Challenge

 

Overview

The SC conferences on high-performance networking and computing have long been a place where high-performance computers and high-speed networks meet. For SC 2001 in Denver, the SCinet team will create a particularly exciting network infrastructure (a backbone using 10 Gigabit Ethernet and an OC-192 SONET WAN) that includes multiple gigabit links to the exhibit floor and connections to most high-speed national research networks. In addition, the SC Global infrastructure will link the Colorado Convention Center with SC Constellation Sites across the U.S., in Australia, Brazil, and Germany using Access Grid (AG) technology (http://www.accessgrid.org/) to support multinational and multistate participation in the Conference. SCinet is continuing the tradition of challenging the research community to show how this unique network can be used to demonstrate exciting applications at the maximum possible speed. Last year, at SC 2000, two applications broke the Gigabit per second limit, with one sustaining almost 100% of the available bandwidth.

 

To this end, SCinet and SC 2001 have solicited proposals for innovative and bandwidth-intensive application demonstrations for the “High Performance Bandwidth Challenge.” The challenge is to propose ideas for meaningful applications that fully use the SCinet network infrastructure and capacity and deliver innovative application value on an OC-48 or higher interconnect. In turn, SCinet will facilitate high-speed access to the networks and provide space and equipment for your demonstrations if you are not participating in an exhibit booth. SC 2001 will also work to ensure that contestant efforts get the recognition they deserve, via publicity prior to and during the conference, publications, and potentially awards. Qwest Communications is again contributing the award of monetary prize(s) for the application(s) that the judges believe make the most effective and/or courageous use of SCinet resources. The primary measure of performance will be the verifiable throughput measured from the contestant’s booth through the SCinet switches and routers to external connections.

 

Challengers

 

Challenge

Institutions

Wide Area Networks

Demonstration Location and Time

Visapult: WAN-Deployed Distributed and Parallel Remote Visualization

LBNL
NERSC
NCSA
Max Planck Inst.

Albert Einstein Inst.

ESnet, Abilene,
Qwest, Level3

LBNL Booth

TBD

TICKET: Traffic Information-Collecting Kernel with Exact Timing

LANL

 

LANL and NPACI Booth

Terra Wide Data Mining Testbed (TWDM)

U of Illinois, Chicago
Project Data Space

Surfnet, SARA, CANARIE III, IWIRE, Starlight, Abilene

R 443

TBD

Bandwidth Greedy Grid-enabled Object Collection Analysis for Particle Physics.

Caltech
CACR

Abilene

CACR Booth

TBD

Dynamic Right-Sizing: An Automatic, Transparent, Scalable, and Lightweight Technique for Enhancing Network Performance in Computational Grids

LANL

ESnet
Internet2

LANL Booth

TBD

IPv6 enabled Telescience Video and Data Services

SDSC
UCSD
NPACI
OSAKA U

Esnet
Internet2 (?)

NPACI Booth R206

TBD

Dancing Beyond Boundaries

U of Florida
State U of Campinas (Brazil)
U of Minnesota

Catalyst Media

Abilene

University of Florida Booth – R1070

TBD

Network Harp

Stanford University

Abliene

Scinet Network Operations Center

TBD

Bandwidth to the World

SLAC
Stanford

Internet2, Esnet, Janet, GARR, Renater

SLAC/FNAL Booth

TBD

 

 

Judging

The challenges will run their applications on Tuesday night after the show exhibit floor closes and on Wednesday Morning before the exhibit floor opens.  During this time SCinet network monitoring staff will measure performance.  Judges will visit each booth on Wednesday for a demonstration of the application and and to discuss the challenges with participants. 

 

Judges

Dr. Walter Polansky, DOE

Dr. Wesley Kaplow, Qwest

Dr. Robert Borchers, NSF

William Wing, SCinet Chair

Dr. Al Kellie, NCAR

TBD

 

Awards

Squandered Most Effectively
Visapult: WAN-Deployed Distributed and Parallel Remote Visualization

Best Network-Enabled Application
IPv6 enabled Telescience Video and Data Services

Most Courageous and Creative
Dancing Beyond Boundaries

Visapult: WAN-Deployed Distributed and Parallel Remote Visualization

 

Primary contact

John Shalf - Lawrence Berkeley National Laboratory: jshalf@lbl.gov

 

Contact information for collaborators

Wes Bethel - Lawrence Berkeley National Laboratory: ewbethel@lbl.gov

Michael Bennett - Lawrence Berkeley National Laboratory: mjbennett@lbl.gov

John Christaman - Lawrence Berkeley National Laboratory: jrchristman@lbl.gov

Eli Dart – National Energy Research Scientific Computing Center (NERSC): eddart@lbl.gov

Brent Draney - National Energy Research Scientific Computing Center (NERSC) brdraney@lbl.gov

Gabrielle Allen - Albert Einstein Institut/Max Planck Institute for Gravitation Physics Potsdam, Germany: allen@aei-potsdam.mpg.de   

Jim Ferguson - NCSA/NLANR:  ferguson@ncsa.uiuc.edu   

Tony Rimovsky - NCSA : tony@ncsa.uiuc.edu

 

Project description

 

The study of complex astrophysical phenomena involving strong dynamical gravitational fields, such as the mergers of neutron stars and collisions of black holes requires the integration of computational tools in many disciplines, such as numerical relativity, GRHydro, radiation transport, MHD and nuclear astrophysics.  The Cactus Computational Toolkit is a modular parallel code framework that integrates the computational tools necessary to carry out this research.  Simulations of these phenomena will be especially important as Gravitational Wave detectors such as LIGO and Virgo come online in the near future.  Simulating the gravitational waveforms produced by very phenomena that these detectors are designed to detect can provide insight into signal processing and detection techniques required to maximize the efficiency of these very expensive experimental apparatus.  The results of these experiments will validate or completely debunk Einstein's General Theory of Relativity, which has withstood more than 80 years of examination.

 

The burden on computational resources to carry out these simulations is extreme. Such calculations are capable of saturating the largest computational resources in the world just for treatment of comparatively simple cases of the phenomena.  Maximizing the effectiveness of these computational resource requires runtime computational monitoring of the progress of the simulation; often coupled with computational steering.  Such monitoring requires use of high performance networking an efficient parallel visualization techniques.

 

The LBNL-developed Visapult application is capable of volume rendering of multi-gigabyte datasets at interactive rates.  Visapult makes efficient use of high-performance wide area networks as evidenced by its winning of last year's bandwidth challenge.  For this year's bandwidth challenge experiment we will couple a live feed of simulation data from the Cactus code with the Visapult parallel volume rendering application.  The combined application will highlight, cutting-edge physics, high performance networking, terascale supercomputing resources, and high performance visualization applications capable of interactive manipulation of the massive datafeeds produced by these resources.

 

Summary of hardware, software, and networking

 

The overall application will involve the following components;

·        The Cactus simulation code which computes the collision of two black holes in 3D.

·        The Cactus code will send visualization data to the Visapult back-end application that runs on parallel computational resources on the SC-2000 show floor.  The Visapult back-end performs partial-rendering of huge datasets.

·        The Visapult back-end then forwards its partially rendered results to the visapult front-end GUI that does final compositing of the geometry data produced by the back-end.  The resulting images offer the illusion of real-time interactive volume rendering of massive datasets (such as that produced by the running Cactus code).

 

The Cactus code will run on the NERSC-3 SP2 system.  It will consume a minimum of 4 computational nodes (16 CPU's per node) and 16Gigabytes of memory at a minimum.  The Cactus code will send data to the Visapult back-end application using all available Gigabit Ethernet connections on NERSC-3.  The NERSC-3 networking traffic will be carried via ESNet and Level3 to the SciNET NOC on the SC2001 showfloor.  In addition, a sub-task of the simulation will run on the Origin 2000 array at NCSA and its associated network traffic will be carried by Abilene to the showfloor. 

 

The network streams from the Cactus code will be carried from the SciNET NOC to the LBNL booth via a 10gigabit ethernet connection and distributed to a cluster of workstations running the Visapult back-end.  The Visapult back-end will perform the partial computations of the volume rendering of the live data arriving from the Cactus code.  The Visapult back-end then forwards its partially-rendered results to a front-end GUI for final compositing and presentation.  The result is an application that performs realtime interactive parallel volume rendering of simulation data arriving from a running application.  This will greatly improve the computational monitoring and real-time analysis capabilities for physicists using the Cactus code for scientific discovery.

 

 

URL address for further project documentation.

 

A description of the Cactus code is available at http://www.cactuscode.org/

A description of Visapult is available at  http://vis.lbl.gov/Projects/Visapult.html

An overall description of the project will be available soon at  http://infinite-entropy.lbl.gov/SC2001/

 

What Wide Area Network(s)

 

We will use Level 3 to carry the bulk of network traffic from NERSC with some traffic also carried by ESNet.  We will use Abilene to carry network traffic from NCSA.  At NCSA, we are coordinating with John Towns, Rob Pennington, Tony Rimovsky and Jim Ferguson to line up both computational and network resources for the challenge.

 

Off-show resources

 

We will use the LBNL NERSC-3 system (a 5 teraflop IBM SP2 supercomputer) to carry the primary Cactus calculation.  This will connect to the SciNET NOC on the showfloor using an ATM OC-48 connection carried by Level3.  NERSC will also use any available bandwidth from its ESNet OC-12 connections.  In addition to the primary Cactus simulation, a sub-task of the overall simulation program will be carried out on the NCSA Origin2000 supercomputers and connected to the SciNET NOC through OC-12 via Abilene in Chicago.

 

On-show floor resources and booth location

 

We will have a dark-fiber drop between the SciNET NOC and the LBNL booth that will connect the 10gigabit ethernet switch in the LBNL booth to an identical switch from the same vendor located in the NOC.  On the NOC end of things, the 10gbE switch will connect to the SciNET backbone using 8 GigE connections.

 

Specialized on-show floor equipment

·        10gbE switches in both SciNET and LBNL booths.  On the SciNET end, the switch will be connected to the NOC backbone using 8 GigE interfaces.

·        A cluster of 8 dual-processor systems; GigE NIC's on each node.

·        An SGI Onyx with GigE for display