SC2000 plenary and invited speakers include the world's leaders in HPNC research and applications. The Keynote Address promises to be unique and memorable. The four State-of-the-Field talks will continue the tradition of critical assessments about recent progress and future challenges in key areas of computing sciences. This year the focus will be on Cluster Systems, Computer Security, Numerical Methods, and Parallel Programming.

SC2000 is also beginning a new tradition, called Masterworks. The Masterworks track features Invited Speakers who will emphasize novel and innovative uses of HPNC in solving challenging problems in the real world. This track will highlight research-quality results that serve practical priorities. This year, the Masterworks track will include presentations on Computational Biology, Scalable Computing/Servers, Computer-Aided Engineering, Large Computing Platforms, and work by the winners of two prestigious IEEE computing awards. We hope that you find this track to be stimulating, and that it expands your understanding of just how broadly HPNC technology can benefit the larger scientific community and the general public.

webcast symbol It is expected that all of the plenary sessions will be webcast over the Internet, using the unique capabilities of SCinet 2000 to deliver high-quality audio and video broadcasts. Webcast events are indicated by this symbol in the Final Program and conference signage.



Chair: Bill Blake, Compaq
Time: Tuesday, 10:30 AM - Noon
Room: D 263/265

From First Assembly Towards a New Cyber-Pharmaceutical Computing Paradigm
Sorin Istrail, Celera Genomics

The new science of whole genomics will fundamentally change how pharmaceutical companies pursue the vital challenge of developing new and better drugs. Target discovery, lead compound identification, pharmacology, toxicology and clinical trials are likely to merge with the science of bioinformatics into a powerful system for developing new pharmaceutical agents. It will be possible to simulate the action of new molecules or therapeutic programs against diverse metabolic pathways prior to preclinical testing. Thus, a paradigm of cyber-pharmaceutical testing will be available to the industry, speeding the selection of promising new agents, eliminating products that are likely to exhibit toxicity, and reducing the formidable costs and risks associated with the current paradigm of drug development.

We will report on Celera's design of a whole genome shotgun assembler and its application to the sequencing of the Drosophila and Human genomes. We will also present some of the major emerging computational challenges of the above paradigm in the exciting new areas of proteomics, structural genomics, expression profiling, SNPs and pharmacogenomics.

Biographical Sketch: Sorin Istrail's work has focused on several areas of computer science and its applications to biology, physics and chemistry. He has worked on linguistics, automata theory, parallel algorithms and architectures, semantics of parallel programming languages, programming logic, complexity theory and derandomization, graph theory, voting theory, genomic mapping, protein folding, biomolecular sequence alignment, biomaterials, combinatorial chemistry, and proteomics. Recently, he resolved a longstanding open problem in statistical mechanics -the Three-Dimensional Ising Model Problem - showing the impossibility of deriving explicit thermodynamic formulas for every three-dimensional model. He is Executive Editor of the Journal of Computational Biology, Founder and General Vice-Chair of the RECOMB Conference Series, and Editor of the MIT Press Computational Molecular Biology Book Series.

Sorin Istrail has a PhD in computer science from University of Bucharest, Romania. After his immigration to the US, he was a visiting scientist at MIT. He joined Sandia Labs in 1992 where he held several positions, including most recently, Principal Senior Member of the Technical Staff. From 1992, he led the Sandia Labs research in genomics and structural genomics within the DOE MICS Computational Biology Project. In April 2000, he joined Celera Genomics, where he is Senior Director of Informatics Research.

Unveiling the Human Genome
Jill P. Mesirov, Whitehead/MIT Center for Genome Research

We are living in an extraordinarily exciting time in the history of research in molecular biology. On June 26, 2000 the Human Genome Project public consortium announced an assembled working draft of the Human Genome. While work continues to complete a "finished" version of the genome with 99.99% accuracy and no gaps or ambiguities, the working draft is already enabling scientist to uncover the genome's mysteries. This work relies heavily on the use of sophisticated computational techniques.

We will discuss a number of the scientific and computational challenges that face the genomics community. These will be drawn from the identification and elucidation of function of the genes in the human genome, the discovery of gene pathways, the mining of gene expression data to aid in the diagnosis and treatment of disease, the search for disease related genes, and the study of human variation.

Biographical Sketch: Jill P. Mesirov is Director of Bioinformatics and Research Computing at the Whitehead/MIT Center for Genome Research where she is responsible for the informatics, computational biology, and research computing program of the Center. She is also Adjunct Professor of Computer Science at Boston University. Mesirov spent many years working in the area of high performance computing and developing parallel algorithms relevant to problems that arise in science, engineering, and business applications. Her current research interest is the study and development of algorithms for computational biology in such areas as pattern discovery and recognition in gene expression data, genome analysis and interpretation, sequence homology searching, protein secondary structure prediction and classification, molecular dynamics and inverse protein folding. Mesirov received her Ph.D. in mathematics from Brandeis University in 1974. Mesirov came to the Whitehead in 1997 from IBM where she was Manager of Computational Biology and Bioinformatics in the Healthcare/Pharmaceutical Solutions Organization. Mesirov is a member of the Biology and Environmental Research Advisory Committee of the Department of Energy, and a trustee of the Mathematical Sciences Research Institute in Berkeley, California. She has also served as President of the Association for Women in Mathematics, Trustee of the Society for Industrial and Applied Mathematics, and Chair of the Conference Board of the Mathematical Sciences. She is a Fellow of the American Association for the Advancement of Science.


Chair: Eugene Fluder, Merck
Time: Tuesday, 1:30- 3:00 PM
Room: D 263/265

Peta-op Computing for Large-Scale Biomolecular Simulation
Robert S. Germain, IBM Computational Biology Center

Gaining an understanding of the mechanisms underlying the protein folding process is generally acknowledged as a Grand Challenge because of its scale and importance to biology. Advancing our understanding of these mechanisms is an initial focus area for the scientific portion of the Blue Gene project whose systems component is a blueprint for a 1 petaflop/sec capacity, massively parallel computer architecture that is expected to significantly enhance the state-of-the-art in the realistic modeling and simulation of this class of problems.

This talk will describe some of the challenges in the modeling and simulation of biological processes on the microscopic level, with special emphasis on the protein folding process. A selection of the scientific and algorithmic issues associated with a large-scale simulation effort aimed improving our understanding of the mechanisms behind the protein folding process will be discussed.

Biographical Sketch: Robert S. Germain manages the Biomolecular Dynamics and Scalable Modeling Group, which is part of the Computational Biology Center at the IBM Thomas J. Watson Research Center. He received his A.B. in physics from Princeton University and his M.S. and Ph.D. in physics from Cornell University in 1986 and 1989, respectively. After receiving his doctorate, Germain joined the Watson Research Center as a Research Staff Member where he has subsequently worked on both scientific and technical problems including the development of novel algorithms to implement a large scale fingerprint identification system. His research interests include the parallel implementation of scientific algorithms and the applications of these algorithms. Germain is a member of the IEEE and the American Physical Society.

Basic Computational Research for Drug Discovery
Simon Kearsley, Merck Research Laboratories

This talk will examine incidents at Merck where HPC has made a difference to the basic research drug discovery process. In addition, the presentation will focus on the newer challenges facing drug discovery and examine how computing resources must be marshaled to address them in the near future.

Biographical Sketch: Simon Kearsley is currently the Senior Director responsible for the Molecular Modeling Department at Merck Research Laboratories. His modeling group has a long history at Merck, and has become the nexus for the interaction of several scientific disciplines focused on drug discovery. Before going to Merck he spent several years at Yale University modeling radical reactions within organic crystals. He received his bachelor's, master's and doctor's degrees from Gonville and Caius College, Cambridge.

3:30 - 5:00

Chair: Pat Teller, UTEP
Time: Tuesday, 3:30- 5:00 PM
Room: D 263/265

Blue Gene
Monty Denneau, IBM

On its target applications, Blue Gene will make the 10-teraflop ASCI White machine look like a Palm Pilot. We tell you how.

Bio : Monty Denneau is the system architect for the Blue Gene project.

Status of the Earth Simulator Project in Japan
Keiji Tani, Japan Atomic Energy Research Institute

The Science and Technology Agency of Japan has proposed a project to promote studies for global change prediction by an integrated three-in-one research and development approach: earth observation, basic research, and computer simulation. As part of the project, an ultra-fast computer, the "Earth Simulator", with a sustained speed of more than 5 teraflops for an atmospheric circulation code, is being developed. The "Earth Simulator" is a MIMD-type, distributed-memory, parallel system in which 640 processor nodes are connected via a fast single-stage crossbar network. Each node consists of eight vector-type arithmetic processors that are tightly connected by a 16 gigabyte shared main memory. The peak performance and the main memory of the total system are 40 teraflops and 10 terabytes, respectively. As part of the development of the basic software system, an operating system service routine called "center routine," is being developed. Since an archival system will be used as main storage for user files, the most important function of the center routine is the optimal scheduling of not only submitted batch jobs but also the optimal management of user files necessary for them. The design and R&D for both hardware and basic software systems were completed during the last three fiscal years, FY97, 98 and 99. The manufacture of the hardware system and the development of the center routine are underway. Facilities necessary for the Earth Simulator, including buildings, are under construction. The total system will be completed in spring 2002.

Biographical Sketch: Dr. Keiji Tani graduated from Osaka University, Osaka, Japan, in July 1984 with a Doctor of Engineering in Nuclear Fusion Research. Tani has worked at the Japan Atomic Energy Research Institute since April 1974, where he has been involved in research in fast-ion confinement in tokamaks using particle-simulation techniques, establishment of a laboratory of advanced photon research, and research and development of the Earth Simulator. With respect to the Earth Simulator project, he is responsible for the development of the basic software system and collaborations necessary for the development of the Earth Simulator.


Chair: Sally Haerer, Oregon State University
Time: Wednesday, 10:30 AM - Noon
Room: D 263/265

Post-conference update: please see the Awards page for a complete list of all awards presented at SC2000.


Looking Back at Glen Culler's Development of Interactive Scientific Computing
Glen J. Culler, University of California, Berkeley, and David E. Culler

This presentation will be a retrospective on Glen Culler's career. It will include a tape presentation of his 1986 lecture at the ACM History of the Personal Computer, entitled "Mathematical Laboratories: A New Power for the Physical and Social Sciences," which traces the development of the "On-Line System" interactive scientific computing and visualization environment from its origins in 1961. It is surprising how much of this talk is still relevant today. David Culler will introduce the tape and set it in the context of Glen's career, including development of the array processor, digital speech processing, and the personal supercomputer.

Biographical Sketch: The work of Dr. Glen J. Culler truly demonstrates both the creative spirit and the profound impact on high performance computing systems recognized by the Seymour Cray Engineering Award. In 1961, he and physicist Burton Fried developed the first interactive, mathematically-based, graphical system - allowing scientists visualize computational solutions in real-time. In 1973, he produced the AP 90B, perhaps the first VLIW array processor. Extensive instruction-level parallelism was applied to obtain high-performance. The AP 90B operated in block floating-point, which was extended to floating-point in the FPS AP120B. Delivering over three MFLOPS for under 50K$ in a minicomputer environment, the AP-120B was often deemed, "the poor man's Cray." It would be fifteen years before RISC microprocessor workstations obtained the same price-performance.

While array processors became established in numerous scientific laboratories, G len's interest in real-time digital speech drove a series of very compact, high- performance digital signal processor designs used in aeronautic and submarine applications. In 1982, this effort produced, in collaboration with Motorola, the first 32-bit VLSI array processor. Novel uses of this technology included the Chromophone, an experimental system allowing the deaf to visualize speech charact eristics.

Throughout the 80's, Culler Scientific Systems was a leader in the minisupercomputer arena, extending the use of instruction level parallelism, multiprocessors, and array-based addressing in several innovative architectures. The Culler PSC (Personal SuperComputer) delivered a quarter of a Cray 1-S in workstation size and price and the Culler-7 pioneered the networked, multiprocessor Unix compute server arena. In 1991, at Star Technologies, Glen developed the first Sparc-based vector processor, the STAR 910/VP, before a serious stroke forced him into retirement.


Large-scale Parallel Transient Dynamics Simulations of an Explosive Blast Interacting with a Concrete Building
Stephen W. Attaway, Sandia National Laboratories

This talk will illustrate how massively parallel computing is currently being applied to several large-scale transient dynamics numerical analyses. Advances in parallel algorithms for transient dynamics analysis have allowed simulations that employ tens of million elements. A review of the key algorithmic work that enables efficient parallel transient dynamics analysis will be illustrated using several examples.

One such example is the prediction of a reinforced concrete building response due to blast loading from a terrorist attack. This work demonstrates that it is feasible to predict the response of reinforced concrete buildings subjected to blast loads using finite-element techniques that incorporate sophisticated physics-based material models to represent the concrete and reinforcement behavior. This modeling challenge is compounded by the potential for blastwave/structure interaction. For a coupled analysis, the airblast load is modeled by an Eulerian finite volume code and the structure is modeled using a Lagrangian finite-element code. These codes can be joined in a single coupled analysis.

The feasibility of large-scale computations, both uncoupled and coupled, was demonstrated at using structural models as large as 3.2 million elements and as many as 512 processors. Numerous benchmark problems were performed to validate the material models that characterize the concrete and reinforcing bar behavior. Uncoupled analyses show excellent agreement with observed behavior in the tests. In addition to discussing the feasibility demonstration and model validation for the coupled code developed to address the blast/structural response problem, this talk presents preliminary results of the coupled analyses. Efficient coupling between the Eulerain and Lagrangian model requires multiple parallel decompositions with irregular communication patterns. The computational strategy for parallel implementation of the coupled codes will be outlined.

Biographical Sketch: Stephen W. Attaway is an engineer who works in the field of computational mechanics for crash and impacts. He has worked at Sandia National Laboratories since 1987 and is currently a Distinguished Member of Technical Staff in the Computational Solid Mechanics and Structural Dynamics Department. Stephen received in 1998 the Eric Reissner Medal, awarded every two years at the International Conference on Computational Engineering Science for excellence in computational mechanics research. Stephen was named a fellow of the American Society of Mechanical Engineers (ASME) in March 2000.

Stephen develops numerical algorithms and application codes and applies them to model transient dynamics (TD) problems. The prototypical TD calculation is the simulation of a car crash, but other applications abound in both government and industry, including simulations of metal forming, explosions, and weapons effects. Stephen is the lead developer of Pronto3D, a state-of-the-art TD code developed at Sandia National Laboratories. Stephen, working with his colleges at Sandia National Laboratories, has made important advances in methodologies for contact detection and enforcement. Pronto3D's methodology is widely respected for its combination of accuracy and efficiency.

Stephen was instrumental in developing a novel parallelization strategy for global contact detection, which enabled Pronto3D to be the first TD code that runs scalably on thousands of processors. This parallelization work enabled Pronto3D to be a finalist in the 1997 Gordon Bell competition. A complex container-crush simulation with 13.8 million finite elements was run at a speed of 120 Gflops on 3600 nodes of the Intel Tflops machine. Today, with faster Pentium processors installed on the machine, the same calculation would run at nearly 200 Gflops. The parallel Pronto3D code also received first prize (out of 34 entries) in the 1999 SuParCup competition, awarded at the Mannheim (Germany) SuperComputing SC'99 conference.


Chair: Tim Mattson, Intel
Time: Wednesday, 1:30 - 3:00 PM
Room: D 263/265

Software Applications Accelerate Performance with the VI Architecture; Hot Clusters, & What's Next?
David Fair, Ed Gronke, Giganet

This talk will focus on implementation of clustered solutions, success stories, and clustering challenges for 2001. Over the last year applications have been tuned to exploit the performance benefits of the VI Architecture for database, storage, and backup. In the database arena, for example, Giganet is the backbone for database acceleration; Microsoft's SQL 2000 edition has exhibited 30 percent or more in performance increases utilizing the VI Architecture with off-the-shelf clustering components. Oracle will be announcing VI-based results at the introduction of their 8.2 release. The roadmap for many of the next suite of VI-enabled applications to be introduced in 2001 will be presented.

Biographical Sketch: David Fair, VP Solutions, Giganet Inc. Mr. Fair joined Giganet in early 2000 after 15 years with Intel Corporation. Giganet is a leader in VI-based data center networks and Mr. Fair is responsible for working with industry to deliver VI-enabled solutions stacks. Mr. Fair's implementation role at Giganet is a natural transition from his Intel responsibilities, where he was the Director for the VI Architecture Specification in Intel's Server Architecture Lab. Mr. Fair worked in the Intel Supercomputer Systems Division from 1986 through 1993 as the Director of the Research Projects Office.

Biographical Sketch: Ed Gronke, Principal Solutions Architect, Giganet, Inc. Mr. Gronke joined Giganet in early 2000 after five years with Intel Corporation. He was responsible for delivering VI-enabled applications also has experience in Intel's Supercomputer Systems Division.

From RAIN to Rainfinity
Jehoshua Bruck, Rainfinity

The RAIN project was a research collaboration between Caltech and NASA-JPL on distributed computing, communication and data storage systems for future spaceborne missions. The goal of the project was to identify and develop key building blocks for reliable distributed systems built with inexpensive off-the-shelf components. The RAIN software components run in conjunction with operating system services and standard network protocols. Through software-implemented fault tolerance, the system tolerates multiple node, link, and switch failures, with no single point of failure. The RAIN technology has been transferred to Rainfinity, a start-up company focusing on creating clustered solutions for improving the performance, availability and scalability of Internet data centers.

Biographical Sketch: Jehoshua (Shuki) Bruck is a Professor of Computation and Neural Systems and Electrical Engineering at the California Institute of Technology. His research interests include parallel and distributed computing, fault-tolerant computing, computation theory and neural and biological systems. Dr. Bruck has an extensive industrial experience, including working with IBM for ten years both at the IBM Almaden Research Center and the IBM Haifa Science center. Dr. Bruck is a co-founder and Chairman of Rainfinity, a spin-off company from Caltech that is focusing on providing software for high performance reliable Internet Servers. Dr Bruck received the Ph.D. degree in Electrical Engineering from Stanford University in 1989. Dr. Bruck is the recipient of a 1997 IBM Partnership Award, a 1995 Sloan Research Fellowship, a 1994 National Science Foundation Young Investigator Award, six IBM Plateau Invention Achievement Awards, a 1992 IBM Outstanding Innovation Award, and a 1994 IBM Outstanding Technical Achievement Award for his contributions to the design and implementation of the SP-1, the first IBM scalable parallel computer. He has published more than a 150 journal and conference papers in his areas of interests and he holds 22 US patents.


Chair: Siamek Zadeh, Sun Microsystems
Time: Wednesday, 3:30 PM - 5:00 PM
Room: D 263/265

Computing Challenges in the Travel & Transportation Industry
Richard Ratliff, Sabre Holdings, Inc.

This presentation will focus on the use of high-end Sun computers at Sabre, Inc. Sabre's mid-range environment is comprised predominantly of Sun servers, and this equipment is used to run hundreds of different applications for our airline customers worldwide. These applications span a wide range of uses - from large database systems to scientific computing with forecasting and optimization models. Overviews of performance and sizing requirements for selected applications will be discussed. Support, procurement and operational issues will also be addressed for our centrally hosted outsourcing and ASP applications as well as those customers with local hardware implementations.

Biographical Sketch: Richard Ratliff is the Vice President of Technology and Product Integration at Sabre and is responsible for coordinating the Operations Research model and data exchange among Sabre's airline applications. His primary expertise is in the area of forecasting and optimization modeling. He served for four years as Director - Revenue Management for Ansett Australia and also led Sabre's development and implementation of its advanced yield management systems. Mr. Ratliff has an M.S. in Economics from the Colorado School of Mines.

SETI@home: Internet Distributed Computing for SETI
David Anderson, United Devices, Inc.

SETI@home is a radio SETI (Search for Extraterrestrial Intelligence) project that takes a novel approach to the compute-intensive analysis of radio signals. Instead of placing a dedicated supercomputer at the telescope, SETI@home distributes data over the Internet to computers in the homes and offices of volunteers. In SETI@home's first year of operation, over 2 million people have participated, and have contributed 300,000 years of computer time. This approach holds promise for a number of other scientific computing problems.

Biographical Sketch: Dr. David Anderson is the Director of the SETI@home Project and former visiting scientist at U.C. Berkeley. As the project director, Dr. Anderson was the technical lead-designing and managing the implementation of the client-side and server-side software, database, and the server hardware architecture. Prior to SETI@home, Dr. Anderson was the CTO of where he architected and implemented a database-driven, Web-based system for personalized music discovery and marketing. Prior to these positions Dr. Anderson was the Director of Software Architecture at Sonic Solutions as well as an Assistant Professor, CS Division, EECS Department, at UC Berkeley. Dr. Anderson has authored or co-authored 65 papers in Computer Science and he is the sole inventor on two pending patent applications for technology related to MediaNet and an invention involving 3D interactive television.


Chair: Ed Turkel, Compaq Corporation
Time: Thursday, 10:30 AM - Noon
Room: D 263/265

Satisfying CFD Engineering Constraints (with Parallel Processing)
Stephen A. Remondi, Jame Hoch, EXA Corporation

The field of Computational Fluid Dynamics (CFD) within the engineering development process is changing rapidly. CFD is fast becoming an integral part of the engineering design cycle. In order to meet these real world demands CFD must meet a set of specific requirements. First, CFD must be capable of handling real world complex geometries. Second, CFD must be accurate. The degree of accuracy is defined as being sufficient to enable engineering decisions. Typically, this means in the 3 to 10 percent error range. Third, CFD must be integrated into the engineering process. This means coupling to mechanical CAD and design software as well as to other analysis software packages. Last is performance. CFD must provide these requirements in a timely manner. Engineering answers are not useful if the design has already been signed off. Parallel processing on scalable SMP and clustered systems form the hardware platforms that provide fast turn around times for CFD simulations.

This presentation will review in some detail, the real engineering requirements for CFD. The presentation will then highlight how parallel computing is utilized to meet these requirements. The presentation will conclude with a review of some recent examples of how CFD is being utilized in production engineering.

Biographical Sketch: James E. Hoch, vice president of software development, was previously systems architect at Maker Communications, a developer of communication processors and associated software for use in the networking industry. Prior to that he was lead architect on several research parallel computer systems and related compiler projects at Sandia National Laboratories. Hoch originally joined Exa in 1993 with extensive experience in compiler technology and parallel computing. Hoch holds both M.S. and B.S. degrees in Computer and Electrical Engineering from Purdue University.

Biographical Sketch: Stephen A. Remondi is president and CEO and a co-founder of Exa Corporation. Remondi has invested over seven years at Exa leading product development. Before assuming the role of CEO in '99, Remondi began his career at Exa as a director, was promoted to VP of engineering in '95 and then VP of application development in '97. Prior to starting Exa, Remondi worked for Alliant Computer Systems and Data General in the development of parallel computer systems, systems architectures, and ASIC design. Remondi holds a B.S. in Computing & Electrical Engineering from Tufts University and an M.B.A. from Bentley College.

Stochastic Simulation: Breaking the Stagnation and Fragmentation of Contemporary HPC
Jacel Marczyk, EASi Engineering GmbH

Contemporary CAE and HPC are in state of crises. Evidence of this is the excessive fragmentation and fractalization of tools, procedures and the market. Fragmentation is a reflection of a state of crisis. States of crisis appear when a discipline runs out of ideas and the only way to push progress and innovation is via small increments. HPC and CAE are amidst a state of profound crisis that has been brought about by determinism, reductionism, and obsession with accuracy and optimality. The fact that CAE and HPC are consistently neglecting uncertainty (material properties, load, boundary conditions, geometry) ultimately gives rise to orthodox forms of computing based on brute force increases in finite element number, particle counts and finer time-steps. At the same time, the fundamental issues of confidence and model validity are being embarrassingly omitted, or even silenced, due to their political incorrectness. Stochastic Simulation based on Monte Carlo techniques that have been recently introduced into the industry has the potential of bringing about a reconciliation of CAE/HPC with experimentation, and the capability to deliver robust designs which transcend the simplistic and trendy Multi-Disciplinary Optimization.

Biographical Sketch: Dr. Jacek Marczyk is the Vice President, Advanced Technologies at EASi Engineering GmbH, which does computer-aided engineering for the European automotive industry. Dr. Marczyk has over 18 years experience in Structural Mechanics, Simulation and Control System Design for the Aerospace, Offshore and Automotive industries. He has a PhD in Civil Engineering from Polytechnic University of Catalonia, Barcelona, Spain and has 28 publications, including four books.



Chair: Sally Haerer, Oregon State University
Time: Thursday, 1:30- 3:00 PM
Room: Ballroom C

On the Scale and Performance of Cooperative Web Proxy Caching
Goeff Voelker, UC San Diego, Computing Research Association Digital Fellow

While algorithms for cooperative proxy caching have been widely studied, little is understood about cooperative-caching performance in the large-scale World Wide Web environment. In this talk, I will describe the work that we have done to explore the potential advantages and drawbacks of inter-proxy cooperation across a wide range of scales in client population. Our work used both trace-based analysis and analytic modeling. With our traces, we evaluated quantitatively the performance-improvement potential of cooperation between 200 small-organization proxies within a university environment, and between two large-organization proxies handling 23,000 and 60,000 clients, respectively. With our model, we extended beyond these populations to project cooperative caching behavior in regions with millions of clients. Overall, our results indicate that cooperative caching has performance benefits when scaling client populations only within limited population bounds. We also used our model to examine the implications

Biographical Sketch: Geoffrey M. Voelker is an assistant professor at the University of California at San Diego. His research interests include operating systems, distributed systems, and Internet systems. He received a BS degree in Electrical Engineering and Computer Science from the University of California at Berkeley in 1992, and the MS and PhD degrees in Computer Science and Engineering from the University of Washington in 1995 and 2000, respectively.


Chair: Ed Turkel, Compaq Corporation
Time: Thursday, 3:30 PM - 5:00 PM
Room: D 263/265

Moving a Large Commercial Application from SMP to DMP
David Lombard, MSC Software Corporation

Moving a high-performance FEA application from symmetric multi-processing (SMP) to Distributed Memory Processing (DMP) offers many advantages and challenges. The largest challenge is the re-architecture of the application to suit the demands of message passing-that challenge is recognized and usually understood. Along the way, however, are differences in MPI implementations, issues outside the MPI standard, such as program startup, and platform issues such as communications performance, reliability, and scalability which must all be addressed in order to successfully deploy the application.

LS-DYNA - Application-driven Strategies for High-Performance Computing
Mark Christon, Livermore Software Technology

LS-DYNA is a multi-physics finite element code that has long been associated with large-scale high-performance computing. An application-centric focus coupled with an emphasis on high-performance computing has permitted LS-DYNA's continuing feature development to include a comprehensive set of capabilities for problems ranging from automotive crash-worthiness and occupant safety, metal forming and fluid-structure interaction, to heat transfer, compressible and incompressible flows. Although current computing trends are yielding benefits in some applications areas, they are also challenging the sustained growth of some grid-based solution strategies. To accommodate these trends, LS-DYNA relies on a configurable hierarchical approach to data distribution and ultimately parallelism. At the coarse-grained level, a domain-decomposition message-passing paradigm is used to partition and distribute the data and concomitant computational load across processors. At an intermediate level of granularity, data is distributed across shared-memory processors using directive-based parallelism. Configurable vector/cache blocking is used at the finest level of granularity to achieve an effective processor-level algorithm-to-architecture mapping. Two applications are used to contrast the advantages and potential pitfalls of current trends in scientific computing. The first demonstrates that low-cost clusters and Beowulf systems are making large-eddy simulation accessible to a broad class of users. The second application focuses on optimization with LS-OPT. LS-OPT uses a response surface methodology to construct design rules and conduct optimization requiring a large number of independent parallel LS-DYNA simulations. An example of an automotive crash-worthiness optimization problem demonstrates LS-OPT's embarrassingly parallel use of LS-DYNA simulations on an HP V-Class machine and a network of PC's.

Biographical Sketch: Mark Christon has been working in the area of acoustic fluid-structure interaction and time-dependent incompressible/low-Mach fluid dynamics with an emphasis on second-order projection methods and large-eddy simulation for the past 10 years. During this period of time he worked at Lawrence Livermore in the Methods Development Group and Sandia National Laboratories in the Computational Physics department. Mark received his PhD in mechanical engineering from Colorado State University in 1990. Currently, Mark is a senior scientist at Livermore Software Technology Corporation and is the primary developer of the incompressible/low-Mach flow solver, fluid-structure capability in LS-DYNA.