skip to main content
10.1145/1341811.1341857acmotherconferencesArticle/Chapter ViewAbstractPublication Pagesmardi-grasConference Proceedingsconference-collections
abstract

A common application platform for the SURAgrid (CAP)

Published:29 January 2008Publication History

ABSTRACT

From our experience in developing and deploying research applications on a regional grid infrastructure (SURAgrid, www.sura.org/suragrid), we observe that there are significant entry-barriers to "grid-enable" applications, and therefore, to realize the full benefit of a grid environment. In order to increase both the number and the variety of applications that can run on the SURAgrid, we propose to develop an integrated environment that can directly support MPI-based* applications on a subset of SURAgrid resources. Our goal is to emulate the environment of a single Beowulf-style cluster on SURAgrid for easy access by MPI-based distributed-memory applications that are readily available in almost all fields of science and engineering. Although a single application can engineer a similar environment and performance with the various grid services, the creation of a persistent environment such as CAP can significantly reduce the complexity for deploying an application while extending the benefits to a much larger set of applications.

This paper describes the architecture and initial implementation of the Common Application Platform (CAP) on SURAgrid. CAP provides an integrated platform for scheduling and execution of sequential and parallel jobs on the CAP-enabled resources in a user-friendly environment. In particular, we aim to provide the following capabilities:

Meta-scheduling: Co-scheduling capability is needed for applications with large-scale memory requirements that can be met only by simultaneous use of resources at multiple sites. Automatic resource selections across multiple sites will enhance the overall utilization of the grid. The scheduling and job management capabilities in CAP are provided by the GridWay metascheduler.

Orchestration: For parallel applications on CAP, the orchestration capabilities are provided by MPICH-G2, a grid-enabled version of the popular MPI implementation -- MPICH. Both GridWay and MPICH-G2 depend on the Globus Toolkit to provide the basic grid functionalities.

Fast data transfer: High-performance networks, such as the National Lambda Rail that connects many of SURAgrid resources can be exploited by performing striped (parallel) file transfers, and will be explored in this project for enhancing inter-cluster communications.

There are significant issues and challenges in several areas affecting practical deployment of CAP: load balancing, routing, resource heterogeneity, and network performance. In order to explore these issues, a prototype involving the SURAgrid resources at Old Dominion University and University of Alabama at Birmingham will be developed. The prototype leverages existing software solutions in a coordinated infrastructure to minimize development efforts, utilize SURAgrid infrastructure to provide simplified job control through the SURAgrid portal, and enable inter-institutional resource allocation through the SURAgrid authentication and authorization mechanism.

This paper will disseminate lessons learned through the construction of this prototype as well as share experiences and perspectives towards next-step development. The goal of CAP is to provide immediately useful benefits to the growing SURAgrid application community in a way that also serves as a model for effective integration of existing technologies for the grid use and development community at large.

Index Terms

  1. A common application platform for the SURAgrid (CAP)

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Other conferences
          MG '08: Proceedings of the 15th ACM Mardi Gras conference: From lightweight mash-ups to lambda grids: Understanding the spectrum of distributed computing requirements, applications, tools, infrastructures, interoperability, and the incremental adoption of key capabilities
          January 2008
          178 pages
          ISBN:9781595938350
          DOI:10.1145/1341811

          Copyright © 2008 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 29 January 2008

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • abstract
        • Article Metrics

          • Downloads (Last 12 months)0
          • Downloads (Last 6 weeks)0

          Other Metrics

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader