Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Wiki Markup
Note: The content of this page is extracted from a pre-print which is currently under peer-review by a distributed systems conference committee.

Scientists attempt to derive solutions to many problems by analyzing the data. The data can be derived from several forms such as initial assumptions or facts, observed data from instruments or output results of a previous experiment. Once the data is acquired scientists attempt to simulate the properties of the basic elements found within the data or search for similar patterns in the data or evaluate/explore convergence of the data in a dynamic model. They use computational science to encode their mathematical model into a simulation or data analysis application and feed the data into that application. The output of the application is analyzed/visualized by the scientist to identify the relationship between the data and the output.
We identify two main components vital to support this scenario:

[^airavata!airavata-analysis-usecase.png]|border=1!


Managing these two components plays a major role in a science gateway. Depending upon the science gateway use cases, the developers of a science gateway has the option of reusing software development kit libraries, leveraging existing services/framework, or building these components from scratch.

Some simulations or data analysis are single application executions, although the application itself may be parallel. The UltraScan Science Gateway~\cite{ultrascan} provides a user-friendly yet powerful and flexible data analysis environment for the analysis of analytical ultracentrifugation experiments. A scientist using the UltraScan science gateway will use his/her input data set in combination with the parameter of the type of analysis to be performed and pass it to UltraScan’s data analysis application. The progress of each data analysis is monitored for each different variation of input that triggered the analysis.

Many gateways extend this concept to provide many tools to their users. The CIPRES gateway~\cite{cipres} is one such gateway in which a suite of phylogenetic applications, most adapted to work on high performance computers, is provide to the gateway’s user community.  It is thus necessary for the gateway to manage the deployment and execution characteristics of a wide range of applications. The scenario is illustrated in Figure \ref{fig:scenario1}

Gateways may also allow users to execute multiple applications in defined sequences. A scientist using the ParamChem science gateway~\cite{paramChem}, which is designed to assist investigations on molecular structures, executes several computational chemistry applications in workflow in which input at one stage is provided by another application.

Gateways will also have to perform infrastructural tasks such as acquiring permission for resources, creating resources, copying/moving applications, setting up special permissions and similar attributes, etc. For example in a science gateway such as the Neuroscience Gateway (NSG)~\cite{nsg}, the neuroscience applications (such as NEURON) require executing scientist-provided python scripts which in turn require neuroscience modules to be loaded for the NEURON application before triggering the computation.  Once the computation has being successfully completed the data has to be moved to the configured storage resources allocated for NSG where scientists from that gateway can access their computation results.