Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

General Assumptions

 

  1. Single Application Catalog will exists in Airavata to be used by all the gateways.
  2. Each gateway will be creating, updating, deleting, etc... their own application deployments in the supercomputers.
  3. Each gateway will have ADMIN users who would do above tasks in the Application Catalog
  4. Airavata ADMIN users will be able to view details entered through all the gateways using Airavata instance. They can also search and view application deployments;
    E.g.:
    • Search for all applications which are in turned-on state in all supercomputers
    • Search for application deployments of  specific gateway
  5. If necessary Airavata ADMIN users will also be able to maintain application deployments across gateways.
  6. Applications and workflows will always be deployed, created (workflows will be created using XBAYA), etc... prior to entering in the Application Catalog.

...

Existing Descriptors

 

  1. In Airavata there are two ways of executing experiments in supercomputers/ computational resources (Trestles, Stampede, Amazon EC2, Big Red II, etc…).
    • Execute applications - Single submission experiments
    • Execute workflows - Multiple node OR single node experiment
  2. To execute experiment jobs of either above types; we rely on ‘Descriptors’ in Airavata to provide information required on application, application hosted supercomputers and input output parameters of the application.
  3. There are three types of descriptors;
    • Host Descriptors
      Specifies the application residing/hosting resource information. GRAM, GridFTP, Amazon EC2, etc….
    • Service Descriptors
      Input and output parameter information for an application
    • Application Descriptors
      Specifies executable location information, deployment location information, scratch locations of output files, meta data specific for the application, meta data specific for the deployment (location where the application is deployed in the resource), etc…

 

Introduction - Application Catalog

...

Create Application/Workflow

  • Create application includes many sub tasks in the Application Catalog. Creating an application instance extends to;

 

    • Entering application specific information such as;

      • Application version details

      • Deployment visibility to gateway users

      • Defining application specific parameters

      • Defining resource specific parameters required for the application execution

      • Define input and output data file sizes, file locations and scratch folder locations and also their folder sizes, etc for the application

      • Error file locations, log file locations ,etc…. for an application in residing resource

      • Application user group (Application is available only for a selected user group(s)

      • Maximum wall-time, number of processors, nodes per processor, etc….

    • Entering resource specific information such as;

      • Deployed resource/supercomputer information

      • Defining resource specific parameters such as application residing location

      • Queue information

      • Deployment information such as using MPI, GPU, Serial, etc…

  1. Creating workflow consists;

    • Enter workflow information & resources where the workflow will be executed

    • Define workflow nodes and applications to be used at each nodes in the workflow

    • Defining subsequent input and output files

  2. Application & workflow creation in the catalog will be a gateway ADMIN task.

  3. Applications and workflows can be created by cloning & importing existing applications and workflows.

  4. Creating applications and workflows will not make them available for users to execute their experiments.

Publish Application/Workflow

  1. Application deployments and workflows which are CREATED will be published in order to be used by gateway users.

  2. Publishing applications and workflows is a gateway ADMIN user level task.

  3. At the time of publishing applications the system would validate

    ;

    • Existence of at least 1 deployment of the application

    • Existence of at least 1 application and resource level

...

    • parameter

...

    • ...
  1. At the time of publishing workflows system should validate;

    • Existence of at least one application in the workflow
    • Existence of at least one node in the workflow

    • ...
  2. User can search within created applications in order to publish by providing;

    • Application ID/Workflow ID (Generate ID within Application Catalog)

    • Application Name

    • Resource

    • Application User Group

    • ...

 

Search/List Application/Workflow

 

  1. Gateway users (ADMIN, normal) and Airavata ADMIN users will search/list applications and workflows for different purposes. when searching/listing for applications and workflows different keys can be used;

  2. Some examples;

...

    • Application ID/Workflow ID

    • Application name/Workflow Name

    • Resource

    • By ON/OFF flag

    • Status (CREATED & PUBLISHED)

    • By the special application user groups

    • ...

  1. Gateway users or Airavata users may want to use above listing for auditing purposes.

...

  1. Application/Workflow can be turned ON and OFF whenever required by the gateway or Airavata ADMIN users.

  2. This functionality will be used when the application/workflow need to be temporarily unavailable for the users.

  3. Applications/workflows which are published can be turned ON and OFF.

  4. Turning ON and Off can be done at different levels. Such as;

    • Resource level

    • Application user group level

    • Application level

  5. When an application or a workflow is turned OFF from a certain level it needs to be turned ON from the same level as well.

    • E.g.: When applications are turned OFF at resource level they cannot be turned ON individually; turning ON also will happen at resource level.

Read from Application Catalog

  1. Airavata internal component can also query information from the Application Catalog. Orchestrator, Workflow Interpreter, GFac can query information required to formulate the tasks to be executed in the supercomputers.

  2. Orchestrator will query Application Catalog to retrieve information required to submit single submission jobs into GFac. It will retrieve;

    • Information on application residing resource

    • Information on application specific parameters

    • If the experiment didn’t specify resource Orchestrator will select a resource from Application Catalog

  3. Workflow interpreter will retrieve information required to submit jobs of the workflows.

    • Retrieve information on applications to execute for each node

    • Information on resources where applications are running

    • Application version information

...

    • ...

  1. GFac will also use information stored in the Application Catalog

    • Information on locations for input data files, output data files, error files and log files for each application

    • Information on application versions  and other information required for application execution

    • Information to select the correct providers and handlers

    • ...

Use Cases - Setting-up & maintenance of Application Catalog

...