Project Title

Small Molecule Ionic Lattices (SMILES) Data Models.

Abstract

The goal of this project is to design and implement the solution for Airavata Data Catalog and the Data Parsers to analyze the metadata extracted from the Literature, Experimental and Computational records in support of Small Molecule Iconic Isolation Lattices (SMILES) Data. Particularly, this includes the data synchronization with the SEAGrid Data Analysis Portal and the Gateway users.

Proposal Content

Problem Definition

Airavata is used by science gateways as a platform to create, submit, execute and monitor different types of scientific jobs and workflows in scientific grids. Airavata is using the three types of individual databases to store the metadata of a particular chemical compound. In the current architecture, there are a few drawbacks in representing the data over the SEAGrid Data Analysis Portal.

The chemical compounds are represented with the missing fields and unstructured keys. Therefore, the scientific representation of the compounds with the Scientific Data Model (SDM) and related Ontology (SDMO) is much better to analyze the behavior of a compound. The use of Google Protobuffer is a good choice to structure the schema of a compound.

Solution Overview

Data Modeling

Implementing a functional database

After successful modeling of the data, the data will be passed to the functional database where the fixed and finest parameters of a particular chemical compound are rendered with the dashboard implemented. This functional database is implemented in MongoDB. Creating scientific strings such as SMILES and InChi as a primary key and accessing the data using these strings.

Main Components of the Solution

The main components of the solution are identified as:

Airavata Portal
- Custom Django UI
- Apache Airavata Data Lake
Data Modeling
- Protobuf files

Deliverables

Redesigning the Data Models.
Creating a robust database to reduce the latency.
Synchronizing the data with the Dashboard.

Timeline

Task	Timeline	Deliverables
Study Airavata Django Portal Framework (ADPF)	May 27, 2022	Setup the Django portal locally. Study the procedure of the computational experiment Understand the input file format to function the experiment. Understand the Gaussian log. Create a customized Django application.
Initializing the databases	June 1, 2022	Setup the databases locally. Understand the schema.
Data Modeling	June 15, 2022	Configure the data model. Design the logical schema in protobuf format. Trigger the empty values and define a specified error flag.
Functional Database	July 1, 2022	Create a functional database Define a schema for the functional database. Ingesting data from the primary databases.
Testing and Validations	July 12, 2022	Build a test plan for the data affirmation Execute tests and validate the data flow process.
Project report and Final Documentation	July 29, 2022	Describe the results, observations, and insights in the final documentation.

References

Jira Issue: Unable to render Jira issues macro, execution error.

Space shortcuts

Page tree

Project Title

Abstract

Proposal Content

Problem Definition

Solution Overview

Data Modeling

Implementing a functional database

Main Components of the Solution

Deliverables

Timeline

References

Space shortcuts

Page tree

SMILES Data Models

Project Title

Abstract

Proposal Content

Problem Definition

Solution Overview

Data Modeling

Implementing a functional database

Main Components of the Solution

Deliverables

Timeline

References