Apache Airavata
Small Molecule Ionic Lattices (SMILES) Data Models.
The goal of this project is to design and implement the solution for Airavata Data Catalog and the Data Parsers to analyze the metadata extracted from the Literature, Experimental and Computational records in support of Small Molecule Iconic Isolation Lattices (SMILES) Data. Particularly, this includes the data synchronization with the SEAGrid Data Analysis Portal and the Gateway users.
Airavata is used by science gateways as a platform to create, submit, execute and monitor different types of scientific jobs and workflows in scientific grids. Airavata is using the three types of individual databases to store the metadata of a particular chemical compound. In the current architecture, there are a few drawbacks in representing the data over the SEAGrid Data Analysis Portal.
The chemical compounds are represented with the missing fields and unstructured keys. Therefore, the scientific representation of the compounds with the Scientific Data Model (SDM) and related Ontology (SDMO) is much better to analyze the behavior of a compound. The use of Google Protobuffer is a good choice to structure the schema of a compound.
After successful modeling of the data, the data will be passed to the functional database where the fixed and finest parameters of a particular chemical compound are rendered with the dashboard implemented. This functional database is implemented in MongoDB. Creating scientific strings such as SMILES and InChi as a primary key and accessing the data using these strings.
The main components of the solution are identified as:
Task | Timeline | Deliverables |
---|---|---|
Study Airavata Django Portal Framework (ADPF) | May 27, 2022 |
|
Initializing the databases | June 1, 2022 |
|
Data Modeling | June 15, 2022 |
|
Functional Database | July 1, 2022 |
|
Testing and Validations | July 12, 2022 |
|
Project report and Final Documentation | July 29, 2022 |
|