Abstract

Since OODT consists of different components like file manager, resource manager and workflow manager, all those components have their own configuration files and locations. This is complex to manage and creates problems when the platform is distributed across servers or geographically. Therefore, the objective of this project is to migrate the OODT configuration to an optional zookeeper module so that the OODT components can register themselves in the zookeeper ensemble accordingly and maintain each component’s state regardless of the scale of the cluster. The proposed zookeeper module will minimize the manual configuration required when configuring OODT components.

Introduction

Apache Object Oriented Data Technology (OODT) is an open source data management system framework which originally developed at NASA Jet Propulsion Laboratory to support capturing, processing and sharing of data for NASA's scientific archives. OODT provides three core components,

File Manager - Responsible for tracking file locations, their metadata and for transferring files from a staging area to controlled access storage
Workflow Manager - Responsible for capturing data flow and control flow for complex processes and allowing for reproducibility and the construction of scientific pipelines.
Resource Manager - Responsible for handling allocation of workflow tasks and other jobs to underlying resources.

Apart from that, OODT consists of several other components like file crawler (CAS-crawler), push/pull framework (CAS-push/pull) and CAS-PGE (Catalog and Archive Service Production Generation Executive).

In addition to the details given in the abstract, this module will make use of the inherited configuration at component level. For example consider the file manager. Almost all the configurations of file manager instances are identical. Therefore, new file managers which are coming up later will inherit the configuration of the initial file managers and will almost remove the manual configuration required when adding new nodes to the cluster.

Deliverables

Completed distributed configuration management module
- Implemented using Apache Zookeeper as the underlying distributed coordination system and Apache Curator as the client to connect to Zookeeper.
Unit tests and integration tests (if possible) for the new module

Tests to check the correct functionality of distributed configuration management module.
Integration tests with simulated OODT component behaviors.

Documentation on “How to use the distributed configuration management module”

By default, OODT will be using the file-based configuration (which is currently available). In order to enable the distributed configuration management module, users will have to do few configurations.

A developer documentation

Explaining the architecture and the reasons for each design decision with corresponding design diagrams.

Design and Implementation

Module Architecture

Since file manager is the most critical component which has the requirement of distributed configuration management, I will be using that component as the reference module when describing the design.

As shown in the class diagram above, I will make use of the factory design pattern to get matching configuration manager of a component. There are two types of configuration managers,

Standalone configuration manager - Which is created with the current logic of configuration management. This configuration manager will be using .properties files and such local files to fetch configuration.
Distributed configuration manager - Which is the new addition to store configuration in zookeeper and fetch them when needed. New components that are coming up on the fly will be able to fetch configuration from zookeeper and use them with no need to manually configure those configured every time.

ConfigurationFactory will return the corresponding configuration manager by looking at a system property. That is, if a user wants to use distributed configuration manager, he/she should set a system property (say org.apache.oodt.conf.zookeeper=true) indicating the component to use distributed configuration management, so that the ConfigurationFactory will return the DistributedConfigurationManager as the configuration manager. By calling the loadConfiguration() method of the returned configuration manager, system properties will be loaded. The ConfigurationFactory class will take the component name and configuration file names in getConfigurationManager() method. The component name will be used to identify similar components in the cluster and the configuration files will be used to fetch the configuration and to store them in zookeeper.

Later, this configuration manager class can be used to query the available components in the cluster and their configurations. This will also allow the developer to check which components are currently active and which are not (through ephemeral nodes in zookeeper). The ZNode structure in zookeeper is described in the next section.

Zookeeper (ZNode) Structure

All the information related to OODT components will be stored under the ZNode /oodt and a separate ZNode will be created per each instance in the cluster. In the above structure, /oodt/node1 and /oodt/node2 are such examples where node1 and node2 are the names of two nodes in the cluster. Inside those node, separate ZNodes will be created as shown for each component that is running inside that instance. File manager (file-mgr) and resourcemanager (resmgr) in the above structure are therefore dedicated to the file manager and resource manager components of node1.

A separate module will be created for this configuration management implementations. As I have shown in the design, since I’m using an interface ConfigurationManager which will act as the API for configuration managers, this design can be further extended in future even to support other distributed databases and distributed coordination systems like etcd (https://coreos.com/etcd/). Any module/component that is willing to make use of this new configuration management mechanism should have a dependency to this module. As I have mentioned previously, a system property will determine whether that component will be using distributed or standalone implementation of configuration manager at runtime. I will be using Apache Curator as the client to connect to Apache Zookeeper.

What I have done so far

I have started implementing the design I have proposed. As the initial step, I have defined the ConfigurationManager API. While implementing, I understood that the two implementations of that interface share several major properties as well. Therefore, I have changed the ConfigurationManager to an abstract class. Furthermore, I have written the code to connect to the zookeeper ensemble through Apache Curator’s CuratorFramework class. All these work can be found at my OODT fork, https://github.com/IMS94/oodt. Looking at the code I have implemented so far, I think a person can resolve any ambiguity in the design I have given above.

I have added a separate module for OODT-Configuration Management. Therefore any module that is willing to use this feature should add this module as a dependency. Currently I have added this to file manager module.

Time line

28th February - 3rd April	Getting familiar with the OODT project. Understanding how each component is configured. Coming up with a draft design. Writing proposal and refining the proposal based on feedback.
4th May	Accepted projects are announced
5th May - 30th May	Reviewing the design in detail with mentor. This includes cross validating the design and the actual requirement. Improve the design to minimize manual configuration required and allowing components to register/deregister on the fly.
20th May – 26th June	Week 1 & 2 Implementing the core of the ConfigurationManager API will be the major task. As I have already started on that, refining and implementing the logic inside DistributedConfigurationManager will be carried on. Week 3 Create proper test cases parallel to the implementation process in order to test the functionality of the DistributedConfigurationManager. Tests will be created using the curator-test package with test zookeeper clusters. Week 4 & 5 Review the architecture and the implemented classes to decide what are the functionalities of each class should be. Mentor’s feedback and suggestions will be mostly considered when deciding the functionalities that should be exposed by the APIs.
26th June - 30th June	Preparing for phase 1 evaluations. Preparing the implemented functionalities and tests after finalizing them for the submission. Submissions for phase 1 evaluations
30th June - 28th July	Week 1 & 2 Adding more functionalities to concrete implementations of ConfigurationManager classes. Adding proper state management to maintain consistency at the runtime. Improve the API exposed by ConfigurationManager. Week 3 Implementing and improving test cases written in phase 1 Reviewing the implementation once again with the mentor to identify required further improvements. Week 4 Submissions for phase 2 evaluations.
28th July - 27th August	Week 1 & 2 Based on the improvements identified, do the required improvements to functionalities and APIs. Continuously reviewing and cross validating the design, implementation and mentor’s objectives in order to converge the implementation to address the actual requirement. In parallel, I will be starting on the “How to use” documentations and developer documentations while preparing the required design diagrams. Week 3 Validating the implementation with improved test cases (mostly integration tests similar to what I did in a previous week). Validating the documentations with my mentor and improve based on his feedback. Final Week Mostly kept free to be used in case of emergency. Refining and refactoring the code (if required) and documentation for the final submission. Making the final submission
29th August	Project timeline ends with submission of deliverables.

About Me

I, Imesha Sudasingha (S.A.I.M. Sudasingha) am a final year undergraduate of University of Moratuwa, Sri Lanka. Learning new technologies, reading on latest technologies, applying my knowledge and concepts learned to real world applications and following best practices are the most outstanding characteristics of mine. Apart from that, I like working with different people so that I can gain more knowledge through their advices and experience. When it comes to developing a software/module, designing the architecture is the best phase I like to be in. I personally believe that the design of a software is the most critical thing when developing a software. The same attitude forced me to choose this project since this project requires a lot of designing and architectural decisions. Apart from that, my previous experience with apache zookeeper and distributed systems helped me to understand this project.

As I have done several open source contributions as explained later in this proposal, I wanted to do a larger contribution. Therefore, I decided that I want to contribute to Apache Software Foundation since I have been using many apache open source products/libraries (ex: Apache2 Server, Zookeeper, HttpComponents, Maven, Tomcat and Curator). This project took my attention at the first glance since this was related to zookeeper and java which are few of my most familiar technologies.

Experience in Zookeeper/Curator and Distributed Systems

I have been in intern in AdroitLogic Lanka (PVT) Ltd (www.adroitlogic.com) where I wrote the entire cluster management module of their new product stack, project-x/ultraESB-x (https://www.adroitlogic.com/products/ultraesb/). That module included a distributed command framework written using the Zookeeper’s watcher mechanism and a failover support implementation as well. The documentation written for that module is available here. API of the module I wrote can be found here. The module I have written will replace the current failover support system of the 2nd most critical system in Singapore Stock Exchange. Furthermore, I have been writing on Apache Zookeeper and Apache Curator (Apache Curator in 5 minutes, Network Partitioning in Zookeeper).

Open source contributions

I have contributed to Apache Curator twice (pull requests https://github.com/apache/curator/pull/175 and https://github.com/apache/curator/pull/177) where one of them was an improvement for their curator-test module to bind the test servers to other network interfaces other than to just localhost.

Other contributions

stackoverflow.com

I’m an active user of stackoverflow where I have gained 2159 (as of 23rd of March, 2017) reputation within two years. Most of those reputation has been gained through giving answers. Having java and apache-zookeeper in my most popular tags in stackoverflow profile proves that I have a considerable knowledge in java and apache-zookeeper.

Apache Zookeeper and Curator Mailing lists

I have been active in both Apache Curator and Zookeeper user and dev mailing lists for some time. I have been mostly asking questions on the design of Zookeeper when I was implementing the cluster management module at AdroitLogic (PVT) Ltd as mentioned above. Several mails I have sent in those mailing threads can be found in Apache Curator Mail Archives of November.

Other commitments during GSoC period

Usually I have lectures on 3 days per week. Therefore I have complete 4 days to involve in my own work. That was a main reason for me to apply for GSoC this summer as I thought of doing something useful within this period. Also I will be getting a 1 month long vacation in July. Because of all those reasons, can afford around 40-50 hours per week on my GSoC project from 30th May to 27th August when the coding of GSoC projects officially carried on.

Contact Information

LinkedIn - www.linkedin.com/in/imeshasudasingha

Github - https://github.com/IMS94

Stackoverflow - http://stackoverflow.com/users/4012073/imesha-sudasingha

Twitter - https://twitter.com/Imesha94

Medium.com - https://medium.com/@Imesha94

Why Me?

As I have described throughout the proposal, I have lot of experience in Zookeeper and Apache Curator. According to my experience, most critical thing when working on distributed systems is handling the edge cases. That is, we don’t have to worry about the “Happy day scenario” but about the inconsistencies in networks and session handling.

Furthermore, I have a good idea on what needs to be done in this project and I think that is reflected on the proposed architecture and design. Please note that I have only added a draft design as well. Actual implementation will be more complex and consistent as I and my mentor will be reviewing each step to refine the outcome as much as possible.

I have been an active open source contributor and an active person on several Apache mailing lists. Therefore, I have a good understanding on how the Apache eco-system works and how the open source culture works. Based on all these reasons, I am confident that I can complete this project in the best possible manner, adding more value to the OODT project in future.

References

Apache Zookeeper - https://zookeeper.apache.org
Apache Curator - http://curator.apache.org
CoreOS etcd - https://coreos.com/etcd/
Apache Curator Mail Archives of November - http://mail-archives.apache.org/mod_mbox/curator-user/201611.mbox/browser

Space shortcuts

Page tree

Abstract

Introduction

Deliverables

Design and Implementation

Module Architecture

Zookeeper (ZNode) Structure

What I have done so far

Time line

About Me

Experience in Zookeeper/Curator and Distributed Systems

Open source contributions

Other contributions

stackoverflow.com

Apache Zookeeper and Curator Mailing lists

Other commitments during GSoC period

Contact Information

Why Me?

References

Space shortcuts

Page tree

Rework OODT configuration to make use of Zookeeper for distributed configuration management

Abstract

Introduction

Deliverables

Design and Implementation

Module Architecture

Zookeeper (ZNode) Structure

What I have done so far

Time line

About Me

Experience in Zookeeper/Curator and Distributed Systems

Open source contributions

Other contributions

stackoverflow.com

Apache Zookeeper and Curator Mailing lists

Other commitments during GSoC period

Contact Information

Why Me?

References