Abstract
Since OODT consists of different components like file manager, resource manager and workflow manager, all those components have their own configuration files and locations. This is complex to manage and creates problems when the platform is distributed across servers or geographically. Therefore, the objective of this project is to migrate the OODT configuration to an optional zookeeper module so that the OODT components can register themselves in the zookeeper ensemble accordingly and maintain each component’s state regardless of the scale of the cluster. The proposed zookeeper module will minimize the manual configuration required when configuring OODT components.
Introduction
Apache Object Oriented Data Technology (OODT) is an open source data management system framework which originally developed at NASA Jet Propulsion Laboratory to support capturing, processing and sharing of data for NASA's scientific archives. OODT provides three core components,
File Manager - Responsible for tracking file locations, their metadata and for transferring files from a staging area to controlled access storage
Workflow Manager - Responsible for capturing data flow and control flow for complex processes and allowing for reproducibility and the construction of scientific pipelines.
Resource Manager - Responsible for handling allocation of workflow tasks and other jobs to underlying resources.
Apart from that, OODT consists of several other components like file crawler (CAS-crawler), push/pull framework (CAS-push/pull) and CAS-PGE (Catalog and Archive Service Production Generation Executive).
In addition to the details given in the abstract, this module will make use of the inherited configuration at component level. For example consider the file manager. Almost all the configurations of file manager instances are identical. Therefore, new file managers which are coming up later will inherit the configuration of the initial file managers and will almost remove the manual configuration required when adding new nodes to the cluster.
Deliverables
Completed distributed configuration management module
Implemented using Apache Zookeeper as the underlying distributed coordination system and Apache Curator as the client to connect to Zookeeper.
Unit tests and integration tests (if possible) for the new module
Tests to check the correct functionality of distributed configuration management module.
Integration tests with simulated OODT component behaviors.
Documentation on “How to use the distributed configuration management module”
By default, OODT will be using the file-based configuration (which is currently available). In order to enable the distributed configuration management module, users will have to do few configurations.
A developer documentation
Explaining the architecture and the reasons for each design decision with corresponding design diagrams.
Design and Implementation
Module Architecture
Since file manager is the most critical component which has the requirement of distributed configuration management, I will be using that component as the reference module when describing the design.
As shown in the class diagram above, I will make use of the factory design pattern to get matching configuration manager of a component. There are two types of configuration managers,
Standalone configuration manager - Which is created with the current logic of configuration management. This configuration manager will be using .properties files and such local files to fetch configuration.
Distributed configuration manager - Which is the new addition to store configuration in zookeeper and fetch them when needed. New components that are coming up on the fly will be able to fetch configuration from zookeeper and use them with no need to manually configure those configured every time.
ConfigurationFactory will return the corresponding configuration manager by looking at a system property. That is, if a user wants to use distributed configuration manager, he/she should set a system property (say org.apache.oodt.conf.zookeeper=true) indicating the component to use distributed configuration management, so that the ConfigurationFactory will return the DistributedConfigurationManager as the configuration manager. By calling the loadConfiguration() method of the returned configuration manager, system properties will be loaded. The ConfigurationFactory class will take the component name and configuration file names in getConfigurationManager() method. The component name will be used to identify similar components in the cluster and the configuration files will be used to fetch the configuration and to store them in zookeeper.
Later, this configuration manager class can be used to query the available components in the cluster and their configurations. This will also allow the developer to check which components are currently active and which are not (through ephemeral nodes in zookeeper). The ZNode structure in zookeeper is described in the next section.
Zookeeper (ZNode) Structure
All the information related to OODT components will be stored under the ZNode /oodt and a separate ZNode will be created per each instance in the cluster. In the above structure, /oodt/node1 and /oodt/node2 are such examples where node1 and node2 are the names of two nodes in the cluster. Inside those node, separate ZNodes will be created as shown for each component that is running inside that instance. File manager (file-mgr) and resourcemanager (resmgr) in the above structure are therefore dedicated to the file manager and resource manager components of node1.
A separate module will be created for this configuration management implementations. As I have shown in the design, since I’m using an interface ConfigurationManager which will act as the API for configuration managers, this design can be further extended in future even to support other distributed databases and distributed coordination systems like etcd (https://coreos.com/etcd/). Any module/component that is willing to make use of this new configuration management mechanism should have a dependency to this module. As I have mentioned previously, a system property will determine whether that component will be using distributed or standalone implementation of configuration manager at runtime. I will be using Apache Curator as the client to connect to Apache Zookeeper.
What I have done so far
I have started implementing the design I have proposed. As the initial step, I have defined the ConfigurationManager API. While implementing, I understood that the two implementations of that interface share several major properties as well. Therefore, I have changed the ConfigurationManager to an abstract class. Furthermore, I have written the code to connect to the zookeeper ensemble through Apache Curator’s CuratorFramework class. All these work can be found at my OODT fork, https://github.com/IMS94/oodt. Looking at the code I have implemented so far, I think a person can resolve any ambiguity in the design I have given above.
I have added a separate module for OODT-Configuration Management. Therefore any module that is willing to use this feature should add this module as a dependency. Currently I have added this to file manager module.
Time line
28th February - 3rd April |
|
4th May |
|
5th May - 30th May |
|
20th May – 26th June |
|
26th June - 30th June |
|
30th June - 28th July |
|
28th July - 27th August |
|
29th August |
|
About Me
I, Imesha Sudasingha (S.A.I.M. Sudasingha) am a final year undergraduate of University of Moratuwa, Sri Lanka. Learning new technologies, reading on latest technologies, applying my knowledge and concepts learned to real world applications and following best practices are the most outstanding characteristics of mine. Apart from that, I like working with different people so that I can gain more knowledge through their advices and experience. When it comes to developing a software/module, designing the architecture is the best phase I like to be in. I personally believe that the design of a software is the most critical thing when developing a software. The same attitude forced me to choose this project since this project requires a lot of designing and architectural decisions. Apart from that, my previous experience with apache zookeeper and distributed systems helped me to understand this project.
As I have done several open source contributions as explained later in this proposal, I wanted to do a larger contribution. Therefore, I decided that I want to contribute to Apache Software Foundation since I have been using many apache open source products/libraries (ex: Apache2 Server, Zookeeper, HttpComponents, Maven, Tomcat and Curator). This project took my attention at the first glance since this was related to zookeeper and java which are few of my most familiar technologies.
Experience in Zookeeper/Curator and Distributed Systems
I have been in intern in AdroitLogic Lanka (PVT) Ltd (www.adroitlogic.com) where I wrote the entire cluster management module of their new product stack, project-x/ultraESB-x (https://www.adroitlogic.com/products/ultraesb/). That module included a distributed command framework written using the Zookeeper’s watcher mechanism and a failover support implementation as well. The documentation written for that module is available here. API of the module I wrote can be found here. The module I have written will replace the current failover support system of the 2nd most critical system in Singapore Stock Exchange. Furthermore, I have been writing on Apache Zookeeper and Apache Curator (Apache Curator in 5 minutes, Network Partitioning in Zookeeper).
Open source contributions
I have contributed to Apache Curator twice (pull requests https://github.com/apache/curator/pull/175 and https://github.com/apache/curator/pull/177) where one of them was an improvement for their curator-test module to bind the test servers to other network interfaces other than to just localhost.
Other contributions
stackoverflow.com
I’m an active user of stackoverflow where I have gained 2159 (as of 23rd of March, 2017) reputation within two years. Most of those reputation has been gained through giving answers. Having java and apache-zookeeper in my most popular tags in stackoverflow profile proves that I have a considerable knowledge in java and apache-zookeeper.
Apache Zookeeper and Curator Mailing lists
I have been active in both Apache Curator and Zookeeper user and dev mailing lists for some time. I have been mostly asking questions on the design of Zookeeper when I was implementing the cluster management module at AdroitLogic (PVT) Ltd as mentioned above. Several mails I have sent in those mailing threads can be found in Apache Curator Mail Archives of November.
Other commitments during GSoC period
Usually I have lectures on 3 days per week. Therefore I have complete 4 days to involve in my own work. That was a main reason for me to apply for GSoC this summer as I thought of doing something useful within this period. Also I will be getting a 1 month long vacation in July. Because of all those reasons, can afford around 40-50 hours per week on my GSoC project from 30th May to 27th August when the coding of GSoC projects officially carried on.
Contact Information
LinkedIn - www.linkedin.com/in/imeshasudasingha
Github - https://github.com/IMS94
Stackoverflow - http://stackoverflow.com/users/4012073/imesha-sudasingha
Twitter - https://twitter.com/Imesha94
Medium.com - https://medium.com/@Imesha94
Why Me?
As I have described throughout the proposal, I have lot of experience in Zookeeper and Apache Curator. According to my experience, most critical thing when working on distributed systems is handling the edge cases. That is, we don’t have to worry about the “Happy day scenario” but about the inconsistencies in networks and session handling.
Furthermore, I have a good idea on what needs to be done in this project and I think that is reflected on the proposed architecture and design. Please note that I have only added a draft design as well. Actual implementation will be more complex and consistent as I and my mentor will be reviewing each step to refine the outcome as much as possible.
I have been an active open source contributor and an active person on several Apache mailing lists. Therefore, I have a good understanding on how the Apache eco-system works and how the open source culture works. Based on all these reasons, I am confident that I can complete this project in the best possible manner, adding more value to the OODT project in future.
References
Apache Zookeeper - https://zookeeper.apache.org
Apache Curator - http://curator.apache.org
CoreOS etcd - https://coreos.com/etcd/
Apache Curator Mail Archives of November - http://mail-archives.apache.org/mod_mbox/curator-user/201611.mbox/browser