Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  1. Automatic failover. (Example: switch request from not responding name-node to active name-node.)
  2. Pluggable support of failover strategies.
  3. Daemon-service for regular ping of Hadoop service state (Performance optimization to keeping actual state of service).

 

 

Provider configuration example

 

Enables or disables HA Provider and binds strategy and provider together. Alias contains list of Hadoop services (name-nodes in our case: active and standby) grouped into one entity.

 

Code Block
languagexml
titleTopology
linenumberstrue
<topology>
  <gateway>
    ...
    <provider>
    <role>ha</role>
    <name>HAProvider</name>
    <param>
        <name>webhdfs.ha</name>
        <value>failover_strategy=BaseStrategy;retryCount=3;timeoutInterval=5000;enabled=true</value>
    </param>
</provider>
    ...
  <gateway>
  ...
  <service>
    <role>WEBHDFS</role>
    <url>machine1.example.com:50070</url>
    <url>machine2.example.com:50070</url>
  </service>
  ...
<service>
    <role>NAMENODE</role>
    <url>machine1.example.com:50070</url>
    <url>machine2.example.com:50070</url>
</service>
...
</topology>
  • failover_strategy – indicates how to define active service and contains some configuration parameters. Default value is BaseStrategy. BaseStrategy for failover has following parameters:
  • retryCount – indicates how many times knox will ping name-node before  knox decides that namenode is down.
  • timeoutInterval – interval for connection timeout. 
  • enabled – indicates whether  HAProvider  is active or not for service.

 

Example UML

PlantUML
border1
titleDiagram Title
hide footbox
autonumber

participant "Deployment\nFactory\n(df)" as df

Example Code Block

...

languagejava
titleCode Title
linenumberstrue

...

Architecture

 

New provider will be added (descendant ProviderDeploymentContributorBase class) with a set of filters. See Pic.#1 for common architecture.

Image Added

Pic. #1 – Providers architecture

 

Definition.

Alias – set of Hadoop name-nodes configured for High Availability mode.

 

Definition.

High Availability Strategy – plan of defining active name-node and switching between active and stand-by name-nodes. Strategy may contain such parameters as retryCount and timeoutInterval. See Pic.#2 for class diagram for HA mode.

 

Image Added

Pic.#2 Class diagram for HA mode.

 

 

See Table #1 for class description.

Table #1. – HA mode new classes description.

#Class nameDescription
1HaUrlRewriteFunctionDescriptorDescribes function that resolves URLs in HA mode
2HaUrlRewriteFunctionProcessorImplements main logic of defining active or standby URL
3HaBaseStrategyHostMapperImplements base strategy for HA mode. Contains parameters: retryCount, timeoutInterval.

 

See Pic.#2 for  UML sequence diagram for UrlRewriteProcessor.

Image Added

Pic #3 – UML sequence diagram for UrlRewriteProcessor.

Provider configuration example

Please look at the WebHDFS HA section http://knox.apache.org/books/knox-0-5-0/knox-0-5-0.html#WebHDFS