Eurosys 2011 Tutorial
Tutorial slides
Code and library
To complete this tutorial you will need the ZooKeeper 3.3.3 "fat" jar. This has all you need to run zookeeper and develop against it.
Example skeleton code:
Running the ZooKeeper server
The easiest way to run ZooKeeper, and the way that is often used for development is to use it in "standalone" mode. This starts up a single ZooKeeper server, so it is not fault tolerant. For example:
java -jar zookeeper-3.3.3-fatjar.jar server 2181 /tmp/zkdata
starts a ZooKeeper server running on port 2181. It is storing its data in /tmp, so this is obviously not a way you want to run in production.
If you want to startup a cluster of server, you must do a bit more. To do this we need to create a configuration file for each server. Generally, we can share a configuration file across server, but since we are starting all three on the same machine, we need to have a configuration for each server:
dataDir=/tmp/1 clientPort=2181 initLimit=3 syncLimit=3 server.1:127.0.0.1:2221:3331 server.2:127.0.0.1:2222:3332 server.3:127.0.0.1:2223:3333
dataDir=/tmp/2 clientPort=2182 initLimit=3 syncLimit=3 server.1:127.0.0.1:2221:3331 server.2:127.0.0.1:2222:3332 server.3:127.0.0.1:2223:3333
dataDir=/tmp/3 clientPort=2183 initLimit=3 syncLimit=3 server.1:127.0.0.1:2221:3331 server.2:127.0.0.1:2222:3332 server.3:127.0.0.1:2223:3333
A ZooKeeper server figures out which server it is by looking in the dataDir for a file called myid that contains its identity, so lets set those up:
mkdir /tmp/1 echo 1 > /tmp/1/myid mkdir /tmp/2 echo 2 > /tmp/2/myid mkdir /tmp/3 echo 3 > /tmp/3/myid
now lets startup the 3 servers:
java -jar zookeeper-3.3.3-fatjar.jar server server1.cfg & java -jar zookeeper-3.3.3-fatjar.jar server server2.cfg & java -jar zookeeper-3.3.3-fatjar.jar server server3.cfg &
you will see a bunch of messages which will eventually stop with something along the lines of Snapshotting: 100000000
.
Setting things up for the example application
We need to create two znodes on our server: /assign and /tasks. Conveniently, the client is also included in the "fat" jar. It even has a bit of a shell. We start up the client with:
java -jar zookeeper-3.3.3-fatjar.jar client -server 127.0.0.1:2181
_if you started up a cluster of servers, you would use 127.0.0.1:2181,127.0.0.1:2182,127.0.0.1:2183 instead of just 127.0.0.1:2181.
Now lets create the znodes. We have to supply initial data for them even though it isn't used:
[zk: 127.0.0.1:2181(CONNECTED) 0] create /assign "" Created /assign [zk: 127.0.0.1:2181(CONNECTED) 1] create /tasks "" Created /tasks [zk: 127.0.0.1:2181(CONNECTED) 2] ls / [assign, tasks, zookeeper]
Extra credit (testing)
The code you have written works and is dynamic, but doesn't handle every error correctly. Check what happens when errors occur: change the TaskQueueWorker.java to use FailFirstZooKeeper.java and see what happens. (When submitting multiple tasks you will find that one gets assigned to a phantom worker.) How do you fix it?