What is Bigtop Sandbox?
A handy tool to build and run big data pseudo clusters atop Docker.
How to run
Make sure you have Docker installed. We've tested this using Docker for Mac
Currently supported OS list:
- debian-8
- ubuntu-16.04
Run Hadoop HDFS
docker run -d -p 50070:50070 bigtop/sandbox:1.2.1-ubuntu-16.04-hdfs
For HDFS, it takes around 30 secs. You can use docker logs to see whether it has been provisioned:
BIGTOP=$(docker run -d -p 50070:50070 bigtop/sandbox:1.2.1-ubuntu-16.04-hdfs)
docker logs -f $BIGTOP
Warning: This method is deprecated, please use the stdlib validate_legacy function, with Stdlib::Compat::Hash. There is further documentation for validate_legacy function in the README. (at /etc/puppet/modules/stdlib/lib/puppet/functions/deprecation.rb:25:in `deprecation') Warning: This method is deprecated, please use match expressions with Stdlib::Compat::Bool instead. They are described at https://docs.puppet.com/puppet/latest/reference/lang_data_type.html#match-expressions. (at /etc/puppet/modules/stdlib/lib/puppet/functions/deprecation.rb:25:in `deprecation') Warning: This method is deprecated, please use match expressions with Stdlib::Compat::Array instead. They are described at https://docs.puppet.com/puppet/latest/reference/lang_data_type.html#match-expressions. (at /etc/puppet/modules/stdlib/lib/puppet/functions/deprecation.rb:25:in `deprecation') Notice: Scope(Class[Node_with_components]): Roles to deploy: [namenode, datanode] Warning: This method is deprecated, please use the stdlib validate_legacy function, with Pattern[]. There is further documentation for validate_legacy function in the README. (at /etc/puppet/modules/stdlib/lib/puppet/functions/deprecation.rb:25:in `deprecation') Warning: This method is deprecated, please use the stdlib validate_legacy function, with Stdlib::Compat::Bool. There is further documentation for validate_legacy function in the README. (at /etc/puppet/modules/stdlib/lib/puppet/functions/deprecation.rb:25:in `deprecation') Warning: This method is deprecated, please use the stdlib validate_legacy function, with Stdlib::Compat::String. There is further documentation for validate_legacy function in the README. (at /etc/puppet/modules/stdlib/lib/puppet/functions/deprecation.rb:25:in `deprecation') Warning: This method is deprecated, please use match expressions with Stdlib::Compat::Numeric instead. They are described at https://docs.puppet.com/puppet/latest/reference/lang_data_type.html#match-expressions. (at /etc/puppet/modules/stdlib/lib/puppet/functions/deprecation.rb:25:in `deprecation') Notice: Compiled catalog for 9c26fcceafad.local in environment production in 1.45 seconds Notice: Baseurl: http://repos.bigtop.apache.org/releases/1.2.1/ubuntu/16.04/x86_64 Notice: /Stage[main]/Bigtop_repo/Notify[Baseurl: http://repos.bigtop.apache.org/releases/1.2.1/ubuntu/16.04/x86_64]/message: defined 'message' as 'Baseurl: http://repos.bigtop.apache.org/releases/1.2.1/ubuntu/16.04/x86_64' Notice: /Stage[main]/Bigtop_repo/Exec[bigtop-apt-update]/returns: executed successfully Notice: /Stage[main]/Hadoop::Common_hdfs/File[/etc/hadoop/conf/core-site.xml]/content: content changed '{md5}71506958747641d1a5def83b021e7f75' to '{md5}ce32af59eb015a3bb3774d375be10f11' Notice: /Stage[main]/Hadoop::Common_hdfs/File[/etc/hadoop/conf/hdfs-site.xml]/content: content changed '{md5}784883dd654527ae577de19ecdec0992' to '{md5}ddc0a621878650832f30eb9690aa7565' Notice: /Stage[main]/Hadoop::Namenode/Service[hadoop-hdfs-namenode]/ensure: ensure changed 'stopped' to 'running' Notice: /Stage[main]/Hadoop::Datanode/File[/data/1/hdfs]/mode: mode changed '0700' to '0755' Notice: /Stage[main]/Hadoop::Datanode/File[/data/2/hdfs]/mode: mode changed '0700' to '0755' Notice: /Stage[main]/Hadoop::Datanode/Service[hadoop-hdfs-datanode]/ensure: ensure changed 'stopped' to 'running' Notice: /Stage[main]/Hadoop::Init_hdfs/Exec[init hdfs]/returns: executed successfully Notice: Finished catalog run in 29.46 seconds
After provisioned, goto http://localhost:50070, you'll see the web UI is ready there.
To destroy the container:
docker stop $BIGTOP
docker rm $BIGTOP
Run Hadoop HDFS + HBase
BIGTOP=$(docker run -d -p 50070:50070 -p 16010:16010 bigtop/sandbox:1.2.1-ubuntu-16.04-hdfs_hbase)
docker exec -ti $BIGTOP hbase shell
Run Hadoop HDFS + Spark Standalone
BIGTOP=$(docker run -d -p 50070:50070 -p 8080:8080 bigtop/sandbox:1.2.1-ubuntu-16.04-hdfs_spark-standalone)
docker exec -ti $BIGTOP spark-shell
Run Hadoop HDFS + YARN + Hive + Pig
BIGTOP=$(docker run -d -p 50070:50070 -p 8088:8088 bigtop/sandbox:1.2.1-ubuntu-16.04-hdfs_yarn_hive_pig)
docker exec -ti $BIGTOP hive
docker exec -ti $BIGTOP pig
How to build
Download Bigtop
Go to http://bigtop.apache.org/download.html#releases and download the latest bigtop release. After downloaded:
tar zxvf bigtop-1.2.1-project.tar.gz cd bigtop-1.2.1/docker/sandbox
Build a Hadoop HDFS sandbox image
./build.sh -a bigtop -o ubuntu-16.04 -c hdfs
Build a Hadoop HDFS, Hadoop YARN, and Spark on YARN sandbox image
./build.sh -a bigtop -o ubuntu-16.04 -c "hdfs, yarn, spark"
Build a Hadoop HDFS and HBase sandbox image
./build.sh -a bigtop -o ubuntu-16.04 -c "hdfs, hbase"
Use --dryrun to skip the build and get Dockerfile and configuration
./build.sh -a bigtop -o ubuntu-16.04 -c "hdfs, hbase" --dryrun
Change the repository of packages
export REPO=https://ci.bigtop.apache.org/view/Packages/job/Bigtop-trunk-repos/OS=ubuntu-16.04,label=docker-slave/ws/output/apt/
./build.sh -a bigtop -o ubuntu-16.04 -c "hdfs, yarn, spark, ignite"
Customize your Big Data Stack
vim site.yaml.template.debian-8_hadoop # Configure your own stack
./build.sh -a bigtop -o debian-8 -f site.yaml.template.debian-8_hadoop -t my_hadoop_stack
Known issues
Fail to start daemons using systemd
Since systemd requires CAP_SYS_ADMIN, currently any OS using systemd can not successfully started up daemons during image build time.
Daemons can be brought up only if --privileged specified using docker run command.
Please read the doc here.
Available Sandboxes: https://hub.docker.com/r/bigtop/sandbox/tags/
Build status: https://ci.bigtop.apache.org/view/Docker/job/Docker-Sandbox/