What is Bigtop Sandbox?
A handy tool to build and run big data pseudo clusters atop Docker.
How to run
Make sure you have Docker installed. We've tested this using Docker for Mac
Currently supported OS list:
- debian-8
- ubuntu-16.04
Run Hadoop HDFS
docker run -d -p 50070:50070 bigtop/sandbox:1.2.1-ubuntu-16.04-hdfs
For HDFS, it takes around 30 secs. You can use docker logs to see whether it has been provisioned:
BIGTOP=$(docker run -d -p 50070:50070 bigtop/sandbox:1.2.1-ubuntu-16.04-hdfs)
docker logs -f $BIGTOP
Warning: This method is deprecated, please use the stdlib validate_legacy function, with Stdlib::Compat::Hash. There is further documentation for validate_legacy function in the README. (at /etc/puppet/modules/stdlib/lib/puppet/functions/deprecation.rb:25:in `deprecation') Warning: This method is deprecated, please use match expressions with Stdlib::Compat::Bool instead. They are described at https://docs.puppet.com/puppet/latest/reference/lang_data_type.html#match-expressions. (at /etc/puppet/modules/stdlib/lib/puppet/functions/deprecation.rb:25:in `deprecation') Warning: This method is deprecated, please use match expressions with Stdlib::Compat::Array instead. They are described at https://docs.puppet.com/puppet/latest/reference/lang_data_type.html#match-expressions. (at /etc/puppet/modules/stdlib/lib/puppet/functions/deprecation.rb:25:in `deprecation') Notice: Scope(Class[Node_with_components]): Roles to deploy: [namenode, datanode] Warning: This method is deprecated, please use the stdlib validate_legacy function, with Pattern[]. There is further documentation for validate_legacy function in the README. (at /etc/puppet/modules/stdlib/lib/puppet/functions/deprecation.rb:25:in `deprecation') Warning: This method is deprecated, please use the stdlib validate_legacy function, with Stdlib::Compat::Bool. There is further documentation for validate_legacy function in the README. (at /etc/puppet/modules/stdlib/lib/puppet/functions/deprecation.rb:25:in `deprecation') Warning: This method is deprecated, please use the stdlib validate_legacy function, with Stdlib::Compat::String. There is further documentation for validate_legacy function in the README. (at /etc/puppet/modules/stdlib/lib/puppet/functions/deprecation.rb:25:in `deprecation') Warning: This method is deprecated, please use match expressions with Stdlib::Compat::Numeric instead. They are described at https://docs.puppet.com/puppet/latest/reference/lang_data_type.html#match-expressions. (at /etc/puppet/modules/stdlib/lib/puppet/functions/deprecation.rb:25:in `deprecation') Notice: Compiled catalog for 9c26fcceafad.local in environment production in 1.45 seconds Notice: Baseurl: http://repos.bigtop.apache.org/releases/1.2.1/ubuntu/16.04/x86_64 Notice: /Stage[main]/Bigtop_repo/Notify[Baseurl: http://repos.bigtop.apache.org/releases/1.2.1/ubuntu/16.04/x86_64]/message: defined 'message' as 'Baseurl: http://repos.bigtop.apache.org/releases/1.2.1/ubuntu/16.04/x86_64' Notice: /Stage[main]/Bigtop_repo/Exec[bigtop-apt-update]/returns: executed successfully Notice: /Stage[main]/Hadoop::Common_hdfs/File[/etc/hadoop/conf/core-site.xml]/content: content changed '{md5}71506958747641d1a5def83b021e7f75' to '{md5}ce32af59eb015a3bb3774d375be10f11' Notice: /Stage[main]/Hadoop::Common_hdfs/File[/etc/hadoop/conf/hdfs-site.xml]/content: content changed '{md5}784883dd654527ae577de19ecdec0992' to '{md5}ddc0a621878650832f30eb9690aa7565' Notice: /Stage[main]/Hadoop::Namenode/Service[hadoop-hdfs-namenode]/ensure: ensure changed 'stopped' to 'running' Notice: /Stage[main]/Hadoop::Datanode/File[/data/1/hdfs]/mode: mode changed '0700' to '0755' Notice: /Stage[main]/Hadoop::Datanode/File[/data/2/hdfs]/mode: mode changed '0700' to '0755' Notice: /Stage[main]/Hadoop::Datanode/Service[hadoop-hdfs-datanode]/ensure: ensure changed 'stopped' to 'running' Notice: /Stage[main]/Hadoop::Init_hdfs/Exec[init hdfs]/returns: executed successfully Notice: Finished catalog run in 29.46 seconds
After provisioned, goto http://localhost:50070, you'll see the web UI is ready there.
To destroy the container:
docker stop $BIGTOP
docker rm $BIGTOP
Run Hadoop HDFS + HBase
docker run -d -p 50070:50070 -p 60010:60010 bigtop/sandbox:1.2.1-ubuntu-16.04_hdfs_hbase
Run Hadoop HDFS + Spark Standalone
docker run -d -p 50070:50070 -p 8088:8088 bigtop/sandbox:1.2.1-ubuntu-16.04_hdfs_spark-standalone
Run Hadoop HDFS + YARN + Hive + Pig
docker run -d -p 50070:50070 -p 8080:8080 bigtop/sandbox:1.2.1-ubuntu-16.04_hdfs_yarn_hive_pig
How to build
Build a Hadoop HDFS sandbox image
./build.sh -a bigtop -o centos-6 -c hdfs
Build a Hadoop HDFS, Hadoop YARN, and Spark on YARN sandbox image
./build.sh -a bigtop -o debian-8 -c "hdfs, yarn, spark"
Build a Hadoop HDFS and HBase sandbox image
./build.sh -a bigtop -o ubuntu-16.04 -c "hdfs, hbase"
Use --dryrun to skip the build and get Dockerfile and configuration
./build.sh -a bigtop -o ubuntu-16.04 -c "hdfs, hbase" --dryrun
Change the repository of packages
- Change the repository to Bigtop's nightly centos-6 repo
export REPO=http://ci.bigtop.apache.org:8080/job/Bigtop-trunk-repos/BUILD_ENVIRONMENTS=centos-6%2Clabel=docker-slave-06//ws/output
./build.sh -a bigtop -o centos-6 -c "hdfs, yarn, spark, ignite"
Customize your Big Data Stack
- Edit site.yaml.template.centos-6_hadoop to create your own prefered stack
cp site.yaml.template.centos-6_hadoop site.yaml.template.centos-6_hadoop_ignite
vim site.yaml.template.centos-6_hadoop_ignite
- Add ignite in hadoop_cluster_node::cluster_components array and leave the other unchanged
...
hadoop_cluster_node::cluster_components: [hdfs, yarn, ignite]
...
- Build
./build.sh -a bigtop -o centos-6 -f site.yaml.template.centos-6_hadoop_ignite -t my_ignite_stack
Known issues
Fail to start daemons using systemd
Since systemd requires CAP_SYS_ADMIN, currently any OS using systemd can not successfully started up daemons during image build time.
Daemons can be brought up only if --privileged specified using docker run command.
Please read the doc here.
Available Sandboxes: https://hub.docker.com/r/bigtop/sandbox/tags/
Build status: https://ci.bigtop.apache.org/view/Docker/job/Docker-Sandbox/