Thes fuse-dfs project allows Ozone to be mounted (on most flavors of Unix) as a standard file system using the mount command. Once mounted, the user can operate on an instance of ozone using standard Unix utilities such as 'ls', 'cd', 'cp', 'mkdir', 'find', 'grep', or use standard Posix libraries like open, write, read, close from C, C++, Python, Ruby, Perl, Java, bash, etc.
Note that a great thing about FUSE is you can export a fuse mount using NFS, so you can use fuse-dfs to mount ozone on one machine and then export that using NFS. The bad news is that fuse relies on the kernel's inode cache since fuse is path-based and not inode-based. If an inode is flushed from the kernel cache on the server, NFS clients get hosed; they try doing a read or an open with an inode the server doesn't have a mapping for and thus NFS chokes. So, while the NFS route gets you started quickly, for production it is more robust to automount fuse on all the machines you want access to ozone from.
Supported Operating Systems
Linux 2.4, 2.6, FreeBSD, NetBSD, MacOS X, Windows, Open Solaris. See: http://fuse.sourceforge.net/wiki/index.php/OperatingSystems
Fuse-DFS
Supports reads, writes, and directory operations (e.g., cp, ls, more, cat, find, less, rm, mkdir, mv, rmdir). Things like touch, chmod, chown, and permissions are in the works. Fuse-dfs currently shows all files as owned by nobody.
CONTRIBUTING
It's pretty straightforward to add functionality to fuse-dfs as fuse makes things relatively simple. Some other tasks require also augmenting libhdfs to expose more hdfs/ozone functionality to C. See contrib/fuse-dfs JIRAs
REQUIREMENTS
- Hadoop with compiled libhdfs.so and compiled fuse.
- Linux kernel > 2.6.9 with fuse, which is the default or Fuse 2.7.x, 2.8.x installed. See: http://fuse.sourceforge.net/ or even easier: http://dag.wieers.com/rpm/packages/fuse/
- modprobe fuse to load it
- fuse-dfs executable (see below)
- fuse_dfs_wrapper.sh installed in /bin or other appropriate location (see below)
- Compiled Ozone (See:https://cwiki.apache.org/confluence/display/HADOOP/Single+Node+Deployment)
BUILDING
You can compile the FUSE with the help of maven command. Execute the following maven command at this path:
{HADOOP_HOME}/hadoop-hdfs-native-client
Command:
mvn compile -Pnative -Drequire.fuse=true -DskipTests -Dmaven.javadoc.skip=true
INSTALLING
mkdir /export/ozone (or wherever you want to mount it)
(note - common problems are that you don't have libhdfs.so or libjvm.so on your LD_LIBRARY_PATH, and your CLASSPATH does not contain hadoop and other required jars.)
Also note, fuse-dfs will write error/warn messages to the syslog - typically in /var/log/messages
You can use fuse-dfs to mount multiple ozone instances by just changing the server/port name and directory mount point above.
DEPLOYING
In root shell execute the following:
fuse_dfs<space><Ozone-URI><space><Absolute path of directory on which Ozone has to be mounted>
E.g: ./fuse_dfs o3fs://bucket1.vol1.127.0.0.1:9862/ /export/ozone
After this your directory should be mounted with Ozone.
You may find fuse_dfs at path: {HADOOP_HOME}/hadoop-hdfs-project/hadoop-hdfs-native-client/target/main/native/fuse-dfs/
RECOMMENDATIONS
- Always start with debug on so you can see if you are missing a classpath or something like that.
Use -odebug -s -d -oallow_other rw -ousetrash to debug.
KNOWN ISSUES
- If you alias ls to ls --color=auto and try listing a directory with lots (over thousands) of files, expect it to be slow and at 10s of thousands, expect it to be very very slow. This is because --color=auto causes ls to stat every file in the directory. Since fuse-dfs does not cache attribute entries when doing a readdir, this is very slow. see HADOOP-3797
- Writes are approximately 33% slower than the DFSClient. TBD how to optimize this. see: HADOOP-3805 - try using -obig_writes if on a >2.6.26 kernel, should perform much better since bigger writes implies less context switching.
- Reads are ~20-30% slower even with the read buffering.
- LD_LIBRARY_PATH is not set. LD_LIBRARY_PATH can be set using the following command:
export LD_LIBRARY_PATH=/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.242.b08-0.el7_7.x86_64/jre/lib/amd64/server:{HADOOP_HOME}/hadoop-hdfs-project/hadoop-hdfs-native-client/target/native/target/usr/local/lib
5. CLASSPATH is not set. CLASSPATH can be set using the following command:
export CLASSPATH=$({OZONE_HOME}/hadoop-ozone/dist/target/ozone-0.5.0-SNAPSHOT/bin/ozone classpath hadoop-ozone-filesystem --glob)
export CLASSPATH=$CLASSPATH:{OZONE_HOME}/hadoop-ozone/dist/target/ozone-0.5.0-SNAPSHOT/share/ozone/lib/hadoop-ozone-filesystem-0.5.0-SNAPSHOT.jar
6. ozone-site.xml is not configured. Save the minimal snippet to hadoop-ozone/dist/target/ozone-*/etc/hadoop/ozone-site.xml
in the compiled distribution.
<configuration>
<properties>
<property><name>ozone.scm.datanode.id.dir</name><value>/tmp/ozone/data</value></property>
<property><name>ozone.replication</name><value>1</value></property>
<property><name>ozone.metadata.dirs</name><value>/tmp/ozone/data/metadata</value></property>
<property><name>ozone.scm.names</name><value>localhost</value></property>
<property><name>ozone.om.address</name><value>localhost</value></property>
</properties>
</configuration>
7. core-site.xml is not configured. Save the minimal snippet to hadoop-ozone/dist/target/ozone-*/etc/hadoop/core-site.xml
in the compiled distribution.
<configuration>
<property>
<name>fs.o3fs.impl</name>
<value>org.apache.hadoop.fs.ozone.OzoneFileSystem</value>
<description>The implementation class of the S3A Filesystem</description>
</property>
</configuration>