You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 6 Next »

Presently, a rename/delete operation can become prohibitively expensive for such directories which have large sub-trees/sub-paths. Ozone does rename/delete each and every sub-file & sub-dir under the given directory via multiple RPC calls to OM thus makes it very expensive. Also, rename and delete doesn't guarantee the atomicity.

The prefix based FileSystem optimization idea allows performing rename, delete of any directory in a deterministic/constant time atomically. Now, ozone performs rename/delete operations in a single RPC call by sending only the given directory to OM. It will finish rename, delete operations with O(1) complexity. Also, makes it possible to support atomic rename/delete of any directory at any level in the namespace.

Git Branch:

Implementing this in a separate feature HDDS-2939 branch. Thanks to all contributors/reviewers.

How to enable prefix based optimization feature:

Following are the set of configurations to be configured in 'ozone-default.xml' to enable this feature. By default, the feature will be turned OFF.

An example of ozone-site.xml

<configuration>
  <property>
    <name>ozone.om.enable.filesystem.paths</name>
    <value>true</value>
  </property>
 
  <property>
    <name>ozone.om.metadata.layout</name>
    <value>prefix</value>
  </property>
 
</configuration>

Related documents

Branch merge checklist

1. builds/intermittent test failures

There are no intermittent failures specific to the HDDS-2939 branch as of now. During the development , it was ensured all the CI checks are clean prior to every commit merge .The plan is to run repeated CI checks on the merge commit to master.

2. documentation

Described feature in Apache ozone page via HDDS-5067.

3. design, attached the docs

All the design docs are linked from the documentation as part of HDDS-2939

TODO: create a link from the documentation page.

4. s3 compatibility

There are no incompatibilities with respect to S3. This feature can be enabled only together with ozone.om.enable.filesystem.paths. When file system-style path handling is enabled, 100 % s3 compatibility could not be guaranteed. FS compatible s3 key names supposed to be working well, but non-fs compatible, extra key names (like 'a/../b1 or real file with the name `key1/` might be normalized or rejected by the implementation of ozone.om.enable.filesystem.paths.)

TODO: S3 acceptance test when feature is turned on?

5. docker-compose / acceptance tests

The `compose/ozone` cluster is modified with testing `ozonefs/ozonefs.robot` with or without turning on the new feature. (both ofs and o3fs and linked and unlinked bucket are tested...)

6. support of containers / Kubernetes:

NA. Deployment model for OzoneManager remains as earlier.

Example files are committed with HDDS-5018

7. coverage/code quality:

Sonar master branch
Sonar HDDS-2939 branch.

The branch has better coverage than master (73.5% vs 742.2%) but two new Sonar bugs are introduced (169 vs 171)

8. build time

There is no significant difference between local build time.

Recent master build

Recent HDDS-2939 branch build

TODO:

  • test time of acceptance unsecure is increased with 20 minutes
  • integration test is increased with 30 mins

9. possible incompatible changes/used feature flag: 

For using this feature, "ozone.om.metadata.layout" config needs to be set to be true in ozone-site.xml

The new metadata layout version can be turned on at any cluster as the flag is stored per-bucket. Old buckets will use the old metadata version (simple) new buckets will use new layout. All existing keys can be read and no additional migration is required (TODO: confirm)

10. third party dependencies/licence changes:

No new dependencies are added.

11. performance

Done testing to evaluate the performance of delete, rename operations in feature branch vs master code base. Following charts capturing the directory delete and rename operations execution time shows that, feature branch has a very significant performance gain compared to the master.

Ran freon 'dtsg' dfs tree generator benchmark test in a single node cluster. V0 represents master code and V1 represents feature branch. Please refer to the Jira document for more details.


12. security considerations

Everything works as earlier and there is no security implications because of the feature.



  • No labels