THIS IS A TEST INSTANCE. ALL YOUR CHANGES WILL BE LOST!!!!

Apache Kylin : Analytical Data Warehouse for Big Data

Page tree

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

In above directory tree, the directory which end with "managed by tool" means StorageCleanupJob will try to check and delete useless files under these directory.

How to use

Option Table 

OptionData TypeDefault ValueComment
deleteBooleanfalseBoolean, whether or not to do real delete operation.
Default value is false, means a dry run.
cleanupTableSnapshotBooleantrueBoolean, whether or not to delete unreferenced snapshot files. Default
value is true .
cleanupGlobalDictBooleantrueBoolean, whether or not to delete unreferenced global dict files. Default value
is true .
cleanupJobTmpBooleanfalseBoolean, whether or not to delete job tmp files. Default value is false .
cleanupThresholdInteger168Integer, used to specific delete unreferenced storage that have not been
modified before how many hours (recent files are protected). Default value
is 168 hours.


List help information

Code Block
languagebash
themeMidnight
firstline1
titleoptions
linenumberstrue
[root@cdh-master apache-kylin-4.0.0-SNAPSHOT-bin]# bin/kylin.sh org.apache.kylin.tool.StorageCleanupJob -help
Retrieving hive dependency...
Retrieving hadoop conf dir...
Retrieving Spark dependency...
...
Running org.apache.kylin.rest.job.StorageCleanupJob -help
usage: org.apache.kylin.rest.job.StorageCleanupJob
 -cleanupGlobalDict <cleanupGlobalDict>         Boolean, whether or not to
                                                delete unreferenced global
                                                dict files. Default value
                                                is true .
 -cleanupJobTmp <cleanupJobTmp>                 Boolean, whether or not to
                                                delete job tmp files.
                                                Default value is false .
 -cleanupTableSnapshot <cleanupTableSnapshot>   Boolean, whether or not to
                                                delete unreferenced
                                                snapshot files. Default
                                                value is true .
 -cleanupThreshold <cleanupThreshold>           Integer, used to specific
                                                delete unreferenced
                                                storage that have not been
                                                modified before how many
                                                hours (recent files are
                                                protected). Default value
                                                is 168 hours.
 -delete <delete>                               Boolean, whether or not to
                                                do real delete operation.
                                                Default value is false,
                                                means a dry run.

...

Code Block
languagebash
themeEmacs
bin/kylin.sh org.apache.kylin.tool.StorageCleanupJob --delete true

...

Only delete stale job_tmp and stale cuboid files

Code Block
languagebash
themeEmacs
bin/kylin.sh org.apache.kylin.tool.StorageCleanupJob --delete true \
 --cleanupJobTmp ture -cleanupTableSnapshot false \
 -cleanupGlobalDict false --cleanupThreshold 24