THIS IS A TEST INSTANCE. ALL YOUR CHANGES WILL BE LOST!!!!

Apache Kylin : Analytical Data Warehouse for Big Data

Page tree

Welcome to Kylin Wiki.

Background

Kylin will generate temporary files in HDFS during the cube building; Besides, when purge/drop/merge cubes, some parquet files may be left in HDFS and will no longer be queried; Although Kylin has started to do some automated garbage collection, it might not cover all cases; You can do an offline storage cleanup periodically.

Directory tree structure under Kylin 4.0 's working dir


Working Dir

  • {PROJECT_NAME}
    • parquet
      • {CUBE_NAME}
        • {SEGMENT_NAME}
          • {CUBOID_ID}
            • parquet files
    • spark_log
      • driver
        • {JOB_ID}
          • drivers' log of cubing job
      • executor
        • {JOB_ID}
          • executors' log of cubing job
    • dict/global_dict
      • {CUBE_NAME}
        • {COLUMN_NAME}
          • dict files
    • table_snapshot
      • {SCHEMA_NAME.TABLE_NAME}
        • {JOB_ID}
          • parquet files
    • job_tmp
      • {JOB_ID}
        • TBD
  • cube_statistics
    • {CUBE_NAME}
      • {JOB_ID}
        • seq file of cuboid 's HLL
  • _sparder_log
    • {DATE}
      • executors 's log of query job
  • resources-jdbc
    • TBD


How to use


  • No labels