Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Migration of unmigrated content due to installation of a new plugin

...

  1. Collectors write chunks to logs/*.chukwa files until a 64MB chunk size is reached or a given time interval has passed.
    • to: logs/*.chukwa
  2. Collectors close chunks and rename them to *.done
    • from: logs/*.chukwa
    • to: logs/*.done
  3. DemuxManager checks for *.done files every 20 seconds.
    1. If *.done files exist, moves files in place for demux processing:
      • from: logs/*.done
      • to: demuxProcessing/mrInput
    2. If demux is successful within 3 attempts, archives the completed files:
      • from: demuxProcessing/mrOutput
      • Wiki Markup
        to: {{dataSinkArchives/\[yyyyMMdd\]/\*/\*.done}}

    3. Otherwise moves the completed files to an error folder:
      • from: demuxProcessing/mrOutput
      • Wiki Markup
        to: {{dataSinkArchives/InError/\[yyyyMMdd\]/\*/\*.done}}

  4. PostProcessManager wakes up every few minutes and aggregates, orders and de-dups record files.
    • Wiki Markup
      from: {{postProcess/demuxOutputDir_\*/\[clusterName\]/\[dataType\]/\[dataType\]_\[yyyyMMdd\]_\[HH\].R.evt}}

    • Wiki Markup
      to: {{repos/\[clusterName\]/\[dataType\]/\[yyyyMMdd\]/\[HH\]/\[mm\]/\[dataType\]_\[yyyyMMdd\]_\[HH\]_\[N\].\[N\].evt}}

  5. HourlyChukwaRecordRolling runs M/R jobs at 16 past the hour to group 5 minute logs to hourly.
    • Wiki Markup
      from: {{repos/\[clusterName\]/\[dataType\]/\[yyyyMMdd\]/\[HH\]/\[mm\]/\[dataType\]_\[yyyyMMdd\]_\[mm\].\[N\].evt}}

    • Wiki Markup
      to: {{temp/hourlyRolling/\[clusterName\]/\[dataType\]/\[yyyyMMdd\]}}

    • Wiki Markup
      to: {{repos/\[clusterName\]/\[dataType\]/\[yyyyMMdd\]/\[HH\]/\[dataType\]_HourlyDone_\[yyyyMMdd\]_\[HH\].\[N\].evt}}

    • Wiki Markup
      leaves: {{repos/\[clusterName\]/\[dataType\]/\[yyyyMMdd\]/\[HH\]/rotateDone/}}

  6. DailyChukwaRecordRolling runs M/R jobs at 1:30AM to group hourly logs to daily.
    • Wiki Markup
      from: {{repos/\[clusterName\]/\[dataType\]/\[yyyyMMdd\]/\[HH\]/\[dataType\]_\[yyyyMMdd\]_\[HH\].\[N\].evt}}

    • Wiki Markup
      to: {{temp/dailyRolling/\[clusterName\]/\[dataType\]/\[yyyyMMdd\]}}

    • Wiki Markup
      to: {{repos/\[clusterName\]/\[dataType\]/\[yyyyMMdd\]/\[dataType\]_DailyDone_\[yyyyMMdd\].\[N\].evt}}

    • Wiki Markup
      leaves: {{repos/\[clusterName\]/\[dataType\]/\[yyyyMMdd\]/rotateDone/}}

  7. ChukwaArchiveManager every half hour or so aggregates and removes dataSinkArchives data using M/R.
    • Wiki Markup
      from: {{dataSinkArchives/\[yyyyMMdd\]/\*/\*.done}}

    • to: archivesProcessing/mrInput
    • to: archivesProcessing/mrOutput
    • Wiki Markup
      to: {{finalArchives/\[yyyyMMdd\]/\*/chukwaArchive-part-\*}}
      \\

Log Directories Requiring Cleanup

The following directories will grow over time and will need to be periodically pruned:

  • Wiki Markup
    {{finalArchives/\[yyyyMMdd\]/\*}}

  • Wiki Markup
    {{repos/\[clusterName\]/\[dataType\]/\[yyyyMMdd\]/\*.evt}}