Apache Kylin : Analytical Data Warehouse for Big Data
Page History
Table of Contents
1. Why need Job scheduler?
In the process of building segment, kylin will produce a lot of tasks to be executed.
In order to coordinate the execution process of these tasks and make efficient and reasonable use of resources, job scheduling mechanism is needed.
2. What schedulers are there in kylin?
In the current kylin version (kylin v3.1.0), there are three kinds of Job Scheduler, and their implementation classes are DefaultScheduler, DistributedScheduler and CuratorScheduler.
...
Please refer to http://kylin.apache.org/cn/docs/install/kylin_cluster.html for configuration method.
3. What is the difference between different job schedulers?
3.1 DefaultScheduler
The DefaultScheduler is the job scheduler initially used by kylin, and it is also the default job scheduler.
...
Once a job server holds the lock, no other job server can obtain the lock until the job server process is finished.
3.2 DistributedScheduler
DistributedScheduler is a distributed scheduler contributed by Meituan, which is supported since kylin version 1.6.1.
...
Users can also configure the kylin.cube.schedule.assigned.servers to specify the job execution node of a cube.
3.3 CuratorScheduler
Curatorscheduler is a curator based scheduler implemented by Kyligence, which is supported since kylin v3.0.0-alpha.
...