You are viewing an old version of this page. View the current version.

Compare with Current View Page History

Version 1 Next »

[Progress record]:
Proposed time: 2022/05/06
Discussion time:
Acceptance time:
Complete time:
[issues]: https://github.com/apache/incubator-linkis/issues/xxxxx  
【email】:  
【release】:

Document the state by adding a label to the LKIP page with one of “discussion”, “accepted”, “released”, “rejected”.

Discussion thread
Vote thread
Issuehttps://github.com/apache/incubator-linkis/issues/xxxxx 
Release1.2.0


Motivation & Background

In the current version, an exception occurs when the Linkis Client submits the task. The client will decide whether to retry according to the parameter configuration. However, when an error occurs during the execution of the task, the client does not have a mechanism to retry, especially for some tasks, which may Because network, resource and other issues are not submitted to the EC for execution, in order to further improve the fault tolerance of the system, the client adds a retry function for tasks that report errors before being submitted to the EC for execution.

Basic concept



Expect to achieve goals

Linkis client adds a task retry function that reports an error if it is not submitted to the ECM for execution


Implementation plan

Linkis Job adds the attribute retryNums, whose type is Int;
The table linkis_ps_job_history_group_history adds a field to indicate whether to enter the EC for execution: execByEcm, the field type is Boolean;
A function has been planned for this, task metrics are added to the record of ec information, requirements: https://github.com/apache/incubator-linkis/issues/2075
This record can be reused
The client retry function is added to isCompleted of LinkisJob. This method has two implementation classes:
1. SimpleOnceJob
This class is mainly for one-time submitted tasks, such as datax, sqoop and other engines, and does not require data interaction. This type does not consider this type of retry for the time being
2. StorableLinkisJob
This class is mainly for once job type tasks, such as hive, spark and other engines, with data data interaction
override def isCompleted: Boolean = getJobInfoResult.isCompleted
getJobInfoResult obtains the execution status of the table: linkis_ps_job_history_group_history through the interface jobhistory.
Retry process

After the task enters the ECM, update the execByEcm field to true; Things to Consider & Note: Do you need to consider the compatibility of the original parameter method? Retry is only supported if retry is added. If it is not added, it is still the original logic.


Changes


Modification Detail
1
Modification of maven module


2Modification of HTTP interface
3Modification of the client interface
4Modification of database table structure
5Modification of configuration item
6Modification Error code 
7Modifications for Third Party Dependencies

Compatibility, Deprecation, and Migration Plan

  • What impact (if any) will there be on existing users?
  • If we are changing behavior, how will we phase out the older behavior?
  • If we require special migration tools, describe them here.
  • When will we remove the existing behavior?


  • No labels