State
[progress record]:
Proposed time: 2022/10/10
Discussion time:
Accept/Reject Time:
Complete time:
[issues]: To be added
[email]: After creating the LKIP and writing the preliminary content, start a discussion on the LKIP proposal. Currently, the discussion must be initiated in the WeChat group [Apache Linkis Community Development Group], and the minutes can be sent to the official dev mailbox of linkis. The email address for the minutes can be placed here
[release]: The (planned) release version of Linkis
[proposer]: peacewong
Motivation & Background
1. Users want to be able to perform fuzzy search by code on Linkis' historical tasks
2. Need to be able to control permissions, ordinary users can only search their own code
Basic concept
- The management console supports code fuzzy search, and supports highlighting of matching content
Expect to achieve goals
- 1. Support fuzzy search of historical code and strictly control permissions
- 2. Phase 1 only supports retrieval of T-1 codes
- 3. Supports highlighting of retrieved codes
Implementation plan
- 1. To realize the task of batch derivative, import yesterday's historical tasks into ES regularly through Exchange every day. It should be noted that retrieval is temporarily not supported for codes exceeding 50,000.
- 2. Implement back-end search interface through ESClient, only support fuzzy search according to code
- 3. Support code fuzzy search function through plug-in, which is not enabled by default
Remark: Why only 50000 is supported, because when Linkis stores the code, if it exceeds 50000, the code will be stored in the file system, and only the corresponding file path is stored in the database. There is no corresponding code for batch importing into ES. In addition, it is also to reduce the pressure on ES as much as possible when searching.
Technology Architecture
As shown in the figure below, the technical architecture is mainly divided into two parts:
1. Schedule batch tasks, import the table linkis_ps_job_history_group_history into ES through Exchange (datax), create a linkis index, and create a type according to the cluster
2. Support front-end search, match search by calling ES Client
Changes
Modification | Detail | |
---|---|---|
1 | Modification of maven module | |
2 | Modification of HTTP interface | |
3 | Modification of the client interface | |
4 | Modification of database table structure | |
5 | Modification of configuration item | |
6 | Modification Error code | |
7 | Modifications for Third Party Dependencies | introduce ES Client |
Compatibility, Deprecation, and Migration Plan
- What impact (if any) will there be on existing users?
- If we are changing behavior, how will we phase out the older behavior?
- If we require special migration tools, describe them here.
- When will we remove the existing behavior?