THIS IS A TEST INSTANCE. ALL YOUR CHANGES WILL BE LOST!!!!
Status
IMPROVING
Principle
- Minimize user operations for managing and monitoring topologies
- REST Service centralized metadata management
- Consistent UI experience
Objective
- Make sure consistent semantics between the term application (user space) and topology (physical space only)
- Topology configuration can be edited on UI
- Use REST Service as only entry to manage application (START/STOP) and ApplicationManager to schedule the task execution and update the execution information
- Eagle UI should support to manage/monitor the application on site/application page
- Single site/application may have more than one topologies (for example hdfsAuditLogMonitoring and userProfileMonitoring should have topologies running separately)
Architecture
To have a better user experience with EAGLE ui, we could provide an interface to manage a certain topology via the ui page, such as submit a topology
Schema
Topology Description Service
- Service name: TopologyDescriptionService
- Entity: TopologyDescriptionEntity
- Table name: eagle_metadata
- Prefix: topologyDescription
Attribute | Type | Description | |
---|---|---|---|
tage | topology | String | topology name |
fields | exeClass | String | topology entry class |
type | String | topology type: DSL or CLASS | |
description | String | description on this topology | |
version | String | topology version |
Topology Execution Service
- Service name: TopologyExecutionService
- Entity: TopologyExecutionEntity
- Table name: eagle_metadata
- Prefix: topologyExecution
- Description: define the relation between an application and an topology, and maintain the execution status as well
Attribute | Type | Description | |
---|---|---|---|
tags | site | String | topology site |
application | String | ||
topology | String | ||
fields | fullName | String | topology execution name: eagle_${site}_${application}_${topology} |
url | String | topology tracking url | |
description | String | topology running status description | |
status | String | application running status {NEW, STARTING, STOPPING, STARTED, STOPPED} | |
mode | String | topology running mode: cluster or local | |
environment | String | topology execution environment, e.g., storm | |
lastModifiedDate | long | last status update time |
Topology Operation Service
- Service name: TopologyOperationService
- Entity: TopologyOperationEntity
- Table name: eagle_metadata
- Prefix: topologyOperation
Attribute | Type | Description | |
---|---|---|---|
tags | operation | String | {START, STOP, STATUS} |
site | String | ||
application | String | ||
operationID | String | ||
topology | String | topology name: ${topology} | |
fields | status | String | {INITIALIZED, PENDING, FAILED, SUCCESS} |
message | String | exception message | |
lastModifiedDate | long | last status update time |
Customized Restful Apis
HTTP Method | URL | Payload | Description |
---|---|---|---|
POST | /app/operation | TopologyOperationEntity | create an topology operation |
DELETE | /app/topology/{topology} | delete a topology description |
17 Comments
Hao Chen
Could we rename the TopologyManagementService to ApplicationManagementService?
qingwzhao
I have two questions about the changes
1) Objective 2 said "Topology should read configuration from SiteApplicationEntity#config instead of configuration file". SiteApplicationEntity#config stores the connection information to datasource, for example, hdfs connection properties (see eagle-topology-init.sh)
2) Objective 3 said we need ApplicationManager to schedule the task execution and update the execution information. So we still need an independent backend process to execute the task but not a new thread initialized by Tomcat?
Hao Chen
qingwzhao
I like the picture. It could be better and clearer if there are some explanations about what's the meaning of each line
Hao Chen
Please take the full responsibility to think clearly and improve through the design with technical details.
Edward Zhang
Formally I would suggest application and topology is 1:1 mapping, application is user space term and topology is implementation term. This 1:1 mapping would simplify application manager much more.
qingwzhao
But actually an application may have more than one topologies. For example Application HdfsAuditLog has two topololgies: hdfsAuditLog and userProfile(online model). How do we leverage the balance?
I have a topologyManager to start/start/status topologies. Each topology have its own configuration defined in TopologyDescServiceEntity
Edward Zhang
qingwzhao I think hdfsAuditLog and userProfile are different applications. They consume the same data, but are different. If an application is exactly a topology, that would make program simpler. Let me know your thoughts
qingwzhao
Sorry, I cannot get the point of making program simpler. In my understand, in the future maybe each data source(application) should have its own user profile, currently we put user profile as a feature together with classification, metadata, and common. So it's reasonable to put userProfile(hdfsAuditlog version) together with HdfsAuditLog topology.
Now i put a topology at the same level as a feature. Site->application->feature(topology). TopologyExecutionEntity will describe the a topology's relation with an application. We can edit a topology's configuration just like how we edit a feature. So i think it's not hard design the frontend pages.
at the backend. I have two ideas
1) push mode.
We can design a REST API to respond /START/STOP/STATUS ... commands, and eventually hand over the command to topology manager to execute. (seems no scheduler is needed)
2) pull mode
We have a daemon scheduler to pull the command entities from database periodically, and have each command processed by TopologyManager.
Maybe I missed something important, or have not enough considerations on the extensibility and scalability. They are just my unsaturated ideas
Edward Zhang
User profile detection topology uses models generated from training program and processes each audit log. But it does not mean user profile detection topology and audit log monitoring topology belong to same monitoring application. Instead we should simplify the relationship between application and topology to be 1:1 mapping.
Application is analogous to Class in java while topology is analogous to Object in java. An application can also be a standalone java program or Spark application etc.
Application lifecycle management can be mapped to Storm topology lifecycle management if this application is implemented by Storm topology.
If we want one thing to contain multiple topology, that thing can be application group, which loosely group topologies together.
Push mode has a lot of issues because that is synchronized operation. The REST API should persist command into Eagle service and we have scheduler embedded in Eagle service to read those operations and execute corresponding topology commands. Why we don't want scheduler to be a separate daemon is because we want Eagle can control everything to improve user experience.
Let me know your thoughts.
Hao Chen
I think there is no conflict here, in low-level implementation view, we could define the data structure as "application (1) – (*) topology" (keep the flexibility), but in user/product view, we could limit the user experience as "application (1) – (1) topology". There is no conflict at all from unit design.
velorina
Thank you for sharing your thoughts. I really appreciate your efforts and I am waiting for your next write ups thank you once again.
Poker
Hao Chen
Another very important thing is that we should consider the configuration design along with application management
qingwzhao
Can you give more details about the configuration design?
Hao Chen
Except the high-level engineering design, I think you'd better to think about more detail like taking some metadata as example to go through the whole process.
Hao Chen
qingwzhao Before finally confirming the design, I think you could start to write some code about the entities and critical prototype like topology operations, I will work together with you about the development starting with code scaffold.
qingwzhao
No problem