You are viewing an old version of this page. View the current version.
Compare with Current
View Page History
« Previous
Version 5
Next »
Bug Reference
https://issues.apache.org/jira/browse/CLOUDSTACK-4992
Branch
4.3
Introduction
Currently, under the environment of cloudstack with multiple regions, each region has its own management server running with a separate database. So if we want to support multiple regions and provide one point of entry for a customer, we need to duplicate domain/account/user information of that customer to all of the databases of regions the customer accesses, which will cause data discrepancies when users update those data independently in each management server.
Purpose
I'd like to provide a way to sync up the data using the messaging system introduced in 4.1.0. Using the events from each management server, updates from each region can be propagated to the rest regions and they can be executed accordingly.
Architecture and Design description
There can be 2 different approaches.
- master - slave architecture : the manual changes are allowed only in one master management server, and those in other servers are either prohibited or discarded.
- multiple source architecture : all management servers allow manual changes and any change in any server will be propagated to the rest of servers.
Restrictions
- The duplicate in each region has a different uuid, which causes same domain/account/user names to have different 'entityuuid's in the event messages. Under this restriction, in the events of 'XXXXX-DELETE', it is not possible to find target resource information like name because the resource already has been deleted from its region.
- In #2 approach, the messaging system does not provide a way to distinguish events from either manual jobs or triggered automatic processing.
Feature Specifications
- For #1 restriction, we may add more information like actual resource and its parent names in the current event message by either way
- update action command classes like 'CreateAccountCmd', 'CreateDomainCmd', etc.
- create new extended classes from any, if not all, classes of 'ActionEventInterceptor', 'RabbitMQEventBus', etc.
- For #2 restriction, we may provide a logging system to store details of completed jobs, which can give a decent way to trace events.
Overall Architecture
- We may provide a queue where the 'event handler' just stores the messages whenever it receives and the 'message processor' processes the stored messages sequentially.
- This way, both the event handler is not blocked and the messages are guaranteed to be processed in the same order compared to them being processed in threads, and the failed messages can be retried.
Master - Slave Architecture
- Tables
- region : stores information of all regions especially authentication information to call API
Field |
Type |
Desc |
name |
char(255) |
name of the region |
host |
char(255) |
API server |
root_url |
char(255) |
root url for API Interfaces |
user_name |
char(255) |
user name for API call authentication |
password |
char(255) |
password for API call authentication |
is_master |
boolean |
true if this region is the master |
- event_log : a queue where the received messages are stored
Field |
Type |
Desc |
routing_key |
char(255) |
routing key of an event message |
body |
char(255) |
body of an event message |
created_time |
datetime |
when this record is created |
processed_time |
datetime |
when this record was processed |
result |
boolean |
true if succeeded |
message |
char(255) |
error message |
- Event Handler
- This is a subscriber to the message queue of the EventNotificationBus.
- It filters only messages of 'Completed ActionEvent' of 'domain', 'account' and 'user' objects
- Once it gets a message
- stores 'routing_key' and 'body' of that message in the 'event_log' table
- launches 'Message Processor' if it is not currently working
- waits for next message
- Message Processor
- This is a processor to sync up the completed action events of the master.
- Once launched
- goes through the 'event_log' table to see if there are new and/or failed tasks
- for each new and/or failed task,
- calls the corresponding API interface to each slave with the task information
- in case of 'DELETE' task, sync up overall data of each slave to those of the master because of #1 restriction
- goes through the 'event_log' table again until there are no new or failed tasks
Multiple Source Architecture
- Tables
- Event Handler
- Message Processor
References
Document History
Glossary
Use cases
Appendix