OBJECTIVE

An Admin of Apache Eagle should be able to get audit details of actions performed on Policies and Site/Datasource.

The Audit details should have details like User, Action(Create/Update/Delete), Timestamp, Link to the Policy/Site/Datsource etc.


PROPOSED APPROACH

Approach #1

  1. Request for Create/Update/Delete comes in for Policies, Site or Datasource.
  2. After the requested data is persisted in corresponding HBase tables, the audit information will be stored on separate audit tables (one for each of the parent tables where actual data is stored for the given service request).
  3. Response for the request is sent back to the user.

Approach #2

  1. Request for Create/Update/Delete comes in for Policies, Site or Datasource.
  2. After the requested data is persisted in corresponding HBase tables, the audit information will be stored on a single audit table (for any action by any service).
  3. Response for the request is sent back to the user.

Approach #2 tries to be more generic, as new tables definitions and implementations need not be added for auditing a new service.

 

PROPOSED DESIGN

Following is the initial design upon which changes will be made to accommodate the auditing feature.

Audit Service Flow

 

­­­

For the purpose of explanation of the design, will be using Policy Definition service as an example.

#1 – Client sends a request to create a Policy to the service component with a certain payload.

http://localhost:8080/eagle-service/rest/entities?serviceName=AlertDefinitionService

#2 – After authentication and preprocessing, the call lands to the create(Entities, EntityDefinition) method in HBaseStorage.java

#3 – The required data for the operation is persisted onto HBase.

#4 – At this point, the method will build and send the response back to the client. The Auditing of the service request will be done here (implementation of one of the approaches mentioned above).

Since there are separate methods for Create, Update and Delete in HBaseStorage.java we would be able to audit what action happened for the particular request so that it can be audited.

#5 – The audit data will be persisted in a single HBase Audit table or multiple HBase Audit tables (as given in the approach)

#6 – After the audit information is persisted in HBase, the response for the service will be set back to the client.

 

SAMPLE TABLE DESIGN

Approach #1

// Individual audit tables for individual data tables

Service #1: Policy Definition

Data Table: alertdef

Audit Table: alertdefAudit

Audit Columns:

  1. encodedRowKey (encoded format of the row key as obtained from persisting the policy data)
  2. userID
  3. actionTaken (CREATE/UPDATE/DELETE)
  4. auditTimestamp

 

Service #2: Alert Data Source

Data Table: alertDataSource

Audit Table: alertDataSourceAudit

Audit Columns:

  1. encodedRowKey (encoded format of the row key as obtained from persisting the datasource information)
  2. userID
  3. actionTaken (CREATE/UPDATE/DELETE)
  4. auditTimestamp

 

Approach #2

// Single audit table for all data tables

Service #1: Policy Definition

Service #2: Alert Data Source

Audit Table: serviceAudit

 

 

Audit Columns:

  1. serviceName (to differentiate which service the audit entry belongs to)
  2. encodedRowKey (encoded format of the row key as obtained from persisting the datasource information)
  3. userID
  4. actionTaken (CREATE/UPDATE/DELETE)
  5. auditTimestamp

 

AUDIT RETRIEVAL APPROACH

Below are the designs for retrieving audit data for each of the proposed approaches.

APPROACH #1

As this approach suggests using multiple tables, we would need to create multiple Entity Definitions and DAO implementations for each of the audit tables created.

SAMPLE SERVICE CALL

http://localhost:8080/eagle-service/rest/list?query=AlertDefinitionServiceAudit[@encodedRowKey="ABC__DEF"]{*}&pageSize=100

http://localhost:8080/eagle-service/rest/list?query=AlertDefinitionServiceAudit[@encodedRowKey="ABC_DEF" AND @actionTaken="CREATE/UPDATE/DELETE"]{*}&pageSize=100

APPROACH #2

As we are going with only one table for auditing for as many numbers of data tables available, we need only one Entity Definition and DAO implementation for the single audit table created.

In the service though we would be passing only additional parameter as compared to the Approach #1 as this will be used to identify for which service the audit entries needs to be retrieved.

SAMPLE SERVICE CALL

http://localhost:8080/eagle-service/rest/list?query=AuditService[@serviceName="AlertDefinitionService" AND @encodedRowKey="ABC__DEF"]{*}&pageSize=100

http://localhost:8080/eagle-service/rest/list?query=AuditService[@serviceName="AlertDefinitionService" AND @encodedRowKey="ABC__DEF" AND @actionTaken="CREATE/UPDATE/DELETE"]{*}&]{*}&pageSize=100

 

GENERIC INTERFACE DESIGN

A generic interface for implementing a custom audit source. Modeled similar to the PropertyChangeListener used in TaggedLogAPIEntity.

  1. AuditListener Interface - Contains the method to be implemented for auditing purpose.
  2. HBaseStorageAudit Class - Implements the AuditListener interface. Register the implementation class for callback and override the above method with the custom logic, which builds up the audit data and persists it to a audit table in HBase. Here we enforce the condition which audits only in case of HBase operations related to entities used for Policy/Datasource.
  3. HBaseStorage Class - From the create/update/delete methods call the method in HBaseStorageAudit that triggers the listener method.
  4. AuditSupport Class - Used for maintaining AuditListener implementations and calling.

 

Adding to the audit columns mentioned in Approach #2, we can also persist the columns available in the @Tags annotation of an entity, so that we can model the audit retrieval service based on the data retrieval service of different entities.

We can also have a configuration parameter which lets us decide whether to use the audit feature or not if required.

  • No labels

9 Comments

  1. I think option 2 looks make more sense than option 1.

    This design is trying to audit table modification from very low level, but problem is for user we want high level audit information, for example who creates which policy. How does current design achieve this goal?

  2. Edward Zhang, we could add two more columns to the current table design for Approach #2 say, source_id, source_name, source_description. These columns would contain high level details wrt to Policy/Site details like policy/site ID and policy/site name.

    Another service can be added to the retrieval to get the details of the policies/site created by a specific user. Example,

    -- Get user specific audits

    http://localhost:8080/eagle-service/rest/list?query=AuditService[@serviceName="AlertDefinitionService" AND @userID="admin"]{*}&]{*}&pageSize=100

    -- Get user and action specific audits
    http://localhost:8080/eagle-service/rest/list?query=AuditService[@serviceName="AlertDefinitionService" AND @userID="admin" AND @actionTaken="CREATE/UPDATE/DELETE"]{*}&]{*}&pageSize=100

    1. Thanks. I did not mean we need add source_id, source_name, source_description. My original question is if we do audit in low level for example in class HBaseStorage.java, how do we know the content of this updated/created/deleted entity? In HBaseStorage everything is generic, we don't know if that is for policy or for site or for data source.

      1. In the different methods available in HBaseStorage.java for the HBase operations, we would have the EntityDefinition parameter from which we can get what service call is that. Using this we can store the value of serviceName column in the audit table to differentiate whether that audit entry is for policy/site/datasource.

        Example,

        Policy creation request >> http://localhost:8080/eagle-service/rest/entities?serviceName=AlertDefinitionService

        AlertDefinitionService - this Service Name will be available in the entity definition

        Datasource creation request >> http://localhost:8080/eagle-service/rest/entities?serviceName=AlertDataSourceService

        AlertDataSourceService - this Service Name will be available in the entity definition

        We would not be needing content of the entity if we just go with the encodedRowKeys as they will be readily available in the response object for the CREATE/UPDATE/DELETE operations in HBaseStorage.java.

         

  3. Edward Zhang   and Murali Krishna ,  Here is the way to find out the contents from Generic Entity ..

    for(TaggedLogAPIEntity entity : entities){ 

    try {
                Object policyId = entityDefinition.getValue(entity, "policyId");
                Object policyName = entityDefinition.getValue(entity, "policyName");
    System.out.println(policyId.toString() + "--" + policyName.toString());
    } catch (Exception e) {
    e.printStackTrace();
    }
    }

    Hopefully this way we can fetch required fields from entity and audit the same.. 

     

    1. Thanks for the investigation. I think probably we need some generic callback interface where for policy create/update/delete we can implement the logic you have mentioned even by forcing type conversion to AlertDefinitionAPIEntity. Please suggest.

      1. Edward Zhang , Can you Pls  review the Generic Interface Design approach added ?? 

        1. Thanks Senthil for this proposal. Nice!

  4. Edward Zhang, in the eagle services for PolicyDefinition and DataSourceDefinition, both the create and update operations go through the create method of HBaseStorage.java. So how do we differentiate between create and update operation without hitting HBase again to check if the row exists (is this acceptable from the point of performance). Or is there any other way to differentiate ? 

    Also in delete operation, it is done by passing the encoded row key. We would need to do a GET operation here to get the row details (for example, the policy ID or datasource or site) for the audit.