Introduction

Existing CloudStack supports volume provisioning from a backed volume snapshot only with sequential job execution. Everything works slick till the cloud provider admin user submits a multiple volume create request. Orchestration layer is unable to handle this kind of request when user fires simultaneous/concurrent volume create from a single snapshot.


For example, volumes with names V1, V2, V3 were submitted simultaneous create request from management server for the snapshot SS. Volume V1 gets created successfully without any interruption. CloudStack Orchestration layer follows state transition framework. Snapshot SS state has been changed to occupied while V1 request is being served. Due to this the other two volumes (V2, V3) creation request were getting failed.

Purpose

The feature is to support parallel creation of multiple data volumes from a volume snapshot on storage resource.

References

CLOUDSTACK-9895


Branch

Master

Document History

Author

Description

Date

Pavan Aravapalli
First revision27 April 2017

Glossary

  • NA

Use case

  • User wants to create multiple volumes of a snapshot simultaneously.

Feature specification

  • Support concurrent volumes creation from snapshot.
  • Support multiple volume creation of the same snapshot at the same time.
  • Volume creation would be allowed in parallel if the underlying hypervisor supports.
  • No change in the backup of these volumes created (would be sequential or parallel depending on the hypervisor, same as the previous behavior)

Test guidelines

  • Create a one or more data disks/volumes from backed up snapshot subsequently.
  • User should be able to perform all operation volume snapshots as they were able to do earlier

Error handling

  • User will be provided with alert in case volume creation process fails.
  • All errors at various levels of operations will be logged in management-server.log.

 

Audit Events

  • Events will be generated in the management server logs for the volume being created during the course of the creation.

Target users

  1. CloudStack Users and Admins.

Current behavior

The multiple volume create operation behavior varies with each hypervisor. The behavior with respect to each hypervisor below.

VMware:  

                 TODO : Yet to Update Details

XenServer:  Xen allows concurrent requests and the behavior  of the request handling is like mentioned below

      • Sequential :-   In CloudStack Resource Layer end point(host) is selected based on rand() function. selected host will perform/process copy command operation for volume creation. Behavior is sequential if the rand() function selects same host for every concurrent volume from snapshot create request. Since there is only one host which needs be handle multiple requests. So Host will copy the snapshot content to volume(s) one after one in sequential manner. 

      •  Parallel :-  Behavior is Parallel if the rand() function selects different host for every concurrent volume from snapshot create request. Host will copy the snapshot content to volume in parallel execution

        Note :     however there are limitations in the existing functionality due to rand() function has problem in handling multiple hosts. It needs to be addressed may be in next version releases.

KVM: KVM allows concurrent request 

          TODO : Yet to update other details

 

Design description

At present volume creation from snapshot is prohibited and restricted to one volume per snapshot at a time. Reason behind current behavior is because ObjectInDataStore state machine supports only one job at a time. As the hypervisors allow to take volume creation from snapshot  in parallel based on the host/end-point selected .This feature will allow the same ability and thus save time when creating multiple volumes of single snapshot subsequently. The possible solution(s):


  1.  Add source state 'Copying', destination state  as 'Copying' and Event as CopyingRequested
    •  Introduce new conncurrent_req_refer_count count for concurrent request(s) handling.  Every new concurrent request, this refer count will be incremented. After processing concurrent request ( which means either the request successfully handled / failed with exception)  reduce refer count by one.
    • Snapshot store template state remains in 'Copying' state until conncurrent_req_refer_count decremented to zero. 
    • Volume event processing remains same, there will be no functionality change in volume creation

    2.  Add source state 'Copying', destination state  as        'Copying' and Event as CopyingRequested.

    • Create volume snapshot meta data table(id,volume_id,snapshot_id) which holds information of volume and snapshot for create volume from snapshot request
    • Row insert option will be performed for every concurrent request
    • After processing concurrent request ( either event success or failure ), delete the entry from meta data table
    • Snapshot refer state still remains in 'Copying' if the query select count(*) from volume_snapshot_meta_data where snapshot_id = ?,return > 0 . Which means there are concurrent request are in progress
    • if the above step query returns '0' then made the Snapshot state entry to default state in snapshot_store_ref table.

3.Truncate state transition for snapshot refer state. And do not change the state of derived snapshot entity (SNAPSHOT_STORE_REF table entity for snapshot).  Store concurrent request information in VOLUME_DETAILS table as described below

    • In a given snapshot, for a new  request insert a row in VOLUME_DETAILS with values ( id,volume_id,name,snapshot_id)
    • snapshot_id is equivalent value of id from snapshot table, name is key name i.e "SNAPSHOT_ID".
    • create indexing on volume_id column in VOLUME_DETAILS table (if required)
    • insert row will perform when a new concurrent request arises
    • delete row based on volume_id once the request is processed (either success /failure of result)
    • verify if any functionality break raised due to state transition override change.

Note :  selected solution 3 as the best way to implement, because it's the optimal way of bringing the concurrent volume creation request.

 

 Known Things 

  1.  In case if Management Server gets restarted while serving concurrent requests, the expected behavior is same as for the Single Request Handling.  i.e MS server will not perform any volume/snapshot sync operation in case of MS restart. 
  2. Delete Snapshot / Create Template Operation(s) is permitted while a create volume operation in-progress. It should be addressed in next releases or in bug fixing.

 

TODO: Yet to Check and add if any other solutions.

Limitations / Assumptions

  1. In case of XenServer : selecting end point/host for executing copy cmd picks is based on random algorithm, If the end selection is differed for each concurrent request then copy cmd executed in concurrent manner, else it's sequential execution at resource layer.

Work Flow

  • Go to the Snapshots, Create multiple Volumes from the snapshot consecutively.

API Changes

No changes

DB Changes

N/A

 

Hypervisors supported

XenServer,KVM,VMware

 

UI Changes

No changes

Upgrade

N/A

Open Items/Questions

N/A

References

N/A

  • No labels