The goal is to provide the S3 REST API calls in CloudStack. Specifically
Allow the solution to be installable via a self contained virtual system server within a CloudStack management server.
Operationally the solution will be part of migrating the CloudBridge code into the CloudStack codebase mainstream.
The technical changes emphasise increasing fidelity to the Amazon API specification so far as this is realistic and to start to demonstrate a level of confidence in interworking with the popular third party S3 clients.
To facilitate the use of Amazon S3 API compatible tooling and solutions at CloudStack enterprise clouds and datacenters.
We have provided the REST API as the intended solution because it is the most widely adopted and scalable in a distributed cloud setting.
Scope of this document is to provide a functional specification for the EC2 integration and fidelity work planned for the Bonita release of CloudStack.
Ideally the following should be accomplished
Deployment - Solution to be installable via a self contained virtual system server within a CloudStack management server.
The S3 API is an optional Technology Preview solution which may be enabled at user discretion.
The plan is to implement the main operations of the S3 RESTful service including:
List All Buckets, GET Bucket (ListObjects), GET Bucket acl, GET Bucket policy, GET Bucket location, GET Bucket Object versions, GET Bucket versioning, HEAD Bucket, List Multipart Uploads, PUT Bucket, PUT Bucket acl, PUT Bucket policy, PUT Bucket versioning, PUT Bucket website, DELETE Bucket, DELETE Bucket policy, GET Object, GET Object acl, HEAD Object, POST Object, PUT Object, PUT Object acl, Initiate Multipart Upload, Upload Part, Complete Multipart Upload, Abort Multipart Upload, List Parts, DELETE Object, Delete Multiple Objects
See 'Use cases' for the key requests to be tested.
Fidelity to the Amazon S3 API embraces:
To allow the above to be configurable to the resources, especially simple storage resources, at a given cloud management installation.
No SOAP - SOAP API will be deprecated in the S3 translation layer: resulting in an explanatory message and a 501.
No internationalization - Messages returned in responses available in (American) English only, identical to those in amazonaws wherever possible.
No regions – The Amazon AWS S3 provision for geographic regions, plus additionally a default ‘US Standard’ pan-regional option, will not be present in this design. Consequently when a location constraint is processed it will be ignored and, if created, will be empty by default.
The list of supported operations will not be fully coextensive with those at s3.amazonaws.com at this release. A list of Amazon S3 operations which are not supported within the current release are as follows.
Operations which are candidates for omission are: GET bucket lifecycle, GET bucket notification, GET bucket request payment, GET bucket website, PUT bucket lifecycle, GET object torrent, PUT bucket notification, PUT request payment, PUT bucket website, DELETE bucket lifecycle, DELETE bucket website, Upload part – Copy, Delete multiple objects
Also GET bucket logging, PUT bucket logging (which are in AWS S3 beta, currently).
A further limitation is that bucket ACLs utilize only CannedACL tokens. This approach is mainly in agreement with the O'Reilly Python and AWS Cookbook by Mitch Garnaat, if not using Amazon Identity Access Manager, which is beyond the scope of this installation. Corresponding operations on policies are also not provided, i.e. neither of GET Bucket policy, nor PUT Bucket policy.
Lexical rules for hosts and buckets are to be strict DNS compatible naming, i.e. not relaxed to allow mixed case or underscores. The latter is allowed by AWS console creation in the case of the ‘US Standard’ pan-regional option but is not adopted in this solution because it has no cross-region portability.
Supposing the S3 API Technology Preview solution is to be enabled at user discretion, then there are some possible debugging steps.
The administrator may consult the logs as a subdirectory of the $CATALINA_HOME location. The log cloud bridge.log records events which may be useful for troubleshooting.
Developers who wish to run the S3 API stack inside eclipse may take advantage of remote application debugging. To do this run the JPDA configuration of the tomcat application, e.g. $CATALINA_HOME/bin/catalina.sh jpda start. If the defaults have been accepted this will run a debug version of the application on port 8787 and a Remote Application listener can be configured on the application at port 8000 (by default) so as to step through chosen Java components. Both the Helios and Indigo releases of eclipse are suitable choices for running this debug activity.
Enable the S3 API by setting the flag enable.s3.api to 'true' in the configuration table. This can be done via the UI or directly in MySQL:
update configuration set value='true' where name='enable.s3.api'
The first step is to define the location of CATALINA_HOME. This is the location of the supported Java application server (currently Apache Tomcat 6.0.33) which is serving on the S3 API solution's virtual server.
The configuration environment is controlled by a file which needs to be accurately defined at the time of installation. Within the cloud bridge installation directory, the file is at conf/cloud-bridge.properties. Typical configuration information defined in this file is
host=http://myhost:8080/awsapi
storage.root=/mounts/mymountpoint
storage.multipartDir=_multipartuploads_
bucket.dns=false
serviceEndpoint=myhost:8080
So configured, the S3 API REST translation service will be running at http://myhost:8080/awsapi/rest/AmazonS3/.
The following step, with tomcat running, is to set up user keys using the script awsapi-setup/setup/cloudstack-aws-api-register. This needs setting up in accordance with the following example
./cloudstack-aws-api-register -u http://localhost:8080/awsapi/rest/AmazonS3 -a MyAccessIDKey -s MySecretKey openssl_generated.mycert.pem
The capabilities of the S3 API are intended to satisfy the following use cases. For an overview of the expected capabilities, see docs.amazonwebservices.com/AmazonS3/latest/API/APIRest.htm.
The design establishes an Axis2 webservice acting as a REST servlet, taking lawful HTTP requests such as those validated by the tools discussed previously and providing the HTTP response in accordance with doc.s3.amazonaws.com/2006-03-01/.
To hold the datacenter's metadata for user credentials, the endpoint (or master host) for the service offering, the host local and credentials of the object storage service (or slave host) and the specific paths to identify buckets, objects, access controls and policies, the design calls for a total of 14 tables to be held in a database.
The job of the REST servlet is to model the user status and requests emanating from such users so as to validate the requests and provide the legal responses to each request. As a piece of REST design, in accordance with the Amazon docs, the legitimate requests are structured according to the specific HTTP verb i.e. GET, HEAD, PUT, POST or DELETE. The requests once validate allow the lawful use of the storage service and provide a valid HTTP response to allow the S3 API command tool to reason about the result.
The lawful use of the storage service is governed by the status of each incoming request and certain business logic steps governed by access control and also granter-grantee rights are implemented as part of the actions on S3 Buckets and S3 Objects. Unlawful requests result in a REST error response. Unimplemented functions result in a form of REST error response indicating that it is a service limitation.
The use of the 14 database tables is critical to the design.
These database tables are held in the CLOUDBRIDGE database supplied alongside the CloudStack cloud management server.
Logically, all data manipulated in the (S3) storage service API service are controlled exclusively by the REST servlet and consequent actions located in s3 specific packages.
It is not envisaged that any other CloudStack software need access these tables.
Code Block |
---|
| Tables | +-------------------+ | acl | | bucket_policies | | meta | | mhost | | mhost_mount | | multipart_meta | | multipart_parts | | multipart_uploads | | offering_bundle | | sbucket | | shost | | sobject | | sobject_item | | usercredentials | |
The tables used are: acl, bucket_policies, meta, mhost, mhost_mount, multipart_meta, multipart_parts, multipart_uploads, offering_bundle, sbucket , shost, sobject, sobject_item, usercredentials
A user such as cloud, password cloud, shall be given all read-write privileges at deployment time. See also Appendix 1.
The design imposes a service lifecycle in which
To validate the request data structure (termed the canonical string) the following rules are enforced by the design:
In processing the URI, three formats can be distinguished:
In the current implementation of the solution we are concentrating on the first two of these.
The processing of buckets and objects is critically dependent on rights ascribed the requester via ACLs. By default every resource has an ACL: associated at create or update time. The default ACL marks the resources as private, i.e. owner has full control. ACL can be updated by owner. Each ACL can attach 100 grant rules.
A grant rule defines a grantee with specific permission value. There are canned ACLs to make the rules easier. An ACL document provides a user or group grantee type with canonical string descriptions. Groups are pre-established organizational units: public, any account holder, bucket access loggers, identified by xsi: type. Five types of permission may be granted: READ, WRITE, READ_ACP, WRITE_ACP, FULL_CONTROL.
Code packaging - Currently the code is organized into a package tree rooted in the cloud.com overall package. It is organized into subdivisions of cloud.com.bridge packages, to distinguish these packages from com.cloud.stack related ones.
The main subdivisions are
lifecycle - controls the start and stop of Axis2
auth - map a client signature to a valid user credential, looked up by an instance of type UserCredentialDao
service - the key definitions for interacting with REST requests and providing their responses
io - helper classes concerned with the intimate details of file, input, output and streaming behaviour and its processing in memory
until - other helper classes, defining and manipulating object structures to be used by the services
model - the classes which the services instantiate to get S3 operation metadata: the master and slave hosts in use, the acls created, the user credentials registered, the representations of buckets and their objects together with user-generated metadata and object items during assembly via multipart upload
persist - the lookup of attributes from the MySQL database tables mentioned above, using the hibernate ORM framework
The ORMs as presently devised are listed in the appendix dealing with the persistence design.
In addition to the business rules, the structure of the CRUD operations upon buckets and objects are governed by an S3 engine, called from the REST servlet according to the action to be executed. In turn the engine makes requests to instances of model classes. These map CRUD queries into the related SQL.
A hibernate layer coordinates between the business logic executed in the rest servlet and the SQL definitions in the MySQL database. See also Appendix 2.
Code Block |
---|
calling_format= OrdinaryCallingFormat() connection = S3Connection(aws_access_key_id=<your api key>, aws_secret_access_key=<your secret key> is_secure=False, host='<cloudstack-server>', port=7080, calling_format=calling_format, path="/awsapi/rest/AmazonS3") |
CLOUDBRIDGE data definitions
Code Block |
---|
acl +--------------------+--------------+------+-----+---------+----------------+ | Field | Type | Null | Key | Default | Extra | +--------------------+--------------+------+-----+---------+----------------+ | ID | bigint(20) | NO | PRI | NULL | auto_increment | | Target | varchar(64) | NO | MUL | NULL | | | TargetID | bigint(20) | NO | | NULL | | | GranteeType | int(11) | NO | | 0 | | | GranteeCanonicalID | varchar(150) | YES | | NULL | | | Permission | int(11) | NO | | 0 | | | GrantOrder | int(11) | NO | | 0 | | | CreateTime | datetime | YES | | NULL | | | LastModifiedTime | datetime | YES | MUL | NULL | | +--------------------+--------------+------+-----+---------+----------------+ bucket_policies +------------------+----------------+------+-----+---------+----------------+ | Field | Type | Null | Key | Default | Extra | +------------------+----------------+------+-----+---------+----------------+ | ID | bigint(20) | NO | PRI | NULL | auto_increment | | BucketName | varchar(64) | NO | UNI | NULL | | | OwnerCanonicalID | varchar(150) | NO | | NULL | | | Policy | varchar(20000) | NO | | NULL | | +------------------+----------------+------+-----+---------+----------------+ meta +----------+--------------+------+-----+---------+----------------+ | Field | Type | Null | Key | Default | Extra | +----------+--------------+------+-----+---------+----------------+ | ID | bigint(20) | NO | PRI | NULL | auto_increment | | Target | varchar(64) | NO | MUL | NULL | | | TargetID | bigint(20) | NO | | NULL | | | Name | varchar(64) | NO | | NULL | | | Value | varchar(256) | YES | | NULL | | +----------+--------------+------+-----+---------+----------------+ mhost +-------------------+--------------+------+-----+---------+----------------+ | Field | Type | Null | Key | Default | Extra | +-------------------+--------------+------+-----+---------+----------------+ | ID | bigint(20) | NO | PRI | NULL | auto_increment | | MHostKey | varchar(128) | NO | MUL | NULL | | | Host | varchar(128) | YES | UNI | NULL | | | Version | varchar(64) | YES | | NULL | | | LastHeartbeatTime | datetime | YES | MUL | NULL | | +-------------------+--------------+------+-----+---------+----------------+ mhost_mount +---------------+--------------+------+-----+---------+----------------+ | Field | Type | Null | Key | Default | Extra | +---------------+--------------+------+-----+---------+----------------+ | ID | bigint(20) | NO | PRI | NULL | auto_increment | | MHostID | bigint(20) | NO | MUL | NULL | | | SHostID | bigint(20) | NO | MUL | NULL | | | MountPath | varchar(256) | YES | | NULL | | | LastMountTime | datetime | YES | MUL | NULL | | +---------------+--------------+------+-----+---------+----------------+ multipart_meta +----------+--------------+------+-----+---------+----------------+ | Field | Type | Null | Key | Default | Extra | +----------+--------------+------+-----+---------+----------------+ | ID | bigint(20) | NO | PRI | NULL | auto_increment | | UploadID | bigint(20) | NO | MUL | NULL | | | Name | varchar(64) | NO | | NULL | | | Value | varchar(256) | YES | | NULL | | +----------+--------------+------+-----+---------+----------------+ multipart_parts +------------+--------------+------+-----+---------+----------------+ | Field | Type | Null | Key | Default | Extra | +------------+--------------+------+-----+---------+----------------+ | ID | bigint(20) | NO | PRI | NULL | auto_increment | | UploadID | bigint(20) | NO | MUL | NULL | | | partNumber | int(11) | NO | | NULL | | | MD5 | varchar(128) | YES | | NULL | | | StoredPath | varchar(256) | YES | | NULL | | | StoredSize | bigint(20) | NO | | 0 | | | CreateTime | datetime | YES | | NULL | | +------------+--------------+------+-----+---------+----------------+ multipart_uploads +------------+--------------+------+-----+---------+----------------+ | Field | Type | Null | Key | Default | Extra | +------------+--------------+------+-----+---------+----------------+ | ID | bigint(20) | NO | PRI | NULL | auto_increment | | AccessKey | varchar(150) | NO | | NULL | | | BucketName | varchar(64) | NO | | NULL | | | NameKey | varchar(255) | NO | | NULL | | | x_amz_acl | varchar(64) | YES | | NULL | | | CreateTime | datetime | YES | | NULL | | +------------+--------------+------+-----+---------+----------------+ offering_bundle +--------------------+--------------+------+-----+---------+----------------+ | Field | Type | Null | Key | Default | Extra | +--------------------+--------------+------+-----+---------+----------------+ | ID | int(11) | NO | PRI | NULL | auto_increment | | AmazonEC2Offering | varchar(100) | NO | UNI | NULL | | | CloudStackOffering | varchar(20) | NO | | NULL | | +--------------------+--------------+------+-----+---------+----------------+ sbucket +------------------+--------------+------+-----+---------+----------------+ | Field | Type | Null | Key | Default | Extra | +------------------+--------------+------+-----+---------+----------------+ | ID | bigint(20) | NO | PRI | NULL | auto_increment | | Name | varchar(64) | NO | UNI | NULL | | | OwnerCanonicalID | varchar(150) | NO | MUL | NULL | | | SHostID | bigint(20) | YES | MUL | NULL | | | CreateTime | datetime | YES | MUL | NULL | | | VersioningStatus | int(11) | NO | | 0 | | +------------------+--------------+------+-----+---------+----------------+ shost +---------------+--------------+------+-----+---------+----------------+ | Field | Type | Null | Key | Default | Extra | +---------------+--------------+------+-----+---------+----------------+ | ID | bigint(20) | NO | PRI | NULL | auto_increment | | Host | varchar(128) | NO | MUL | NULL | | | HostType | int(11) | NO | | 0 | | | ExportRoot | varchar(128) | NO | | NULL | | | MHostID | bigint(20) | YES | MUL | NULL | | | UserOnHost | varchar(64) | YES | | NULL | | | UserPasssword | varchar(128) | YES | | NULL | | | UserPassword | varchar(255) | YES | | NULL | | +---------------+--------------+------+-----+---------+----------------+ sobject +------------------+--------------+------+-----+---------+----------------+ | Field | Type | Null | Key | Default | Extra | +------------------+--------------+------+-----+---------+----------------+ | ID | bigint(20) | NO | PRI | NULL | auto_increment | | SBucketID | bigint(20) | NO | MUL | NULL | | | NameKey | varchar(255) | NO | | NULL | | | OwnerCanonicalID | varchar(150) | NO | MUL | NULL | | | NextSequence | int(11) | NO | | 1 | | | DeletionMark | varchar(150) | YES | | NULL | | | CreateTime | datetime | YES | MUL | NULL | | +------------------+--------------+------+-----+---------+----------------+ sobject_item +------------------+--------------+------+-----+---------+----------------+ | Field | Type | Null | Key | Default | Extra | +------------------+--------------+------+-----+---------+----------------+ | ID | bigint(20) | NO | PRI | NULL | auto_increment | | SObjectID | bigint(20) | NO | MUL | NULL | | | Version | varchar(64) | YES | | NULL | | | MD5 | varchar(128) | YES | | NULL | | | StoredPath | varchar(256) | YES | | NULL | | | StoredSize | bigint(20) | NO | MUL | 0 | | | CreateTime | datetime | YES | MUL | NULL | | | LastModifiedTime | datetime | YES | MUL | NULL | | | LastAccessTime | datetime | YES | MUL | NULL | | +------------------+--------------+------+-----+---------+----------------+ usercredentials +--------------+--------------+------+-----+---------+----------------+ | Field | Type | Null | Key | Default | Extra | +--------------+--------------+------+-----+---------+----------------+ | ID | bigint(20) | NO | PRI | NULL | auto_increment | | AccessKey | varchar(150) | NO | UNI | NULL | | | SecretKey | varchar(150) | NO | | NULL | | | CertUniqueId | varchar(200) | YES | UNI | NULL | | +--------------+--------------+------+-----+---------+----------------+ |
OR mapping definitions