As soon as you enter one of these directory names, a terraform template launches an instance in your AWS account, it will execute the necessary setup logic and then stop the instance in order to allow you to continue with the launch template creation process. Warning: do not stop the instance manually! Please note that you will need a named AWS CLI profile called 'mxnet-ci-dev' or this operation is going to fail.

On Ubuntu, no additional steps are necessary after executing the create-slave shellscriptshell script. Just create an AMI in the EC2 console after the instance has reached the Stopped-state. Warning: do not stop the instance manually as it leaves it in an inconsistent state that will be baked into the launch template.

Windows

On Windows, there is currently no process to set up a slave from scratch and the above shellscript is not applicable.

...

Expand

title	Configurations

Ubuntu CPU

AMI-ID: ID of the previously created AMI

Instance type: C5.18xlarge

Key-Pair-Name: mxnet_edge_berlin_shared_rsa

Network type: VPC

Network interfaces: -

Volumes: EBS / 400GB / GP2 / Delete on terminated: yes / Default IOPS

Security groups: TODO

IAM instance profile: TODO

Monitoring: Enable

Ubuntu GPU

AMI-ID: ID of the previously created AMI

Instance type: G3.8xlarge

Key-Pair-Name: mxnet_edge_berlin_shared_rsa

Network type: VPC

Network interfaces: -

Volumes: EBS / 2000GB / GP2 / Delete on terminated: yes / Default IOPS

Security groups: TODO

IAM instance profile: TODO

Monitoring: Enable

Ubuntu GPU P3

AMI-ID: ID of the previously created AMI

Instance type: P3.2xlarge

Key-Pair-Name: mxnet_edge_berlin_shared_rsa

Network type: VPC

Network interfaces: -

Volumes: EBS / 2000GB / GP2 / Delete on terminated: yes / Default IOPS

Security groups: TODO

IAM instance profile: TODO

Monitoring: Enable

Ubuntu GPU P3 8xlarge

AMI-ID: ID of the previously created AMI

Instance type: P3.8xlarge

Key-Pair-Name: mxnet_edge_berlin_shared_rsa

Network type: VPC

Network interfaces: -

Volumes: EBS / 2000GB / GP2 / Delete on terminated: yes / Default IOPS

Security groups: TODO

IAM instance profile: TODO

Monitoring: Enable

Windows CPU

AMI-ID: ID of the previously created AMI

Instance type: C5.18xlarge

Key-Pair-Name: mxnet_edge_berlin_shared_rsa

Network type: VPC

Network interfaces: -

Volumes: EBS / 500GB / GP2 / Delete on terminated: yes / Default IOPS

Security groups: TODO

IAM instance profile: TODO

Monitoring: Enable

Windows GPU

AMI-ID: ID of the previously created AMI

Instance type: G3.8xlarge

Key-Pair-Name: mxnet_edge_berlin_shared_rsa

Network type: VPC

Network interfaces: -

Volumes: EBS / 500GB / GP2 / Delete on terminated: yes / Default IOPS

Security groups: TODO

IAM instance profile: TODO

Monitoring: Enable

...

In order to manage a distributed Docker cache, we're leveraging Docker Hub.

Cache creation

To generate the cache, we're leveraging a Jenkins job that rebuilds the cache upon new commits to the master. To define which bucket to be used for cache publish and retrieval, set the following environment variable at Jenkins -> Manage Jenkins -> Configure System -> Global properties -> Environment variables. Create variables as follows and insert the variables from the secret created above:

Auto scaling

Auto scaling is done by a lambda function. The management of this function is done using the serverless framework.

...

npm install serverless

export PATH=".~/node_modules/.bin/:$PATH"

...

Expand

title	Role permission

Overall:
- Read
Agent:
- Configure
- Connect
- Create
- Delete
- Disconnect
- Provision
Job:
- Discover
- Read

...

After creating the role, assign this role to the user created above by going to Jenkins->Manage and Assign Roles->Assign Roles. Enter the GitHub handle at 'User/group to add' and press 'Add'. Attention: This name is case-sensitive! Afterwards, assign it the autoscaling role.

Page tree

Versions Compared

Old Version 31

New Version Current

Key

Windows

Ubuntu CPU

Ubuntu GPU

Ubuntu GPU P3

Ubuntu GPU P3 8xlarge

Windows CPU

Windows GPU

Cache creation

Auto scaling

Page tree

Page History

Versions Compared

Old Version 31

New Version Current

Key

Windows

Ubuntu CPU

Ubuntu GPU

Ubuntu GPU P3

Ubuntu GPU P3 8xlarge

Windows CPU

Windows GPU

Cache creation

Auto scaling