Skip to content

Commit 48a3d15

Browse files
Sean Smithmhuguesaws
authored andcommitted
Lab 2 changes
Signed-off-by: Sean Smith <[email protected]>
1 parent c539909 commit 48a3d15

21 files changed

+128
-131
lines changed
Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,24 @@
1+
+++
2+
title = "a. Update Lambda Permissions"
3+
date = 2019-09-18T10:46:30-04:00
4+
weight = 20
5+
tags = ["tutorial", "update", "ParallelCluster"]
6+
+++
7+
8+
Before we get started we need to add more permissions to **pcluster-manager**.
9+
10+
#### Modify the Lambda Function
11+
12+
1. Go to the [Lambda Console (deeplink)](https://eu-west-1.console.aws.amazon.com/lambda/home?region=eu-west-1#/functions?f0=true&fo=and&k0=functionName&n0=false&o0=%3A&op=and&v0=ParallelClusterFunction) and search for `ParallelClusterFunction`
13+
2. Select the function then `Configuration` > `Permissions` > Click on the role under `Role name`.
14+
15+
![Attach Policies](/images/container-pc/lambda-permissions.jpeg)
16+
17+
3. Select the `AWSXRayDaemonWriteAccess` policy and remove it
18+
4. Select `Add permissions` > `Attach policies`
19+
20+
![Attach Policies](/images/container-pc/attach-policies.jpeg)
21+
22+
5. Search for `AdministratorAccess` > click `Attach policies`
23+
24+
![Attach Policies](/images/container-pc/attach-admin.png)

content/04-container-parallelcluster/02-update-PC.md

Lines changed: 55 additions & 89 deletions
Original file line numberDiff line numberDiff line change
@@ -1,128 +1,94 @@
11
+++
2-
title = "a. Update your cluster"
2+
title = "b. Update your cluster"
33
date = 2019-09-18T10:46:30-04:00
44
weight = 30
55
tags = ["tutorial", "update", "ParallelCluster"]
66
+++
77

8-
In this section, you will update the configuration of the HPC cluster you created in Lab I to:
9-
- Create a post-install script to install Docker and Singularity.
8+
In this section, you will update the configuration of the HPC cluster you created in [Lab I](03-hpc-aws-parallelcluster-workshop.html) to:
9+
- Add a script to install Docker and Singularity.
1010
- Provide access to the container registry, [Amazon Elastic Container Registry (ECR)](https://aws.amazon.com/ecr/).
1111
- Create a new queue that will be used to run the containerized workload.
1212
- Update the configuration of the HPC Cluster.
1313

14-
{{% notice warning %}}
15-
The following commands must be executed on the AWS Cloud9 environment created at the beginning of the tutorial.
16-
You can find the AWS Cloud9 environment by opening the [AWS Cloud9 console](https://console.aws.amazon.com/cloud9) and choose **Open IDE**
17-
{{% /notice %}}
14+
#### 1. Edit Cluster
1815

19-
#### Preliminary
16+
Click on the **Edit** button in Pcluster Manager.
2017

21-
Starting with version 3.x, AWS ParallelCluster uses configuration file in `yaml` format.
22-
For the following steps, you will use an utility to manipulate `yaml` files, named [yq](https://github.com/mikefarah/yq).
23-
That will make the editing easier and more reproductible.
18+
![Edit button](/images/container-pc/edit.png)
2419

25-
In the Cloud 9 Terminal, copy and paste the command below to install `yq`:
20+
#### 2. HeadNode
2621

27-
```bash
28-
YQ_VERSION=4.21.1
29-
sudo wget https://github.com/mikefarah/yq/releases/download/v${YQ_VERSION}/yq_linux_amd64 -O /usr/bin/yq && sudo chmod +x /usr/bin/yq
30-
```
22+
The first screen leave it as is, next advance to the **HeadNode** tab.
3123

32-
#### 1. Add a compute queue with a different instance type for running the container
24+
On the HeadNode tab add permission to access the [Amazon Elastic Container Registry (ECR)](https://aws.amazon.com/ecr/) by adding the managed `AmazonEC2ContainerRegistryFullAccess` [AWS IAM](https://aws.amazon.com/iam/) policy.
3325

34-
In this step, you will add a new compute queue that use **c5.xlarge** EC2 instances.
26+
1. Click the drop down on the **Advanced options**
27+
2. Click the drop down on **IAM Policies**
28+
3. Add in the policy `arn:aws:iam::aws:policy/AmazonEC2ContainerRegistryFullAccess`. Click Add.
3529

36-
Let create a new queue named __c5xlarge__:
37-
```bash
38-
PARALLELCLUSTER_CONFIG=~/environment/my-cluster-config.yaml
39-
yq -i '.Scheduling.SlurmQueues[1].Name = "c5xlarge"' ${PARALLELCLUSTER_CONFIG}
40-
```
41-
42-
Let create a new compute resources named __c5xlarge__:
43-
```bash
44-
yq -i '.Scheduling.SlurmQueues[1].ComputeResources[0].Name = "c5xlarge"' ${PARALLELCLUSTER_CONFIG}
45-
yq -i '.Scheduling.SlurmQueues[1].ComputeResources[0].InstanceType = "c5.xlarge"' ${PARALLELCLUSTER_CONFIG}
46-
yq -i '.Scheduling.SlurmQueues[1].ComputeResources[0].MinCount = 0' ${PARALLELCLUSTER_CONFIG}
47-
yq -i '.Scheduling.SlurmQueues[1].ComputeResources[0].MaxCount = 8' ${PARALLELCLUSTER_CONFIG}
48-
yq -i '.Scheduling.SlurmQueues[1].Networking.SubnetIds[0] = strenv(SUBNET_ID)' ${PARALLELCLUSTER_CONFIG}
49-
yq -i '.Scheduling.SlurmQueues[1].ComputeSettings.LocalStorage.RootVolume.Size = 50' ${PARALLELCLUSTER_CONFIG}
50-
```
51-
#### 2. Access to the container registry
30+
![HeadNode IAM](/images/container-pc/headnode-iam.png)
5231

53-
In this step, you will add permission to the HPC cluster configuration file to access the [Amazon Elastic Container Registry (ECR)](https://aws.amazon.com/ecr/) by adding the managed `AmazonEC2ContainerRegistryFullAccess` [AWS IAM](https://aws.amazon.com/iam/) policy.
32+
#### 3. Queue Configuration
5433

55-
```bash
56-
yq -i '.HeadNode.Iam.AdditionalIamPolicies[1].Policy = "arn:aws:iam::aws:policy/AmazonEC2ContainerRegistryFullAccess"' ${PARALLELCLUSTER_CONFIG}
57-
yq -i '.Scheduling.SlurmQueues[1].Iam.AdditionalIamPolicies[0].Policy = "arn:aws:iam::aws:policy/AmazonEC2ContainerRegistryFullAccess"' ${PARALLELCLUSTER_CONFIG}
58-
```
34+
Click next twice to advance to the **Queues section**, here we're going to add a queue that has Docker and Singularity installed on the compute nodes.
5935

60-
#### 3. Create a post-install script
36+
1. Choose **Add Queue**
37+
2. Set the **Subnet** to the same subnet as the first queue (queue1)
38+
3. Set the **Dynamic Nodes** to `8`
39+
4. Set the **Instance Type** to `c5.xlarge`
6140

62-
In this step, you will create a post-install script that installs Docker and Singularity on the compute nodes.
41+
![Queue Edit](/images/container-pc/queue-edit.png)
6342

64-
```bash
65-
cat > ~/environment/post_install.sh << EOF
66-
# Install Docker
67-
sudo amazon-linux-extras install -y docker
68-
sudo usermod -a -G docker ec2-user
69-
sudo systemctl start docker
70-
sudo systemctl enable docker
71-
72-
# Install Singularity
73-
sudo yum install -y singularity
74-
EOF
75-
```
43+
Next add in a script that installs Docker and Singularity on the Compute Nodes.
7644

77-
For your `post-install.sh` script to be use by the HPC Cluster, you will need to create [Amazon S3](https://aws.amazon.com/s3/) bucket and copy the `post-install.sh` script to the bucket.
45+
1. Dropdown **Advanced Options** on the queue you just created
46+
2. Paste in the following url into **On Configured** section `https://github.com/aws-samples/aws-hpc-tutorials/blob/isc22/static/scripts/post-install/container-install.sh`
47+
3. Expand **IAM Policies** and paste in the following policy `arn:aws:iam::aws:policy/AmazonEC2ContainerRegistryFullAccess`. Click Add.
7848

79-
```bash
80-
BUCKET_POSTFIX=$(python3 -S -c "import uuid; print(str(uuid.uuid4().hex)[:10])")
81-
export BUCKET_NAME_POSTINSTALL="parallelcluster-isc22-postinstall-${BUCKET_POSTFIX}"
49+
![Advanced Options](/images/container-pc/queue-iam.png)
8250

83-
aws s3 mb s3://${BUCKET_NAME_POSTINSTALL} --region ${AWS_REGION}
84-
aws s3 cp ~/environment/post_install.sh s3://${BUCKET_NAME_POSTINSTALL}/
85-
```
51+
#### 4. Increase RootVolume Size of your cluster
8652

87-
Now, you can add access to the `BUCKET_NAME_POSTINSTALL` bucket and specify the post install script path in the HPC cluster configuration file
53+
In the cluster's config add the following snippet at the bottom of the `queue1` section, `line 56`:
8854

89-
```bash
90-
export BUCKET_NAME_POSTINSTALL_PATH="s3://${BUCKET_NAME_POSTINSTALL}/post_install.sh"
91-
yq -i '.HeadNode.Iam.S3Access[0].BucketName = strenv(BUCKET_NAME_POSTINSTALL)' ${PARALLELCLUSTER_CONFIG}
92-
yq -i '.Scheduling.SlurmQueues[1].Iam.S3Access[0].BucketName = strenv(BUCKET_NAME_POSTINSTALL)' ${PARALLELCLUSTER_CONFIG}
93-
yq -i '.Scheduling.SlurmQueues[1].CustomActions.OnNodeConfigured.Script= strenv(BUCKET_NAME_POSTINSTALL_PATH)' ${PARALLELCLUSTER_CONFIG}
55+
```yaml
56+
ComputeSettings:
57+
LocalStorage:
58+
RootVolume:
59+
Size: 50
9460
```
9561
96-
#### 4. Update your HPC Cluster
62+
![Add in LocalStorage](/images/container-pc/localstorage-edit.png)
9763
98-
In this step, you will update your HPC cluster with the configuration changes made in the previous steps.
64+
#### 5. Update your HPC Cluster
9965
100-
Prior to an update, the cluster should be a stopped state.
66+
On the next screen confirm the cluster configuration and update the cluster.
10167
102-
```bash
103-
pcluster update-compute-fleet -n hpc-cluster-lab --status STOP_REQUESTED -r $AWS_REGION
104-
```
68+
1. Click **Stop Compute Fleet** and click to confirm, this will take a minute to complete, wait to run the update until it's stopped.
69+
2. **Dryrun** to validate the cluster configuration. You'll see three warnings that you can safely ignore.
70+
3. Run **Update**
71+
72+
![Update Cluster](/images/container-pc/update-cluster.png)
10573
106-
Before proceeding to the cluster update, you can check the content of the configuration file that should look like this:
74+
Once we've ran the update we'll be redirected to the main pcluster console screen where we can view update progress.
10775
108-
`cat ~/environment/my-cluster-config.yaml`
76+
If the update doesn't succeed check the contents of the cluster configuration file looks similar to the below. If you are missing anything, review the steps above.
10977
11078
```yaml
11179
HeadNode:
11280
InstanceType: m5.2xlarge
11381
Ssh:
114-
KeyName: ${SSH_KEY_NAME}
82+
KeyName: hpc-lab-key
11583
Networking:
116-
SubnetId: ${SUBNET_ID}
84+
SubnetId: subnet-123456789
11785
LocalStorage:
11886
RootVolume:
11987
Size: 50
12088
Iam:
12189
AdditionalIamPolicies:
12290
- Policy: arn:aws:iam::aws:policy/AmazonSSMManagedInstanceCore
12391
- Policy: arn:aws:iam::aws:policy/AmazonEC2ContainerRegistryFullAccess
124-
S3Access:
125-
- BucketName: ${BUCKET_NAME_POSTINSTALL}
12692
Dcv:
12793
Enabled: true
12894
Imds:
@@ -139,40 +105,40 @@ Scheduling:
139105
DisableSimultaneousMultithreading: true
140106
Efa:
141107
Enabled: true
108+
GdrSupport: true
142109
Networking:
143110
SubnetIds:
144-
- ${SUBNET_ID}
111+
- subnet-123456789
145112
PlacementGroup:
146113
Enabled: true
147114
ComputeSettings:
148115
LocalStorage:
149116
RootVolume:
150117
Size: 50
151-
- Name: c5xlarge
118+
- Name: queue1
152119
ComputeResources:
153-
- Name: c5xlarge
154-
InstanceType: c5.xlarge
120+
- Name: queue1-c5xlarge
155121
MinCount: 0
156122
MaxCount: 8
123+
InstanceType: c5.xlarge
157124
Networking:
158125
SubnetIds:
159-
- ${SUBNET_ID}
126+
- subnet-123456789
127+
CustomActions:
128+
OnNodeConfigured:
129+
Script: >-
130+
https://github.com/aws-samples/aws-hpc-tutorials/blob/isc22/static/scripts/post-install/container-install.sh
160131
Iam:
161132
AdditionalIamPolicies:
162133
- Policy: arn:aws:iam::aws:policy/AmazonEC2ContainerRegistryFullAccess
163-
S3Access:
164-
- BucketName: ${BUCKET_NAME_POSTINSTALL}
165-
CustomActions:
166-
OnNodeConfigured:
167-
Script: s3://${BUCKET_NAME_POSTINSTALL}/post_install.sh
168134
ComputeSettings:
169135
LocalStorage:
170136
RootVolume:
171137
Size: 50
172138
Region: eu-west-1
173139
Image:
174140
Os: alinux2
175-
CustomAmi: ${CUSTOM_AMI}
141+
CustomAmi: ami-0975de9b755cc2d78
176142
SharedStorage:
177143
- Name: Ebs0
178144
StorageType: Ebs

content/04-container-parallelcluster/03-create-repository.md

Lines changed: 8 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,15 +1,21 @@
11
+++
2-
title = "b. Create container repository"
2+
title = "c. Create container repository"
33
date = 2019-09-18T10:46:30-04:00
44
weight = 40
55
tags = ["tutorial", "container", "repository"]
66
+++
77

88
In this section, you will create a container repository on Amazon ECR and create a Docker container image.
99

10+
#### Start Cluster
11+
12+
After the update completes, be sure to start the cluster.
13+
14+
![Start Cluster](/images/container-pc/start-cluster.png)
15+
1016
#### Preliminary
1117

12-
Connect to the Head node via DCV, following instructions from part **[h. Connect to the Cluster](/03-hpc-aws-parallelcluster-workshop/09-connect-cluster.html#dcv-connect)**
18+
From the pcluster manager console connect to the cluster via [h. Connect to the Cluster](/03-hpc-aws-parallelcluster-workshop/09-connect-cluster.html#optional-ssm-connect).
1319

1420
Since the HPC Cluster existed prior to `post-install` script, you will need to manually install Docker and Singularity on the head node of the HPC Cluster.
1521

content/04-container-parallelcluster/04-launch-nextflow.md

Lines changed: 2 additions & 32 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
+++
2-
title = "c. Run nextflow container"
2+
title = "d. Run nextflow container"
33
date = 2019-09-18T10:46:30-04:00
44
weight = 50
55
tags = ["tutorial", "initialize", "ParallelCluster"]
@@ -74,7 +74,7 @@ cat > nextflow_sub.sh << EOF
7474
#!/bin/bash
7575
7676
#SBATCH --job-name=nextflow
77-
#SBATCH --partition=c5xlarge
77+
#SBATCH --partition=queue1
7878
#SBATCH --output=%x_%j.out
7979
#SBATCH --error=%x_%j.err
8080
#SBATCH --ntasks=1
@@ -96,33 +96,3 @@ The output of the job will be in the `nextflow_[SLURM_JOB_ID].out` file and simi
9696

9797
You have now run a basic genomics pipeline and you won't need the cluster in the next labs.
9898
The next section will go over how to delete your HPC Cluster.
99-
100-
101-
<!-- ```bash
102-
cat > Dockerfile << EOF
103-
FROM nextflow/rnaseq-nf
104-
105-
ENV DEBIAN_FRONTEND=noninteractive
106-
RUN apt-get --allow-releaseinfo-change update && apt-get update -y && apt-get install -y git python3-pip curl jq
107-
108-
RUN curl -s https://get.nextflow.io | bash \
109-
&& mv nextflow /usr/local/bin/
110-
111-
RUN pip3 install --upgrade awscli
112-
EOF
113-
```
114-
115-
116-
```bash
117-
cat > Dockerfile << EOF
118-
FROM public.ecr.aws/amazoncorretto/amazoncorretto:8
119-
120-
RUN yum install -y python3
121-
122-
RUN curl -O https://repo.anaconda.com/miniconda/Miniconda2-4.7.12-Linux-x86_64.sh
123-
124-
RUN bash ./Miniconda2-4.7.12-Linux-x86_64.sh -b -p /opt/conda
125-
RUN ln -s /opt/conda/etc/profile.d/conda.sh /etc/profile.d/conda.sh && echo ". /opt/conda/etc/profile.d/conda.sh" >> ~/.bashrc
126-
RUN curl -O https://raw.githubusercontent.com/nextflow-io/rnaseq-nf/master/conda.yml && source ~/.bashrc && conda env update -n root -f conda.yml
127-
EOF
128-
``` -->
Lines changed: 6 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,14 +1,16 @@
11
+++
2-
title = "d. Terminate Your Cluster"
2+
title = "e. Terminate Your Cluster"
33
date = 2019-09-18T10:46:30-04:00
44
weight = 60
55
tags = ["tutorial", "delete", "ParallelCluster"]
66
+++
77

88
Now that you are done with your HPC cluster, you can delete it.
99

10-
On Pcluster manager, let's delete the cluster by selecting delete:
10+
From the **pcluster manager** console, choose **Delete** and confirm to start the cluster deletion.
1111

12-
![ParallelCluster Delete](/images/container-pc/pcluster_manager_delete.png)
12+
![Delete Cluster](/images/container-pc/delete.png)
1313

14-
The cluster and all its resources will be deleted.
14+
The cluster and all its resources will be deleted by AWS CloudFormation. You can check the status on the **Stack Events** tab.
15+
16+
![Delete Cluster](/images/container-pc/delete-stack-events.png)

content/04-container-parallelcluster/_index.md

Lines changed: 2 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,14 +1,12 @@
11
---
2-
title: "Container on AWS ParallelCluster"
2+
title: "Containers on AWS ParallelCluster"
33
date: 2019-01-24T09:05:54Z
44
weight: 40
55
pre: "<b>Lab II ⁃ </b>"
66
tags: ["HPC", "Overview"]
77
---
88

9-
10-
{{% notice info %}}This lab requires an AWS Cloud9 IDE. If you do not have an AWS Cloud9 IDE set up, complete sections *a. Sign in to the Console* through *d. Work with the AWS CLI* in the **[Getting Started in the Cloud](/02-aws-getting-started.html)** workshop.
11-
{{% /notice %}}
9+
![ECS Logo](/images/container-pc/ecs-logo.png)
1210

1311
HPC Applications typically rely on several libraries and software components along with complex dependencies.
1412
Those applications tend to be deployed on a shared file system for on-premise HPC system.
479 KB
Loading
164 KB
Loading
154 KB
Loading
48.1 KB
Loading

0 commit comments

Comments
 (0)