The Beginner's Guide to AWS CDK - Python Version


AWS CDK (Cloud Development Kit) is a software development framework that lets you define cloud resource using your favorite programming language.  At the time of this writing, AWS CDK supports Typescript, Javascript, Python, C# and Java.

In this tutorial, we're going to use Python.  If you prefer typescript, you can refer to CDK with typescript tutorial here.

Why AWS CDK?

There are few options to provision AWS cloud resources.

Manually creating resources using AWS console:

Yes, you can use AWS console to create AWS resources. But, it is highly recommended to use any Infrastructure as Code(IaC) as solution. Infrastructure as Code(IaC) is nothing but having desired state of your infrastructure in the form of code.

If you're creating resources manually, below are some of the problems that you might face

  • Error prone : Creating resources by hand is error-prone. It is highly possible for someone to make mistake over a period of time.
  • Lack of review: As your infrastructure is not expressed as code - it is difficult for your peer to review
  • Not repeatable: For example, if you want to create similar set of resources for 5 to 10 times in different regions - you might have to repeat the same task again and again.

CloudFormation:

CloudFormation is one of IaC tools available in AWS ecosystem. You can express your code in YAML or JSON format. It is well supported and is easy to understand.
There are few problems with CloudFormation:

  • Lengthy: Even though CloudFormation is easy to understand, the file that we have to create will be lengthy. For example, for creating a serverless web application - you might have to write hundreds of lines of YAML.
  • Explicitness: You might have to give all parameters explicitly to create AWS resource. Sometimes, it is better to have some sensible defaults
  • Not repeatable: If you want to create similar set of resources with different parameters - you might have to do lot of "copy-paste" in your YAML file and change the parameter values. As you know, this might be error-prone as one may forget to update the value of a particular parameter.

AWS CDK:

You can define your cloud resources using your favorite programming language. The problems that we encountered in CloudFormation is resolved in AWS CDK. It is easier to create your infrastructure with lesser lines of code. AWS CDK will fill-in sensible defaults so that you don't need to worry about all nitty gritty details.

Note: Please note that we've restricted our discussion to IaC tools available natively within AWS. There are few other tools available in market outside of AWS - Terraform, Pulumi etc.. But that is outside the scope of this article.

How AWS CDK works:


When you create your AWS CDK and deploy your stack - AWS CDK ultimately converts your code as CloudFormation stacks. So, indirectly you're using CloudFormation even when using AWS CDK. AWS just makes things easier for you to develop and maintain CloudFormation stacks.

How AWS CDK works
How AWS CDK works

Enough of theory. Let us create our first CDK application.

Creating your first AWS CDK app:

First you need to install AWS CDK toolkit. AWS CDK toolkit is a CLI(Command Line Interface) tool to build and deploy CDK code. It is built with typescript and published in npm registry. Yes, even though you use python to build CDK apps, you still need to have nodejs installed.

npm install cdk

Then, create an empty directory and execute below command from your terminal from that empty directory

mkdir cdk-python
cd cdk-python
cdk init app --language=python

Before discussing about the project structure and significance of each of the created files, let us try to deploy the predefined stack created and see how it works.

There is a predefined stack in cdk_python directory. The name of the file would follow the convention <your-directory-name>_stack.py As I've created CDK app from cdk-python directory, the name of the file is cdk_python/cdk_puthon_stack.py

Below is the contents of the file.

from aws_cdk import (
    # Duration,
    Stack,
    # aws_sqs as sqs,
)
from constructs import Construct

class CdkPythonStack(Stack):

    def __init__(self, scope: Construct, construct_id: str, **kwargs) -> None:
        super().__init__(scope, construct_id, **kwargs)

        # The code that defines your stack goes here

        # example resource
        # queue = sqs.Queue(
        #     self, "CdkPythonQueue",
        #     visibility_timeout=Duration.seconds(300),
        # )

Now, I'm going to uncomment the queue resource creation statement and its associated import.  The objective of the above code snippet is to create a Queue using SQS. Updated code (changes are highlighted in bold) would like below.


from aws_cdk import (
    Duration,
    Stack,
    aws_sqs as sqs, 
)
from constructs import Construct


class CdkPythonStack(Stack):

    def __init__(self, scope: Construct, construct_id: str, **kwargs) -> None:
        super().__init__(scope, construct_id, **kwargs)

        # The code that defines your stack goes here

        # example resource
        
        queue = sqs.Queue(
            self, "CdkPythonQueue",
            visibility_timeout=Duration.seconds(300),
        )


Now, I want to deploy the code. So, I'm executing the below command from terminal in vscode

cdk deploy

When I try to run above command, I'm getting below error message

CDK bootstrapping error

If the above message is not clear, we've got bootstrapping error. But, what is bootstraping?

Bootstrapping:

Earlier, we've mentioned our CDK app is converted into CloudFormation templates and the CloudFormation templates are actually getting deployed. We need to store the generated templates somewhere. They are stored in S3 Bucket. So, bucket has to be created before you deploy your CDK app. It is not only about S3 bucket - we need roles with necessary permissions to put those files. So, we need to create necessary roles. In essence, we need to create initial resources before we deploy our CDK app.

Bootstrapping is the process of creating that initial resources before we deploy our CDK app. When you deploy CDK app, you're not only deploying the infrastructure, you need to deploy the actual application too. For example, if you're creating some containerized application - you might want to build the Docker image by CDK. So, these assets (files that are required to run the application) have to be stored somewhere.  Primarily, AWS CDK has couple of types of assets - S3 and Docker images.

You can execute below command to bootstrap your environment

cdk bootstrap

When you execute the above command, the environment would be bootstrapped. When you login into AWS console - you can create an empty bucket is created in AWS S3.

AWS CDK bootstrap bucket
AWS CDK bootstrap bucket

When you see the CloudFormation service, a stack by name CDKToolKit would have been created with necessary resources such as roles, policy for that role, bucket , policy for that bucket etc.. There may be additional resources that we would be created during bootstrap process - which we'll discuss later in this article.

Once the environment is bootstrapped, we can execute the below command in terminal  to deploy our CDK app

cdk deploy

It might take some time - but the app will be deployed. Once the app is deployed, you can login to AWS console to see the resources created.  A new stack would be created and when you click Resources tab, you could see a SQS queue is created.

And, when you visit S3 service, you could see cloudformation template in JSON format has been created in a bucket(which was created during bootstrapping)

When you download this file, you could see the cloudformation template which is used to create stack.

Constructs:

Constructs are building blocks for your CDK application. In fact, we've used a construct for creating SQS queue.

 queue = sqs.Queue(
            self, "CdkPythonQueue",
            visibility_timeout=Duration.seconds(300),
        )

If you've any experience with object oriented programming - you can think of construct like a class.  You can use an instance of a class as a property in another class. Thus, you can use construct to build some more complex constructs.

Levels of Constructs:

There are 3 levels of constructs - Level 1, Level 2 and Level 3 constructs.

Level 1 construct:

Level 1 construct represents CloudFormation resource directly with one-to-one mapping between Level 1 construct and CloudFormation resource.  For example, CfnBucket construct is a Level 1 construct.  Usually, you don't use Level 1 constructs often as we would be using Level 2 or Level 3 constructs.

Level 2 construct:

These are commonly used constructs that are available in CDK. For example, we've used sqs.Queue earlier. This construct is Level 2 construct. s3.Bucket is another Level 2 construct.

bucket = s3.Bucket(self, "AwsCdkTutorialBucket",
                           bucket_name="aws-cdk-tutorial-bucket")

Level 3 construct:

Many Level 2 constructs are grouped together to form Level 3 constructs. For example, you can create a Level 3 construct for creating simple website. And, this Level 3 construct will have a bucket Level 2 construct and a cloudfront distribution Level 2 construct. Mostly, Level 3 constructs are used to represent commonly used patterns.

AWS CDK Construct Levels
AWS CDK Construct Levels

Above is the pictorial representation between different levels of constructs.

Stack construct:

Stack construct represent CloudFormation stack template. When you open file lib/aws-cdk-tutorial-stack.ts , you can see the below stack construct

class CdkPythonStack(Stack):

    def __init__(self, scope: Construct, construct_id: str, **kwargs) -> None:
        super().__init__(scope, construct_id, **kwargs)

        # The code that defines your stack goes here

        # example resource
        queue = sqs.Queue(
            self, "CdkPythonQueue",
            visibility_timeout=Duration.seconds(300),

App construct:

App construct is a special type of construct where we don't have direct mapping with CloudFormation. This construct exists only in realm of CDK.

Project structure:

When you create a python cdk project, your generated project folder structure would look like below. Let us discuss the significance of each of the folder/file.

app.py :

By default, this is where your CDK app ( app.py in our case) resides. A project should contain at-least one CDK app. A stack is created within the scope of App or within the scope of another stack

<your-folder-name> folder:

This is where where our stacks and custom constructs are defined. It is recommended to have separate file for each of the stack and construct that you define for better readability.

tests:

This is where your test cases reside.

gitignore:

As with any git project, you can mention the files and folders to be ignored while checking-in. This is where you mention node_modules folder.

cdk.json:

This is the entry point of our application. CDK will try to look for app property in this cdk.json file - and the command ( python app.py ) is executed. It is completely customisable  - you can pass additional flags in this command as per your needs.

 "app": "python app.py",

CDK CLI commands:

Let us discuss some of most commonly used CDK CLI commands that you can execute in your terminal

cdk synth :

This command synthesizes the templates in your stack and prints the CloudFormation stack template which are going to deploy in AWS. When I run the command in our CDK app created earlier, I'm getting the output as shown below. Below picture shows a segment of generated CloudFormation template as the complete CloudFormation template is longer.

AWS CDK synth output
AWS CDK synth output

It is recommended to use cdk synth before you deploy your code.

cdk diff:

This command will tell you the changes since last deployment. For example, if I add a S3 bucket construct to our stack and execute cdk diff command, I would be getting output like below

cdk deploy:

We've used this command before. This command deploys our CloudFormation templates into AWS. If you had not run cdk synth earlier, this command would synthesizes the application before deploying it.

Creating simple static website using CDK

We've discussed about different concepts in AWS CDK. Let us create a simple static website to reinforce the concepts.

Here's the plan

  1. Create a new project by executing below command in a new empty directory. You can choose the directory name as you want - static-website for example.
cdk init app --language=python

2. Run cdk bootstrap to bootstrap your application, if you had not done before.

3. Create a folder by name website - this is where contents of static website resides. As of now, this folder contains just contains a skeleton HTML 5 file by name index.html The contents of this file is irrelevant as S3 can host any static files.

4. In your default stack, add following code

# Create an S3 bucket
website_bucket = s3.Bucket(self, "WebsiteBucket",			
    			bucket_name="serverless-app-py-fe-bucket",
                            public_read_access=True,
                            website_index_document="index.html")

# Deploy files from the "website" directory to the S3 bucket
s3deploy.BucketDeployment(self, "BucketDeployment",
				sources=[
                            s3deploy.Source.asset("./website/")],
                            destination_bucket=website_bucket)

In above code snippet, we're creating S3 Bucket. S3 has a capability to host a website. We need to add couple of properties in the S3.Bucket construct to make it behave like a website - public_read_access to make the contents of bucket public and website_index_document as index.html( the file that we've created in previous step)

The next construct that we use is to s3deploy.BucketDeployment . The purpose of this construct is to deploy the contents of our website folder to the created S3 Bucket. It accepts couple of properties - sources to mention the contents of website(this is 'website' folder in our case ) and destination_bucket represents target bucket.

5. Then, you can run cdk synth to see the generated CloudFormation templates

6. You can run cdk deploy to deploy your website

7. Login to AWS console S3 service and select the bucket serverless-app-py-fe-bucket in our case and select Properties  tab.

Scroll down to section Static website hosting to get the URL of the website

Automatic creation of Lambdas

In some cases, AWS CDK constructs will create lambda functions on behalf of you.

In below code snippet, we're creating a S3 bucket. We've set properties autoDeleteObjects to true ( which requires removalPolicy property to be set to  RemovalPolicy.DESTROY)

 s3bucket = s3.Bucket(self, 'S3Bucket',
                             bucket_name="aws-lambda-py-s3-test123",
                             auto_delete_objects=True,
                             removal_policy=RemovalPolicy.DESTROY)

Whenever we destroy the stack, all the objects in the bucket will be deleted. Even when there is any change in the s3 bucket name, auto deletion will be called and all the objects in the bucket will be destroyed. You should NOT use this in production environment and this is generally used only in development environment.

Please note that we've not created lambda function in AWS stack.

This deletion of objects will be done by lambda function. This lambda function will created by AWS CDK automatically for you.

When you deploy the stack using cdk deploy - you can see lambda function is created for you.

Automatic creation of Lambda function in AWS CDK
Automatic creation of Lambda function in AWS CDK

Please let me know your thoughts in the comments section.