Using Environment variables and secrets (API keys, database passwords) in AWS Lambda - Python Version

Passing environment variables to AWS Lambda
Passing environment variables to AWS Lambda

In this article, you're going to learn how to use environment variables and pass sensitive information such as API keys and database passwords securely to AWS Lambda using Python.

The typeScript version of this article is available here

What are Environment Variables?

Environment variables are variables whose values are set outside of the program (in our case - this program is a lambda function) and are usually key-value pairs. It is important to note that the program is never changed across environments and the values of these environment variables are fetched within the program.

For example, if you're running a program in a local environment - you might want to refer to the database installed on your local machine. If you're running the same program in dev environment - you would want to refer to the dev database.

Kinds of environment variables used in lambda

At a high level, you can categorize the data that you might want to pass as environment variables to your lambda function into 2 kinds

  • Non-Sensitive data such as bucket name, dynamodb table name, etc...
  • Sensitive data such as API keys, username, and password of the database, etc...

Apart from this, AWS Lambda itself has some default environment variables.

Non-sensitive data

Let's consider a scenario where you have a dynamodb table for storing the sales transactions. At the end of the day, you might want to run some lambda function to process the sales information for that day.

In this case, your lambda function needs to know the name of the dynamodb table. You can pass this as an environment variable directly as this information is not sensitive.

In below code snippet (which usesaws-cdk ), we're creating the dynamodb table with a single partition key.

table = ddb.Table(self, 'sales_table',
                          table_name='sales',
                          partition_key=ddb.Attribute(
                              name='id', type=ddb.AttributeType.STRING),
                          billing_mode=ddb.BillingMode.PAY_PER_REQUEST)

Then, we create a lambda function that uses Python 3.8 as runtime. We configure timeout and memory size for the lambda function. We pass the table_name as an environment variable to the lambda function.

read_ddb_fn = _lambda.Function(self,
                                       'read_ddb_function',
                                       runtime=_lambda.Runtime.PYTHON_3_8,
                                       function_name='read_ddb_function',
                                       timeout=Duration.seconds(180),
                                       memory_size=256,
                                       code=_lambda.Code.from_asset(
                                           'lambdas'),
                                       environment={
                                           "table_name": table.table_name
                                       },
                                       handler='lambda.handler')

When you deploy using cdk deploy, CDK would create all AWS resources.

You can see the environment variables (along with the values in plain text) created earlier are shown in Environment Variables section of Configuration tab.

As this is not sensitive data, you can pass the dynamodb table information as an environment variable.

How to use the environment variables in the Lambda function

There is no difference between accessing environment variables from the lambda function to any other program.

If you're using python, you can use os.environ.get('env_variable_name')

Below is the actual lambda function code for retrieving the environment variable

import os


def handler(event, context):
    table_name = os.environ.get('table_name')

Permissions

It is obvious but worth to re-iterate that even though you can get the dynamodb table name from the environment variable, you need to have the necessary permissions to access the table

You can give the permissions like this in CDK for read access.

table.grant_read_data(read_ddb_fn)
Grant access to lambda function to read table 

Sensitive data

Let's consider a scenario where you have a database and want your lambda function to interact with the RDS database.

In this case, your lambda function should have access to the hostname, user name, and password for the database to communicate with the RDS database aurora or any other form of data store.

As the user name and password of the database are sensitive information - it is not advisable to pass this information directly as environment variables to your lambda function. As shown earlier, all environment variables that you pass into the lambda function would be visible in the AWS console.

This is the reason why you should never pass sensitive data as environment variables directly to your lambda function.

For sensitive information, you need to store the sensitive information in parameter store or secrets manager and then pass name or ARN(Amazon Resource Name) as an environment variable to your lambda function.

The Lambda function can fetch the sensitive information from the parameter store or secrets manager dynamically within the lambda function and then use that information in the lambda function

Passing sensitive information using parameter store


Let us assume that our lambda talks to an external API and this API requires a API Key. As this is sensitive information, you can't pass this information as plain text. So, we've stored the value of the API key by name ( API_KEY )in the parameter store. We're referring to the parameter from the parameter store using the below cdk code

api_key = 'API_KEY'
        ssm_param = ssm.StringParameter.from_secure_string_parameter_attributes(
            self, 'ssm_param', parameter_name=api_key)

Passing the name of the Parameter (of sensitive data) to the Lambda function

Then, we're creating a lambda function with Python 3.8 as runtime and configuring timeout and memory size properties. We're passing the key name ( API_KEY )as an environment variable. Please note that we're NOT passing the actual secret value. It is just a key.

read_ssm_fn = _lambda.Function(self,
                                       'read_ssm_function',
                                       runtime=_lambda.Runtime.PYTHON_3_8,
                                       function_name='read_ssm_function',
                                       code=_lambda.Code.from_asset(
                                           'lambdas'),
                                       environment={
                                           "API_KEY": api_key
                                       },
                                       handler='lambda.ssm_handler')

Permissions for lambda to read the secret

Our lambda will talk to the parameter store to get the actual secret value but it needs to permission to retrieve that value. The below line of code would grant that permission.

ssm_param.grant_read(read_ssm_fn)

Lambda function code

Our lambda function talks to the parameter store to retrieve the secret value - and then it can use that value to communicate with external API.

def ssm_handler(event, context):
    api_key = os.environ.get('API_KEY')
    ssm = boto3.client('ssm')
    response = ssm.get_parameter(Name=api_key, WithDecryption=True)
    api_value = response['Parameter']['Value']
    # use api_value

Creating a database with credentials as secret stored in Secrets Manager

You can also use Secrets Manager to store and retrieve secrets required for your lambda. Secrets Manager has additional functionalities such as secret rotation.

In below cdk code, we're creating a database instance in the existing VPC.

database_name = 'ecommerce'

db_instance = rds.DatabaseInstance(self, 'db_instance',
                                           engine=rds.DatabaseInstanceEngine.postgres(
                                               version=rds.PostgresEngineVersion.VER_13),
                                           instance_type=ec2.InstanceType.of(
                                               ec2.InstanceClass.BURSTABLE3, ec2.InstanceSize.SMALL),
                                           vpc=vpc,
                                           vpc_subnets=ec2.SubnetSelection(
                                               subnet_type=ec2.SubnetType.PRIVATE_WITH_EGRESS),
                                           database_name=database_name,
                                           credentials=rds.Credentials.from_generated_secret(
                                               'postgres'),
                                           max_allocated_storage=200)

Creation of secret

Please note that rds.Credentials.from_generated_secret('postgres') will create a secret  secrets manager with the passed value as the user name. As we're talking to the Postgres database, we've specified the user name as postgres

Passing the secret arn as an environment variable to Lambda

You can pass the database endpoint and database name as environment variables directly. However, for database passwords, you can pass the secret arn to the lambda function.

read_secret_fn = _lambda.Function(self,
                                          'read_secret_function',
                                          runtime=_lambda.Runtime.PYTHON_3_8,
                                          function_name='read_secret_function',
                                          code=_lambda.Code.from_asset(
                                              'lambdas'),
                                          environment={
                                              "DB_ENDPOINT_ADDRESS": db_instance.db_instance_endpoint_address,
                                              "DB_NAME": database_name,
                                              "DB_SECRET_ARN": db_instance.secret.secret_arn
                                          },
                                          handler='lambda.secret_handler')

Fetching the actual secret from lambda

In Lambda, you can fetch the actual password value from the secrets manager and use this information to communicate with the database.

We're using get_secret_value the method of Secrets Manager to fetch the actual secret. Please note that it will be a JSON string(in this case) with username and password properties.

def secret_handler(event, context):
    secret_name = os.environ.get('DB_SECRET_ARN')
    secretsmanager = boto3.client('secretsmanager')
    secret = secretsmanager.get_secret_value(SecretId=secret_name)
    seecret_value = secret.get('SecretString')
    secret_json = json.loads(seecret_value)
    db_password = (secret_json["password"])
    # use db_password

Default environment variables

Till now, we've discussed the environment variables that we've passed explicitly. However, the Lambda service has several environment variables that it uses and some of the environment variables are set to sensible defaults which you can override them.

LAMBDA_TASK_ROOT is the environment variable that contains the path of your lambda code

AWS_LAMBDA_FUNCTION_VERSION is the environment variable that tells us the version of the lambda function

Conclusion

Hope you've learned a bit about passing environment variables in AWS Lambda.

Please let me know your thoughts in the comments.