Using Environment variables and secrets (API keys, database passwords) in AWS Lambda - Python Version
In this article, you're going to learn how to use environment variables and pass sensitive information such as API keys and database passwords securely to AWS Lambda using Python.
The typeScript version of this article is available here
What are Environment Variables?
Environment variables are variables whose values are set outside of the program (in our case - this program is a lambda function) and are usually key-value pairs. It is important to note that the program is never changed across environments and the values of these environment variables are fetched within the program.
For example, if you're running a program in a local environment - you might want to refer to the database installed on your local machine. If you're running the same program in dev
environment - you would want to refer to the dev
database.
Kinds of environment variables used in lambda
At a high level, you can categorize the data that you might want to pass as environment variables to your lambda function into 2 kinds
- Non-Sensitive data such as bucket name, dynamodb table name, etc...
- Sensitive data such as API keys, username, and password of the database, etc...
Apart from this, AWS Lambda itself has some default environment variables.
Non-sensitive data
Let's consider a scenario where you have a dynamodb table for storing the sales transactions. At the end of the day, you might want to run some lambda function to process the sales information for that day.
In this case, your lambda function needs to know the name of the dynamodb table. You can pass this as an environment variable directly as this information is not sensitive.
In below code snippet (which usesaws-cdk
), we're creating the dynamodb table with a single partition key.
table = ddb.Table(self, 'sales_table',
table_name='sales',
partition_key=ddb.Attribute(
name='id', type=ddb.AttributeType.STRING),
billing_mode=ddb.BillingMode.PAY_PER_REQUEST)
Then, we create a lambda function that uses Python 3.8 as runtime. We configure timeout and memory size for the lambda function. We pass the table_name
as an environment variable to the lambda function.
read_ddb_fn = _lambda.Function(self,
'read_ddb_function',
runtime=_lambda.Runtime.PYTHON_3_8,
function_name='read_ddb_function',
timeout=Duration.seconds(180),
memory_size=256,
code=_lambda.Code.from_asset(
'lambdas'),
environment={
"table_name": table.table_name
},
handler='lambda.handler')
When you deploy using cdk deploy
, CDK would create all AWS resources.
You can see the environment variables (along with the values in plain text) created earlier are shown in Environment Variables
section of Configuration
tab.
As this is not sensitive data, you can pass the dynamodb table information as an environment variable.
How to use the environment variables in the Lambda function
There is no difference between accessing environment variables from the lambda function to any other program.
If you're using python, you can use os.environ.get('env_variable_name')
Below is the actual lambda function code for retrieving the environment variable
import os
def handler(event, context):
table_name = os.environ.get('table_name')
Permissions
It is obvious but worth to re-iterate that even though you can get the dynamodb table name from the environment variable, you need to have the necessary permissions to access the table
You can give the permissions like this in CDK for read access.
Sensitive data
Let's consider a scenario where you have a database and want your lambda function to interact with the RDS database.
In this case, your lambda function should have access to the hostname, user name, and password for the database to communicate with the RDS database aurora or any other form of data store.
As the user name and password of the database are sensitive information - it is not advisable to pass this information directly as environment variables to your lambda function. As shown earlier, all environment variables that you pass into the lambda function would be visible in the AWS console.
This is the reason why you should never pass sensitive data as environment variables directly to your lambda function.
For sensitive information, you need to store the sensitive information in parameter store or secrets manager and then pass name or ARN(Amazon Resource Name) as an environment variable to your lambda function.
The Lambda function can fetch the sensitive information from the parameter store or secrets manager dynamically within the lambda function and then use that information in the lambda function
Passing sensitive information using parameter store
Let us assume that our lambda talks to an external API and this API requires a API Key
. As this is sensitive information, you can't pass this information as plain text. So, we've stored the value of the API key by name ( API_KEY
)in the parameter store. We're referring to the parameter from the parameter store using the below cdk
code
api_key = 'API_KEY'
ssm_param = ssm.StringParameter.from_secure_string_parameter_attributes(
self, 'ssm_param', parameter_name=api_key)
Passing the name of the Parameter (of sensitive data) to the Lambda function
Then, we're creating a lambda function with Python 3.8 as runtime and configuring timeout and memory size properties. We're passing the key name ( API_KEY
)as an environment variable. Please note that we're NOT passing the actual secret value. It is just a key.
read_ssm_fn = _lambda.Function(self,
'read_ssm_function',
runtime=_lambda.Runtime.PYTHON_3_8,
function_name='read_ssm_function',
code=_lambda.Code.from_asset(
'lambdas'),
environment={
"API_KEY": api_key
},
handler='lambda.ssm_handler')
Permissions for lambda to read the secret
Our lambda will talk to the parameter store to get the actual secret value but it needs to permission to retrieve that value. The below line of code would grant that permission.
ssm_param.grant_read(read_ssm_fn)
Lambda function code
Our lambda function talks to the parameter store to retrieve the secret value - and then it can use that value to communicate with external API.
def ssm_handler(event, context):
api_key = os.environ.get('API_KEY')
ssm = boto3.client('ssm')
response = ssm.get_parameter(Name=api_key, WithDecryption=True)
api_value = response['Parameter']['Value']
# use api_value
Creating a database with credentials as secret stored in Secrets Manager
You can also use Secrets Manager
to store and retrieve secrets required for your lambda. Secrets Manager
has additional functionalities such as secret rotation.
In below cdk
code, we're creating a database instance in the existing VPC.
database_name = 'ecommerce'
db_instance = rds.DatabaseInstance(self, 'db_instance',
engine=rds.DatabaseInstanceEngine.postgres(
version=rds.PostgresEngineVersion.VER_13),
instance_type=ec2.InstanceType.of(
ec2.InstanceClass.BURSTABLE3, ec2.InstanceSize.SMALL),
vpc=vpc,
vpc_subnets=ec2.SubnetSelection(
subnet_type=ec2.SubnetType.PRIVATE_WITH_EGRESS),
database_name=database_name,
credentials=rds.Credentials.from_generated_secret(
'postgres'),
max_allocated_storage=200)
Creation of secret
Please note that rds.Credentials.from_generated_secret('postgres')
will create a secret secrets manager
with the passed value as the user name. As we're talking to the Postgres database, we've specified the user name as postgres
Passing the secret arn as an environment variable to Lambda
You can pass the database endpoint and database name as environment variables directly. However, for database passwords, you can pass the secret arn to the lambda function.
read_secret_fn = _lambda.Function(self,
'read_secret_function',
runtime=_lambda.Runtime.PYTHON_3_8,
function_name='read_secret_function',
code=_lambda.Code.from_asset(
'lambdas'),
environment={
"DB_ENDPOINT_ADDRESS": db_instance.db_instance_endpoint_address,
"DB_NAME": database_name,
"DB_SECRET_ARN": db_instance.secret.secret_arn
},
handler='lambda.secret_handler')
Fetching the actual secret from lambda
In Lambda, you can fetch the actual password value from the secrets manager
and use this information to communicate with the database.
We're using get_secret_value
the method of Secrets Manager
to fetch the actual secret. Please note that it will be a JSON string(in this case) with username
and password
properties.
def secret_handler(event, context):
secret_name = os.environ.get('DB_SECRET_ARN')
secretsmanager = boto3.client('secretsmanager')
secret = secretsmanager.get_secret_value(SecretId=secret_name)
seecret_value = secret.get('SecretString')
secret_json = json.loads(seecret_value)
db_password = (secret_json["password"])
# use db_password
Default environment variables
Till now, we've discussed the environment variables that we've passed explicitly. However, the Lambda service has several environment variables that it uses and some of the environment variables are set to sensible defaults which you can override them.
LAMBDA_TASK_ROOT
is the environment variable that contains the path of your lambda code
AWS_LAMBDA_FUNCTION_VERSION
is the environment variable that tells us the version of the lambda function
Conclusion
Hope you've learned a bit about passing environment variables in AWS Lambda.
Please let me know your thoughts in the comments.