Improving Your DynamoDB Streams Developer Experience

{ "Records": [ { "eventID": "c4ca4238a0b923820dcc509a6f75849b", "eventName": "INSERT", "eventVersion": "1.1", "eventSource": "aws:dynamodb", "awsRegion": "us-east-1", "dynamodb": { "Keys": { "Id": { "N": "101" } }, "NewImage": { "Message": { "S": "New item!" }, "Id": { "N": "101" } }, "ApproximateCreationDateTime": 1428537600, "SequenceNumber": "4421584500000000017450439091", "SizeBytes": 26, "StreamViewType": "NEW_AND_OLD_IMAGES" }, "eventSourceARN": "arn:aws:dynamodb:us-east-1:123456789012:table/ExampleTableWithStream/stream/2015-06-27T00:48:05.899" }, {...}, {...} ] }

Let's Talk

Introduction

DynamoDB streams allow us to write and execute code that can react to insert, update and delete operations of DynamoDB items. The DynamoDB events are pushed to a stream that you typically configure to trigger a Lambda function.

When writing Lambda code that will be triggered by a DynamoDB stream:

How do you test your code locally, before deploying to ensure you have the fastest feedback loop?
How do you disable DynamoDB streams for your test purposes?
How do you debug your Lambda code and step through?

In this blog post, we are going to explore some of the ways we can improve the developer experience while answering the questions above.

We are going to be using the Serverless Framework and VS Code as our tools of choice, however, you can achieve the same with other frameworks such as AWS SAM.

Invoking Locally

Invoking your triggering Lambda function locally can be invaluable during the dev/test cycle.

This can be achieved with the following command entered into your local command prompt:

'serverless invoke local --function functionName'

But there are challenges to overcome when working with serverless code locally:

How can I be sure my Lambda IAM role has sufficient permissions?
How can I access AWS resources via code?
What event/context do I invoke my Lambda handler with?

Lambda IAM Credibility

A drawback of executing any cloud-based code locally is that your local code will usually be making AWS requests using your IAM credentials.

If you are implementing least privilege for your Lambda IAM role (and you should!), your personal AWS IAM user/role will commonly have more permissions than your Lambda function running in AWS. This means if your Lambda executes successfully locally, there is no guarantee this will also happen when running in AWS. Invalidating your testing!

One way to combat this would be to assume the role of your deployed Lambda function and then execute your function locally. However, this is difficult to achieve because you would need to extend the role’s trust policy to accommodate this. Opening up another security hole.

The advice and best practice is to ensure you run end-to-end tests 100% in the cloud regularly. But not at the expense of slowing down development time.

So a mixture of local dev/test and cloud-based end-to-end testing is the ideal mix.

Accessing AWS

A big benefit of using DynamoDB over RDS for your data store is that DynamoDB has a public HTTPS API which is protected via IAM whereas RDS relies on a combination of username/password and private networking.

This is huge for developers.

It allows us to talk to a real DynamoDB table running in AWS (whereas for RDS you would commonly start a local SQL database and populate it with dummy data🤢).

There aren’t many hoops you need to jump through as a developer:

It's common to set the table name you intend to interact with as an environment variable in your serverless.yml. You can easily replicate this for your local invocation by passing in the table name environment variable on the command line:

  
serverless invoke local -f functionName -e tableName=my-table

# Or more than one variable

serverless invoke local -f functionName \
  -e tableName=my-table \
  -e foo=bar

If your Lambda function is going to run in a private subnet within AWS, you can create a VPC endpoint. This will create a private tunnel between that subnet and the DynamoDB API endpoint. However, you will not have a problem connecting via the public internet when locally invoking your function. Furthermore, the VPC endpoint is just networking so there are no Lambda code changes required to make your private connection work.
As mentioned above, you must have the correct DynamoDB IAM permissions in order to access the service. This goes for your own IAM credentials for a local invoke and your Lambda’s IAM role when invoking in AWS.

The Lambda Event/Context

When triggering your function locally, you have to pass it the event. You cannot insert an item in DynamoDB and expect the stream to invoke your local Lambda - that will invoke your function in AWS.

You can do this with the following command from your command line:

'serverless invoke local --function functionName --path events/data.json'

The schema for a DynamoDB event is available in the AWS documentation but it is hard to find. I have provided it below 👇:

  
{
  "Records": [
    {
      "eventID": "c4ca4238a0b923820dcc509a6f75849b",
      "eventName": "INSERT",
      "eventVersion": "1.1",
      "eventSource": "aws:dynamodb",
      "awsRegion": "us-east-1",
      "dynamodb": {
        "Keys": {
          "Id": {
            "N": "101"
          }
        },
        "NewImage": {
          "Message": {
            "S": "New item!"
          },
          "Id": {
            "N": "101"
          }
        },
        "ApproximateCreationDateTime": 1428537600,
        "SequenceNumber": "4421584500000000017450439091",
        "SizeBytes": 26,
        "StreamViewType": "NEW_AND_OLD_IMAGES"
      },
      "eventSourceARN": "arn:aws:dynamodb:us-east-1:123456789012:table/ExampleTableWithStream/stream/2015-06-27T00:48:05.899"
    },
    {...},
    {...}
  ]
}

I would recommend creating an 'events/' directory at the location of your serverless.yml and putting common event payloads you would like to invoke locally there.

Note: if your Lambda function is going to retrieve the inserted/updated DynamoDB item from the event payload, then the item needs to, of course, exist in the real DynamoDB table.

A hint for finding these poorly-documented event schemas is to visit the Lambda function “test” tab in the AWS console and find your trigger in the dropdown:

Time to Invoke

Once you have prepared your event JSON, you now need to ensure your data is in the correct state in AWS. The DynamoDB console (specifically its Item Explorer) is a good tool for doing this. You can insert/update items quite easily. There are other tools too (such as Dynabase) which serve a similar functionality.

But what if inserting/modifying your data will add events to your existing DynamoDB stream and invoke the deployed version of your function? Mutating the item beyond the state you need locally (if your function mutates items)?

(See the Enabling/Disabling Your DynamoDB Stream section below 👇)

Once:

Your real DynamoDB data is in the correct state.
Your JSON event is saved locally.
Your IAM permissions are configured.

It is time to invoke!

As mentioned in the intro, this post is using Serverless Framework, however, the same can be achieved with any other tool that supports local function invokes (e.g. AWS SAM).

The following command from a command line will invoke your function, combining all the parameters mentioned previously:

'serverless invoke local --function functionName -e tableName=my-table --path events/data.json'

Remember to reset your data via the Item Explorer if you want to rerun your function (if your function is mutating data).

Enable/Disable Your DynamoDB Stream

DynamoDB streams are commonly used in event-driven architectures - And the Lambda code you want to invoke is a small piece in a larger puzzle. When you are dev/testing your little piece, you probably want to prevent all downstream behavior from being triggered by your (potentially bug-filed) incomplete code.

You may also find yourself pulling your hair out when setting your data in the desired initial state if every insert/update you do via the Item Explorer triggers the deployed version of your function to run - potentially mutating your data before your local function runs, likely breaking everything.

The answer is to temporarily enable/disable your DynamoDB stream.

You cannot achieve this in the AWS console. However, the following CLI commands will enable/disable your stream for you 👇:

  
region=us-east-1
event_source_mapping_uuid=... # (see below)

aws lambda update-event-source-mapping \
  --region "$region" \
  --uuid "$event_source_mapping_uuid" \
  --no-enabled

  
region=us-east-1
event_source_mapping_uuid=... # (see below)

aws lambda update-event-source-mapping \
  --region "$region" \
  --uuid "$event_source_mapping_uuid" \
  --enabled

(credit: Alestic)

You can find your event source mapping ID using the following steps:

Go to your triggered Lambda function in the console (NOT DynamoDB Stream)
Click on your DynamoDB Streams trigger
Find the trigger UUID

You can also check the state of your stream in the console, rather than relying on the CLI using the following steps:

Go to your triggering Lambda function in the console (NOT DynamoDB Stream)
Click on your DynamoDB Streams trigger
Check for an enabled/disabled state.

The location of your Event Source Mapping UUID and state. These can be fed into the above CLI commands.

There is a second option to disable your stream in your infrastructure-as-code:

  
events:
    - stream:
        type: dynamodb
        arn: ...
        enabled: false

If you are using the Serverless Framework to deploy your Lambda function and your DynamoDB table/stream, you can add 'enabled: false' to your 'events' section when defining your function to deploy the Lambda event source mapping in a disabled state. You can use the same instructions above to check the state of that event source mapping in the console.

Debugging your Lambda Function

This final piece of advice is not specific to DynamoDB Streams. But it's surprising how many people struggle to attach debuggers to local function invocations. And this is one of the biggest benefits of invoking your functions locally.

This advice is specific to VS Code:

Open a debugging terminal from the VS Code debugging menu
Simply execute your function locally within this terminal

Your breakpoints will then be hit!

It’s as easy as 1, 2, … step through, pass over … 3

Conclusion

When adopting any new tool/service, developers should always take the time to optimize their developer experience before diving in. This will improve their productivity and allow them to deliver features/fixes faster.

The advice given in this blog post will allow any developer using the Serverless Framework and VS Code to:

Invoke their functions locally - minimizing the dev/test feedback loop - But using a real DynamoDB table.
Attach debuggers to their function code - allowing them to debug issues (without 'console.log()'-ing everything)
Control when their DynamoDB stream is enabled/disabled to isolate their own boundaries in an event-driven architecture. Avoiding unwanted processes being triggered downstream.

References

Serverless Framework documentation (specifically the section on the 'invoke local' command)
Pause/Resume AWS Lambda Reading Kinesis Stream - Technical instructions for enabling/disabling a Kinesis Data Stream (also works for DynamoDB Streams)

Samuel Lock

Sr. Serverless Developer

A backend developer passionate about the tooling, platforming and education of serverless technologies

Improving Your DynamoDB Streams Developer Experience

Introduction

Invoking Locally

Lambda IAM Credibility

Accessing AWS

The Lambda Event/Context

Time to Invoke

Enable/Disable Your DynamoDB Stream

Debugging your Lambda Function

Conclusion

References

The dream team

Looking for skilled architects & developers?

More from Serverless Guru

I Rebuilt Next.js Behavior Using Only Go and AWS SAM — And It Might Now Be My Favorite Stack

Automate Brand Visibility Tracking With Amazon Rekognition

Building A Translation And Transcription Application Using AWS Transcribe, And Translate

Join the Community

Improving Your DynamoDB Streams Developer Experience

Looking for Senior AWS Serverless Architects & Engineers?

Introduction

Invoking Locally

Lambda IAM Credibility

Accessing AWS

The Lambda Event/Context

Time to Invoke

Enable/Disable Your DynamoDB Stream

Debugging your Lambda Function

Conclusion

References

Samuel Lock

More from Serverless Guru

I Rebuilt Next.js Behavior Using Only Go and AWS SAM — And It Might Now Be My Favorite Stack

Automate Brand Visibility Tracking With Amazon Rekognition

Building A Translation And Transcription Application Using AWS Transcribe, And Translate

Short Story Generator with AWS Bedrock and Amplify

The Evolution of Serverless: From Compute to Full-Stack Cloud Architectures

How to Simplify Remote Database Access with AWS Session Manager