
Enabling Amazon Data Firehose delivery to Private APIs
In this post, we have demonstrated the process of enabling delivery to private APIs via Amazon data firehose streams
Arnab Ghosh
Amazon Employee
Published Mar 3, 2025
This post is co-authored by Lavanya Tangutur.
Real-time streaming data allows organizations to gain immediate insights from data as it's generated. This capability enables faster, data-driven decisions, swift reactions to changing conditions, and personalized customer experiences. These features are crucial in fast-paced environments like financial markets, cybersecurity, and customer-centric industries, where timely responses directly impact success.
Amazon Kinesis Data Firehose is a streaming ETL solution that provides the simplest way to load streaming data into data stores and analytics tools. It captures, transforms, and loads streaming data into Amazon S3, Amazon Redshift, Amazon OpenSearch Service, Snowflake, Apache Iceberg tables and Splunk, enabling near real-time analytics with existing business intelligence tools and dashboards. As a fully managed service, Data Firehose automatically scales to match data throughput without requiring ongoing administration. The service batches, compresses, and encrypts data before loading, which minimizes destination storage usage and enhances security.
Data Firehose natively supports data delivery to HTTP endpoints for both customer and third-party use cases. However, these endpoints must be public - a requirement that may not suit customers who want to keep their custom HTTP endpoints private and away from internet exposure. This post provides prescriptive guidance on enabling data delivery from Data Firehose to private custom HTTP endpoints.
- Access to an AWS Account.
- A fully configured Amazon Virtual Private Cloud (VPC) for deploying resources. Learn more about how to create a VPC here.
- IAM User or Role with permissions to create Amazon Kinesis Data Firehose, Amazon S3 bucket, AWS Lambda function and VPC networking components following principle of least privilege.
- All resources are created in us-east-1 AWS region.

Amazon Kinesis Data Firehose (Firehose) does not natively support delivery to private HTTP endpoints. However, we can accomplish this using the source data transformation feature. This feature allows Firehose to invoke an AWS Lambda function that transforms incoming data before delivery to the destination. We will leverage this Lambda function and its private networking feature to invoke the private API.
The solution architecture consists of an Amazon Data Firehose stream configured with two key components:
The solution architecture consists of an Amazon Data Firehose stream configured with two key components:
- An Amazon S3 bucket as the destination
- Data transformation enabled through an AWS Lambda function
The critical aspect of this architecture is the Lambda function's network configuration. The function must be attached to a VPC that has access to your private API endpoint. When Firehose receives data, it triggers the Lambda function based on predetermined buffer settings with the data records. The Lambda function then processes the records and forwards them to the private API. You can choose to transform the data as required based on the private API requirements.
For this blog post, we have used Amazon API Gateway private endpoints backed by a simple AWS Lambda function to simulate a private API. The Lambda function handles the API requests and processes the records received in the request body.
- Creating a private API using Amazon API gateway involves the following steps:
- First, create a VPC endpoint for API Gateway.
- Next, create a private API and attach a resource policy to enable access
- Finally, deploy the API. You can find more detailed steps here.
- Create an AWS Lamba function which will act as the handler to the API. Next, use Lambda proxy integration to associate the function to the private API, we created in step 1. You can find more detailed steps here. You can use the following sample python code to create the function.
Create a AWS Lambda function attached to a VPC. Please choose the VPC where you have the VPC endpoint created for the API gateway. The Lambda function will be able to access resources hosted in the VPC via the deployed ENIs in the chosen subnets. This Lambda function will be the transformation function for the Amazon Kinesis Firehose.
- The function accepts the input from Amazon Data Firehose, performs required transformations and invokes the private API.
- The function will require information about the VPC endpoint and the API gateway endpoint in order to invoke the private API. We are passing these parameters as environment variables to the Lambda function. You can choose to use AWS SSM parameter store as well.
- Next you will need to assign adequate permissions to the Lambda execution role. It will require permissions to create and write to log groups and create and manage the network interfaces. The following is the sample IAM policy that you can use the Lambda execution role.
- The Lambda function processes Firehose records, each containing a unique identifier and associated data. Upon receiving these records, the function invokes the private API through the VPC endpoint and must return an 'OK' status code to confirm successful execution. If you choose to have the Firehose records to be delivered to destination S3 bucket, return the transformed records as part of the response. However, in the current implementation, we return empty data as no S3 storage is required. If any errors occur during processing, the system automatically writes failure reports to the destination S3 bucket. Detailed error handling documentation is available in the Amazon Firehose documentation. Below is the sample implementation code for the transformation Lambda function.
- Create an Amazon S3 bucket. We will use this bucket as the destination for the Amazon Data Firehose Stream.
- Create a Firehose Stream.
- Choose any Source from the options available (we chose “Direct PUT” for our test) and Amazon S3 as the Destination.
- To enable transformation select, Turn on data transformation and choose the Lambda function, that we created in the previous section.
- Choose the created S3 bucket as the destination.
- Keep rest of the options as default and click on Create firehose stream.
After completing the setup described in the "Implementation Guide" section, send a specific amount of test data to your active Firehose stream. Follow the steps in the Testing Firehose stream with sample data documentation for this process. Monitor your Lambda function logs to confirm successful delivery to your private API. If you encounter any issues, review the Lambda function logs. This verification process ensures that your Firehose stream is correctly configured to deliver data to your private HTTP destination.
After you have completed testing the setup, you need to remove the AWS resources that you created in your account to stop incurring costs.
- Delete Firehose data stream
- To delete using AWS Console, navigate Amazon Data firehose service page.
- Select the firehose data stream from the list of available streams
- Click on Delete button and follow the prompts.
This post demonstrates how to enable delivery to private APIs using Amazon Kinesis Data Firehose. The included sample code accelerates implementation and testing. We will update this post with current architectural guidance as Amazon Kinesis Data Firehose features evolve.
Any opinions in this post are those of the individual author and may not reflect the opinions of AWS.