AWS Logo
Menu

Three strategies of handling long-running query in AppSync

Provide 3 strategies to overcome AWS AppSync's 30-second timeout limitation : Step Functions orchestration, API Gateway integration, and event-driven designs.

Shengcai Cheng
Amazon Employee
Published Jun 3, 2025
When building applications with AWS AppSync, developers often encounter the 30-second timeout limitation for resolver operations. These limitations can become problematic when dealing with for example, Large data processing tasks, or External API calls that take longer to respond and etc.
Let's explore three effective solutions to handle these long-running operations.

Solution 1: Leveraging Step Functions

This solution has been discussed under this blog.
https://aws.amazon.com/blogs/mobile/invoke-aws-services-directly-from-aws-appsync/
When a client makes a query to AppSync that requires extensive processing, instead of trying to handle everything in the resolver, you'll want to initiate a Step Function workflow. The resolver immediately returns an execution ID to the client, allowing the long-running process to continue in the background. Think of it like ordering at a busy restaurant - you get an order number right away, but the cooking happens in the background.
The client can then use this execution ID to periodically check the status of their request through another AppSync query. Behind the scenes, this status query communicates with Step Functions to understand the current state of the process. Meanwhile, the Step Function orchestrates all the necessary work, whether that's processing large datasets, running complex calculations, or coordinating multiple AWS services.
When the Step Function completes its work, it can trigger a callback to AppSync, updating the status in a database (DynamoDB for example). If you've set up subscriptions, clients can be automatically notified when their process completes, rather than having to poll for status updates. This creates a smooth, responsive experience for users while handling complex, time-consuming operations in the background.
This pattern is particularly valuable for scenarios like video processing, data analysis, or any operation that might take minutes or even hours to complete. By separating the request initiation from the actual processing, you create a more resilient and scalable system that can handle long-running operations without timing out or blocking other requests.

Solution 2: HTTP resolver API Gateway with Asynchronous Lambda

Let's explore handling long-running operations through API Gateway and async Lambda functions. This approach is particularly elegant when you already have a REST API infrastructure or need to expose your APIs beyond AppSync.
Imagine you're building a system where multiple clients need access to your APIs - some through AppSync, others directly via REST endpoints. In this scenario, API Gateway becomes your unified entry point. When a client initiates a long-running query through AppSync, your resolver makes an HTTP call to API Gateway, which then triggers an async Lambda function.
The flow works like this: The AppSync resolver calls your API Gateway endpoint using the HTTP resolver. API Gateway is configured for async integration with Lambda, meaning it doesn't wait for the Lambda function to complete. Instead, it immediately returns a task ID. This ID is passed back through AppSync to the client, similar to our Step Functions approach. The Lambda function continues processing in the background, and when it completes, it writes the results to DynamoDB.
What makes this approach particularly valuable is its versatility. Since you're using API Gateway, you can easily expose these endpoints to external systems, generate SDK clients, or apply API Gateway's powerful features like throttling, API keys, and usage plans. For organizations that need to provide both GraphQL and REST interfaces to their services, this pattern prevents duplicate implementation while maintaining consistent business logic.
The trade-off compared to the Step Functions approach is that you have less built-in orchestration capabilities, but you gain more flexibility in API management and exposure. It's particularly well-suited for organizations transitioning from REST to GraphQL or maintaining hybrid API architectures.
Remember though, you'll need to implement your own status checking mechanism, typically through a combination of DynamoDB for state storage and additional API endpoints for status queries. While this requires more initial setup than the Step Functions approach, it provides greater control over the API interface and integration patterns.

Solution 3: Event-Driven Architecture

The Event-Driven Architecture (EDA) approach for handling long-running AppSync operations is like setting up a sophisticated notification system in a restaurant. Instead of waiting at the counter for your order, you get a buzzer that notifies you when your food is ready.
In this pattern, when a client makes a query through AppSync, the resolver publishes an event to EventBridge. Think of EventBridge as the central nervous system of your application - it receives the initial request and routes it to the appropriate service. The resolver immediately returns a correlation ID to the client, and sets up a subscription channel for updates.
Behind the scenes, EventBridge routes the event to a Lambda function (or any other service) that handles the long-running process. As the process progresses, the Lambda function publishes updates back through EventBridge. These events are then picked up by another Lambda function that acts as a "notifier," updating AppSync through mutations. AppSync's subscription system ensures these updates flow back to the waiting client in real-time. 
For example, imagine a video processing service where users upload videos that need to be transcoded, analyzed for content, and have subtitles generated. Each of these steps could be handled by different services, all choreographed through events. The client gets real-time updates as each step completes, without any service needing to know about the others.
The beauty of this approach is its scalability and flexibility. You can add new event consumers without changing existing code, and each component can scale independently. It's particularly powerful when combined with other AWS services like SQS for guaranteed delivery, or when you need to fan out a single request to multiple processors.
While it requires more initial setup than the API Gateway or Step Functions approaches, the EDA pattern provides the most flexible and scalable solution for complex, distributed systems where real-time updates and system extensibility are key requirements.
 

Any opinions in this post are those of the individual author and may not reflect the opinions of AWS.

Comments