
Low-Code S3 Key Validation With AWS Step Functions & JSONata
In this post, I use JSONata to add low-code S3 object key validation to an AWS Step Functions state machine.
- Arithmetic (
$price * 1.2
) - Conditional Logic (
$price > 100 ? 'expensive' : 'affordable'
). - Filtering (
$orders[status = 'shipped']
) - String Operations (
$firstName & ' ' & $lastName
)
QueryLanguage
field to JSONata
in the state machine definition. This action replaces the traditional JSONPath fields with two JSONata fields:Arguments
: Used to customise data sent to state actions.Output
: Used to transform results into custom state output.
Assign
field sets variables that can be stored and reused across the workflow.{% %}
delimiters but otherwise follow standard JSONata syntax. They access data using the $states
reserved variable with the following structures:- State input is accessed using
$states.input
- Context information is accessed using
$states.context
- Task results (if successful) are accessed using
$states.result
- Error outputs (if existing) are accessed using
$states.errorOutput
$partition
, $range
, $hash
, $random
, and $uuid
. Some functions, such as $eval
, are not supported.{% $states.input.title %}
{% $current_price <= $states.input.desired_priced %}
{% $parse($states.input.json_string) %}
Choice
states introduce conditional logic to a state machine. They assess conditions and guide execution accordingly, allowing workflows to branch dynamically based on input data. When used with JSONata, a Choice
state must contain the following fields:Condition
field – a JSONata expression that evaluates totrue
/false
.Next
field – a value that must match a state name in the state machine.
Choice
state checks if a variable foo
equals 1
:{"Condition": "{% $foo = 1 %}", "Next": "NumericMatchState"}
$foo = 1
, the condition is true
and the workflow transitions to a NumericMatchState
state.
- A file is uploaded to an Amazon S3 Bucket.
- S3 creates an Object Created event.
- Amazon EventBridge matches the event record to an event rule.
- Eventbridge executes the AWS Step Functions state machine and passes the event to it as JSON input.
- The state machine transitions through the various choice states.
- The state machine transitions to the fail state if any choice state criteria are not met.
- The state machine transitions to the success state if all choice state criteria are met.
txt
:{% $lowercase($split($split($states.input.detail.object.key, '/')[-1], '.')[-1]) = 'txt' %}
$states.input
:$states.input.detail.object.key
"iTunes/iTunes-AllTunes-2025-02-01.txt"
%split
using /
as the delimiter:$split($states.input.detail.object.key, '/')
["iTunes", "iTunes-AllTunes-2025-02-01.txt"]
[-1]
:$split(...)[-1]
"iTunes-AllTunes-2025-02-01.txt"
$split
again, using .
as the delimiter:$split($split(...)[-1], '.')
["iTunes-AllTunes-2025-02-01", "txt"]
[-1]
:$split($split(...)[-1], '.')[-1]
"txt"
$lowercase
to convert the suffix to lowercase:$lowercase($split(...)[-1], '.')[-1])
"txt"
$lowercase
function ensures consistency, as files with TXT, Txt, or tXt extensions will still match correctly. Here, there is no change as txt
is already lowercase.'txt'
:$lowercase($split(...)[-1], '.')[-1]) = 'txt'
true
✅iTunes
.{% $contains($split($states.input.detail.object.key, '/')[-1], 'iTunes') %}
$lowercase
this time, as iTunes
is the correct spelling.$split($states.input.detail.object.key, '/')[-1]
"iTunes-AllTunes-2025-02-01.txt"
$contains
function checks if the string contains the specified substring. It returns true
if the substring exists; otherwise, it returns false
.$contains($split(...)[-1], 'iTunes')
true
✅ if 'iTunes
‘ appears anywhere in the filename.- ✅
"iTunes-AllTunes-2025-02-01.txt"
→true
- ❌
"itunes-AllTunes-2025-02-01.txt"
→false
(case-sensitive)
YYYY-MM-DD
.{% $exists($match($split($states.input.detail.object.key, '/')[-1], /\d{4}-\d{2}-\d{2}/)) %}
$split($states.input.detail.object.key, '/')[-1]
"iTunes-AllTunes-2025-02-01.txt"
$match
function applies the substring to the provided regular expression (regex). If found, an array of objects is returned containing the following fields:match
– the substring that was matched by the regex.index
– the offset (starting at zero) within the substring.groups
– if the regex contains capturing groups (parentheses), this contains an array of strings representing each captured group.
$match(..., /\d{4}-\d{2}-\d{2}/)
\d{4}
→ Four digits (year)-
→ Hyphen separator\d{2}
→ Two digits (month)-
→ Another hyphen\d{2}
→ Two digits (day)
$match
output yet as the Choice
state needs a boolean output. Enter $exists
. This function returns true
for a successful match; otherwise, it returns false
.$exists($match(..., /\d{4}-\d{2}-\d{2}/))
true
✅ if a date is found.$exists
returns true
as a date is present. However, note that JSONata lacks built-in functions to validate dates. For example:"2025-02-01"
→true
(valid date)"2025-02-31"
→true
(invalid date but still matches format)
Choice
states for each JSONata expression in this section, I will add that all the expressions can be combined into a single Choice
state using and
:{% $lowercase($split($split($states.input.detail.object.key, '/')[-1], '.')[-1]) = 'txt' and $contains($split($states.input.detail.object.key, '/')[-1], 'iTunes') and $exists($match($split($states.input.detail.object.key, '/')[-1], /\\d{4}-\\d{2}-\\d{2}/)) %}
- Simplified Structure: Reducing the number of states can make the state machine easier to understand and maintain visually. Instead of multiple branching paths, all logic is in one centralised
Choice
state. - Cost Optimisation: AWS Step Functions Standard Workflows pricing is based on the number of state transitions. Combining multiple
Choice
states into one reduces transitions, potentially lowering costs for high-volume workflows. - Minimises Transition Latency: Each state transition adds a slight delay. By managing all logic within a single Choice state, the workflow runs more efficiently due to the reduced transitions.
- Added Complexity: A complex
Choice
state with many conditions can be difficult to read, debug, and modify. It may require deeply nested logic, which makes future updates challenging. - Limited Observability: If multiple conditions are combined into one state, debugging failures becomes more difficult as it is unclear which condition caused an unexpected transition.
- Potential Scaling Difficulty: As the workflow evolves, adding more conditions to a single
Choice
state can become unmanageable. Ultimately, this situation may require breaking it up.
Choice
states for each JSONata expression:
Choice
state for all JSONata expressions:
- File Suffix (
.txt
) - Key Content (
iTunes
) - Date Format (
YYYY-MM-DD
)
Test Case | S3 Key | Expected | Actual |
---|---|---|---|
✅ Valid Suffix (.txt ) | "iTunes/iTunes-2025-02-01.txt" | Proceed to iTunes Check | ✅ Success → Next: iTunes String Check |
❌ Invalid Suffix (.csv ) | "iTunes/iTunes-2025-02-01.csv" | Fail (No further checks) | ❌ Failure → No further checks |
❌ Missing Suffix | "iTunes/iTunes-2025-02-01" | Fail (No further checks) | ❌ Failure → No further checks |
Test Case | S3 Key | Expected | Actual |
---|---|---|---|
✅ Valid “iTunes” Key | "iTunes/iTunes-2025-02-01.txt" | Proceed to Date Check | ✅ Success → Next: Date Check |
❌ Incorrect Case (itunes instead of iTunes ) | "iTunes/itunes-2025-02-01.txt" | Fail (No further checks) | ❌ Failure → No further checks |
❌ Missing Key String | "" | Fail (No further checks) | ❌ Failure → No further checks |
Test Case | S3 Key | Expected | Actual |
---|---|---|---|
✅ Correct Date Format (YYYY-MM-DD ) | "iTunes/iTunes-2025-02-01.txt" | Success (Validation complete) | ✅ Success → Validation complete! |
❌ Incorrect Date Format (Missing Day) | "iTunes/iTunes-2025-02.txt" | Fail (No further checks) | ❌ Failure → No further checks |
❌ Missing Date | "iTunes/iTunes.txt" | Fail (No further checks) | ❌ Failure → No further checks |
Test Case | S3 Key | Expected | Actual |
---|---|---|---|
⚠️ Impossible Date (2025-02-31 ) | "iTunes/iTunes-2025-02-31.txt" | Fail (Ideally) | ❌ Unexpected Success (JSONata does not validate real-world dates) |
YYYY-MM-DD
) it does not validate real-world dates. If strict date validation is needed then an AWS Lambda function would be required.