How it works
When you write a workflow in Polos, the framework automatically persists your execution state to Postgres database. Here’s what happens behind the scenes:1. Workflow starts
process_order, before your code runs:
- Polos generates a unique execution ID
- Stores the execution status as
PENDINGin the database - Persists the input payload
2. Steps execute and persist
Each timectx.step.run() executes:
- The step function runs
- The output is persisted to the database
- The output is returned to the workflow
fetch_order→ output storedcharge_payment→ output storedsend_confirmation→ output stored
3. Workflow completes
Once all steps finish, Polos updates the workflow status toSUCCESS.
Recovery after failure
What happens if your worker crashes at step 2? When the orchestrator detects the failure, it automatically retries the workflow:- Retrieves the original inputs from the database
- Starts the workflow function from the beginning
- For each step:
- Checks if that step already executed
- If yes → returns the cached output (no re-execution)
- If no → executes the step and persists the output
Why determinism matters
For this model to work, workflows must be deterministic: given the same inputs, they should invoke the same steps with the same inputs in the same order. This is why non-deterministic operations must be in steps - otherwise, replaying a workflow could take a different execution path.The problem with non-deterministic code
random.random()returns0.7→ takes expensive analysis path- Step
"expensive_analysis"executes and output is cached - Worker crashes before
send_resultexecutes
random.random()returns0.3→ takes quick check path- Tries to execute step
"quick_check" - But step
"expensive_analysis"already exists in the database! - Polos can’t reconcile the execution history → replay fails or produces wrong results
The solution: Put non-deterministic operations in steps
- Step
"decide"returnsTrue(cached:0.7 > 0.5) - Step
"expensive_analysis"executes
- Step
"decide"returns cachedTrue(not re-executed) - Takes same path →
"expensive_analysis"found in cache - Workflow resumes deterministically from
"send_result"
Built-in helpers for common cases
Polos provides helpers for common non-deterministic operations:time.time(), uuid.uuid4(), and random.random() in steps, but more convenient.
Step output requirements
Step outputs must be JSON serializable so they can be persisted to the database. We recommend using Pydantic models for step outputs.Key takeaways
- Durable execution = automatic state tracking - Every step’s input and output is persisted
- Steps are cached on replay - no re-execution, no wasted API calls — Completed steps return cached results
- Workflows must be deterministic to replay correctly - Same inputs must produce same execution path
- Non-deterministic operations must be in steps - Time, UUIDs, random, API calls
- Use
ctx.step.now(),ctx.step.uuid(),ctx.step.random()for common cases - These helpers are cached on replay - Step outputs must be JSON serializable - Use Pydantic models for complex types