• Что бы вступить в ряды "Принятый кодер" Вам нужно:
    Написать 10 полезных сообщений или тем и Получить 10 симпатий.
    Для того кто не хочет терять время,может пожертвовать средства для поддержки сервеса, и вступить в ряды VIP на месяц, дополнительная информация в лс.

  • Пользаватели которые будут спамить, уходят в бан без предупреждения. Спам сообщения определяется администрацией и модератором.

  • Гость, Что бы Вы хотели увидеть на нашем Форуме? Изложить свои идеи и пожелания по улучшению форума Вы можете поделиться с нами здесь. ----> Перейдите сюда
  • Все пользователи не прошедшие проверку электронной почты будут заблокированы. Все вопросы с разблокировкой обращайтесь по адресу электронной почте : info@guardianelinks.com . Не пришло сообщение о проверке или о сбросе также сообщите нам.

Workflow-Level Resilience in Orkes Conductor: Timeouts and Failure Workflows

Lomanu4 Оффлайн

Lomanu4

Команда форума
Администратор
Регистрация
1 Мар 2015
Сообщения
1,481
Баллы
155
Building resilient, production-grade workflows means preparing for the unexpected—from task stalls to external service outages. While task-level timeouts catch issues in isolated steps, workflow-level resilience settings act as a safety net for your entire orchestration. They ensure your system behaves predictably under stress and provides a graceful fallback when things go wrong.

In this post, we’ll explore two key features in Orkes Conductor that help you build robust workflows:

  • Workflow Timeouts
  • Failure Workflows (a.k.a. Compensation flows)
Workflow timeouts: Don’t let things hang


A workflow timeout defines how long a workflow is allowed to run before it's forcibly marked as timed out. This is crucial when your business logic needs to meet service-level agreements (SLAs) or avoid workflows stalling indefinitely.

Workflow timeout parameters

ParameterDescription
timeoutSecondsMaximum duration (in seconds) for which the workflow is allowed to run. If the workflow hasn’t reached a terminal state within this time, it is marked as TIMED_OUT. Set to 0 to disable.
timeoutPolicyAction to take when a timeout occurs. Supports:
  • TIME_OUT_WF–Terminates the workflow as TIMED_OUT.
  • ALERT_ONLY–Logs an alert but lets the workflow continue.
Use case: E-commerce checkout with 30-minute SLA


Imagine a checkout flow involving payment, inventory locking, and order confirmation. You don’t want stale carts holding inventory hostage for hours. A 30-minute timeout ensures the workflow either completes or fails cleanly.

Here’s a simplified implementation in Python using the Conductor SDK:


def register_workflow(workflow_executor: WorkflowExecutor) -> ConductorWorkflow:
# 1) HTTP task to fetch product price (simulated with dummy URL)
fetch_random_number_task = HttpTask(
task_ref_name="fetch_random_number",
http_input={
"uri": "

Пожалуйста Авторизируйтесь или Зарегистрируйтесь для просмотра скрытого текста.

",
"method": "GET",
"headers": {
"Content-Type": "application/json"
}
}
)

# 2) Set variable for base price
set_base_price = SetVariableTask(task_ref_name='set_base_price')
set_base_price.input_parameters.update({
'base_price': '${fetch_random_number.output.response.body}'
})

# 3) Inline task to calculate final price
calculate_price_task = InlineTask(
task_ref_name='calculate_final_price',
script='''
(function() {
let basePrice = $.base_price;
let loyaltyDiscount = $.loyalty_discount === "gold" ? 0.2 : 0;
let promotionDiscount = $.promotion_discount ? 0.1 : 0;
return basePrice * (1 - loyaltyDiscount - promotionDiscount);
})();
''',
bindings={
'base_price': '${workflow.variables.base_price}',
'loyalty_discount': '${workflow.input.loyalty_status}',
'promotion_discount': '${workflow.input.is_promotion_active}',
}
)

# 4) Set final calculated price
set_price_variable = SetVariableTask(task_ref_name='set_final_price_variable')
set_price_variable.input_parameters.update({
'final_price': '${calculate_final_price.output.result}'
})

# Define the workflow with a 30-minute timeout
workflow = ConductorWorkflow(
name='checkout_workflow',
executor=workflow_executor
)
workflow.version = 1
workflow.description = "E-commerce checkout workflow with 30-min timeout"
workflow.timeout_seconds(1800) # 30 minutes
workflow.timeout_policy(TimeoutPolicy.TIME_OUT_WORKFLOW)

workflow.add(fetch_random_number_task)
workflow.add(set_base_price)
workflow.add(calculate_price_task)
workflow.add(set_price_variable)

# Register the workflow definition
workflow.register(overwrite=True)
return workflow


Пожалуйста Авторизируйтесь или Зарегистрируйтесь для просмотра скрытого текста.



If the workflow exceeds 30 minutes, it is marked as TIMED_OUT automatically, allowing you to alert a team, start a cleanup flow, or retry.


Пожалуйста Авторизируйтесь или Зарегистрируйтесь для просмотра скрытого текста.



E-commerce workflow with a 30-minute timeout
Failure workflows: Your fallback plan


What happens when a workflow fails unexpectedly, due to a timeout, an API error, or an unhandled edge case? That’s where failure workflows come in.

These are separate workflows that are triggered when the main workflow fails. They allow you to compensate, clean up, and notify downstream systems or users.

Failure workflow parameters

ParameterDescription
failureWorkflowThe name of the fallback workflow to be triggered if this one fails. The default is empty.
Use case: Hotel booking with compensation flow


Let’s say your travel booking app orchestrates a hotel reservation workflow. If the booking fails (maybe the payment went through, but the room wasn’t confirmed), you’d want to:

  • Trigger a refund flow, and
  • Notify the customer that the booking failed

Main workflow code


def register_hotel_booking_workflow(workflow_executor: WorkflowExecutor) -> ConductorWorkflow:
# 1) HTTP task to reserve a hotel (simulated with dummy URL)
reserve_hotel_task = HttpTask(
task_ref_name="reserve_hotel",
http_input={
"uri": "

Пожалуйста Авторизируйтесь или Зарегистрируйтесь для просмотра скрытого текста.

",
"method": "POST",
"headers": {"Content-Type": "application/json"},
"body": {
"hotel_id": "${workflow.input.hotel_id}",
"checkin": "${workflow.input.checkin_date}",
"checkout": "${workflow.input.checkout_date}",
"customer_id": "${workflow.input.customer_id}"
}
}
)

# 2) Set variable to confirm reservation status (simulate from body)
set_status = SetVariableTask(task_ref_name='set_reservation_status')
set_status.input_parameters.update({
'reservation_status': '${reserve_hotel.output.response.body.json.status}'
})

# 3) Inline task to check booking status
evaluate_reservation = InlineTask(
task_ref_name='check_booking_status',
script='''
(function() {
if ($.reservation_status !== 'confirmed') {
throw new Error("Booking failed");
}
return "confirmed";
})();
''',
bindings={
'reservation_status': '${workflow.variables.reservation_status}'
}
)

workflow = ConductorWorkflow(
name='hotel_booking_workflow',
executor=workflow_executor
)
workflow.version = 1
workflow.description = "Hotel reservation flow with SLA and failure handling"
workflow.timeout_seconds(900) # 15 minutes
workflow.timeout_policy(TimeoutPolicy.TIME_OUT_WORKFLOW)
workflow.failure_workflow("hotel_booking_failure_handler")

workflow.add(reserve_hotel_task)
workflow.add(set_status)
workflow.add(evaluate_reservation)

workflow.register(overwrite=True)
return workflow

Failure workflow code


def register_failure_workflow(workflow_executor: WorkflowExecutor) -> ConductorWorkflow:
# 1) Notify customer (simulated with dummy URL)
notify_customer_task = HttpTask(
task_ref_name="notify_customer",
http_input={
"uri": "

Пожалуйста Авторизируйтесь или Зарегистрируйтесь для просмотра скрытого текста.

",
"method": "POST",
"headers": {"Content-Type": "application/json"},
"body": {
"customer_id": "${workflow.input.customer_id}",
"message": "Your hotel booking could not be completed. We apologize for the inconvenience."
}
}
)

# 2) Trigger refund (simulated with dummy URL)
refund_payment_task = HttpTask(
task_ref_name="trigger_refund",
http_input={
"uri": "

Пожалуйста Авторизируйтесь или Зарегистрируйтесь для просмотра скрытого текста.

",
"method": "POST",
"headers": {"Content-Type": "application/json"},
"body": {
"payment_id": "${workflow.input.payment_id}",
"reason": "Hotel booking failed"
}
}
)

failure_workflow = ConductorWorkflow(
name="hotel_booking_failure_handler",
executor=workflow_executor
)
failure_workflow.version = 1
failure_workflow.description = "Handles failed hotel bookings with customer notification and refund"

failure_workflow.add(notify_customer_task)
failure_workflow.add(refund_payment_task)

failure_workflow.register(overwrite=True)
return failure_workflow


Пожалуйста Авторизируйтесь или Зарегистрируйтесь для просмотра скрытого текста.




Пожалуйста Авторизируйтесь или Зарегистрируйтесь для просмотра скрытого текста.



Hotel booking workflow with a failure handler workflow
Best practices

  • Always define timeoutSeconds at both workflow and critical task levels to prevent resource overuse.
  • Use failureWorkflow for any workflow that produces side effects or artifacts that need cleanup in the event of failure.
Wrap up


Building production-ready workflows in Orkes Conductor means planning for both success and failure. Timeout policies and failure workflows aren’t just safeguards—they’re essential tools for maintaining system health, meeting SLAs, and ensuring a reliable user experience. When combined thoughtfully, they allow your workflows to self-regulate, recover from disruptions, and maintain a clean system state, even when things don’t go as planned.




Пожалуйста Авторизируйтесь или Зарегистрируйтесь для просмотра скрытого текста.

is an enterprise-grade Unified Application Platform for process automation, API and microservices orchestration, agentic workflows, and more. Check out the

Пожалуйста Авторизируйтесь или Зарегистрируйтесь для просмотра скрытого текста.

, or try it yourself using our free

Пожалуйста Авторизируйтесь или Зарегистрируйтесь для просмотра скрытого текста.

.


Пожалуйста Авторизируйтесь или Зарегистрируйтесь для просмотра скрытого текста.

 
Вверх Снизу