Pipeline States

HERE platform pipelines are stateful. This means that Deployed Pipelines always have a characteristic state that defines them. Pipelines can only have one state at any given time.

Pipeline Version

The only deployable component of a pipeline is a Pipeline Version. Its run-time configuration is defined by the Pipeline Template. Some run-time parameters can be over-ridden by optional parameters.

Pipeline Version States

There are 4 possible states for a Pipeline Version:

  • READY – when a Pipeline Version is just created by the user or is ready to be initiated
  • SCHEDULED – when a Pipeline Version is scheduled and waiting for a trigger to start the processing of input data
  • RUNNING – when a Pipeline Version has an active Job processing the input data
  • PAUSED – when a Pipeline Version is paused by the user and is not actively processing the input data

Note

Pipelines consume resources and generate costs in the RUNNING state only. When a pipeline is in the PAUSED state, no computing resources are allocated and, therefore, no costs accrue. When the pipeline is resumed and returns into the RUNNING state, compute resources are provisioned again. At this point, the cost generation is resumed, too. For more information on pipeline resource consumption and costs, see Billable services.

Operations

The following operations can be performed on a Pipeline Version, either by the user or by the platform.

  • ACTIVATE
  • PAUSE
  • DEACTIVATE
  • CANCEL
  • RESUME
  • UPGRADE
  • RESTART

Most of these operations are asynchronous and the platform will run them in the background. You will be able to view or list these operations, their state, and related message or error information.

Operation States

These operations may have any of the following states:

  • ACCEPTED – The operation is accepted by the platform but yet to process it.
  • BEING_PROCESSED – The platform is processing the operation request.
  • SUCCEEDED – The operation is successfully completed.
  • FAILED – The operation failed to complete.

    Info

    When an operation on a pipeline version is being processed, a new operation may not be requested on the same pipeline version until the current operation succeeds or fails.

Jobs and States

A Job corresponds to an execution instance of a Pipeline Version. A Pipeline Version may have 0 or more Jobs, but only one can be Running at a time.

  • For Batch processing, a Job terminates by either succeeding or failing.

  • For Stream processing, a Job is not supposed to terminate by itself, but if this happens the platform may automatically restart the processing with a new Job.

Job States

There are 4 possible states for a Job:

  • STARTING – the job is starting and resources are being allocated.
  • RUNNING – the job is running.
  • COMPLETED – the job terminated successfully.
  • FAILED – the job terminated due to a failure.
  • CANCELED – the job is terminated before completion.

Pipeline Version State Transitions

Valid transitions between states are independent of the processing model of a Pipeline Version (that is, Batch or Stream). However, some transitions may have a different internal behavior based on the processing model being used. The figure below is a representation of the various Pipeline Version States and Operations.

Pipeline version state transitions
Figure 1. State Transitions

Current State Table

 
Operation
Starting State
Ending State
Notes
1 Create ~ Ready

A new Pipeline Version with a UUID is created and is ready to be used.

2 Activate Ready Scheduled

A Pipeline Version is activated and gets scheduled to be run.

When a Batch pipeline version is activated to be run on-demand, it enters the Scheduled state and immediately changes to the Running state to attempt to run once and then returns to the Ready state. No further processing is done even if the input catalogs have new data. It will have to be run again manually.

3 Run (internal) Scheduled Running

This is an internal operation. A new Job is created for the Pipeline Version and the state changes to Running.

For Batch processing, the platform waits for the input catalogs to change to trigger a new job.

For Stream processing, the platform starts running the job after a few minutes of delay and continues running.

4 Terminate (internal) Running Scheduled or Ready

This is an internal operation. The current Job terminates with a success or failure. If the Pipeline Version is configured to run again, it will be set to Scheduled state, otherwise it will be set to Ready state.

5 Deactivate Scheduled Ready A Pipeline Version is deactivated and returns to the Ready state.
Note: The platform will automatically deactivate a Pipeline Version if it fails 12 times consecutively.
6 Pause Running Paused

For Batch processing, the current Job is completed and future Jobs are paused.

For Stream processing, the current state is saved and the Job is gracefully terminated.

7 Resume Paused Scheduled

For Batch processing, a Pipeline Version gets scheduled for it to create a new Job upon the next trigger of input catalog change.

For Stream processing, a new Job is started to resume the Pipeline Version from the previously saved state. This saved state is then discarded.

8 Cancel Running Ready The running Job is immediately terminated without saving state and the Pipeline Version moves to Ready state. For Batch processing, all the future Jobs are also canceled.
9 Cancel Paused Ready

For Batch processing, all the future jobs will be canceled and the Pipeline Version moves to Ready state.

For Stream processing, the saved state of the paused Job is discarded and the Pipeline Version moves to Ready state.

10 Restart (Stream Pipeline only) Running Running

In situations where a Stream pipeline needs to be interrupted and the users can't perform the requested action, an attempt is made to save the state of the Stream pipeline and restart it from the saved state.
See Enable Notification and Recovery of Stream Pipelines.

See Also

results matching ""

    No results matching ""