Pipeline Status Definitions

HERE Platform Pipeline Status is a default dashboard available to you in Grafana that shows the following metrics.

For more information on monitoring and troubleshooting pipelines, see Pipeline Monitoring in the Pipeline Developer Guide.

The following metrics report on the internal lifecycle of a batch or stream job. They do not necessarily correspond directly to the Pipeline Status as reported by the API. In addition, these metrics are only available while a pipeline job is in existence.

pipeline_jobs_submitted

This metric is reported with value 1 from the moment the first components of the pipeline cluster resources have started until the moment the job starts processing.

pipeline_jobs_running

This metric is reported with value 1 from the moment the job starts processing until the moment the job terminates. A job may terminate because of successful completion, failed completion, or if it is canceled or paused by the user.

pipeline_jobs_completed

This metric is reported with value 1 from the moment the job successfully completes processing until the moment the pipeline cluster resources have terminated.

pipeline_jobs_failed

This metric is reported with value 1 from the moment the job terminates with a failure until the moment the pipeline cluster resources have terminated.

pipeline_jobs_canceled (Deprecated. Removal date: February 01, 2021)

This metric is only applicable for Stream runtime environment and only reported in case the user issues the Pause or Upgrade request. This metric is reported with value 1 from the moment a savepoint is created and the job has stopped processing until the moment the pipeline cluster resources have terminated.

Note: Cancel Operation on Pipelines

The Cancel request is handled by the platform by simply terminating all cluster resources. It is a hard shutdown of the pipeline. The metrics will not show pipeline_jobs_canceled with a value of 1. The last that will have been reported is pipeline_jobs_running with a value of 1.

pipeline_jobs_submitted_ts

This metric reports the approximate timestamp at which the pipeline cluster started initialization, in seconds since January 1, 1970, 00:00 UTC. This metric is reported with a value > 0 from the moment the first components of the pipeline cluster resources have started until the moment the resources have been terminated. The reported value will remain mostly constant, but should one of the components restart then it is possible for this metric to be reset to a later value.

pipeline_jobs_running_ts

This metric reports the timestamp at which the pipeline started processing, in seconds since January 1, 1970, 00:00 UTC, as reported by the Stream or Batch framework. This metric is reported with a value > 0 from the moment the Stream or Batch framework reports the job as running until the moment the resources have been terminated. The reported value will remain mostly constant, but in case of a Stream pipeline the value will be reset when the Stream job restarts. It will then report the time at which the job last entered the RUNNING state.

pipeline_jobs_completed_ts

This metric reports the timestamp at which the pipeline completed processing successfully, in seconds since January 1, 1970, 00:00 UTC, as reported by the Stream or Batch framework. This metric is reported with a value > 0 from the moment the Stream or Batch framework reports the job as running until the moment the resources have been terminated.

pipeline_jobs_failed_ts

This metric reports the timestamp at which the pipeline completed processing with a failure, in seconds since January 1, 1970, 00:00 UTC, as reported by the Stream or Batch framework. This metric is reported with a value > 0 from the moment the Stream or Batch framework reports the job as failed until the moment the resources have been terminated. A Stream pipeline that goes through a restart is not considered to have failed.

Note: Timestamp metrics (_ts)

The timestamp metrics are reprted in an attempt to provide greater accuracy of runtime events in the pipeline's lifecycle. As with every metric they are reported with their own timestamp attribute. This attribute is however to be interpreted as the time at which the metric was reported and NOT the time at which the event occured. The difference between the value and the reported timestamp can differ by several minutes.

Note:: Pipeline Cancellation and Upgrade

There is no metric to indicate a cancellation of a pipeline. The platform only provides the terminating metrics pipeline_jobs_completed_ts and pipeline_jobs_failed_ts for cases where the pipeline completes its natural lifecycle, either by completing processing successfully, or terminating a job after encountering an error. This applies primarily to Batch pipelines, since Stream pipelines are expected to run indefinitely. Therefore, in case of a Cancel or Pause by the user the standard behavior is that the values of pipeline_jobs_completed_ts and pipeline_jobs_failed_ts remain 0, for the remainder of the job's lifetime.

In the special case of an Upgrade or Restart one may observe that the predecessor pipeline job will keep reporting pipeline_jobs_running_ts even after the successor job has started running. In this case the value of pipeline_jobs_running_ts of the successor job should be interpreted the approximate time of termination of the predecessor job.

results matching ""

    No results matching ""