Flink Metrics

Flink Metrics is a default dashboard available to you in Grafana that shows the following metrics. The standard metrics listed here are available for Flink pipelines. Custom metrics can be added to your pipeline code. See the official Flink documentation for more information about Flink metrics.

Flink allows the creation of custom numerical metrics using accumulators. Stream Pipelines using Apache Flink support the following type of accumulators: Long and Double. Once created, these accumulators become available as named metrics that Grafana can query and add to dashboards. The metric names are commonly prefixed with the phrase flink_accumulators_.

For more information on using accumulators, see Custom Metrics and the documentation on Flink Accumulators.

Standard Metrics

CPU/Memory Metrics

METRIC UNIT DESCRIPTION
flink_jobmanager_Status_JVM_CPU_Load Percentage JobManager - recent CPU usage of the JVM, due to unclear reasons is not functioning as expected (For more information on workarounds see: How can I see the percentage CPU usage of jobmanager or taskmanagers of a Stream pipeline.)
flink_jobmanager_Status_JVM_CPU_Time Nanoseconds JobManager - CPU Time used by the JVM
flink_jobmanager_Status_JVM_Memory_Heap_Used Bytes JobManager - amount of heap memory currently used
flink_jobmanager_Status_JVM_Memory_Heap_Committed Bytes JobManager - amount of heap memory guaranteed to be available to the JVM
flink_jobmanager_Status_JVM_Memory_Heap_Max Bytes JobManager - maximum amount of heap memory that can be used for memory management
flink_jobmanager_Status_JVM_Memory_NonHeap_Used Bytes JobManager - amount of non-heap memory currently used
flink_jobmanager_Status_JVM_Memory_NonHeap_Committed Bytes JobManager - amount of non-heap memory guaranteed to be available to the JVM
flink_jobmanager_Status_JVM_Memory_NonHeap_Max Bytes JobManager - maximum amount of non-heap memory that can be used for memory management
flink_jobmanager_Status_JVM_Memory_Direct_Count Count JobManager - number of buffers in the direct buffer pool
flink_jobmanager_Status_JVM_Memory_Direct_MemoryUsed Bytes JobManager - amount of memory used by the JVM for the direct buffer pool
flink_jobmanager_Status_JVM_Memory_Direct_TotalCapacity Bytes JobManager - total capacity of all buffers in the direct buffer pool
flink_jobmanager_Status_JVM_Memory_Mapped_Count Count JobManager - number of buffers in the mapped buffer pool
flink_jobmanager_Status_JVM_Memory_Mapped_MemoryUsed Bytes JobManager - amount of memory used by the JVM for the mapped buffer pool
flink_jobmanager_Status_JVM_Memory_Mapped_TotalCapacity Bytes JobManager - number of buffers in the mapped buffer pool
flink_taskmanager_Status_JVM_CPU_Load Percentage TaskManager - recent CPU usage of the JVM, due to unclear reasons is not functioning as expected (For more information on workarounds see: How can I see the percentage CPU usage of jobmanager or taskmanagers of a Stream pipeline.)
flink_taskmanager_Status_JVM_CPU_Time Nanoseconds TaskManager - CPU Time used by the JVM
flink_taskmanager_Status_JVM_Memory_Heap_Used Bytes TaskManager - amount of heap memory currently used
flink_taskmanager_Status_JVM_Memory_Heap_Committed Bytes TaskManager - amount of heap memory guaranteed to be available to the JVM
flink_taskmanager_Status_JVM_Memory_Heap_Max Bytes TaskManager - maximum amount of heap memory that can be used for memory management
flink_taskmanager_Status_JVM_Memory_NonHeap_Used Bytes TaskManager - amount of non-heap memory currently used
flink_taskmanager_Status_JVM_Memory_NonHeap_Committed Bytes TaskManager - amount of non-heap memory guaranteed to be available to the JVM
flink_taskmanager_Status_JVM_Memory_NonHeap_Max Bytes TaskManager - maximum amount of non-heap memory that can be used for memory management
flink_taskmanager_Status_JVM_Memory_Direct_Count Count TaskManager - number of buffers in the direct buffer pool
flink_taskmanager_Status_JVM_Memory_Direct_MemoryUsed Bytes TaskManager - amount of memory used by the JVM for the direct buffer pool
flink_taskmanager_Status_JVM_Memory_Direct_TotalCapacity Bytes TaskManager - total capacity of all buffers in the direct buffer pool
flink_taskmanager_Status_JVM_Memory_Mapped_Count Count TaskManager - number of buffers in the mapped buffer pool
flink_taskmanager_Status_JVM_Memory_Mapped_MemoryUsed Bytes TaskManager - amount of memory used by the JVM for the mapped buffer pool
flink_taskmanager_Status_JVM_Memory_Mapped_TotalCapacity Bytes TaskManager - number of buffers in the mapped buffer pool
METRIC DESCRIPTION
flink_jobmanager_numRegisteredTaskManagers Total Number of Registered Task Managers
flink_jobmanager_numRunningJobs Total Number of Running Jobs
flink_jobmanager_taskSlotsTotal Total Number of Task Slots Allocated
flink_jobmanager_taskSlotsAvailable Total Number of Task Slots Available
METRIC DESCRIPTION
flink_taskmanager_job_task_currentLowWatermark Task - currentLowWatermark: the lowest watermark this task has received
flink_taskmanager_job_task_numBytesInLocal Task - numBytesInLocal: the total number of bytes this task has read from a local source
flink_taskmanager_job_task_numBytesInLocalPerSecond Task - numBytesInLocalPerSecond: the number of bytes this task reads from a local source per second
flink_taskmanager_job_task_numBytesInRemote Task - numBytesInRemote: the total number of bytes this task has read from a remote source
flink_taskmanager_job_task_numBytesInRemotePerSecond Task - numBytesInRemotePerSecond: the number of bytes this task reads from a remote source per second
flink_taskmanager_job_task_numBytesOut Task - numBytesOut: the total number of bytes this task has emitted
flink_taskmanager_job_task_numBytesOutPerSecond Task - numBytesOutPerSecond: the number of bytes this task emits per second
flink_taskmanager_job_task_numRecordsIn Task/Operator - numRecordsIn: the total number of records this operator/task has received
flink_taskmanager_job_task_numRecordsInPerSecond Task/Operator - numRecordsInPerSecond: the number of records this operator/task receives per second
flink_taskmanager_job_task_numRecordsOut Task/Operator - numRecordsOut: the total number of records this operator/task has emitted
flink_taskmanager_job_task_numRecordsOutPerSecond Task/Operator - numRecordsOutPerSecond: the number of records this operator/task sends per second
flink_taskmanager_job_task_operator_latency Operator - latency: the latency distributions from all incoming sources

Kafka Producer and Consumer Metrics

Standard Kafka metrics are available when enabled in the configuration settings of the HERE platform Data Client, and their names are prefixed with:

METRIC DESCRIPTION
flink_taskmanager_job_task_operator_KafkaProducer Kafka Producer metrics
flink_taskmanager_job_task_operator_KafkaConsumer Kafka Consumer metrics

The complete list of Kafka Producer and Consumer metrics can be found in Apache Kafka documentation (see links below).

Note: Querying Prometheus

When querying these metrics with PromQL (Prometheus Query Language), you can take advantage of label matchers on the metric names by matching against the internal __name__ label. For example, the expression flink_taskmanager_job_task_operator_KafkaConsumer_client_id_consumer_fetch_manager_metrics_fetch_rate is equivalent to {__name__=~".*consumer_fetch_manager_metrics_fetch_rate"}.

See Also

results matching ""

    No results matching ""