Migrate Pipeline to New Run-Time Environment
Contents
Migrate Batch-2.0.0 Pipeline to Batch-2.1.0 Environment (Apache Spark 2.4.2 with Support for History Server)
To migrate a Batch pipeline to Batch-2.1.0 environment (Apache Spark 2.4.2 with support for History Server), follow these steps:
- Download the HERE platform SDK 2.12 or newer that contains the new
sdk-batch-bom.pom
environment file. For more information on using the new sdk-batch-bom.pom
file, see the SDK Dependency Management. - Re-compile your pipeline code using the new
sdk-batch-bom.pom
file to create a new JAR file. For more information, see Batch Pipeline. - For an existing Batch pipeline, use the CLI or Web Portal to create a new Pipeline Template and a Pipeline Version with the
batch-2.1.0
run-time environment and the new JAR file that was created in Step #2 above. From the Web Portal, you can save time by creating a copy of the existing Pipeline Version that uses a previous version of the batch environment and changing the run-time environment to batch-2.1.0
, and using the new JAR file for the new Pipeline Version. For more information, see Pipeline Deployment. - Upgrade to the new Pipeline Version. For more information, see Upgrade via Portal and Upgrade via CLI.
Caution: Compatibility issues due to AWS SDK versions
History Server support in batch-2.1.0
pipelines currently requires the AWS SDK 1.7.4 in the (Spark 2.4.2) runtime environment. This version is not compatible with later versions of the AWS SDK (1.10.x and up). To use the AWS SDK in a pipeline with the batch-2.1.0
run-time environment, it is recommended to build the pipeline JAR with AWS SDK 1.7.4.
For more information about pipelines general support for Apache Spark, see Batch Pipelines - Apache Spark Support FAQ.
↑ Top
Migrate Batch-2.1.0 Pipeline to Batch-3.0 Environment (Apache Spark 2.4.7)
There are no changes required from Pipeline runtime perspective. Please check the HERE Data SDK for Java and Scala documentation for changes with Batch-3.0
↑ Top
Migrate Stream-2.0.0 Pipeline to Stream-3.0.0 Environment (Apache Flink 1.10.1)
To migrate a Stream-2.0.0 pipeline to Stream-3.0.0 environment (Apache Flink 1.10.1), follow these steps:
Caution: Memory changes to consider before migration
Check the Stream-3.0.0 memory changes below before migrating the pipeline.
- Use the 2.17 version or newer of HERE Data SDK for Java and Scala that contains the new
sdk-stream-bom.pom
environment file. For more information on using the new sdk-stream-bom.pom
file, see the SDK Dependency Management. - Re-compile your pipeline code using the new
sdk-stream-bom.pom
file to create a new JAR file. For more information, see Stream Pipeline. - For an existing Stream pipeline, use the CLI or Web Portal to create a new Pipeline Template and a Pipeline Version with the
stream-3.0.0
run-time environment and the new JAR file that was created in Step #2 above. From the Web Portal, you can save time by creating a copy of the existing Pipeline Version that uses the stream-2.0.0
run-time environment and changing the run-time environment to stream-3.0.0
, and using the new JAR file for the new Pipeline Version. For more information, see Pipeline Deployment. - Upgrade to the new Pipeline Version. For more information, see Upgrade via Portal and Upgrade via CLI.
Stream-3.0.0 Worker (Flink TaskManager) Memory Changes
Apache Flink 1.10 introduced improvements in the memory setup of TaskManagers. Stream-3.0.0 includes these improvements and provides pipeline users with granular control on the memory configuration of the TaskManagers. There are two categories of these changes in Stream-3.0.0 compared to Stream-2.0.0 as documented below:
- In Stream-2.0.0, the memory in a Worker unit is translated to be set as the
taskmanager.heap.size
. However, it includes not only the JVM heap but also other "off-heap" memory components. In Stream-3.0.0, the memory in a Worker unit is translated to taskmanager.memory.process.size
, which is assigned to the TaskManager JVM process. Flink adjusts the rest of the memory components based on default values or additionally configured options (more on this below). - In Stream-3.0.0, pipeline users can use all the memory configuration options via the Stream Configuration option. The memory configurations available to pipeline users include all the configurations listed within the official Flink documentation. It is important to ensure that the values set for these memory components are tuned in accordance with the Worker units configured for the TaskManager.
For more information on the memory improvements, configuration options and defaults, see the official Flink documentation below:
For more information about pipelines general support for Apache Flink, please see Stream Pipelines - Apache Flink Support FAQ.
Known Issues
Change in the Maven Modules of Table API (FLINK-11064)
Users that had a flink-table dependency before, need to update their dependencies to flink-table-planner
and the correct dependency of flink-table-api-*
, depending on whether Java or Scala is used: one of flink-table-api-java-bridge
or flink-table-api-scala-bridge
.
Note: No flink-table dependency for Stream-3.0.0
stream-2.0.0
contained the flink-table
dependency while stream-3.0.0
no longer contains it.
Change in Scala-Macro Generated Serializers (FLINK-11330)
Users that migrated to stream-3.0.0
and have the warning-related scalac
options enabled (e.g. -Xfatal-warnings
, -Ywarn-unused:*
) might face the compilation errors with the following messages: local val in method createSerializer is never used
, *.scala:1: Unused import
. These messages are actually a result of the macro-expanded code inspection.
To prevent the compilation errors for Scala 2.11
, you can choose one of the following options:
- Using
sbt
, add the scalacOptions ~= (_.filterNot(_.startsWith("-Ywarn-unused")))
line to your build.sbt
settings for the project. - Provided you use
Maven
, use the com.github.ghik:silencer-plugin_${scala.version}
plugin as following:
<configuration>
<compilerPlugins>
<compilerPlugin>
<groupId>com.github.ghik</groupId>
<artifactId>silencer-plugin_${scala.version}</artifactId>
<version>1.7.0</version>
</compilerPlugin>
</compilerPlugins>
<args>
<arg>
-P:silencer:globalFilters=local val in method createSerializer is never used
</arg>
<arg>
-P:silencer:lineContentFilters=^/\*$;^package;^//
</arg>
</args>
</configuration>
Note
The purpose of the -P:silencer:lineContentFilters=^/\*$;^package;^//
argument is to ignore the unused import coming from the Flink macro, which is reported to be located at the first line of the source code. You might want to change this line according to your needs.
Environment Comparison
Group ID | Artifact ID | stream-4.0.0 | stream-3.0.0 |
com.fasterxml.jackson.core | jackson-annotations | 2.7.9 | Removed |
com.fasterxml.jackson.core | jackson-core | 2.7.9 | Removed |
com.fasterxml.jackson.core | jackson-databind | 2.7.9 | Removed |
com.fasterxml.jackson.dataformat | jackson-dataformat-csv | 2.7.9 | Removed |
com.fasterxml.jackson.dataformat | jackson-dataformat-yaml | 2.7.9 | Removed |
com.github.scopt | scopt_2.11 | 3.5.0 | Removed |
com.google.code.findbugs | jsr305 | 1.3.9 | Removed |
com.google.protobuf | protobuf-java | 2.6.1 | Removed |
com.twitter | chill-java | 0.7.6 | Removed |
com.twitter | chill_2.11 | 0.7.6 | Removed |
com.typesafe | config | 1.3.0 | Removed |
com.typesafe | ssl-config-core_2.11 | 0.2.1 | Removed |
com.typesafe.akka | akka-actor_2.11 | 2.4.20 | Removed |
com.typesafe.akka | akka-camel_2.11 | 2.4.20 | Removed |
com.typesafe.akka | akka-protobuf_2.11 | 2.4.20 | Removed |
com.typesafe.akka | akka-slf4j_2.11 | 2.4.20 | Removed |
com.typesafe.akka | akka-stream_2.11 | 2.4.20 | Removed |
commons-cli | commons-cli | 1.3.1 | Removed |
commons-collections | commons-collections | 3.2.2 | Removed |
commons-io | commons-io | 2.4 | Removed |
io.netty | netty-all | 4.1.24.Final | Removed |
jline | jline | 2.14.3 | Removed |
org.apache.camel | camel-core | 2.17.7 | Removed |
org.apache.curator | curator-client | 2.12.0 | Removed |
org.apache.curator | curator-framework | 2.12.0 | Removed |
org.apache.curator | curator-recipes | 2.12.0 | Removed |
org.apache.flink | flink-python_2.11 | 1.7.1 | Removed |
org.apache.flink | flink-queryable-state-client-java_2.11 | 1.7.1 | Removed |
org.apache.flink | flink-s3-fs-hadoop | 1.7.1 | Removed |
org.apache.flink | flink-shaded-asm | 5.0.4-5.0 | Removed |
org.apache.flink | flink-shaded-asm-6 | 6.2.1-5.0 | Removed |
org.apache.flink | flink-table_2.11 | 1.7.1 | Removed |
org.clapper | grizzled-slf4j_2.11 | 1.3.2 | Removed |
org.javassist | javassist | 3.19.0-GA | Removed |
org.objenesis | objenesis | 2.1 | Removed |
org.reactivestreams | reactive-streams | 1.0.0 | Removed |
org.rocksdb | rocksdbjni | 5.7.5 | Removed |
org.scala-lang | scala-compiler | 2.11.12 | Removed |
org.scala-lang | scala-library | 2.11.12 | Removed |
org.scala-lang | scala-reflect | 2.11.12 | Removed |
org.scala-lang.modules | scala-java8-compat_2.11 | 0.7.0 | Removed |
org.scala-lang.modules | scala-parser-combinators_2.11 | 1.0.4 | Removed |
org.scala-lang.modules | scala-xml_2.11 | 1.0.5 | Removed |
org.tukaani | xz | 1.5 | Removed |
org.xerial.snappy | snappy-java | 1.1.4 | Removed |
org.yaml | snakeyaml | 1.15 | Removed |
ch.qos.logback | logback-classic | 1.2.3 | 1.2.3 |
ch.qos.logback | logback-core | 1.2.3 | 1.2.3 |
com.esotericsoftware.kryo | kryo | 2.24.0 | 2.24.0 |
com.esotericsoftware.minlog | minlog | 1.2 | 1.2 |
com.esotericsoftware.reflectasm | reflectasm | 1.09 | 1.09 |
org.apache.calcite | calcite-core | 1.17.0 | 1.21.0 |
org.apache.calcite | calcite-linq4j | 1.17.0 | 1.21.0 |
org.apache.calcite.avatica | avatica-core | 1.12.0 | 1.15.0 |
org.apache.commons | commons-compress | 1.4.1 | 1.18 |
org.apache.commons | commons-lang3 | 3.3.2 | 3.3.2 |
org.apache.commons | commons-math3 | 3.5 | 3.5 |
org.apache.flink | flink-annotations | 1.7.1 | 1.10.1 |
org.apache.flink | flink-clients_2.11 | 1.7.1 | 1.10.1 |
org.apache.flink | flink-container_2.11 | 1.7.1 | 1.10.1 |
org.apache.flink | flink-core | 1.7.1 | 1.10.1 |
org.apache.flink | flink-dist_2.11 | 1.7.1 | 1.10.1 |
org.apache.flink | flink-hadoop-fs | 1.7.1 | 1.10.1 |
org.apache.flink | flink-java | 1.7.1 | 1.10.1 |
org.apache.flink | flink-mapr-fs | 1.7.1 | 1.10.1 |
org.apache.flink | flink-mesos_2.11 | 1.7.1 | 1.10.1 |
org.apache.flink | flink-metrics-core | 1.7.1 | 1.10.1 |
org.apache.flink | flink-metrics-jmx_2.11 | 1.7.1 | 1.10.1 |
org.apache.flink | flink-optimizer_2.11 | 1.7.1 | 1.10.1 |
org.apache.flink | flink-runtime-web_2.11 | 1.7.1 | 1.10.1 |
org.apache.flink | flink-runtime_2.11 | 1.7.1 | 1.10.1 |
org.apache.flink | flink-scala-shell_2.11 | 1.7.1 | 1.10.1 |
org.apache.flink | flink-scala_2.11 | 1.7.1 | 1.10.1 |
org.apache.flink | flink-shaded-curator | 1.7.1 | 1.10.1 |
org.apache.flink | flink-shaded-guava | 18.0-5.0 | 18.0-9.0 |
org.apache.flink | flink-shaded-jackson | 2.7.9-5.0 | 2.10.1-9.0 |
org.apache.flink | flink-shaded-netty | 4.1.24.Final-5.0 | 4.1.39.Final-9.0 |
org.apache.flink | flink-statebackend-rocksdb_2.11 | 1.7.1 | 1.10.1 |
org.apache.flink | flink-streaming-java_2.11 | 1.7.1 | 1.10.1 |
org.apache.flink | flink-streaming-scala_2.11 | 1.7.1 | 1.10.1 |
org.apache.flink | flink-table-common | 1.7.1 | 1.10.1 |
org.apache.flink | flink-yarn_2.11 | 1.7.1 | 1.10.1 |
org.apache.flink | force-shading | 1.7.1 | 1.10.1 |
org.apache.mesos | mesos | 1.0.1 | 1.0.1 |
org.codehaus.janino | commons-compiler | 3.0.7 | 3.0.9 |
org.codehaus.janino | janino | 3.0.7 | 3.0.9 |
org.slf4j | log4j-over-slf4j | 1.7.25 | 1.7.25 |
org.slf4j | slf4j-api | 1.7.15 | 1.7.15 |
org.apache.flink | flink-cep_2.11 | - | 1.10.1 |
org.apache.flink | flink-queryable-state-client-java | - | 1.10.1 |
org.apache.flink | flink-shaded-asm-7 | - | 7.1-9.0 |
org.apache.flink | flink-sql-parser | - | 1.10.1 |
org.apache.flink | flink-table-api-java | - | 1.10.1 |
org.apache.flink | flink-table-api-java-bridge_2.11 | - | 1.10.1 |
org.apache.flink | flink-table-api-scala-bridge_2.11 | - | 1.10.1 |
org.apache.flink | flink-table-api-scala_2.11 | - | 1.10.1 |
org.apache.flink | flink-table-planner-blink_2.11 | - | 1.10.1 |
org.apache.flink | flink-table-planner_2.11 | - | 1.10.1 |
org.apache.flink | flink-table-runtime-blink_2.11 | - | 1.10.1 |
org.apache.flink | flink-table-uber-blink_2.11 | - | 1.10.1 |
org.apache.flink | flink-table-uber_2.11 | - | 1.10.1 |
For more information, see SDK Dependencies for Stream Processing
↑ Top
Migrate Stream-3.0.0 Pipeline to Stream-4.0 Environment (Apache Flink 1.10.3)
There are no changes required from Pipeline runtime perspective. Please check the HERE Data SDK for Java and Scala documentation for changes with Stream-4.0
↑ Top
Migrate Stream-4.0 Pipeline to Stream-5.0 Environment (Apache Flink 1.13.5)
To migrate a Stream-4.0 pipeline to Stream-5.0 environment (Apache Flink 1.13.5), follow these steps:
Caution: Memory changes to consider before migration
Check the Stream-5.0 memory changes below before migrating the pipeline.
- Use the 2.34 version or newer of HERE Data SDK for Java and Scala that contains the new
sdk-stream-bom.pom
environment file. For more information on using the new sdk-stream-bom.pom
file, see the SDK Dependency Management. - Re-compile your pipeline code using the new
sdk-stream-bom.pom
file to create a new JAR file. For more information, see Stream Pipeline. - For an existing Stream pipeline, use the CLI or Web Portal to create a new Pipeline Template and a Pipeline Version with the
stream-5.0
run-time environment and the new JAR file that was created in Step #2 above. From the Web Portal, you can save time by creating a copy of the existing Pipeline Version that uses the stream-4.0
run-time environment and changing the run-time environment to stream-5.0
, and using the new JAR file for the new Pipeline Version. For more information, see Pipeline Deployment. - Upgrade to the new Pipeline Version. For more information, see Upgrade via Portal and Upgrade via CLI.
Stream-5.0 Supervisor (Flink JobManager) Memory Changes
Apache Flink 1.11 introduced improvements in the memory setup of the JobManagers. By moving from Flink 1.10 to 1.13, Stream-5.0 now includes these improvements and provides pipeline users with granular control on the memory configuration of the JobManager. There are two categories of these changes in Stream-5.0 compared to Stream-4.0 as documented below:
- In Stream-4.0, the memory in the Supervisor unit is translated to be set as the
jobmanager.heap.size
. However, it includes not only the JVM heap but also other "off-heap" memory components. In Stream-5.0, the memory in a Supervisor unit is translated to jobmanager.memory.process.size
, which is assigned to the JobManager JVM process. Flink adjusts the rest of the memory components based on default values or additionally configured options (more on this below). - In Stream-5.0, pipeline users can use all the memory configuration options via the Stream Configuration option. The memory configurations available to pipeline users include all the configurations listed within the official Flink documentation. It is important to ensure that the values set for these memory components are tuned in accordance with the Supervisor units configured for the JobManager.
For more information on the memory improvements, configuration options and defaults, see the official Flink documentation below:
For more information about pipelines general support for Apache Flink, please see Stream Pipelines - Apache Flink Support FAQ.
Known Issues
None
Environment Comparison
Group ID | Artifact ID | stream-4.0 | stream-5.0 |
org.apache.calcite.avatica | avatica-core | 1.15.0 | 1.17.0 |
org.apache.calcite | calcite-core | 1.21.0 | 1.26.0 |
org.apache.calcite | calcite-linq4j | 1.21.0 | 1.26.0 |
org.apache.commons | commons-compress | 1.18 | 1.21 |
org.apache.flink | flink-annotations | 1.10.3 | 1.13.5 |
org.apache.flink | flink-cep_2.12 | 1.10.3 | 1.13.5 |
org.apache.flink | flink-clients_2.12 | 1.10.3 | 1.13.5 |
org.apache.flink | flink-connector-base | Unavailable | 1.13.5 |
org.apache.flink | flink-connector-files | Unavailable | 1.13.5 |
org.apache.flink | flink-container_2.12 | 1.10.3 | 1.13.5 |
org.apache.flink | flink-core | 1.10.3 | 1.13.5 |
org.apache.flink | flink-csv | Unavailable | 1.13.5 |
org.apache.flink | flink-file-sink-common | Unavailable | 1.13.5 |
org.apache.flink | flink-hadoop-fs | 1.10.3 | 1.13.5 |
org.apache.flink | flink-java | 1.10.3 | 1.13.5 |
org.apache.flink | flink-json | Unavailable | 1.13.5 |
org.apache.flink | flink-mapr-fs | 1.10.3 | 1.13.5 |
org.apache.flink | flink-metrics-core | 1.10.3 | 1.13.5 |
org.apache.flink | flink-metrics-jmx_2.12 | 1.10.3 | Removed |
org.apache.flink | flink-optimizer_2.12 | 1.10.3 | 1.13.5 |
org.apache.flink | flink-queryable-state-client-java | 1.10.3 | 1.13.5 |
org.apache.flink | flink-runtime_2.12 | 1.10.3 | 1.13.5 |
org.apache.flink | flink-runtime-web_2.12 | 1.10.3 | 1.13.5 |
org.apache.flink | flink-scala_2.12 | 1.10.3 | 1.13.5 |
org.apache.flink | flink-shaded-asm-7 | 7.1-9.0 | 7.1-13.0 |
org.apache.flink | flink-shaded-curator | 1.10.3 | Removed |
org.apache.flink | flink-shaded-guava | 18.0-9.0 | 18.0-13.0 |
org.apache.flink | flink-shaded-jackson | 2.10.1-9.0 | 2.12.1-13.0 |
org.apache.flink | flink-shaded-netty | 4.1.39.Final-9.0 | 4.1.49.Final-13.0 |
org.apache.flink | flink-shaded-zookeeper-3 | Unavailable | 3.4.14-13.0 |
org.apache.flink | flink-sql-parser | 1.10.3 | 1.13.5 |
org.apache.flink | flink-sql-parser-hive | Unavailable | 1.13.5 |
org.apache.flink | flink-statebackend-rocksdb_2.12 | 1.10.3 | 1.13.5 |
org.apache.flink | flink-streaming-java_2.12 | 1.10.3 | 1.13.5 |
org.apache.flink | flink-streaming-scala_2.12 | 1.10.3 | 1.13.5 |
org.apache.flink | flink-table-api-java | 1.10.3 | 1.13.5 |
org.apache.flink | flink-table-api-java-bridge_2.12 | 1.10.3 | 1.13.5 |
org.apache.flink | flink-table-api-scala_2.12 | 1.10.3 | 1.13.5 |
org.apache.flink | flink-table-api-scala-bridge_2.12 | 1.10.3 | 1.13.5 |
org.apache.flink | flink-table-common | 1.10.3 | 1.13.5 |
org.apache.flink | flink-table-planner_2.12 | 1.10.3 | 1.13.5 |
org.apache.flink | flink-table-planner-blink_2.12 | 1.10.3 | 1.13.5 |
org.apache.flink | flink-table-runtime-blink_2.12 | 1.10.3 | 1.13.5 |
org.apache.flink | flink-table-uber_2.12 | 1.10.3 | Removed |
org.apache.flink | flink-table-uber-blink_2.12 | 1.10.3 | Removed |
org.apache.flink | flink-yarn_2.12 | 1.10.3 | 1.13.5 |
org.apache.flink | force-shading | 1.10.3 | 1.13.5 |
org.codehaus.janino | commons-compiler | 3.0.9 | 3.0.11 |
org.codehaus.janino | janino | 3.0.9 | 3.0.11 |
For more information, see SDK Dependencies for Stream Processing
↑ Top
See Also