Migrate Pipeline to New Run-Time Environment
Contents
- Migrate Batch-2.0.0 Pipeline to Batch-2.1.0 Environment (Apache Spark 2.4.2 with Support for History Server)
- Migrate Batch-2.1.0 Pipeline to Batch-3.0 Environment (Apache Spark 2.4.7)
- Migrate Stream-2.0.0 Pipeline to Stream-3.0.0 Environment (Apache Flink 1.10.1)
- Migrate Stream-3.0.0 Pipeline to Stream-4.0 Environment (Apache Flink 1.10.3)
- Migrate Stream-4.0 Pipeline to Stream-5.0 Environment (Apache Flink 1.13.5)
Migrate Batch-2.0.0 Pipeline to Batch-2.1.0 Environment (Apache Spark 2.4.2 with Support for History Server)
To migrate a Batch pipeline to Batch-2.1.0 environment (Apache Spark 2.4.2 with support for History Server), follow these steps:
- Download the HERE platform SDK 2.12 or newer that contains the new
sdk-batch-bom.pom
environment file. For more information on using the newsdk-batch-bom.pom
file, see the SDK Dependency Management. - Re-compile your pipeline code using the new
sdk-batch-bom.pom
file to create a new JAR file. For more information, see Batch Pipeline. - For an existing Batch pipeline, use the CLI or Web Portal to create a new Pipeline Template and a Pipeline Version with the
batch-2.1.0
run-time environment and the new JAR file that was created in Step #2 above. From the Web Portal, you can save time by creating a copy of the existing Pipeline Version that uses a previous version of the batch environment and changing the run-time environment tobatch-2.1.0
, and using the new JAR file for the new Pipeline Version. For more information, see Pipeline Deployment. - Upgrade to the new Pipeline Version. For more information, see Upgrade via Portal and Upgrade via CLI.
Caution: Compatibility issues due to AWS SDK versions
History Server support in batch-2.1.0
pipelines currently requires the AWS SDK 1.7.4 in the (Spark 2.4.2) runtime environment. This version is not compatible with later versions of the AWS SDK (1.10.x and up). To use the AWS SDK in a pipeline with the batch-2.1.0
run-time environment, it is recommended to build the pipeline JAR with AWS SDK 1.7.4.
For more information about pipelines general support for Apache Spark, see Batch Pipelines - Apache Spark Support FAQ.
Migrate Batch-2.1.0 Pipeline to Batch-3.0 Environment (Apache Spark 2.4.7)
There are no changes required from Pipeline runtime perspective. Please check the HERE Data SDK for Java and Scala documentation for changes with Batch-3.0
Migrate Stream-2.0.0 Pipeline to Stream-3.0.0 Environment (Apache Flink 1.10.1)
To migrate a Stream-2.0.0 pipeline to Stream-3.0.0 environment (Apache Flink 1.10.1), follow these steps:
Caution: Memory changes to consider before migration
Check the Stream-3.0.0 memory changes below before migrating the pipeline.
- Use the 2.17 version or newer of HERE Data SDK for Java and Scala that contains the new
sdk-stream-bom.pom
environment file. For more information on using the newsdk-stream-bom.pom
file, see the SDK Dependency Management. - Re-compile your pipeline code using the new
sdk-stream-bom.pom
file to create a new JAR file. For more information, see Stream Pipeline. - For an existing Stream pipeline, use the CLI or Web Portal to create a new Pipeline Template and a Pipeline Version with the
stream-3.0.0
run-time environment and the new JAR file that was created in Step #2 above. From the Web Portal, you can save time by creating a copy of the existing Pipeline Version that uses thestream-2.0.0
run-time environment and changing the run-time environment tostream-3.0.0
, and using the new JAR file for the new Pipeline Version. For more information, see Pipeline Deployment. - Upgrade to the new Pipeline Version. For more information, see Upgrade via Portal and Upgrade via CLI.
Stream-3.0.0 Worker (Flink TaskManager) Memory Changes
Apache Flink 1.10 introduced improvements in the memory setup of TaskManagers. Stream-3.0.0 includes these improvements and provides pipeline users with granular control on the memory configuration of the TaskManagers. There are two categories of these changes in Stream-3.0.0 compared to Stream-2.0.0 as documented below:
- In Stream-2.0.0, the memory in a Worker unit is translated to be set as the
taskmanager.heap.size
. However, it includes not only the JVM heap but also other "off-heap" memory components. In Stream-3.0.0, the memory in a Worker unit is translated totaskmanager.memory.process.size
, which is assigned to the TaskManager JVM process. Flink adjusts the rest of the memory components based on default values or additionally configured options (more on this below). - In Stream-3.0.0, pipeline users can use all the memory configuration options via the Stream Configuration option. The memory configurations available to pipeline users include all the configurations listed within the official Flink documentation. It is important to ensure that the values set for these memory components are tuned in accordance with the Worker units configured for the TaskManager.
For more information on the memory improvements, configuration options and defaults, see the official Flink documentation below:
- Flink 1.10 - Memory Management Improvements
- Flink 1.10 - Detailed Memory Model
- Flink 1.10 - Memory Migration Guide
For more information about pipelines general support for Apache Flink, please see Stream Pipelines - Apache Flink Support FAQ.
Known Issues
Change in the Maven Modules of Table API (FLINK-11064)
Users that had a flink-table dependency before, need to update their dependencies to flink-table-planner
and the correct dependency of flink-table-api-*
, depending on whether Java or Scala is used: one of flink-table-api-java-bridge
or flink-table-api-scala-bridge
.
Note: No flink-table dependency for Stream-3.0.0
stream-2.0.0
contained the flink-table
dependency while stream-3.0.0
no longer contains it.
Change in Scala-Macro Generated Serializers (FLINK-11330)
Users that migrated to stream-3.0.0
and have the warning-related scalac
options enabled (e.g. -Xfatal-warnings
, -Ywarn-unused:*
) might face the compilation errors with the following messages: local val in method createSerializer is never used
, *.scala:1: Unused import
. These messages are actually a result of the macro-expanded code inspection.
To prevent the compilation errors for Scala 2.11
, you can choose one of the following options:
- Using
sbt
, add thescalacOptions ~= (_.filterNot(_.startsWith("-Ywarn-unused")))
line to yourbuild.sbt
settings for the project. - Provided you use
Maven
, use thecom.github.ghik:silencer-plugin_${scala.version}
plugin as following:
<configuration>
<compilerPlugins>
<compilerPlugin>
<groupId>com.github.ghik</groupId>
<artifactId>silencer-plugin_${scala.version}</artifactId>
<version>1.7.0</version>
</compilerPlugin>
</compilerPlugins>
<args>
<!-- Unused val in a Flink macro instantiation -->
<arg>
-P:silencer:globalFilters=local val in method createSerializer is never used
</arg>
<!-- Unused import in a Flink macro instantiation, that is reported on line '1' of the source file -->
<arg>
-P:silencer:lineContentFilters=^/\*$;^package;^//
</arg>
</args>
</configuration>
Note
The purpose of the -P:silencer:lineContentFilters=^/\*$;^package;^//
argument is to ignore the unused import coming from the Flink macro, which is reported to be located at the first line of the source code. You might want to change this line according to your needs.
Environment Comparison
Group ID | Artifact ID | stream-4.0.0 | stream-3.0.0 |
---|---|---|---|
com.fasterxml.jackson.core | jackson-annotations | 2.7.9 | Removed |
com.fasterxml.jackson.core | jackson-core | 2.7.9 | Removed |
com.fasterxml.jackson.core | jackson-databind | 2.7.9 | Removed |
com.fasterxml.jackson.dataformat | jackson-dataformat-csv | 2.7.9 | Removed |
com.fasterxml.jackson.dataformat | jackson-dataformat-yaml | 2.7.9 | Removed |
com.github.scopt | scopt_2.11 | 3.5.0 | Removed |
com.google.code.findbugs | jsr305 | 1.3.9 | Removed |
com.google.protobuf | protobuf-java | 2.6.1 | Removed |
com.twitter | chill-java | 0.7.6 | Removed |
com.twitter | chill_2.11 | 0.7.6 | Removed |
com.typesafe | config | 1.3.0 | Removed |
com.typesafe | ssl-config-core_2.11 | 0.2.1 | Removed |
com.typesafe.akka | akka-actor_2.11 | 2.4.20 | Removed |
com.typesafe.akka | akka-camel_2.11 | 2.4.20 | Removed |
com.typesafe.akka | akka-protobuf_2.11 | 2.4.20 | Removed |
com.typesafe.akka | akka-slf4j_2.11 | 2.4.20 | Removed |
com.typesafe.akka | akka-stream_2.11 | 2.4.20 | Removed |
commons-cli | commons-cli | 1.3.1 | Removed |
commons-collections | commons-collections | 3.2.2 | Removed |
commons-io | commons-io | 2.4 | Removed |
io.netty | netty-all | 4.1.24.Final | Removed |
jline | jline | 2.14.3 | Removed |
org.apache.camel | camel-core | 2.17.7 | Removed |
org.apache.curator | curator-client | 2.12.0 | Removed |
org.apache.curator | curator-framework | 2.12.0 | Removed |
org.apache.curator | curator-recipes | 2.12.0 | Removed |
org.apache.flink | flink-python_2.11 | 1.7.1 | Removed |
org.apache.flink | flink-queryable-state-client-java_2.11 | 1.7.1 | Removed |
org.apache.flink | flink-s3-fs-hadoop | 1.7.1 | Removed |
org.apache.flink | flink-shaded-asm | 5.0.4-5.0 | Removed |
org.apache.flink | flink-shaded-asm-6 | 6.2.1-5.0 | Removed |
org.apache.flink | flink-table_2.11 | 1.7.1 | Removed |
org.clapper | grizzled-slf4j_2.11 | 1.3.2 | Removed |
org.javassist | javassist | 3.19.0-GA | Removed |
org.objenesis | objenesis | 2.1 | Removed |
org.reactivestreams | reactive-streams | 1.0.0 | Removed |
org.rocksdb | rocksdbjni | 5.7.5 | Removed |
org.scala-lang | scala-compiler | 2.11.12 | Removed |
org.scala-lang | scala-library | 2.11.12 | Removed |
org.scala-lang | scala-reflect | 2.11.12 | Removed |
org.scala-lang.modules | scala-java8-compat_2.11 | 0.7.0 | Removed |
org.scala-lang.modules | scala-parser-combinators_2.11 | 1.0.4 | Removed |
org.scala-lang.modules | scala-xml_2.11 | 1.0.5 | Removed |
org.tukaani | xz | 1.5 | Removed |
org.xerial.snappy | snappy-java | 1.1.4 | Removed |
org.yaml | snakeyaml | 1.15 | Removed |
ch.qos.logback | logback-classic | 1.2.3 | 1.2.3 |
ch.qos.logback | logback-core | 1.2.3 | 1.2.3 |
com.esotericsoftware.kryo | kryo | 2.24.0 | 2.24.0 |
com.esotericsoftware.minlog | minlog | 1.2 | 1.2 |
com.esotericsoftware.reflectasm | reflectasm | 1.09 | 1.09 |
org.apache.calcite | calcite-core | 1.17.0 | 1.21.0 |
org.apache.calcite | calcite-linq4j | 1.17.0 | 1.21.0 |
org.apache.calcite.avatica | avatica-core | 1.12.0 | 1.15.0 |
org.apache.commons | commons-compress | 1.4.1 | 1.18 |
org.apache.commons | commons-lang3 | 3.3.2 | 3.3.2 |
org.apache.commons | commons-math3 | 3.5 | 3.5 |
org.apache.flink | flink-annotations | 1.7.1 | 1.10.1 |
org.apache.flink | flink-clients_2.11 | 1.7.1 | 1.10.1 |
org.apache.flink | flink-container_2.11 | 1.7.1 | 1.10.1 |
org.apache.flink | flink-core | 1.7.1 | 1.10.1 |
org.apache.flink | flink-dist_2.11 | 1.7.1 | 1.10.1 |
org.apache.flink | flink-hadoop-fs | 1.7.1 | 1.10.1 |
org.apache.flink | flink-java | 1.7.1 | 1.10.1 |
org.apache.flink | flink-mapr-fs | 1.7.1 | 1.10.1 |
org.apache.flink | flink-mesos_2.11 | 1.7.1 | 1.10.1 |
org.apache.flink | flink-metrics-core | 1.7.1 | 1.10.1 |
org.apache.flink | flink-metrics-jmx_2.11 | 1.7.1 | 1.10.1 |
org.apache.flink | flink-optimizer_2.11 | 1.7.1 | 1.10.1 |
org.apache.flink | flink-runtime-web_2.11 | 1.7.1 | 1.10.1 |
org.apache.flink | flink-runtime_2.11 | 1.7.1 | 1.10.1 |
org.apache.flink | flink-scala-shell_2.11 | 1.7.1 | 1.10.1 |
org.apache.flink | flink-scala_2.11 | 1.7.1 | 1.10.1 |
org.apache.flink | flink-shaded-curator | 1.7.1 | 1.10.1 |
org.apache.flink | flink-shaded-guava | 18.0-5.0 | 18.0-9.0 |
org.apache.flink | flink-shaded-jackson | 2.7.9-5.0 | 2.10.1-9.0 |
org.apache.flink | flink-shaded-netty | 4.1.24.Final-5.0 | 4.1.39.Final-9.0 |
org.apache.flink | flink-statebackend-rocksdb_2.11 | 1.7.1 | 1.10.1 |
org.apache.flink | flink-streaming-java_2.11 | 1.7.1 | 1.10.1 |
org.apache.flink | flink-streaming-scala_2.11 | 1.7.1 | 1.10.1 |
org.apache.flink | flink-table-common | 1.7.1 | 1.10.1 |
org.apache.flink | flink-yarn_2.11 | 1.7.1 | 1.10.1 |
org.apache.flink | force-shading | 1.7.1 | 1.10.1 |
org.apache.mesos | mesos | 1.0.1 | 1.0.1 |
org.codehaus.janino | commons-compiler | 3.0.7 | 3.0.9 |
org.codehaus.janino | janino | 3.0.7 | 3.0.9 |
org.slf4j | log4j-over-slf4j | 1.7.25 | 1.7.25 |
org.slf4j | slf4j-api | 1.7.15 | 1.7.15 |
org.apache.flink | flink-cep_2.11 | - | 1.10.1 |
org.apache.flink | flink-queryable-state-client-java | - | 1.10.1 |
org.apache.flink | flink-shaded-asm-7 | - | 7.1-9.0 |
org.apache.flink | flink-sql-parser | - | 1.10.1 |
org.apache.flink | flink-table-api-java | - | 1.10.1 |
org.apache.flink | flink-table-api-java-bridge_2.11 | - | 1.10.1 |
org.apache.flink | flink-table-api-scala-bridge_2.11 | - | 1.10.1 |
org.apache.flink | flink-table-api-scala_2.11 | - | 1.10.1 |
org.apache.flink | flink-table-planner-blink_2.11 | - | 1.10.1 |
org.apache.flink | flink-table-planner_2.11 | - | 1.10.1 |
org.apache.flink | flink-table-runtime-blink_2.11 | - | 1.10.1 |
org.apache.flink | flink-table-uber-blink_2.11 | - | 1.10.1 |
org.apache.flink | flink-table-uber_2.11 | - | 1.10.1 |
Migrate Stream-3.0.0 Pipeline to Stream-4.0 Environment (Apache Flink 1.10.3)
There are no changes required from Pipeline runtime perspective. Please check the HERE Data SDK for Java and Scala documentation for changes with Stream-4.0
Migrate Stream-4.0 Pipeline to Stream-5.0 Environment (Apache Flink 1.13.5)
To migrate a Stream-4.0 pipeline to Stream-5.0 environment (Apache Flink 1.13.5), follow these steps:
Caution: Memory changes to consider before migration
Check the Stream-5.0 memory changes below before migrating the pipeline.
- Use the 2.34 version or newer of HERE Data SDK for Java and Scala that contains the new
sdk-stream-bom.pom
environment file. For more information on using the newsdk-stream-bom.pom
file, see the SDK Dependency Management. - Re-compile your pipeline code using the new
sdk-stream-bom.pom
file to create a new JAR file. For more information, see Stream Pipeline. - For an existing Stream pipeline, use the CLI or Web Portal to create a new Pipeline Template and a Pipeline Version with the
stream-5.0
run-time environment and the new JAR file that was created in Step #2 above. From the Web Portal, you can save time by creating a copy of the existing Pipeline Version that uses thestream-4.0
run-time environment and changing the run-time environment tostream-5.0
, and using the new JAR file for the new Pipeline Version. For more information, see Pipeline Deployment. - Upgrade to the new Pipeline Version. For more information, see Upgrade via Portal and Upgrade via CLI.
Stream-5.0 Supervisor (Flink JobManager) Memory Changes
Apache Flink 1.11 introduced improvements in the memory setup of the JobManagers. By moving from Flink 1.10 to 1.13, Stream-5.0 now includes these improvements and provides pipeline users with granular control on the memory configuration of the JobManager. There are two categories of these changes in Stream-5.0 compared to Stream-4.0 as documented below:
- In Stream-4.0, the memory in the Supervisor unit is translated to be set as the
jobmanager.heap.size
. However, it includes not only the JVM heap but also other "off-heap" memory components. In Stream-5.0, the memory in a Supervisor unit is translated tojobmanager.memory.process.size
, which is assigned to the JobManager JVM process. Flink adjusts the rest of the memory components based on default values or additionally configured options (more on this below). - In Stream-5.0, pipeline users can use all the memory configuration options via the Stream Configuration option. The memory configurations available to pipeline users include all the configurations listed within the official Flink documentation. It is important to ensure that the values set for these memory components are tuned in accordance with the Supervisor units configured for the JobManager.
For more information on the memory improvements, configuration options and defaults, see the official Flink documentation below:
- Flink 1.11 - Memory Management Improvements
- Flink 1.13 - Set up JobManager Memory
- Flink 1.13 - Migration Guide
For more information about pipelines general support for Apache Flink, please see Stream Pipelines - Apache Flink Support FAQ.
Known Issues
None
Environment Comparison
Group ID | Artifact ID | stream-4.0 | stream-5.0 |
---|---|---|---|
org.apache.calcite.avatica | avatica-core | 1.15.0 | 1.17.0 |
org.apache.calcite | calcite-core | 1.21.0 | 1.26.0 |
org.apache.calcite | calcite-linq4j | 1.21.0 | 1.26.0 |
org.apache.commons | commons-compress | 1.18 | 1.21 |
org.apache.flink | flink-annotations | 1.10.3 | 1.13.5 |
org.apache.flink | flink-cep_2.12 | 1.10.3 | 1.13.5 |
org.apache.flink | flink-clients_2.12 | 1.10.3 | 1.13.5 |
org.apache.flink | flink-connector-base | Unavailable | 1.13.5 |
org.apache.flink | flink-connector-files | Unavailable | 1.13.5 |
org.apache.flink | flink-container_2.12 | 1.10.3 | 1.13.5 |
org.apache.flink | flink-core | 1.10.3 | 1.13.5 |
org.apache.flink | flink-csv | Unavailable | 1.13.5 |
org.apache.flink | flink-file-sink-common | Unavailable | 1.13.5 |
org.apache.flink | flink-hadoop-fs | 1.10.3 | 1.13.5 |
org.apache.flink | flink-java | 1.10.3 | 1.13.5 |
org.apache.flink | flink-json | Unavailable | 1.13.5 |
org.apache.flink | flink-mapr-fs | 1.10.3 | 1.13.5 |
org.apache.flink | flink-metrics-core | 1.10.3 | 1.13.5 |
org.apache.flink | flink-metrics-jmx_2.12 | 1.10.3 | Removed |
org.apache.flink | flink-optimizer_2.12 | 1.10.3 | 1.13.5 |
org.apache.flink | flink-queryable-state-client-java | 1.10.3 | 1.13.5 |
org.apache.flink | flink-runtime_2.12 | 1.10.3 | 1.13.5 |
org.apache.flink | flink-runtime-web_2.12 | 1.10.3 | 1.13.5 |
org.apache.flink | flink-scala_2.12 | 1.10.3 | 1.13.5 |
org.apache.flink | flink-shaded-asm-7 | 7.1-9.0 | 7.1-13.0 |
org.apache.flink | flink-shaded-curator | 1.10.3 | Removed |
org.apache.flink | flink-shaded-guava | 18.0-9.0 | 18.0-13.0 |
org.apache.flink | flink-shaded-jackson | 2.10.1-9.0 | 2.12.1-13.0 |
org.apache.flink | flink-shaded-netty | 4.1.39.Final-9.0 | 4.1.49.Final-13.0 |
org.apache.flink | flink-shaded-zookeeper-3 | Unavailable | 3.4.14-13.0 |
org.apache.flink | flink-sql-parser | 1.10.3 | 1.13.5 |
org.apache.flink | flink-sql-parser-hive | Unavailable | 1.13.5 |
org.apache.flink | flink-statebackend-rocksdb_2.12 | 1.10.3 | 1.13.5 |
org.apache.flink | flink-streaming-java_2.12 | 1.10.3 | 1.13.5 |
org.apache.flink | flink-streaming-scala_2.12 | 1.10.3 | 1.13.5 |
org.apache.flink | flink-table-api-java | 1.10.3 | 1.13.5 |
org.apache.flink | flink-table-api-java-bridge_2.12 | 1.10.3 | 1.13.5 |
org.apache.flink | flink-table-api-scala_2.12 | 1.10.3 | 1.13.5 |
org.apache.flink | flink-table-api-scala-bridge_2.12 | 1.10.3 | 1.13.5 |
org.apache.flink | flink-table-common | 1.10.3 | 1.13.5 |
org.apache.flink | flink-table-planner_2.12 | 1.10.3 | 1.13.5 |
org.apache.flink | flink-table-planner-blink_2.12 | 1.10.3 | 1.13.5 |
org.apache.flink | flink-table-runtime-blink_2.12 | 1.10.3 | 1.13.5 |
org.apache.flink | flink-table-uber_2.12 | 1.10.3 | Removed |
org.apache.flink | flink-table-uber-blink_2.12 | 1.10.3 | Removed |
org.apache.flink | flink-yarn_2.12 | 1.10.3 | 1.13.5 |
org.apache.flink | force-shading | 1.10.3 | 1.13.5 |
org.codehaus.janino | commons-compiler | 3.0.9 | 3.0.11 |
org.codehaus.janino | janino | 3.0.9 | 3.0.11 |