Configuration File Reference
Contents
Overview
This topic is a summary of the various configuration files used in HERE platform pipelines. Most are only used in certain places, such as with the CLI, and are identified accordingly. In some instances, a configuration file may only be used in specific use cases, which are also described here.
Note
With rare exceptions, there are no configuration files needed when directly using the Pipeline API. Most of the parameters encountered in these configuration files are parameters required by various API functions. Both the CLI and the platform portal simplify working with pipelines, but the data is still required by the Pipeline API. So, we use configuration files with the SDK and CLI, and we use forms with the platform portal.
pipeline-job.conf
CLI only
This file only applies to activating batch Pipeline Versions without the use of the scheduler (that is, in a Run Now mode).
Example content of pipeline-job.conf
:
pipeline.job.catalog-versions {
output-catalog { base-version = 42 }
input-catalogs {
test-input-1 {
processing-type = "no_changes"
version = 19
}
test-input-2 {
processing-type = "changes"
since-version = 70
version = 75
}
test-input-3 {
processing-type = "reprocess"
version = 314159
}
}
}
Where:
↑ Top
pipeline-config.conf
CLI only
When you create a Pipeline Version using the CLI, the configuration of the new Pipeline Version is specified by the pipeline-config.conf
file, the template file ID, and the pipeline ID. pipeline-config.conf
is added to the classpath of the main user process.
Because the pipeline template
(see discussion below) includes information about the input catalog(s) to be used with this Pipeline Version, the information in the pipeline-config.conf
file should agree with the information in the pipeline template
, as follows:
- In the
pipeline template
, the input catalog ID values are identified. - In the
pipeline-config.conf
file, you use the same input catalog ID values from the template to associate the HRN value for each catalog. - The
pipeline-job.conf
file should only be used when a batch Pipeline Version needs to be executed with specific versions of input catalogs or type of processing. It contains the version information about the catalogs defined in the pipeline-config.conf
file. However, if you run the batch Pipeline Version using the scheduler, the scheduler determines the versions and the new data to be processed. Or, if the batch Pipeline Version is run on-demand without any details, the service will, by default, pick the latest catalog versions and reprocess.
Example content of pipeline-config.conf
:
pipeline.config {
billing-tag = "test-billing-tag"
output-catalog { hrn = "hrn:here:data:::example-output" }
input-catalogs {
test-input-1 { hrn = "hrn:here:data:::example1" }
test-input-2 { hrn = "hrn:here:data:::example2" }
test-input-3 { hrn = "hrn:here:data:::example3" }
}
}
Where:
- billing-tag specifies an optional tag to group billing entries for the pipeline.
- output-catalog specifies the HRN that identifies the output catalog of the pipeline.
- input-catalogs specifies one or more input catalogs of the pipeline: for each input catalog, its fixed identifier is provided together with the HRN of the actual catalog.
As of HERE platform Release 2.1, a stream Pipeline Version can use the same catalog for input and output. This does not apply to batch Pipeline Versions, which must use a different output catalog.
↑ Top
pipeline template
The pipeline template is an entity required by the Pipeline API for creating a Pipeline Version. From the platform portal, it is created by answering the questions in the UI. It can also be created using the pipeline template create
CLI command. The parameters used in both cases include the following.
Required:
parameter | Required | Description | Format |
name | yes | A meaningful name for pipeline template | string |
cluster type | yes | Distributed processing framework used to execute the pipeline template | string Possible values: - stream-2.0.0
- stream-3.0.0
- batch-2.0.0
- batch-2.1.0
|
JAR file | yes | The pipeline JAR file location, including the path on a local file system, to upload. | string |
class name | yes | Name of the pipeline template main class | string |
Group ID | yes | ID of the group allowed to access the template. | string |
--input-catalog-ids <catalog IDs...> | yes | IDs of the input catalogs expected by the pipeline, multiple IDs are separated by a space; this list must match the catalog IDs used in the pipeline-config.conf file. Note: --input-catalog-ids may also accept a path statement to a configuration file that contains catalogs IDs. | string |
--description | no | Description of the pipeline template | string |
--supervisor-units | no | Default size of a supervisor node (1-15 units) | integer |
--supervisor-units-profile | no | ID of the resource profile requested for the supervisor units | string |
--worker-units | no | Default size of a worker node (1-15 units) | integer |
--workers | no | Default number of workers to allocate | integer |
--worker-units-profile | no | ID of the resource profile requested for the worker units | string | --default-runtime-config | no | Map of default configuration values for the pipeline application given in the key1=value1\nkey2=value2... form. The Pipeline API passes them as the application.properties file to the pipeline by adding it to the classpath of the main JVM process. | key/value pairs The maximum property name (key) size is 256 and the maximum property value is 1024. |
--credentials | no | The credentials file to use with the command as downloaded from the portal. | |
--profile | no | The name of the credentials profile to be used from the olpcli.ini file | string |
--json | no | Display the created pipeline template contents in JSON format. | string |
However, if you are using the API to create a template, the parameters are different. This is because the platform portal and CLI take care of uploading the pipeline JAR file to the Pipeline. When using the API, this is a separate step that creates a package based on the uploaded pipeline JAR file and assigns it a package ID. This is how all pipeline JAR files are identified and tracked.
parameter | Required | Description | Format |
name | yes | User provided name for the Pipeline Template. | string [ 3 .. 64 ] characters required |
runtimeEnvironment | yes | The runtime environment type. | string Possible values: - "stream-2.0.0"
- "stream-3.0.0"
- "stream-4.0"
- "stream-5.0"
- "batch-2.0.0"
- "batch-2.1.0"
- "batch-3.0"
|
packageId | yes | The Pipeline API generated identifier (UUID) for the Package (that is, the pipeline JAR file) that, when combined with this Pipeline Template, is used to create a Pipeline Version. Note: The package ID is returned by the package upload API function. | string |
entryPointClassName | yes | The fully qualified class name of the entry point of this Pipeline Template. | string |
groupId | yes | Group ID that has ownership of the Pipeline Template. Used to restrict access to the Pipeline Version. | string |
defaultClusterConfiguration | yes | Configuration of the cluster for running the Pipeline Version; see the example, below. | ClusterConfiguration |
inputCatalogIds | yes | List of input catalog identifiers for Pipeline Template. This list of identifiers can be used by a client to describe a list of HRNs needed for a valid Pipeline Version. | string Each identifier is an alpha-numeric string (_ and - are also allowed) providing a label for a catalog HRN. |
description | no | Additional text describing the Pipeline Template. | string [ 0 .. 512 ] characters |
defaultRuntimeConfiguration | yes | Default runtime config in Java Properties format (as a String). Any runtime configuration values supplied in this PipelineTemplate will provide default values for Pipeline Versions using this Template. If a Pipeline Version has its own runtime configuration values, they will be added to any defaults available from the parent Pipeline Template. The Pipeline Version runtime configuration values with the same key will over-ride any default configuration values from the Pipeline Template. The application of these default values occurs dynamically whenever the Pipeline Version is run. The Pipeline API passes them as the application.properties file to the pipeline by adding it to the classpath of the main JVM process. | string key/value pair The maximum property name (key) size is 256 and the maximum property value is 1024. |
Example
Results from pipeline template list
CLI command:
{
"pipelineTemplates": [
{
"created": "2018-03-01T15:16:53.796Z",
"groupId": "GROUP-9479863e-a13b-4d35-9eb1-5a054669046e",
"defaultClusterConfiguration": {
"workerResourceProfileId": "HS1B",
"supervisorResourceProfileId": "HS1B",
"supervisorUnits": 1,
"workerUnits": 1,
"workers": 1
},
"name": "sparktestcompiler",
"packageId": "68d723f6-2ae7-40e4-8c24-61512d511852",
"entryPointClassName": "com.example.Main",
"description": "",
"id": "5c0660a3-0fb4-4f35-bcd0-be6ce25075f6",
"state": "created",
"defaultRuntimeConfiguration": "",
"updated": "2018-03-01T15:16:53.796Z",
"runtimeEnvironment": "batch-3.0"
}
]
}
↑ Top
credentials.properties
Every application (such as a pipeline) functioning within the HERE platform must be registered with the HERE platform. This is done under the Apps and keys
function in the platform portal. When registered, the credentials.properties
file is created and must be used whenever loading that specific pipeline into the HERE platform. The following is an example of the credentials.properties
file.
here.user.id = HERE-01966c94-aaf1-4ae2-a1y6-6516b3f9b6c1
here.client.id = mzLcb1rL8nskvDQpCAAO
here.access.key.id = BELUTk45QdaYGgZ9A_IMTA
here.access.key.secret = 108lI7w9m8G_6sIw9kng-PXGoeHQQ-cv6xByNOuMcRYixZZp...
here.token.endpoint.url = https://account.api.here.com/oauth2/token
For more information on how to use this file, see how to Get Your Credentials or the Identity & Access Management Guide.
↑ Top
olpcli.ini
CLI only
This file is created from the credentials.properties
file. For additional information, see HERE Workspace for Java & Scala Developers.
↑ Top
application.properties
A custom configuration file (in Java Properties file format) made available on the running pipeline’s classpath. This file is constructed from the value of the Pipeline Template’s defaultRuntimeConfig
property (API & CLI only) overridden on a key-by-key basis with the value of the Pipeline Version’s customRuntimeConfig
property. The values of defaultRuntimeConfig
and customRuntimeConfig
are strings whose content represents a valid Java Properties file.
Note
For Stream runtimes, if the uber jar contains application.properties
then it will take preference in the classpath over the application.properties
provided by the runtime.
Example
# Value of Pipeline Template's "defaultRuntimeConfig" property
"myexample.threads = 3\nmyexample.language = \"en_US\"\nmyexample .processing.window=300\nmyexample.processing.mode=stateless"
# Value of Pipeline Version’s "customRuntimeConfig" property
"myexample.threads=5\n\n myexample.processing.mode= \"stateful\"\nmyexample.processing.filterInvalid = true"
# The resulting Application.properties file on the pipeline classpath
# (for the given values of "defaultRuntimeConfig" and "customRuntimeConfig")
myexample.threads = 5
myexample.language = "en_US"
myexample.processing.window = 300
myexample.processing.mode = "stateful"
myexample.processing.filterInvalid = true
↑ Top
See Also