HERE Map Content (HMC) Topology Filter


HERE Map Content (HMC) Topology Filter is a batch pipeline designed to easily extract topology segments data and their related attributes from the HERE Map Content catalog based on the attribute filters provided by the user. One example would be to find all topology segments that belong to the roads of specific functional class, and/or have a certain number of lanes. Please see the file for detailed information on supported attribute filters. Before deployment of this pipeline template, the user will need to provide filter criteria in the file in config folder. One or multiple filters can be provided at a time. If two or more filters are provided, they will be treated using AND logic.

Consider the following example, where the user provides the following filters:

# Attribute Filter #1:
# Attribute Filter #2:

The output of this pipeline will be topology segments that have the FUNCTIONAL_CLASS_1 attribute and have two or more lanes.


Application Flow Diagram
Figure 1. Application Flow Diagram

Legend: Legend Diagram


  • If you wish to provide an existing catalog for output data, make sure it is created using the same configuration as specified in the config/output-catalog.json file and is shared with the same GROUP that will be used for deployment of this pipeline template.
  • Confirm that your local credentials (~/.here/ are added to the same GROUP.
Output Data/Schema

The user can create the output catalog with the OLP CLI using the provided output catalog JSON config (replace any <> variables) or it will be created during the execution of the Wizard Deployer.

Layer Id Layer Type Partition Type Schema
topology-geometry-output Versioned HEREtile
navigation-attributes-output Versioned HEREtile
advanced-navigation-attributes-output Versioned HEREtile
adas-attributes-output Versioned HEREtile
road-attributes-output Versioned HEREtile
lane-attributes-output Versioned HEREtile
state Versioned generic Zip


In order to deploy and run this pipeline template you will need the Wizard Deployer. The Wizard executes interactively, asking questions about the application, and expects the user to provide needed answers. Follow the Wizard's documentation instructions, set up the filters as described in, and then follow these steps:

  1. Execute the script as ./
  2. Follow the prompts and provide needed answers

PLEASE NOTE: During deployment with the Wizard, you will be asked to provide a number of workers (cores) you wish to use for processing. This number depends on the size of the area (bounding box) you wish to process and the layers that will be used as an input. Some layers (for example, adas-attributes) contain more data than other layers, thus requiring more processing power to parse the content. We recommend starting with a small bounding box to assess the processing time required to extract data according to provided filters, and then tuning the amount of workers for your bounding box until you are satisfied with the processing time. When tuning the performance of your pipeline, Spark Web UI is a great tool to monitor resource allocation and data distribution.

Rerunning pipeline with different configuration

Once deployed, this pipeline can be re-executed with a different configuration by simply copying the pipeline version and providing different runtime parameters and amount of workers. The user doesn't have to go through the deployment process via the Wizard again. The same pipeline can be reused to extract different attributes for a different bounding box by simply changing the corresponding runtime parameters during version creation/copying. Make sure to tune the amount of workers depending on the size of the requested bounding box.

Tuning Data Processing Library parameters

This pipeline template uses the Data Processing Library under the hood, which allows the user to tune DPL related parameters in order to achieve better performance. Read more on DPL configuration here. The performance tuning section of the DPL documentation can also be useful when tuning Spark related configuration. All these parameters can be provided in the config/ file before initial deployment or later as runtime parameters when creating a new version for a deployed pipeline.


In the Platform Portal select the Pipelines tab where you should be able to see your Pipeline deployed and running. The Spark Web UI provides important details about your running pipeline. After your pipeline finishes and your data is published, you will be able to see your data published in the output catalog.

Cost Estimation

Executing this pipeline template will incur the following costs:


Cost will depend on the amount of test data being stored in a Versioned layer.


Cost will depend on the amount and size of partitions(metadata) stored in the Versioned layer.

Data Transfer IO

Cost will depend on the amount of:

  • input data read from HMC catalog
  • amount of data published to an output catalog
Compute Core and Compute RAM

Cost will depend on the amount of workers selected by the user

Log Search IO

Cost will depend on the log level set for the execution of this pipeline template. To minimize this cost, the user can set log level to WARN.


If you need support with this pipeline template, please contact us.

results matching ""

    No results matching ""