Creating Real-Time Anonymization Pipeline
The process for preparing a new pipeline version for a new pipeline involves the following steps:
- Create a pipeline
- Create a template
- Create a pipeline version
Depending upon the tools used for this, these steps either come separately or as a single step. In the WebPortal there is a single step where a Pipeline, Pipeline Template and first Pipeline Version are created together.
In preperation for preparing a Pipeline, download the Real-Time Anonymization Pipeline Template zip archive and extract all contents of this. The Zip archive contains:
- .JAR file (used to create Pipeline Template)
- Readme.md file (providing overview of Pipeline Template)
- config (configuration) folder - contains pipeline-config.conf which details runtime parameters for pipeline.
Create new Real-Time Anonymization Pipeline
A new Pipeline is created with the following properties:
- Pipeline in a Project - This Pipeline can be created within or outside of a Project. A Project is an access controlled collection of resources (catalogs, pipelines and schemas).
- Shared with Group - The HERE Platform Group from which Members are able to access the Pipeline. In order for members to be able to run this pipeline they will additionally need access to the input and output Catalogs.
- Pipeline Name and Description - details for this specific pipeline, for which pipeline versions can be created.
- Notification Email - Used to distribute information on outages and service incidents.
Create new Real-Time Anonymization Pipeline Version
A new Pipeline Template is created with the following properties:
- Pipeline Template Name - name for the pipeline template created.
- Runtime Environment - Stream 2.0.0 or 3.0.0
- Pipeline Template - New template created by uploading JAR file (provided in zip archive) or by using existing pipeline template (previously created with the template JAR file).
- Pipeline Template Group - The HERE Platform Group from which Members are able to access the Pipeline Template.
- Multi-Region Support - Secondary region available in case of primary region fails.
- Entry Point Class Name - com.here.platform.extensions.anonymization.stream.AnonymizationStreamingApp
- Input / Output Catalogs - the source / output Catalogs for the location data streams.
Create new Real-Time Anonymization Pipeline Template
A new Pipeline Version is created with the following properties:
- Version Name- Unique name of this specific Pipeline Version.
- Pipeline Template - specific pipeline template to be used.
- Pipeline ID - specific Pipeline for which version will be created.
- Input / Output Catalogs - wilcatalogs specified here will override those define for the pipeline template.
- Cluster Configuration - Flink jobmanager and taskmanager size must be configured.
- Runtime Parameters - specific configuration to be used for this Real-Time Anonymization Pipeline Version including streaming layers and anonymization method. More details available here.
- Cost Allocation Tag - used for allocating costs of this pipeline version.