Vehicle Road Coverage Extractor

Overview

Vehicle Road Coverage Extractor allows users to perform large scale data analytics in understanding vehicle road coverage and sensor data as a function of multiple variables. This customizable pipeline template processes archived SDII data stored in the versioned/index layer to calculate the vehicle road coverage of this data. It consists of two pipelines:

  1. vehicle-road-coverage-processor-pipeline
  2. vehicle-road-coverage-aggregator-pipeline

The vehicle-road-coverage-processor-pipeline reads SDII data from the versioned/index layer. It map matches the data and calculates the total number of probes and messages in each segment of the given bounding box for every 12 hours in the given time period. It publishes the results for every 12 hours as a CSV in a HERE Tile partition.

Every version in the output catalog of the pipeline corresponds to aggregated results for each 12 hours of the time-period. This time period to output version mapping information is also published in the log-mapping layer of the output catalog.

The vehicle-road-coverage-aggregator-pipeline aggregates the results published for every 12 hours by the vehicle-road-coverage-processor-pipeline and calculates the coverage information for the entire time period given by the user.

The final output gives the information of the number of messages, number of probes, and hourly messages distribution for each segment in a HERE Tile, along with attributes of the segment like, segment Id, direction (Forward/Backward), functional class, length, autoAccess (automobile access of the segment).

The result can be analyzed using the available Jupyter notebook present in vehicle-road-coverage-aggregator-pipeline/notebooks directory.

Structure

Architectural Diagram

Application Flow Diagram
Figure 1. Application Flow Diagram

Legend: Legend Diagram

Prerequisites

  • This pipeline template writes output to a Versioned layer of the catalog. You can use your existing output layer or let the Wizard script create a new catalog/layer for you. Please refer to the Execution section below for further details.
  • If you are planning to use an existing catalog/layer please make sure that your output catalog is shared with GROUP which you are going to use for deployment of this pipeline template.
  • Confirm that your local credentials (~/.here/credentials.properties) are added to the same group.

Execution

Running on the Platform

In order to deploy and run this pipeline template, you will need the Wizard Deployer. The Wizard executes interactively, asking questions about the application, and expects the user to provide needed answers. Follow the Wizard's documentation instructions and set up needed parameters, then follow these steps:

  1. Execute the script as ./wizard.sh
  2. Follow the prompts and provide needed answers

You can use your existing output layer or let the Wizard create a new catalog/layer for you. If you are using existing catalog, make sure it is shared with the GROUP_ID that will be used for this deployment.

PLEASE NOTE: You will need to run the Wizard twice, first for the vehicle-road-coverage-processor-pipeline. Once the processor pipeline finishes publishing results to the output catalog, the Wizard must be run for the vehicle-road-coverage-aggregator pipeline to get the final aggregated results. Follow the instructions in the README.md of the individual pipelines for further instructions.

Verification

In the Platform Portal select the Pipelines tab where you can see your Pipeline deployed and running. After your Pipeline finishes and your data is published, you can find your output catalog under the Data tab and inspect your data visually, or query/retrieve your data programmatically using one of the following options:

Cost Estimation

Executing this pipeline template will incur the following costs:

Storage-Blob

Cost will depend on the amount of data that will be published to a Versioned layer as an output from execution.

Data Transfer IO

Cost will depend on the amount of:

  • amount of data read from the input catalog and HMC catalog - this will depend on the bounding box you specified
  • amount of data published to your output layer
Metadata

Cost will depend on the amount and size of partitions(metadata) stored in the Versioned layer.

Compute Core and Compute RAM

Cost will depend on the workers you set when running the Wizard.

Log Search IO

Cost will depend on the log level set for the execution of this pipeline template. To minimize this cost, the user can set log level to WARN.

Support

If you need support with this pipeline template, please contact us.

results matching ""

    No results matching ""