The Data Processing Library contains the following modules:

  • batch-core — provides the core functionality, including the Driver, DriverContext, DriverTask abstractions and all compilation patterns.
  • batch-catalog — provides the Catalog abstraction to access catalogs via Spark. This module contains the abstract interfaces only.
  • batch-catalog-dataservice — provides the implementation of the batch-catalog abstractions for the HERE Data API.
  • pipeline-runner — provides the PipelineRunner class that constitutes the entry point of a batch pipeline. This module depends on batch-core.
  • batch-core-java — Java bindings for the batch-core module.
  • batch-catalog-java — Java bindings for the batch-catalog module.
  • pipeline-runner-java — Java bindings for the pipeline-runner module.
  • batch-validation — provides a set of classes and DeltaSet transformations to implement data validation pipelines.
  • batch-validation-scalatest — provides scalatest bindings to implement data validation suites using scalatest Domain Specific Language.

To use the Data Processing Library in your Scala applications, it is sufficient to include pipeline-runner and batch-catalog-dataservice as dependencies.

For Java applications, you also need to include pipeline-runner-java as a dependency.

For more information on how to manage dependencies, see Dependency Management.

results matching ""

    No results matching ""