Configuration

The Data Client Library is configured via Typesafe Config. Usually this means that you provide an application.conf which contains all the application-specific settings that differ from the default ones provided by the reference.conf configuration files from the individual Data Client Library modules.

These are the relevant default configuration values for the Data Client Library modules.

Client Core

##########################################
# Data Client Reference Config File #
##########################################
include classpath("here-cacerts.conf")

#akka {
#  # this is the minimum log level of events that will be forwarded by akka to slf4j logger.
#  # slf4j backend configuration can define higher log level in its configuration (i.e. in logback.xml or in log4j.xml).
#  loglevel = "DEBUG"
#  loggers = ["akka.event.slf4j.Slf4jLogger"]
#  logging-filter = "akka.event.slf4j.Slf4jLoggingFilter"
#  logger-startup-timeout = 1m
#}

# This is the reference config file that contains default settings for Data Client.
# Any application-specific settings that differ from the default ones provided here should be set in your application.conf.
here.platform.data-client {
  # Application defined string to be appendend in User-Agent and forward on each request.
  #
  # user-agent-suffix = "<application user agent>"

  # Define timeout policy during retries.
  retry-policy {
    # The type of policy. Possible values are 'fail-fast', 'normal-pace' and 'best-effort'. Default is 'normal-pace'.
    #
    # For each of these types there is a per request timeout, an overall timeout and
    # a maximum waiting time between retries like this:
    #
    # retry type  | per request timeout | overall timeout | max. wait time
    # ------------+---------------------+-----------------+----------------
    # fail-fast   | 5 seconds           | 35 seconds      | 10 sec
    # normal-pace | 131 seconds         | 195 seconds     | 60 sec
    # best-effort | 5 minutes           | 61 minutes      | 60 sec
    #
    # Note: This is not applicable for streaming which uses a general overall timeout of 3 minutes.
    #
    type = "normal-pace"
  }

  timeouts {
    # All timeouts use syntax of class scala.concurrent.duration.Duration which is roughly (for details see scala doc)
    # <length><space><unit> where length is some integer and unit is one of
    # d, day, h, hour, min, minute, s, sec, second, ms, milli, millisecond, µs, micro, microsecond, ns, nano, nanosecond and their pluralized forms
    # (for every but the first mentioned form of each unit, i.e. no "ds", but "days")
    # Infinity is defined as "Inf" without length or unit

    # This timeout defines the overall deadline for uploading all parts of an upload to durable volume.
    # durable-write-volume-overall = "10 minutes"
  }

  # Dispatcher which is used for blocking io operations
  # performed by Data Client
  blocking-io-dispatcher {
    type = Dispatcher
    executor = "fork-join-executor"
    fork-join-executor {
      parallelism-min = 1
      parallelism-factor = 0.8
      parallelism-max = 4
    }
    throughput = 1
  }

  # Define the proxy configuration. The credentials key is optional.
  #
  # proxy {
  #   host = "localhost"
  #   port = 9999
  #
  #   credentials {
  #     username: "user"
  #     password: "pass"
  #   }
  # }

  # Component responsible for executing http requests.
  # With custom implementation it is possible to exchange default akka-http client with any other.
  #
  # Fully qualified class name of the RequestExecutor interface implementation.
  # The Class must be public, have a public constructor with
  # (com.typesafe.config.Config, clientExecutionContext: ClientExecutionContext) parameters.
  # Any implementation MUST follow http redirects.
  request-executor {
    implementation = "com.here.platform.data.client.http.StdRequestExecutor"

    # Maximum number of redirects allowed
    # Applicable only with default implementation, must be >= 0
    max-redirects = 5

    # Please note that this section mirrors `akka.http.host-connection-pool`
    # however is used only for requests made with akka http with StdRequestExecutor
    akka.http.host-connection-pool {
      # The maximum number of times failed requests are attempted again,
      # (if the request can be safely retried) before giving up and returning an error.
      # Set to zero to completely disable request retries.
      # Number of retries for tcp errors only
      max-retries = 5

      # The maximum number of parallel connections that a connection pool to a
      # single host endpoint is allowed to establish. Must be greater than zero.
      max-connections = 16

      # The maximum number of open requests accepted into the pool across all
      # materializations of any of its client flows.
      # Protects against (accidentally) overloading a single pool with too many client flow materializations.
      # Note that with N concurrent materializations the max number of open request in the pool
      # will never exceed N * max-connections * pipelining-limit.
      # Must be a power of 2 and > 0!
      max-open-requests = 16

      # The time after which an idle connection pool (without pending requests)
      # will automatically terminate itself. Set to `infinite` to completely disable idle timeouts.
      idle-timeout = 60 sec

      # The "new" pool implementation will fail a connection early and clear the slot if a response entity was not
      # subscribed during the given time period after the response was dispatched. In busy systems the timeout might be
      # too tight if a response is not picked up quick enough after it was dispatched by the pool.
      response-entity-subscription-timeout = 15s

      # Modify to tweak client settings for host connection pools only.
      #
      # IMPORTANT:
      # Please note that this section mirrors `akka.http.client` however is used only for pool-based APIs,
      # such as `Http().superPool` or `Http().singleRequest`.
      client = {
        # The time after which an idle connection will be automatically closed.
        # Set to `infinite` to completely disable idle timeouts.
        idle-timeout = 60 sec

        parsing.max-chunk-size = 10m
      }
    }
  }

  #Turns on/off HERE certificates pinning
  ssl-config.pin-here-certificates = true

  # Discovery of baseUrls of various Data Service APIs like publish, metadata, query, etc.
  endpoint-locator {
    # Determines which environment to use for the discovery service's endpoints.
    # Possible values are: 'here', 'here-dev', 'here-cn', 'here-cn-dev', 'custom'.
    # If 'custom' is specified then 'discovery-service-url' property MUST be set.
    discovery-service-env = here

    # Defines a URL for a custom discovery service endpoint.
    # discovery-service-url = "<custom discovery service URL>"

    # Fully qualified class name of the RequestExecutor interface implementation.
    # The Class must be public, have a public constructor with
    # (com.typesafe.config.Config, clientExecutionContext: ClientExecutionContext) parameters.
    # Any implementation MUST follow http redirects.
    request-executor: ${here.platform.data-client.request-executor}
  }

  # Component responsible for signing all outgoing http requests.
  request-signer {
    # Fully qualified class name of the RequestSigner interface implementation.
    # The Class must be public, have a public constructor with
    # (com.typesafe.config.Config, clientExecutionContext: ClientExecutionContext) parameters.
    implementation = "com.here.platform.data.client.settings.DefaultHttpRequestSigner"

    # Billing tag, if provided will be added to every outgoing request.
    # billing-tag = "example_billing_tag"

    # Define credentials which will be used to sign outgoing requests.
    # If this configuration is ommited then credentials from ~/.here/credentials.properties file will be used.
    # Otherwise only one of "file-path", "here-account", "here-token" configurations should be specified.
    # credentials {
      # Absolute path of properties file in file system. File should contain follow properties:
      #     here.token.endpoint.url
      #     here.client.id
      #     here.access.key.id
      #     here.access.key.secret
      #     here.token.scope [optional]
      # file-path = "/path/credentials.properties"

      # Settings for HERE account credentials, to sign any outgoing requests.
      # here-account {
      #   here-token-endpoint-url = "https://elb.cn-northwest-1.account.hereapi.cn/oauth2/token"
      #   here-client-id = "example-client-id"
      #   here-access-key-id = "example-access-key-id"
      #   here-access-key-secret = "example-access-key-secret"
      #   here-token-scope = "example-project-hrn [optional]"
      # }

      # Settings for HERE Token credentials, to sign any outgoing requests.
      # here-token = "example-token"

    # }

    # Fully qualified class name of the RequestExecutor interface implementation.
    # The Class must be public, have a public constructor with
    # (com.typesafe.config.Config, clientExecutionContext: ClientExecutionContext) parameters.
    # Any implementation MUST follow http redirects.
    request-executor: ${here.platform.data-client.request-executor}
  }
}

Data Client

########################################
# Data Client Reference Config File #
########################################

# This is the reference config file that contains default settings for Data Client.
# Any application-specific settings that differ from the default ones provided here should be set in your application.conf.
here.platform.data-client {

  # To create queryApi/publishApi/etc. catalog configuration is mandatory
  # Request to fetch catalog configuration is blocking on object creation.
  # This property define max await timeout, MUST be bigger than 5s
  # Because all operations are timeout bounded, by default await until the underline operation fails.
  await-configuration-on-creation-timeout = infinite

  # Timeout that controlls maximum time of polling catalog configuration at the momonent of its creation in case of 403 response status.
  await-permissions-on-creation-timeout = 90s

  enable-flink-kafka-metrics = false

  config {

    # Delay between two subsequent requests to fetch status of create / delete / update catalog
    status-poll-interval = 5s

    # Fully qualified class name of the RequestExecutor interface implementation.
    # The Class must be public, have a public constructor with
    # (com.typesafe.config.Config, clientExecutionContext: ClientExecutionContext) parameters.
    # Any implementation MUST follow http redirects.
    request-executor: ${here.platform.data-client.request-executor}
  }

  metadata {

    # The time after which an idle connection to server will be automatically closed.
    # It MUST be lower than {here.platform.data-client.metadata.request-executor.akka.http.host-connection-pool.client.idle-timeout}
    # When metadata service requests contains byte range header, server can keep connection idle for some time before sending data,
    # current max value on server side is 1 min.
    # Server support resume of download,
    # this value also define a timeout after which if no data pass through, connection will be restarted.
    idle-timeout = 1m

    # Fully qualified class name of the RequestExecutor interface implementation.
    # The Class must be public, have a public constructor with
    # (com.typesafe.config.Config, clientExecutionContext: ClientExecutionContext) parameters.
    # Any implementation MUST follow http redirects.
    request-executor: ${here.platform.data-client.request-executor} {
      akka.http.host-connection-pool {
        max-connections = 128
        max-open-requests = 128
      }
    }
  }

  publish {

    # Delay between two subsequent requests to fetch commit token status
    commit-poll-interval = 15s

    # Max. number of threads publishing to kafka in parallel
    # event ordering is guaranteed only in case of 1 which has
    # huge performance impact.
    # Applies to direct kafka mode only.
    # @deprecated use here.platform.data-client.ingest.kafka-publish-parallelism
    kafka-publish-parallelism = 2048

    # Fully qualified class name of the RequestExecutor interface implementation.
    # The Class must be public, have a public constructor with
    # (com.typesafe.config.Config, clientExecutionContext: ClientExecutionContext) parameters.
    # Any implementation MUST follow http redirects.
    request-executor: ${here.platform.data-client.request-executor} {
      akka.http.host-connection-pool {
        max-connections = 256
        max-open-requests = 256
      }
    }
  }

  ingest {

    # Max. number of threads publishing to kafka in parallel
    # event ordering is guaranteed only in case of 1 which has
    # huge performance impact.
    # Applies to direct kafka mode only.
    # kafka-publish-parallelism = 2048

    # Fully qualified class name of the RequestExecutor interface implementation.
    # The Class must be public, have a public constructor with
    # (com.typesafe.config.Config, clientExecutionContext: ClientExecutionContext) parameters.
    # Any implementation MUST follow http redirects.
    request-executor: ${here.platform.data-client.request-executor} {
      akka.http.host-connection-pool {
        max-connections = 256
        max-open-requests = 256
      }
    }
  }

  query {

    # Fully qualified class name of the RequestExecutor interface implementation.
    # The Class must be public, have a public constructor with
    # (com.typesafe.config.Config, clientExecutionContext: ClientExecutionContext) parameters.
    # Any implementation MUST follow http redirects.
    request-executor: ${here.platform.data-client.request-executor} {
      akka.http.host-connection-pool {
        max-connections = 512
        max-open-requests = 512
      }
    }

    # Defines the behaviour of querying index partitions
    # Default value is `true`, which means that index partitions will be queried in parts.
    # When it is `false` the index partitions will be queried as a whole.
    query-by-parts = true
  }

  stream {
    # Define the type of connector used with any stream connection. Valid values are:
    # "http-connector" - Uses the HTTP API for connections
    # "kafka-connector" - Uses Kafka through kafka-support (must be available)
    connector {
      # Defines the connector used to consumed data.
      consumer = "kafka-connector"
      # Defines where notifications are retrieved from.
      consumer-notification = "http-connector"
      # Defines the connector used to publish data.
      producer = "kafka-connector"
    }

    # maximum size for embedded payload, if payload size is bigger than this value
    # data will be first uploaded to blobstore and later send to stream by reference
    max-embedded-data-size = 1024 kB

    # Fully qualified class name of the RequestExecutor interface implementation.
    # The Class must be public, have a public constructor with
    # (com.typesafe.config.Config, clientExecutionContext: ClientExecutionContext) parameters.
    # Any implementation MUST follow http redirects.
    request-executor: ${here.platform.data-client.request-executor} {
      akka.http.host-connection-pool {
        # manage a stream web connection require consistently pooling for data, leasing and commits.
        max-connections = 128
        max-open-requests = 128
      }
    }
  }

  index {
    # The time after which request will be retried
    idle-timeout = 600s,
    # Fully qualified class name of the RequestExecutor interface implementation.
    # The Class must be public, have a public constructor with
    # (com.typesafe.config.Config, clientExecutionContext: ClientExecutionContext) parameters.
    # Any implementation MUST follow http redirects.
    request-executor: ${here.platform.data-client.request-executor}
  }

  artifact {
    # Defines the name of the `RequestExecutor` for private access to the `artifact` service.
    request-executor: ${here.platform.data-client.request-executor}
  }

  interactiveMap {
      # Defines the name of the `RequestExecutor` for private access to the `interactiveMap` service.
      request-executor: ${here.platform.data-client.request-executor}
    }
}

Blobstore Client

########################################
# Data Client Reference Config File #
########################################

# This is the reference config file that contains default settings for Data Client.
# Any application-specific settings that differ from the default ones provided here should be set in your application.conf.
here.platform.data-client {

  blobstore {
    # The strategy of loading a file depends on its size and multi-region parameter. If size less the property's value
    # then applies a single upload strategy. In other case applies a multipart strategy.
    # Disable single uploading when the catalog is multi-region.
    # The maximum value is 50 MiB. The minimum value is 0
    use-multi-part-upload-from = 30 MiB

    # Parallelism for part upload during multipart upload.
    # Allowed values are from 1 to 100
    multi-part-upload-parallelism = 2

    # Buffer size for keeping upload data in memory before sending it to server.
    # For perfomance optimizations actuall buffer allocation can be 2x more.
    # Must be bigger than 5 MiB
    upload-buffer-size = 10 MiB

    # The time after which an idle connection to blobstore server will be automatically closed.
    # It MUST be lower than {blobstore-client.metadata.request-executor.akka.http.host-connection-pool.client.idle-timeout}
    idle-timeout = 15s

    # Define which algorithm will be used to automatically calculate metadata.checksum
    # of data published by WriteEngine.
    checksum-algorithm: "SHA-1"

    # Fully qualified class name of the RequestExecutor interface implementation.
    # The Class must be public, have a public constructor with
    # (com.typesafe.config.Config, clientExecutionContext: ClientExecutionContext) parameters.
    # Any implementation MUST follow http redirects.
    request-executor: ${here.platform.data-client.request-executor} {
      akka.http.host-connection-pool {
        max-connections   = 512
        max-open-requests = 512
      }
    }
  }

}

Flink Support

########################################
# Data Client Reference Config File #
########################################

# This is the reference config file that contains default settings for Data Client.
# Any application-specific settings that differ from the default ones provided here should be set in your application.conf.
here.platform.data-client {

  flink-support {

    # HERE Resource Name (HRN) a unique identifier of the catalog.
    # olp.catalog.hrn = "Whatever catalog HRN"

    # An identifier for a layer that should be unique within a catalog.
    # olp.catalog.layer-id = "Whatever layer-id"

    # Describes the structure of a partition in a catalog layer.
    # It is applicable only for the parquet and avro data formats.
    # olp.catalog.layer-schema = "Whatever layer-schema"

    # An interval in milliseconds that defines how often the sink should aggregate rows with the same index columns together.
    # Default value is 10000 milliseconds.
    # It is applicable only for the avro and parquet formats.
    olp.connector.aggregation-window = 10000

    # The maximum number of blobs that are being read in parallel in one flink task.
    # Default value is 10.
    olp.connector.download-parallelism = 10

    # The overall timeout in milliseconds that is applied for reading a blob from the Blob API.
    # Default value is 300000 milliseconds.
    olp.connector.download-timeout = 300000

    # An interval in milliseconds that defines the length of the time window used to perform the batching of the metadata for the publication.
    # All partitions which needed to be published in a given time window defined by this attribute will be published together as one.
    # Default value is 1000 milliseconds, 0 disables the metadata batching & each partition is published separately.
    olp.connector.publication-window = 1000

    # The maximum number of parallel batches that are allowed for the publication.
    # Default value is 20.
    olp.connector.publication-parallelism = 20

    # The maximum number of blobs that are being written in parallel in one flink task.
    # Default value is 10.
    olp.connector.upload-parallelism = 10

    # The overall timeout in milliseconds that is applied for writing a blob from the Blob API.
    # Default value is 300000 milliseconds.
    olp.connector.upload-timeout = 300000

    # It is used to derive/compose the group ID settings of the Kafka consumer config.
    # olp.kafka.group-name = "Whatever group-name"

    # Offset is the sequential id number that unique identifies each data record.
    # It can get either the "earliest" or the "latest" value.
    # It is translated to the Kafka auto.offset.reset consumer config.
    olp.kafka.offset = "earliest"

    # A string written in the RSQL query language that is used to query the index layer.
    # olp.layer.query = "Whatever RSQL query"
  }

  stream.kafka.multi-region {
    regionChangeListener {
      enabled = false
      poll.interval {
        ms = 100
        backoff {
          ms = 100
          max.ms = 1000
        }
      }
    }
    timestampConsumer.parallelism = 5
  }
}

Spark Support

########################################
# Data Client Reference Config File #
########################################

# This is the reference config file that contains default settings for Data Client.
# Any application-specific settings that differ from the default ones provided here should be set in your application.conf.
here.platform.data-client {

  spark-support {

    # Indicates the delete operation for the volatile layer removes only the data.
    # Default value is true.
    olp.volatile.delete-data-only = true

    # The overall timeout in milliseconds that is applied for decompressing the entire data of the partition.
    # Default value is 600000 milliseconds.
    olp.connector.data-decompression-timeout = 600000

    # The overall timeout in milliseconds that is applied for reading the partitions.
    # Default value is 60000 milliseconds.
    olp.connector.read-timeout = 60000

    # Indicates if Spark connector must check partitions integrity before reading or not
    olp.connector.check-invalid-partitions = false

    # Indicates if Spark connector should ignore invalid partitions or throw an exception. When
    # `olp.connector.check-invalid-partitions` is set to false it has no effect.
    olp.connector.ignore-invalid-partitions = false

    # indicates the checksum algorithm to be used for checking checksum of a partition in case of the layer
    # digest attribute is undefined. By default it is "MD5". possible values are: "MD5", "SHA-1" and "SHA-256"
    olp.connector.default-checksum = "MD5"
  }
}

Stream Support

########################################
# Data Client Reference Config File #
########################################

# This is the reference config file that contains default settings for Data Client.
# Any application-specific settings that differ from the default ones provided here should be set in your application.conf.
here.platform.data-client {
  stream {
    kafka {
      implementation = "com.here.platform.data.client.service.stream.kafka.DirectKafkaImpl"

      # define kafka protocols versions supported by the client
      supportedKafkaProtocolVersion = ["0.10"]

      consumer {
        properties {
          auto.commit.interval.ms = "1000"
          session.timeout.ms      = "30000"
          metadata.max.age.ms     = "30000"
          request.timeout.ms      = "35000"
          connections.max.idle.ms = "35000"
        }
      }

      producer {
        properties {
          retries = "50"
          retry.backoff.ms = "200"
          reconnect.backoff.ms = "250"
          reconnect.backoff.max.ms = "10000"
          linger.ms = "3"
          # Reduced buffer causes Kafka to block early in case it cannot keep up with
          # the speed of publishing. In other case it throws a timeout exception
          buffer.memory = "1048576"
          max.in.flight.requests.per.connection = "4"
          acks = "all"
          # Should be larger than the service timeout which is 10000
          # This timeout is extended to take into account the time spent in Kafka producer buffer
          # in case messages are produced faster than consumer is able to send them
          request.timeout.ms = "60000"
        }
      }

      # Configuration for multi region stream layer.
      multi-region {
        # regionChangeListener monitors Stream REST's health to determine when a region change occurs.
        regionChangeListener {
          enabled = false
          poll.interval {
            # The frequency at which to check if a region change has occurred.
            ms = 100
            backoff {
              # when a poll failure occurs this defines how much to increase the poll interval.
              # It will be reset on a success.
              ms = 100
              # maximum amount of time to wait between polls, backoff will not exceed this value.
              max.ms = 1000
            }
          }
        }
        # Timestamp consumers consume replicated message timestamp metadata. This value
        # specifies the number that will be created and run in parallel.
        # The minimum value is 1 and the maximum value is 50, with the default being 5.
        timestampConsumer.parallelism = 5
      }
    }
  }
}

Flink and Spark Connectors

The Flink and Spark connectors offer a set of common configuration properties plus a dedicated set of configuration properties per connector type. Unlike the other modules of Data Client Library the connectors do not use Typesafe Config but just a plain key-value map for its own configuration. This means, all configuration properties in this chapter should be used as is without adding any prefix. Nevertheless, the connectors do use the functions of the other Data Client modules thus those configuration properties apply as written in the chapters above depending on the use-case and layer type.

Common Properties

Related to catalogs
- olp.catalog.layer-format (STRING) - defines/overrides the format to be used for the layer. It shall use the same values as the content type from the layer definition.
- olp.catalog.layer-schema (STRING) - defines/overrides the schema to be used for the layer. It shall use the HRNs of the layer schemas.
Related to Data APIs
- olp.connector.query-parallelism (LONG) - defines the number of sub-queries and indirectly sets the level of parallelism used for querying the metadata. Default value is equals 20.
- olp.connector.download-parallelism (LONG) - defines the level of parallelism used for reading the data (partition payloads). Default value is 10.
- olp.connector.download-timeout (LONG) - defines the timeout for reading the data (partition payloads). Default value is 300000. Values are expressed in milliseconds.
- olp.connector.upload-parallelism (LONG) - defines the level of parallelism used for writing the data (partition payloads). Default value is 10.
- olp.connector.upload-timeout (LONG) - defines the timeout for writing the data (partition payloads). Default value is 300000. Values are expressed in milliseconds.
- olp.connector.publication-timeout (LONG) - defines the timeout for metadata publication (index only). Default value is 300000.
- olp.connector.metadata-columns (BOOLEAN) - controls if the connector is providing or expecting metadata columns.
  
  false (default value) - the metadata columns are not provided/expected
  
  true - the metadata columns are provided/expected

Spark Connector

Related to Data APIs
- olp.volatile.delete-data-only (BOOLEAN) - indicates the delete operation for the volatile layer removes only the data w/o modifying the metadata. Default value is true.
Miscellaneous
- olp.connector.force-raw-data (BOOLEAN) - controls if the connector is forcing use raw data or decoding the payload the blob based on the layer configuration.
  
  true - indicates to treat the layer as containing an unstructured/raw data (octet-stream)
  false (default value) - the payload file will be decoded according to the layer configuration
- olp.connector.data-decompression-timeout (LONG) - defines the timeout for decompressing the entire data of the partition (partition payloads). Default value is 600000. Values are expressed in milliseconds.
  
  false (default value) - use old deprecated version of volatile reader
  
  true - use new version of volatile reader
- olp.connector.read-timeout (LONG) - defines the overall timeout in milliseconds that is applied for reading the partitions. Default value is 60000.

Flink Support

Stream control
- here.platform.data-client.enable-flink-kafka-metrics (BOOLEAN) - enables emitting Kafka consumer and producer metrics through Flink.
  
  true - indicates that Kafka metrics are going to be reported when using Flink.
  
  false (default value) - disable reporting Kafka metrics through Flink.

Flink Connector

Related to catalogs
- olp.catalog.hrn (STRING) - HRN of the catalog
- olp.catalog.layer-id (STRING) - ID of the layer
Related to Data APIs
- olp.layer.query (STRING) - RSQL query predicate to be used when requesting data from the layer
- olp.kafka.offset (LONG) - when reading data from the streaming layer specifies the
- olp.kafka.group-name (STRING) - when reading the data from the streaming layer specifies
- olp.layer.output.path (STRING) - path to be used as an output folder for the objectstore layer type
Stream control
- olp.connector.aggregation-window (LONG) - defines the length of the time window used to perform the aggregation of the rows with the same partitioning information (volatile and index). All rows which need to be written out in a given aggregation window defined by this attribute will be grouped together into a single partition file` (payload file). Side effect is the added latency before the data is written out to the layer. Default value is 10000, 0 disables the aggregation and each row is written out as a separate partition. Values are expressed in milliseconds.
- olp.connector.publication-window (LONG) - defines the length of the time window used to perform the batching of the metadata for the publication` (volatile and index). All partitions which needed to be published in a given time window defined by this attribute will be published together as one. Side effect is the added latency before the metadata is written out to the layer. Default value is 1000, 0 disables the metadata batching and each partition is published separately. Values are expressed in milliseconds.
- olp.connector.publication-parallelism (LONG) - defines the level of parallelism used for metadata publication` (index only). Default value is 20.
- olp.connector-refresh-interval (LONG) - defines the interval for detecting changes in the layer which might result` (depending on the query predicate) in the data being streamed to the consumer. Default value is 60000. Values are expressed in milliseconds. -1 disables the refresh all-together.
Miscellaneous
- olp.connector.force-raw-data (BOOLEAN) - controls if the connector is forcing use raw data or decoding the payload the blob based on the layer configuration.
  
  true - indicates to treat the layer as containing an unstructured/raw data (octet-stream)
  
  false (default value) - the payload file will be decoded/encoded according to the layer configuration
- olp.connector.mode (STRING) - controls the schema of the interactive map flink table.
  
  read - row schema for reading (contains all columns)
  
  write - row schema for writing (contains writable columns only)
- olp.connector.max-features-per-request (INT) - limits the number of features requested from the interactive map layer by the connector in a single call. Adjust this if the layer contains very big features, default is 10000.

Other Modules

The other Data Client Library modules do not offer any configuration with Typesafe Config.

Configuration

Configuration

Client Core

Data Client

Blobstore Client

Flink Support

Spark Support

Stream Support

Flink and Spark Connectors

Common Properties

Spark Connector

Flink Support

Flink Connector

Other Modules

results matching ""

No results matching ""

Developer