Data API - changes, deprecations and known issues

Summary of changes

January 2022

Added: Use case examples with respective source code, including geometry simplification, data sampling, clustering, and spatial/property search have been added to documentation for interactive map layers, available at https://github.com/heremaps/here-interactive-map-layer-examples.

September 2021

Added: Interactive map layers now support the ability to request data in the open, industry standard MVT (Mapbox Vector Tile) format via the interactive API. This data format is also optimized for some data visualization purposes (such as when using Maps API for JS to produce a map within a web application) because the geometries are already projected and simplified specifically for the requested zoom level, thereby reducing the amount of work necessary by a consuming client to render the data.

July 2021

Changed: Due to underlying database changes/improvements, HERE is now able to charge for exact bytes stored for metadata (sum of bytes stored) as opposed to using an approximation of typical metadata partition size (partition count * 256 KB). It is possible that you will see slight differences in your metadata charges during your next billing period due to this change. Learn more here: Billable Cloud Services

May 2021

Added: Rate limits are now in place for interactive map layers. Learn more here: Data Limits and Cost

Updated: Ingest API has been updated to support CORS (cross-origin requests and data transfers) to support integrations with browser-based applications. Learn more here: Ingest API

February 2021

Added: Upload your data to Object Store layers using Hadoop FSWe have added two new tutorials explaining how to publish data to Object Store layers in a distributed fashion using Hadoop FS. One shows how to do this from Spark and another one shows how to do this in standalone mode. Typically, you would use one or the other when writing data to an Object Store layer that you have stored in AWS S3 or Microsoft Azure.

January 2021

Added: A new Interactive Map layer type is available to store and manipulate data on a feature level (not available in China with this release)

For those of you familiar with HERE Data Hub as a storage/service solution available via the HERE Developer Portal, this new interactive map layer introduces that same storage concept to HERE Workspace where you can work with data at a more granular feature level, modifying individual geofeatures or even feature properties instead of full partitions. You can also request different tiling schemas at the time of request, including quadkey/Virtual Earth, Web Mercator, OSGEO Tile Map Service and HEREtile.
This release includes integration support via API only with integrated visualization capabilities forthcoming. This feature is not yet available in China. Learn more about this interactive map layer storage solution here Create a layer.

Fixed: The blobfs library has been updated to correctly list both directories and files. You can now successfully read a single file from an Object Store layer using the blobfs library without including the respective directory path in the request.

December 2020

Added: A new Object Store layer type is available to store objects without requiring HERE partitions

HERE Workspace now supports Object Storage as a new layer type. Read, write, update, delete and list objects stored in Object Store layers via any of the Data interfaces with exception of the Portal. Portal support, including a file system view and data inspection/visualization, will be delivered at a later time.
A Hadoop File System interface-based connector is also provided with this solution in the form of a new component of the Data Client Library. This library enables you to access data stored in this layer via any tooling which supports HDFS. The combined solution of storage + library enables you to store objects in HERE Workspace without requiring HERE partitions and reduces custom code necessary to access your stored data from outside HERE Workspace.
Note: The rollout of this new storage type introduces a new version of the Blob API (data-blob-v2). In order to leverage Object Storage directly via API, you must use this new version. The preexisting API (data-blob-v1) will continue to exist. See more information here.

Changed: The Data Stream API and the Data Client Library are now integrated with OAUTH2. OAUTH2 mitigates the need for Kafka data producers and consumers to restart whenever authentication tokens are refreshed, which was the case with the prior version of authentication. Such restarts can drive data duplication in streaming data workflows, therefore this update is intended to address that issue.

October 2020

Changed: Change layer configurations after a layer is created to correct mistakes without needing to delete and recreate the layer

It is now possible to edit certain layer configurations after a layer has been created and even after the parent catalog as been marked as "Marketplace Ready". This change was made to ease certain changes, mostly to the advantage of developers when making configuration mistakes during layer creation or who want to change the configuration during CI/CD testing. It is important to note that making certain layer configuration changes after data is stored in the layer and/or after the parent catalog is available in Marketplace is very risky and can lead to irrecoverable impacts on data. Such changes should be made with great caution and understanding of the ramifications. Learn more here. The following configurations are now mutable via API, Data Client Library and CLI after a layer has been created:
Stream layer configurations: Throughput, retention (TTL), content type, content encoding (compression) and schema association.
Index layer configurations: Retention (TTL) and schema association
Versioned layer configurations: Schema association
Volatile layer configuration changes as well as support for this functionality via the Portal will be delivered in a future release.

September 2020

Performance: Set granular Stream layer throughput configurations to optimize performance and cost

Set more granular stream layer throughput configurations so that you can better optimize your stream layer storage based on your use case and budget needs. When creating a Stream layer, you are only charged for the "In" throughput. With this release, you can create new Stream layers with "In" throughput in 100 KB/s increments. Please see more specific details about this change here.

Note:

Going forward and in order to maintain backward compatibility with older Data Client Library versions, you will see Stream throughput numbers in MB/s and rounded down when requesting layer configuration information using older clients. Example: If you are using a Data Client Library version less than the latest version released with this HERE Workspace release AND you request Stream layer throughput configuration AND you set your Stream layer "In" throughput to 100 KB/s, you will see a response of "0". This response is explained in HERE Workspace documentation here. Conversely, If you are using the latest SDK version available with this release, you will see the accurate "In" throughput of 100 KB/s for this example.
Please, take note of the corresponding deprecation announcement at the end of these release notes.

Added: You can upload data directly from your local machine to a data layer using the "Upload data" button in the data layer user interface.

July 2020

Added: Use Direct Kafka metrics to better monitor and debug streaming data workflows.

Underlying Kafka Producer, Kafka Consumer "Fetch" and Direct Kafka connectivity related metrics are now accessible via the Data Client Library for data workflows with Flink and when using the Direct Kafka connector type (only). These metrics can be used to create custom dashboards in Grafana. Underlying Kafka Consumer metrics are also available for programmatic retrieval. With these metrics, you have more information to help you monitor and debug any of your streaming data workflows using Direct Kafka. Learn more here.

Added: Add Stream layers to multi-region configured catalogs as an additional data loss mitigation strategy

This release supports adding Stream layers to multi-region catalogs. With this addition, it is possible to include Versioned, Volatile and Stream storage layers to multi-region catalogs. Catalogs can be configured to be multi-region upon initial catalog configuration/creation. Data stored in multi-region catalogs is replicated to a second region mitigating data loss in the event the primary region experiences a downtime event.

Note: Additional charges apply:

Storage charges double when storing data in a second region.
Data I/O charges increase 2.5 to 4 times, depending on the size of the objects you’re uploading: less for fewer large objects, more for many small objects. This is due to the validation HERE performs to ensure successful replication.
Learn more about multi-region catalogs and associated costs here.

Fixed: Issues with Metadata Storage billing were resolved where Metadata Storage tied to the use of Versioned and Index layers was not being charged / appearing on customer invoices. This fix will result in those charges now correctly appearing on invoices. Note that no storage costs have increased; associated increases in invoices are simply due to these charges now getting billed correctly.

May 2020

Added: Parallelize stream data consumption.

Setting a partition parallelization parameter for Stream layers is supported via the Data Client Library and the CLI, making this configuration now possible across the following Data interfaces in total (API, Portal, CLI, Data Client Library). This granular control can help you scale your stream data consumption per your use case needs by enabling more data consumers to consume your stream data in parallel.

Summary of currently active deprecation notices

Feature Summary: OrgID added to catalog HRN (RoW)

Deprecation period announced: November 2019

Deprecation period end: September 2022 (extended)

Deprecation Summary:

Catalog HRNs without OrgID will no longer be supported in any way.
Referencing catalogs and all other interactions with REST APIs using the old HRN format without OrgID OR by CatalogID will stop working after this deprecation period.
- Please ensure all HRN references in your code are updated to use Catalog HRNs with OrgID before this date so your workflows continue to work.
HRN duplication to ensure backward compatibility of Catalog version dependencies resolution will no longer be supported after this date.
Examples of old and new Catalog HRN formats:
- Old (without OrgID/realm): hrn:here:data:::my-catalog
- New (with OrgID/realm): hrn:here:data::OrgID:my-catalog

Feature Summary: Schema validation to be added

Deprecation period announced: March 2020

Deprecation period end: June 2022 (extended)

Deprecation Summary:

For security reasons, the platform will start validating schema reference changes in layer configurations after this deprecation period. Schema validation will check if the user or application trying to make a layer configuration change has at least read access to the existing schema associated with that layer (i.e., a user or application cannot reference or use a schema they do not have access to).
If the user or application does not have access to a schema associated with any layer after this date, attempts to update configurations of that layer will fail until the schema association or permissions are corrected. Make sure all layers refer only to real, current schemas - or have no schema reference at all - before the deprecation period end. It's possible to use the Config API to remove or change schemas associated with layers to resolve these invalid schema/layer associations. Also, any CI/CD jobs referencing non-existent or inaccessible schemas need to be updated by this date, or they will fail.

Feature Summary: Additional support to help distinguish Blob API versions in Lookup API responses

Deprecation period announced: December 2020

Deprecation period end: September 2022 (extended)

Deprecation Summary:

Because multiple versions of different Data APIs exist, it's important that your automated workflows that request service endpoints from Lookup API are updated to select the right baseUrls for the right API and API version you are working with. As some existing customer workflow automation is not yet updated to select the right baseUrls from the Lookup API responses, the Look Up API will return multiple Blob API V1 baseUrls in various positions in responses over the next 6 months, starting January 2021.
To prevent downtime, update your workflow automation during this deprecation period to select the right baseUrl from Lookup API responses based on the right API and API version.

Feature Summary: Publish API deprecation for writing data to stream layers

Deprecation period announced: March 2022

Deprecation period end: March 2023

Deprecation Summary:

Writing data to stream layers via the Publish API is deprecated. Going forward, data writes to stream layers should use the Ingest API only. As of the end of the deprecation period, the ability to write to stream layers via the Publish API will no longer be supported by either the API or SDK. Use the Ingest API or SDK versions equal to or greater than 2.36 to ensure your workflows are not impacted by this deprecation.

Summary of current known issues

Known issue: In support of the Object Store layer type, a newer version of the Blob API (blob v2) is available in production. The availability of this newer Blob API version can impact existing workflows if developers use Lookup API to get a list of all provided endpoints for a given resource BUT do not select the right baseUrl based on the right API and API version. Because multiple versions of the same API exist, Lookup API responses will include specific URLs per API version.
Workaround: Always select the right baseUrl from Lookup API responses based on the API and API version that you intend to work with. To support existing workflows until you can correct your API selection logic, the Lookup API will return multiple Blob API v1 baseUrls in various positions in responses for the next 6 months, starting January 2021. Please see the deprecation summary at the end of these release notes for more information.

Known issue: The "Upload data" button available via "More" within the versioned layer details page is hidden when the "Content encoding" field in the layer is set to "gzip".
Workaround: Files (including .zip files) can still be uploaded and downloaded as long as the "Content encoding" field is set to "Uncompressed".

Known issue: The changes released with 2.9 (RoW) and with 2.10 (China) - for adding OrgIDs to catalog HRNs - and with 2.10 (Global) - for adding OrgIDs to schema HRNs - could impact any use case (CI/CD or other) where comparisons are made between HRNs used by various workflow dependencies. For example, requests to compare HRNs that a pipeline is using with those to which a group, user or app has permissions will result in errors if the comparison is expecting results to match the old HRN construct. With this change, data APIs will return only the new HRN construct, which includes the OrgID, e.g. olp-here…, so a comparison between the old HRN and the new HRN will fail.

Reading from and writing to catalogs using old HRNs will continue to work until this functionality is deprecated (see deprecation notice summary).
Referencing old schema HRNs will continue to work indefinitely.

Workaround: Update any workflows comparing HRNs to perform the comparison against the new HRN construct, including the OrgID.

Known issue: Visualization of data is not yet supported by the Data Inspector for index or interactive map layer types.

Data API - changes, deprecations and known issues

Jeff Henning

Have your say

Sign up for our newsletter

HERE