Read From Layers

You can read partitions/data from the following layers:

  • Versioned
  • Volitile
  • Index
  • Stream

The functions read_partitions and read_stream_data return generators. A generator is a function that returns an object (iterator) which can be iterated one value at a time.

Read Partitions/Data from Versioned Layer

Parameters:

  • partition_ids – The list of partition IDs. If not specified, all partitions are read.

  • version – The catalog version. If not specified, the latest catalog version will be used.

  • part – Indicates which part of the layer will be queried. If not specified, return all the partitions. It cannot be specified together with partition_ids.

  • decode – Decode the data or return raw bytes.

Example:

partitions_map = versioned_layer.read_partitions(partition_ids=["377893751"], version=sdii_catalog.latest_version())

for partition_map in partitions_map:
    partition = partition_map[0]
    partition_blob = partition_map[1]
    print(partition.id, partition_blob)

Example: (with part paramter)

part parameter values can be determined by using metadata api.

partitions_map = versioned_layer.read_partitions(part="eyJwYXJ0SW5kZXgiOjEsIm1heFBhcnRzIjozLCJzY29wZSI6InBwIn0", version=sdii_catalog.latest_version())

for partition_map in partitions_map:
    partition = partition_map[0]
    partition_blob = partition_map[1]
    print(partition.id, partition_blob)

Example: (get values for part parameter)

partition_parts = versioned_layer.data_metadata_api.get_partitions_parts(layer_id=versioned_layer.id, num_requested_parts=3)
print(partition_parts)

Read Partitions/Data from Volatile Layer

Parameters:

  • partition_ids – The list of partition IDs. If not specified, all partitions are read.

  • decode – Decode the data or return raw bytes

Example:

partitions_map = volatile_layer.read_partitions(partition_ids=["81150"])

for partition_map in partitions_map:
    partition = partition_map[0]
    partition_blob = partition_map[1]
    print(partition.id, partition_blob)

Read Partitions/Data from Index Layer

Parameters:

  • query – RSQL query

  • decode – Decode the data or return raw bytes

Example:

# decode=False as this layer has no schema.
partitions_map = index_layer.read_partitions(query="hour_from=ge=10", decode=False)

for partition_map in partitions_map:
    partition = partition_map[0]
    partition_blob = partition_map[1]
    print(partition.id, partition_blob)

Read Partitions/Data from Stream Layer

Parameters:

  • subscription – The subscription from where to consume data

  • update_offsets – Automatically update offset so next read starts at the end of the last message.

  • decode – Decode the data or return raw bytes

Example:

subscription = stream_layer.subscribe()
partitions_map = stream_layer.read_stream_data(subscription=subscription)
for partition_map in partitions_map:
    partition = partition_map[0]
    partition_blob = partition_map[1]
    print(partition.id, partition_blob)

results matching ""

    No results matching ""