Accessing Metadata Programmatically

Metadata can be manipulated and updated within both nodes and programs in Algoreus.

This feature can be employed for metadata-centric processing. For instance, if you have a dataset with a field that contains sensitive information and your organization enforces a policy to mask all such sensitive information, you can tag the sensitive field accordingly. Following this, a node can be designed that reads the metadata of all fields in the dataset and masks the content of a field if it is tagged as sensitive.

Programs

Metadata can be accessed by leveraging methods from the MetadataReader object. These are accessible through the appropriate program context object in your program, obtained by invoking the getContext() method within the initialize method of your program.

Here's an example of how you can fetch the metadata of an entity:

@Override
public void initialize() throws Exception {
  MapReduceContext context = getContext();

  // construct the metadata entity
  MetadataEntity entity = MetadataEntity.builder().append(MetadataEntity.NAMESPACE, "myNamespace")
    .appendAsType(MetadataEntity.DATASET, "myDataset").build();

  // get the metadata
  Map<MetadataScope, Metadata> metadata = context.getMetadata(entity);
}

Similarly, you can tag metadata to an entity through methods provided by the MetadataWriter object. This object is accessible through the same program context object.

Here's an example of how you can tag metadata to an entity:

@Override
public void initialize() throws Exception {
  MapReduceContext context = getContext();

  // construct the metadata entity
  MetadataEntity entity = MetadataEntity.builder().append(MetadataEntity.NAMESPACE, "myNamespace")
    .appendAsType(MetadataEntity.DATASET, "myDataset").build();

  // add a tag
  context.addTags(entity, "someTag");
}

Nodes

Within a node, metadata can be accessed through methods from the MetadataReader object, available via the relevant context provided in the prepareRun stage.

Here's an example of how you can retrieve the metadata of an entity within a node:

@Override
public void prepareRun(BatchSourceContext context) throws Exception {
  context.setInput(Input.ofDataset(config.tableName));

  // construct the metadata entity
  MetadataEntity entity = MetadataEntity.builder().append(MetadataEntity.NAMESPACE, "myNamespace")
    .appendAsType(MetadataEntity.DATASET, "myDataset").build();

  // get the metadata
  Map<MetadataScope, Metadata> metadata = context.getMetadata(entity);
}

Last updated