Extracting fields to retrieve values

You may have valuable information stored in a single column, split by a delimiter. To execute transformations on this data, you need to segregate the values into different columns. In Algoreus, you can retrieve values from a column and generate a new column for the extracted values. This extraction can be based on:

  • Patterns

  • Delimiters

  • Positions

  • Utilizing patterns

You can retrieve values from fields in String columns using the following common patterns:

  • Credit cards

  • Date

  • Date time

  • Email

  • URLs from HTML anchors

  • IPv4 address

  • ISBN codes

  • Mac address

  • N digits number

  • SSN

  • Start/End pattern

  • Time


To extract values based on a pattern

Follow these steps:

  1. From the drop-down menu, click Extract data > Utilizing patterns.

  2. Select a pattern to extract the fields. Optionally, click Show pattern to view the regex for that pattern.

Algoreus extracts the fields based on the chosen pattern and adds the extract-regex-groups directive to the recipe. When you run the data axon, the transformation will be applied to all values in the column.


Using delimiters

You can break down a column into two or more columns based on a delimiter. The extraction of values can be based on the following delimiters:

  • Comma

  • Tab

  • Pipe

  • Whitespace

  • Custom separator

Note: If you select Custom separator, a regular expression (regex) is required to define the separator. Regular expressions allow for the use of intricate search patterns when splitting the data in the column. It supports standard Java regular expression constructs.

If the value does not contain the specified separator, then no additional rows are generated.

To extract values based on a delimiter:

  1. From the drop-down menu, click Extract data > Using delimiters.

  2. Select the delimiter to use to extract the fields.

Algoreus extracts the fields based on the chosen delimiter and adds the split-to-columns directive to the recipe. When you run the data axon, the transformation will be applied to all values in the column.


Using positions

You can extract a part of a string based on its position in the string.

To extract fields based on positions from a column:

  1. From the drop-down menu, click Extract data > Using positions.

  2. The column appears with a blue background, indicating that you are in Extract mode.

  3. Highlight the portion of one value you want to extract for all values in the column. The Extract dialog box appears showing the positions you selected to extract:

  4. Provide a name for the new column.

Algoreus extracts the fields based on the selected pattern and adds the cut-character directive to the recipe. When you run the data axon, the transformation will be applied to all values in the column.


Last updated