Extracting fields to retrieve values
You may have valuable information stored in a single column, split by a delimiter. To execute transformations on this data, you need to segregate the values into different columns. In Algoreus, you can retrieve values from a column and generate a new column for the extracted values. This extraction can be based on:
Patterns
Delimiters
Positions
Utilizing patterns
You can retrieve values from fields in String columns using the following common patterns:
Credit cards
Date
Date time
Email
URLs from HTML anchors
IPv4 address
ISBN codes
Mac address
N digits number
SSN
Start/End pattern
Time
To extract values based on a pattern
Follow these steps:
From the drop-down menu, click Extract data > Utilizing patterns.
Select a pattern to extract the fields. Optionally, click Show pattern to view the regex for that pattern.
Algoreus extracts the fields based on the chosen pattern and adds the extract-regex-groups directive to the recipe. When you run the data axon, the transformation will be applied to all values in the column.
Using delimiters
You can break down a column into two or more columns based on a delimiter. The extraction of values can be based on the following delimiters:
Comma
Tab
Pipe
Whitespace
Custom separator
Note: If you select Custom separator, a regular expression (regex) is required to define the separator. Regular expressions allow for the use of intricate search patterns when splitting the data in the column. It supports standard Java regular expression constructs.
If the value does not contain the specified separator, then no additional rows are generated.
To extract values based on a delimiter:
From the drop-down menu, click Extract data > Using delimiters.
Select the delimiter to use to extract the fields.
Algoreus extracts the fields based on the chosen delimiter and adds the split-to-columns directive to the recipe. When you run the data axon, the transformation will be applied to all values in the column.
Using positions
You can extract a part of a string based on its position in the string.
To extract fields based on positions from a column:
From the drop-down menu, click Extract data > Using positions.
The column appears with a blue background, indicating that you are in Extract mode.
Highlight the portion of one value you want to extract for all values in the column. The Extract dialog box appears showing the positions you selected to extract:
Provide a name for the new column.
Algoreus extracts the fields based on the selected pattern and adds the cut-character directive to the recipe. When you run the data axon, the transformation will be applied to all values in the column.
Last updated