Axons (Pipeline) User Guide
Algoreus Axons, is a robust, user-friendly framework that simplifies the management of your big data. With Algoreus Axons, you're equipped to ingest, transform, and load data, leveraging the power of the ability to convert transformations and programmatic logic into parallelized computations utilizing Spark and MapReduce.
In a broader sense, the term "axons" refers to the path data travels from a source system, passing through intermediate datasets, to finally produce curated datasets that form the foundation of machine learning and analytical workflows. A successful axon regularly and reliably guides data through this journey, powered by a dedicated person or team ensuring its functionality.
A high-quality, production-ready axon is characterized by factors such as data scale, latency requirements, maintenance complexity, and build scheduling. Our comprehensive setup includes quick and easy setup, support for data quality checks, and robust security and governance protocols.
Algoreus Axons aim to provide a seamless user experience, assisting users in developing complex data processing workflows in both batch and real-time modalities through an intuitive UI. To streamline administrative processes, our axons offer accessible logs and metrics, eliminating the necessity for extensive custom tooling.
In the Algoreus Axon, you will see a series of stages arranged in a Directed Acyclic Graph (DAG), constructing a one-way data highway. These stages, serving as the "nodes" in the axon graph, are categorized into six main categories: Sources, Transformations, Analytics, Actions, Sinks, and Error Handling.
Data Sources: These are databases, files, or real-time streams you use to obtain your data, offering simple UI for hassle-free data ingestion.
Data Transformations: Once data is ingested, transformations allow you to manipulate it. This could be cloning a record or formatting JSON, and you can even create custom transformations with the Javascript node.
Data Analytics: These nodes enable running analytics or Machine Learning tasks with Algoreus ML on the data, including data merging from different sources. Algoreus Axons provides nodes for numerous use cases.
Actions: These nodes define a custom action that's scheduled within a workflow but doesn't directly manipulate data. For instance, you could run a database command at the end of your axons or move files within an HDFS cluster.
Error Handling: You can employ Error Handler when stages encounter errors. This catches these errors and can record them in a database for further inspection.
Sinks: Finally, your processed data is written to a Sink, which can vary from Avro, Parquet, to an RMBDS, accessible from Algoreus UI.
To summarize, Algoreus Axons allow you to construct and deploy Big Data applications running MapReduce or Spark with just a visual interface.
Finally, to ensure the longevity and maintenance of the Axon, we provide support processes and comprehensive documentation. This ensures the high-quality of the axon is preserved even as it transitions between teams. All these features combined make Algoreus Axons a powerful tool for managing and processing your big data.
Last updated