-
Notifications
You must be signed in to change notification settings - Fork 0
Workflow Components and Operation
The Workflows feature allows for ETLs to be automatically triggered with dependency on multiple combinations of data sources surpassing predefined quality score thresholds.
The bulleted numbers below correspond with the numbered callouts in the below screenshot.
The Schedule Run Time is used in conjunction with the Alerts. Alerts can be set to send specified users an email notification if the last ETL in a Workflow doesn’t trigger by the designated Scheduled Run Time. In the above screenshot, the last ETL in the example Workflow would be ‘AmultiPerfPipe_copy5’.
When creating Workflows, you must select Data Sources that are already set up in the platform. These Data Sources will also be found in the Sources tab in the blue left-hand navigation pane. The highlighted portion in the above screenshot is the parameter that is used when setting up an ETL so the ETL knows which file to process when triggered.
- If you click on the parameter name which is highlighted, it will automatically save to your clipboard.
- This parameter name can also be found in the Settings tab in the Manage screen for any Source.
The Run ETL button will manually trigger the designated ETL. In our above instance, this would trigger the ETL circled in #5.
When a Data Source has a report run, the status will display as “Complete” in green (as shown in the screenshot). The Data Quality Score for that report run will also display. The ‘Min DQ Score’ is a threshold that a source’s quality score must surpass in order to auto-trigger the ETL. In the above example, the report score did not surpass the threshold which is indicated by the Score being highlighted in red.
- If a Data Source is used multiple times in either the same Workflow or in different Workflows, the Min DQ score will be the same for every instance where that Data Source is present within all Workflows. In other words, if the same Data Source is found in two different Workflows, and you change the Min DQ Score on one of them, it will automatically update the other with the same Min DQ Score.
This shows the ETL Pipeline that is designated to trigger if the following conditions are met:
-
To have an ETL Auto Trigger:
-
- A report is generated on a new Batch file that hasn’t yet auto-triggered the ETL (Batches further explained in #8),
-
- The report’s data quality score surpasses the Min DQ Score threshold, and
-
- Any dependent conditions are met. In the above instance, there are no other dependent conditions for the circled ETL ‘AmultiPerfPipe_copy4’, so this ETL did auto-trigger. In the example just to the right, there is a dependency on three DataSource files that need to complete successfully with scores that surpass their respective Min DQ Scores before the ETL ‘AmultiPerfPipe_copy3’ would auto-trigger.
-
-
If the Run ETL button mentioned in #3 were pressed, it would trigger the designated ETL
For information purposes only.