Wednesday, July 2, 2025

Using DAGU for Pipeline Workflows

I wanted to run my ML / AI scripts - an increasing number of them. This runs on Linux, and I am familiar with the tried-true cron. But I wanted something a bit better.

After an evaluation of Windmill and Prefect, I came across dagu. 

I loved that it was lightweight. You can download the source, compile it, and off you go. A single binary.

Understanding how it works, however, was a bit more tedious. 

dagu has a CLI, which is great. It also has GUI that runs on port 80 (SSL not supported I guess).

I decided to set it up as a System Service in Linux (systemd). So I crafted up a unit file for it. To do this, I had to create a dagu user, with no login (for security), and no home directory.

One problem I ran into, is that dagu needs a home directory. I created /opt/dagu and it created directories underneath that for dags

The binary likes to be in /usr/local/bin. But in addition to that, I created a directory called /opt/dagu. 

If you use the CLI, and pass your yaml file into it, dagu wants to run it "then and there". In my testing at least, it ignored the schedule. Or, perhaps I could be wrong and maybe it will acknowledge the schedule but it still wants to do an immediate run the moment you type: "dagu start my_dag -- NAME=mydag".

So there's a couple of other ways to make your yaml dag file work.

  • Craft your yaml inside the GUI - which will ultimately save your dag in the $DAGU_HOME/dags directory.
  • Drop your yaml into the $DAGU_HOME/dags directory and restart the dagu service -- remember, I set it up as a service! "systemctl restart dagu". Since the service does a start-all it starts the scheduler as well which is a different process fork.

Once it loads in, you can get different views and perspectives of your workflow. 

This is a pipeline view. 

pipeline view

 This is a timeline view.


 And, this is a graph perspective.


 




No comments:

AI / ML - Altman-Z Score

I saw a website that was showing an Altman-Z score for companies in their Solvency section. Not fully aware of this calculation, I decided t...