Deploying Apache Airflow on Akash: A Step-by-Step Guide

Introduction to Apache Airflow

Apache Airflow is an open-source platform designed for orchestrating workflows. It allows developers to create, schedule, and monitor workflows as directed acyclic graphs (DAGs). Airflow is highly extensible and can be used for a variety of automation tasks.

Key Use Cases for Airflow

  1. Data Engineering: Automating ETL pipelines for data transformation and loading.
  2. Machine Learning Pipelines: Coordinating training, validation, and deployment of machine learning models.
  3. DevOps: Managing CI/CD pipelines and system automations.
  4. Analytics: Scheduling reports and running analytics workflows.
  5. Integration: Orchestrating tasks across multiple services and APIs.

Prerequisites

  1. Akash CLI: Ensure the Akash CLI is installed and configured.
  2. Docker Knowledge: Basic understanding of Docker and images.
  3. Apache Airflow Docker Image: We’ll use the official apache/airflow image.
  4. SDL Template: You can use your pre-built SDL template for deploying applications on Akash.

Steps to Deploy Apache Airflow on Akash

1. Prepare Your SDL File

Create a deploy.yaml file that describes the resources and configurations for your Airflow deployment. Below is a sample SDL file for deploying Apache Airflow:

version: "2.0"
services:
airflow:
image: apache/airflow:latest
expose:
- port: 8080
as: 80
to:
- global: true
env:
- AIRFLOW__CORE__EXECUTOR=LocalExecutor
- AIRFLOW__CORE__SQL_ALCHEMY_CONN=sqlite:////usr/local/airflow/airflow.db
args:
- airflow webserver
volumes:
- size: 1Gi
resources:
cpu:
units: 500m
memory:
size: 512Mi
profiles:
compute:
airflow:
resources:
cpu:
units: 500m
memory:
size: 512Mi
storage:
size: 1Gi
placement:
akash:
pricing:
airflow:
denom: uakt
amount: 100
deployment:
airflow:
airflow:
profile: airflow
count: 1

2. Customize Airflow Configuration

  • Update environment variables under env in the SDL file to suit your needs.
  • For a production setup, consider using a database like PostgreSQL instead of SQLite.
  • Adjust resource requirements under the resources section.

3. Deploy the SDL File to Akash

Run the following commands to deploy Airflow on Akash:

  1. Validate Your SDL File:

    akash deployment validate deploy.yaml
  2. Send the Deployment:

    akash deployment create deploy.yaml
  3. Query the Lease: Find the lease created for your deployment:

    akash deployment lease-status --dseq <deployment-sequence>
  4. Access Airflow: Once the lease is active, you will receive an external IP address and port. Use this to access the Airflow web server in your browser.

4. Set Up and Test DAGs

Once Airflow is running, upload your DAGs to the /dags directory in the container (use persistent storage or mount a volume). Test workflows to ensure everything is configured properly.


Conclusion

Deploying Apache Airflow on Akash leverages decentralized computing resources, reducing costs while maintaining scalability. By customizing the SDL template, you can deploy Airflow for various use cases, from data engineering to machine learning.

footer-logo-dark

© Akash Network 2025 The Akash Network Authors Documentation Distributed under CC BY 4.0

Open-source Apache 2.0 Licensed.

GitHub v0.20.0

Privacy