Data Engineer at qode.world

Senior Remote Posted about 14 hours ago RemoteFirstJobs Product

Engineer

AI summary: Designs and builds scalable ETL/ELT data pipelines using Databricks, Airflow, Airbyte, and dbt to support analytics and data-driven decision-making across the organization.

Description

Data Engineer

Location: Vietnam

Workplace Type: Remote

About the Role

Our client is looking for a Data Engineer to join their Data Platform team, focusing on building scalable data pipelines and enabling analytics across the organization. In this role, you will work with modern data stack tools like Databricks, AWS, Airflow, Airbyte, and dbt to design and maintain data workflows that support reporting, analytics, and data-driven decisions.

Responsibilities

Design and build scalable ETL/ELT pipelines using both batch and streaming approaches.
Develop ingestion workflows from multiple sources such as databases, APIs, and event streams.
Implement ingestion strategies including full load, incremental load, and CDC.
Orchestrate data workflows using Apache Airflow.
Manage data connectors using Airbyte.
Work with Databricks Lakehouse to build and optimize data processing pipelines.
Write and optimize complex SQL queries for analytics and transformation.
Build modular and testable data models using dbt (staging → intermediate → marts).
Maintain data quality, observability, and reliability across the platform.
Work with AWS services such as S3, Lambda, EC2, IAM.
Containerize data services using Docker and Kubernetes (EKS) when needed.
Document pipelines, data models, and data dictionaries for long-term maintainability.

Requirements

At least 5 years of experience in Data Engineering.
Strong understanding of data architectures such as Data Lake, Data Warehouse, and Lakehouse.
Hands-on experience with ETL/ELT pipelines, including batch and streaming processing.
Familiar with ingestion patterns: full load, incremental, CDC, event-driven.
Experience working with Databricks (Delta Live Tables, Jobs, Notebooks).
Strong skills in PySpark or Spark SQL for large-scale data processing.
Solid understanding of Delta Lake (ACID, time travel, schema evolution).
Experience with Apache Airflow (DAGs, scheduling, monitoring).
Experience with Airbyte or similar ingestion tools.
Strong SQL skills (CTEs, joins, window functions, query optimization).
Experience with dbt for transformation, testing, and documentation.
Hands-on experience with AWS (S3, Lambda, IAM, etc.).

Nice to have

Experience with Docker, Kubernetes (EKS).
Experience running Airflow or Airbyte on Kubernetes.
Familiar with data quality tools such as Great Expectations or Soda.
Experience with Terraform or Infrastructure as Code.
Exposure to data governance or catalog tools (e.g., Databricks Catalog).
Experience with CI/CD pipelines (e.g., GitHub Actions).
Strong Python skills for automation and pipeline scripting.

Benefit Packages:

Attractive salary range and we are open to negotiate if you’re a strong fit.
Hybrid/Remote-friendly culture, work where you grow best.
Flexible hours, async teamwork (we respect your focus time).
Work equipment support.
Allowance for Certification & Skill Development.
Year-end bonus & performance-based rewards.
15 paid leaves a year.
Career growth with personal coaching sessions.
Open, collaborative team culture - no micromanagement, only trust.
Tools & AI-powered workflows that make remote work easier.

Apply on source