Sr Data Engineer
Mastech DigitalFull Description
Responsibilities:
About the Job We are looking for a Senior Data Engineer to help us with ingestion, pipeline, and infrastructure work, while also bringing the technical depth to improve the systems around the systems: CI/CD, observability, and developer tooling. We want someone who can deliver on immediate DE priorities while also identifying and implementing improvements to our CI/CD and dbt processes that make the whole team faster and with more confidence. You'll be embedded with the Finance Data team and working closely with our engineers on production systems. Key to this vision is the ability to integrate AI capabilities throughout.
What You’ll Do:
* Build and maintain data ingestion pipelines across a variety of sources (web scraping, RPA, internal and external APIs, Fivetran, DBs, etc) into our Databricks data warehouse
* Develop and extend dbt models supporting financial reporting use cases including billing, revenue recognition, commissions, and ads revenue
* Contribute to monitoring, data quality, and alerting capabilities across our pipelines, ensuring data freshness and reliability for downstream consumers
* Support fine-grained access control and role management in Databricks
* Collaborate with analysts and finance stakeholders to scope and deliver new data sources and reporting requirements Developer Experience & Platform (secondary focus)
* Own and improve CI/CD infrastructure around the Finance dbt repo — improving PR quality, automated testing, and validation confidence before changes reach production
* Implement AI-assisted code review workflows (using the Claude API) that surface lineage impacts, data diffs, and downstream risks directly in PRs
* Build automated testing frameworks for our data pipelines and containerized applications
* Automate documentation publishing so that dbt model metadata stays current and searchable in Confluence Senior Data Engineer (DataOps)
* Establish and reinforce engineering standards: testing patterns, environment management, observability, and Git workflow discipline.
What We’re Looking For:
* 5+ years of experience in Data Engineering roles in production environments.
* Expert-level dbt proficiency — deep understanding of model configs, lineage, testing patterns, and CI approaches
* Strong Python skills, including writing testable, production-ready code
* Experience with AWS — particularly with ECS, S3, MWAA
* Experience with a modern cloud data warehouses (Databricks preferred, Snowflake, BigQuery, Redshift, etc)
* Proven ability to work with LLM APIs (Claude / Anthropic strongly preferred) to build agentic or automated review systems in a production context.
* Hands-on experience building and maintaining GitHub Actions workflows for data pipelines, including CI/CD for dbt repos.
* Familiarity with orchestration frameworks (Airflow) and how analytics transformations are scheduled within broader data workflows
* Experience implementing data diffing, data quality checks, and anomaly detection at scale.
* Terraform proficiency for infrastructure provisioning and management.
* Strong Git discipline and experience enforcing branching strategies, PR standards, and code review processes on data teams.
* Excellent written communication — you will be producing PR comments, Confluence documentation, and async team communications as artifacts of your work.
* BS in Computer Science, Engineering, Mathematics, or equivalent experience.