This position is listed on behalf of a partner company, who manages all applications and next steps. Our partner is looking for a Data Engineer (Databricks) based in the United States.
This role sits at the intersection of data engineering, product thinking, and modern AI-enabled platform design. You will be responsible for building and scaling production-grade data systems on Databricks that power digital products, analytics, and machine learning use cases across a diverse set of client environments. The work spans architecture, pipeline development, and hands-on implementation, with a strong emphasis on delivering real business outcomes rather than theoretical designs. You will operate in a consulting-style environment, working closely with clients, product teams, and engineers to translate complex data challenges into scalable, production-ready solutions. A key part of the role involves designing lakehouse architectures, enabling governed data access, and supporting AI and ML workflows through modern Databricks capabilities. This is a highly collaborative and fast-moving environment where technical depth, communication skills, and pragmatic decision-making are equally important.
This position is listed on behalf of a partner company, who manages all applications and next steps. Our partner is looking for a Data Engineer (Databricks) based in the United States.
This role sits at the intersection of data engineering, product thinking, and modern AI-enabled platform design. You will be responsible for building and scaling production-grade data systems on Databricks that power digital products, analytics, and machine learning use cases across a diverse set of client environments. The work spans architecture, pipeline development, and hands-on implementation, with a strong emphasis on delivering real business outcomes rather than theoretical designs. You will operate in a consulting-style environment, working closely with clients, product teams, and engineers to translate complex data challenges into scalable, production-ready solutions. A key part of the role involves designing lakehouse architectures, enabling governed data access, and supporting AI and ML workflows through modern Databricks capabilities. This is a highly collaborative and fast-moving environment where technical depth, communication skills, and pragmatic decision-making are equally important.
Accountabilities
- Design, build, and maintain production-grade data pipelines on Databricks using tools such as Lakeflow Declarative Pipelines, Autoloader, Structured Streaming, and CI/CD automation.
- Architect scalable Lakehouse solutions leveraging Delta Lake, medallion architecture, and Unity Catalog to support analytics, AI, and application workloads.
- Develop and optimize transformation layers using PySpark, DLT pipelines, and dbt, ensuring strong data quality, reliability, and performance.
- Implement data governance frameworks including access control, lineage, and compliance using Unity Catalog as a core platform capability.
- Collaborate with product, backend, and client stakeholders to design data models, APIs, and data contracts aligned with product and business needs.
- Build and support AI/ML foundations such as feature stores, MLflow, and model serving to enable production machine learning and AI workflows.
- Engage directly with clients to assess requirements, define data strategies, and deliver scalable, production-ready data solutions.
- 3–5+ years of data engineering experience, including at least 2+ years working in production Databricks environments.
- Strong hands-on expertise with Databricks components such as Delta Lake, Unity Catalog, Structured Streaming, and Lakeflow pipelines.
- Solid experience designing and operating Lakehouse architectures, including data modeling, partitioning, and performance optimization.
- Proficiency in SQL and Python with a focus on writing clean, efficient, and production-grade code.
- Experience working with cloud platforms (AWS and/or Azure), including storage, compute, networking, and IAM concepts.
- Strong understanding of data pipeline lifecycle, including testing, observability, CI/CD, and version control practices.
- Excellent communication skills with the ability to translate complex technical concepts for both technical and non-technical stakeholders.
- Ability to work in ambiguous, client-driven environments and deliver practical, high-quality engineering solutions.
- Competitive compensation package ($120,000–$145,000 base salary)
- Equity participation opportunities
- Remote-first work environment (US-based)
- Flexible working arrangements
- Health, dental, and vision insurance (where applicable)
- Opportunity to work on high-impact client projects across multiple industries
- Learning and career development opportunities in modern data and AI technologies
- Exposure to cutting-edge Databricks and AI platform implementations