Lead Data Engineer - Pipelines, Spark Streaming and Spark Offline

JPMorganChase·Oracle Recruiting
Tampa, FLFull-timePosted Jul 3, 2026
Apply

Join us as we embark on a journey of collaboration and innovation, where your unique skills and talents will be valued and celebrated. Together we will create a brighter future and make a meaningful difference.

As a Lead Data Engineer at JPMorganChase within the Commercial & Investment Bank, you are an integral part of an agile team that works to enhance, build, and deliver data collection, storage, access, and analytics solutions in a secure, stable, and scalable way. As a core technical contributor, you are responsible for maintaining critical data pipelines and architectures across multiple technical areas within various business functions in support of the firm’s business objectives.

 

Job responsibilities

  • Collaborate with all of JPMorgan’s lines of business and functions to delivery software solutions
  • Experiment, Architect, develop and productionize efficient Data pipelines, Data services and Data platforms contributing to the business
  • Design and implement highly scalable, efficient and reliable data processing pipelines and perform analysis and insights to drive and optimize business result
  • Design and develop features and entities for ML and rule using spark or any bigdata environment
  • Acts on previously identified opportunities to converge physical, IT, and data security architecture to manage access
  • Applies reuse-first, AI-assisted practices within delivery and operational routines (e.g., backup/recovery validation and access control review support), ensuring traceability/auditability and alignment to resiliency and security expectations

 

Required qualifications, capabilities, and skills

  • Formal training or certification on Data Engineering concepts and 5+ years applied experience

     

  • Demonstrated experience using enterprise-authorized AI capabilities within the work environment to support data engineering workflows with strong validation habits and awareness of data sensitivity
  • Ability to review and validate AI-assisted outputs (e.g., model/design summaries or operational checklists) before use, escalating when uncertain and following data handling requirements
  • Experienced programming skills with Python, PySpark
  • Experience across the data lifecycle, building Data frameworks, working with Data lakes
  • Experience with Batch and Real time Data processing with Spark or Flink and Batch and Real time feature engineering with Spark or Flink or data brick
  • Working knowledge of AWS Glue and EMR usage for Data processing and real time data processing and features using Flink or Data brick live tables or Spark streaming
  • Experience working with Databricks and data brick live tables
  • Experience working in building services using Glue, Lamida, EMR or Flask, and deploying them on AWS EKS or Kubernetes
  • Working experience with both relational and NoSQL databases
  • Experience in ETL data pipelines both batch and real-time data processing, Data warehousing, NoSQL DB
 Preferred qualifications, capabilities, and skills 
  • Expertise in Amazon Web Services (AWS), Docker, and Kubernetes for cloud-native and containerized data solutions
  • Experience in big data technologies: Hadoop, Spark, Kafka, Flink
  • Experience in distributed system design and development

Want jobs like this matched to you?

Swoopd scores fresh postings against your résumé so you only see the matches that matter.

Get started free