AdTechTalent
Engineering4 days agoOn-site

Merkle

Associate Technical Architect

GCPBigQueryDataflowDataprocDBTPySparkSQLApache BeamLookerVertex AIBigQuery MLLangChainVector DatabasesGenerative AICloud ComposerAirflowCI/CDPythonData EngineeringData PlatformsStreamingETLELTSemantic LayerData GovernanceAIMLGenAILeadership

Key details

Salary

Not specified

Employment type

Full-time

Seniority

Lead

Years experience

5-10

Location

Mumbai, India

Full job description

Lead GCP Data Engineer role focused on designing, developing, and implementing scalable cloud-native data platforms and AI-ready data foundations on Google Cloud Platform. Responsibilities include building batch, streaming, and event-driven data pipelines; collaborating with architects; modernizing legacy data ecosystems; developing reusable data products; enabling semantic layers and business consumption; leveraging GCP services such as BigQuery, Dataflow, Dataproc, Pub/Sub, Cloud Storage, Cloud Composer, and Cloud SQL; building AI/ML and GenAI data pipelines; leading engineering teams; ensuring governance, reliability, and observability. Requires 7+ years in data engineering, 4+ years hands-on GCP experience, strong skills in scalable data pipelines, semantic modeling, cloud transformation, leadership, and communication. GCP Professional Data Engineer certification preferred.

What you'll do

  • Design, develop, and optimize scalable cloud-native data platforms and pipelines on GCP
  • Implement robust batch, streaming, and event-driven data processing solutions supporting enterprise analytics and AI use cases
  • Collaborate with Enterprise Architects to translate target-state architecture into scalable engineering implementations
  • Contribute to modernization of legacy data ecosystems into reusable, governed, and AI-ready cloud platforms
  • Support implementation of scalable ingestion, transformation, serving, and orchestration frameworks
  • Develop reusable and domain-oriented data products aligned with data mesh and data-as-a-product principles
  • Implement scalable and modular data pipelines supporting multiple downstream consumers including analytics, AI/ML, and operational applications
  • Contribute to implementation of data contracts, schema management, metadata enrichment, data quality frameworks, reusable transformation patterns
  • Enable discoverability, trust, and operational reliability of enterprise data assets
  • Support implementation of semantic and business-consumption layers that simplify enterprise data access
  • Collaborate with analytics and BI teams to enable standardized business metrics, reusable dimensions, and governed KPI definitions
  • Contribute to semantic modeling and metadata integration initiatives supporting self-service analytics and AI consumption
  • Assist in improving enterprise data usability, consistency, and discoverability across platforms
  • Develop and optimize solutions leveraging GCP-native services including BigQuery, Dataflow, Dataproc, DBT, Pub/Sub, Cloud Storage, Cloud Composer (Airflow), Cloud SQL
  • Build scalable ETL/ELT frameworks and real-time streaming pipelines
  • Optimize data processing performance, reliability, scalability, and cost efficiency
  • Implement CI/CD pipelines and engineering automation for data platform delivery
  • Build AI-ready data pipelines and scalable feature engineering workflows supporting enterprise AI initiatives
  • Support integration with Vertex AI, BigQuery ML, Vector databases, LangChain, Generative AI Studio
  • Contribute to implementation of RAG architectures, semantic search, and AI-assisted data interaction patterns
  • Partner with AI/ML teams to operationalize scalable ML and GenAI workflows
  • Lead day-to-day engineering activities across multiple data engineering workstreams
  • Guide and mentor junior and mid-level data engineers on modern engineering best practices
  • Ensure adherence to coding standards, architecture guidelines, and operational best practices
  • Drive engineering quality through automated testing, observability, monitoring, and performance optimization
  • Collaborate with architects, product owners, analysts, and client stakeholders to ensure successful delivery outcomes
  • Implement data governance, lineage, monitoring, and observability frameworks
  • Support enforcement of enterprise standards around security, reliability, scalability, and operational readiness
  • Contribute to platform monitoring, incident management, and continuous improvement initiatives
  • Ensure production readiness of pipelines and data services through robust testing and validation processes

Requirements

  • Bachelor’s or Master’s degree in Computer Science, Engineering, Information Systems, or related field
  • 7+ years of experience in data engineering and cloud-native data platform development
  • Minimum 4+ years of hands-on experience delivering enterprise-scale solutions on GCP
  • Strong expertise in building scalable batch and streaming data pipelines
  • Experience working on modern enterprise data platforms supporting analytics, AI/ML, and GenAI use cases
  • Good understanding of semantic layer concepts, reusable data models, and governed data consumption patterns
  • Experience working within large-scale data modernization and cloud transformation initiatives
  • Strong problem-solving, debugging, and performance optimization skills
  • Proven ability to lead engineering teams and collaborate across architecture, product, and business functions
  • Excellent communication and stakeholder management skills
  • GCP certifications such as Professional Data Engineer preferred

Tech stack

GCPBigQueryDataflowDataprocDBTPySparkSQLApache BeamLookerVertex AIBigQuery MLLangChainVector DatabasesGenerative AI StudioCloud ComposerAirflowCI/CDPythonData Catalog

Apply now

This MVP uses a placeholder application flow. In production, this section can connect to an external apply URL or a native application form.

Similar jobs

More roles worth a look

Related opportunities based on specialty and working model so candidates can keep momentum.