AdTechTalent
Engineering15 days agoHybrid

PubMatic

Principal Software Engineer - Data Analytics

scalasparkhadoopkafkapythonjavaawsbig datagenaillmlangchaindata pipelinesreal-time processingdistributed systemssoftware engineeringai agentsprompt engineeringretrieval-augmented generationsnowflake

Key details

Salary

Not specified

Employment type

Full-time

Seniority

Senior

Years experience

5-10

Location

Pune, Maharashtra, India

Full job description

PubMatic seeks a Principal Software Engineer focused on Data Analytics and AI agents development. The role involves building scalable big data platforms and pipelines using Hadoop, Spark, Kafka, Snowflake, and cloud services (AWS). Responsibilities include backend development in Java, REST APIs, and JDBC, designing real-time data workflows, leading projects, collaborating with cross-functional teams, and integrating LLMs (OpenAI, Claude, Mistral) for analytics and decision support. Candidates must have 6+ years of Java/backend experience, strong CS fundamentals, expertise in Big Data tools and GenAI applications, and the ability to lead feature development and debug distributed systems. The position follows Agile methodologies and supports customer issue resolution. A bachelor's degree in engineering or equivalent is required. The role offers a hybrid work schedule in Pune, India, with benefits including parental leave, healthcare, broadband reimbursement, and office amenities.

What you'll do

  • Build, design, and implement a highly scalable, fault-tolerant big data platform to process terabytes of data and provide in-depth analytics
  • Develop backend services using Java, REST APIs, JDBC, and AWS
  • Build and maintain Big Data pipelines using Spark, Hadoop, Kafka, and Snowflake
  • Architect and implement real-time data processing workflows and automation frameworks
  • Lead multiple projects to develop features for data processing and reporting platforms
  • Collaborate with product managers and cross-functional teams to build end-to-end products and features
  • Fix bugs to improve performance
  • Design and develop GenAI-powered agents for analytics, operations, and data enrichment using frameworks like LangChain, LlamaIndex, or custom orchestration systems
  • Integrate LLMs (OpenAI, Claude, Mistral) into existing services for query understanding, summarization, and decision support
  • Manage end-to-end GenAI workflows including prompt engineering, fine-tuning, vector embeddings, and retrieval-augmented generation (RAG)
  • Work closely with cross-functional teams to improve availability and scalability of large data platforms and software functionality
  • Participate in Agile/Scrum processes such as sprint planning, retrospectives, backlog grooming, user story management, and prioritization
  • Discuss software features with product managers
  • Support customer issues via email or JIRA, provide updates and patches
  • Perform code and design reviews

Requirements

  • 6+ years of coding experience in Java and backend development
  • Solid computer science fundamentals, including data structure and algorithm design, and creation of architectural specifications
  • Experience in professional software engineering best practices for full SDLC, including coding standards and code reviews
  • Hands-on experience with Big Data tools and systems like Scala Spark, Kafka, Hadoop, and Snowflake
  • Proven experience in building GenAI applications including LLM integration, LangChain or similar libraries, prompt engineering, embedding, and retrieval-based generation (RAG)
  • Experience in developing and deploying scalable, production-grade AI or data systems
  • Ability to lead end-to-end feature development and debug distributed systems
  • Experience in developing and delivering large-scale big data pipelines, real-time systems, and data warehouses preferred
  • Ability to achieve stretch goals in an innovative and fast-paced environment
  • Ability to learn new technologies quickly and independently
  • Excellent verbal and written communication skills
  • Strong interpersonal skills and desire to work collaboratively
  • Bachelor’s degree in engineering or equivalent from a recognized institute/university

Tech stack

HadoopSparkScalaKafkaSpark StreamingPythonJavaREST APIsJDBCAWSSnowflakeLangChainLlamaIndexOpenAIClaudeMistralAnthropicCohere

Benefits

Paternity/maternity leaveHealthcare insuranceBroadband reimbursementOffice kitchen with healthy snacks and drinksCatered lunches

Apply now

This MVP uses a placeholder application flow. In production, this section can connect to an external apply URL or a native application form.

Similar jobs

More roles worth a look

Related opportunities based on specialty and working model so candidates can keep momentum.