AdTechTalent
Engineering15 days agoHybrid

PubMatic

Senior/Software Engineer - Data Analytics

scalapythonjavahadoopsparkkafkabig datagenaillmlangchainsnowflakeaidata pipelinescloudawsrest apispark streamingprompt engineeringragdistributed systemsagilescrum

Key details

Salary

Not specified

Employment type

Full-time

Seniority

Senior

Years experience

1-5

Location

Pune, Maharashtra, India

Full job description

PubMatic is hiring a Senior Software Engineer focused on Data Analytics and AI agent development. The role involves building scalable big data platforms and pipelines using Hadoop, Spark, Kafka, Snowflake, and cloud technologies (AWS). Responsibilities include backend development in Java, REST APIs, and JDBC, designing GenAI-powered agents with frameworks like LangChain and LlamaIndex, integrating LLMs (OpenAI, Claude, Mistral), and managing end-to-end GenAI workflows such as prompt engineering and retrieval-augmented generation. The candidate will collaborate with cross-functional teams, participate in Agile processes, support customers, and conduct code reviews. Requirements include 1-5 years of Java/backend experience, strong CS fundamentals, expertise in big data tools and GenAI applications, ability to lead feature development, and a bachelor's degree in engineering or equivalent. The position follows a hybrid work model (3 days in office, 2 remote) in Pune, India. Benefits include parental leave, healthcare insurance, broadband reimbursement, and office amenities.

What you'll do

  • Build, design, and implement a highly scalable, fault-tolerant big data platform to process terabytes of data and provide in-depth analytics
  • Develop backend services using Java, REST APIs, JDBC, and AWS
  • Build and maintain Big Data pipelines using Spark, Hadoop, Kafka, and Snowflake
  • Architect and implement real-time data processing workflows and automation frameworks
  • Design and develop GenAI-powered agents for analytics, operations, and data enrichment using frameworks like LangChain, LlamaIndex, or custom orchestration systems
  • Integrate LLMs (OpenAI, Claude, Mistral) into existing services for query understanding, summarization, and decision support
  • Manage end-to-end GenAI workflows including prompt engineering, fine-tuning, vector embeddings, and retrieval-augmented generation (RAG)
  • Collaborate with cross-functional teams to improve availability and scalability of large data platforms and PubMatic software functionality
  • Participate in Agile/Scrum processes including sprint planning, retrospectives, backlog grooming, user story management, and prioritization
  • Discuss software features with product managers for the PubMatic Data Analytics platform
  • Support customer issues via email or JIRA, provide updates and patches
  • Perform code and design reviews

Requirements

  • 1-5 plus years of coding experience in Java and backend development
  • Solid computer science fundamentals, including data structure and algorithm design, and creation of architectural specifications
  • Expertise in professional software engineering best practices for the full software development life cycle, including coding standards and code reviews
  • Hands-on experience with Big Data tools and systems like Scala Spark, Kafka, Hadoop, Snowflake
  • Proven expertise in building GenAI applications including LLM integration, LangChain or similar agent orchestration libraries, prompt engineering, embedding, and retrieval-based generation (RAG)
  • Experience in developing and deploying scalable, production-grade AI or data systems
  • Ability to lead end-to-end feature development and debug distributed systems
  • Experience in developing and delivering large-scale big data pipelines, real-time systems, and data warehouses preferred
  • Ability to achieve stretch goals in an innovative and fast-paced environment
  • Ability to learn new technologies quickly and independently
  • Excellent verbal and written communication skills
  • Strong interpersonal skills and desire to work collaboratively
  • Bachelor’s degree in engineering or equivalent from a well-known institute/university

Tech stack

HadoopSparkScalaKafkaSpark StreamingPythonJavaREST APIsJDBCAWSSnowflakeLangChainLlamaIndexOpenAIClaudeMistralAnthropicCohere

Benefits

Paternity/maternity leaveHealthcare insuranceBroadband reimbursementKitchen with healthy snacks and drinksCatered lunches

Apply now

This MVP uses a placeholder application flow. In production, this section can connect to an external apply URL or a native application form.

Similar jobs

More roles worth a look

Related opportunities based on specialty and working model so candidates can keep momentum.