Full job description
PubMatic seeks a senior engineer with 2-5 years experience in Generative AI and AI agent development. The role involves building and optimizing AI agents using Retrieval-Augmented Generation (RAG), vector databases, and large language models (LLMs). Responsibilities include technical leadership, designing and deploying AI-driven features, fine-tuning LLMs, developing RAG-powered AI agents, optimizing vector databases (FAISS, Pinecone, Weaviate), prompt engineering, and performance evaluation. Required skills include strong understanding of LLMs, experience with agentic frameworks (LangGraph, CrewAI, AutoGen), vector database expertise, proficiency in Python and ML libraries (TensorFlow, PyTorch, Hugging Face), and ability to communicate complex ideas. A bachelor's degree in engineering or equivalent is required. The position follows a hybrid work model (3 days office, 2 days remote) based in Pune, India. Benefits include parental leave, healthcare insurance, broadband reimbursement, and office amenities.
What you'll do
- Provide technical leadership and mentorship to engineering teams
- Collaborate with architects, product managers, and UX designers to create innovative AI solutions
- Lead design, development, and deployment of AI-driven features with end-to-end ownership
- Drive quick iterations based on customer feedback in an Agile environment
- Spearhead technical design meetings and produce detailed design documents
- Ensure solutions align with long-term product strategy and technical roadmaps
- Implement and optimize LLMs including fine-tuning, deploying pre-trained models, and evaluating performance
- Develop AI agents powered by RAG systems integrating external knowledge sources
- Design, implement, and optimize vector databases for scalable vector search
- Create and fine-tune sophisticated prompts for LLMs
- Utilize evaluation frameworks and metrics to assess and improve generative models
- Work with data scientists, engineers, and product teams to integrate AI capabilities into products and tools
- Stay updated with latest research and trends in LLMs, RAG, and generative AI
- Continuously monitor and optimize models for performance, scalability, and cost efficiency
Requirements
- 2 to 5 years of experience and strong understanding of LLMs and their underlying principles — transformer architecture, attention mechanisms, and hyperparameter tuning
- Proven experience designing and building AI agents, including multi-agent orchestration, tool-use patterns, multi-step planning, and agent memory architectures
- Hands-on experience with agentic frameworks such as LangGraph, CrewAI, or AutoGen, and familiarity with RAG pipelines integrating external knowledge sources
- In-depth knowledge of vector databases and indexing algorithms; practical experience with FAISS, Pinecone, Weaviate, or Milvus
- Experience with agent observability, tracing, and guardrails using tools like Langfuse or equivalent
- Proficiency in prompt engineering for context-sensitive, domain-specific LLM outputs
- Familiarity with Evals and other performance evaluation tools
- Proficiency in Python and experience with machine learning libraries such as TensorFlow, PyTorch, and Hugging Face Transformers
- Experience with data preprocessing, vectorization, and handling large-scale datasets
- Ability to present complex technical ideas to both technical and non-technical stakeholders
- Bachelor’s degree in engineering or equivalent from a recognized institute/university
Tech stack
Generative AIAI agentsRetrieval-Augmented Generation (RAG)vector databaseslarge language models (LLMs)FAISSPineconeWeaviateMilvusLangGraphCrewAIAutoGenLangfusePythonTensorFlowPyTorchHugging Face TransformersEvalsDockerKubernetesAWSGCPAzure
Benefits
Paternity/maternity leaveHealthcare insuranceBroadband reimbursementKitchen stocked with healthy snacks and drinksCatered lunches