AdTechTalent
Engineering21 days agoHybrid

InMobi Advertising

SDE III Gen AI

generative AILLMPythonPyTorchTransformersLangChainvector databasesGCPDockerKubernetesMLOpsRAGmulti-agent systemsprompt engineeringembeddingcomputer visionAI governancereal-time inferencemicroservicesAI safetymultimodal AIONNXTensorRT

Key details

Salary

Not specified

Employment type

Full-time

Seniority

Senior

Years experience

5-10

Location

Bangalore, Karnataka, India

Full job description

Design and implement production-ready generative AI applications serving millions of users. Build advanced RAG pipelines with vector databases, hybrid search, and caching. Develop multimodal AI systems integrating text, vision, and audio. Architect scalable microservices optimizing cost, latency, and reliability. Lead code reviews and technical design sessions. Optimize large language models through fine-tuning. Implement MLOps practices including automated testing, model versioning, A/B testing, and monitoring. Collaborate with product managers to translate business requirements into AI solutions. Deploy AI models on GCP using containerization and orchestration. Maintain technical documentation and mentor junior engineers. Research and prototype emerging AI technologies. Fine-tune language models for business use cases. Design multi-agent AI systems and advanced prompt engineering strategies. Build embedding systems handling billions of vectors. Develop computer vision pipelines. Create secure AI applications with safeguards and compliance. Optimize token usage and caching to reduce costs. Design evaluation frameworks with human feedback. Build real-time AI inference systems with sub-100ms latency. Integrate multiple foundation models with fallback and load balancing. Develop custom tools extending LLM capabilities. Implement advanced RAG techniques and multimodal search systems. Build AI-powered data processing pipelines. Deploy edge AI solutions optimized for resource-constrained environments. Requires 5+ years ML/AI experience, 2+ years generative AI, expert Python, experience with PyTorch, Transformers, LangChain, vector DBs, GCP, Docker, Kubernetes, and strong communication skills. Bachelor's degree required.

What you'll do

  • Design and implement production-ready generative AI applications serving millions of users
  • Build advanced RAG pipelines combining vector databases, hybrid search, and intelligent caching
  • Develop multimodal AI systems integrating text, vision, and audio capabilities
  • Architect scalable microservices handling thousands of concurrent AI requests optimizing cost, latency, reliability
  • Lead code reviews and technical design sessions, establish best practices and architectural patterns
  • Optimize large language models through fine-tuning for domain-specific performance
  • Implement comprehensive MLOps practices including automated testing, model versioning, A/B testing, real-time monitoring
  • Collaborate with product managers and stakeholders to translate business requirements into AI solutions
  • Deploy AI models across multiple cloud platforms (GCP) using containerization and orchestration
  • Create and maintain technical documentation, runbooks, architectural decision records
  • Mentor junior engineers through pair programming, technical talks, and hands-on guidance
  • Research and prototype emerging AI technologies for competitive advantage
  • Fine-tune and optimize state-of-the-art language models for business use cases
  • Design multi-agent AI systems using orchestration frameworks
  • Implement advanced prompt engineering strategies
  • Build production-grade embedding systems handling billions of vectors with efficient indexing and hybrid search
  • Develop computer vision pipelines for object detection and visual question answering
  • Create secure AI applications with safeguards against prompt injection, jailbreaking, data leakage, and ensure AI governance compliance
  • Optimize token usage and implement intelligent caching to reduce costs by 50-70%
  • Design and implement evaluation frameworks incorporating human feedback and domain-specific quality measures
  • Build real-time AI inference systems processing streaming data with sub-100ms latency
  • Integrate multiple foundation models with fallback mechanisms and load balancing for high availability
  • Develop custom tools/functions extending LLM capabilities to interact with databases, APIs, external systems
  • Implement advanced RAG techniques including contextual embeddings, cross-encoder reranking, Graph RAG
  • Create multimodal search systems querying text, images, documents using natural language
  • Build AI-powered data processing pipelines to extract, transform, enrich unstructured data at scale
  • Deploy edge AI solutions using ONNX and TensorRT optimizing models for resource-constrained environments

Requirements

  • 5+ years of hands-on experience building and deploying ML/AI systems
  • At least 2+ years focused on generative AI and LLMs
  • Expert-level Python programming skills with async programming, multiprocessing, and performance optimization
  • Strong experience with modern AI frameworks including PyTorch, Transformers, LangChain, and vector databases
  • Proven track record of deploying AI applications to production environments serving real users at scale
  • Deep understanding of transformer architectures, attention mechanisms, and latest advances in generative AI
  • Experience with cloud platforms (GCP) and containerization technologies (Docker, Kubernetes)
  • Excellent communication skills to explain complex AI concepts to technical and non-technical audiences
  • Proven experience improving large-scale product search and discovery including dense retrieval, cross-encoder reranking, query understanding, hybrid BM25 + vector search
  • Hands-on experience building and deploying production multi-agent systems using orchestration frameworks such as LangGraph and Google ADK
  • Bachelor's degree in Computer Science, Mathematics, or related field (Master's preferred but not required)

Tech stack

PythonPyTorchTransformersLangChainvector databasesGCPDockerKubernetesONNXTensorRTLangGraphGoogle ADK

Benefits

Continuous learning and career progression through InMobi Live Your Potential programEqual Employment Opportunity employer with reasonable accommodations for qualified individuals with disabilities

Apply now

This MVP uses a placeholder application flow. In production, this section can connect to an external apply URL or a native application form.

Similar jobs

More roles worth a look

Related opportunities based on specialty and working model so candidates can keep momentum.