SDE III Gen AI

generative AILLMPythonPyTorchTransformersLangChainvector databasesGCPDockerKubernetesMLOpsRAGmulti-agent systemsprompt engineeringembeddingcomputer visionAI governancereal-time inferencemicroservicesAI safetymultimodal AIONNXTensorRT

Key details

Salary

Not specified

Employment type

Full-time

Seniority

Senior

Years experience

5-10

Location

Bengaluru, India

Full job description

Design and implement production-ready generative AI applications serving millions of users. Build advanced RAG pipelines with vector databases, hybrid search, and caching. Develop multimodal AI systems integrating text, vision, and audio. Architect scalable microservices optimizing cost, latency, and reliability. Lead code reviews and technical design sessions. Optimize large language models through fine-tuning. Implement MLOps practices including automated testing, model versioning, A/B testing, and monitoring. Collaborate with product managers to translate business requirements into AI solutions. Deploy AI models on GCP using containerization and orchestration. Maintain technical documentation and mentor junior engineers. Research and prototype emerging AI technologies. Fine-tune language models for business use cases. Design multi-agent AI systems and advanced prompt engineering strategies. Build embedding systems handling billions of vectors. Develop computer vision pipelines. Create secure AI applications with safeguards and compliance. Optimize token usage and caching to reduce costs. Design evaluation frameworks with human feedback. Build real-time AI inference systems with sub-100ms latency. Integrate multiple foundation models with fallback and load balancing. Develop custom tools extending LLM capabilities. Implement advanced RAG techniques and multimodal search systems. Build AI-powered data processing pipelines. Deploy edge AI solutions optimized for resource-constrained environments. Requires 5+ years ML/AI experience, 2+ years generative AI, expert Python, experience with PyTorch, Transformers, LangChain, vector DBs, GCP, Docker, Kubernetes, and strong communication skills. Bachelor's degree required.

What you'll do

Design and implement production-ready generative AI applications serving millions of users
Build advanced RAG pipelines combining vector databases, hybrid search, and intelligent caching
Develop multimodal AI systems integrating text, vision, and audio capabilities
Architect scalable microservices handling thousands of concurrent AI requests optimizing cost, latency, reliability
Lead code reviews and technical design sessions, establish best practices and architectural patterns
Optimize large language models through fine-tuning for domain-specific performance
Implement comprehensive MLOps practices including automated testing, model versioning, A/B testing, real-time monitoring
Collaborate with product managers and stakeholders to translate business requirements into AI solutions
Deploy AI models across multiple cloud platforms (GCP) using containerization and orchestration
Create and maintain technical documentation, runbooks, architectural decision records
Mentor junior engineers through pair programming, technical talks, and hands-on guidance
Research and prototype emerging AI technologies for competitive advantage
Fine-tune and optimize state-of-the-art language models for business use cases
Design multi-agent AI systems using orchestration frameworks
Implement advanced prompt engineering strategies
Build production-grade embedding systems handling billions of vectors with efficient indexing and hybrid search
Develop computer vision pipelines for object detection and visual question answering
Create secure AI applications with safeguards against prompt injection, jailbreaking, data leakage, and ensure AI governance compliance
Optimize token usage and implement intelligent caching to reduce costs by 50-70%
Design and implement evaluation frameworks incorporating human feedback and domain-specific quality measures
Build real-time AI inference systems processing streaming data with sub-100ms latency
Integrate multiple foundation models with fallback mechanisms and load balancing for high availability
Develop custom tools/functions extending LLM capabilities to interact with databases, APIs, external systems
Implement advanced RAG techniques including contextual embeddings, cross-encoder reranking, Graph RAG
Create multimodal search systems querying text, images, documents using natural language
Build AI-powered data processing pipelines to extract, transform, enrich unstructured data at scale
Deploy edge AI solutions using ONNX and TensorRT optimizing models for resource-constrained environments

Requirements

5+ years of hands-on experience building and deploying ML/AI systems
At least 2+ years focused on generative AI and LLMs
Expert-level Python programming skills with async programming, multiprocessing, and performance optimization
Strong experience with modern AI frameworks including PyTorch, Transformers, LangChain, and vector databases
Proven track record of deploying AI applications to production environments serving real users at scale
Deep understanding of transformer architectures, attention mechanisms, and latest advances in generative AI
Experience with cloud platforms (GCP) and containerization technologies (Docker, Kubernetes)
Excellent communication skills to explain complex AI concepts to technical and non-technical audiences
Proven experience improving large-scale product search and discovery including dense retrieval, cross-encoder reranking, query understanding, hybrid BM25 + vector search
Hands-on experience building and deploying production multi-agent systems using orchestration frameworks such as LangGraph and Google ADK
Bachelor's degree in Computer Science, Mathematics, or related field (Master's preferred but not required)

Tech stack

PythonPyTorchTransformersLangChainvector databasesGCPDockerKubernetesONNXTensorRTLangGraphGoogle ADK

Benefits

Continuous learning and career progression through InMobi Live Your Potential programEqual Employment Opportunity employer with reasonable accommodations for qualified individuals with disabilities

Apply now

Ready to take the next step in your career? Click the button below to continue to the application process.

Continue to application Browse more jobs

Similar jobs

More roles worth a look

Related opportunities based on specialty and working model so candidates can keep momentum.

The Trade Desk

Business Development GM (Holdco)

New York, US•2 months ago

$134K – $245K

business developmentsalesagency

View job details→

TripleLift

Accountant

Detroit, United States; New York, US•2 months ago

$75K – $95K

accountingpayrollcompensation

View job details→

TripleLift

Associate Campaign Manager

Pune, India•2 months ago

ad opsprogrammaticcampaign management

View job details→