Full job description
Senior SDE role focused on building and shipping AI-powered workflows, agents, and features into production. Requires fluency in LLM application design, agentic frameworks, and AI tooling. Responsibilities include designing LLM-powered features, architecting RAG pipelines, building multi-step agentic workflows, standardizing AI-assisted engineering workflows, embedding AI agents into development cycles, partnering cross-functionally to identify AI automation opportunities, running model quality evaluations, documenting AI systems, and ensuring responsible AI practices with monitoring and validation. Required skills include 3+ years experience with LLM-powered production features, knowledge of Claude, GPT-4o, open-source models, agentic frameworks (LangGraph, LangChain, LlamaIndex, CrewAI), RAG pipelines, prompt engineering, AI developer tooling (Claude Code, Cursor, GitHub Copilot, Augment Code, MCP), strong TypeScript/JavaScript/Node.js skills, Python for ML tooling, React or equivalent front-end frameworks, AWS or equivalent cloud services, and observability tooling for LLM systems.
What you'll do
- Design, build, and ship LLM-powered features and bounded autonomous agents with observability, latency budgets, and fallback behaviour
- Architect RAG pipelines — chunking, embedding selection, retrieval, and re-ranking — for internal knowledge bases and product surfaces
- Build multi-step agentic workflows (LangGraph, LangChain, CrewAI, Claude Agent SDK) with scoped tool sets, structured outputs, and auditable side-effects
- Standardise AI-assisted engineering workflows — Claude Code, Cursor, GitHub Copilot, Augment Code — driving measurable feature delivery improvements
- Author CLAUDE.md / AGENTS.md context files, shared system prompts, slash commands, and MCP integrations for the engineering team
- Embed AI agents into code-review and test-generation loops with guardrails ensuring AI output meets engineering quality standards
- Partner with Product, Engineering, Data, and Operations to identify and scope high-leverage AI automation opportunities
- Run structured evals and prompt experiments; maintain harnesses that track model quality and alert on regressions after upgrades
- Document AI system designs, agent architectures, prompt libraries, and runbooks so the team can maintain and extend AI features independently
- Instrument LLM features with structured logging, cost tracking, and latency monitoring; own SLOs and incident response for AI-driven workflows
- Apply output validation (Zod / JSON Schema) and human-in-the-loop checkpoints to manage hallucination risk in production
Requirements
- 3+ years building and shipping LLM-powered features or AI agents in production
- Working knowledge of Claude (Sonnet / Opus), GPT-4o, and/or open-source models (Llama, Hugging Face ecosystem)
- Proficiency with agentic frameworks: LangGraph, LangChain, LlamaIndex, or CrewAI; experience designing bounded, auditable agent workflows
- Hands-on RAG experience: chunking, embeddings, vector stores (Pinecone, ChromaDB, pgvector), retrieval strategies, and re-ranking
- Strong prompt and context engineering: system prompt design, structured output enforcement (JSON Schema / Zod), and prompt caching
- Experience with Claude Code — context files (CLAUDE.md), slash commands, and agent guardrails
- Experience with Cursor — AI-assisted coding and codebase-aware refactoring workflows
- Experience with GitHub Copilot and/or Augment Code in daily development and code-review cycles
- Experience with MCP (Model Context Protocol) — configuring or authoring MCP servers for agent context management
- Strong TypeScript / JavaScript and Node.js skills; Python for ML tooling is a plus
- Experience with React or equivalent front-end framework; ability to build streaming, real-time AI-powered UIs
- Experience with AWS or equivalent cloud (Lambda, API Gateway, S3, DynamoDB); REST, WebSockets, SSE, OAuth 2.0
- Experience with observability tooling applied to LLM systems: structured logging, cost tracking, latency dashboards
Tech stack
LLMTypeScriptJavaScriptNode.jsPythonReactAWSLambdaAPI GatewayS3DynamoDBRESTWebSocketsSSEOAuth 2.0ClaudeGPT-4oLlamaHugging FaceLangGraphLangChainLlamaIndexCrewAIPineconeChromaDBpgvectorZodJSON SchemaCursorGitHub CopilotAugment CodeMCPWeaviateQdrant