AdTechTalent
Engineering48 days agoOn-site

Amazon Web Services (AWS)

Support Engineer Manager II, Measurements & Data Science

AWSdistributed systemsbillingincident managementsupport engineeringautomationAImachine learningDevOpscloudtechnical leadershiproot cause analysisSLAon-calldata analysisteam managementsite reliability engineeringfinancial systems

Key details

Salary

Not specified

Employment type

Full-time

Seniority

Senior

Years experience

5-10

Location

Bengaluru, India

Full job description

Amazon's MADS Billing Support team seeks a Senior Support Engineering Manager to lead a team ensuring high availability and performance of global advertising billing applications. Responsibilities include team leadership, operational excellence, incident management, strategic incident prevention, innovation with AI/ML and automation, cross-functional collaboration, technical leadership, and process quality management. Requires 5+ years engineering experience with 2+ years in leadership, strong background in distributed systems and support, incident management experience, and proven operational excellence. Preferred qualifications include 8+ years experience, AWS expertise, billing/financial systems knowledge, automation and AI/ML implementation, global team collaboration, and Agile/DevOps practices. Bachelor's degree in a relevant technical field required.

What you'll do

  • Lead, mentor, and develop a team of Support Engineers, fostering a culture of operational excellence and continuous improvement
  • Hire, onboard, and retain top talent to build a world-class support engineering organization
  • Conduct performance reviews, provide coaching, and create career development plans for team members
  • Mentor engineers transitioning from IC roles to management positions, building future leaders
  • Drive team engagement and maintain high morale while managing demanding on-call rotations
  • Own the operational health of MADS Billing applications, ensuring high availability and performance of critical systems processing hundreds of millions of API requests daily
  • Establish and maintain SLAs, operational metrics, and KPIs to measure team effectiveness and system reliability
  • Lead incident management processes, ensuring rapid response, effective communication, and thorough post-incident reviews
  • Drive operational reviews and Monthly Business Reviews (MBRs), Weekly Business Reviews (WBRs) with senior leadership, presenting key metrics and improvement initiatives
  • Implement proactive monitoring and alerting strategies to detect and prevent issues before customer impact
  • Analyze incident patterns and trends to identify systemic issues requiring architectural or process improvements
  • Lead comprehensive Root Cause Analysis (RCA) efforts for critical incidents, driving corrective and preventive actions
  • Partner with development teams to prioritize and resolve recurring issues, improving overall system reliability
  • Build mechanisms to track incident recurrence and measure effectiveness of preventive measures
  • Establish feedback loops between support operations and product development to drive continuous improvement
  • Champion the adoption of Generative AI solutions and intelligent agents to transform incident handling and customer support
  • Drive automation initiatives to reduce manual toil, optimize billing systems, and simplify operational processes
  • Build self-healing mechanisms and automated remediation workflows to improve system resilience
  • Leverage AI/ML technologies for predictive analytics to prevent potential issues before they impact customers
  • Foster a culture of innovation where team members are empowered to experiment with new technologies and approaches
  • Partner closely with Software Development Managers (SDMs) leading billing development teams to align on priorities and improvements
  • Collaborate with partner teams across MADS, including Measurements, AdTech, and Data Science organizations
  • Work with Product Management to represent customer pain points and influence product roadmap decisions
  • Coordinate with global teams across multiple regions (US, EU, JP, MX, AU) to ensure consistent support coverage
  • Build strong relationships with stakeholders at all levels, including SVP-level visibility on key initiatives
  • Maintain deep technical expertise in distributed systems, billing platforms, and AWS cloud technologies
  • Guide architectural decisions for support tools, automation frameworks, and operational systems
  • Review and approve technical designs for complex automation and tooling projects
  • Stay current with emerging technologies and evaluate their applicability to support operations
  • Lead technical deep dives and serve as an escalation point for the most complex technical issues
  • Establish and continuously improve support processes, runbooks, and operational documentation
  • Implement quality assurance mechanisms to ensure consistent and high-quality customer interactions
  • Drive adoption of best practices across the support organization
  • Measure and improve key customer success KPIs including resolution time, customer satisfaction, and first-contact resolution
  • Build scalable processes that can support 10X growth in transaction volume

Requirements

  • 5+ years of engineering experience with at least 2+ years in a technical leadership or management role
  • Experience managing teams of engineers in a support, operations, or development capacity
  • Strong technical background in software development, distributed systems, or technical support
  • Experience with incident management, on-call operations, and maintaining high-availability systems
  • Proven track record of driving operational excellence and process improvements
  • 3+ years of data analysis work and leveraging analytics to make decisions experience
  • 3+ years of managing IT environments on behalf of customers experience
  • 5+ years of technical support work, engineering or operations environment experience
  • Bachelor's degree in Computer Science, Engineering, Mathematics, or a related field
  • Knowledge of operating systems, hardware, storage, network, security, database administration and cloud infrastructure
  • Experience in management: developing engineers into managers and building teams
  • Preferred: 8+ years of experience in software engineering, technical support, or site reliability engineering
  • Preferred: Experience managing support operations for large-scale distributed systems or SaaS platforms
  • Preferred: Deep knowledge of AWS services and cloud-based architectures
  • Preferred: Experience with billing, payments, or financial systems
  • Preferred: Track record of implementing automation and AI/ML solutions in operational contexts
  • Preferred: Experience working with global teams across multiple time zones and regions
  • Preferred: Strong data-driven decision-making skills with experience presenting to senior leadership
  • Preferred: Experience with Agile methodologies and DevOps practices
  • Preferred: Proven ability to hire, develop, and retain high-performing technical teams
  • Preferred: 3+ years of engineering experience
  • Preferred: Knowledge of AWS and Computing concepts (AWS Elastic Beanstalk, AWS CloudFormation or AWS OpsWorks)
  • Preferred: Experience in direct customer support
  • Preferred: Experience in engineering

Tech stack

AWSAWS Elastic BeanstalkAWS CloudFormationAWS OpsWorksdistributed systemsautomationAI/MLDevOpscloud infrastructureincident managementbilling systemsdata analysis

Apply now

This MVP uses a placeholder application flow. In production, this section can connect to an external apply URL or a native application form.

Similar jobs

More roles worth a look

Related opportunities based on specialty and working model so candidates can keep momentum.