AdTechTalent
Engineering1 month agoHybrid

Epsilon

Lead, Systems Administration

linuxwindowsawsazuregcpdockerkubernetesterraformansiblepythongobashci/cditilsrenetworkinginfrastructure as codemonitoringcloudserver managementsecurityincident managementroot cause analysis

Key details

Salary

Not specified

Employment type

Full-time

Seniority

Senior

Years experience

5-10

Location

Bengaluru, India

Full job description

The role involves managing and overseeing server infrastructure including physical and virtual servers, networking, storage, and cloud environments. Responsibilities include monitoring system performance, capacity, and security, troubleshooting issues, conducting root cause analysis, maintaining infrastructure hardware and software updates, and reviewing operational documentation. The candidate will manage Wintel/Linux server farms, daily IT operations, and collaborate with global teams on service improvements and observability. Requirements include 7-10 years of experience, proficiency in Linux/OS administration, cloud platform operations (AWS/Azure/GCP), Docker, Kubernetes, Infrastructure as Code tools (Terraform/Ansible), scripting (Python/Go/Bash), ITIL certification, and strong communication skills. A bachelor's degree in engineering, computer science, IT, or equivalent is required.

What you'll do

  • Oversee server infrastructure (physical/virtual), networking, storage, and cloud environments
  • Proactively monitor system performance and capacity
  • Conduct troubleshooting and contribute to root cause analysis (RCA)
  • Monitor system security and ensure maximum uptime
  • Build reports, analyze data and communicate to management
  • Investigate, diagnose, and act on operational events, alarms, and incidents
  • Ensure infrastructure hardware and software are up to date with patches
  • Review SOPs and operational documentation
  • Manage Wintel/Linux based server farm including Windows/Linux versions, Microsoft SQL, and Load Balancing
  • Oversee daily IT operations including server maintenance and patching
  • Collaborate with global teams and internal partners on RCA, dashboards, and observability
  • Identify service improvements and operational efficiencies proactively

Requirements

  • Proficient in Linux/OS administration & scripting
  • Experience in Cloud/platform operations (AWS/Azure/GCP concepts)
  • Knowledge of Docker & Kubernetes fundamentals
  • Proficient in Infrastructure as Code (Terraform/Ansible)
  • Observability, SRE practices, SLI/SLO approach
  • Understanding of Networking essentials (DNS, TCP/IP, load balancing)
  • Excellent problem-solving, communication, and mentoring abilities
  • Knowledge on Windows/Wintel/Linux basics
  • Automation/tooling (Python/Go/Bash, CI/CD for infra)
  • Security & compliance fundamentals
  • ITSM & incident leadership (ITIL, RCA, partner communications)
  • Bachelor’s degree in engineering, Computer Science, IT, or equivalent
  • 7-10 years of related experience
  • Strong verbal/written communication
  • Certifications: ITIL V4, MCSE, CCNA, AWS, IAT, MCSA, RHCE
  • Command Centre/NOC/SOC experience
  • Familiarity with application lifecycle and IT Service Management concepts

Tech stack

LinuxWindowsWintelAWSAzureGCPDockerKubernetesTerraformAnsiblePythonGoBashCI/CDMicrosoft SQLLoad BalancingITIL

Benefits

Opportunities for growth through learning, development and career advancementFocus on employee well-beingCollaborative work environmentFlexibility to balance work and personal lifeCommitment to diversity, inclusion, and equal employment opportunities

Apply now

This MVP uses a placeholder application flow. In production, this section can connect to an external apply URL or a native application form.

Similar jobs

More roles worth a look

Related opportunities based on specialty and working model so candidates can keep momentum.