Full job description
The role involves managing and overseeing server infrastructure including physical and virtual servers, networking, storage, and cloud environments. Responsibilities include monitoring system performance, capacity, and security, troubleshooting issues, conducting root cause analysis, maintaining infrastructure hardware and software updates, and reviewing operational documentation. The candidate will manage Wintel/Linux server farms, daily IT operations, and collaborate with global teams on service improvements and observability. Requirements include 7-10 years of experience, proficiency in Linux/OS administration, cloud platform operations (AWS/Azure/GCP), Docker, Kubernetes, Infrastructure as Code tools (Terraform/Ansible), scripting (Python/Go/Bash), ITIL certification, and strong communication skills. A bachelor's degree in engineering, computer science, IT, or equivalent is required.
What you'll do
- Oversee server infrastructure (physical/virtual), networking, storage, and cloud environments
- Proactively monitor system performance and capacity
- Conduct troubleshooting and contribute to root cause analysis (RCA)
- Monitor system security and ensure maximum uptime
- Build reports, analyze data and communicate to management
- Investigate, diagnose, and act on operational events, alarms, and incidents
- Ensure infrastructure hardware and software are up to date with patches
- Review SOPs and operational documentation
- Manage Wintel/Linux based server farm including Windows/Linux versions, Microsoft SQL, and Load Balancing
- Oversee daily IT operations including server maintenance and patching
- Collaborate with global teams and internal partners on RCA, dashboards, and observability
- Identify service improvements and operational efficiencies proactively
Requirements
- Proficient in Linux/OS administration & scripting
- Experience in Cloud/platform operations (AWS/Azure/GCP concepts)
- Knowledge of Docker & Kubernetes fundamentals
- Proficient in Infrastructure as Code (Terraform/Ansible)
- Observability, SRE practices, SLI/SLO approach
- Understanding of Networking essentials (DNS, TCP/IP, load balancing)
- Excellent problem-solving, communication, and mentoring abilities
- Knowledge on Windows/Wintel/Linux basics
- Automation/tooling (Python/Go/Bash, CI/CD for infra)
- Security & compliance fundamentals
- ITSM & incident leadership (ITIL, RCA, partner communications)
- Bachelor’s degree in engineering, Computer Science, IT, or equivalent
- 7-10 years of related experience
- Strong verbal/written communication
- Certifications: ITIL V4, MCSE, CCNA, AWS, IAT, MCSA, RHCE
- Command Centre/NOC/SOC experience
- Familiarity with application lifecycle and IT Service Management concepts
Tech stack
LinuxWindowsWintelAWSAzureGCPDockerKubernetesTerraformAnsiblePythonGoBashCI/CDMicrosoft SQLLoad BalancingITIL
Benefits
Opportunities for growth through learning, development and career advancementFocus on employee well-beingCollaborative work environmentFlexibility to balance work and personal lifeCommitment to diversity, inclusion, and equal employment opportunities