Site Reliability Engineer (SRE)

Location: 

Thailand - Bangkok, Thailand

Company:  Gulf Group

Job Position

Cloud Site Reliability Engineer (SRE)

Job Responsibilities

  • Design, implement, and manage cloud infrastructure using best practices.
  • Monitor system performance and reliability; troubleshoot and resolve incidents promptly.
  • Collaborate with development teams to ensure services are built with reliability and scalability in mind.
  • Automate manual processes through scripting and development of tools.
  • Implement and manage CI/CD pipelines to streamline deployment processes.
  • Conduct capacity planning and performance tuning to optimize system efficiency.
  • Develop and maintain documentation for system architecture, processes, and procedures.
  • Participate in on-call rotations to respond to system alerts and outages.
  • Foster a culture of reliability by driving initiatives around incident management, postmortems, and root cause analysis.

Job Qualifications

  • Bachelor’s degree in computer science, Engineering, or a related field.
  • Proven experience in cloud platforms (AWS, Azure, GCP).
  • Strong understanding of system architecture, networking, and security best practices.
  • Proficiency in programming/scripting languages (Python, Go, Bash, etc.).
  • Experience with containerization and orchestration tools (Docker, Kubernetes).
  • Familiarity with monitoring and logging tools (Prometheus, Grafana, ELK stack).
  • Experience with infrastructure as code tools (Terraform, CloudFormation).
  • Knowledge of DevOps practices and methodologies.
  • Familiarity with agile development processes.
  • Excellent problem-solving skills and attention to detail.
  • Strong communication and collaboration skills.
  • Good command in English (Minimum 750 TOEIC score).
  • Goal-Oriented, Unity, Learning, Flexible.