Site Reliability Engineer (SRE)
Location:
Thailand - Bangkok, Thailand
Company:
Gulf Group
Job Position
Cloud Site Reliability Engineer (SRE)
Job Responsibilities
- Design, implement, and manage cloud infrastructure using best practices.
- Monitor system performance and reliability; troubleshoot and resolve incidents promptly.
- Collaborate with development teams to ensure services are built with reliability and scalability in mind.
- Automate manual processes through scripting and development of tools.
- Implement and manage CI/CD pipelines to streamline deployment processes.
- Conduct capacity planning and performance tuning to optimize system efficiency.
- Develop and maintain documentation for system architecture, processes, and procedures.
- Participate in on-call rotations to respond to system alerts and outages.
- Foster a culture of reliability by driving initiatives around incident management, postmortems, and root cause analysis.
Job Qualifications
- Bachelor’s degree in computer science, Engineering, or a related field.
- Proven experience in cloud platforms (AWS, Azure, GCP).
- Strong understanding of system architecture, networking, and security best practices.
- Proficiency in programming/scripting languages (Python, Go, Bash, etc.).
- Experience with containerization and orchestration tools (Docker, Kubernetes).
- Familiarity with monitoring and logging tools (Prometheus, Grafana, ELK stack).
- Experience with infrastructure as code tools (Terraform, CloudFormation).
- Knowledge of DevOps practices and methodologies.
- Familiarity with agile development processes.
- Excellent problem-solving skills and attention to detail.
- Strong communication and collaboration skills.
- Good command in English (Minimum 750 TOEIC score).
- Goal-Oriented, Unity, Learning, Flexible.