Moladin sedang merekrut seorang

Site Reliability Engineer (Smart Solutions)

Loker ini dibuat lebih dari 2 bulan yang lalu

Cek ketersediaan dengan klik lamar. Tidak tersedia? Cek loker lain di Jakarta.

As a Site Reliability Engineer, you will play a critical role in ensuring the reliability, availability, and performance of our infrastructure, applications and hardware & software systems. You will collaborate with cross-functional teams to design, build, maintain and monitor robust systems that can withstand the challenges of high-traffic and mission-critical environments. You will also have the opportunity to travel around the country to do onsite system setup and maintenance. As a critical thinker, your situational awareness and flexibility in dealing with system issues will be key to our success.

Responsibilities

Implement and maintain best practices for system reliability, availability, and scalability while minimizing downtime and disruptions for both software and hardware systems.
Develop and enhance automation tools and scripts for system monitoring, deployment, and recovery to streamline operational processes.
Identify and resolve performance bottlenecks, proactively optimizing system components to ensure optimal response times and resource utilization.
Participate in on-call rotations to respond to and resolve system incidents promptly and efficiently, ensuring minimal impact on end-users. Site visits and on-site debugging will be needed.
Use Infrastructure as Code tools to manage and version infrastructure, making it more predictable and reproducible.
Set up and maintain robust monitoring, alerting, and logging systems to detect and mitigate issues before they impact the user experience.
Analyze system performance trends and collaborate with other teams to plan for capacity requirements and scaling as needed.
Implement security best practices and participate in vulnerability assessments to protect systems and data from threats.
Maintain comprehensive documentation for systems, configurations, and procedures to ensure knowledge sharing and smooth knowledge transfer within the team.

Bachelor's degree in a technical or scientific field such as Software Engineering, Computer Science, Electrical Engineering or IT preferred.
Minimum 4 years proven experience as a Site Reliability Engineer or in a similar role.
Proficiency in scripting and automation with languages such as Python and Bash.
Familiarity with cloud platforms (e.g., AWS, Azure, GCP).
Strong knowledge of containerization and orchestration technologies (e.g., Docker, Kubernetes).
Experience with Infrastructure as Code tools (e.g., Terraform, Ansible).
Solid understanding of monitoring tools and practices.
Knowledge of security best practices and incident response.
Experience and knowledge of IoT (eg. sensors, Raspberry Pi, device management)
Experience or interest in electrical circuit design, PCB layout, soldering preferred.
Excellent problem-solving and communication skills.
Ability to work effectively in a collaborative team environment.
Having experience and knowledge in back-end microservices is a plus.
You are a problem solver with good analytical skills.

Comfortable in conversational English.

Lamar loker ini

Silakan referensi bahwa Anda menemukan lowongan kerja ini di Fungsi.id, ini membantu kami mendapatkan lebih banyak lowongan kerja berkualitas di sini, terima kasih!