Overview Designing and implementing SLIs/SLOs aligned to key customer journeys. Strong knowledge of observability concepts: logs, metrics, traces, SLIs/SLO. Integrating observability tools like Dynatrace, Elastic, and Nagios to provide deep insights into application performance and reliability. Building alerting pipelines via PagerDuty to ensure timely and actionable notifications for support teams. Collaborating with senior SREs and application teams to identify gaps and drive improvements in monitoring coverage and incident response. ML/Anomaly detection strategies. Responsibilities Design and implement SLIs/SLOs for critical customer journeys. Integrate and maintain observability tools (Dynatrace, Elastic, Nagios). Build and maintain alerting pipelines (PagerDuty). Collaborate with senior SREs and application teams. Develop ML/Anomaly detection strategies for observability. Qualifications Strong knowledge of logs, metrics, traces, SLIs/SLOs. Experience with Dynatrace, Elastic, Nagios, PagerDuty, OTEL. Knowledge of ITIL, Java, AWS. Experience in containerised environments. Tertiary qualification in Computer Science or Engineering. Work With Environments: Mix of on‑prem and cloud‑hosted applications. OTEL experience. ITIL. JAVA. AWS Knowledge. Automation/Coding. Experience with containerised environments. Contact Interested candidates can send their updated resume to or reach me @ M: #J-18808-Ljbffr
Site Reliability Engineer
CARECONE GROUP
council of the city of sydney, council of the city of sydney
Published 4 days ago
Report job
Similar jobs
Part Time Work From Home Focus Group Panelist. Call Centre Agent Experience Not Required
APEX FOCUS GROUP LLC
Permanent