Job VC
Site Reliability Engineer ID60188
Technologies
Description
Hi there! AgileEngine is an Inc. 5000 company that creates award-winning software for Fortune 500 brands and trailblazing startups across 17+ industries. We rank among the leaders in areas like application development and AI/ML, and our people-first culture has earned us multiple Best Place to Work awards.
Why join us
If you’re looking for a place to grow, make an impact, and work with people who care, we’d love to meet you! :)
About the role
We are looking for an
SRE Operations Engineer
to keep production and staging environments running reliably across a cloud-based SaaS platform. You’ll respond to live incidents, reduce operational toil through automation, and improve observability using Kubernetes, Terraform, Grafana, and AWS. A hands-on role with real ownership across CI/CD pipelines, GitOps workflows, and on-call rotations.
What you will do
● Monitor and support production and staging environments in real time, ensuring high availability, performance, and stability;
● Respond to incidents, perform triage and root cause analysis, and contribute to post-incident reviews and remediation efforts;
● Participate in an on-call rotation with defined SLAs;
● Handle ad-hoc and unplanned operational requests from Product, Support, and internal teams;
● Maintain and enhance monitoring, alerting, dashboards, logs, and metrics, and improve observability practices;
● Support CI/CD pipelines, production releases, and GitOps workflows;
● Contribute to automation efforts to reduce operational toil;
● Maintain and improve Kubernetes-based infrastructure and containerized workloads;
● Support Infrastructure as Code practices and ongoing environment improvements.
Must haves
●
2+ years of experience
in Site Reliability Engineering, DevOps, or Production Operations;
● Experience with
AWS
supporting production environments;
● Experience supporting production SaaS applications;
● Strong understanding of CI/CD systems such as
GitHub Actions
,
Jenkins
, or
CircleCI
;
● Experience with GitOps and strong Git fundamentals;
● Experience using
GitHub
,
Jira
, and
Confluence
in collaborative environments;
● Experience with
Kubernetes
such as EKS or kOps;
● Experience with
Docker
and containerization;
● Experience with observability tools such as
Grafana
,
Prometheus
,
Loki
, or
PagerDuty
;
● Experience with scripting languages such as
Bash
,
Python
, or
Go
;
● Experience with Infrastructure as Code such as
Terraform
or
Helm
;
● Ability to work within structured operational processes and SLAs;
● Strong written and verbal English communication skills;
● Self-driven with a growth mindset.
Nice to haves
● AWS certifications such as Solutions Architect, DevOps Engineer, or SysOps Administrator;
● Experience in multi-tenant SaaS environments;
● Experience working in globally distributed teams;
● Familiarity with ChatOps practices;
● Experience improving monitoring quality and reducing alert fatigue.
Perks and benefits
●
Professional growth: Accelerate your professional journey with mentorship, TechTalks, and personalized growth roadmaps
●
Competitive compensation: We match your ever-growing skills, talent, and contributions with competitive USD-based compensation and budgets for education, fitness, and team activities
●
A selection of exciting projects: Join projects with modern solutions development and top-tier clients that include Fortune 500 enterprises and leading product brands
●
Flextime: Tailor your schedule for an optimal work-life balance, by having the options of working from home and going to the office — whatever makes you the happiest and most productive.
Meet Our Recruitment Process
Asynchronous stage
— An automated, self-paced track that helps us move faster and give you quicker feedback:
● Short online form to confirm basic requirements
●
30–60
minute skills assessment
●
5-minute
introduction video
Synchronous stage
— Live interviews
● Technical interview with our engineering team (scheduled at your convenience)
● Final interview with your future teammates
If it’s a match—you’ll get an offer!
Why join us
If you’re looking for a place to grow, make an impact, and work with people who care, we’d love to meet you! :)
About the role
We are looking for an
SRE Operations Engineer
to keep production and staging environments running reliably across a cloud-based SaaS platform. You’ll respond to live incidents, reduce operational toil through automation, and improve observability using Kubernetes, Terraform, Grafana, and AWS. A hands-on role with real ownership across CI/CD pipelines, GitOps workflows, and on-call rotations.
What you will do
● Monitor and support production and staging environments in real time, ensuring high availability, performance, and stability;
● Respond to incidents, perform triage and root cause analysis, and contribute to post-incident reviews and remediation efforts;
● Participate in an on-call rotation with defined SLAs;
● Handle ad-hoc and unplanned operational requests from Product, Support, and internal teams;
● Maintain and enhance monitoring, alerting, dashboards, logs, and metrics, and improve observability practices;
● Support CI/CD pipelines, production releases, and GitOps workflows;
● Contribute to automation efforts to reduce operational toil;
● Maintain and improve Kubernetes-based infrastructure and containerized workloads;
● Support Infrastructure as Code practices and ongoing environment improvements.
Must haves
●
2+ years of experience
in Site Reliability Engineering, DevOps, or Production Operations;
● Experience with
AWS
supporting production environments;
● Experience supporting production SaaS applications;
● Strong understanding of CI/CD systems such as
GitHub Actions
,
Jenkins
, or
CircleCI
;
● Experience with GitOps and strong Git fundamentals;
● Experience using
GitHub
,
Jira
, and
Confluence
in collaborative environments;
● Experience with
Kubernetes
such as EKS or kOps;
● Experience with
Docker
and containerization;
● Experience with observability tools such as
Grafana
,
Prometheus
,
Loki
, or
PagerDuty
;
● Experience with scripting languages such as
Bash
,
Python
, or
Go
;
● Experience with Infrastructure as Code such as
Terraform
or
Helm
;
● Ability to work within structured operational processes and SLAs;
● Strong written and verbal English communication skills;
● Self-driven with a growth mindset.
Nice to haves
● AWS certifications such as Solutions Architect, DevOps Engineer, or SysOps Administrator;
● Experience in multi-tenant SaaS environments;
● Experience working in globally distributed teams;
● Familiarity with ChatOps practices;
● Experience improving monitoring quality and reducing alert fatigue.
Perks and benefits
●
Professional growth: Accelerate your professional journey with mentorship, TechTalks, and personalized growth roadmaps
●
Competitive compensation: We match your ever-growing skills, talent, and contributions with competitive USD-based compensation and budgets for education, fitness, and team activities
●
A selection of exciting projects: Join projects with modern solutions development and top-tier clients that include Fortune 500 enterprises and leading product brands
●
Flextime: Tailor your schedule for an optimal work-life balance, by having the options of working from home and going to the office — whatever makes you the happiest and most productive.
Meet Our Recruitment Process
Asynchronous stage
— An automated, self-paced track that helps us move faster and give you quicker feedback:
● Short online form to confirm basic requirements
●
30–60
minute skills assessment
●
5-minute
introduction video
Synchronous stage
— Live interviews
● Technical interview with our engineering team (scheduled at your convenience)
● Final interview with your future teammates
If it’s a match—you’ll get an offer!