Job VC

Cloud Operations Engineer ID59491

AgileEngine · dou · Not specified · Львів, віддалено
Open original ↗
Hi there! This position is for Ukraine-based candidates.
AgileEngine is an Inc. 5000 company that creates award-winning software for Fortune 500 brands and trailblazing startups across 17+ industries. We rank among the leaders in areas like application development and AI/ML, and our people-first culture has earned us multiple Best Place to Work awards.
Why join us
If you’re looking for a place to grow, make an impact, and work with people who care, we’d love to meet you! :)

About the role
We are looking for a
Cloud Operations Engineer
to own production support across a full AWS-native technology stack supporting multiple platforms and hundreds of terabytes of data. You will monitor systems, triage and escalate incidents, execute operational playbooks, and build automation to reduce manual toil across ECS, RDS, Glue, Lambda, and observability tooling. The role operates as a standalone support function in a fintech environment with SLA accountability and on-call responsibilities.

What you will do
● Monitor production systems and respond to alerts across the full stack;
● Perform first-level triage on incidents and support requests and escalate to developers with thorough context and diagnostics;
● Execute patching, operational tasks, and documented playbooks;
● Contribute to improving documentation, monitoring coverage, reporting, and automation of operational capabilities;
● Follow and contribute to the improvement of incident management procedures and participate in post-incident reviews;
● Identify recurring issues, diagnose root causes, and systemic risks before they escalate;
● Manage and measure SLA performance across supported services and highlight risks;
● Implement and enhance automation and monitoring based on existing frameworks, agentic workflows, and scripting to minimize manual toil;
● Collaborate with help desk and deskside support partners for production tasks affecting employees;
● Support security incident handling in coordination with internal processes.

Must haves

3+ years of experience
in
production support
,
SRE
,
NOC
, or
operations engineering
roles responsible for system uptime and incident resolution;
● Hands-on
AWS
operations experience across compute, networking, and security services;
● Operational proficiency with
PostgreSQL
and/or
Amazon RDS
;
● Full-stack triage ability across infrastructure, data pipelines, and application layers;
● Experience working in
SLA-driven environments
and meeting performance targets;
● Experience implementing
automation
and using
AI and ML tools
to streamline operations;
● Strong communication and coordination skills for cross-functional work with developers, security partners, and support providers;

Upper-intermediate English level
.

Nice to haves
● Experience working within incident response frameworks such as ITIL, NIST, or equivalent;
● Experience with AWS data services such as Glue, S3, Athena, and EventBridge and ETL pipeline operations;
● Familiarity with Datadog, Metaplane, or comparable observability and data quality platforms;
● Infrastructure-as-code proficiency with SAM, CloudFormation, or Terraform;
● Background in financial services or environments with regulatory and compliance requirements;
● AWS certifications such as Solutions Architect, SysOps Administrator, or equivalent.

Perks and benefits


Professional growth: Accelerate your professional journey with mentorship, TechTalks, and personalized growth roadmaps

Competitive compensation: We match your ever-growing skills, talent, and contributions with competitive USD-based compensation and budgets for education, fitness, and team activities

A selection of exciting projects: Join projects with modern solutions development and top-tier clients that include Fortune 500 enterprises and leading product brands

Flextime: Tailor your schedule for an optimal work-life balance, by having the options of working from home and going to the office — whatever makes you the happiest and most productive.

Meet Our Recruitment Process
Asynchronous stage
— An automated, self-paced track that helps us move faster and give you quicker feedback:
● Short online form to confirm basic requirements

30–60
minute skills assessment

5-minute
introduction video
Synchronous stage
— Live interviews
● Technical interview with our engineering team (scheduled at your convenience)
● Final interview with your future teammates
If it’s a match—you’ll get an offer!