This position is no longer accepting applications. The listing was closed on 5/20/2026. Browse open roles at Amazon.

Support Engineer, AWS Incident Response

at Amazon

Location

Seattle, Washington, USA

Compensation

$90k–$158k USD

Type

full time

Posted

3 days ago

Job description

AWS Incident Response (AIR) keeps AWS working for millions of customers. When major incidents hit, AIR leads the response, coordinating resolvers across AWS and driving mitigation. We move fast, but not carelessly, obsessing over observability of the cloud and perpetually improving our detection and response speed and accuracy. We ensure each incident drives improvements that strengthen AWS. It's a high-visibility, high-impact role with a global view of AWS health that few teams get to see.

The Role

As a Support Engineer on AIR's Seattle team, you'll be on the front line of AWS incident response. You'll lead high-severity calls, triage complex failures across distributed systems, coordinate resolver teams, and drive incidents to mitigation while millions of customers depend on the outcome. Between incidents, you'll obsess over metrics and detection analysis, building dashboards and mechanisms that surface problems before customers notice. You will drive operational improvements that make the incident management ecosystem faster and more accurate.

This isn't a role where you watch dashboards and robotically follow runbooks. You'll deep-dive the largest, most complex technical environment in the world. You'll develop expertise across AWS services, networking, and infrastructure. You'll own operational processes end-to-end and use data to find the next leap in how we protect the cloud. If interested, you'll also have the opportunity to grow your development skills by taking on coding projects matched to your ability level.

This role includes participation in an on-call rotation, including some weekends and holidays.

Key job responsibilities
Incident Response
Lead high-severity incident response calls. Triage, coordinate resolvers across AWS service teams, communicate clearly under pressure, and drive incidents to mitigation. Manage escalations and ensure accurate documentation throughout.

Operational Excellence and Detection
Own and run operational health reviews. Build and maintain dashboards, metrics, and monitoring that surface trends before they become incidents. Obsess over detection accuracy and speed. Detect patterns across events and drive proactive mechanisms to prevent recurrence.

Metrics and Analysis
Deep-dive operational data to identify systemic issues, measure response effectiveness, and prioritize improvements. Use metrics to tell the story of what's working, what's degrading, and where the next risk is hiding.

Process and Tooling Improvement
Identify gaps in operational processes, documentation, and tooling. Build or improve mechanisms that reduce time-to-detection and time-to-mitigation. Use data to prioritize where effort has the highest impact.

Automation and Generative AI
Leverage scripting, generative AI, and automation to accelerate incident response, improve detection, and reduce toil. Identify opportunities where AI can augment human judgment during incidents or surface insights from operational data at scale.

Driving Continuous Improvement
Ensure each incident makes AWS stronger. Work with service teams to ensure learnings from incidents drive corrective actions and that follow-through happens. Close the loop between what broke and what gets fixed.
- 2+ years of technical support experience
- Direct experience participating in incident response for production systems
- Strong understanding of operating systems (Linux), networking fundamentals, and distributed systems
- Experience with operational monitoring, alerting, and metrics (CloudWatch, Datadog, Grafana, or equivalent)
- Demonstrated ability to troubleshoot complex technical problems spanning multiple systems or services
- Experience scripting or programming in at least one modern language (Python, Bash, Go, or similar)
- Ability to clearly break down technical complexity for a wide range of audiences, from engineers to senior leadership, without relying on jargon - Familiarity with incident management tooling and workflows
- Experience with AWS services and cloud infrastructure
- Experience using generative AI or automation to solve operational problems or accelerate workflows
- Track record of authoring post-incident analyses (post-mortems) and driving corrective actions to completion
- Experience building operational dashboards, runbooks, or automation that improved team efficiency
- Experience coordinating across globally distributed teams and time zones

Amazon is an equal opportunity employer and does not discriminate on the basis of protected veteran status, disability, or other legally protected status.

Our inclusive culture empowers Amazonians to deliver the best results for our customers. If you have a disability and need a workplace accommodation or adjustment during the application and hiring process, including support for the interview or onboarding process, please visit https://amazon.jobs/content/en/how-we-hire/accommodations for more information. If the country/region you’re applying in isn’t listed, please contact your Recruiting Partner.

The base salary range for this position is listed below. Your Amazon package will include sign-on payments and restricted stock units (RSUs). Final compensation will be determined based on factors including experience, qualifications, and location. Amazon also offers comprehensive benefits including health insurance (medical, dental, vision, prescription, Basic Life & AD&D insurance and option for Supplemental life plans, EAP, Mental Health Support, Medical Advice Line, Flexible Spending Accounts, Adoption and Surrogacy Reimbursement coverage), 401(k) matching, paid time off, and parental leave. Learn more about our benefits at https://amazon.jobs/en/benefits.

USA, WA, Seattle - 90,400.00 - 158,200.00 USD annually

More open roles at Amazon

Hiring velocity, headcount trend, and every open posting on one page.

View Amazon profile →