AWS Infrastructure Services owns the design, planning, delivery, and operation of all
AWS global infrastructure. In other words, we're the people who keep the cloud running. We support all
AWS data centers and all of the servers, storage, networking, power, and cooling equipment that ensure our customers have continual access to the innovation they rely on. We work on the most challenging problems, with thousands of variables impacting the supply chain, and we're looking for talented people who want to help.
You'll join a diverse team of software, hardware, and network engineers, supply chain specialists, security experts, operations managers, and other vital roles. You'll collaborate with people across
AWS to help us deliver the highest standards for safety and security while providing seemingly infinite capacity at the lowest possible cost for our customers. And you'll experience an inclusive culture that welcomes bold ideas and empowers you to own them to completion.
The HealthGuardian team is looking for a software engineer who is excited about building automated detection and mitigation systems that protect
AWS infrastructure at scale. We detect subtle failures that evade traditional health checks and automatically remove affected resources from service before customers are impacted. Our systems run across every
AWS region, and we're scaling coverage from hundreds of services to thousands. This is a hands-on position where you will design and deliver significant software components, drive cross-team technical alignment, and mentor other engineers. You need to be a strong software developer with a track record of delivering, but also excel in communication, technical leadership, and customer focus. You'll leverage generative AI tools as part of your daily workflow to accelerate design, development, and validation. This is an opportunity to join a small, high-impact team solving hard reliability problems and help shape both the technology and the direction of automated failure protection across
AWS.
Key job responsibilities
Our engineers collaborate across diverse teams, projects, and environments to have a firsthand impact on
AWS reliability. You'll bring a passion for
distributed systems, safety engineering, and data-driven detection. You'll also: Design and deliver systems that span multiple
AWS teams and organizational boundaries. Build detection algorithms and
experimentation frameworks that validate changes at scale. Architect safety mechanisms — circuit breakers, throttling, validation — that let automation scale without unintended customer impact. Own ambiguous problems end-to-end from design through operations. Mentor other engineers and lead technical design reviews. Use AI-assisted development tools to prototype, test, and validate faster.
About the team
We are a small team with outsized impact on
AWS reliability. We operate what we build, and every engineer has direct visibility into how their code performs during real infrastructure events. We solve complex
distributed systems challenges to ensure automated protection works reliably even during the failures it's designed to detect. We value operational rigor, building systems that are safe by default, and solving hard problems with simple designs.
- 3+ years of non-internship professional software development experience
- 2+ years of non-internship design or architecture (design patterns, reliability and scaling) of new and existing systems experience
- 2+ years of programming with at least one software programming language experience
- 2+ years of full software development life cycle, including coding standards, code reviews, source control management, build processes, testing, and operations experience
- Bachelor's degree in computer science or equivalent
- Experience in mentoring, leading, or managing more junior engineers
Amazon is an equal opportunity employer and does not discriminate on the basis of protected veteran status, disability, or other legally protected status.
Our inclusive culture empowers Amazonians to deliver the best results for our customers. If you have a disability and need a workplace accommodation or adjustment during the application and hiring process, including support for the interview or onboarding process, please visit
https://amazon.jobs/content/en/how-we-hire/accommodations for more information. If the country/region you’re applying in isn’t listed, please contact your Recruiting Partner.
The base salary range for this position is listed below. Your Amazon package will include sign-on payments and restricted stock units (RSUs). Final compensation will be determined based on factors including experience, qualifications, and location. Amazon also offers comprehensive benefits including health insurance (medical, dental, vision, prescription, Basic Life & AD&D insurance and option for Supplemental life plans, EAP, Mental Health Support, Medical Advice Line, Flexible Spending Accounts, Adoption and Surrogacy Reimbursement coverage), 401(k) matching, paid time off, and parental leave. Learn more about our benefits at https://amazon.jobs/en/benefits.
USA, WA, Seattle - 143,700.00 - 194,400.00 USD annually