Staff Software Engineer, AI and Infrastructure

at Google

Location

Sunnyvale, CA, USA

Compensation

$207k–$300k USD

Type

full time

Posted

1 weeks ago

Tailor your résumé to this role in 30 seconds.

Free account · ATS keyword check · per-job bullet rewrite by Claude.

Tailor my résumé Apply on company site

Job description

Google Cloud’s mission is to make every business successful through AI by combining cutting-edge technology, infrastructure, and talent. AI/ML software engineers in Cloud bridge the gap between pioneering models and a massive product vehicle reaching billions. Our talent density and AI-powered tools drive rapid development, rooted in a culture of empowerment and a bias to action. In this role, you aren’t just building technology; you’re shaping the frontier of enterprise and driving the evolution of advanced models.

Our team develops Borglet which is Google’s “node agent”, responsible for managing the life cycle of all user processes that run on all our machines. The Borglet Infrastructure group in Borglet is a team of 30 SWEs (distributed between Sunnyvale and Warsaw) developing core pieces of Borglet (Software Architecture, Runtime, Machine Management, Storage) to deliver the container infrastructure that is scalable, extensible, efficient and secure. The team is focused on key large initiatives in MSCA like ML/AI Infrastructure (GPU/TPUs in Borg), Security, Capacity Fungibility, Warp Space (TI VMs).

The ML, Systems, & Cloud AI (MSCA) organization at Google designs, implements, and manages the hardware, software, machine learning, and systems infrastructure for all Google services (Search, YouTube, etc.) and Google Cloud. Our end users are Googlers, Cloud customers and the billions of people who use Google services around the world.

We prioritize security, efficiency, and reliability across everything we do - from developing our latest TPUs to running a global network, while driving towards shaping the future of hyperscale computing. Our global impact spans software and hardware, including Google Cloud’s Vertex AI, the leading AI platform for bringing Gemini models to enterprise customers.

The US base salary range for this full-time position is $207,000-$300,000 + bonus + equity + benefits. Our salary ranges are determined by role, level, and location. Within the range, individual pay is determined by work location and additional factors, including job-related skills, experience, and relevant education or training. Your recruiter can share more about the specific salary range for your preferred location during the hiring process.

Please note that the compensation details listed in US role postings reflect the base salary only, and do not include bonus, equity, or benefits. Learn more about benefits at Google.

Responsibilities

Design, implement, and analyze computer systems and their interactions with the kernel and hardware.
Collaborate with partner teams as well as users across Google e.g., Borg team, ML teams, HW platform teams, SRE teams, Google's internal and Cloud users
Solve ambiguous and high impact problems. Develop junior engineers on the team.
Strategic planning and tactical execution in complex projects. Ability to cross-coordinate across partner teams in Warsaw.

Minimum qualifications:

Bachelor's degree or equivalent practical experience.
8 years of experience programming in C++.
5 years of experience testing, and launching software products.
5 years of experience building and developing large-scale infrastructure, distributed systems or networks, or experience with compute technologies, storage, or hardware architecture.
3 years of experience with software design and architecture.

Preferred qualifications:

Master’s degree or PhD in Engineering, Computer Science, or a related technical field.
3 years of experience in a technical leadership role leading project teams and setting technical direction.
3 years of experience working in a complex, matrixed organization involving cross-functional, or cross-business projects.
Experience with Linux Internals, Cluster Management, System Architecture, Virtualization, and Security.