Staff Software Engineer, Gemini App, Horizontal Quality, DeepMind
at Google
Location
Mountain View, CA, USA
Compensation
$207k–$300k USD
Type
full time
Posted
2 weeks ago
Tailor your résumé to this role in 30 seconds.
Free account · ATS keyword check · per-job bullet rewrite by Claude.
Job description
Our mission is to elevate the Gemini experience by perfectly aligning foundational model behaviors with high-quality data. We drive conversational excellence through thoughtful persona shaping, robust safety enforcement, and clear information architecture. By combining these efforts with rich online and offline signals, we deliver a product that is highly performant and effortlessly intuitive.
Artificial intelligence will be one of humanity’s most transformative inventions. At Google DeepMind, we are a pioneering AI lab with exceptional interdisciplinary teams focused on advancing AI development to solve complex global challenges and accelerate high-quality product innovation for billions of users. We use our technologies for widespread public benefit and scientific discovery, ensuring safety and ethics are always our highest priority.
Responsibilities
- Design, build, and maintain a highly scalable evaluation framework specifically tailored to measure the product-level quality of the Gemini App, moving beyond standard model-level benchmarks.
- Create, synthesize, and refine meaningful metrics that accurately capture the user experience.
- Be responsible for developing a holistic view of quality by combining newly engineered online metrics with deep offline evaluation data.
- Build a transparent, definitive ranking system for product-level quality. Use this system to benchmark the Gemini App against industry standards and clearly identify our competitive strengths and weaknesses.
- Act as a critical partner to Product Management and leadership by translating complex evaluation data into clear, strategic signals. Offer technical leadership on high-impact projects.
Minimum qualifications:
- Bachelor’s degree or equivalent practical experience.
- 8 years of experience in software development.
- 5 years of experience leading technical strategy and architecting large-scale ML infrastructure (e.g., designing serving layers, model evaluation frameworks, or data processing pipelines).
- 5 years of experience testing, and launching software products.
- 3 years of experience with Generative AI, Large Language Models (LLMs), Machine Learning, and related frameworks.
Preferred qualifications:
- Master’s degree or PhD in Engineering, Computer Science, or a related technical field.
- 3 years of experience working in a complex, matrixed organization involving cross-functional, or cross-business projects.
- Experience in building and scaling evaluation pipelines (e.g., RLHF, auto-evals, or side-by-side human evaluations) to measure helpfulness and accuracy.
- Proficiency in advanced prompting techniques and understanding how model fine-tuning, RL, or RAG impacts final response quality.
- Ability to use SQL, Python, or internal data tools to analyze user behavioral data and "pain points" to identify where the model is failing.