What if AI could query your data warehouse and actually understand what the numbers mean, not just return rows? That's the infrastructure we're building, and we need a data engineer to help us scale it.
We've built a semantic layer that sits between raw operational data and AI agents, encoding metric definitions, business logic, entity relationships, data lineage, and query routing into structured knowledge that large language models can consume and reason about. The foundation is in place. Now we need someone to deepen the data models, expand entity coverage, enrich the ontology with causal relationships, and build the pipeline infrastructure that keeps it all fresh and accurate at scale.
In this role, you'll design and maintain the data infrastructure that powers AI-driven analytics for workforce Learning across Amazon's fulfillment network. That means building SQL pipelines in
Redshift that process millions of daily records from nine upstream platforms, defining entity schemas with join keys, primary keys, and PII classifications, writing metric definitions with traceable formulas grounded in actual
ETL logic, and modeling granularity levels that tell AI agents whether to query at the associate, site, or network level. You'll own the full stack from raw ingestion through transformation to semantic enrichment.
You'll also work directly with business stakeholders to translate their domain expertise into structured metadata. When a Regional Learning Manager explains that "training compliance resets weekly on Sunday" or "this site type structurally can't meet that threshold," you'll encode that context into the semantic layer so AI agents handle it correctly without human intervention. Over time, you'll push this toward a world model: not just what metrics exist, but how they relate causally, what drives them, and what happens when they change.
We're looking for someone who thinks about data infrastructure as more than pipelines and tables. You'll work with knowledge graphs, entity relationship modeling, YAML-based ontologies, vector embeddings for retrieval, and the prompt engineering that ties it
all together. If you want to build the data systems that make AI genuinely useful for business decision-making, at Amazon scale, this is the role.
Key job responsibilities
• Design and maintain semantic layer infrastructure including entity schemas, metric definitions, data lineage, and query routing logic that enables AI agents to accurately query and interpret warehouse data
• Build and optimize SQL pipelines in
Redshift processing millions of daily records from multiple upstream platforms, ensuring freshness, accuracy, and traceability from source through transformation to consumption
• Partner with business stakeholders to translate domain expertise and institutional knowledge into structured, machine-readable metadata that AI systems can reason about without human intervention
• Expand data ontologies with causal relationships, temporal logic, and policy constraints that improve AI accuracy and enable increasingly autonomous data investigation
• Interface with upstream data teams to extract, transform, and load data from diverse sources using SQL,
Python, and
AWS technologies, unifying disparate learning platforms into a coherent analytical layer
• Maintain pipeline infrastructure that keeps semantic layer content synchronized with evolving
ETL logic, detecting drift between metric definitions and underlying data structures
• Continuously reduce manual analysis by building toward natural language interfaces where stakeholders get answers directly from AI
• Explore emerging techniques in knowledge representation, retrieval-augmented generation, and semantic
data modeling to deepen AI-powered analytics capabilities
- 3+ years of data engineering experience
- Bachelor's degree or above in Computer Science, Computer Engineering, Data Science, Electrical Engineering, or majors relating to these fields, or 3+ years of professional software development experience
- Experience with one or more object-oriented programming languages (e.g.,
Java, C/
C++,
Python)
- Experience in data warehouse technical architectures,
data modeling, infrastructure components,
ETL/ ELT and reporting/analytic tools and environments, data structures and hands-on SQL coding
- Experience with
Redshift, Oracle, NoSQL etc.
- Experience with
AWS technologies like
Redshift,
S3,
AWS Glue, EMR,
Kinesis, FireHose,
Lambda, and IAM roles and permissions
- Experience with non-relational databases / data stores (object storage, document or key-value stores, graph databases, column-family databases)
- Knowledge of software engineering best practices across the development life cycle, including
agile methodologies, coding standards, code reviews, source management, build processes, testing, and operations
- Experience building/operating highly available, distributed systems of data extraction, ingestion, and processing of large data sets
- 1+ years of programming with at least one software programming language experience
Amazon is an equal opportunity employer and does not discriminate on the basis of protected veteran status, disability, or other legally protected status.
Our inclusive culture empowers Amazonians to deliver the best results for our customers. If you have a disability and need a workplace accommodation or adjustment during the application and hiring process, including support for the interview or onboarding process, please visit
https://amazon.jobs/content/en/how-we-hire/accommodations for more information. If the country/region you’re applying in isn’t listed, please contact your Recruiting Partner.
The base salary range for this position is listed below. Your Amazon package will include sign-on payments and restricted stock units (RSUs). Final compensation will be determined based on factors including experience, qualifications, and location. Amazon also offers comprehensive benefits including health insurance (medical, dental, vision, prescription, Basic Life & AD&D insurance and option for Supplemental life plans, EAP, Mental Health Support, Medical Advice Line, Flexible Spending Accounts, Adoption and Surrogacy Reimbursement coverage), 401(k) matching, paid time off, and parental leave. Learn more about our benefits at https://amazon.jobs/en/benefits.
USA, WA, Bellevue - 132,100.00 - 178,800.00 USD annually