Data Engineer II, QuBIT

Amazon.com Services LLC•4h ago

United StatesOnsiteFull-timeMid Level3+ yrs exp

H-1B verified · 2310 LCAs

Apply now

Top focus

Data EngineerVp DataData Warehouse Engineer

What if AI could query your data warehouse and actually understand what the numbers mean, not just return rows? That's the infrastructure we're building
we need a data engineer to help us scale it. We've built a semantic layer that sits between raw operational data and AI agents, encoding metric definitions, business logic, entity relationships, data lineage
query routing into structured knowledge that large language models can consume and reason about. The foundation is in place. Now we need someone to deepen the data models, expand entity coverage, enrich the ontology with causal relationships
build the pipeline infrastructure that keeps it all fresh and accurate at scale. In this role, you'll design and maintain the data infrastructure that powers AI-driven analytics for workforce Learning across Amazon's fulfillment network. That means building SQL pipelines in Redshift that process millions of daily records from nine upstream platforms, defining entity schemas with join keys, primary keys
PII classifications, writing metric definitions with traceable formulas grounded in actual ETL logic
modeling granularity levels that tell AI agents whether to query at the associate, site
network level. You'll own the full stack from raw ingestion through transformation to semantic enrichment. You'll also work directly with business stakeholders to translate their domain expertise into structured metadata. When a Regional Learning Manager explains that "training compliance resets weekly on Sunday" or "this site type structurally can't meet that threshold," you'll encode that context into the semantic layer so AI agents handle it correctly without human intervention. Over time, you'll push this toward a world model: not just what metrics exist, but how they relate causally, what drives them
what happens when they change. We're looking for someone who thinks about data infrastructure as more than pipelines and tables. You'll work with knowledge graphs, entity relationship modeling, YAML-based ontologies, vector embeddings for retrieval
the prompt engineering that ties it all together. If you want to build the data systems that make AI genuinely useful for business decision-making, at Amazon scale, this is the role. Key job responsibilities
Design and maintain semantic layer infrastructure including entity schemas, metric definitions, data lineage, and query routing logic that enables AI agents to accurately query and interpret warehouse data
Build and optimize SQL pipelines in Redshift processing millions of daily records from multiple upstream platforms, ensuring freshness, accuracy, and traceability from source through transformation to consumption
Partner with business stakeholders to translate domain expertise and institutional knowledge into structured, machine-readable metadata that AI systems can reason about without human intervention
Expand data ontologies with causal relationships, temporal logic, and policy constraints that improve AI accuracy and enable increasingly autonomous data investigation
Interface with upstream data teams to extract, transform, and load data from diverse sources using SQL, Python, and AWS technologies, unifying disparate learning platforms into a coherent analytical layer
Maintain pipeline infrastructure that keeps semantic layer content synchronized with evolving ETL logic, detecting drift between metric definitions and underlying data structures
Continuously reduce manual analysis by building toward natural language interfaces where stakeholders get answers directly from AI
Explore emerging techniques in knowledge representation, retrieval-augmented generation, and semantic data modeling to deepen AI-powered analytics capabilities
3+ years of data engineering experience - Bachelor's degree or above in Computer Science, Computer Engineering, Data Science, Electrical Engineering
majors relating to these fields
3+ years of professional software development experience - Experience with one or more object-oriented programming languages (e.g., Java, C/C++, Python) - Experience in data warehouse technical architectures, data modeling, infrastructure components, ETL/ ELT and reporting/analytic tools and environments, data structures and hands-on SQL coding - Experience with Redshift, Oracle, NoSQL etc.
Experience with AWS technologies like Redshift, S3, AWS Glue, EMR, Kinesis, FireHose, Lambda
IAM roles and permissions - Experience with non-relational databases / data stores (object storage, document or key-value stores, graph databases, column-family databases) - Knowledge of software engineering best practices across the development life cycle, including agile methodologies, coding standards, code reviews, source management, build processes, testing
operations - Experience building/operating highly available, distributed systems of data extraction, ingestion
processing of large data sets - 1+ years of programming with at least one software programming language experience Amazon is an equal opportunity employer and does not discriminate on the basis of protected veteran status, disability
other legally protected status. Our inclusive culture empowers Amazonians to deliver the best results for our customers. If you have a disability and need a workplace accommodation or adjustment during the application and hiring process, including support for the interview or onboarding process, please visit https://amazon.jobs/content/en/how-we-hire/accommodations for more information. If the country/region you’re applying in isn’t listed, please contact your Recruiting Partner. The base salary range for this position is listed below. Your Amazon package will include sign-on payments and restricted stock units (RSUs). Final compensation will be determined based on factors including experience, qualifications
location. Amazon also offers comprehensive benefits including health insurance (medical, dental, vision, prescription, Basic Life & AD&D insurance and option for Supplemental life plans, EAP, Mental Health Support, Medical Advice Line, Flexible Spending Accounts, Adoption and Surrogacy Reimbursement coverage), 401(k) matching, paid time off
parental leave. Learn more about our benefits at https://amazon.jobs/en/benefits . USA, WA, Bellevue - 132,100.00 - 178,800.00 USD annually

Required skills

SQLPythonAWSRedshiftETLdata modelingdata warehousingNoSQLobject-oriented programming