Principal Solutions Architect, Foundation Model Providers
Amazon Web Services, Inc.•2h ago
United StatesOnsiteFull-timePrincipal Level10+ yrs exp
H-1B verified · 2310 LCAs
Top focus
Solutions Architect
- As a Principal Solutions Architect supporting Foundation Model Providers (FMP) on AWS, you will tackle some of the most challenging and exciting problems in cloud computing today. Model provider’s compute footprint represents one of the most demanding workloads on the AWS platform, pushing the boundaries of what's possible with networking, GPU infrastructure, storage, container orchestration
- distributed computing at extraordinary scale. In this role, you'll design cloud architectures that enable model providers to train, fine-tune
- serve state-of-the-art generative AI models. You'll help solve technical challenges that few organizations in the world face - exabyte-scale data, millions of interconnected GPUs, complex networking topologies
- custom hardware acceleration requirements that define the leading edge of AI infrastructure. You will help them solve business challenges such as rapidly releasing products/services to the market and building elastic, scalable, cost optimized applications. You will engage with product owners and service teams to set the strategy for AWS services. For this role, we are looking for folks who have technical breadth complimented by technical depth in one or two areas, business aptitude
- the ability to lead in-depth technology discussions, articulating the business value of the AWS platform and services. Key job responsibilities
- Maintain and foster relationships with model providers, becoming their trusted technical advisor and strategic partner
- Develop deep knowledge of core foundational services (Compute, Network, Storage) along with ML expertise to build long-term relationships with customer engineering teams
- Dive deep to understand the details of model provider’s environment, business goals, and technical requirements for building and deploying foundation models
- Design and implement advanced cloud architectures that enable model providers to scale their AI research and production workloads efficiently
- Partner closely with AWS service teams (EC2, Global Networking, EKS, Bedrock, S3) to influence roadmaps and develop custom solutions that meet model provider’s unique requirements
- Identify patterns and technical solutions that can be broadly applied across the FMP segment to accelerate innovation
- Lead technical discussions that articulate the business value of AWS platform and services to both technical architects and executive stakeholders
- Drive technical and architectural best practices for GPU optimization, network throughput, distributed training
- cost efficiency at massive scale About the team About the team Diverse Experiences AWS values diverse experiences. Even if you do not meet all of the qualifications and skills listed in the job description, we encourage candidates to apply. If your career is just starting, hasn’t followed a traditional path
- includes alternative experiences, don’t let it stop you from applying. Why AWS? Amazon Web Services (AWS) is the world’s most comprehensive and broadly adopted cloud platform. We pioneered cloud computing and never stopped innovating — that’s why customers from the most successful startups to Global 500 companies trust our robust suite of products and services to power their businesses. Inclusive Team Culture Here at AWS, it’s in our nature to learn and be curious. Our employee-led affinity groups foster a culture of inclusion that empower us to be proud of our differences. Ongoing events and learning experiences, including our Conversations on Race and Ethnicity (CORE) and AmazeCon (diversity) conferences, inspire us to never stop embracing our uniqueness. Mentorship & Career Growth We’re continuously raising our performance bar as we strive to become Earth’s Best Employer. That’s why you’ll find endless knowledge-sharing, mentorship and other career-advancing resources here to help you develop into a better-rounded professional. Work/Life Balance We value work-life harmony. Achieving success at work should never come at the expense of sacrifices at home
- is why we strive for flexibility as part of our working culture. When we feel supported in the workplace and at home, there’s nothing we can’t achieve in the cloud.
- 10+ years of specific technology domain areas (e.g. software development, cloud computing, systems engineering, infrastructure, security, networking, data & analytics) experience - Bachelor's degree in computer science, engineering, mathematics or equivalent - Experience developing technology solutions and evangelising end-to-end technology roadmaps that guide IT transformations toward cloud computing - Experience communicating across technical and non-technical audiences and at C-level, including training, workshops, publications
- Knowledge of large scale automation and workflow management or equivalent - Knowledge of presentations and whiteboarding skills with a high degree of comfort speaking with internal and external executives, IT management
- developers - Experience with training and deploying machine learning systems to solve large-scale optimizations
- experience operating highly available, distributed systems of data extraction, ingestion
- processing of large data sets - Experience with CUDA kernels or ML/low-level kernels
- experience with Machine Learning and Large Language Model fundamentals, including architecture, training/inference lifecycles
- optimization of model execution - Experience in Kubernetes, Docker or containers ecosystem - Knowledge of foundation model architectures, training approaches
- serving infrastructure Amazon is an equal opportunity employer and does not discriminate on the basis of protected veteran status, disability
- other legally protected status. Our inclusive culture empowers Amazonians to deliver the best results for our customers. If you have a disability and need a workplace accommodation or adjustment during the application and hiring process, including support for the interview or onboarding process, please visit https://amazon.jobs/content/en/how-we-hire/accommodations for more information. If the country/region you’re applying in isn’t listed, please contact your Recruiting Partner.
Required skills
cloud computingsoftware developmentsystems engineeringinfrastructuresecuritynetworkingdata & analyticsmachine learningKubernetesDocker