AI Field Engineer - Microsoft Foundry
Fireworksai•2d ago
United StatesOnsite$4Full-timeMid Level3+ yrs exp
About Us
- At Fireworks, we’re building the future of generative AI infrastructure. Our platform delivers the highest-quality models with the fastest and most scalable inference in the industry. We’ve been independently benchmarked as the leader in LLM inference speed and are driving cutting-edge innovation through projects like our own function calling and multimodal models. Fireworks is a Series C company valued at $4 billion and backed by top investors including Benchmark, Sequoia, Lightspeed, Index
- Evantic. We’re an ambitious, collaborative team of builders, founded by veterans of Meta PyTorch and Google Vertex AI.
- The Role
- As an AI Field Engineer for Microsoft Foundry, you will be one of the technical owners of Fireworks' most strategic partnership. You’ll work closely with Microsoft's field teams, Azure-aligned ISVs
- the SIs that run enterprise AI transformation programs to make Fireworks the default inference and fine-tuning layer in every Azure AI architecture your partners touch. The role sits at the intersection of engineering, partner development
- customer delivery. You build reference architectures, run benchmarks, debug production integrations
- co-develop POCs — all while holding your own in executive-level conversations about strategy, roadmap
- You spend most of your time building and enabling. You ship code, run joint POCs with Microsoft field teams
- architect deployments that span Azure Foundry and Fireworks. But you also lead discovery conversations, align partner stakeholders
- translate field signals into product improvements that compress the feedback loop from partner to roadmap.
- The Segment
- As a Field Engineer aligned with our Partnerships team you own the technical relationship between Fireworks and the Microsoft ecosystem, Azure field teams, ISVs building on Azure Foundry
- the SIs that deliver AI transformation programs on Azure. The Microsoft partnership is a core go-to-market bet: clients like UIPath, Stack Blitz, Motif run via Fireworks on Foundry.. Your job is to scale that pattern across the partner ecosystem. These engagements involve large, multi-stakeholder organizations, so you will need to navigate both the enterprise buyer (IT, security, compliance) and the builder (ML engineers, platform teams, app developers)
- building the trusted-advisor relationships inside Microsoft's field that multiply your reach.
- What You'll Work On
- Technical Delivery and Deployment
- Be the technical lead on co-sell motions with Microsoft — joint reference architectures, Azure Foundry integration patterns, and shared POCs for strategic accounts.
- Build end-to-end POCs and MVPs alongside partner engineering teams, working inside their codebases, infrastructure, and constraints.
- Run load tests and establish latency, throughput, and cost baselines against realistic customer traffic profiles, and tune deployments to hit those targets.
- Deploy and validate new model families on inference frameworks (vLLM, SGLang), determining optimal shapes, quantization configs, and serving patterns across workloads.
- Model Strategy and Fine-Tuning
- Guide Microsoft’s customers on model selection, fine-tuning strategy (SFT, DPO, RFT), and evaluation methodology.
- Build and run fine-tuning pipelines directly with customers, navigating trade-offs between model families, compute cost, and quality targets.
- Design and implement evaluation frameworks that measure production-quality metrics, not just benchmark scores.
- Product Feedback and Platform Improvement
- Own the feedback loop — surface partner-driven product gaps to Fireworks engineering, and translate the roadmap back into partner messaging.
- Ship external technical content: reference architectures, integration guides, and benchmark posts that make it easy for partners to win deals with us.
- Track pipeline health; flag risks and opportunities to Field leadership weekly.
- What We're Looking For
- Minimum Qualifications
- 3+ years in a pre-sales, partner engineering, forward-deployed, or technical consulting role.
- Demonstrated ability to build production software with customers, not just advise on it. You have shipped code running in someone else's production environment.
- Strong Python skills. Comfortable reading, writing, and debugging production code. Familiarity with Kubernetes and infrastructure engineering.
- Hands-on fluency with LLM inference: latency/throughput tradeoffs, batching strategies, quantization, structured outputs, function calling. You can explain why 50ms p99 matters to an enterprise CTO.
- Real experience with fine-tuning — LoRA at minimum, RFT a strong plus. You understand when SFT is enough and when it isn't.
- Deep familiarity with the Azure AI stack: Azure Foundry, Azure OpenAI Service, Azure ML, AKS, Entra/RBAC for AI workloads. You know where Fireworks fits and where it doesn't.
- Exceptional communication: able to run a sharp discovery call, present to a VP, and debug a latency issue with an ML engineer in the same afternoon.
- Preferred Qualifications
- 5+ years in technical field or engineering roles where you've owned a technical relationship with a hyperscaler or major SI, not just supported one
- Experience with inference serving frameworks (vLLM, SGLang, TensorRT-LLM) and tuning deployments for real workloads.
- Prior role at a hyperscaler, AI-native cloud, or inference provider.
- Experience with agentic frameworks (LangChain, LlamaIndex, or custom tool-use pipelines) — you understand how inference latency and reliability shapes agent behavior at scale.
- Background in model evaluation — you understand why benchmark gaming is rampant and what rigorous evals actually look like.
- You've written a technical blog post or reference architecture that people actually read.
- Track record taking GenAI POCs from prototype to production-scale deployments.
- On-Target Expectations (Plus Equity)
- $280,000 - $320,000 USD
- Total compensation also includes meaningful equity in a fast-growing startup, along with a competitive salary and comprehensive benefits package. Base salary is determined by a range of factors including individual qualifications, experience, skills, interview performance, market data, and work location.
- Fireworks AI is an equal-opportunity employer. We celebrate diversity and are committed to creating an inclusive environment for all innovators.
- Total compensation for this role also includes meaningful equity in a fast-growing startup, along with a competitive salary and comprehensive benefits package. Base salary is determined by a range of factors including individual qualifications, experience, skills, interview performance, market data
- work location. The listed salary range is intended as a guideline and may be adjusted.
- On Target Earnings (Plus Equity)
- $280,000 — $320,000 USD
- Why Fireworks AI?
- Solve Hard Problems: Tackle challenges at the forefront of AI infrastructure, from low-latency inference to scalable model serving.
- Build What’s Next: Work with bleeding-edge technology that impacts how businesses and developers harness AI globally.
- Ownership & Impact: Join a fast-growing, passionate team where your work directly shapes the future of AI—no bureaucracy, just results.
- Learn from the Best: Collaborate with world-class engineers and AI researchers who thrive on curiosity and innovation.
- Fireworks AI is an equal-opportunity employer. We celebrate diversity and are committed to creating an inclusive environment for all innovators.
Required skills
PythonKubernetesLLM inferenceAzure FoundryAzure OpenAI ServiceAzure MLAKSEntra/RBAC