Principal Software Engineer – Large-Scale LLM Memory and Storage Systems

Full-stack EngineerSoftware EngineerFull TimeRemoteTeam 10,001+Since 1993H1B SponsorCompany SiteLinkedIn

Location

California + 2 moreAll locations: California, Massachusetts, Washington

Posted

81 days ago

Salary

$272K - $425.5K / year

Postgraduate Degree15 yrs expEnglishCloudDistributed SystemsOpen SourcePython

Job Description

• Design and evolve a unified memory layer that spans GPU memory, pinned host memory, RDMA-accessible memory, SSD tiers, and remote file/object/cloud storage to support large-scale LLM inference • Architect and implement deep integrations with leading LLM serving engines (such as vLLM, SGLang, TensorRT-LLM), with a focus on KV-cache offload, reuse, and remote sharing across heterogeneous and disaggregated clusters • Co-design interfaces and protocols that enable disaggregated prefill, peer-to-peer KV-cache sharing, and multi-tier KV-cache storage (GPU, CPU, local disk, and remote memory) for high-throughput, low-latency inference • Partner closely with GPU architecture, networking, and platform teams to exploit GPUDirect, RDMA, NVLink, and similar technologies for low-latency KV-cache access and sharing across heterogeneous accelerators and memory pools • Mentor senior and junior engineers, set technical direction for memory and storage subsystems, and represent the team in internal reviews and external forums (open source, conferences, and customer-facing technical deep dives)

Job Requirements

  • Masters or PhD or equivalent experience
  • 15+ years of experience building large-scale distributed systems, high-performance storage, or ML systems infrastructure in C/C++ and Python, with a track record of delivering production services
  • Deep understanding of memory hierarchies (GPU HBM, host DRAM, SSD, and remote/object storage) and experience designing systems that span multiple tiers for performance and cost efficiency
  • Distributed caching or key-value systems, especially designs optimized for low latency and high concurrency
  • Hands-on experience with networked I/O and RDMA/NVMe-oF/NVLink-style technologies, and familiarity with concepts like disaggregated and aggregated deployments for AI clusters
  • Strong skills in profiling and optimizing systems across CPU, GPU, memory, and network, using metrics to drive architectural decisions and validate improvements in TTFT and throughput
  • Excellent communication skills and prior experience leading cross-functional efforts with research, product, and customer teams.

Benefits

  • Equity
  • Benefits

Related Job Pages

More Full-stack Engineer Jobs

Director, Product Engineering

May Mobility

Transforming cities through autonomous technology to create a safer, greener, more accessible world.

Full-stack Engineer81 days ago
Full TimeRemoteTeam 51-200Since 2017H1B Sponsor

Director of Product Engineering at May Mobility overseeing product development

Michigan
$160K - $230K / year

Senior Software Engineer – FOS

CannonDesign

We design solutions that help people continuously flourish. Living-Centered Design is how we do it.

Full-stack Engineer82 days ago
Full TimeRemoteTeam 1,001-5,000H1B No Sponsor

Senior Software Engineer designing scalable SaaS solutions at CannonDesign.

CloudJavaScriptNode.jsPythonReactTypeScript.NET
United States
$148K - $175K / year

Senior Software Engineer, Exposure Management

Censys

The Leader in Attack Surface Management & Cloud Security

Full-stack Engineer82 days ago
Full TimeRemoteTeam 51-200Since 2017H1B Sponsor

Senior Backend Engineer developing real-time backend services at Censys

Distributed SystemsGRPCGo
United States
$137K - $192K / year

Staff Software Engineer, Full-Stack

Collectors

Helping collectors pursue their passion

Full-stack Engineer83 days ago
Full TimeRemoteTeam 1,001-5,000Since 1986H1B Sponsor

Staff Software Engineer developing customer experiences for collectibles platform

AWSCloudDistributed SystemsJavaJavaScriptKubernetesMicroservicesPostgresPythonReactSQLSvelteTerraformVue.js.NET
California
$209.2K - $258.9K / year