AI Inference Engineer Job at Signify Technology, Sonoma, CA

OGtRbUpRSHpBM2U2eXg0QVpzaUoySGJkenc9PQ==
  • Signify Technology
  • Sonoma, CA

Job Description

AI Inference Engineer – Stealth Startup | San Fransisco Onsite

Compensation: $200K–$300K + equity

Join a stealth-stage team backed by prominent academic research and successful technical founders, working at the bleeding edge of AI infrastructure. As generative AI continues to scale rapidly, the bottleneck is no longer training—it’s inference. This team is rebuilding the core systems that power inference, from kernel-level GPU optimizations to full-stack distributed deployment.

This role is ideal for engineers who want to go deep: working on quantization, KV caching, attention mechanisms like FlashAttention, and designing new strategies for parallelism across heterogeneous compute. You'll contribute to an integrated software-hardware stack that enables large-scale model deployment with dramatically improved performance, efficiency, and quality—at production scale.

What You’ll Be Doing:

  • Research and implement state-of-the-art techniques to improve AI model inference speed and quality
  • Architect and optimize distributed AI infrastructure across both GPU kernel and software layers
  • Profile, benchmark, and debug system performance across varied hardware environments
  • Drive improvements in model execution through compiler-level tuning, caching, and runtime strategies

What They’re Looking For:

  • Bachelor's degree in Computer Science, Engineering, Applied Math, or a related field
  • Strong experience with performance optimization and systems-level thinking
  • Proficiency in Python, C++, and CUDA
  • Familiarity with AI frameworks like PyTorch, TensorFlow, ONNX, or vLLM

Nice to Have:

  • Graduate degree in a technical field
  • Experience with MLIR or other compiler frameworks
  • Hands-on work with large-scale GPU infrastructure or custom kernels

This is a hands-on, foundational role in a fast-moving environment, offering the chance to shape the backbone of the next generation of AI systems.

Job Tags

Similar Jobs

China Global Connections

Science Coordinator Job at China Global Connections

We are looking for a Science Coordinator for an international school in Beijing. This is a full-time position starting in August 2025. About the schoolThis is an international school located in Haidian. The school offers a combination of the national curriculum and US Common... 

Touchstone Communities

Certified Nurse Aide (CNA) Job at Touchstone Communities

 ...Laredo Nursing and Rehabilitation Center Certified Nurse Aide (CNA) Join Our Compassionate Care Team! Are you passionate about making a real difference in the lives of others? At Laredo Nursing & Rehab , we are more than just a skilled nursing communitywe... 

Cypress Village Retirement

RN Director Private Duty Job at Cypress Village Retirement

 ...component of this model is our Innovative Home Services (IHS) private duty program , and we are seeking a passionate leader to guide its...  ...compassionate, driven, and service-oriented: Current Registered Nurse (RN) license. Valid CPR certification . Minimum 2 years of... 

Broadlawns Medical Center

Microbiology MLT/MLS Job at Broadlawns Medical Center

 ...Friday | 6a-2:30p | Every 3rd weekend SIGN ON BONUS $5000 Qualifications Associates degree in Medical Laboratory Technician (MLT) or Clinical Laboratory Technician (CLT) OR a Bachelors degree in MLS, CLS, or MT OR Bachelor's degree in related field plus Certificate... 

AKIRA/shopAKIRA.com

Fashion Stylist - Lincoln Road Mall (Miami Beach, FL) Job at AKIRA/shopAKIRA.com

Fashion Stylist - Lincoln Road Mall (Miami Beach, FL) In 2002, AKIRA opened its first women's clothing boutique in Chicago. Since then, the company has expanded to over 30 stores across the U.S. and developed a thriving eCommerce platform (shopAKIRA.com). AKIRA is poised...