Cayuse Holdings

Staff Software Engineer, Applied AI

Location US-CA-Burlingame
ID 2026-3598
Category
Information Technology
Position Type
Full-Time Hourly Non Exempt
Remote
No
Clearance Required
None

Overview

Role: Staff Software Engineer, Applied AI

Salary: $200,000.00 - $275,000.00 per year

Location: Burlingame, CA (Hybrid)

 

We are seeking a Staff Software Engineer, Applied AI to build and scale the backend systems that power LLM applications in healthcare. This role is ideal for an engineer who thrives at the intersection of backend architecture and applied AI, designing APIs, pipelines, and infrastructure that make LLMs reliable, secure, and cost-efficient in production. If you want to push LLMs beyond demos into mission-critical healthcare workflows, we’d love to hear from you.

Responsibilities

  • Backend for LLMs – Architect and implement scalable, low-latency APIs and services that wrap, orchestrate, and optimize LLMs for healthcare use cases.

  • Data & Retrieval Pipelines – Build ingestion, preprocessing, and retrieval-augmented generation (RAG) pipelines to ground LLMs in clinical and revenue-cycle data.

  • LLMOps & Observability – Design systems for model monitoring, evaluation, cost tracking, and guardrails, ensuring reliability and responsible use.

  • Performance & Optimization – Engineer solutions for caching, batching, load balancing, and scaling LLM workloads across cloud and containerized environments.

  • Security & Compliance – Implement HIPAA-ready infrastructure, data governance, and auditability for LLM-powered applications.

  • Cross-Functional Collaboration – Partner with product, ML engineers, and healthcare experts to translate business workflows into robust backend systems.

  • Technical Leadership – Drive end-to-end delivery of LLM backend projects, establish engineering best practices, and mentor peers in LLM system design.

Qualifications

 

Required Qualifications:

  • 5+ years of backend or full-stack software engineering experience, with 3+ years working on ML/LLM-enabled applications.

  • Strong coding skills in Python (and ideally one statically typed language such as Go, Java, or TypeScript).

  • Experience with LLM integration frameworks (Hugging Face, LangChain, LlamaIndex, OpenAI APIs, Anthropic, etc.).

  • Deep knowledge of distributed systems, service-oriented architecture, and building APIs at scale.

  • Cloud-native expertise: AWS/GCP/Azure, Kubernetes, Docker, Terraform, etc.

  • Familiarity with MLOps/LLMOps practices: CI/CD for models, evaluation harnesses, monitoring, and reproducibility.

  • Excellent system design skills and the ability to align technical architecture with product goals.

Preferred Qualifications:

  • Experience applying LLMs in healthcare or other regulated industries (FHIR, HL7, HIPAA).

  • Hands-on experience with RAG pipelines, vector databases, and structured-output orchestration.

  • Background in enterprise SaaS or mission-critical platforms where uptime, latency, and scale matter.

  • Knowledge of responsible AI, safety, and privacy-preserving ML techniques.

Pay Range

USD $200,000.00 - USD $275,000.00 /Yr.

Options

Sorry the Share function is not working properly at this moment. Please refresh the page and try again later.