ABOUT THE ROLE:

AI Engineers at Varick own the intelligence layer. You design, build, and optimize the agent systems that run inside enterprise operations — processing thousands of transactions, making classification decisions, routing exceptions, and learning from human feedback.

This role is for engineers who have been deep in LLMs, agent architectures, and evaluation systems. You’ve built agentic workflows that run in production, not just demos. You understand prompt engineering, retrieval, tool calling, multi-agent orchestration, and the evaluation infrastructure required to ship AI systems that enterprises trust.

WHAT YOU'LL DO:

• Design and build agent architectures for complex enterprise workflows (multi-step reasoning, tool calling, exception handling)

• Build and maintain evaluation systems for agent quality, accuracy, safety, and groundedness

• Design prompt systems, retrieval pipelines, and context engineering strategies for reliable agent behavior

• Build the feedback loops that allow agents to learn from human corrections and improve over time

• Optimize inference cost and latency for production workloads

• Define best practices for agent reliability, observability, and governance

• Stay current with the latest models, frameworks, and research — and ship what matters into production

WHAT WE'RE LOOKING FOR:

• 3+ years of software engineering with at least 1–2 years focused on LLM applications or AI systems in production

• Hands-on experience building agentic workflows with tool calling, retrieval, and multi-step reasoning

• Deep understanding of prompt engineering, context engineering, and how to get reliable behavior from LLMs

• Experience building evaluation and quality systems for AI outputs

• Strong Python skills and backend engineering fundamentals

• You’ve shipped AI features to real users and dealt with the messy parts: hallucinations, edge cases, accuracy degradation, cost management

• Based in SF.

HELPFUL EXPERIENCE:

• Agent frameworks: LangGraph, CrewAI, Claude Code/Codex patterns, or custom orchestration

• Retrieval systems: vector databases (Qdrant, pgvector, Pinecone), reranking, hybrid search

• MCP, tool-calling protocols, and third-party API integrations

• Fine-tuning, LoRA, or other model adaptation methods

• Evaluation frameworks and continuous quality monitoring

• Experience with enterprise AI deployments (compliance, audit trails, governance)

• Prior work at AI labs, AI-native startups, or applied ML teams

WHY VARICK:

• Ship to production, not to demos. Every system you build runs inside real enterprise operations. 100% deployment rate.

• Early enough to shape everything. Your work defines the product, the platform, and the company.

• Compounding impact. Every client deployment feeds the pattern library and makes the next one faster. You’re building leverage, not doing the same thing twice.

• Work with operators, not committees. You talk directly to the people who run the business — CFOs, COOs, ops leads — not procurement layers.

COMPENSATION:

* $175K – $225K • Offers Equity

100% medical, dental, vision; MacBook Pro + peripherals

APPLY HERE: https://jobs.ashbyhq.com/Varick-Agents/30e16a2a-6374-475d-9154-2e186c481319

*We are an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, or protected veteran status. Employment is subject to a standard confidentiality and non-disclosure agreement.

AI Engineer

Skills & Expertise

Key Responsibilities

Full Description