Back to jobs

Staff Software Engineer

Demand.io
Los Angeles, CA
Full-time
35,000,000 – 45,000,000 / year
AI tools:
Perplexity
ChatGPT

The Opportunity

$22M revenue. Profitable. Bootstrapped. 16 years in market. 20 people doing what competitors need 200 for.

Every AI assistant is trained to close the deal - to be the realtor. We built the home inspector. Product.ai tells you what NOT to buy, and we're the only ones who can.

We are Product.ai - the company building the verification layer for AI commerce. You may know us as Demand.io, the founder-led team behind SimplyCodes - the leading AI-powered savings tool driving $1B+ in annual commerce. In early 2026, we complete our transformation and formally become Product.ai.

What we're building: The reasoning infrastructure that turns probabilistic AI into deterministic, auditable recommendations. We ingest product data, marketing claims, and real user feedback, then run adversarial AI pipelines that expose hidden flaws - producing the confident "No" that no other AI can deliver. The architecture for this system does not exist off the shelf. You will design the prompt pipelines, build the evaluation infrastructure, make the build/buy/fine-tune decisions on model infrastructure, and ship it all to millions of users and AI agents simultaneously.

Why we're different: Perplexity summarizes the web - including the lies. ChatGPT is trained to be agreeable - it hedges. Google ranks by ads. We pre-verify claims through adversarial pressure testing and deliver the confident "No" that no other AI can. (Read the thesis)

We are seeking a Staff Software Engineer to own the core technical systems that power verified intelligence at scale. This is not a "feature engineer" role - you will architect hybrid AI pipelines, build evaluation systems that define what "correct" means, and ship production systems that serve real users. You own the full stack from reasoning backend to React frontend.

Based in Los Angeles. Hybrid schedule with flexibility. For the right builder, we're open to remote.

What You Will Build

* The Adversarial Verification Pipeline: Own the core reasoning system that collides marketing claims against real user feedback and public consensus to produce verified, auditable recommendations. You will design the prompt architectures, evaluation harnesses, and multi-step reasoning chains that turn probabilistic LLM outputs into deterministic judgments. This is the engine that powers the confident "No."

* The Knowledge Graph Interface: Build and optimize the retrieval architecture over our 75M+ entity commerce knowledge graph. Design retrieval strategies, chunking approaches, and embedding pipelines that serve both human users and AI agents with sub-200ms latency. You will make the architectural decisions about when to use retrieval versus generation, and own the trade-offs.

* The Evaluation Infrastructure: Architect the systems that measure AI output quality at scale. LLM-as-judge pipelines, human evaluation frameworks, automated accuracy metrics - you will define what "correct" means for a verified recommendation and build the machinery to enforce it continuously. You understand that shipping AI without evaluation infrastructure is shipping hallucinations.

* AI-Native Products: Ship features to millions of users through our consumer interfaces, ChatGPT app integration, and agent API endpoints. You own the full stack - from the reasoning backend to the React frontend that renders the result. You make build/buy/fine-tune decisions on model infrastructure and own the prompt architectures that power new capabilities.

Who You Are

We do not care about your pedigree; we care about your fundamentals.

* You have shipped AI systems to production. Not demos. Not prototypes. Not tutorials. You have deployed AI systems that serve real users at scale, and you have the scars to prove it. You measure your work in deployed systems, not Jupyter notebooks. You know what breaks when a prompt that works at 100 queries per day hits 100,000.

* You have opinions about retrieval-augmented generation. You can articulate when RAG fails, when it is the right tool, and the trade-offs between chunking strategies, embedding models, and retrieval versus generation balance. You have built these systems, not just read about them.

* You think about evaluation as much as features. You understand that shipping AI without measurement infrastructure is shipping noise. You have built systems to measure output quality and have opinions about LLM-as-judge versus human evaluation versus automated metrics. You know which approach to use when, and why.

* You own the full stack. You do not hand off a model output and wait for someone else to build the interface. You build the reasoning pipeline AND the React component that renders the result. A slow query, a confusing UI, a hallucinated output - you fix whichever one is broken because you understand the full system.

* You are a builder-strategist. You do not need a spec to move. Given a problem ("make recommendations more accurate"), you figure out the AI approach, make the build/buy/fine-tune decision, and ship. You own outcomes, not tasks. You operate under ambiguity and thrive in environments where the solution is not yet defined.

* You distrust cargo culting. You can explain why a particular prompt architecture works, not just that it works. You do not use a vector database because everyone else does - you use it because you have evaluated the alternatives and it is the right tool. You reason from first principles about system design.

* You use AI to build AI. You are not just building AI products - you are using AI tools to build them faster. Claude Code, Cursor, Copilot are part of your daily workflow. You generate boilerplate, debug pipelines, write evaluations, and prototype architectures with AI assistance. Your output per hour is measurably higher than it was before these tools existed.

Required Experience:

* 7+ years building and shipping production software

* 2+ years building production LLM/AI systems (not research, not demos - production systems serving real users)

* Strong across the full stack: Python + TypeScript/React + PostgreSQL minimum

* Has built RAG systems, prompt pipelines, or evaluation infrastructure in production

* Can make architectural decisions about model selection, retrieval strategy, and system design without supervision

Strong Signals:

* Former founder or early-stage engineer who built the technical foundation

* Has built for end-users, not just enterprise B2B or internal tools

* E-commerce, fintech, or consumer product background

* Experience with knowledge graphs, structured data, or information retrieval at scale

* Has failed at something hard and learned from it

Compensation & Ownership

We operate as a high-performance studio, not a typical corporation. Because we are profitable with no outside investors, we share the surplus directly with our builders via direct ownership.

* Elite Base Salary: $350,000 - $450,000. We target the 99th percentile of market base pay to ensure you are focused on the mission, not your mortgage. Your base covers your life; your equity builds your freedom.

* Profits Interest Units (PIUs): You will receive a formal equity stake via Class B Membership Interests. Unlike Stock Options, PIUs have a $0 strike price, participate in the upside from day one, and qualify for Capital Gains tax treatment, maximizing your long-term wealth.

* Ongoing Profit Distribution: As an equity holder, you participate in our annual success. You are entitled to a pro-rata share of our Free Cash Flow (FCF) - receiving your portion of the company's profit as an annual dividend.

* Annual Liquidity: We do not believe in "Paper Wealth." We operate an Annual Tender Offer where the company buys back vested interests. You have the option to turn your ownership into cash every year - you don't have to wait for an exit.

* Benefits: 100% premium coverage for you and your family, daily catered lunches, and unlimited PTO that we actually expect you to use to recharge.

How to Apply

We skip the 60-minute recruiter screen. Instead, you'll answer 5-6 short video questions (2 minutes each) so we can get to know how you think and work. It takes about 15 minutes total - less time than a phone screen, and you can do it whenever works for you.

The Process:

1. Click Apply.

2. Record short video responses to our questions (no prep required - we want to see how you naturally think).

3. Your responses go directly to the hiring team. No ATS. No keyword parsing.

We're not looking for polished presentations. We're looking for signal on who you are.

Applications go to the hiring team directly