LLM/AI Data Engineer
Lifescale AnalyticsFull Description
Lifescale Analytics helps organizations unlock the power of data through advanced analytics, AI, and modern digital solutions. We partner with forward-thinking clients to design and implement scalable, high-impact technologies that drive measurable business outcomes.
We are currently seeking a LLM/AI Data Engineer to support client engagement. In this role, you will work at the intersection of data engineering and AI, designing and validating high-quality, production-grade data pipelines with integrated LLM capabilities. This opportunity is remote, but candidates must live in the United States.
Applicants responding to this position must be a US Citizen and may be subject to a government security investigation and must meet eligibility requirements by currently possessing the ability to view classified government information. The candidate must have lived in the United States for the past 5 years.
The Employer will not sponsor applicants for any employment visas, at hiring or in the future, including but not limited to H-1B visas. Corp-to-Corp or subcontract personnel will not be considered for this position.
What You’ll Do
* Design, build, and operate LLM-assisted analytics pipelines in structured data environments
* Implement retrieval-augmented generation (RAG) and structured data grounding patterns
* Validate and improve LLM output quality, consistency, and traceability
* Develop and maintain production-grade ETL/ELT pipelines
* Review and test pipelines to identify logic errors, data gaps, and performance issues
* Define and track pipeline SLAs (latency, throughput, data freshness)
* Build and enforce data quality frameworks and validation processes
* Document engineering processes including QC logs, test cases, and schema documentation
* Collaborate with cross-functional teams to ensure scalable and auditable data systems
* All other duties as assigned.
Required Skills & Experience:
LLM-Integrated Data Engineering
* Experience designing, building, or operating LLM-assisted analytics pipelines
* Experience validating and improving LLM output quality and reliability
Strong understanding of:
* Prompt engineering for structured outputs
* Retrieval-Augmented Generation (RAG) patterns
* Structured-data grounding & hallucination mitigation
Production Data Engineering
Minimum 4+ years of experience in:
* Data engineering
* ETL/ELT pipeline development
* Data quality assurance in production environments
* Proven experience working with high-volume structured data systems
Technical Stack Proficiency
* Advanced proficiency in SQL and Python
* Experience with tools such as dbt, Spark, or similar frameworks
Hands-on experience with Snowflake, including:
* Snowpark or equivalent transformation frameworks
* Data modeling and performance optimization
* Snowflake Cortex
Pipeline Validation & Data Quality
* Ability to design and implement data quality frameworks
Experience reviewing and validating production pipelines:
* Logic validation and transformation accuracy
* Data completeness and integrity checks
* Identification of edge cases and failure modes
Benchmarking & Performance Engineering
* Ability to benchmark and optimize pipelines against performance targets
Experience defining and measuring:
* Pipeline latency
* Throughput
* Data freshness SLAs
Auditability & Documentation
* Experience supporting auditable and explainable data systems
Strong documentation practices, including:
* QC logs and validation reports
* Test case design and execution records
* Schema and lineage documentation
* Issue tracking and remediation workflows
Preferred Qualifications (Nice-to-Have)
Experience supporting U.S. Department of Defense (DoD) environments:
* Air Force Life Cycle Management Center (LCMC)
* Army Materiel Command (AMC)
Familiarity with Palantir Foundry:
* Ontology modeling concepts
* Data product consumption patterns
Experience with defense datasets:
* Government-Industry Data Exchange Program (GIDEP)
* Federal Logistics Information System (FED-LOG)
Exposure to:
* Entity resolution and part matching
* ERP data integration into analytics platforms
* Data normalization across fragmented systems
Education
* Bachelor’s degree in Computer Science, Data Engineering, or related field (or equivalent experience)
Who we are:
Lifescale Analytics, a small business that provides specialized expertise in data and analytics. Formed in 2012, the Lifescale Analytics team has years of experience providing a spectrum of customized data management services and solutions including Data Management/Analytics, Big Data Solutions, Cloud Services, Business Intelligence, and Data Science that focus on building strong portfolios and programs. Through experience and innovation, we allow businesses, pharmaceutical companies, financial institutions, and government agencies to manage and proactively make decisions based on their biggest asset, their data. Our specialists are skilled at managing, refining, analyzing, or visualizing information for the specific purpose of increasing the value of IT to benefit from the data science industry. This job will be remote until the client decides to have employees report to the site.
For more information, please visit our website at www.lifescaleanalytics.com