Data Science Engineer
Predictive Sales AI a Spectrum Communications & Consulting LLC BrandJob Title: Data Science Engineer
Location: Chicago, IL-Remote/Hybrid
About Us
At Predictive Sales AI (PSAI), we’re redefining how technology and intelligence transform digital marketing. Our AI-powered software enables home services businesses to make smarter, faster decisions—fueling growth through automation, prediction, and precision.
We are seeking a Data Science Engineer with strong data engineering and MLOps expertise to build scalable, production-grade ML and data platforms that directly impact customer growth and retention.
Job Overview
As a Data Science Engineer, you will design and operate the data + machine learning foundations behind PSAI’s predictive products. You will build scalable pipelines and robust warehouse/lakehouse models across CRM, marketing, product events, and external datasets — ensuring reliability, accuracy, and business continuity at scale.
This Role Requires
* 4+ years in data-centric engineering
* Proven experience deploying ML models via pipelines
* Deep expertise in Python, SQL, and Azure infrastructure
* Architectural ownership through data contracts and resilient modeling
Key Responsibilities
* Build scalable batch and near-real-time ingestion pipelines using Azure Data Factory, APIs, event streams, and external connectors.
* Develop ML-ready datasets across CRM, marketing automation platforms, product telemetry, and geospatial data sources.
* Design performant, well-modeled warehouse/lakehouse systems in Azure Synapse or Databricks.
* Train and deploy predictive models (lead scoring, churn prediction, forecasting) through reproducible pipelines.
* Build time-aware, leakage-resistant feature pipelines for production ML use cases.
* Support full MLOps lifecycle using Azure Machine Learning, including experiment tracking, model registry, and deployment.
* Implement automated validation, anomaly detection, reconciliation, and monitoring for pipelines and warehouse models.
* Design and enforce data contracts to prevent upstream schema changes from breaking downstream ML workflows.
* Own pipeline SLAs, alerting, incident response, and durable improvements through postmortems.
* Optimize processing for very large datasets (>100GB) through partitioning, incremental loads, distributed compute, and query tuning.
* Improve cost efficiency across compute/storage in Azure environments.
* Maintain clean, testable, production-ready Python codebases using:
* Object-oriented patterns
* Type hinting
* CI/CD workflows via Azure DevOps
* Package models and pipelines using Docker for consistent deployment across dev/staging/prod.
* Communicate architectural trade-offs and technical debt in business terms to Product, RevOps, and leadership.
* Partner with Engineering on instrumentation and scalable data integration.
* Mentor junior engineers through pairing, code reviews, and documentation best practices.
Desired Traits
We are looking for an individual who is organized, proactive, and detail-oriented. In this role, you will work closely with teams across the company. Here’s what we’re looking for:
* Ownership mindset with a reliability-first approach
* Strong SQL/Python and a high attention to data quality
* Scales systems thoughtfully (performance/cost aware, maintainable designs)
* Collaborative communicator across engineering, RevOps, and analytics
* Documents well and supports others through reviews/mentorship
Required Skills And Experience
* Preferred Master’s degree in Data Science, Computer Science, Statistics, Engineering, or a closely related quantitative field.
* 4+ years in data engineering, ML engineering, or data platform development.
* Minimum 2 years deploying ML models into production workflows.
* Experience building pipelines and warehouse systems at scale (>100GB datasets).
* Demonstrated adaptability in fast-changing technical and business environments.
* Python (Expert): pandas, polars, scikit-learn; PyTorch, transformers; production engineering (OOP, testing, typing)
* SQL (Expert): advanced analytics, recursive CTEs, query tuning, Azure Synapse optimization
* Azure Data & ML Stack: Data Factory (ETL/ELT), Azure ML (MLOps), Key Vault, Databricks/Spark, Docker deployment
* Distributed & Large-Scale Compute: Spark, Ray, Dask; GPU acceleration with RAPIDS (plus)
* Geospatial & Specialized Data: GeoPandas, Shapely, rasterio
* AI Automation & LLMs: LangChain/Semantic Kernel, agentic workflows
* DevOps & CI/CD: Azure DevOps pipelines, Gitflow, rebasing, clean version control
Why Join Us?
* Innovative Environment: Be part of a forward-thinking company that values creativity and encourages the exploration of new ideas.
* Professional Growth: Access opportunities for continuous learning and career advancement within a supportive and dynamic team.
* Comprehensive Benefits: Enjoy a competitive salary, performance-based bonuses, flexible work arrangements, and a robust benefits package.
* Collaborative Culture: Work in a team-oriented environment where collaboration and mutual respect drive our success.
If you're ready to be part of an innovative, growth-oriented team, apply today!