Back to jobs

HPC Operations Lead

LTM
Pyrénées-Atlantiques, Nouvelle-Aquitaine, France
Contract
AI tools:
Azure HPC
Applications go directly to the hiring team

Full Description

HPC Operations Lead / HPC Service Manager (Contract)

Pyrénées-Atlantiques, France | 3 months (renewable) | Start: ASAP

About Us

LTM is a global technology consulting and digital solutions company partner to more than 700+ clients and powered by nearly 90,000 talented professionals across more than 30 countries in the world. Technology expertise: ERP, analytics, AI, cloud computing, and cybersecurity. For more information, please visit www.ltm.com.

Job Description

We are looking for an experienced HPC Operations Lead / Service Manager to pilot and coordinate the end‑to‑end operations of large‑scale High Performance Computing (HPC) platforms, both on‑premise and cloud (Azure HPC).

This is a governance and orchestration role, not hands‑on engineering. You will act as the central point of coordination between HPC users, operations teams (L1/L2), expert providers (L3), and hardware vendors, ensuring reliable, secure, and high‑performance HPC service delivery.

Location

* Onsite - Pyrénées-Atlantiques, France

Client Expectations

* Senior HPC operations background

* Clear leadership and pilotage capability

* Immediate availability

* Comfortable working in a bilingual (FR/EN) environment

Key Responsibilities

HPC Operations & Service Pilotage

* Own daily production and service continuity of HPC platforms (on‑prem & cloud)

* Ensure availability, performance, and operational continuity (MCO)

* Coordinate security compliance (MCS) with cybersecurity teams

* Lead incident, problem, and change management across stakeholders

Vendor & Stakeholder Coordination

* Act as the primary interface between users, operations teams, vendors, and experts

* Lead and animate Steering, Technical, and Operational Committees

* Ensure SLA adherence and delivery commitments

User & Business Engagement

* Maintain strong relationships with scientific, R&D, and AI users

* Align HPC operations with business workloads and expectations

* Provide clear service reporting and operational visibility

Governance & Continuous Improvement

* Monitor capacity, usage, and performance trends

* Anticipate risks (capacity, obsolescence, bottlenecks)

* Coordinate platform transitions, upgrades, and next‑generation systems

* Ensure production readiness for new infrastructure and releases

Technical Environment (Operational Understanding Required)

* Large‑scale HPC clusters (CPU & GPU), on‑prem & Azure HPC

* Very large Lustre storage environments

* Schedulers: SLURM, LSF

* OS: RHEL, SUSE SLES

* HPC ecosystem: MPI, Lustre, CUDA awareness

* Monitoring & observability: Grafana, Nagios, Elastic, etc.

Note: deep hands‑on engineering is not required; strong operational understanding is essential.

Required Experience & Skills

Must‑Have

* Strong experience in HPC operations or HPC service delivery

* Proven operational pilotage / service coordination background

* Experience managing complex environments with multiple vendors

* Solid incident, change, and service governance experience

* Comfortable in business‑critical, high‑visibility environments

Soft Skills

* Strong communication and leadership skills

* Ability to orchestrate teams and “make others do”

* Structured, assertive, and delivery‑focused mindset

Languages

* French: Fluent

* English: Fluent

Benefits Package

* Excellent daily rate, flexible working, learning, development and professional growth opportunities, collaborative and inclusive work environment.

#HPCOperationsLead, #HPCLead, #HPCServiceDeliveryLead, #OperationalPilotage, #ServiceCoordination, #HPCVendorManagement, #jobPyrénées-Atlantiques

Applications go to the hiring team directly