AI Cloud Systems Architect
TrekAIFull Description
AI Cloud Systems Architect
U.S. Citizens Only
Hybrid/Remote
Role Purpose
TrekAI is reinventing education through AI. As a high-growth, mission-driven startup, we build technology that directly impacts teachers, students, and school systems.
As TrekAI’s AI Cloud Systems Architect, you will lead the design, scalability, and security of our AI and cloud platform architecture across infrastructure, distributed systems, AI/ML services, and data platforms. This is a highly hands-on role — not a “paper architect” position. You will build proofs-of-concept, develop infrastructure-as-code, support deployment pipelines, and help drive technical execution alongside engineering leadership.
Working closely with the CTO, CPO and the Director of Engineering, you will help shape TrekAI’s long-term architecture strategy while ensuring the platform remains scalable, resilient, secure, and compliant with student data privacy standards.
Key Responsibilities
Cloud Systems & Software Architecture
* Own cloud-native reference architecture (e.g., GCP/Digital Ocean): VPC, IAM, service meshes, Kubernetes, autoscaling, multi-tenant isolation.
* Drive Infrastructure-as-Code (e.g., Kubernetes, Terraform, Ansible) and GitOps (e.g, ArgoCD) adoption.
* Design for high availability, disaster recovery (RPO/RTO targets), and cost efficiency.
* Define the end-to-end reference architecture: microservices, APIs, orchestration, messaging/event buses, and data pipelines.
* Design patterns for multi-agent coordination, experimentation (feature flags, A/B tests), and fast AI model rollouts.
* Maintain clear boundaries between core IP, third-party components, and open-source dependencies.
AI & Data Infrastructure
* Architect scalable model-serving platforms with autoscaling and GPU scheduling.
* Build secure feature stores (e.g., Redis) and vector retrieval pipelines using technologies such as PostgreSQL/pgvector and Pinecone.
* Support continuous training/retraining workflows with CI/CD integration and drift monitoring.
Security Architecture
* Embed zero-trust security at every layer: RBAC, workload identity, network segmentation, secret management (Vault/KMS).
* Perform lightweight threat analysis for model pipelines, APIs, and user data flows; ensure FERPA, COPPA, GDPR compliance.
Required Experience
* BS/MS in Computer Engineering, Computer Science, Systems Engineering, or related field.
* Strong experience with Kubernetes, virtual machines, service meshes, serverless computing patterns, and cloud networking.
* 10+ years designing and building distributed systems, with direct IaC experience.
* Proven ability to design modular, scalable microservices architectures
* Solid knowledge and administrative experience with Linux distributions (e.g., Ubuntu, Debian, RHEL), cloud networking administration and Windows (client side) / Mac OS (client-side)
* Solid programming and DB skills: Python, React, Node.js, Java, Json, SQL, NoSQL
* Skilled in secure cloud architecture, identity and access management, encryption strategies, and compliance-by-design.
* Excellent communicator who can work with engineers, ML scientists, and product teams to translate requirements into scalable design.
* Educational technology (EdTech) or SaaS platform experience is a plus.
* Startup experience is a plus