UKIMEA Advisory and Professional Services Sovereign AI Enterprise Architect

Other Jobs To Apply

UKIMEA Advisory and Professional Services Sovereign AI Enterprise Architect  

This role has been designated as ‘Remote/Teleworker’, which means you will primarily work from home.

Who We Are:

Hewlett Packard Enterprise is the global edge-to-cloud company advancing the way people live and work. We help companies connect, protect, analyze, and act on their data and applications wherever they live, from edge to cloud, so they can turn insights into outcomes at the speed required to thrive in today’s complex world. Our culture thrives on finding new and better ways to accelerate what’s next. We know varied backgrounds are valued and succeed here. We have the flexibility to manage our work and personal needs. We make bold moves, together, and are a force for good. If you are looking to stretch and grow your career our culture will embrace you. Open up opportunities with HPE.

Job Description:

   

Sovereign AI Enterprise Architect

UKIMEA Advisory & Professional Services

Location: London

Type: Permanent, full time

About the Role

We are seeking a Sovereign AI Enterprise Architect to support strategic UKIMEA customers in designing, deploying, and scaling secure, high‑performance AI platforms. This role sits at the intersection of AI infrastructure, HPC, Kubernetes, and enterprise architecture, helping sovereign and regulated organisations build resilient, compliant AI environments.

You will act as a trusted technical advisor, working closely with customers, partners, and internal teams to architect end‑to‑end AI solutions—from bare metal and GPUs through to orchestration, operations, and governance.

Key Responsibilities

  • Design and architect sovereign AI platforms for enterprise and public sector customers
  • Lead end‑to‑end AI infrastructure deployments, from design through implementation
  • Advise on Kubernetes-based AI platforms, GPU clusters, and HPC integrations
  • Partner with stakeholders to translate business requirements into scalable architectures
  • Support solution validation, performance tuning, and operational readiness
  • Provide technical leadership across customer engagements and advisory projects

Required Technical Experience

Container Platforms & Automation

  • Strong, demonstrated experience deploying and configuring enterprise Kubernetes platforms, including:
  • Rancher RKE2
  • Red Hat OpenShift
  • CNCF-compliant Kubernetes

Note: microk8s and kubespray alone are not considered sufficient enterprise experience

  • 5+ years hands-on Linux experience (RHEL, Ubuntu)
  • Strong background in Ansible automation frameworks
  • Experience integrating platforms using REST APIs

AI, HPC & Accelerated Computing

  • Familiarity with SLURM on Kubernetes frameworks (e.g. Slinky, SUNK)
  • Strong understanding of distributed systems architecture (GPU clusters, multi-node training)
  • Knowledge of HPC architectures, network topologies, and high-performance storage
  • Experience with GPU and accelerator platforms (NVIDIA, AMD, or custom ASICs)
  • Familiarity with CUDA, NCCL, and distributed training optimisation
  • Knowledge of NVIDIA AI Enterprise tooling, including BCM and DCGM

Core Technical Skills

Infrastructure & Platforms

  • High-performance storage (NVMe, parallel file systems, object storage)
  • Data centre infrastructure (power, cooling, racks, redundancy)
  • Advanced networking (InfiniBand, RoCE, RDMA, 100–800GbE fabrics)
  • Virtualisation and containerisation (Docker, Kubernetes, OpenShift)
  • Infrastructure as Code: Terraform, Ansible, Pulumi

AI & MLOps

  • AI training and inference pipelines
  • Model lifecycle management and MLOps platforms
  • Data pipeline orchestration (Airflow, Kubeflow)
  • Performance benchmarking and workload profiling
  • Large-scale model deployment (on‑prem, edge, hybrid cloud)

Cloud, Security & Compliance

Cloud & Hybrid

  • AI services across AWS, Azure, and GCP
  • Hybrid cloud design and migration strategies
  • Secure connectivity between on‑prem AI systems and cloud environments
  • Cost optimisation for large-scale compute workloads

Security & Governance

  • Zero-trust architecture principles
  • Identity and Access Management (IAM)
  • Data governance and privacy controls
  • Secure multi‑tenant AI platforms
  • Regulatory compliance (e.g. ISO, SOC 2, GDPR)

Operations & Reliabil

Back to blog