Skip to main content
0%
MLOps

Setting Up a Development Environment for ML Engineers: From Laptop to Cluster

A deep guide to setting up an ML development environment that scales from a local laptop to shared GPU clusters, covering local GPU setup, remote dev workspaces, notebook-to-IDE workflows, and dev/prod parity.

4 min read696 words

Most ML environment problems are not caused by the model. They are caused by friction between how engineers develop locally and how systems actually run in shared infrastructure.

One engineer is using a notebook on a MacBook. Another is attached to a remote Linux box. A third is testing against a Kubernetes cluster with a different Python version, different CUDA stack, and different model artifact layout. Eventually the team starts asking why things work in one place and break in another.

That is why good ml development environment setup is not a matter of installing a few packages. It is about designing a path that lets ML engineers move from:

  • quick local iteration
  • to reproducible remote workspaces
  • to shared GPU clusters
  • without constantly changing tools, assumptions, or filesystem layout

This guide is about that path. For a broader look at the infrastructure required to support these environments, see Building an Internal ML Platform on Kubernetes and our Private AI Infrastructure Kubernetes Reference Architecture.

Layer 1: A Clean Local Base Environment

The local laptop should be optimized for iteration, not for pretending to be production. However, it should be powerful enough to run lightweight versions of production tools. For example, using vLLM or LiteLLM locally can help debug model serving logic without needing a full A100 cluster.

Standardizing on a tool like uv can significantly reduce setup time. A simple pyproject.toml managed by uv ensures that every engineer is running the exact same dependency tree:

[project]
name = "ml-dev-environment"
version = "0.1.0"
dependencies = [
    "torch>=2.2.0",
    "transformers>=4.38.0",
    "vllm>=0.3.0",
    "fastapi>=0.109.0",
]

[tool.uv]
dev-dependencies = [
    "pytest>=8.0.0",
    "ruff>=0.2.0",
    "ipykernel>=6.29.0",
]

Layer 2: Containerized Development (The Environment Contract)

Using VS Code's Dev Containers or DevPod, you can define your entire environment in a devcontainer.json file. This ensures dev/prod parity ml by mirroring the production base image:

{
  "name": "ML Development",
  "build": {
    "dockerfile": "Dockerfile.dev",
    "context": ".."
  },
  "features": {
    "ghcr.io/devcontainers/features/nvidia-cuda:1": {
      "installCudaDriver": true,
      "cudaVersion": "12.3"
    }
  },
  "customizations": {
    "vscode": {
      "extensions": ["ms-python.python", "ms-toolsai.jupyter", "nvidia.nsight-vscode-edition"]
    }
  },
  "remoteUser": "vscode"
}

Layer 3: Remote GPU Workspaces on Kubernetes

When local compute isn't enough, engineers should be able to spin up remote workspaces on a shared cluster. This is where gpu development environment kubernetes strategies become critical.

Using a custom Resource Definition or a tool like DevPod, you can provision a Pod that looks like this:

apiVersion: v1
kind: Pod
metadata:
  name: ml-dev-workspace-shivam
  labels:
    user: shivam
    type: dev-environment
spec:
  containers:
  - name: workspace
    image: resiliotech/ml-base-dev:latest
    resources:
      limits:
        nvidia.com/gpu: 1
        memory: "32Gi"
        cpu: "8"
      requests:
        nvidia.com/gpu: 1
        memory: "16Gi"
        cpu: "4"
    volumeMounts:
    - name: data-volume
      mountPath: /home/jovyan/data
  volumes:
  - name: data-volume
    persistentVolumeClaim:
      claimName: ml-data-pvc-shivam

To manage costs, we recommend implementing GPU Autoscaling and using tools like KEDA to scale down idle dev environments.

The Notebook-to-IDE Transition

The goal is not to ban notebooks. The goal is to stop using them as the only execution surface. A proper ml engineer dev environment allows importing logic from a src/ directory into a notebook for visualization, then promoting that same logic to a production pipeline without rewriting it. This is a key part of moving up the MLOps Maturity Model.

Observability in Dev

ML engineers should not first encounter logs and metrics in production. Your dev environment should include access to local or dev-cluster Prometheus/Grafana instances. Being able to see GPU utilization in real-time during a training run is essential for debugging OOMKilled GPU Pods.

Final Takeaway

Good ml development environment setup is a systems design problem, not a laptop setup checklist. By layering local speed with remote consistency and shared GPU access, you create a workflow that scales with your team.

At Resilio Tech, we help teams design and implement these layered environments, ensuring that your engineers spend less time debugging CUDA drivers and more time shipping models. Whether you need a standardized Dev Container strategy or a full-scale Kubernetes-based remote dev platform, we can build the bridge from laptop to cluster.

Ready to modernize your ML developer experience? Contact Resilio Tech for a platform assessment.

Share this article

Help others discover this content

Share with hashtags:

#Developer Experience#Mlops#Kubernetes#Gpu#Platform Engineering
RT

Resilio Tech Team

Building AI infrastructure tools and sharing knowledge to help companies deploy ML systems reliably.

Article Info

Published4/18/2026
Reading Time4 min read
Words696
Scale Your AI Infrastructure

Ready to move from notebook to production?

We help companies deploy, scale, and operate AI systems reliably. Book a free 30-minute audit to discuss your specific infrastructure challenges.