Cluster status Thu, 18 Sep 2025 14:32:01 +0000 report from beholder01
Resource usage (overall)
Resource usage (by namespace)
Ceph file system status
Etcd cluster status
Detailed network health report
nVidia Driver and GPU reports
Imp
Dretch
Belial
Fierna
Tiamat
Vecna
Asmodeus
Zariel
Demogorgon
Resource Requested Limit Allocatable Free
cpu (38%) 614.2 (50%) 807.0 1.6k 809.0
├─ asmodeus (98%) 250.3 (98%) 250.1 256.0 5.7
│ ├─ asmodeus-all-gpu-julian-schaefer-zimmermann 250.0 250.0
│ ├─ dnsutils-asmodeus 100.0m 100.0m
│ └─ kube-router-qv489 250.0m 0.0
├─ beholder01 (4%) 900.0m (0%) 100.0m 24.0 23.1
│ ├─ dnsutils-beholder01 100.0m 100.0m
│ ├─ kube-apiserver-beholder01 250.0m 0.0
│ ├─ kube-controller-manager-beholder01 200.0m 0.0
│ ├─ kube-router-d2rq4 250.0m 0.0
│ └─ kube-scheduler-beholder01 100.0m 0.0
├─ beholder02 (8%) 1.9 (9%) 2.1 24.0 21.9
│ ├─ dnsutils-beholder02 100.0m 100.0m
│ ├─ kube-apiserver-beholder02 250.0m 0.0
│ ├─ kube-controller-manager-beholder02 200.0m 0.0
│ ├─ kube-router-dlcmv 250.0m 0.0
│ ├─ kube-scheduler-beholder02 100.0m 0.0
│ ├─ wordpress-8c8944c8d-cxbrq 500.0m 1.0
│ └─ wordpress-mariadb-565c547dc8-59fhr 500.0m 1.0
├─ beholder03 (4%) 900.0m (0%) 100.0m 24.0 23.1
│ ├─ dnsutils-beholder03 100.0m 100.0m
│ ├─ kube-apiserver-beholder03 250.0m 0.0
│ ├─ kube-controller-manager-beholder03 200.0m 0.0
│ ├─ kube-router-hm6f8 250.0m 0.0
│ └─ kube-scheduler-beholder03 100.0m 0.0
├─ belial (74%) 59.4 (146%) 117.1 80.0 0.0
│ ├─ dnsutils-belial 100.0m 100.0m
│ ├─ gpu-pod 1.0 1.0
│ ├─ kube-router-bbv5t 250.0m 0.0
│ ├─ mariya-automl-well-being-job-7w56h 4.0 8.0
│ ├─ mariya-lstm-well-being-job-894nj 4.0 8.0
│ └─ ubuntu-gpu1 50.0 100.0
├─ demogorgon (92%) 88.3 (92%) 88.1 96.0 7.7
│ ├─ a2v2-triple-gpu-jcsz 16.0 16.0
│ ├─ dnsutils-demogorgon 100.0m 100.0m
│ ├─ gpu-demogorgon 48.0 48.0
│ ├─ kube-router-nh4mn 250.0m 0.0
│ └─ ulas-bingoel-model-pod-6 24.0 24.0
├─ fierna (33%) 26.4 (33%) 26.1 80.0 53.6
│ ├─ beesbook-days-0-qb4cm 12.0 12.0
│ ├─ beesbook-days-1-mwm9s 12.0 12.0
│ ├─ dnsutils-fierna 100.0m 100.0m
│ ├─ gpu-pod 1.0 1.0
│ ├─ kube-router-nndmg 250.0m 0.0
│ └─ mi-swarm-pod 1.0 1.0
├─ kiaransalee (0%) 350.0m (0%) 100.0m 192.0 191.7
│ ├─ dnsutils-kiaransalee 100.0m 100.0m
│ └─ kube-router-bw8bz 250.0m 0.0
├─ lolth (2%) 750.0m (8%) 3.1 40.0 36.9
│ ├─ coredns-55cb58b774-cxfmw 100.0m 0.0
│ ├─ dnsutils-lolth 100.0m 100.0m
│ ├─ gatekeeper-audit-59d4b6fd4c-q7cj8 100.0m 1.0
│ ├─ gatekeeper-controller-manager-66f474f785-qfkmj 100.0m 1.0
│ ├─ kube-router-z6szh 250.0m 0.0
│ └─ ubuntu-test-pod 100.0m 1.0
├─ mindflayer01 (4%) 2.3 (8%) 5.1 64.0 58.9
│ ├─ dex-7ddbb96f79-cmc5m 250.0m 0.0
│ ├─ dex-loginapp-7d9c6ff56-lhj2r 100.0m 0.0
│ ├─ dex-mysql-66b5885f7b-2wdw8 100.0m 0.0
│ ├─ dnsutils-mindflayer01 100.0m 100.0m
│ ├─ gatekeeper-controller-manager-66f474f785-lxl5x 100.0m 1.0
│ ├─ kube-router-rzzr2 250.0m 0.0
│ ├─ login-pod 60.0m 1.0
│ ├─ mariadb-849498cb44-dldtm 500.0m 1.0
│ ├─ prometheus-deployment-b65d5d898-t2z9l 500.0m 1.0
│ ├─ registry-7fcf447f4-m5sz9 100.0m 0.0
│ ├─ registry-auth-7dcc7f6766-m8bsh 100.0m 0.0
│ └─ temp-pod 100.0m 1.0
├─ mindflayer02 (34%) 21.6 (44%) 28.1 64.0 35.9
│ ├─ cool-pod 10.0 16.0
│ ├─ dnsutils-mindflayer02 100.0m 100.0m
│ ├─ gpu-pod-aalbi 10.0 10.0
│ ├─ kube-router-k8t5k 250.0m 0.0
│ ├─ mediawiki-mariadb-5cc4866855-xffs4 250.0m 0.0
│ └─ phpfpm-nginx-5db96d6895-97f2g 1.0 2.0
├─ mindflayer03 (3%) 2.1 (4%) 2.6 64.0 61.4
│ ├─ check-data-pod 100.0m 500.0m
│ ├─ coredns-55cb58b774-4kpwk 100.0m 0.0
│ ├─ dnsutils-mindflayer03 100.0m 100.0m
│ ├─ gatekeeper-controller-manager-66f474f785-5gv4n 100.0m 1.0
│ ├─ kube-router-ptw9p 250.0m 0.0
│ ├─ ldap-6987986dbc-2zr2c 250.0m 0.0
│ ├─ ledavio-text-search 1.0 1.0
│ ├─ system-registry-8877cd57-59pzr 100.0m 0.0
│ └─ user-registry-64bb8ff7cf-5cktq 100.0m 0.0
├─ tiamat (0%) 350.0m (0%) 100.0m 256.0 255.7
│ ├─ dnsutils-tiamat 100.0m 100.0m
│ └─ kube-router-6zps8 250.0m 0.0
├─ vecna (17%) 16.4 (104%) 100.1 96.0 0.0
│ ├─ dnsutils-vecna 100.0m 100.0m
│ ├─ felix-petersen-job-20 16.0 100.0
│ └─ kube-router-f67nm 250.0m 0.0
└─ zariel (56%) 142.3 (72%) 184.1 256.0 71.9
├─ dnsutils-zariel 100.0m 100.0m
├─ julian-welzel-job-1 96.0 96.0
├─ kube-router-fww75 250.0m 0.0
├─ mariya-cnn-well-being-classifier-job-7kx7t 4.0 8.0
├─ oliver-wiedemann-pod 32.0 64.0
└─ yolo2-kc67n 10.0 16.0
ephemeral-storage (1%) 100.0Gi (1%) 150.0Gi 11.9T 11.7T
├─ asmodeus (0%) 0.0 (0%) 0.0 94.6G 94.6G
├─ beholder01 (0%) 0.0 (0%) 0.0 1.7T 1.7T
├─ beholder02 (0%) 0.0 (0%) 0.0 1.7T 1.7T
├─ beholder03 (0%) 0.0 (0%) 0.0 1.7T 1.7T
├─ belial (57%) 100.0Gi (85%) 150.0Gi 189.2G 28.2G
│ └─ ubuntu-gpu1 100.0Gi 150.0Gi
├─ demogorgon (0%) 0.0 (0%) 0.0 706.7G 706.7G
├─ fierna (0%) 0.0 (0%) 0.0 189.2G 189.2G
├─ kiaransalee (0%) 0.0 (0%) 0.0 1.7T 1.7T
├─ lolth (0%) 0.0 (0%) 0.0 530.5G 530.5G
├─ mindflayer01 (0%) 0.0 (0%) 0.0 211.5G 211.5G
├─ mindflayer02 (0%) 0.0 (0%) 0.0 211.5G 211.5G
├─ mindflayer03 (0%) 0.0 (0%) 0.0 211.5G 211.5G
├─ tiamat (0%) 0.0 (0%) 0.0 164.4G 164.4G
├─ vecna (0%) 0.0 (0%) 0.0 849.0G 849.0G
└─ zariel (0%) 0.0 (0%) 0.0 1.7T 1.7T
memory (30%) 4.2T (45%) 5.9Ti 12.9Ti 7.0Ti
├─ asmodeus (77%) 1.5Ti (77%) 1.5Ti 2.0Ti 467.4Gi
│ ├─ asmodeus-all-gpu-julian-schaefer-zimmermann 1.5Ti 1.5Ti
│ ├─ dnsutils-asmodeus 100.0Mi 100.0Mi
│ └─ kube-router-qv489 250.0Mi 0.0
├─ beholder01 (0%) 350.0Mi (0%) 100.0Mi 92.9Gi 92.5Gi
│ ├─ dnsutils-beholder01 100.0Mi 100.0Mi
│ └─ kube-router-d2rq4 250.0Mi 0.0
├─ beholder02 (1%) 862.0Mi (1%) 1.1Gi 92.9Gi 91.8Gi
│ ├─ dnsutils-beholder02 100.0Mi 100.0Mi
│ ├─ kube-router-dlcmv 250.0Mi 0.0
│ ├─ wordpress-8c8944c8d-cxbrq 256.0Mi 512.0Mi
│ └─ wordpress-mariadb-565c547dc8-59fhr 256.0Mi 512.0Mi
├─ beholder03 (0%) 350.0Mi (0%) 100.0Mi 92.9Gi 92.5Gi
│ ├─ dnsutils-beholder03 100.0Mi 100.0Mi
│ └─ kube-router-hm6f8 250.0Mi 0.0
├─ belial (45%) 340.3Gi (70%) 530.1Gi 754.4Gi 224.3Gi
│ ├─ dnsutils-belial 100.0Mi 100.0Mi
│ ├─ gpu-pod 10.0Gi 10.0Gi
│ ├─ kube-router-bbv5t 250.0Mi 0.0
│ ├─ mariya-automl-well-being-job-7w56h 40.0Gi 60.0Gi
│ ├─ mariya-lstm-well-being-job-894nj 40.0Gi 60.0Gi
│ └─ ubuntu-gpu1 250.0Gi 400.0Gi
├─ demogorgon (59%) 1.2Ti (59%) 1.2Ti 2.0Ti 819.5Gi
│ ├─ a2v2-triple-gpu-jcsz 1.0Ti 1.0Ti
│ ├─ dnsutils-demogorgon 100.0Mi 100.0Mi
│ ├─ gpu-demogorgon 80.0Gi 80.0Gi
│ ├─ kube-router-nh4mn 250.0Mi 0.0
│ └─ ulas-bingoel-model-pod-6 80.0Gi 80.0Gi
├─ fierna (11%) 84.3Gi (20%) 148.1Gi 754.4Gi 606.3Gi
│ ├─ beesbook-days-0-qb4cm 32.0Gi 64.0Gi
│ ├─ beesbook-days-1-mwm9s 32.0Gi 64.0Gi
│ ├─ dnsutils-fierna 100.0Mi 100.0Mi
│ ├─ gpu-pod 10.0Gi 10.0Gi
│ ├─ kube-router-nndmg 250.0Mi 0.0
│ └─ mi-swarm-pod 10.0Gi 10.0Gi
├─ kiaransalee (1%) 9.0G (0%) 100.0Mi 1.5Ti 1.6T
│ ├─ dnsutils-kiaransalee 100.0Mi 100.0Mi
│ ├─ jupyter-aivan-2dau 1.1G 0.0
│ ├─ jupyter-alebellina412 1.1G 0.0
│ ├─ jupyter-huygenssteiner 1.1G 0.0
│ ├─ jupyter-jamannio 1.1G 0.0
│ ├─ jupyter-kmayer24 1.1G 0.0
│ ├─ jupyter-matheus-2dstefanini-2dmariano 1.1G 0.0
│ ├─ jupyter-samrauh 1.1G 0.0
│ ├─ jupyter-saroyehun 1.1G 0.0
│ └─ kube-router-bw8bz 250.0Mi 0.0
├─ lolth (0%) 1.0Gi (1%) 2.3Gi 251.4Gi 249.2Gi
│ ├─ coredns-55cb58b774-cxfmw 70.0Mi 170.0Mi
│ ├─ dnsutils-lolth 100.0Mi 100.0Mi
│ ├─ gatekeeper-audit-59d4b6fd4c-q7cj8 256.0Mi 512.0Mi
│ ├─ gatekeeper-controller-manager-66f474f785-qfkmj 256.0Mi 512.0Mi
│ ├─ kube-router-z6szh 250.0Mi 0.0
│ └─ ubuntu-test-pod 100.0Mi 1.0Gi
├─ mindflayer01 (0%) 1.6G (1%) 4.1Gi 376.5Gi 372.4Gi
│ ├─ dnsutils-mindflayer01 100.0Mi 100.0Mi
│ ├─ gatekeeper-controller-manager-66f474f785-lxl5x 256.0Mi 512.0Mi
│ ├─ kube-router-rzzr2 250.0Mi 0.0
│ ├─ login-pod 100.0Mi 1.0Gi
│ ├─ mariadb-849498cb44-dldtm 256.0Mi 512.0Mi
│ ├─ prometheus-deployment-b65d5d898-t2z9l 500.0M 1.0Gi
│ └─ temp-pod 100.0Mi 1.0Gi
├─ mindflayer02 (20%) 74.8Gi (20%) 75.1Gi 376.5Gi 301.4Gi
│ ├─ cool-pod 64.0Gi 64.0Gi
│ ├─ dnsutils-mindflayer02 100.0Mi 100.0Mi
│ ├─ gpu-pod-aalbi 10.0Gi 10.0Gi
│ ├─ kube-router-k8t5k 250.0Mi 0.0
│ └─ phpfpm-nginx-5db96d6895-97f2g 512.0Mi 1.0Gi
├─ mindflayer03 (9%) 32.8Gi (9%) 33.3Gi 376.5Gi 343.2Gi
│ ├─ check-data-pod 100.0Mi 500.0Mi
│ ├─ coredns-55cb58b774-4kpwk 70.0Mi 170.0Mi
│ ├─ dnsutils-mindflayer03 100.0Mi 100.0Mi
│ ├─ gatekeeper-controller-manager-66f474f785-5gv4n 256.0Mi 512.0Mi
│ ├─ kube-router-ptw9p 250.0Mi 0.0
│ └─ ledavio-text-search 32.0Gi 32.0Gi
├─ tiamat (0%) 350.0Mi (0%) 100.0Mi 1007.6Gi 1007.2Gi
│ ├─ dnsutils-tiamat 100.0Mi 100.0Mi
│ └─ kube-router-6zps8 250.0Mi 0.0
├─ vecna (7%) 100.3Gi (106%) 1.6Ti 1.5Ti 0.0
│ ├─ dnsutils-vecna 100.0Mi 100.0Mi
│ ├─ felix-petersen-job-20 100.0Gi 1.6Ti
│ └─ kube-router-f67nm 250.0Mi 0.0
└─ zariel (28%) 560.3Gi (44%) 890.1Gi 2.0Ti 1.1Ti
├─ dnsutils-zariel 100.0Mi 100.0Mi
├─ julian-welzel-job-1 100.0Gi 400.0Gi
├─ kube-router-fww75 250.0Mi 0.0
├─ mariya-cnn-well-being-classifier-job-7kx7t 30.0Gi 60.0Gi
├─ oliver-wiedemann-pod 400.0Gi 400.0Gi
└─ yolo2-kc67n 30.0Gi 30.0Gi
nvidia.com/gpu (77%) 48.0 (77%) 48.0 62.0 14.0
├─ asmodeus (100%) 4.0 (100%) 4.0 4.0 0.0
│ └─ asmodeus-all-gpu-julian-schaefer-zimmermann 4.0 4.0
├─ belial (50%) 4.0 (50%) 4.0 8.0 4.0
│ ├─ gpu-pod 1.0 1.0
│ ├─ mariya-automl-well-being-job-7w56h 1.0 1.0
│ ├─ mariya-lstm-well-being-job-894nj 1.0 1.0
│ └─ ubuntu-gpu1 1.0 1.0
├─ demogorgon (75%) 6.0 (75%) 6.0 8.0 2.0
│ ├─ a2v2-triple-gpu-jcsz 3.0 3.0
│ ├─ gpu-demogorgon 1.0 1.0
│ └─ ulas-bingoel-model-pod-6 2.0 2.0
├─ fierna (50%) 4.0 (50%) 4.0 8.0 4.0
│ ├─ beesbook-days-0-qb4cm 1.0 1.0
│ ├─ beesbook-days-1-mwm9s 1.0 1.0
│ ├─ gpu-pod 1.0 1.0
│ └─ mi-swarm-pod 1.0 1.0
├─ kiaransalee (100%) 6.0 (100%) 6.0 6.0 0.0
│ ├─ jupyter-aivan-2dau 1.0 1.0
│ ├─ jupyter-alebellina412 1.0 1.0
│ ├─ jupyter-huygenssteiner 1.0 1.0
│ ├─ jupyter-jamannio 1.0 1.0
│ ├─ jupyter-matheus-2dstefanini-2dmariano 1.0 1.0
│ └─ jupyter-saroyehun 1.0 1.0
├─ tiamat (0%) 0.0 (0%) 0.0 4.0 4.0
├─ vecna (100%) 16.0 (100%) 16.0 16.0 0.0
│ └─ felix-petersen-job-20 16.0 16.0
└─ zariel (100%) 8.0 (100%) 8.0 8.0 0.0
├─ julian-welzel-job-1 4.0 4.0
├─ mariya-cnn-well-being-classifier-job-7kx7t 1.0 1.0
├─ oliver-wiedemann-pod 2.0 2.0
└─ yolo2-kc67n 1.0 1.0
nvidia.com/mig-1g.10gb (29%) 2.0 (29%) 2.0 7.0 5.0
└─ kiaransalee (29%) 2.0 (29%) 2.0 7.0 5.0
├─ jupyter-kmayer24 1.0 1.0
└─ jupyter-samrauh 1.0 1.0
nvidia.com/mig-3g.40gb (0%) 0.0 (0%) 0.0 1.0 1.0
└─ kiaransalee (0%) 0.0 (0%) 0.0 1.0 1.0
nvidia.com/mig-4g.40gb (0%) 0.0 (0%) 0.0 1.0 1.0
└─ kiaransalee (0%) 0.0 (0%) 0.0 1.0 1.0
pods (9%) 151.0 (9%) 151.0 1.6k 1.5k
├─ asmodeus (5%) 5.0 (5%) 5.0 110.0 105.0
│ ├─ asmodeus-all-gpu-julian-schaefer-zimmermann 1.0 1.0
│ ├─ dnsutils-asmodeus 1.0 1.0
│ ├─ kube-proxy-rbzdb 1.0 1.0
│ ├─ kube-router-qv489 1.0 1.0
│ └─ nvidia-device-plugin-daemonset-lrkw7 1.0 1.0
├─ beholder01 (5%) 6.0 (5%) 6.0 110.0 104.0
│ ├─ dnsutils-beholder01 1.0 1.0
│ ├─ kube-apiserver-beholder01 1.0 1.0
│ ├─ kube-controller-manager-beholder01 1.0 1.0
│ ├─ kube-proxy-7f5v4 1.0 1.0
│ ├─ kube-router-d2rq4 1.0 1.0
│ └─ kube-scheduler-beholder01 1.0 1.0
├─ beholder02 (7%) 8.0 (7%) 8.0 110.0 102.0
│ ├─ dnsutils-beholder02 1.0 1.0
│ ├─ kube-apiserver-beholder02 1.0 1.0
│ ├─ kube-controller-manager-beholder02 1.0 1.0
│ ├─ kube-proxy-dlc5n 1.0 1.0
│ ├─ kube-router-dlcmv 1.0 1.0
│ ├─ kube-scheduler-beholder02 1.0 1.0
│ ├─ wordpress-8c8944c8d-cxbrq 1.0 1.0
│ └─ wordpress-mariadb-565c547dc8-59fhr 1.0 1.0
├─ beholder03 (5%) 6.0 (5%) 6.0 110.0 104.0
│ ├─ dnsutils-beholder03 1.0 1.0
│ ├─ kube-apiserver-beholder03 1.0 1.0
│ ├─ kube-controller-manager-beholder03 1.0 1.0
│ ├─ kube-proxy-jqcj7 1.0 1.0
│ ├─ kube-router-hm6f8 1.0 1.0
│ └─ kube-scheduler-beholder03 1.0 1.0
├─ belial (8%) 9.0 (8%) 9.0 110.0 101.0
│ ├─ dnsutils-belial 1.0 1.0
│ ├─ gpu-feature-discovery-4w2g5 1.0 1.0
│ ├─ gpu-pod 1.0 1.0
│ ├─ kube-proxy-nsltd 1.0 1.0
│ ├─ kube-router-bbv5t 1.0 1.0
│ ├─ mariya-automl-well-being-job-7w56h 1.0 1.0
│ ├─ mariya-lstm-well-being-job-894nj 1.0 1.0
│ ├─ nvidia-device-plugin-daemonset-kn6h4 1.0 1.0
│ └─ ubuntu-gpu1 1.0 1.0
├─ demogorgon (6%) 7.0 (6%) 7.0 110.0 103.0
│ ├─ a2v2-triple-gpu-jcsz 1.0 1.0
│ ├─ dnsutils-demogorgon 1.0 1.0
│ ├─ gpu-demogorgon 1.0 1.0
│ ├─ kube-proxy-gknqq 1.0 1.0
│ ├─ kube-router-nh4mn 1.0 1.0
│ ├─ nvidia-device-plugin-daemonset-wcsbj 1.0 1.0
│ └─ ulas-bingoel-model-pod-6 1.0 1.0
├─ fierna (8%) 9.0 (8%) 9.0 110.0 101.0
│ ├─ beesbook-days-0-qb4cm 1.0 1.0
│ ├─ beesbook-days-1-mwm9s 1.0 1.0
│ ├─ dnsutils-fierna 1.0 1.0
│ ├─ gpu-feature-discovery-8tz6l 1.0 1.0
│ ├─ gpu-pod 1.0 1.0
│ ├─ kube-proxy-9tg4f 1.0 1.0
│ ├─ kube-router-nndmg 1.0 1.0
│ ├─ mi-swarm-pod 1.0 1.0
│ └─ nvidia-device-plugin-daemonset-r5pcj 1.0 1.0
├─ kiaransalee (14%) 15.0 (14%) 15.0 110.0 95.0
│ ├─ continuous-image-puller-6fs4k 1.0 1.0
│ ├─ continuous-image-puller-6z8bj 1.0 1.0
│ ├─ dnsutils-kiaransalee 1.0 1.0
│ ├─ gpu-feature-discovery-znz6m 1.0 1.0
│ ├─ jupyter-aivan-2dau 1.0 1.0
│ ├─ jupyter-alebellina412 1.0 1.0
│ ├─ jupyter-huygenssteiner 1.0 1.0
│ ├─ jupyter-jamannio 1.0 1.0
│ ├─ jupyter-kmayer24 1.0 1.0
│ ├─ jupyter-matheus-2dstefanini-2dmariano 1.0 1.0
│ ├─ jupyter-samrauh 1.0 1.0
│ ├─ jupyter-saroyehun 1.0 1.0
│ ├─ kube-proxy-65675 1.0 1.0
│ ├─ kube-router-bw8bz 1.0 1.0
│ └─ nvidia-device-plugin-daemonset-wj2wf 1.0 1.0
├─ lolth (16%) 18.0 (16%) 18.0 110.0 92.0
│ ├─ coredns-55cb58b774-cxfmw 1.0 1.0
│ ├─ dnsutils-lolth 1.0 1.0
│ ├─ echo1-77fbfb54d-8t4dq 1.0 1.0
│ ├─ echo1-77fbfb54d-hcrph 1.0 1.0
│ ├─ echo2-5d58759df-ssc7q 1.0 1.0
│ ├─ gatekeeper-audit-59d4b6fd4c-q7cj8 1.0 1.0
│ ├─ gatekeeper-controller-manager-66f474f785-qfkmj 1.0 1.0
│ ├─ hub-767b56fc4d-5k8hh 1.0 1.0
│ ├─ hub-78d6dd898d-89np9 1.0 1.0
│ ├─ hub-85db65cb54-8tktv 1.0 1.0
│ ├─ kube-proxy-cl9gc 1.0 1.0
│ ├─ kube-router-z6szh 1.0 1.0
│ ├─ kube-state-metrics-8945855d-9fqtg 1.0 1.0
│ ├─ nvidia-device-plugin-daemonset-smgtn 1.0 1.0
│ ├─ proxy-5bc89cc587-z8q9p 1.0 1.0
│ ├─ ubuntu-test-pod 1.0 1.0
│ ├─ user-scheduler-5cf5ffbc54-htfdj 1.0 1.0
│ └─ user-scheduler-c7db6c584-cf297 1.0 1.0
├─ mindflayer01 (21%) 23.0 (21%) 23.0 110.0 87.0
│ ├─ dex-7ddbb96f79-cmc5m 1.0 1.0
│ ├─ dex-loginapp-7d9c6ff56-lhj2r 1.0 1.0
│ ├─ dex-mysql-66b5885f7b-2wdw8 1.0 1.0
│ ├─ dnsutils-mindflayer01 1.0 1.0
│ ├─ gatekeeper-controller-manager-66f474f785-lxl5x 1.0 1.0
│ ├─ kube-proxy-d2xl4 1.0 1.0
│ ├─ kube-router-rzzr2 1.0 1.0
│ ├─ local-path-provisioner-759479454f-jxl54 1.0 1.0
│ ├─ login-pod 1.0 1.0
│ ├─ mariadb-849498cb44-dldtm 1.0 1.0
│ ├─ memcached-578474d6f9-8dgb8 1.0 1.0
│ ├─ nginx-frontend-5749b578b9-fjshf 1.0 1.0
│ ├─ nginx-ip-2023-78b9c84dbf-znhrx 1.0 1.0
│ ├─ nginx-k8s-7c8d949b5f-grwnx 1.0 1.0
│ ├─ nginx-rec-2023-79df8c77cd-gg87f 1.0 1.0
│ ├─ nginx-rsn-2024-7dbc6b668b-jbjk7 1.0 1.0
│ ├─ nginx-self-service-password-65b44f7547-dhvwp 1.0 1.0
│ ├─ nvidia-device-plugin-daemonset-5gmd7 1.0 1.0
│ ├─ pdf-b647544f-snvmm 1.0 1.0
│ ├─ prometheus-deployment-b65d5d898-t2z9l 1.0 1.0
│ ├─ registry-7fcf447f4-m5sz9 1.0 1.0
│ ├─ registry-auth-7dcc7f6766-m8bsh 1.0 1.0
│ └─ temp-pod 1.0 1.0
├─ mindflayer02 (7%) 8.0 (7%) 8.0 110.0 102.0
│ ├─ cool-pod 1.0 1.0
│ ├─ dnsutils-mindflayer02 1.0 1.0
│ ├─ gpu-pod-aalbi 1.0 1.0
│ ├─ kube-proxy-zbwk4 1.0 1.0
│ ├─ kube-router-k8t5k 1.0 1.0
│ ├─ mediawiki-mariadb-5cc4866855-xffs4 1.0 1.0
│ ├─ nvidia-device-plugin-daemonset-7wv2j 1.0 1.0
│ └─ phpfpm-nginx-5db96d6895-97f2g 1.0 1.0
├─ mindflayer03 (16%) 18.0 (16%) 18.0 110.0 92.0
│ ├─ check-data-pod 1.0 1.0
│ ├─ coredns-55cb58b774-4kpwk 1.0 1.0
│ ├─ dnsutils-mindflayer03 1.0 1.0
│ ├─ echo1-77fbfb54d-8bnp6 1.0 1.0
│ ├─ echo1-77fbfb54d-rcwzd 1.0 1.0
│ ├─ gatekeeper-controller-manager-66f474f785-5gv4n 1.0 1.0
│ ├─ kube-proxy-b68xq 1.0 1.0
│ ├─ kube-router-ptw9p 1.0 1.0
│ ├─ ldap-6987986dbc-2zr2c 1.0 1.0
│ ├─ ledavio-text-search 1.0 1.0
│ ├─ nginx-ip-2025-5888cccd9-fmnh8 1.0 1.0
│ ├─ nvidia-device-plugin-daemonset-7k6p6 1.0 1.0
│ ├─ proxy-5495d795d5-vx2ld 1.0 1.0
│ ├─ proxy-7f79cc645f-gj2ld 1.0 1.0
│ ├─ system-registry-8877cd57-59pzr 1.0 1.0
│ ├─ user-registry-64bb8ff7cf-5cktq 1.0 1.0
│ ├─ user-scheduler-5cf5ffbc54-gkwvq 1.0 1.0
│ └─ user-scheduler-c7db6c584-4xdvn 1.0 1.0
├─ tiamat (5%) 6.0 (5%) 6.0 110.0 104.0
│ ├─ continuous-image-puller-wtkhd 1.0 1.0
│ ├─ dnsutils-tiamat 1.0 1.0
│ ├─ gpu-feature-discovery-5qb95 1.0 1.0
│ ├─ kube-proxy-nbjvl 1.0 1.0
│ ├─ kube-router-6zps8 1.0 1.0
│ └─ nvidia-device-plugin-daemonset-9b4tb 1.0 1.0
├─ vecna (5%) 5.0 (5%) 5.0 110.0 105.0
│ ├─ dnsutils-vecna 1.0 1.0
│ ├─ felix-petersen-job-20 1.0 1.0
│ ├─ kube-proxy-djtx4 1.0 1.0
│ ├─ kube-router-f67nm 1.0 1.0
│ └─ nvidia-device-plugin-daemonset-qln9t 1.0 1.0
└─ zariel (7%) 8.0 (7%) 8.0 110.0 102.0
├─ dnsutils-zariel 1.0 1.0
├─ julian-welzel-job-1 1.0 1.0
├─ kube-proxy-c8frf 1.0 1.0
├─ kube-router-fww75 1.0 1.0
├─ mariya-cnn-well-being-classifier-job-7kx7t 1.0 1.0
├─ nvidia-device-plugin-daemonset-djsrl 1.0 1.0
├─ oliver-wiedemann-pod 1.0 1.0
└─ yolo2-kc67n 1.0 1.0
Resource Requested Limit Allocatable Free
auth
├─ mindflayer01
│ ├─ cpu 650.0m 0.0
│ └─ pods 5.0 5.0
└─ mindflayer03
├─ cpu 250.0m 0.0
└─ pods 1.0 1.0
frontend 4.0 4.0
├─ lolth 1.0 1.0
│ └─ pods 1.0 1.0
├─ mindflayer01 2.0 2.0
│ └─ pods 2.0 2.0
└─ mindflayer03 1.0 1.0
└─ pods 1.0 1.0
gatekeeper-system
├─ lolth
│ ├─ cpu 200.0m 2.0
│ ├─ memory 512.0Mi 1.0Gi
│ └─ pods 2.0 2.0
├─ mindflayer01
│ ├─ cpu 100.0m 1.0
│ ├─ memory 256.0Mi 512.0Mi
│ └─ pods 1.0 1.0
└─ mindflayer03
├─ cpu 100.0m 1.0
├─ memory 256.0Mi 512.0Mi
└─ pods 1.0 1.0
gcpr-vmv-2022
└─ beholder02
├─ cpu 1.0 2.0
├─ memory 512.0Mi 1.0Gi
└─ pods 2.0 2.0
jupyterhub
├─ kiaransalee
│ ├─ memory 6.4G 0.0
│ ├─ nvidia.com/gpu 6.0 6.0
│ └─ pods 7.0 7.0
├─ lolth 3.0 3.0
│ └─ pods 3.0 3.0
└─ mindflayer03 1.0 1.0
└─ pods 1.0 1.0
jupyterhub-kuckling 3.0 3.0
├─ lolth 1.0 1.0
│ └─ pods 1.0 1.0
├─ mindflayer03 1.0 1.0
│ └─ pods 1.0 1.0
└─ tiamat 1.0 1.0
└─ pods 1.0 1.0
jupyterhub-students
├─ kiaransalee
│ ├─ memory 2.1G 0.0
│ ├─ nvidia.com/mig-1g.10gb 2.0 2.0
│ └─ pods 3.0 3.0
├─ lolth 2.0 2.0
│ └─ pods 2.0 2.0
└─ mindflayer03 2.0 2.0
└─ pods 2.0 2.0
kube-system
├─ asmodeus
│ ├─ cpu 350.0m 100.0m
│ ├─ memory 350.0Mi 100.0Mi
│ └─ pods 4.0 4.0
├─ beholder01
│ ├─ cpu 900.0m 100.0m
│ ├─ memory 350.0Mi 100.0Mi
│ └─ pods 6.0 6.0
├─ beholder02
│ ├─ cpu 900.0m 100.0m
│ ├─ memory 350.0Mi 100.0Mi
│ └─ pods 6.0 6.0
├─ beholder03
│ ├─ cpu 900.0m 100.0m
│ ├─ memory 350.0Mi 100.0Mi
│ └─ pods 6.0 6.0
├─ belial
│ ├─ cpu 350.0m 100.0m
│ ├─ memory 350.0Mi 100.0Mi
│ └─ pods 5.0 5.0
├─ demogorgon
│ ├─ cpu 350.0m 100.0m
│ ├─ memory 350.0Mi 100.0Mi
│ └─ pods 4.0 4.0
├─ fierna
│ ├─ cpu 350.0m 100.0m
│ ├─ memory 350.0Mi 100.0Mi
│ └─ pods 5.0 5.0
├─ kiaransalee
│ ├─ cpu 350.0m 100.0m
│ ├─ memory 350.0Mi 100.0Mi
│ └─ pods 5.0 5.0
├─ lolth
│ ├─ cpu 450.0m 100.0m
│ ├─ memory 420.0Mi 270.0Mi
│ └─ pods 5.0 5.0
├─ mindflayer01
│ ├─ cpu 350.0m 100.0m
│ ├─ memory 350.0Mi 100.0Mi
│ └─ pods 4.0 4.0
├─ mindflayer02
│ ├─ cpu 350.0m 100.0m
│ ├─ memory 350.0Mi 100.0Mi
│ └─ pods 4.0 4.0
├─ mindflayer03
│ ├─ cpu 450.0m 100.0m
│ ├─ memory 420.0Mi 270.0Mi
│ └─ pods 5.0 5.0
├─ tiamat
│ ├─ cpu 350.0m 100.0m
│ ├─ memory 350.0Mi 100.0Mi
│ └─ pods 5.0 5.0
├─ vecna
│ ├─ cpu 350.0m 100.0m
│ ├─ memory 350.0Mi 100.0Mi
│ └─ pods 4.0 4.0
└─ zariel
├─ cpu 350.0m 100.0m
├─ memory 350.0Mi 100.0Mi
└─ pods 4.0 4.0
local-path-storage 1.0 1.0
└─ mindflayer01 1.0 1.0
└─ pods 1.0 1.0
monitoring
├─ lolth 1.0 1.0
│ └─ pods 1.0 1.0
└─ mindflayer01
├─ cpu 500.0m 1.0
├─ memory 500.0M 1.0Gi
└─ pods 1.0 1.0
registry
└─ mindflayer03
├─ cpu 200.0m 0.0
└─ pods 2.0 2.0
testing 3.0 3.0
├─ lolth 2.0 2.0
│ └─ pods 2.0 2.0
└─ mindflayer03 1.0 1.0
└─ pods 1.0 1.0
user-adwait-deshpande
├─ mindflayer01
│ ├─ cpu 100.0m 1.0
│ ├─ memory 100.0Mi 1.0Gi
│ └─ pods 1.0 1.0
└─ mindflayer03
├─ cpu 100.0m 500.0m
├─ memory 100.0Mi 500.0Mi
└─ pods 1.0 1.0
user-alex-chan
└─ mindflayer02
├─ cpu 10.0 16.0
├─ memory 64.0Gi 64.0Gi
└─ pods 1.0 1.0
user-angela-albi
├─ mindflayer02
│ ├─ cpu 10.0 10.0
│ ├─ memory 10.0Gi 10.0Gi
│ └─ pods 1.0 1.0
└─ zariel
├─ cpu 10.0 16.0
├─ memory 30.0Gi 30.0Gi
├─ nvidia.com/gpu 1.0 1.0
└─ pods 1.0 1.0
user-felix-petersen
└─ vecna
├─ cpu 16.0 100.0
├─ memory 100.0Gi 1.6Ti
├─ nvidia.com/gpu 16.0 16.0
└─ pods 1.0 1.0
user-jacob-davidson
├─ fierna
│ ├─ cpu 24.0 24.0
│ ├─ memory 64.0Gi 128.0Gi
│ ├─ nvidia.com/gpu 2.0 2.0
│ └─ pods 2.0 2.0
└─ mindflayer01
├─ cpu 60.0m 1.0
├─ memory 100.0Mi 1.0Gi
└─ pods 1.0 1.0
user-julian-jandeleit
├─ demogorgon
│ ├─ cpu 48.0 48.0
│ ├─ memory 80.0Gi 80.0Gi
│ ├─ nvidia.com/gpu 1.0 1.0
│ └─ pods 1.0 1.0
└─ lolth
├─ cpu 100.0m 1.0
├─ memory 100.0Mi 1.0Gi
└─ pods 1.0 1.0
user-julian-welzel
└─ zariel
├─ cpu 96.0 96.0
├─ memory 100.0Gi 400.0Gi
├─ nvidia.com/gpu 4.0 4.0
└─ pods 1.0 1.0
user-julian-zimmermann
├─ asmodeus
│ ├─ cpu 250.0 250.0
│ ├─ memory 1.5Ti 1.5Ti
│ ├─ nvidia.com/gpu 4.0 4.0
│ └─ pods 1.0 1.0
└─ demogorgon
├─ cpu 16.0 16.0
├─ memory 1.0Ti 1.0Ti
├─ nvidia.com/gpu 3.0 3.0
└─ pods 1.0 1.0
user-mariya-tykhonchuk
├─ belial
│ ├─ cpu 9.0 17.0
│ ├─ memory 90.0Gi 130.0Gi
│ ├─ nvidia.com/gpu 3.0 3.0
│ └─ pods 3.0 3.0
└─ zariel
├─ cpu 4.0 8.0
├─ memory 30.0Gi 60.0Gi
├─ nvidia.com/gpu 1.0 1.0
└─ pods 1.0 1.0
user-maya-dagher
└─ fierna
├─ cpu 2.0 2.0
├─ memory 20.0Gi 20.0Gi
├─ nvidia.com/gpu 2.0 2.0
└─ pods 2.0 2.0
user-mike-battistella
└─ mindflayer03
├─ cpu 1.0 1.0
├─ memory 32.0Gi 32.0Gi
└─ pods 1.0 1.0
user-oliver-wiedemann
└─ zariel
├─ cpu 32.0 64.0
├─ memory 400.0Gi 400.0Gi
├─ nvidia.com/gpu 2.0 2.0
└─ pods 1.0 1.0
user-segun-aroyehun
└─ belial
├─ cpu 50.0 100.0
├─ ephemeral-storage 100.0Gi 150.0Gi
├─ memory 250.0Gi 400.0Gi
├─ nvidia.com/gpu 1.0 1.0
└─ pods 1.0 1.0
user-ulas-bingoel
└─ demogorgon
├─ cpu 24.0 24.0
├─ memory 80.0Gi 80.0Gi
├─ nvidia.com/gpu 2.0 2.0
└─ pods 1.0 1.0
user-v-time
├─ mindflayer01
│ ├─ cpu 500.0m 1.0
│ ├─ memory 256.0Mi 512.0Mi
│ └─ pods 1.0 1.0
└─ mindflayer02
├─ cpu 1.0 2.0
├─ memory 512.0Mi 1.0Gi
└─ pods 1.0 1.0
web
├─ mindflayer01 6.0 6.0
│ └─ pods 6.0 6.0
├─ mindflayer02
│ ├─ cpu 250.0m 0.0
│ └─ pods 1.0 1.0
└─ mindflayer03 1.0 1.0
└─ pods 1.0 1.0
cluster:
id: 3fee6f38-ba9f-11ec-9328-e188936dcafd
health: HEALTH_OK
services:
mon: 5 daemons, quorum beholder03,beholder01,beholder02,mindflayer02,mindflayer03 (age 4M)
mgr: beholder01.verxwn(active, since 8M), standbys: beholder03.nprqzk, mindflayer03.rzdvrr, mindflayer01.mkuopd, mindflayer02.ympgrs, beholder02.akktmp
mds: 4/4 daemons up, 2 standby
osd: 24 osds: 24 up (since 8M), 24 in (since 15M)
data:
volumes: 1/1 healthy
pools: 3 pools, 545 pgs
objects: 69.20M objects, 97 TiB
usage: 195 TiB used, 86 TiB / 282 TiB avail
pgs: 541 active+clean
3 active+clean+scrubbing+deep
1 active+clean+scrubbing
io:
client: 7.4 MiB/s wr, 0 op/s rd, 15 op/s wr
HEALTH_OK
+------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
| ENDPOINT | ID | VERSION | DB SIZE | IS LEADER | IS LEARNER | RAFT TERM | RAFT INDEX | RAFT APPLIED INDEX | ERRORS |
+------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
| 192.168.1.1:4252 | 7126e7a3a9cc42ca | 3.4.30 | 156 MB | false | false | 302053 | 642000778 | 642000778 | |
| 192.168.1.2:4252 | 39d72894bf6c7600 | 3.4.30 | 156 MB | false | false | 302053 | 642000778 | 642000778 | |
| 192.168.1.3:4252 | bbf4a2b99c3fd692 | 3.4.30 | 156 MB | false | false | 302053 | 642000778 | 642000778 | |
| 192.168.2.1:4252 | 5cb9997dd1c2246b | 3.3.25 | 156 MB | true | false | 302053 | 642000778 | 0 | |
| 192.168.2.3:4252 | cbc1cf89959ea4e | 3.3.25 | 156 MB | false | false | 302053 | 642000778 | 0 | |
+------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
beholder01 |
SSH port open |
yes |
Report available |
yes |
External interface up |
ok |
Infiniband interface up |
ok |
API servers reachable |
1
2
3
4
|
Ceph monitors reachable |
1
2
3
|
cephfs mounted |
/cephfs |
local raid mounted |
/raid |
Test pod responding |
dnsutils-beholder01 |
Can reach kube-dns |
10.96.0.10 |
Pod can reach kube-dns |
yes |
Pod can reach internet |
yes |
|
beholder02 |
SSH port open |
yes |
Report available |
yes |
External interface up |
ok |
Infiniband interface up |
ok |
API servers reachable |
1
2
3
4
|
Ceph monitors reachable |
1
2
3
|
cephfs mounted |
/cephfs |
local raid mounted |
/raid |
Test pod responding |
dnsutils-beholder02 |
Can reach kube-dns |
10.96.0.10 |
Pod can reach kube-dns |
yes |
Pod can reach internet |
yes |
|
beholder03 |
SSH port open |
yes |
Report available |
yes |
External interface up |
ok |
Infiniband interface up |
ok |
API servers reachable |
1
2
3
4
|
Ceph monitors reachable |
1
2
3
|
cephfs mounted |
/cephfs |
local raid mounted |
/raid |
Test pod responding |
dnsutils-beholder03 |
Can reach kube-dns |
10.96.0.10 |
Pod can reach kube-dns |
yes |
Pod can reach internet |
yes |
|
lolth |
SSH port open |
yes |
Report available |
yes |
External interface up |
ok |
Infiniband interface up |
ok |
API servers reachable |
1
2
3
4
|
Ceph monitors reachable |
1
2
3
|
cephfs mounted |
/cephfs |
local raid mounted |
/raid |
Test pod responding |
dnsutils-lolth |
Can reach kube-dns |
10.96.0.10 |
Pod can reach kube-dns |
yes |
Pod can reach internet |
yes |
|
Thu Sep 18 14:32:51 2025
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 565.57.01 Driver Version: 565.57.01 CUDA Version: 12.7 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 Quadro RTX 6000 Off | 00000000:1B:00.0 Off | Off |
| 33% 31C P8 5W / 260W | 1MiB / 24576MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
| 1 Quadro RTX 6000 Off | 00000000:1C:00.0 Off | Off |
| 33% 31C P8 11W / 260W | 1MiB / 24576MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
| 2 Quadro RTX 6000 Off | 00000000:1D:00.0 Off | Off |
| 33% 33C P8 7W / 260W | 1MiB / 24576MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
| 3 Quadro RTX 6000 Off | 00000000:1E:00.0 Off | Off |
| 42% 66C P0 201W / 260W | 7122MiB / 24576MiB | 54% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
| 4 Quadro RTX 6000 Off | 00000000:3D:00.0 Off | Off |
| 33% 31C P8 12W / 260W | 1MiB / 24576MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
| 5 Quadro RTX 6000 Off | 00000000:3F:00.0 Off | Off |
| 33% 31C P8 6W / 260W | 1MiB / 24576MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
| 6 Quadro RTX 6000 Off | 00000000:40:00.0 Off | Off |
| 33% 31C P8 4W / 260W | 1MiB / 24576MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
| 7 Quadro RTX 6000 Off | 00000000:41:00.0 Off | Off |
| 33% 32C P8 4W / 260W | 1MiB / 24576MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| 3 N/A N/A 2389981 C python 7118MiB |
+-----------------------------------------------------------------------------------------+
Thu Sep 18 14:32:53 2025
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 565.57.01 Driver Version: 565.57.01 CUDA Version: 12.7 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 Quadro RTX 6000 Off | 00000000:1B:00.0 Off | Off |
| 33% 48C P0 101W / 260W | 13073MiB / 24576MiB | 49% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
| 1 Quadro RTX 6000 Off | 00000000:1C:00.0 Off | Off |
| 33% 43C P0 125W / 260W | 13069MiB / 24576MiB | 98% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
| 2 Quadro RTX 6000 Off | 00000000:1D:00.0 Off | Off |
| 33% 33C P8 4W / 260W | 1MiB / 24576MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
| 3 Quadro RTX 6000 Off | 00000000:1E:00.0 Off | Off |
| 33% 31C P8 5W / 260W | 1MiB / 24576MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
| 4 Quadro RTX 6000 Off | 00000000:3D:00.0 Off | Off |
| 33% 30C P8 15W / 260W | 1MiB / 24576MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
| 5 Quadro RTX 6000 Off | 00000000:3F:00.0 Off | Off |
| 33% 31C P8 4W / 260W | 1MiB / 24576MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
| 6 Quadro RTX 6000 Off | 00000000:40:00.0 Off | Off |
| 33% 31C P8 12W / 260W | 1MiB / 24576MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
| 7 Quadro RTX 6000 Off | 00000000:41:00.0 Off | Off |
| 33% 31C P8 16W / 260W | 1MiB / 24576MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| 0 N/A N/A 2862971 C python 4358MiB |
| 0 N/A N/A 2862972 C python 4356MiB |
| 0 N/A N/A 2862973 C python 4356MiB |
| 1 N/A N/A 2862768 C python 4356MiB |
| 1 N/A N/A 2862769 C python 4356MiB |
| 1 N/A N/A 2862770 C python 4354MiB |
+-----------------------------------------------------------------------------------------+
Thu Sep 18 14:32:55 2025
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 565.57.01 Driver Version: 565.57.01 CUDA Version: 12.7 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA A100-SXM4-40GB Off | 00000000:01:00.0 Off | 0 |
| N/A 27C P0 50W / 400W | 1MiB / 40960MiB | 0% Default |
| | | Disabled |
+-----------------------------------------+------------------------+----------------------+
| 1 NVIDIA A100-SXM4-40GB Off | 00000000:41:00.0 Off | 0 |
| N/A 27C P0 52W / 400W | 1MiB / 40960MiB | 0% Default |
| | | Disabled |
+-----------------------------------------+------------------------+----------------------+
| 2 NVIDIA A100-SXM4-40GB Off | 00000000:81:00.0 Off | 0 |
| N/A 26C P0 49W / 400W | 1MiB / 40960MiB | 0% Default |
| | | Disabled |
+-----------------------------------------+------------------------+----------------------+
| 3 NVIDIA A100-SXM4-40GB Off | 00000000:C1:00.0 Off | 0 |
| N/A 25C P0 50W / 400W | 1MiB / 40960MiB | 0% Default |
| | | Disabled |
+-----------------------------------------+------------------------+----------------------+
+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| No running processes found |
+-----------------------------------------------------------------------------------------+
Thu Sep 18 16:32:58 2025
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 565.57.01 Driver Version: 565.57.01 CUDA Version: 12.7 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 Tesla V100-SXM3-32GB On | 00000000:34:00.0 Off | 0 |
| N/A 33C P0 48W / 350W | 1MiB / 32768MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
| 1 Tesla V100-SXM3-32GB On | 00000000:36:00.0 Off | 0 |
| N/A 33C P0 47W / 350W | 1MiB / 32768MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
| 2 Tesla V100-SXM3-32GB On | 00000000:39:00.0 Off | 0 |
| N/A 35C P0 53W / 350W | 1MiB / 32768MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
| 3 Tesla V100-SXM3-32GB On | 00000000:3B:00.0 Off | 0 |
| N/A 36C P0 51W / 350W | 1MiB / 32768MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
| 4 Tesla V100-SXM3-32GB On | 00000000:57:00.0 Off | 0 |
| N/A 33C P0 48W / 350W | 1MiB / 32768MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
| 5 Tesla V100-SXM3-32GB On | 00000000:59:00.0 Off | 0 |
| N/A 36C P0 53W / 350W | 1MiB / 32768MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
| 6 Tesla V100-SXM3-32GB On | 00000000:5C:00.0 Off | 0 |
| N/A 34C P0 51W / 350W | 1MiB / 32768MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
| 7 Tesla V100-SXM3-32GB On | 00000000:5E:00.0 Off | 0 |
| N/A 38C P0 53W / 350W | 1MiB / 32768MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
| 8 Tesla V100-SXM3-32GB On | 00000000:B7:00.0 Off | 0 |
| N/A 34C P0 50W / 350W | 1MiB / 32768MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
| 9 Tesla V100-SXM3-32GB On | 00000000:B9:00.0 Off | 0 |
| N/A 33C P0 49W / 350W | 1MiB / 32768MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
| 10 Tesla V100-SXM3-32GB On | 00000000:BC:00.0 Off | 0 |
| N/A 36C P0 51W / 350W | 1MiB / 32768MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
| 11 Tesla V100-SXM3-32GB On | 00000000:BE:00.0 Off | 0 |
| N/A 38C P0 48W / 350W | 1MiB / 32768MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
| 12 Tesla V100-SXM3-32GB On | 00000000:E0:00.0 Off | 0 |
| N/A 36C P0 48W / 350W | 1MiB / 32768MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
| 13 Tesla V100-SXM3-32GB On | 00000000:E2:00.0 Off | 0 |
| N/A 36C P0 49W / 350W | 1MiB / 32768MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
| 14 Tesla V100-SXM3-32GB On | 00000000:E5:00.0 Off | 0 |
| N/A 39C P0 51W / 350W | 1MiB / 32768MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
| 15 Tesla V100-SXM3-32GB On | 00000000:E7:00.0 Off | 0 |
| N/A 38C P0 49W / 350W | 1MiB / 32768MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| No running processes found |
+-----------------------------------------------------------------------------------------+
Thu Sep 18 14:33:00 2025
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 570.86.15 Driver Version: 570.86.15 CUDA Version: 12.8 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA A100-SXM4-80GB On | 00000000:01:00.0 Off | 0 |
| N/A 27C P0 62W / 500W | 1MiB / 81920MiB | 0% Default |
| | | Disabled |
+-----------------------------------------+------------------------+----------------------+
| 1 NVIDIA A100-SXM4-80GB On | 00000000:41:00.0 Off | 0 |
| N/A 29C P0 58W / 500W | 1MiB / 81920MiB | 0% Default |
| | | Disabled |
+-----------------------------------------+------------------------+----------------------+
| 2 NVIDIA A100-SXM4-80GB On | 00000000:81:00.0 Off | 0 |
| N/A 27C P0 60W / 500W | 1MiB / 81920MiB | 0% Default |
| | | Disabled |
+-----------------------------------------+------------------------+----------------------+
| 3 NVIDIA A100-SXM4-80GB On | 00000000:C1:00.0 Off | 0 |
| N/A 26C P0 60W / 500W | 1MiB / 81920MiB | 0% Default |
| | | Disabled |
+-----------------------------------------+------------------------+----------------------+
+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| No running processes found |
+-----------------------------------------------------------------------------------------+
Thu Sep 18 16:33:02 2025
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.216.03 Driver Version: 535.216.03 CUDA Version: 12.2 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 NVIDIA A100-SXM4-40GB On | 00000000:07:00.0 Off | 0 |
| N/A 31C P0 55W / 400W | 3MiB / 40960MiB | 0% Default |
| | | Disabled |
+-----------------------------------------+----------------------+----------------------+
| 1 NVIDIA A100-SXM4-40GB On | 00000000:0F:00.0 Off | 0 |
| N/A 30C P0 52W / 400W | 3MiB / 40960MiB | 0% Default |
| | | Disabled |
+-----------------------------------------+----------------------+----------------------+
| 2 NVIDIA A100-SXM4-40GB On | 00000000:47:00.0 Off | 0 |
| N/A 28C P0 51W / 400W | 0MiB / 40960MiB | 0% Default |
| | | Disabled |
+-----------------------------------------+----------------------+----------------------+
| 3 NVIDIA A100-SXM4-40GB On | 00000000:4E:00.0 Off | 0 |
| N/A 28C P0 52W / 400W | 0MiB / 40960MiB | 0% Default |
| | | Disabled |
+-----------------------------------------+----------------------+----------------------+
| 4 NVIDIA A100-SXM4-40GB On | 00000000:87:00.0 Off | 0 |
| N/A 35C P0 57W / 400W | 0MiB / 40960MiB | 0% Default |
| | | Disabled |
+-----------------------------------------+----------------------+----------------------+
| 5 NVIDIA A100-SXM4-40GB On | 00000000:90:00.0 Off | 0 |
| N/A 32C P0 51W / 400W | 0MiB / 40960MiB | 0% Default |
| | | Disabled |
+-----------------------------------------+----------------------+----------------------+
| 6 NVIDIA A100-SXM4-40GB On | 00000000:B7:00.0 Off | 0 |
| N/A 32C P0 51W / 400W | 0MiB / 40960MiB | 0% Default |
| | | Disabled |
+-----------------------------------------+----------------------+----------------------+
| 7 NVIDIA A100-SXM4-40GB On | 00000000:BD:00.0 Off | 0 |
| N/A 48C P0 86W / 400W | 11745MiB / 40960MiB | 0% Default |
| | | Disabled |
+-----------------------------------------+----------------------+----------------------+
+---------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=======================================================================================|
| 7 N/A N/A 880308 C python 11736MiB |
+---------------------------------------------------------------------------------------+
Thu Sep 18 14:33:06 2025
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 565.57.01 Driver Version: 565.57.01 CUDA Version: 12.7 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA A40 On | 00000000:01:00.0 Off | 0 |
| 0% 43C P0 91W / 300W | 3417MiB / 46068MiB | 19% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
| 1 NVIDIA A40 On | 00000000:25:00.0 Off | 0 |
| 0% 29C P8 24W / 300W | 4MiB / 46068MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
| 2 NVIDIA A40 On | 00000000:41:00.0 Off | 0 |
| 0% 30C P8 31W / 300W | 1MiB / 46068MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
| 3 NVIDIA A40 On | 00000000:61:00.0 Off | 0 |
| 0% 28C P8 24W / 300W | 1MiB / 46068MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
| 4 NVIDIA A40 On | 00000000:81:00.0 Off | 0 |
| 0% 30C P8 24W / 300W | 1MiB / 46068MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
| 5 NVIDIA A40 On | 00000000:A1:00.0 Off | 0 |
| 0% 29C P8 24W / 300W | 1MiB / 46068MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
| 6 NVIDIA A40 On | 00000000:C1:00.0 Off | 0 |
| 0% 31C P8 24W / 300W | 1MiB / 46068MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
| 7 NVIDIA A40 On | 00000000:E1:00.0 Off | 0 |
| 0% 31C P8 24W / 300W | 1MiB / 46068MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| 0 N/A N/A 3982942 C python 3408MiB |
+-----------------------------------------------------------------------------------------+
Thu Sep 18 14:33:09 2025
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.127.08 Driver Version: 550.127.08 CUDA Version: 12.4 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA H100 80GB HBM3 On | 00000000:26:00.0 Off | 0 |
| N/A 60C P0 635W / 700W | 66514MiB / 81559MiB | 86% Default |
| | | Disabled |
+-----------------------------------------+------------------------+----------------------+
| 1 NVIDIA H100 80GB HBM3 On | 00000000:2F:00.0 Off | 0 |
| N/A 50C P0 282W / 700W | 22237MiB / 81559MiB | 100% Default |
| | | Disabled |
+-----------------------------------------+------------------------+----------------------+
| 2 NVIDIA H100 80GB HBM3 On | 00000000:46:00.0 Off | 0 |
| N/A 56C P0 401W / 700W | 29810MiB / 81559MiB | 83% Default |
| | | Disabled |
+-----------------------------------------+------------------------+----------------------+
| 3 NVIDIA H100 80GB HBM3 On | 00000000:54:00.0 Off | 0 |
| N/A 65C P0 655W / 700W | 65570MiB / 81559MiB | 91% Default |
| | | Disabled |
+-----------------------------------------+------------------------+----------------------+
| 4 NVIDIA H100 80GB HBM3 On | 00000000:A6:00.0 Off | 0 |
| N/A 68C P0 351W / 700W | 10504MiB / 81559MiB | 83% Default |
| | | Disabled |
+-----------------------------------------+------------------------+----------------------+
| 5 NVIDIA H100 80GB HBM3 On | 00000000:AF:00.0 Off | 0 |
| N/A 30C P0 75W / 700W | 1MiB / 81559MiB | 0% Default |
| | | Disabled |
+-----------------------------------------+------------------------+----------------------+
| 6 NVIDIA H100 80GB HBM3 On | 00000000:C6:00.0 Off | On |
| N/A 31C P0 76W / 700W | 89MiB / 81559MiB | N/A Default |
| | | Enabled |
+-----------------------------------------+------------------------+----------------------+
| 7 NVIDIA H100 80GB HBM3 On | 00000000:CF:00.0 Off | On |
| N/A 29C P0 122W / 700W | 1138MiB / 81559MiB | N/A Default |
| | | Enabled |
+-----------------------------------------+------------------------+----------------------+
+-----------------------------------------------------------------------------------------+
| MIG devices: |
+------------------+----------------------------------+-----------+-----------------------+
| GPU GI CI MIG | Memory-Usage | Vol| Shared |
| ID ID Dev | BAR1-Usage | SM Unc| CE ENC DEC OFA JPG |
| | | ECC| |
|==================+==================================+===========+=======================|
| 6 1 0 0 | 51MiB / 40320MiB | 64 0 | 4 0 4 0 4 |
| | 0MiB / 65535MiB | | |
+------------------+----------------------------------+-----------+-----------------------+
| 6 2 0 1 | 38MiB / 40320MiB | 60 0 | 3 0 3 0 3 |
| | 0MiB / 65535MiB | | |
+------------------+----------------------------------+-----------+-----------------------+
| 7 7 0 0 | 13MiB / 9984MiB | 16 0 | 1 0 1 0 1 |
| | 0MiB / 16383MiB | | |
+------------------+----------------------------------+-----------+-----------------------+
| 7 8 0 1 | 13MiB / 9984MiB | 16 0 | 1 0 1 0 1 |
| | 0MiB / 16383MiB | | |
+------------------+----------------------------------+-----------+-----------------------+
| 7 9 0 2 | 13MiB / 9984MiB | 16 0 | 1 0 1 0 1 |
| | 0MiB / 16383MiB | | |
+------------------+----------------------------------+-----------+-----------------------+
| 7 11 0 3 | 13MiB / 9984MiB | 16 0 | 1 0 1 0 1 |
| | 0MiB / 16383MiB | | |
+------------------+----------------------------------+-----------+-----------------------+
| 7 12 0 4 | 13MiB / 9984MiB | 16 0 | 1 0 1 0 1 |
| | 0MiB / 16383MiB | | |
+------------------+----------------------------------+-----------+-----------------------+
| 7 13 0 5 | 13MiB / 9984MiB | 16 0 | 1 0 1 0 1 |
| | 0MiB / 16383MiB | | |
+------------------+----------------------------------+-----------+-----------------------+
| 7 14 0 6 | 1061MiB / 9984MiB | 16 0 | 1 0 1 0 1 |
| | 6MiB / 16383MiB | | |
+------------------+----------------------------------+-----------+-----------------------+
+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| 0 N/A N/A 983592 C VLLM::EngineCore 66504MiB |
| 1 N/A N/A 3833152 C python 5846MiB |
| 1 N/A N/A 3836635 C python 16376MiB |
| 2 N/A N/A 816691 C python 29800MiB |
| 3 N/A N/A 913477 C /opt/conda/bin/python 65560MiB |
| 4 N/A N/A 975701 C ...f-mars-experiment/ollama/bin/ollama 10494MiB |
| 7 14 0 888634 C ...nvironments/smda-project/bin/python 398MiB |
| 7 14 0 952670 C ...nvironments/smda-project/bin/python 384MiB |
| 7 14 0 979314 C ...nvironments/smda-project/bin/python 248MiB |
+-----------------------------------------------------------------------------------------+