Cluster status Wed, 04 Feb 2026 02:32:01 +0000 report from beholder01

Resource usage (overall)
Resource usage (by namespace)
Ceph file system status
Etcd cluster status
Detailed network health report
nVidia Driver and GPU reports
  Imp
  Dretch
  Belial
  Fierna
  Tiamat
  Vecna
  Asmodeus
  Zariel
  Demogorgon

Resource usage (overall)

 Resource                                                   Requested          Limit  Allocatable     Free 
  cpu                                                     (61%) 963.2     (65%) 1.0k         1.6k    549.3 
  ├─ asmodeus                                             (52%) 132.3    (52%) 132.1        256.0    123.7 
  │  ├─ asmodeus-single-gpu-julian-schaefer-zimmermann           64.0           64.0                       
  │  ├─ dnsutils-asmodeus                                      100.0m         100.0m                       
  │  ├─ kube-router-5cjkf                                      250.0m            0.0                       
  │  ├─ step16-coolchic-gpu-run-asmodeus-a100-hq                 20.0           20.0                       
  │  ├─ step18-ftic-gpu-run-asmodeus-a100-hq-40cpu               40.0           40.0                       
  │  └─ ulas-bingoel-model-pod-5                                  8.0            8.0                       
  ├─ beholder01                                              (7%) 1.7    (0%) 100.0m         24.0     22.3 
  │  ├─ coredns-66bc5c9577-kjst9                               100.0m            0.0                       
  │  ├─ dex-69dddb47b7-pk5ks                                   250.0m            0.0                       
  │  ├─ dex-loginapp-5f5974b54d-6dskk                          100.0m            0.0                       
  │  ├─ dex-mysql-589f4586bc-94f8p                             100.0m            0.0                       
  │  ├─ dnsutils-beholder01                                    100.0m         100.0m                       
  │  ├─ kube-apiserver-beholder01                              250.0m            0.0                       
  │  ├─ kube-controller-manager-beholder01                     200.0m            0.0                       
  │  ├─ kube-router-4zkq2                                      250.0m            0.0                       
  │  ├─ kube-scheduler-beholder01                              100.0m            0.0                       
  │  └─ mediawiki-mariadb-7ffb6c9b8d-g87lk                     250.0m            0.0                       
  ├─ beholder02                                              (5%) 1.1       (5%) 1.1         24.0     22.9 
  │  ├─ dnsutils-beholder02                                    100.0m         100.0m                       
  │  ├─ kube-apiserver-beholder02                              250.0m            0.0                       
  │  ├─ kube-controller-manager-beholder02                     200.0m            0.0                       
  │  ├─ kube-router-6mwfx                                      250.0m            0.0                       
  │  ├─ kube-scheduler-beholder02                              100.0m            0.0                       
  │  └─ nginx-rsn-2024-57d49484d-47rfc                         250.0m            1.0                       
  ├─ beholder03                                              (5%) 1.2    (2%) 400.0m         24.0     22.8 
  │  ├─ dnsutils-beholder03                                    100.0m         100.0m                       
  │  ├─ kube-apiserver-beholder03                              250.0m            0.0                       
  │  ├─ kube-controller-manager-beholder03                     200.0m            0.0                       
  │  ├─ kube-router-jnkdl                                      250.0m            0.0                       
  │  ├─ kube-scheduler-beholder03                              100.0m            0.0                       
  │  ├─ ldap-67b47cf9b9-v6vvm                                  250.0m            0.0                       
  │  └─ nfd-master-6589cf6d4c-9xw6v                            100.0m         300.0m                       
  ├─ belial                                                (90%) 72.1     (91%) 73.1         80.0      6.9 
  │  ├─ cool-pod                                                  2.0            2.0                       
  │  ├─ coredns-66bc5c9577-kxlwp                               100.0m            0.0                       
  │  ├─ dnsutils-belial                                        100.0m         100.0m                       
  │  ├─ gatekeeper-controller-manager-66f474f785-bq2vs         100.0m            1.0                       
  │  ├─ kube-router-2hmkx                                      250.0m            0.0                       
  │  ├─ nfd-gc-7b6c64c4b8-gc2k5                                 10.0m          20.0m                       
  │  ├─ ollama-pod                                                1.0            1.0                       
  │  ├─ prometheus-deployment-b65d5d898-q55j8                  500.0m            1.0                       
  │  ├─ pytorch-pod                                               1.0            1.0                       
  │  ├─ till-aust-baseline-arrowhead-job-pv7w7                   32.0           32.0                       
  │  ├─ till-aust-baseline-car-job-x7tlg                         32.0           32.0                       
  │  ├─ till-aust-ubuntu-entry-pod                                1.0            1.0                       
  │  └─ valentin-schmuker-storage                                 2.0            2.0                       
  ├─ demogorgon                                            (84%) 80.3   (116%) 111.1         96.0      0.0 
  │  ├─ a2v2-single-gpu-jcsz                                      8.0            8.0                       
  │  ├─ demogorgon-a2v2                                          15.0           15.0                       
  │  ├─ dnsutils-demogorgon                                    100.0m         100.0m                       
  │  ├─ gpu-demogorgon                                           48.0           48.0                       
  │  ├─ kube-router-v72jn                                      250.0m            0.0                       
  │  ├─ pycharm                                                   1.0           32.0                       
  │  └─ ulas-bingoel-model-pod-3                                  8.0            8.0                       
  ├─ fierna                                                (82%) 65.5     (84%) 67.1         80.0     12.9 
  │  ├─ dnsutils-fierna                                        100.0m         100.0m                       
  │  ├─ gatekeeper-controller-manager-66f474f785-dvhjf         100.0m            1.0                       
  │  ├─ kube-router-6p89p                                      250.0m            0.0                       
  │  ├─ ledavio-text-search-5d8f755795-cndsv                      1.0            2.0                       
  │  ├─ till-aust-baseline-adiac-job-66j56                       32.0           32.0                       
  │  └─ till-aust-baseline-beef-job-ts4jw                        32.0           32.0                       
  ├─ kiaransalee                                           (26%) 49.4     (39%) 74.1        192.0    117.9 
  │  ├─ bash-pod                                                 12.0           12.0                       
  │  ├─ dnsutils-kiaransalee                                   100.0m         100.0m                       
  │  ├─ kube-router-s5qjv                                      250.0m            0.0                       
  │  ├─ tgillm-pod-7cd979b759-wgxjg                              12.0           12.0                       
  │  └─ ubuntu-gpu1                                              25.0           50.0                       
  ├─ mindflayer01                                            (3%) 1.9       (6%) 4.1         64.0     59.9 
  │  ├─ dnsutils-mindflayer01                                  100.0m         100.0m                       
  │  ├─ gatekeeper-controller-manager-66f474f785-s8b5w         100.0m            1.0                       
  │  ├─ kube-router-w8xhq                                      250.0m            0.0                       
  │  ├─ mariadb-849498cb44-kjj5q                               500.0m            1.0                       
  │  └─ phpfpm-nginx-5db96d6895-vwwf5                             1.0            2.0                       
  ├─ mindflayer02                                         (1%) 900.0m       (3%) 2.1         64.0     61.9 
  │  ├─ dnsutils-mindflayer02                                  100.0m         100.0m                       
  │  ├─ gatekeeper-audit-59d4b6fd4c-gtwjm                      100.0m            1.0                       
  │  ├─ kube-router-sv7f6                                      250.0m            0.0                       
  │  ├─ mediawiki-77f9c84df5-p6k9g                             250.0m            0.0                       
  │  ├─ registry-auth-bd4bb7d8b-cdjnd                          100.0m            0.0                       
  │  └─ ubuntu-test-pod                                        100.0m            1.0                       
  ├─ mindflayer03                                         (1%) 650.0m    (0%) 100.0m         64.0     63.4 
  │  ├─ dnsutils-mindflayer03                                  100.0m         100.0m                       
  │  ├─ kube-router-rzv5h                                      250.0m            0.0                       
  │  ├─ registry-7495ddbf59-vvf2x                              100.0m            0.0                       
  │  └─ registry-browser-7f4cbdf96b-5x2mv                      200.0m            0.0                       
  ├─ tiamat                                              (100%) 255.3   (100%) 255.1        256.0   650.0m 
  │  ├─ a2v-pt-ft-hyena-large-julian-zimmermann                 255.0          255.0                       
  │  ├─ dnsutils-tiamat                                        100.0m         100.0m                       
  │  └─ kube-router-g92j5                                      250.0m            0.0                       
  ├─ vecna                                                 (52%) 50.4     (58%) 56.1         96.0     39.9 
  │  ├─ dnsutils-vecna                                         100.0m         100.0m                       
  │  ├─ kube-router-4gqqj                                      250.0m            0.0                       
  │  ├─ train-maap-qln5k                                         10.0           16.0                       
  │  ├─ ulas-bingoel-model-pod-4                                  8.0            8.0                       
  │  ├─ valentin-schmuker-mvp-seg-full                           16.0           16.0                       
  │  └─ valentin-schmuker-mvp-seg-image                          16.0           16.0                       
  └─ zariel                                               (98%) 250.3    (98%) 250.1        256.0      5.7 
     ├─ dnsutils-zariel                                        100.0m         100.0m                       
     ├─ kube-router-fmt6c                                      250.0m            0.0                       
     └─ zariel-a2v2-pt-ft-jcsz                                  250.0          250.0                       
  ephemeral-storage                                       (1%) 55.0Gi    (1%) 95.0Gi        11.3T    11.2T 
  ├─ asmodeus                                                (0%) 0.0       (0%) 0.0        94.6G    94.6G 
  ├─ beholder01                                              (0%) 0.0       (0%) 0.0         1.7T     1.7T 
  ├─ beholder02                                              (0%) 0.0       (0%) 0.0         1.7T     1.7T 
  ├─ beholder03                                              (0%) 0.0       (0%) 0.0         1.7T     1.7T 
  ├─ belial                                                  (0%) 0.0       (0%) 0.0       189.2G   189.2G 
  ├─ demogorgon                                           (2%) 15.0Gi    (5%) 30.0Gi       706.7G   674.5G 
  │  └─ pycharm                                                15.0Gi         30.0Gi                       
  ├─ fierna                                                  (0%) 0.0       (0%) 0.0       189.2G   189.2G 
  ├─ kiaransalee                                          (1%) 20.0Gi    (3%) 40.0Gi         1.7T     1.7T 
  │  └─ ubuntu-gpu1                                            20.0Gi         40.0Gi                       
  ├─ mindflayer01                                            (0%) 0.0       (0%) 0.0       211.5G   211.5G 
  ├─ mindflayer02                                            (0%) 0.0       (0%) 0.0       211.5G   211.5G 
  ├─ mindflayer03                                            (0%) 0.0       (0%) 0.0       211.5G   211.5G 
  ├─ tiamat                                                  (0%) 0.0       (0%) 0.0       164.4G   164.4G 
  ├─ vecna                                                (3%) 20.0Gi    (3%) 25.0Gi       849.0G   822.1G 
  │  └─ train-maap-qln5k                                       20.0Gi         25.0Gi                       
  └─ zariel                                                  (0%) 0.0       (0%) 0.0         1.7T     1.7T 
  memory                                                   (41%) 5.7T    (43%) 5.4Ti       12.7Ti    7.3Ti 
  ├─ asmodeus                                           (38%) 752.3Gi  (38%) 752.1Gi        2.0Ti    1.2Ti 
  │  ├─ asmodeus-single-gpu-julian-schaefer-zimmermann        512.0Gi        512.0Gi                       
  │  ├─ dnsutils-asmodeus                                     100.0Mi        100.0Mi                       
  │  ├─ kube-router-5cjkf                                     250.0Mi            0.0                       
  │  ├─ step16-coolchic-gpu-run-asmodeus-a100-hq              100.0Gi        100.0Gi                       
  │  ├─ step18-ftic-gpu-run-asmodeus-a100-hq-40cpu            100.0Gi        100.0Gi                       
  │  └─ ulas-bingoel-model-pod-5                               40.0Gi         40.0Gi                       
  ├─ beholder01                                          (0%) 420.0Mi   (0%) 270.0Mi       92.9Gi   92.5Gi 
  │  ├─ coredns-66bc5c9577-kjst9                               70.0Mi        170.0Mi                       
  │  ├─ dnsutils-beholder01                                   100.0Mi        100.0Mi                       
  │  └─ kube-router-4zkq2                                     250.0Mi            0.0                       
  ├─ beholder02                                          (0%) 350.0Mi   (0%) 100.0Mi       92.9Gi   92.5Gi 
  │  ├─ dnsutils-beholder02                                   100.0Mi        100.0Mi                       
  │  └─ kube-router-6mwfx                                     250.0Mi            0.0                       
  ├─ beholder03                                          (1%) 478.0Mi     (4%) 4.1Gi       92.9Gi   88.8Gi 
  │  ├─ dnsutils-beholder03                                   100.0Mi        100.0Mi                       
  │  ├─ kube-router-jnkdl                                     250.0Mi            0.0                       
  │  └─ nfd-master-6589cf6d4c-9xw6v                           128.0Mi          4.0Gi                       
  ├─ belial                                              (38%) 310.6G  (44%) 330.8Gi      754.4Gi  423.7Gi 
  │  ├─ cool-pod                                               32.0Gi         32.0Gi                       
  │  ├─ coredns-66bc5c9577-kxlwp                               70.0Mi        170.0Mi                       
  │  ├─ dnsutils-belial                                       100.0Mi        100.0Mi                       
  │  ├─ gatekeeper-controller-manager-66f474f785-bq2vs        256.0Mi        512.0Mi                       
  │  ├─ kube-router-2hmkx                                     250.0Mi            0.0                       
  │  ├─ nfd-gc-7b6c64c4b8-gc2k5                               128.0Mi          1.0Gi                       
  │  ├─ ollama-pod                                             32.0Gi         32.0Gi                       
  │  ├─ prometheus-deployment-b65d5d898-q55j8                  500.0M          1.0Gi                       
  │  ├─ pytorch-pod                                            12.0Gi         12.0Gi                       
  │  ├─ till-aust-baseline-arrowhead-job-pv7w7                100.0Gi        120.0Gi                       
  │  ├─ till-aust-baseline-car-job-x7tlg                      100.0Gi        120.0Gi                       
  │  ├─ till-aust-ubuntu-entry-pod                             10.0Gi         10.0Gi                       
  │  └─ valentin-schmuker-storage                               2.0Gi          2.0Gi                       
  ├─ demogorgon                                         (38%) 771.3Gi  (40%) 793.1Gi        2.0Ti    1.2Ti 
  │  ├─ a2v2-single-gpu-jcsz                                  256.0Gi        256.0Gi                       
  │  ├─ demogorgon-a2v2                                       395.0Gi        395.0Gi                       
  │  ├─ dnsutils-demogorgon                                   100.0Mi        100.0Mi                       
  │  ├─ gpu-demogorgon                                         80.0Gi         80.0Gi                       
  │  ├─ kube-router-v72jn                                     250.0Mi            0.0                       
  │  ├─ pycharm                                                10.0Gi         32.0Gi                       
  │  └─ ulas-bingoel-model-pod-3                               30.0Gi         30.0Gi                       
  ├─ fierna                                             (31%) 232.6Gi  (36%) 272.6Gi      754.4Gi  481.8Gi 
  │  ├─ dnsutils-fierna                                       100.0Mi        100.0Mi                       
  │  ├─ gatekeeper-controller-manager-66f474f785-dvhjf        256.0Mi        512.0Mi                       
  │  ├─ kube-router-6p89p                                     250.0Mi            0.0                       
  │  ├─ ledavio-text-search-5d8f755795-cndsv                   32.0Gi         32.0Gi                       
  │  ├─ till-aust-baseline-adiac-job-66j56                    100.0Gi        120.0Gi                       
  │  └─ till-aust-baseline-beef-job-ts4jw                     100.0Gi        120.0Gi                       
  ├─ kiaransalee                                         (13%) 214.0G  (16%) 246.1Gi        1.5Ti    1.2Ti 
  │  ├─ bash-pod                                               16.0Gi         16.0Gi                       
  │  ├─ dnsutils-kiaransalee                                  100.0Mi        100.0Mi                       
  │  ├─ jupyter-leo-2dtiedemann                                  1.1G            0.0                       
  │  ├─ jupyter-matheus-2dstefanini-2dmariano                    1.1G            0.0                       
  │  ├─ jupyter-saroyehun                                        1.1G            0.0                       
  │  ├─ kube-router-s5qjv                                     250.0Mi            0.0                       
  │  ├─ tgillm-pod-7cd979b759-wgxjg                            80.0Gi         80.0Gi                       
  │  └─ ubuntu-gpu1                                           100.0Gi        150.0Gi                       
  ├─ mindflayer01                                          (0%) 1.3Gi     (1%) 2.1Gi      376.5Gi  374.4Gi 
  │  ├─ dnsutils-mindflayer01                                 100.0Mi        100.0Mi                       
  │  ├─ gatekeeper-controller-manager-66f474f785-s8b5w        256.0Mi        512.0Mi                       
  │  ├─ kube-router-w8xhq                                     250.0Mi            0.0                       
  │  ├─ mariadb-849498cb44-kjj5q                              256.0Mi        512.0Mi                       
  │  └─ phpfpm-nginx-5db96d6895-vwwf5                         512.0Mi          1.0Gi                       
  ├─ mindflayer02                                        (0%) 706.0Mi     (0%) 1.6Gi      376.5Gi  374.9Gi 
  │  ├─ dnsutils-mindflayer02                                 100.0Mi        100.0Mi                       
  │  ├─ gatekeeper-audit-59d4b6fd4c-gtwjm                     256.0Mi        512.0Mi                       
  │  ├─ kube-router-sv7f6                                     250.0Mi            0.0                       
  │  └─ ubuntu-test-pod                                       100.0Mi          1.0Gi                       
  ├─ mindflayer03                                        (0%) 350.0Mi   (0%) 100.0Mi      376.5Gi  376.1Gi 
  │  ├─ dnsutils-mindflayer03                                 100.0Mi        100.0Mi                       
  │  └─ kube-router-rzv5h                                     250.0Mi            0.0                       
  ├─ tiamat                                             (89%) 896.3Gi  (89%) 896.1Gi     1007.6Gi  111.2Gi 
  │  ├─ a2v-pt-ft-hyena-large-julian-zimmermann               896.0Gi        896.0Gi                       
  │  ├─ dnsutils-tiamat                                       100.0Mi        100.0Mi                       
  │  └─ kube-router-g92j5                                     250.0Mi            0.0                       
  ├─ vecna                                              (15%) 222.3Gi  (21%) 318.1Gi        1.5Ti    1.2Ti 
  │  ├─ dnsutils-vecna                                        100.0Mi        100.0Mi                       
  │  ├─ kube-router-4gqqj                                     250.0Mi            0.0                       
  │  ├─ train-maap-qln5k                                       64.0Gi        160.0Gi                       
  │  ├─ ulas-bingoel-model-pod-4                               30.0Gi         30.0Gi                       
  │  ├─ valentin-schmuker-mvp-seg-full                         64.0Gi         64.0Gi                       
  │  └─ valentin-schmuker-mvp-seg-image                        64.0Gi         64.0Gi                       
  └─ zariel                                               (95%) 1.9Ti    (95%) 1.9Ti        2.0Ti   95.2Gi 
     ├─ dnsutils-zariel                                       100.0Mi        100.0Mi                       
     ├─ kube-router-fmt6c                                     250.0Mi            0.0                       
     └─ zariel-a2v2-pt-ft-jcsz                                  1.9Ti          1.9Ti                       
  nvidia.com/gpu                                           (74%) 46.0     (74%) 46.0         62.0     16.0 
  ├─ asmodeus                                              (100%) 4.0     (100%) 4.0          4.0      0.0 
  │  ├─ asmodeus-single-gpu-julian-schaefer-zimmermann            1.0            1.0                       
  │  ├─ step16-coolchic-gpu-run-asmodeus-a100-hq                  1.0            1.0                       
  │  ├─ step18-ftic-gpu-run-asmodeus-a100-hq-40cpu                1.0            1.0                       
  │  └─ ulas-bingoel-model-pod-5                                  1.0            1.0                       
  ├─ belial                                                 (62%) 5.0      (62%) 5.0          8.0      3.0 
  │  ├─ ollama-pod                                                1.0            1.0                       
  │  ├─ pytorch-pod                                               2.0            2.0                       
  │  ├─ till-aust-baseline-arrowhead-job-pv7w7                    1.0            1.0                       
  │  └─ till-aust-baseline-car-job-x7tlg                          1.0            1.0                       
  ├─ demogorgon                                            (100%) 8.0     (100%) 8.0          8.0      0.0 
  │  ├─ a2v2-single-gpu-jcsz                                      1.0            1.0                       
  │  ├─ demogorgon-a2v2                                           4.0            4.0                       
  │  ├─ gpu-demogorgon                                            1.0            1.0                       
  │  ├─ pycharm                                                   1.0            1.0                       
  │  └─ ulas-bingoel-model-pod-3                                  1.0            1.0                       
  ├─ fierna                                                 (38%) 3.0      (38%) 3.0          8.0      5.0 
  │  ├─ ledavio-text-search-5d8f755795-cndsv                      1.0            1.0                       
  │  ├─ till-aust-baseline-adiac-job-66j56                        1.0            1.0                       
  │  └─ till-aust-baseline-beef-job-ts4jw                         1.0            1.0                       
  ├─ kiaransalee                                            (67%) 4.0      (67%) 4.0          6.0      2.0 
  │  ├─ jupyter-leo-2dtiedemann                                   1.0            1.0                       
  │  ├─ jupyter-saroyehun                                         1.0            1.0                       
  │  ├─ tgillm-pod-7cd979b759-wgxjg                               1.0            1.0                       
  │  └─ ubuntu-gpu1                                               1.0            1.0                       
  ├─ tiamat                                                (100%) 4.0     (100%) 4.0          4.0      0.0 
  │  └─ a2v-pt-ft-hyena-large-julian-zimmermann                   4.0            4.0                       
  ├─ vecna                                                 (62%) 10.0     (62%) 10.0         16.0      6.0 
  │  ├─ train-maap-qln5k                                          1.0            1.0                       
  │  ├─ ulas-bingoel-model-pod-4                                  1.0            1.0                       
  │  ├─ valentin-schmuker-mvp-seg-full                            4.0            4.0                       
  │  └─ valentin-schmuker-mvp-seg-image                           4.0            4.0                       
  └─ zariel                                                (100%) 8.0     (100%) 8.0          8.0      0.0 
     └─ zariel-a2v2-pt-ft-jcsz                                    8.0            8.0                       
  nvidia.com/mig-1g.10gb                                     (0%) 0.0       (0%) 0.0          7.0      7.0 
  └─ kiaransalee                                             (0%) 0.0       (0%) 0.0          7.0      7.0 
  nvidia.com/mig-3g.40gb                                     (0%) 0.0       (0%) 0.0          1.0      1.0 
  └─ kiaransalee                                             (0%) 0.0       (0%) 0.0          1.0      1.0 
  nvidia.com/mig-4g.40gb                                   (100%) 1.0     (100%) 1.0          1.0      0.0 
  └─ kiaransalee                                           (100%) 1.0     (100%) 1.0          1.0      0.0 
     └─ jupyter-matheus-2dstefanini-2dmariano                     1.0            1.0                       
  pods                                                    (10%) 149.0    (10%) 149.0         1.5k     1.4k 
  ├─ asmodeus                                                (8%) 9.0       (8%) 9.0        110.0    101.0 
  │  ├─ asmodeus-single-gpu-julian-schaefer-zimmermann            1.0            1.0                       
  │  ├─ dnsutils-asmodeus                                         1.0            1.0                       
  │  ├─ gpu-feature-discovery-9wxd8                               1.0            1.0                       
  │  ├─ kube-proxy-kxtvf                                          1.0            1.0                       
  │  ├─ kube-router-5cjkf                                         1.0            1.0                       
  │  ├─ nvidia-device-plugin-daemonset-pgtkn                      1.0            1.0                       
  │  ├─ step16-coolchic-gpu-run-asmodeus-a100-hq                  1.0            1.0                       
  │  ├─ step18-ftic-gpu-run-asmodeus-a100-hq-40cpu                1.0            1.0                       
  │  └─ ulas-bingoel-model-pod-5                                  1.0            1.0                       
  ├─ beholder01                                            (12%) 13.0     (12%) 13.0        110.0     97.0 
  │  ├─ coredns-66bc5c9577-kjst9                                  1.0            1.0                       
  │  ├─ dex-69dddb47b7-pk5ks                                      1.0            1.0                       
  │  ├─ dex-loginapp-5f5974b54d-6dskk                             1.0            1.0                       
  │  ├─ dex-mysql-589f4586bc-94f8p                                1.0            1.0                       
  │  ├─ dnsutils-beholder01                                       1.0            1.0                       
  │  ├─ kube-apiserver-beholder01                                 1.0            1.0                       
  │  ├─ kube-controller-manager-beholder01                        1.0            1.0                       
  │  ├─ kube-proxy-6tptz                                          1.0            1.0                       
  │  ├─ kube-router-4zkq2                                         1.0            1.0                       
  │  ├─ kube-scheduler-beholder01                                 1.0            1.0                       
  │  ├─ mediawiki-mariadb-7ffb6c9b8d-g87lk                        1.0            1.0                       
  │  ├─ nginx-k8s-5889449f8b-6p8j7                                1.0            1.0                       
  │  └─ pdf-55ccd6f459-mnq72                                      1.0            1.0                       
  ├─ beholder02                                              (8%) 9.0       (8%) 9.0        110.0    101.0 
  │  ├─ dnsutils-beholder02                                       1.0            1.0                       
  │  ├─ hub-848f5d5578-47s2j                                      1.0            1.0                       
  │  ├─ kube-apiserver-beholder02                                 1.0            1.0                       
  │  ├─ kube-controller-manager-beholder02                        1.0            1.0                       
  │  ├─ kube-proxy-cmzf2                                          1.0            1.0                       
  │  ├─ kube-router-6mwfx                                         1.0            1.0                       
  │  ├─ kube-scheduler-beholder02                                 1.0            1.0                       
  │  ├─ nginx-ip-2025-7fd66b99dd-2khh6                            1.0            1.0                       
  │  └─ nginx-rsn-2024-57d49484d-47rfc                            1.0            1.0                       
  ├─ beholder03                                              (8%) 9.0       (8%) 9.0        110.0    101.0 
  │  ├─ dnsutils-beholder03                                       1.0            1.0                       
  │  ├─ kube-apiserver-beholder03                                 1.0            1.0                       
  │  ├─ kube-controller-manager-beholder03                        1.0            1.0                       
  │  ├─ kube-proxy-cftmz                                          1.0            1.0                       
  │  ├─ kube-router-jnkdl                                         1.0            1.0                       
  │  ├─ kube-scheduler-beholder03                                 1.0            1.0                       
  │  ├─ ldap-67b47cf9b9-v6vvm                                     1.0            1.0                       
  │  ├─ memcached-6b68cdd947-w4k2q                                1.0            1.0                       
  │  └─ nfd-master-6589cf6d4c-9xw6v                               1.0            1.0                       
  ├─ belial                                                (16%) 18.0     (16%) 18.0        110.0     92.0 
  │  ├─ cool-pod                                                  1.0            1.0                       
  │  ├─ coredns-66bc5c9577-kxlwp                                  1.0            1.0                       
  │  ├─ dnsutils-belial                                           1.0            1.0                       
  │  ├─ gatekeeper-controller-manager-66f474f785-bq2vs            1.0            1.0                       
  │  ├─ gpu-feature-discovery-x4hnt                               1.0            1.0                       
  │  ├─ hub-78d6dd898d-7f6kx                                      1.0            1.0                       
  │  ├─ kube-proxy-xxgbc                                          1.0            1.0                       
  │  ├─ kube-router-2hmkx                                         1.0            1.0                       
  │  ├─ nfd-gc-7b6c64c4b8-gc2k5                                   1.0            1.0                       
  │  ├─ nvidia-device-plugin-daemonset-9nzsv                      1.0            1.0                       
  │  ├─ ollama-pod                                                1.0            1.0                       
  │  ├─ prometheus-deployment-b65d5d898-q55j8                     1.0            1.0                       
  │  ├─ pytorch-pod                                               1.0            1.0                       
  │  ├─ till-aust-baseline-arrowhead-job-pv7w7                    1.0            1.0                       
  │  ├─ till-aust-baseline-car-job-x7tlg                          1.0            1.0                       
  │  ├─ till-aust-ubuntu-entry-pod                                1.0            1.0                       
  │  ├─ user-scheduler-5cf5ffbc54-tjld9                           1.0            1.0                       
  │  └─ valentin-schmuker-storage                                 1.0            1.0                       
  ├─ demogorgon                                             (9%) 10.0      (9%) 10.0        110.0    100.0 
  │  ├─ a2v2-single-gpu-jcsz                                      1.0            1.0                       
  │  ├─ demogorgon-a2v2                                           1.0            1.0                       
  │  ├─ dnsutils-demogorgon                                       1.0            1.0                       
  │  ├─ gpu-demogorgon                                            1.0            1.0                       
  │  ├─ gpu-feature-discovery-2d8cr                               1.0            1.0                       
  │  ├─ kube-proxy-xdkh7                                          1.0            1.0                       
  │  ├─ kube-router-v72jn                                         1.0            1.0                       
  │  ├─ nvidia-device-plugin-daemonset-2b2z8                      1.0            1.0                       
  │  ├─ pycharm                                                   1.0            1.0                       
  │  └─ ulas-bingoel-model-pod-3                                  1.0            1.0                       
  ├─ fierna                                                (12%) 13.0     (12%) 13.0        110.0     97.0 
  │  ├─ dnsutils-fierna                                           1.0            1.0                       
  │  ├─ gatekeeper-controller-manager-66f474f785-dvhjf            1.0            1.0                       
  │  ├─ gpu-feature-discovery-k27g5                               1.0            1.0                       
  │  ├─ kube-proxy-pgw9t                                          1.0            1.0                       
  │  ├─ kube-router-6p89p                                         1.0            1.0                       
  │  ├─ ledavio-text-search-5d8f755795-cndsv                      1.0            1.0                       
  │  ├─ nvidia-device-plugin-daemonset-rqv7h                      1.0            1.0                       
  │  ├─ proxy-5495d795d5-jjk7j                                    1.0            1.0                       
  │  ├─ proxy-7f79cc645f-52qjx                                    1.0            1.0                       
  │  ├─ till-aust-baseline-adiac-job-66j56                        1.0            1.0                       
  │  ├─ till-aust-baseline-beef-job-ts4jw                         1.0            1.0                       
  │  ├─ user-scheduler-5cf5ffbc54-wrnrk                           1.0            1.0                       
  │  └─ user-scheduler-c7db6c584-6vbss                            1.0            1.0                       
  ├─ kiaransalee                                           (12%) 13.0     (12%) 13.0        110.0     97.0 
  │  ├─ bash-pod                                                  1.0            1.0                       
  │  ├─ continuous-image-puller-6fs4k                             1.0            1.0                       
  │  ├─ continuous-image-puller-6z8bj                             1.0            1.0                       
  │  ├─ dnsutils-kiaransalee                                      1.0            1.0                       
  │  ├─ gpu-feature-discovery-w8j9g                               1.0            1.0                       
  │  ├─ jupyter-leo-2dtiedemann                                   1.0            1.0                       
  │  ├─ jupyter-matheus-2dstefanini-2dmariano                     1.0            1.0                       
  │  ├─ jupyter-saroyehun                                         1.0            1.0                       
  │  ├─ kube-proxy-l8zn4                                          1.0            1.0                       
  │  ├─ kube-router-s5qjv                                         1.0            1.0                       
  │  ├─ nvidia-device-plugin-daemonset-8jskd                      1.0            1.0                       
  │  ├─ tgillm-pod-7cd979b759-wgxjg                               1.0            1.0                       
  │  └─ ubuntu-gpu1                                               1.0            1.0                       
  ├─ mindflayer01                                           (9%) 10.0      (9%) 10.0        110.0    100.0 
  │  ├─ dnsutils-mindflayer01                                     1.0            1.0                       
  │  ├─ gatekeeper-controller-manager-66f474f785-s8b5w            1.0            1.0                       
  │  ├─ kube-proxy-9xsph                                          1.0            1.0                       
  │  ├─ kube-router-w8xhq                                         1.0            1.0                       
  │  ├─ mariadb-849498cb44-kjj5q                                  1.0            1.0                       
  │  ├─ nginx-rec-2023-79df8c77cd-w9xwl                           1.0            1.0                       
  │  ├─ nvidia-device-plugin-daemonset-qv82v                      1.0            1.0                       
  │  ├─ phpfpm-nginx-5db96d6895-vwwf5                             1.0            1.0                       
  │  ├─ proxy-5bc89cc587-vlm9x                                    1.0            1.0                       
  │  └─ whoami-74dc54d675-swm2n                                   1.0            1.0                       
  ├─ mindflayer02                                          (14%) 15.0     (14%) 15.0        110.0     95.0 
  │  ├─ cert-manager-79559475b4-7kv54                             1.0            1.0                       
  │  ├─ cert-manager-cainjector-966fc8fbc-zql8j                   1.0            1.0                       
  │  ├─ cert-manager-webhook-854cf5f458-wwf4d                     1.0            1.0                       
  │  ├─ dnsutils-mindflayer02                                     1.0            1.0                       
  │  ├─ gatekeeper-audit-59d4b6fd4c-gtwjm                         1.0            1.0                       
  │  ├─ kube-proxy-dfgxf                                          1.0            1.0                       
  │  ├─ kube-router-sv7f6                                         1.0            1.0                       
  │  ├─ kube-state-metrics-8945855d-dqg79                         1.0            1.0                       
  │  ├─ local-path-provisioner-759479454f-7pqw8                   1.0            1.0                       
  │  ├─ mediawiki-77f9c84df5-p6k9g                                1.0            1.0                       
  │  ├─ nginx-ip-2023-78b9c84dbf-x8qs7                            1.0            1.0                       
  │  ├─ nvidia-device-plugin-daemonset-h7f9j                      1.0            1.0                       
  │  ├─ registry-auth-bd4bb7d8b-cdjnd                             1.0            1.0                       
  │  ├─ ubuntu-test-pod                                           1.0            1.0                       
  │  └─ user-scheduler-c7db6c584-2pxhd                            1.0            1.0                       
  ├─ mindflayer03                                            (7%) 8.0       (7%) 8.0        110.0    102.0 
  │  ├─ dnsutils-mindflayer03                                     1.0            1.0                       
  │  ├─ kube-proxy-nph7j                                          1.0            1.0                       
  │  ├─ kube-router-rzv5h                                         1.0            1.0                       
  │  ├─ nginx-self-service-password-54767ddc56-wncd2              1.0            1.0                       
  │  ├─ nvidia-device-plugin-daemonset-5h7w5                      1.0            1.0                       
  │  ├─ registry-7495ddbf59-vvf2x                                 1.0            1.0                       
  │  ├─ registry-browser-7f4cbdf96b-5x2mv                         1.0            1.0                       
  │  └─ traefik-deployment-d8ccbfdd4-7nxdn                        1.0            1.0                       
  ├─ tiamat                                                  (6%) 7.0       (6%) 7.0        110.0    103.0 
  │  ├─ a2v-pt-ft-hyena-large-julian-zimmermann                   1.0            1.0                       
  │  ├─ continuous-image-puller-wtkhd                             1.0            1.0                       
  │  ├─ dnsutils-tiamat                                           1.0            1.0                       
  │  ├─ gpu-feature-discovery-vc97v                               1.0            1.0                       
  │  ├─ kube-proxy-n8m88                                          1.0            1.0                       
  │  ├─ kube-router-g92j5                                         1.0            1.0                       
  │  └─ nvidia-device-plugin-daemonset-pqfxg                      1.0            1.0                       
  ├─ vecna                                                   (8%) 9.0       (8%) 9.0        110.0    101.0 
  │  ├─ dnsutils-vecna                                            1.0            1.0                       
  │  ├─ gpu-feature-discovery-8mn9m                               1.0            1.0                       
  │  ├─ kube-proxy-6r98t                                          1.0            1.0                       
  │  ├─ kube-router-4gqqj                                         1.0            1.0                       
  │  ├─ nvidia-device-plugin-daemonset-hv8rs                      1.0            1.0                       
  │  ├─ train-maap-qln5k                                          1.0            1.0                       
  │  ├─ ulas-bingoel-model-pod-4                                  1.0            1.0                       
  │  ├─ valentin-schmuker-mvp-seg-full                            1.0            1.0                       
  │  └─ valentin-schmuker-mvp-seg-image                           1.0            1.0                       
  └─ zariel                                                  (5%) 6.0       (5%) 6.0        110.0    104.0 
     ├─ dnsutils-zariel                                           1.0            1.0                       
     ├─ gpu-feature-discovery-tpxz6                               1.0            1.0                       
     ├─ kube-proxy-gsqm7                                          1.0            1.0                       
     ├─ kube-router-fmt6c                                         1.0            1.0                       
     ├─ nvidia-device-plugin-daemonset-4q45h                      1.0            1.0                       
     └─ zariel-a2v2-pt-ft-jcsz                                    1.0            1.0                       




Resource usage by namespace

 Resource                       Requested    Limit  Allocatable  Free 
  auth                                                                
  ├─ beholder01                                                       
  │  ├─ cpu                        450.0m      0.0                    
  │  └─ pods                          3.0      3.0                    
  ├─ beholder03                                                       
  │  ├─ cpu                        250.0m      0.0                    
  │  └─ pods                          1.0      1.0                    
  └─ mindflayer03                     1.0      1.0                    
     └─ pods                          1.0      1.0                    
  cert-manager                        3.0      3.0                    
  └─ mindflayer02                     3.0      3.0                    
     └─ pods                          3.0      3.0                    
  gatekeeper-system                                                   
  ├─ belial                                                           
  │  ├─ cpu                        100.0m      1.0                    
  │  ├─ memory                    256.0Mi  512.0Mi                    
  │  └─ pods                          1.0      1.0                    
  ├─ fierna                                                           
  │  ├─ cpu                        100.0m      1.0                    
  │  ├─ memory                    256.0Mi  512.0Mi                    
  │  └─ pods                          1.0      1.0                    
  ├─ mindflayer01                                                     
  │  ├─ cpu                        100.0m      1.0                    
  │  ├─ memory                    256.0Mi  512.0Mi                    
  │  └─ pods                          1.0      1.0                    
  └─ mindflayer02                                                     
     ├─ cpu                        100.0m      1.0                    
     ├─ memory                    256.0Mi  512.0Mi                    
     └─ pods                          1.0      1.0                    
  jupyterhub                                                          
  ├─ beholder02                       1.0      1.0                    
  │  └─ pods                          1.0      1.0                    
  ├─ fierna                           1.0      1.0                    
  │  └─ pods                          1.0      1.0                    
  ├─ kiaransalee                                                      
  │  ├─ memory                       3.2G      0.0                    
  │  ├─ nvidia.com/gpu                2.0      2.0                    
  │  ├─ nvidia.com/mig-4g.40gb        1.0      1.0                    
  │  └─ pods                          4.0      4.0                    
  ├─ mindflayer01                     1.0      1.0                    
  │  └─ pods                          1.0      1.0                    
  └─ mindflayer02                     1.0      1.0                    
     └─ pods                          1.0      1.0                    
  jupyterhub-kuckling                 2.0      2.0                    
  ├─ fierna                           1.0      1.0                    
  │  └─ pods                          1.0      1.0                    
  └─ tiamat                           1.0      1.0                    
     └─ pods                          1.0      1.0                    
  jupyterhub-students                 5.0      5.0                    
  ├─ belial                           2.0      2.0                    
  │  └─ pods                          2.0      2.0                    
  ├─ fierna                           2.0      2.0                    
  │  └─ pods                          2.0      2.0                    
  └─ kiaransalee                      1.0      1.0                    
     └─ pods                          1.0      1.0                    
  kube-system                                                         
  ├─ asmodeus                                                         
  │  ├─ cpu                        350.0m   100.0m                    
  │  ├─ memory                    350.0Mi  100.0Mi                    
  │  └─ pods                          5.0      5.0                    
  ├─ beholder01                                                       
  │  ├─ cpu                           1.0   100.0m                    
  │  ├─ memory                    420.0Mi  270.0Mi                    
  │  └─ pods                          7.0      7.0                    
  ├─ beholder02                                                       
  │  ├─ cpu                        900.0m   100.0m                    
  │  ├─ memory                    350.0Mi  100.0Mi                    
  │  └─ pods                          6.0      6.0                    
  ├─ beholder03                                                       
  │  ├─ cpu                        900.0m   100.0m                    
  │  ├─ memory                    350.0Mi  100.0Mi                    
  │  └─ pods                          6.0      6.0                    
  ├─ belial                                                           
  │  ├─ cpu                        450.0m   100.0m                    
  │  ├─ memory                    420.0Mi  270.0Mi                    
  │  └─ pods                          6.0      6.0                    
  ├─ demogorgon                                                       
  │  ├─ cpu                        350.0m   100.0m                    
  │  ├─ memory                    350.0Mi  100.0Mi                    
  │  └─ pods                          5.0      5.0                    
  ├─ fierna                                                           
  │  ├─ cpu                        350.0m   100.0m                    
  │  ├─ memory                    350.0Mi  100.0Mi                    
  │  └─ pods                          5.0      5.0                    
  ├─ kiaransalee                                                      
  │  ├─ cpu                        350.0m   100.0m                    
  │  ├─ memory                    350.0Mi  100.0Mi                    
  │  └─ pods                          5.0      5.0                    
  ├─ mindflayer01                                                     
  │  ├─ cpu                        350.0m   100.0m                    
  │  ├─ memory                    350.0Mi  100.0Mi                    
  │  └─ pods                          4.0      4.0                    
  ├─ mindflayer02                                                     
  │  ├─ cpu                        350.0m   100.0m                    
  │  ├─ memory                    350.0Mi  100.0Mi                    
  │  └─ pods                          4.0      4.0                    
  ├─ mindflayer03                                                     
  │  ├─ cpu                        350.0m   100.0m                    
  │  ├─ memory                    350.0Mi  100.0Mi                    
  │  └─ pods                          4.0      4.0                    
  ├─ tiamat                                                           
  │  ├─ cpu                        350.0m   100.0m                    
  │  ├─ memory                    350.0Mi  100.0Mi                    
  │  └─ pods                          5.0      5.0                    
  ├─ vecna                                                            
  │  ├─ cpu                        350.0m   100.0m                    
  │  ├─ memory                    350.0Mi  100.0Mi                    
  │  └─ pods                          5.0      5.0                    
  └─ zariel                                                           
     ├─ cpu                        350.0m   100.0m                    
     ├─ memory                    350.0Mi  100.0Mi                    
     └─ pods                          5.0      5.0                    
  local-path-storage                  1.0      1.0                    
  └─ mindflayer02                     1.0      1.0                    
     └─ pods                          1.0      1.0                    
  monitoring                                                          
  ├─ belial                                                           
  │  ├─ cpu                        500.0m      1.0                    
  │  ├─ memory                     500.0M    1.0Gi                    
  │  └─ pods                          1.0      1.0                    
  └─ mindflayer02                     1.0      1.0                    
     └─ pods                          1.0      1.0                    
  node-feature-discovery                                              
  ├─ beholder03                                                       
  │  ├─ cpu                        100.0m   300.0m                    
  │  ├─ memory                    128.0Mi    4.0Gi                    
  │  └─ pods                          1.0      1.0                    
  └─ belial                                                           
     ├─ cpu                         10.0m    20.0m                    
     ├─ memory                    128.0Mi    1.0Gi                    
     └─ pods                          1.0      1.0                    
  registry                                                            
  ├─ mindflayer02                                                     
  │  ├─ cpu                        100.0m      0.0                    
  │  └─ pods                          1.0      1.0                    
  └─ mindflayer03                                                     
     ├─ cpu                        300.0m      0.0                    
     └─ pods                          2.0      2.0                    
  traefik                             2.0      2.0                    
  ├─ mindflayer01                     1.0      1.0                    
  │  └─ pods                          1.0      1.0                    
  └─ mindflayer03                     1.0      1.0                    
     └─ pods                          1.0      1.0                    
  user-alex-chan                                                      
  ├─ belial                                                           
  │  ├─ cpu                           2.0      2.0                    
  │  ├─ memory                     32.0Gi   32.0Gi                    
  │  └─ pods                          1.0      1.0                    
  └─ vecna                                                            
     ├─ cpu                          10.0     16.0                    
     ├─ ephemeral-storage          20.0Gi   25.0Gi                    
     ├─ memory                     64.0Gi  160.0Gi                    
     ├─ nvidia.com/gpu                1.0      1.0                    
     └─ pods                          1.0      1.0                    
  user-andri-rutschmann                                               
  └─ kiaransalee                                                      
     ├─ cpu                          12.0     12.0                    
     ├─ memory                     16.0Gi   16.0Gi                    
     └─ pods                          1.0      1.0                    
  user-celine-angonin                                                 
  └─ demogorgon                                                       
     ├─ cpu                          15.0     15.0                    
     ├─ memory                    395.0Gi  395.0Gi                    
     ├─ nvidia.com/gpu                4.0      4.0                    
     └─ pods                          1.0      1.0                    
  user-christoph-hanselka                                             
  └─ belial                                                           
     ├─ cpu                           2.0      2.0                    
     ├─ memory                     44.0Gi   44.0Gi                    
     ├─ nvidia.com/gpu                3.0      3.0                    
     └─ pods                          2.0      2.0                    
  user-eduard-buss                                                    
  └─ demogorgon                                                       
     ├─ cpu                           1.0     32.0                    
     ├─ ephemeral-storage          15.0Gi   30.0Gi                    
     ├─ memory                     10.0Gi   32.0Gi                    
     ├─ nvidia.com/gpu                1.0      1.0                    
     └─ pods                          1.0      1.0                    
  user-giovanna-ratini                                                
  └─ mindflayer02                                                     
     ├─ cpu                        100.0m      1.0                    
     ├─ memory                    100.0Mi    1.0Gi                    
     └─ pods                          1.0      1.0                    
  user-julian-jandeleit                                               
  └─ demogorgon                                                       
     ├─ cpu                          48.0     48.0                    
     ├─ memory                     80.0Gi   80.0Gi                    
     ├─ nvidia.com/gpu                1.0      1.0                    
     └─ pods                          1.0      1.0                    
  user-julian-zimmermann                                              
  ├─ asmodeus                                                         
  │  ├─ cpu                          64.0     64.0                    
  │  ├─ memory                    512.0Gi  512.0Gi                    
  │  ├─ nvidia.com/gpu                1.0      1.0                    
  │  └─ pods                          1.0      1.0                    
  ├─ demogorgon                                                       
  │  ├─ cpu                           8.0      8.0                    
  │  ├─ memory                    256.0Gi  256.0Gi                    
  │  ├─ nvidia.com/gpu                1.0      1.0                    
  │  └─ pods                          1.0      1.0                    
  ├─ tiamat                                                           
  │  ├─ cpu                         255.0    255.0                    
  │  ├─ memory                    896.0Gi  896.0Gi                    
  │  ├─ nvidia.com/gpu                4.0      4.0                    
  │  └─ pods                          1.0      1.0                    
  └─ zariel                                                           
     ├─ cpu                         250.0    250.0                    
     ├─ memory                      1.9Ti    1.9Ti                    
     ├─ nvidia.com/gpu                8.0      8.0                    
     └─ pods                          1.0      1.0                    
  user-mike-battistella                                               
  └─ fierna                                                           
     ├─ cpu                           1.0      2.0                    
     ├─ memory                     32.0Gi   32.0Gi                    
     ├─ nvidia.com/gpu                1.0      1.0                    
     └─ pods                          1.0      1.0                    
  user-mohsen-jenadeleh                                               
  └─ asmodeus                                                         
     ├─ cpu                          60.0     60.0                    
     ├─ memory                    200.0Gi  200.0Gi                    
     ├─ nvidia.com/gpu                2.0      2.0                    
     └─ pods                          2.0      2.0                    
  user-segun-aroyehun                                                 
  └─ kiaransalee                                                      
     ├─ cpu                          25.0     50.0                    
     ├─ ephemeral-storage          20.0Gi   40.0Gi                    
     ├─ memory                    100.0Gi  150.0Gi                    
     ├─ nvidia.com/gpu                1.0      1.0                    
     └─ pods                          1.0      1.0                    
  user-stephen-tyndel                                                 
  └─ kiaransalee                                                      
     ├─ cpu                          12.0     12.0                    
     ├─ memory                     80.0Gi   80.0Gi                    
     ├─ nvidia.com/gpu                1.0      1.0                    
     └─ pods                          1.0      1.0                    
  user-till-aust                                                      
  ├─ belial                                                           
  │  ├─ cpu                          65.0     65.0                    
  │  ├─ memory                    210.0Gi  250.0Gi                    
  │  ├─ nvidia.com/gpu                2.0      2.0                    
  │  └─ pods                          3.0      3.0                    
  └─ fierna                                                           
     ├─ cpu                          64.0     64.0                    
     ├─ memory                    200.0Gi  240.0Gi                    
     ├─ nvidia.com/gpu                2.0      2.0                    
     └─ pods                          2.0      2.0                    
  user-ulas-bingoel                                                   
  ├─ asmodeus                                                         
  │  ├─ cpu                           8.0      8.0                    
  │  ├─ memory                     40.0Gi   40.0Gi                    
  │  ├─ nvidia.com/gpu                1.0      1.0                    
  │  └─ pods                          1.0      1.0                    
  ├─ demogorgon                                                       
  │  ├─ cpu                           8.0      8.0                    
  │  ├─ memory                     30.0Gi   30.0Gi                    
  │  ├─ nvidia.com/gpu                1.0      1.0                    
  │  └─ pods                          1.0      1.0                    
  └─ vecna                                                            
     ├─ cpu                           8.0      8.0                    
     ├─ memory                     30.0Gi   30.0Gi                    
     ├─ nvidia.com/gpu                1.0      1.0                    
     └─ pods                          1.0      1.0                    
  user-v-time                                                         
  └─ mindflayer01                                                     
     ├─ cpu                           1.5      3.0                    
     ├─ memory                    768.0Mi    1.5Gi                    
     └─ pods                          2.0      2.0                    
  user-valentin-schmuker                                              
  ├─ belial                                                           
  │  ├─ cpu                           2.0      2.0                    
  │  ├─ memory                      2.0Gi    2.0Gi                    
  │  └─ pods                          1.0      1.0                    
  └─ vecna                                                            
     ├─ cpu                          32.0     32.0                    
     ├─ memory                    128.0Gi  128.0Gi                    
     ├─ nvidia.com/gpu                8.0      8.0                    
     └─ pods                          2.0      2.0                    
  web                                                                 
  ├─ beholder01                                                       
  │  ├─ cpu                        250.0m      0.0                    
  │  └─ pods                          3.0      3.0                    
  ├─ beholder02                                                       
  │  ├─ cpu                        250.0m      1.0                    
  │  └─ pods                          2.0      2.0                    
  ├─ beholder03                       1.0      1.0                    
  │  └─ pods                          1.0      1.0                    
  ├─ mindflayer01                     1.0      1.0                    
  │  └─ pods                          1.0      1.0                    
  └─ mindflayer02                                                     
     ├─ cpu                        250.0m      0.0                    
     └─ pods                          2.0      2.0                    




Ceph file system report

  cluster:
    id:     3fee6f38-ba9f-11ec-9328-e188936dcafd
    health: HEALTH_OK
 
  services:
    mon: 5 daemons, quorum beholder03,beholder01,beholder02,mindflayer02,mindflayer03 (age 4w) [leader: beholder03]
    mgr: mindflayer01.mkuopd(active, since 6w), standbys: mindflayer02.ympgrs, beholder02.akktmp, beholder01.verxwn, beholder03.nprqzk, mindflayer03.rzdvrr
    mds: 4/4 daemons up, 2 standby
    osd: 24 osds: 24 up (since 4w), 24 in (since 4w)
 
  data:
    volumes: 1/1 healthy
    pools:   3 pools, 545 pgs
    objects: 73.56M objects, 97 TiB
    usage:   195 TiB used, 87 TiB / 282 TiB avail
    pgs:     543 active+clean
             2   active+clean+scrubbing+deep
 
  io:
    client:   1.4 MiB/s rd, 944 KiB/s wr, 10 op/s rd, 15 op/s wr
 
HEALTH_OK




Etcd cluster

+------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
|     ENDPOINT     |        ID        | VERSION | DB SIZE | IS LEADER | IS LEARNER | RAFT TERM | RAFT INDEX | RAFT APPLIED INDEX | ERRORS |
+------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
| 192.168.1.1:4252 | 7126e7a3a9cc42ca |  3.4.30 |  156 MB |     false |      false |    302065 |  706720078 |          706720078 |        |
| 192.168.1.2:4252 | 39d72894bf6c7600 |  3.4.30 |  156 MB |     false |      false |    302065 |  706720078 |          706720078 |        |
| 192.168.1.3:4252 | bbf4a2b99c3fd692 |  3.4.30 |  156 MB |     false |      false |    302065 |  706720078 |          706720078 |        |
| 192.168.2.1:4252 | 5cb9997dd1c2246b |  3.4.30 |  156 MB |      true |      false |    302065 |  706720078 |          706720078 |        |
| 192.168.2.3:4252 |  cbc1cf89959ea4e |  3.4.30 |  156 MB |     false |      false |    302065 |  706720078 |          706720078 |        |
+------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+




Detailed network health

API and web servers

beholder01
SSH port open yes
Report available yes
External interface up ok
Infiniband interface up ok
API servers reachable 1 2 3 4
Ceph monitors reachable 1 2 3
cephfs mounted /cephfs
local raid mounted /raid
Test pod responding dnsutils-beholder01
Can reach kube-dns 10.96.0.10
Pod can reach kube-dns yes
Pod can reach internet yes
beholder02
SSH port open yes
Report available yes
External interface up ok
Infiniband interface up ok
API servers reachable 1 2 3 4
Ceph monitors reachable 1 2 3
cephfs mounted /cephfs
local raid mounted /raid
Test pod responding dnsutils-beholder02
Can reach kube-dns 10.96.0.10
Pod can reach kube-dns yes
Pod can reach internet yes
beholder03
SSH port open yes
Report available yes
External interface up ok
Infiniband interface up ok
API servers reachable 1 2 3 4
Ceph monitors reachable 1 2 3
cephfs mounted /cephfs
local raid mounted /raid
Test pod responding dnsutils-beholder03
Can reach kube-dns 10.96.0.10
Pod can reach kube-dns yes
Pod can reach internet yes




Ceph osd nodes

mindflayer01
SSH port open yes
Report available yes
Infiniband interface up ok
API servers reachable 1 2 3 4
Ceph monitors reachable 1 2 3
cephfs mounted /cephfs
local raid mounted /raid
Test pod responding dnsutils-mindflayer01
Can reach kube-dns 10.96.0.10
Pod can reach kube-dns yes
Pod can reach internet yes
mindflayer02
SSH port open yes
Report available yes
Infiniband interface up ok
API servers reachable 1 2 3 4
Ceph monitors reachable 1 2 3
cephfs mounted /cephfs
local raid mounted /raid
Test pod responding dnsutils-mindflayer02
Can reach kube-dns 10.96.0.10
Pod can reach kube-dns yes
Pod can reach internet yes
mindflayer03
SSH port open yes
Report available yes
Infiniband interface up ok
API servers reachable 1 2 3 4
Ceph monitors reachable 1 2 3
cephfs mounted /cephfs
local raid mounted /raid
Test pod responding dnsutils-mindflayer03
Can reach kube-dns 10.96.0.10
Pod can reach kube-dns yes
Pod can reach internet yes




Compute nodes

vecna
SSH port open yes
Report available yes
Infiniband interface up ok
API servers reachable 1 2 3 4
Ceph monitors reachable 1 2 3
cephfs mounted /cephfs/abyss
local raid mounted /raid
Test pod responding dnsutils-vecna
Can reach kube-dns 10.96.0.10
Pod can reach kube-dns yes
Pod can reach internet yes
kiaransalee
SSH port open yes
Report available yes
Infiniband interface up ok
API servers reachable 1 2 3 4
Ceph monitors reachable 1 2 3
cephfs mounted /cephfs/abyss
local raid mounted /raid
Test pod responding dnsutils-kiaransalee
Can reach kube-dns 10.96.0.10
Pod can reach kube-dns yes
Pod can reach internet yes
belial
SSH port open yes
Report available yes
Infiniband interface up ok
API servers reachable 1 2 3 4
Ceph monitors reachable 1 2 3
cephfs mounted /cephfs/abyss
local raid mounted /raid
Test pod responding dnsutils-belial
Can reach kube-dns 10.96.0.10
Pod can reach kube-dns yes
Pod can reach internet yes
fierna
SSH port open yes
Report available yes
Infiniband interface up ok
API servers reachable 1 2 3 4
Ceph monitors reachable 1 2 3
cephfs mounted /cephfs/abyss
local raid mounted /raid
Test pod responding dnsutils-fierna
Can reach kube-dns 10.96.0.10
Pod can reach kube-dns yes
Pod can reach internet yes
demogorgon
SSH port open yes
Report available yes
Infiniband interface up ok
API servers reachable 1 2 3 4
Ceph monitors reachable 1 2 3
cephfs mounted /cephfs/abyss
local raid mounted /raid
Test pod responding dnsutils-demogorgon
Can reach kube-dns 10.96.0.10
Pod can reach kube-dns yes
Pod can reach internet yes
tiamat
SSH port open yes
Report available yes
Infiniband interface up ok
API servers reachable 1 2 3 4
Ceph monitors reachable 1 2 3
cephfs mounted /cephfs/abyss
local raid mounted /raid
Test pod responding dnsutils-tiamat
Can reach kube-dns 10.96.0.10
Pod can reach kube-dns yes
Pod can reach internet yes
asmodeus
SSH port open yes
Report available yes
Infiniband interface up ok
API servers reachable 1 2 3 4
Ceph monitors reachable 1 2 3
cephfs mounted /cephfs/abyss
local raid mounted /raid
Test pod responding dnsutils-asmodeus
Can reach kube-dns 10.96.0.10
Pod can reach kube-dns yes
Pod can reach internet yes
zariel
SSH port open yes
Report available yes
Infiniband interface up ok
API servers reachable 1 2 3 4
Ceph monitors reachable 1 2 3
cephfs mounted /cephfs/abyss
local raid mounted /raid
Test pod responding dnsutils-zariel
Can reach kube-dns 10.96.0.10
Pod can reach kube-dns yes
Pod can reach internet yes




nVidia driver and GPU status

dretch

belial

Wed Feb  4 02:32:49 2026       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 580.95.05              Driver Version: 580.95.05      CUDA Version: 13.0     |
+-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  Quadro RTX 6000                Off |   00000000:1B:00.0 Off |                  Off |
| 33%   33C    P8              4W /  260W |       1MiB /  24576MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   1  Quadro RTX 6000                Off |   00000000:1C:00.0 Off |                  Off |
| 33%   32C    P8             12W /  260W |       1MiB /  24576MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   2  Quadro RTX 6000                Off |   00000000:1D:00.0 Off |                  Off |
| 33%   33C    P8              5W /  260W |       1MiB /  24576MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   3  Quadro RTX 6000                Off |   00000000:1E:00.0 Off |                  Off |
| 33%   33C    P8              5W /  260W |       1MiB /  24576MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   4  Quadro RTX 6000                Off |   00000000:3D:00.0 Off |                  Off |
| 33%   33C    P8             13W /  260W |       1MiB /  24576MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   5  Quadro RTX 6000                Off |   00000000:3F:00.0 Off |                  Off |
| 33%   32C    P8              6W /  260W |       1MiB /  24576MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   6  Quadro RTX 6000                Off |   00000000:40:00.0 Off |                  Off |
| 33%   33C    P8              4W /  260W |       1MiB /  24576MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   7  Quadro RTX 6000                Off |   00000000:41:00.0 Off |                  Off |
| 33%   33C    P8              4W /  260W |       1MiB /  24576MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI              PID   Type   Process name                        GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|  No running processes found                                                             |
+-----------------------------------------------------------------------------------------+

fierna

Wed Feb  4 02:32:51 2026       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 580.95.05              Driver Version: 580.95.05      CUDA Version: 13.0     |
+-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  Quadro RTX 6000                On  |   00000000:1B:00.0 Off |                  Off |
| 33%   33C    P8             17W /  260W |       1MiB /  24576MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   1  Quadro RTX 6000                On  |   00000000:1C:00.0 Off |                  Off |
| 33%   32C    P8             17W /  260W |       1MiB /  24576MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   2  Quadro RTX 6000                On  |   00000000:1D:00.0 Off |                  Off |
| 33%   33C    P8              4W /  260W |    4100MiB /  24576MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   3  Quadro RTX 6000                On  |   00000000:1E:00.0 Off |                  Off |
| 33%   32C    P8              4W /  260W |       1MiB /  24576MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   4  Quadro RTX 6000                On  |   00000000:3D:00.0 Off |                  Off |
| 33%   31C    P8             15W /  260W |       1MiB /  24576MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   5  Quadro RTX 6000                On  |   00000000:3F:00.0 Off |                  Off |
| 33%   32C    P8              4W /  260W |       1MiB /  24576MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   6  Quadro RTX 6000                On  |   00000000:40:00.0 Off |                  Off |
| 33%   32C    P8             12W /  260W |       1MiB /  24576MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   7  Quadro RTX 6000                On  |   00000000:41:00.0 Off |                  Off |
| 33%   32C    P8             16W /  260W |       1MiB /  24576MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI              PID   Type   Process name                        GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|    2   N/A  N/A         2012068      C   python                                 4096MiB |
+-----------------------------------------------------------------------------------------+

tiamat

Wed Feb  4 02:32:53 2026       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 580.95.05              Driver Version: 580.95.05      CUDA Version: 13.0     |
+-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA A100-SXM4-40GB          On  |   00000000:01:00.0 Off |                    0 |
| N/A   52C    P0            211W /  400W |   39823MiB /  40960MiB |     98%      Default |
|                                         |                        |             Disabled |
+-----------------------------------------+------------------------+----------------------+
|   1  NVIDIA A100-SXM4-40GB          On  |   00000000:41:00.0 Off |                    0 |
| N/A   52C    P0            159W /  400W |   39907MiB /  40960MiB |     86%      Default |
|                                         |                        |             Disabled |
+-----------------------------------------+------------------------+----------------------+
|   2  NVIDIA A100-SXM4-40GB          On  |   00000000:81:00.0 Off |                    0 |
| N/A   44C    P0             99W /  400W |   40075MiB /  40960MiB |     98%      Default |
|                                         |                        |             Disabled |
+-----------------------------------------+------------------------+----------------------+
|   3  NVIDIA A100-SXM4-40GB          On  |   00000000:C1:00.0 Off |                    0 |
| N/A   46C    P0            323W /  400W |   40015MiB /  40960MiB |     66%      Default |
|                                         |                        |             Disabled |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI              PID   Type   Process name                        GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|    0   N/A  N/A           45258      C   /usr/bin/python3                      39812MiB |
|    1   N/A  N/A           45259      C   /usr/bin/python3                      39896MiB |
|    2   N/A  N/A           45260      C   /usr/bin/python3                      40064MiB |
|    3   N/A  N/A           45261      C   /usr/bin/python3                      40004MiB |
+-----------------------------------------------------------------------------------------+

vecna

Wed Feb  4 03:32:56 2026       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 580.95.05              Driver Version: 580.95.05      CUDA Version: 13.0     |
+-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  Tesla V100-SXM3-32GB           Off |   00000000:34:00.0 Off |                    0 |
| N/A   34C    P0             48W /  350W |       0MiB /  32768MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   1  Tesla V100-SXM3-32GB           Off |   00000000:36:00.0 Off |                    0 |
| N/A   34C    P0             48W /  350W |       0MiB /  32768MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   2  Tesla V100-SXM3-32GB           Off |   00000000:39:00.0 Off |                    0 |
| N/A   37C    P0             54W /  350W |       0MiB /  32768MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   3  Tesla V100-SXM3-32GB           Off |   00000000:3B:00.0 Off |                    0 |
| N/A   59C    P0            304W /  350W |   25740MiB /  32768MiB |    100%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   4  Tesla V100-SXM3-32GB           Off |   00000000:57:00.0 Off |                    0 |
| N/A   49C    P0            117W /  350W |   25738MiB /  32768MiB |     80%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   5  Tesla V100-SXM3-32GB           Off |   00000000:59:00.0 Off |                    0 |
| N/A   65C    P0            335W /  350W |   25766MiB /  32768MiB |     54%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   6  Tesla V100-SXM3-32GB           Off |   00000000:5C:00.0 Off |                    0 |
| N/A   53C    P0            345W /  350W |   25770MiB /  32768MiB |     73%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   7  Tesla V100-SXM3-32GB           Off |   00000000:5E:00.0 Off |                    0 |
| N/A   68C    P0            350W /  350W |   25750MiB /  32768MiB |     60%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   8  Tesla V100-SXM3-32GB           Off |   00000000:B7:00.0 Off |                    0 |
| N/A   48C    P0             87W /  350W |   25746MiB /  32768MiB |     94%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   9  Tesla V100-SXM3-32GB           Off |   00000000:B9:00.0 Off |                    0 |
| N/A   50C    P0            233W /  350W |   25766MiB /  32768MiB |     36%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|  10  Tesla V100-SXM3-32GB           Off |   00000000:BC:00.0 Off |                    0 |
| N/A   63C    P0            350W /  350W |   25762MiB /  32768MiB |     50%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|  11  Tesla V100-SXM3-32GB           Off |   00000000:BE:00.0 Off |                    0 |
| N/A   67C    P0            321W /  350W |   25750MiB /  32768MiB |     67%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|  12  Tesla V100-SXM3-32GB           Off |   00000000:E0:00.0 Off |                    0 |
| N/A   37C    P0             48W /  350W |       0MiB /  32768MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|  13  Tesla V100-SXM3-32GB           Off |   00000000:E2:00.0 Off |                    0 |
| N/A   37C    P0             49W /  350W |       0MiB /  32768MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|  14  Tesla V100-SXM3-32GB           Off |   00000000:E5:00.0 Off |                    0 |
| N/A   40C    P0             51W /  350W |       0MiB /  32768MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|  15  Tesla V100-SXM3-32GB           Off |   00000000:E7:00.0 Off |                    0 |
| N/A   39C    P0             49W /  350W |       0MiB /  32768MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI              PID   Type   Process name                        GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|    3   N/A  N/A         3915219      C   python                                25734MiB |
|    4   N/A  N/A         1620655      C   /opt/conda/bin/python                 25728MiB |
|    5   N/A  N/A         1620656      C   /opt/conda/bin/python                 25756MiB |
|    6   N/A  N/A         1620657      C   /opt/conda/bin/python                 25760MiB |
|    7   N/A  N/A         1620658      C   /opt/conda/bin/python                 25740MiB |
|    8   N/A  N/A         2939670      C   /opt/conda/bin/python                 25736MiB |
|    9   N/A  N/A         2939671      C   /opt/conda/bin/python                 25756MiB |
|   10   N/A  N/A         2939672      C   /opt/conda/bin/python                 25752MiB |
|   11   N/A  N/A         2939673      C   /opt/conda/bin/python                 25740MiB |
+-----------------------------------------------------------------------------------------+

asmodeus

Wed Feb  4 02:32:59 2026       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 580.95.05              Driver Version: 580.95.05      CUDA Version: 13.0     |
+-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA A100-SXM4-80GB          On  |   00000000:01:00.0 Off |                    0 |
| N/A   46C    P0            244W /  500W |   55084MiB /  81920MiB |    100%      Default |
|                                         |                        |             Disabled |
+-----------------------------------------+------------------------+----------------------+
|   1  NVIDIA A100-SXM4-80GB          On  |   00000000:41:00.0 Off |                    0 |
| N/A   44C    P0            216W /  500W |    7498MiB /  81920MiB |    100%      Default |
|                                         |                        |             Disabled |
+-----------------------------------------+------------------------+----------------------+
|   2  NVIDIA A100-SXM4-80GB          On  |   00000000:81:00.0 Off |                    0 |
| N/A   27C    P0             59W /  500W |       0MiB /  81920MiB |      0%      Default |
|                                         |                        |             Disabled |
+-----------------------------------------+------------------------+----------------------+
|   3  NVIDIA A100-SXM4-80GB          On  |   00000000:C1:00.0 Off |                    0 |
| N/A   27C    P0             60W /  500W |       0MiB /  81920MiB |      0%      Default |
|                                         |                        |             Disabled |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI              PID   Type   Process name                        GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|    0   N/A  N/A         2085878      C   python                                13020MiB |
|    0   N/A  N/A         2103268      C   python                                15440MiB |
|    0   N/A  N/A         2107017      C   python                                15440MiB |
|    0   N/A  N/A         2125087      C   python                                11110MiB |
|    1   N/A  N/A          388245      C   python                                   12MiB |
|    1   N/A  N/A          388252      C   python                                   12MiB |
|    1   N/A  N/A         2112002      C   python                                 3560MiB |
|    1   N/A  N/A         2124517      C   python                                 3760MiB |
+-----------------------------------------------------------------------------------------+

zariel

Wed Feb  4 03:33:02 2026       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 580.95.05              Driver Version: 580.95.05      CUDA Version: 13.0     |
+-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA A100-SXM4-40GB          On  |   00000000:07:00.0 Off |                    0 |
| N/A   48C    P0            178W /  400W |   34476MiB /  40960MiB |     32%      Default |
|                                         |                        |             Disabled |
+-----------------------------------------+------------------------+----------------------+
|   1  NVIDIA A100-SXM4-40GB          On  |   00000000:0F:00.0 Off |                    0 |
| N/A   45C    P0            192W /  400W |   34002MiB /  40960MiB |    100%      Default |
|                                         |                        |             Disabled |
+-----------------------------------------+------------------------+----------------------+
|   2  NVIDIA A100-SXM4-40GB          On  |   00000000:47:00.0 Off |                    0 |
| N/A   43C    P0            177W /  400W |   34506MiB /  40960MiB |     77%      Default |
|                                         |                        |             Disabled |
+-----------------------------------------+------------------------+----------------------+
|   3  NVIDIA A100-SXM4-40GB          On  |   00000000:4E:00.0 Off |                    0 |
| N/A   44C    P0            174W /  400W |   34942MiB /  40960MiB |     96%      Default |
|                                         |                        |             Disabled |
+-----------------------------------------+------------------------+----------------------+
|   4  NVIDIA A100-SXM4-40GB          On  |   00000000:87:00.0 Off |                    0 |
| N/A   61C    P0            206W /  400W |   34004MiB /  40960MiB |    100%      Default |
|                                         |                        |             Disabled |
+-----------------------------------------+------------------------+----------------------+
|   5  NVIDIA A100-SXM4-40GB          On  |   00000000:90:00.0 Off |                    0 |
| N/A   56C    P0            150W /  400W |   34878MiB /  40960MiB |     95%      Default |
|                                         |                        |             Disabled |
+-----------------------------------------+------------------------+----------------------+
|   6  NVIDIA A100-SXM4-40GB          On  |   00000000:B7:00.0 Off |                    0 |
| N/A   57C    P0            198W /  400W |   34170MiB /  40960MiB |     99%      Default |
|                                         |                        |             Disabled |
+-----------------------------------------+------------------------+----------------------+
|   7  NVIDIA A100-SXM4-40GB          On  |   00000000:BD:00.0 Off |                    0 |
| N/A   55C    P0            176W /  400W |   34472MiB /  40960MiB |     92%      Default |
|                                         |                        |             Disabled |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI              PID   Type   Process name                        GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|    0   N/A  N/A         1060845      C   /usr/bin/python                       34466MiB |
|    1   N/A  N/A         1060846      C   /usr/bin/python                       33992MiB |
|    2   N/A  N/A         1060847      C   /usr/bin/python                       34496MiB |
|    3   N/A  N/A         1060848      C   /usr/bin/python                       34932MiB |
|    4   N/A  N/A         1060849      C   /usr/bin/python                       33994MiB |
|    5   N/A  N/A         1060850      C   /usr/bin/python                       34868MiB |
|    6   N/A  N/A         1060851      C   /usr/bin/python                       34160MiB |
|    7   N/A  N/A         1060852      C   /usr/bin/python                       34462MiB |
+-----------------------------------------------------------------------------------------+

demogorgon

Wed Feb  4 02:33:06 2026       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 580.95.05              Driver Version: 580.95.05      CUDA Version: 13.0     |
+-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA A40                     Off |   00000000:01:00.0 Off |                    0 |
|  0%   30C    P8             25W /  300W |       0MiB /  46068MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   1  NVIDIA A40                     Off |   00000000:25:00.0 Off |                    0 |
|  0%   29C    P8             24W /  300W |       0MiB /  46068MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   2  NVIDIA A40                     Off |   00000000:41:00.0 Off |                    0 |
|  0%   29C    P8             24W /  300W |       0MiB /  46068MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   3  NVIDIA A40                     Off |   00000000:61:00.0 Off |                    0 |
|  0%   29C    P8             24W /  300W |       0MiB /  46068MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   4  NVIDIA A40                     Off |   00000000:81:00.0 Off |                    0 |
|  0%   29C    P8             17W /  300W |       0MiB /  46068MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   5  NVIDIA A40                     Off |   00000000:A1:00.0 Off |                    0 |
|  0%   42C    P0             80W /  300W |   39707MiB /  46068MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   6  NVIDIA A40                     Off |   00000000:C1:00.0 Off |                    0 |
|  0%   47C    P0             83W /  300W |    4093MiB /  46068MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   7  NVIDIA A40                     Off |   00000000:E1:00.0 Off |                    0 |
|  0%   29C    P8             16W /  300W |       0MiB /  46068MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI              PID   Type   Process name                        GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|    5   N/A  N/A         1964129      C   /usr/bin/python                       39698MiB |
|    6   N/A  N/A          649962      C   python                                 4084MiB |
+-----------------------------------------------------------------------------------------+

kiaransalee

Wed Feb  4 02:33:08 2026       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.127.08             Driver Version: 550.127.08     CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA H100 80GB HBM3          On  |   00000000:26:00.0 Off |                    0 |
| N/A   28C    P0             97W /  700W |       1MiB /  81559MiB |      0%      Default |
|                                         |                        |             Disabled |
+-----------------------------------------+------------------------+----------------------+
|   1  NVIDIA H100 80GB HBM3          On  |   00000000:2F:00.0 Off |                    0 |
| N/A   68C    P0            700W /  700W |   58452MiB /  81559MiB |    100%      Default |
|                                         |                        |             Disabled |
+-----------------------------------------+------------------------+----------------------+
|   2  NVIDIA H100 80GB HBM3          On  |   00000000:46:00.0 Off |                    0 |
| N/A   29C    P0             74W /  700W |       1MiB /  81559MiB |      0%      Default |
|                                         |                        |             Disabled |
+-----------------------------------------+------------------------+----------------------+
|   3  NVIDIA H100 80GB HBM3          On  |   00000000:54:00.0 Off |                    0 |
| N/A   33C    P0            148W /  700W |     724MiB /  81559MiB |      0%      Default |
|                                         |                        |             Disabled |
+-----------------------------------------+------------------------+----------------------+
|   4  NVIDIA H100 80GB HBM3          On  |   00000000:A6:00.0 Off |                    0 |
| N/A   43C    P0             82W /  700W |       1MiB /  81559MiB |      0%      Default |
|                                         |                        |             Disabled |
+-----------------------------------------+------------------------+----------------------+
|   5  NVIDIA H100 80GB HBM3          On  |   00000000:AF:00.0 Off |                    0 |
| N/A   31C    P0            127W /  700W |   75430MiB /  81559MiB |      0%      Default |
|                                         |                        |             Disabled |
+-----------------------------------------+------------------------+----------------------+
|   6  NVIDIA H100 80GB HBM3          On  |   00000000:C6:00.0 Off |                   On |
| N/A   42C    P0            344W /  700W |   29600MiB /  81559MiB |     N/A      Default |
|                                         |                        |              Enabled |
+-----------------------------------------+------------------------+----------------------+
|   7  NVIDIA H100 80GB HBM3          On  |   00000000:CF:00.0 Off |                   On |
| N/A   26C    P0             73W /  700W |      90MiB /  81559MiB |     N/A      Default |
|                                         |                        |              Enabled |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| MIG devices:                                                                            |
+------------------+----------------------------------+-----------+-----------------------+
| GPU  GI  CI  MIG |                     Memory-Usage |        Vol|      Shared           |
|      ID  ID  Dev |                       BAR1-Usage | SM     Unc| CE ENC DEC OFA JPG    |
|                  |                                  |        ECC|                       |
|==================+==================================+===========+=======================|
|  6    1   0   0  |           29534MiB / 40320MiB    | 64      0 |  4   0    4    0    4 |
|                  |                 3MiB / 65535MiB  |           |                       |
+------------------+----------------------------------+-----------+-----------------------+
|  6    2   0   1  |              38MiB / 40320MiB    | 60      0 |  3   0    3    0    3 |
|                  |                 0MiB / 65535MiB  |           |                       |
+------------------+----------------------------------+-----------+-----------------------+
|  7    7   0   0  |              13MiB /  9984MiB    | 16      0 |  1   0    1    0    1 |
|                  |                 0MiB / 16383MiB  |           |                       |
+------------------+----------------------------------+-----------+-----------------------+
|  7    8   0   1  |              13MiB /  9984MiB    | 16      0 |  1   0    1    0    1 |
|                  |                 0MiB / 16383MiB  |           |                       |
+------------------+----------------------------------+-----------+-----------------------+
|  7    9   0   2  |              13MiB /  9984MiB    | 16      0 |  1   0    1    0    1 |
|                  |                 0MiB / 16383MiB  |           |                       |
+------------------+----------------------------------+-----------+-----------------------+
|  7   11   0   3  |              13MiB /  9984MiB    | 16      0 |  1   0    1    0    1 |
|                  |                 0MiB / 16383MiB  |           |                       |
+------------------+----------------------------------+-----------+-----------------------+
|  7   12   0   4  |              13MiB /  9984MiB    | 16      0 |  1   0    1    0    1 |
|                  |                 0MiB / 16383MiB  |           |                       |
+------------------+----------------------------------+-----------+-----------------------+
|  7   13   0   5  |              13MiB /  9984MiB    | 16      0 |  1   0    1    0    1 |
|                  |                 0MiB / 16383MiB  |           |                       |
+------------------+----------------------------------+-----------+-----------------------+
|  7   14   0   6  |              13MiB /  9984MiB    | 16      0 |  1   0    1    0    1 |
|                  |                 0MiB / 16383MiB  |           |                       |
+------------------+----------------------------------+-----------+-----------------------+
                                                                                         
+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|    1   N/A  N/A   2316272      C   VLLM::EngineCore                            58442MiB |
|    3   N/A  N/A   2310751      C   /home/jovyan/server_env/bin/python            714MiB |
|    5   N/A  N/A   2572298      C   VLLM::EngineCore                            75420MiB |
|    6    1    0    1989885      C   python                                      29502MiB |
+-----------------------------------------------------------------------------------------+