Cluster status Wed, 17 Dec 2025 21:52:01 +0000 report from beholder03

Resource usage (overall)
Resource usage (by namespace)
Ceph file system status
Etcd cluster status
Detailed network health report
nVidia Driver and GPU reports
  Imp
  Dretch
  Belial
  Fierna
  Tiamat
  Vecna
  Asmodeus
  Zariel
  Demogorgon

Resource usage (overall)

 Resource                                                                    Requested          Limit  Allocatable     Free 
  cpu                                                                      (41%) 651.5    (51%) 808.4         1.6k    767.6 
  ├─ asmodeus                                                              (96%) 245.3    (96%) 245.1        256.0     10.7 
  │  ├─ asmodeus-three-gpu-julian-schaefer-zimmermann                            225.0          225.0                       
  │  ├─ dnsutils-asmodeus                                                       100.0m         100.0m                       
  │  ├─ kube-router-dnkrb                                                       250.0m            0.0                       
  │  └─ step16-coolchic-gpu-run-asmodeus-a100-hq                                  20.0           20.0                       
  ├─ beholder01                                                               (4%) 1.0    (0%) 100.0m         24.0     23.0 
  │  ├─ coredns-55cb58b774-ts944                                                100.0m            0.0                       
  │  ├─ dnsutils-beholder01                                                     100.0m         100.0m                       
  │  ├─ kube-apiserver-beholder01                                               250.0m            0.0                       
  │  ├─ kube-controller-manager-beholder01                                      200.0m            0.0                       
  │  ├─ kube-router-x6f6z                                                       250.0m            0.0                       
  │  └─ kube-scheduler-beholder01                                               100.0m            0.0                       
  ├─ beholder02                                                               (8%) 1.9       (9%) 2.1         24.0     21.9 
  │  ├─ dnsutils-beholder02                                                     100.0m         100.0m                       
  │  ├─ kube-apiserver-beholder02                                               250.0m            0.0                       
  │  ├─ kube-controller-manager-beholder02                                      200.0m            0.0                       
  │  ├─ kube-router-q5qj6                                                       250.0m            0.0                       
  │  ├─ kube-scheduler-beholder02                                               100.0m            0.0                       
  │  ├─ wordpress-8c8944c8d-cxbrq                                               500.0m            1.0                       
  │  └─ wordpress-mariadb-565c547dc8-59fhr                                      500.0m            1.0                       
  ├─ beholder03                                                            (4%) 900.0m    (0%) 100.0m         24.0     23.1 
  │  ├─ dnsutils-beholder03                                                     100.0m         100.0m                       
  │  ├─ kube-apiserver-beholder03                                               250.0m            0.0                       
  │  ├─ kube-controller-manager-beholder03                                      200.0m            0.0                       
  │  ├─ kube-router-nkl22                                                       250.0m            0.0                       
  │  └─ kube-scheduler-beholder03                                               100.0m            0.0                       
  ├─ belial                                                                 (78%) 62.4     (78%) 62.1         80.0     17.6 
  │  ├─ dnsutils-belial                                                         100.0m         100.0m                       
  │  ├─ kube-router-fh7nx                                                       250.0m            0.0                       
  │  ├─ ledavio-similarity-search                                                  1.0            1.0                       
  │  ├─ till-aust-inceptions-models-job-hgvwj                                      1.0            1.0                       
  │  ├─ till-aust-lc-11-20-all-all-2-ppot-accuracy-job-bl9xc                       5.0            5.0                       
  │  ├─ till-aust-lc-15-20-all-all-2-ppot-accuracy-job-qnnvk                       5.0            5.0                       
  │  ├─ till-aust-lc-16-10-all-all-3-ppot-accuracy-job-546wc                       5.0            5.0                       
  │  ├─ till-aust-lc-2-10-all-all-2-ppot-accuracy-job-r9vth                        5.0            5.0                       
  │  ├─ till-aust-lc-2-60-all-all-3-duration-estimate-accuracy-jobd2wvl            5.0            5.0                       
  │  ├─ till-aust-lc-3-5-all-all-3-duration-estimate-accuracy-job-m6q28            5.0            5.0                       
  │  ├─ till-aust-lc-5-10-all-all-3-ppot-accuracy-job-j2qph                        5.0            5.0                       
  │  ├─ till-aust-lc-5-20-all-all-3-ppot-accuracy-job-c4tjl                        5.0            5.0                       
  │  ├─ till-aust-lc-5-45-all-all-2-ppot-accuracy-job-gdq6m                        5.0            5.0                       
  │  ├─ till-aust-lc-6-10-all-all-2-ppot-accuracy-job-n586b                        5.0            5.0                       
  │  ├─ till-aust-lc-6-20-all-all-3-ppot-accuracy-job-8njqd                        5.0            5.0                       
  │  └─ till-aust-lc-7-2-all-all-2-ppot-accuracy-job-s9thx                         5.0            5.0                       
  ├─ demogorgon                                                             (75%) 72.3   (107%) 103.1         96.0      0.0 
  │  ├─ a2v2-two-gpu-jcsz                                                          8.0            8.0                       
  │  ├─ demogorgon-a2v2                                                           15.0           15.0                       
  │  ├─ dnsutils-demogorgon                                                     100.0m         100.0m                       
  │  ├─ gpu-demogorgon                                                            48.0           48.0                       
  │  ├─ kube-router-shv9v                                                       250.0m            0.0                       
  │  └─ pycharm                                                                    1.0           32.0                       
  ├─ fierna                                                                 (24%) 19.6     (31%) 25.1         80.0     54.9 
  │  ├─ ccu-deepseek-low                                                          10.0           10.0                       
  │  ├─ detect-20251217-071751-94-thsw4                                            2.0            3.0                       
  │  ├─ detect-20251217-071751-95-lqrhm                                            2.0            3.0                       
  │  ├─ detect-20251217-071751-96-zmb5z                                            2.0            3.0                       
  │  ├─ detect-20251217-071751-97-2ftpq                                            2.0            3.0                       
  │  ├─ dnsutils-fierna                                                         100.0m         100.0m                       
  │  ├─ gatekeeper-audit-59d4b6fd4c-kcshh                                       100.0m            1.0                       
  │  ├─ gatekeeper-controller-manager-66f474f785-65q6j                          100.0m            1.0                       
  │  ├─ gpu-pod                                                                    1.0            1.0                       
  │  └─ kube-router-z5hzc                                                       250.0m            0.0                       
  ├─ kiaransalee                                                            (19%) 36.4     (19%) 36.1        192.0    155.7 
  │  ├─ bash-pod                                                                  12.0           12.0                       
  │  ├─ dnsutils-kiaransalee                                                    100.0m         100.0m                       
  │  ├─ kube-router-89zrt                                                       250.0m            0.0                       
  │  ├─ tgillm-pod-689c99c6d7-9jjkz                                               12.0           12.0                       
  │  └─ wave-pod                                                                  12.0           12.0                       
  ├─ mindflayer01                                                             (5%) 3.2       (7%) 4.6         64.0     59.4 
  │  ├─ coolchic-editor                                                         500.0m         500.0m                       
  │  ├─ dex-7b88c8985f-5zqcd                                                    250.0m            0.0                       
  │  ├─ dex-loginapp-779c77c986-wjxqs                                           100.0m            0.0                       
  │  ├─ dex-mysql-66b5885f7b-2wdw8                                              100.0m            0.0                       
  │  ├─ dnsutils-mindflayer01                                                   100.0m         100.0m                       
  │  ├─ gatekeeper-controller-manager-66f474f785-lxl5x                          100.0m            1.0                       
  │  ├─ kube-router-rzwmw                                                       250.0m            0.0                       
  │  ├─ mariadb-849498cb44-dldtm                                                500.0m            1.0                       
  │  ├─ mediawiki-68594fd995-2k4fp                                              250.0m            0.0                       
  │  ├─ mediawiki-mariadb-5c7dbf6b85-9wwp8                                      250.0m            0.0                       
  │  ├─ prometheus-deployment-b65d5d898-t2z9l                                   500.0m            1.0                       
  │  ├─ registry-57fb9f57f4-vplmf                                               100.0m            0.0                       
  │  ├─ registry-auth-7b4598cd74-qctwp                                          100.0m            0.0                       
  │  └─ temp-pod                                                                100.0m            1.0                       
  ├─ mindflayer02                                                           (18%) 11.3     (28%) 18.1         64.0     45.9 
  │  ├─ cool-pod                                                                  10.0           16.0                       
  │  ├─ dnsutils-mindflayer02                                                   100.0m         100.0m                       
  │  ├─ kube-router-sbdv9                                                       250.0m            0.0                       
  │  └─ phpfpm-nginx-5db96d6895-97f2g                                              1.0            2.0                       
  ├─ mindflayer03                                                             (2%) 1.1       (2%) 1.6         64.0     62.4 
  │  ├─ check-data-pod                                                          100.0m         500.0m                       
  │  ├─ coredns-55cb58b774-4kpwk                                                100.0m            0.0                       
  │  ├─ dnsutils-mindflayer03                                                   100.0m         100.0m                       
  │  ├─ gatekeeper-controller-manager-66f474f785-5gv4n                          100.0m            1.0                       
  │  ├─ kube-router-q46sc                                                       250.0m            0.0                       
  │  ├─ ldap-6987986dbc-2zr2c                                                   250.0m            0.0                       
  │  ├─ system-registry-8877cd57-59pzr                                          100.0m            0.0                       
  │  └─ user-registry-64bb8ff7cf-5cktq                                          100.0m            0.0                       
  ├─ tiamat                                                                 (16%) 41.4     (16%) 42.1        256.0    213.9 
  │  ├─ coolchic-gpu-run-perceptive-set3                                          40.0           40.0                       
  │  ├─ dnsutils-tiamat                                                         100.0m         100.0m                       
  │  ├─ kube-router-zc5x6                                                       250.0m            0.0                       
  │  └─ ledavio-text-search-7bcd8c4885-vn2qs                                       1.0            2.0                       
  ├─ vecna                                                                 (0%) 350.0m    (0%) 100.0m         96.0     95.7 
  │  ├─ dnsutils-vecna                                                          100.0m         100.0m                       
  │  └─ kube-router-bsdm4                                                       250.0m            0.0                       
  └─ zariel                                                                (60%) 154.3   (105%) 268.1        256.0      0.0 
     ├─ allin1                                                                    64.0           96.0                       
     ├─ dnsutils-zariel                                                         100.0m         100.0m                       
     ├─ kube-router-svsrt                                                       250.0m            0.0                       
     ├─ oliver-wiedemann-pod                                                      32.0           64.0                       
     ├─ ubuntu-gpu1                                                               50.0          100.0                       
     └─ ulas-bingoel-model-pod-4                                                   8.0            8.0                       
  ephemeral-storage                                                       (1%) 115.0Gi   (2%) 180.0Gi        11.3T    11.1T 
  ├─ asmodeus                                                                 (0%) 0.0       (0%) 0.0        94.6G    94.6G 
  ├─ beholder01                                                               (0%) 0.0       (0%) 0.0         1.7T     1.7T 
  ├─ beholder02                                                               (0%) 0.0       (0%) 0.0         1.7T     1.7T 
  ├─ beholder03                                                               (0%) 0.0       (0%) 0.0         1.7T     1.7T 
  ├─ belial                                                                   (0%) 0.0       (0%) 0.0       189.2G   189.2G 
  ├─ demogorgon                                                            (2%) 15.0Gi    (5%) 30.0Gi       706.7G   674.5G 
  │  └─ pycharm                                                                 15.0Gi         30.0Gi                       
  ├─ fierna                                                                   (0%) 0.0       (0%) 0.0       189.2G   189.2G 
  ├─ kiaransalee                                                              (0%) 0.0       (0%) 0.0         1.7T     1.7T 
  ├─ mindflayer01                                                             (0%) 0.0       (0%) 0.0       211.5G   211.5G 
  ├─ mindflayer02                                                             (0%) 0.0       (0%) 0.0       211.5G   211.5G 
  ├─ mindflayer03                                                             (0%) 0.0       (0%) 0.0       211.5G   211.5G 
  ├─ tiamat                                                                   (0%) 0.0       (0%) 0.0       164.4G   164.4G 
  ├─ vecna                                                                    (0%) 0.0       (0%) 0.0       849.0G   849.0G 
  └─ zariel                                                               (6%) 100.0Gi   (9%) 150.0Gi         1.7T     1.5T 
     └─ ubuntu-gpu1                                                            100.0Gi        150.0Gi                       
  memory                                                                    (37%) 5.1T    (39%) 5.0Ti       12.7Ti    7.7Ti 
  ├─ asmodeus                                                              (82%) 1.6Ti    (82%) 1.6Ti        2.0Ti  367.4Gi 
  │  ├─ asmodeus-three-gpu-julian-schaefer-zimmermann                            1.5Ti          1.5Ti                       
  │  ├─ dnsutils-asmodeus                                                      100.0Mi        100.0Mi                       
  │  ├─ kube-router-dnkrb                                                      250.0Mi            0.0                       
  │  └─ step16-coolchic-gpu-run-asmodeus-a100-hq                               100.0Gi        100.0Gi                       
  ├─ beholder01                                                           (0%) 420.0Mi   (0%) 270.0Mi       92.9Gi   92.5Gi 
  │  ├─ coredns-55cb58b774-ts944                                                70.0Mi        170.0Mi                       
  │  ├─ dnsutils-beholder01                                                    100.0Mi        100.0Mi                       
  │  └─ kube-router-x6f6z                                                      250.0Mi            0.0                       
  ├─ beholder02                                                           (1%) 862.0Mi     (1%) 1.1Gi       92.9Gi   91.8Gi 
  │  ├─ dnsutils-beholder02                                                    100.0Mi        100.0Mi                       
  │  ├─ kube-router-q5qj6                                                      250.0Mi            0.0                       
  │  ├─ wordpress-8c8944c8d-cxbrq                                              256.0Mi        512.0Mi                       
  │  └─ wordpress-mariadb-565c547dc8-59fhr                                     256.0Mi        512.0Mi                       
  ├─ beholder03                                                           (0%) 350.0Mi   (0%) 100.0Mi       92.9Gi   92.5Gi 
  │  ├─ dnsutils-beholder03                                                    100.0Mi        100.0Mi                       
  │  └─ kube-router-nkl22                                                      250.0Mi            0.0                       
  ├─ belial                                                              (49%) 372.3Gi  (68%) 512.1Gi      754.4Gi  242.3Gi 
  │  ├─ dnsutils-belial                                                        100.0Mi        100.0Mi                       
  │  ├─ kube-router-fh7nx                                                      250.0Mi            0.0                       
  │  ├─ ledavio-similarity-search                                               32.0Gi         32.0Gi                       
  │  ├─ till-aust-inceptions-models-job-hgvwj                                  100.0Gi        120.0Gi                       
  │  ├─ till-aust-lc-11-20-all-all-2-ppot-accuracy-job-bl9xc                    20.0Gi         30.0Gi                       
  │  ├─ till-aust-lc-15-20-all-all-2-ppot-accuracy-job-qnnvk                    20.0Gi         30.0Gi                       
  │  ├─ till-aust-lc-16-10-all-all-3-ppot-accuracy-job-546wc                    20.0Gi         30.0Gi                       
  │  ├─ till-aust-lc-2-10-all-all-2-ppot-accuracy-job-r9vth                     20.0Gi         30.0Gi                       
  │  ├─ till-aust-lc-2-60-all-all-3-duration-estimate-accuracy-jobd2wvl         20.0Gi         30.0Gi                       
  │  ├─ till-aust-lc-3-5-all-all-3-duration-estimate-accuracy-job-m6q28         20.0Gi         30.0Gi                       
  │  ├─ till-aust-lc-5-10-all-all-3-ppot-accuracy-job-j2qph                     20.0Gi         30.0Gi                       
  │  ├─ till-aust-lc-5-20-all-all-3-ppot-accuracy-job-c4tjl                     20.0Gi         30.0Gi                       
  │  ├─ till-aust-lc-5-45-all-all-2-ppot-accuracy-job-gdq6m                     20.0Gi         30.0Gi                       
  │  ├─ till-aust-lc-6-10-all-all-2-ppot-accuracy-job-n586b                     20.0Gi         30.0Gi                       
  │  ├─ till-aust-lc-6-20-all-all-3-ppot-accuracy-job-8njqd                     20.0Gi         30.0Gi                       
  │  └─ till-aust-lc-7-2-all-all-2-ppot-accuracy-job-s9thx                      20.0Gi         30.0Gi                       
  ├─ demogorgon                                                            (75%) 1.5Ti    (76%) 1.5Ti        2.0Ti  472.7Gi 
  │  ├─ a2v2-two-gpu-jcsz                                                        1.0Ti          1.0Ti                       
  │  ├─ demogorgon-a2v2                                                        395.0Gi        395.0Gi                       
  │  ├─ dnsutils-demogorgon                                                    100.0Mi        100.0Mi                       
  │  ├─ gpu-demogorgon                                                          80.0Gi         80.0Gi                       
  │  ├─ kube-router-shv9v                                                      250.0Mi            0.0                       
  │  └─ pycharm                                                                 10.0Gi         32.0Gi                       
  ├─ fierna                                                              (28%) 210.8Gi  (29%) 219.1Gi      754.4Gi  535.3Gi 
  │  ├─ ccu-deepseek-low                                                       160.0Gi        160.0Gi                       
  │  ├─ detect-20251217-071751-94-thsw4                                         10.0Gi         12.0Gi                       
  │  ├─ detect-20251217-071751-95-lqrhm                                         10.0Gi         12.0Gi                       
  │  ├─ detect-20251217-071751-96-zmb5z                                         10.0Gi         12.0Gi                       
  │  ├─ detect-20251217-071751-97-2ftpq                                         10.0Gi         12.0Gi                       
  │  ├─ dnsutils-fierna                                                        100.0Mi        100.0Mi                       
  │  ├─ gatekeeper-audit-59d4b6fd4c-kcshh                                      256.0Mi        512.0Mi                       
  │  ├─ gatekeeper-controller-manager-66f474f785-65q6j                         256.0Mi        512.0Mi                       
  │  ├─ gpu-pod                                                                 10.0Gi         10.0Gi                       
  │  └─ kube-router-z5hzc                                                      250.0Mi            0.0                       
  ├─ kiaransalee                                                           (7%) 121.7G   (7%) 112.1Gi        1.5Ti     1.5T 
  │  ├─ bash-pod                                                                16.0Gi         16.0Gi                       
  │  ├─ dnsutils-kiaransalee                                                   100.0Mi        100.0Mi                       
  │  ├─ jupyter-sense-2damid-2dmadness                                            1.1G            0.0                       
  │  ├─ kube-router-89zrt                                                      250.0Mi            0.0                       
  │  ├─ tgillm-pod-689c99c6d7-9jjkz                                             80.0Gi         80.0Gi                       
  │  └─ wave-pod                                                                16.0Gi         16.0Gi                       
  ├─ mindflayer01                                                            (1%) 2.6G     (1%) 4.1Gi      376.5Gi  372.4Gi 
  │  ├─ coolchic-editor                                                          1.0Gi          1.0Gi                       
  │  ├─ dnsutils-mindflayer01                                                  100.0Mi        100.0Mi                       
  │  ├─ gatekeeper-controller-manager-66f474f785-lxl5x                         256.0Mi        512.0Mi                       
  │  ├─ kube-router-rzwmw                                                      250.0Mi            0.0                       
  │  ├─ mariadb-849498cb44-dldtm                                               256.0Mi        512.0Mi                       
  │  ├─ prometheus-deployment-b65d5d898-t2z9l                                   500.0M          1.0Gi                       
  │  └─ temp-pod                                                               100.0Mi          1.0Gi                       
  ├─ mindflayer02                                                         (17%) 64.8Gi   (17%) 65.1Gi      376.5Gi  311.4Gi 
  │  ├─ cool-pod                                                                64.0Gi         64.0Gi                       
  │  ├─ dnsutils-mindflayer02                                                  100.0Mi        100.0Mi                       
  │  ├─ kube-router-sbdv9                                                      250.0Mi            0.0                       
  │  └─ phpfpm-nginx-5db96d6895-97f2g                                          512.0Mi          1.0Gi                       
  ├─ mindflayer03                                                         (0%) 776.0Mi     (0%) 1.3Gi      376.5Gi  375.2Gi 
  │  ├─ check-data-pod                                                         100.0Mi        500.0Mi                       
  │  ├─ coredns-55cb58b774-4kpwk                                                70.0Mi        170.0Mi                       
  │  ├─ dnsutils-mindflayer03                                                  100.0Mi        100.0Mi                       
  │  ├─ gatekeeper-controller-manager-66f474f785-5gv4n                         256.0Mi        512.0Mi                       
  │  └─ kube-router-q46sc                                                      250.0Mi            0.0                       
  ├─ tiamat                                                                (7%) 72.3Gi    (7%) 72.1Gi     1007.6Gi  935.2Gi 
  │  ├─ coolchic-gpu-run-perceptive-set3                                        40.0Gi         40.0Gi                       
  │  ├─ dnsutils-tiamat                                                        100.0Mi        100.0Mi                       
  │  ├─ kube-router-zc5x6                                                      250.0Mi            0.0                       
  │  └─ ledavio-text-search-7bcd8c4885-vn2qs                                    32.0Gi         32.0Gi                       
  ├─ vecna                                                                (0%) 350.0Mi   (0%) 100.0Mi        1.5Ti    1.5Ti 
  │  ├─ dnsutils-vecna                                                         100.0Mi        100.0Mi                       
  │  └─ kube-router-bsdm4                                                      250.0Mi            0.0                       
  └─ zariel                                                              (37%) 754.3Gi  (46%) 920.1Gi        2.0Ti    1.1Ti 
     ├─ allin1                                                                  64.0Gi         80.0Gi                       
     ├─ dnsutils-zariel                                                        100.0Mi        100.0Mi                       
     ├─ kube-router-svsrt                                                      250.0Mi            0.0                       
     ├─ oliver-wiedemann-pod                                                   400.0Gi        400.0Gi                       
     ├─ ubuntu-gpu1                                                            250.0Gi        400.0Gi                       
     └─ ulas-bingoel-model-pod-4                                                40.0Gi         40.0Gi                       
  nvidia.com/gpu                                                            (48%) 30.0     (48%) 30.0         62.0     32.0 
  ├─ asmodeus                                                               (100%) 4.0     (100%) 4.0          4.0      0.0 
  │  ├─ asmodeus-three-gpu-julian-schaefer-zimmermann                              3.0            3.0                       
  │  └─ step16-coolchic-gpu-run-asmodeus-a100-hq                                   1.0            1.0                       
  ├─ belial                                                                  (25%) 2.0      (25%) 2.0          8.0      6.0 
  │  ├─ ledavio-similarity-search                                                  1.0            1.0                       
  │  └─ till-aust-inceptions-models-job-hgvwj                                      1.0            1.0                       
  ├─ demogorgon                                                             (100%) 8.0     (100%) 8.0          8.0      0.0 
  │  ├─ a2v2-two-gpu-jcsz                                                          2.0            2.0                       
  │  ├─ demogorgon-a2v2                                                            4.0            4.0                       
  │  ├─ gpu-demogorgon                                                             1.0            1.0                       
  │  └─ pycharm                                                                    1.0            1.0                       
  ├─ fierna                                                                  (75%) 6.0      (75%) 6.0          8.0      2.0 
  │  ├─ ccu-deepseek-low                                                           1.0            1.0                       
  │  ├─ detect-20251217-071751-94-thsw4                                            1.0            1.0                       
  │  ├─ detect-20251217-071751-95-lqrhm                                            1.0            1.0                       
  │  ├─ detect-20251217-071751-96-zmb5z                                            1.0            1.0                       
  │  ├─ detect-20251217-071751-97-2ftpq                                            1.0            1.0                       
  │  └─ gpu-pod                                                                    1.0            1.0                       
  ├─ kiaransalee                                                             (33%) 2.0      (33%) 2.0          6.0      4.0 
  │  ├─ jupyter-sense-2damid-2dmadness                                             1.0            1.0                       
  │  └─ tgillm-pod-689c99c6d7-9jjkz                                                1.0            1.0                       
  ├─ tiamat                                                                  (50%) 2.0      (50%) 2.0          4.0      2.0 
  │  ├─ coolchic-gpu-run-perceptive-set3                                           1.0            1.0                       
  │  └─ ledavio-text-search-7bcd8c4885-vn2qs                                       1.0            1.0                       
  ├─ vecna                                                                    (0%) 0.0       (0%) 0.0         16.0     16.0 
  └─ zariel                                                                  (75%) 6.0      (75%) 6.0          8.0      2.0 
     ├─ allin1                                                                     2.0            2.0                       
     ├─ oliver-wiedemann-pod                                                       2.0            2.0                       
     ├─ ubuntu-gpu1                                                                1.0            1.0                       
     └─ ulas-bingoel-model-pod-4                                                   1.0            1.0                       
  nvidia.com/mig-1g.10gb                                                      (0%) 0.0       (0%) 0.0          7.0      7.0 
  └─ kiaransalee                                                              (0%) 0.0       (0%) 0.0          7.0      7.0 
  nvidia.com/mig-3g.40gb                                                      (0%) 0.0       (0%) 0.0          1.0      1.0 
  └─ kiaransalee                                                              (0%) 0.0       (0%) 0.0          1.0      1.0 
  nvidia.com/mig-4g.40gb                                                    (100%) 1.0     (100%) 1.0          1.0      0.0 
  └─ kiaransalee                                                            (100%) 1.0     (100%) 1.0          1.0      0.0 
     └─ wave-pod                                                                   1.0            1.0                       
  pods                                                                     (10%) 155.0    (10%) 155.0         1.5k     1.4k 
  ├─ asmodeus                                                                 (5%) 6.0       (5%) 6.0        110.0    104.0 
  │  ├─ asmodeus-three-gpu-julian-schaefer-zimmermann                              1.0            1.0                       
  │  ├─ dnsutils-asmodeus                                                          1.0            1.0                       
  │  ├─ kube-proxy-rbzdb                                                           1.0            1.0                       
  │  ├─ kube-router-dnkrb                                                          1.0            1.0                       
  │  ├─ nvidia-device-plugin-daemonset-lrkw7                                       1.0            1.0                       
  │  └─ step16-coolchic-gpu-run-asmodeus-a100-hq                                   1.0            1.0                       
  ├─ beholder01                                                               (6%) 7.0       (6%) 7.0        110.0    103.0 
  │  ├─ coredns-55cb58b774-ts944                                                   1.0            1.0                       
  │  ├─ dnsutils-beholder01                                                        1.0            1.0                       
  │  ├─ kube-apiserver-beholder01                                                  1.0            1.0                       
  │  ├─ kube-controller-manager-beholder01                                         1.0            1.0                       
  │  ├─ kube-proxy-7f5v4                                                           1.0            1.0                       
  │  ├─ kube-router-x6f6z                                                          1.0            1.0                       
  │  └─ kube-scheduler-beholder01                                                  1.0            1.0                       
  ├─ beholder02                                                               (7%) 8.0       (7%) 8.0        110.0    102.0 
  │  ├─ dnsutils-beholder02                                                        1.0            1.0                       
  │  ├─ kube-apiserver-beholder02                                                  1.0            1.0                       
  │  ├─ kube-controller-manager-beholder02                                         1.0            1.0                       
  │  ├─ kube-proxy-dlc5n                                                           1.0            1.0                       
  │  ├─ kube-router-q5qj6                                                          1.0            1.0                       
  │  ├─ kube-scheduler-beholder02                                                  1.0            1.0                       
  │  ├─ wordpress-8c8944c8d-cxbrq                                                  1.0            1.0                       
  │  └─ wordpress-mariadb-565c547dc8-59fhr                                         1.0            1.0                       
  ├─ beholder03                                                               (5%) 6.0       (5%) 6.0        110.0    104.0 
  │  ├─ dnsutils-beholder03                                                        1.0            1.0                       
  │  ├─ kube-apiserver-beholder03                                                  1.0            1.0                       
  │  ├─ kube-controller-manager-beholder03                                         1.0            1.0                       
  │  ├─ kube-proxy-jqcj7                                                           1.0            1.0                       
  │  ├─ kube-router-nkl22                                                          1.0            1.0                       
  │  └─ kube-scheduler-beholder03                                                  1.0            1.0                       
  ├─ belial                                                                 (17%) 19.0     (17%) 19.0        110.0     91.0 
  │  ├─ dnsutils-belial                                                            1.0            1.0                       
  │  ├─ gpu-feature-discovery-64knd                                                1.0            1.0                       
  │  ├─ kube-proxy-nsltd                                                           1.0            1.0                       
  │  ├─ kube-router-fh7nx                                                          1.0            1.0                       
  │  ├─ ledavio-similarity-search                                                  1.0            1.0                       
  │  ├─ nvidia-device-plugin-daemonset-kn6h4                                       1.0            1.0                       
  │  ├─ till-aust-inceptions-models-job-hgvwj                                      1.0            1.0                       
  │  ├─ till-aust-lc-11-20-all-all-2-ppot-accuracy-job-bl9xc                       1.0            1.0                       
  │  ├─ till-aust-lc-15-20-all-all-2-ppot-accuracy-job-qnnvk                       1.0            1.0                       
  │  ├─ till-aust-lc-16-10-all-all-3-ppot-accuracy-job-546wc                       1.0            1.0                       
  │  ├─ till-aust-lc-2-10-all-all-2-ppot-accuracy-job-r9vth                        1.0            1.0                       
  │  ├─ till-aust-lc-2-60-all-all-3-duration-estimate-accuracy-jobd2wvl            1.0            1.0                       
  │  ├─ till-aust-lc-3-5-all-all-3-duration-estimate-accuracy-job-m6q28            1.0            1.0                       
  │  ├─ till-aust-lc-5-10-all-all-3-ppot-accuracy-job-j2qph                        1.0            1.0                       
  │  ├─ till-aust-lc-5-20-all-all-3-ppot-accuracy-job-c4tjl                        1.0            1.0                       
  │  ├─ till-aust-lc-5-45-all-all-2-ppot-accuracy-job-gdq6m                        1.0            1.0                       
  │  ├─ till-aust-lc-6-10-all-all-2-ppot-accuracy-job-n586b                        1.0            1.0                       
  │  ├─ till-aust-lc-6-20-all-all-3-ppot-accuracy-job-8njqd                        1.0            1.0                       
  │  └─ till-aust-lc-7-2-all-all-2-ppot-accuracy-job-s9thx                         1.0            1.0                       
  ├─ demogorgon                                                               (7%) 8.0       (7%) 8.0        110.0    102.0 
  │  ├─ a2v2-two-gpu-jcsz                                                          1.0            1.0                       
  │  ├─ demogorgon-a2v2                                                            1.0            1.0                       
  │  ├─ dnsutils-demogorgon                                                        1.0            1.0                       
  │  ├─ gpu-demogorgon                                                             1.0            1.0                       
  │  ├─ kube-proxy-gknqq                                                           1.0            1.0                       
  │  ├─ kube-router-shv9v                                                          1.0            1.0                       
  │  ├─ nvidia-device-plugin-daemonset-wcsbj                                       1.0            1.0                       
  │  └─ pycharm                                                                    1.0            1.0                       
  ├─ fierna                                                                 (18%) 20.0     (18%) 20.0        110.0     90.0 
  │  ├─ ccu-deepseek-low                                                           1.0            1.0                       
  │  ├─ detect-20251217-071751-94-thsw4                                            1.0            1.0                       
  │  ├─ detect-20251217-071751-95-lqrhm                                            1.0            1.0                       
  │  ├─ detect-20251217-071751-96-zmb5z                                            1.0            1.0                       
  │  ├─ detect-20251217-071751-97-2ftpq                                            1.0            1.0                       
  │  ├─ dnsutils-fierna                                                            1.0            1.0                       
  │  ├─ echo1-77fbfb54d-bmf98                                                      1.0            1.0                       
  │  ├─ echo1-77fbfb54d-k6nd2                                                      1.0            1.0                       
  │  ├─ echo2-5d58759df-8vfxs                                                      1.0            1.0                       
  │  ├─ gatekeeper-audit-59d4b6fd4c-kcshh                                          1.0            1.0                       
  │  ├─ gatekeeper-controller-manager-66f474f785-65q6j                             1.0            1.0                       
  │  ├─ gpu-feature-discovery-nwnmt                                                1.0            1.0                       
  │  ├─ gpu-pod                                                                    1.0            1.0                       
  │  ├─ kube-proxy-9tg4f                                                           1.0            1.0                       
  │  ├─ kube-router-z5hzc                                                          1.0            1.0                       
  │  ├─ kube-state-metrics-8945855d-zhjd9                                          1.0            1.0                       
  │  ├─ nvidia-device-plugin-daemonset-r5pcj                                       1.0            1.0                       
  │  ├─ proxy-5bc89cc587-gqgfw                                                     1.0            1.0                       
  │  ├─ user-scheduler-5cf5ffbc54-82462                                            1.0            1.0                       
  │  └─ user-scheduler-c7db6c584-gsfr2                                             1.0            1.0                       
  ├─ kiaransalee                                                            (10%) 11.0     (10%) 11.0        110.0     99.0 
  │  ├─ bash-pod                                                                   1.0            1.0                       
  │  ├─ continuous-image-puller-6fs4k                                              1.0            1.0                       
  │  ├─ continuous-image-puller-6z8bj                                              1.0            1.0                       
  │  ├─ dnsutils-kiaransalee                                                       1.0            1.0                       
  │  ├─ gpu-feature-discovery-znz6m                                                1.0            1.0                       
  │  ├─ jupyter-sense-2damid-2dmadness                                             1.0            1.0                       
  │  ├─ kube-proxy-65675                                                           1.0            1.0                       
  │  ├─ kube-router-89zrt                                                          1.0            1.0                       
  │  ├─ nvidia-device-plugin-daemonset-wj2wf                                       1.0            1.0                       
  │  ├─ tgillm-pod-689c99c6d7-9jjkz                                                1.0            1.0                       
  │  └─ wave-pod                                                                   1.0            1.0                       
  ├─ mindflayer01                                                           (23%) 25.0     (23%) 25.0        110.0     85.0 
  │  ├─ coolchic-editor                                                            1.0            1.0                       
  │  ├─ dex-7b88c8985f-5zqcd                                                       1.0            1.0                       
  │  ├─ dex-loginapp-779c77c986-wjxqs                                              1.0            1.0                       
  │  ├─ dex-mysql-66b5885f7b-2wdw8                                                 1.0            1.0                       
  │  ├─ dnsutils-mindflayer01                                                      1.0            1.0                       
  │  ├─ gatekeeper-controller-manager-66f474f785-lxl5x                             1.0            1.0                       
  │  ├─ kube-proxy-d2xl4                                                           1.0            1.0                       
  │  ├─ kube-router-rzwmw                                                          1.0            1.0                       
  │  ├─ local-path-provisioner-759479454f-jxl54                                    1.0            1.0                       
  │  ├─ mariadb-849498cb44-dldtm                                                   1.0            1.0                       
  │  ├─ mediawiki-68594fd995-2k4fp                                                 1.0            1.0                       
  │  ├─ mediawiki-mariadb-5c7dbf6b85-9wwp8                                         1.0            1.0                       
  │  ├─ memcached-578474d6f9-8dgb8                                                 1.0            1.0                       
  │  ├─ nginx-frontend-54d6db9d5c-vnhwd                                            1.0            1.0                       
  │  ├─ nginx-ip-2023-78b9c84dbf-znhrx                                             1.0            1.0                       
  │  ├─ nginx-k8s-7c8d949b5f-grwnx                                                 1.0            1.0                       
  │  ├─ nginx-rec-2023-79df8c77cd-gg87f                                            1.0            1.0                       
  │  ├─ nginx-rsn-2024-7dbc6b668b-jbjk7                                            1.0            1.0                       
  │  ├─ nginx-self-service-password-65b44f7547-dhvwp                               1.0            1.0                       
  │  ├─ nvidia-device-plugin-daemonset-5gmd7                                       1.0            1.0                       
  │  ├─ pdf-b647544f-snvmm                                                         1.0            1.0                       
  │  ├─ prometheus-deployment-b65d5d898-t2z9l                                      1.0            1.0                       
  │  ├─ registry-57fb9f57f4-vplmf                                                  1.0            1.0                       
  │  ├─ registry-auth-7b4598cd74-qctwp                                             1.0            1.0                       
  │  └─ temp-pod                                                                   1.0            1.0                       
  ├─ mindflayer02                                                             (5%) 6.0       (5%) 6.0        110.0    104.0 
  │  ├─ cool-pod                                                                   1.0            1.0                       
  │  ├─ dnsutils-mindflayer02                                                      1.0            1.0                       
  │  ├─ kube-proxy-zbwk4                                                           1.0            1.0                       
  │  ├─ kube-router-sbdv9                                                          1.0            1.0                       
  │  ├─ nvidia-device-plugin-daemonset-7wv2j                                       1.0            1.0                       
  │  └─ phpfpm-nginx-5db96d6895-97f2g                                              1.0            1.0                       
  ├─ mindflayer03                                                           (17%) 19.0     (17%) 19.0        110.0     91.0 
  │  ├─ check-data-pod                                                             1.0            1.0                       
  │  ├─ coredns-55cb58b774-4kpwk                                                   1.0            1.0                       
  │  ├─ dnsutils-mindflayer03                                                      1.0            1.0                       
  │  ├─ echo1-77fbfb54d-8bnp6                                                      1.0            1.0                       
  │  ├─ echo1-77fbfb54d-rcwzd                                                      1.0            1.0                       
  │  ├─ gatekeeper-controller-manager-66f474f785-5gv4n                             1.0            1.0                       
  │  ├─ hub-5c779f8d7c-d26wb                                                       1.0            1.0                       
  │  ├─ hub-78d6dd898d-8wwbq                                                       1.0            1.0                       
  │  ├─ kube-proxy-b68xq                                                           1.0            1.0                       
  │  ├─ kube-router-q46sc                                                          1.0            1.0                       
  │  ├─ ldap-6987986dbc-2zr2c                                                      1.0            1.0                       
  │  ├─ nginx-ip-2025-5888cccd9-fmnh8                                              1.0            1.0                       
  │  ├─ nvidia-device-plugin-daemonset-7k6p6                                       1.0            1.0                       
  │  ├─ proxy-5495d795d5-vx2ld                                                     1.0            1.0                       
  │  ├─ proxy-7f79cc645f-gj2ld                                                     1.0            1.0                       
  │  ├─ system-registry-8877cd57-59pzr                                             1.0            1.0                       
  │  ├─ user-registry-64bb8ff7cf-5cktq                                             1.0            1.0                       
  │  ├─ user-scheduler-5cf5ffbc54-gkwvq                                            1.0            1.0                       
  │  └─ user-scheduler-c7db6c584-4xdvn                                             1.0            1.0                       
  ├─ tiamat                                                                   (7%) 8.0       (7%) 8.0        110.0    102.0 
  │  ├─ continuous-image-puller-wtkhd                                              1.0            1.0                       
  │  ├─ coolchic-gpu-run-perceptive-set3                                           1.0            1.0                       
  │  ├─ dnsutils-tiamat                                                            1.0            1.0                       
  │  ├─ gpu-feature-discovery-5qb95                                                1.0            1.0                       
  │  ├─ kube-proxy-nbjvl                                                           1.0            1.0                       
  │  ├─ kube-router-zc5x6                                                          1.0            1.0                       
  │  ├─ ledavio-text-search-7bcd8c4885-vn2qs                                       1.0            1.0                       
  │  └─ nvidia-device-plugin-daemonset-9b4tb                                       1.0            1.0                       
  ├─ vecna                                                                    (4%) 4.0       (4%) 4.0        110.0    106.0 
  │  ├─ dnsutils-vecna                                                             1.0            1.0                       
  │  ├─ kube-proxy-djtx4                                                           1.0            1.0                       
  │  ├─ kube-router-bsdm4                                                          1.0            1.0                       
  │  └─ nvidia-device-plugin-daemonset-qln9t                                       1.0            1.0                       
  └─ zariel                                                                   (7%) 8.0       (7%) 8.0        110.0    102.0 
     ├─ allin1                                                                     1.0            1.0                       
     ├─ dnsutils-zariel                                                            1.0            1.0                       
     ├─ kube-proxy-c8frf                                                           1.0            1.0                       
     ├─ kube-router-svsrt                                                          1.0            1.0                       
     ├─ nvidia-device-plugin-daemonset-djsrl                                       1.0            1.0                       
     ├─ oliver-wiedemann-pod                                                       1.0            1.0                       
     ├─ ubuntu-gpu1                                                                1.0            1.0                       
     └─ ulas-bingoel-model-pod-4                                                   1.0            1.0                       




Resource usage by namespace

 Resource                       Requested    Limit  Allocatable  Free 
  auth                                                                
  ├─ mindflayer01                                                     
  │  ├─ cpu                        650.0m      0.0                    
  │  └─ pods                          5.0      5.0                    
  └─ mindflayer03                                                     
     ├─ cpu                        250.0m      0.0                    
     └─ pods                          1.0      1.0                    
  frontend                            4.0      4.0                    
  ├─ fierna                           1.0      1.0                    
  │  └─ pods                          1.0      1.0                    
  ├─ mindflayer01                     2.0      2.0                    
  │  └─ pods                          2.0      2.0                    
  └─ mindflayer03                     1.0      1.0                    
     └─ pods                          1.0      1.0                    
  gatekeeper-system                                                   
  ├─ fierna                                                           
  │  ├─ cpu                        200.0m      2.0                    
  │  ├─ memory                    512.0Mi    1.0Gi                    
  │  └─ pods                          2.0      2.0                    
  ├─ mindflayer01                                                     
  │  ├─ cpu                        100.0m      1.0                    
  │  ├─ memory                    256.0Mi  512.0Mi                    
  │  └─ pods                          1.0      1.0                    
  └─ mindflayer03                                                     
     ├─ cpu                        100.0m      1.0                    
     ├─ memory                    256.0Mi  512.0Mi                    
     └─ pods                          1.0      1.0                    
  gcpr-vmv-2022                                                       
  └─ beholder02                                                       
     ├─ cpu                           1.0      2.0                    
     ├─ memory                    512.0Mi    1.0Gi                    
     └─ pods                          2.0      2.0                    
  jupyterhub                                                          
  ├─ fierna                           2.0      2.0                    
  │  └─ pods                          2.0      2.0                    
  ├─ kiaransalee                                                      
  │  ├─ memory                       1.1G      0.0                    
  │  ├─ nvidia.com/gpu                1.0      1.0                    
  │  └─ pods                          2.0      2.0                    
  └─ mindflayer03                     2.0      2.0                    
     └─ pods                          2.0      2.0                    
  jupyterhub-kuckling                 2.0      2.0                    
  ├─ mindflayer03                     1.0      1.0                    
  │  └─ pods                          1.0      1.0                    
  └─ tiamat                           1.0      1.0                    
     └─ pods                          1.0      1.0                    
  jupyterhub-students                 5.0      5.0                    
  ├─ fierna                           1.0      1.0                    
  │  └─ pods                          1.0      1.0                    
  ├─ kiaransalee                      1.0      1.0                    
  │  └─ pods                          1.0      1.0                    
  └─ mindflayer03                     3.0      3.0                    
     └─ pods                          3.0      3.0                    
  kube-system                                                         
  ├─ asmodeus                                                         
  │  ├─ cpu                        350.0m   100.0m                    
  │  ├─ memory                    350.0Mi  100.0Mi                    
  │  └─ pods                          4.0      4.0                    
  ├─ beholder01                                                       
  │  ├─ cpu                           1.0   100.0m                    
  │  ├─ memory                    420.0Mi  270.0Mi                    
  │  └─ pods                          7.0      7.0                    
  ├─ beholder02                                                       
  │  ├─ cpu                        900.0m   100.0m                    
  │  ├─ memory                    350.0Mi  100.0Mi                    
  │  └─ pods                          6.0      6.0                    
  ├─ beholder03                                                       
  │  ├─ cpu                        900.0m   100.0m                    
  │  ├─ memory                    350.0Mi  100.0Mi                    
  │  └─ pods                          6.0      6.0                    
  ├─ belial                                                           
  │  ├─ cpu                        350.0m   100.0m                    
  │  ├─ memory                    350.0Mi  100.0Mi                    
  │  └─ pods                          5.0      5.0                    
  ├─ demogorgon                                                       
  │  ├─ cpu                        350.0m   100.0m                    
  │  ├─ memory                    350.0Mi  100.0Mi                    
  │  └─ pods                          4.0      4.0                    
  ├─ fierna                                                           
  │  ├─ cpu                        350.0m   100.0m                    
  │  ├─ memory                    350.0Mi  100.0Mi                    
  │  └─ pods                          5.0      5.0                    
  ├─ kiaransalee                                                      
  │  ├─ cpu                        350.0m   100.0m                    
  │  ├─ memory                    350.0Mi  100.0Mi                    
  │  └─ pods                          5.0      5.0                    
  ├─ mindflayer01                                                     
  │  ├─ cpu                        350.0m   100.0m                    
  │  ├─ memory                    350.0Mi  100.0Mi                    
  │  └─ pods                          4.0      4.0                    
  ├─ mindflayer02                                                     
  │  ├─ cpu                        350.0m   100.0m                    
  │  ├─ memory                    350.0Mi  100.0Mi                    
  │  └─ pods                          4.0      4.0                    
  ├─ mindflayer03                                                     
  │  ├─ cpu                        450.0m   100.0m                    
  │  ├─ memory                    420.0Mi  270.0Mi                    
  │  └─ pods                          5.0      5.0                    
  ├─ tiamat                                                           
  │  ├─ cpu                        350.0m   100.0m                    
  │  ├─ memory                    350.0Mi  100.0Mi                    
  │  └─ pods                          5.0      5.0                    
  ├─ vecna                                                            
  │  ├─ cpu                        350.0m   100.0m                    
  │  ├─ memory                    350.0Mi  100.0Mi                    
  │  └─ pods                          4.0      4.0                    
  └─ zariel                                                           
     ├─ cpu                        350.0m   100.0m                    
     ├─ memory                    350.0Mi  100.0Mi                    
     └─ pods                          4.0      4.0                    
  local-path-storage                  1.0      1.0                    
  └─ mindflayer01                     1.0      1.0                    
     └─ pods                          1.0      1.0                    
  monitoring                                                          
  ├─ fierna                           1.0      1.0                    
  │  └─ pods                          1.0      1.0                    
  └─ mindflayer01                                                     
     ├─ cpu                        500.0m      1.0                    
     ├─ memory                     500.0M    1.0Gi                    
     └─ pods                          1.0      1.0                    
  registry                                                            
  └─ mindflayer03                                                     
     ├─ cpu                        200.0m      0.0                    
     └─ pods                          2.0      2.0                    
  testing                             3.0      3.0                    
  ├─ fierna                           2.0      2.0                    
  │  └─ pods                          2.0      2.0                    
  └─ mindflayer03                     1.0      1.0                    
     └─ pods                          1.0      1.0                    
  user-adwait-deshpande                                               
  ├─ mindflayer01                                                     
  │  ├─ cpu                        100.0m      1.0                    
  │  ├─ memory                    100.0Mi    1.0Gi                    
  │  └─ pods                          1.0      1.0                    
  └─ mindflayer03                                                     
     ├─ cpu                        100.0m   500.0m                    
     ├─ memory                    100.0Mi  500.0Mi                    
     └─ pods                          1.0      1.0                    
  user-alex-chan                                                      
  └─ mindflayer02                                                     
     ├─ cpu                          10.0     16.0                    
     ├─ memory                     64.0Gi   64.0Gi                    
     └─ pods                          1.0      1.0                    
  user-andri-rutschmann                                               
  └─ kiaransalee                                                      
     ├─ cpu                          24.0     24.0                    
     ├─ memory                     32.0Gi   32.0Gi                    
     ├─ nvidia.com/mig-4g.40gb        1.0      1.0                    
     └─ pods                          2.0      2.0                    
  user-celine-angonin                                                 
  └─ demogorgon                                                       
     ├─ cpu                          15.0     15.0                    
     ├─ memory                    395.0Gi  395.0Gi                    
     ├─ nvidia.com/gpu                4.0      4.0                    
     └─ pods                          1.0      1.0                    
  user-daniel-calovi                                                  
  └─ fierna                                                           
     ├─ cpu                          10.0     10.0                    
     ├─ memory                    160.0Gi  160.0Gi                    
     ├─ nvidia.com/gpu                1.0      1.0                    
     └─ pods                          1.0      1.0                    
  user-eduard-buss                                                    
  └─ demogorgon                                                       
     ├─ cpu                           1.0     32.0                    
     ├─ ephemeral-storage          15.0Gi   30.0Gi                    
     ├─ memory                     10.0Gi   32.0Gi                    
     ├─ nvidia.com/gpu                1.0      1.0                    
     └─ pods                          1.0      1.0                    
  user-jacob-davidson                                                 
  └─ fierna                                                           
     ├─ cpu                           8.0     12.0                    
     ├─ memory                     40.0Gi   48.0Gi                    
     ├─ nvidia.com/gpu                4.0      4.0                    
     └─ pods                          4.0      4.0                    
  user-julian-jandeleit                                               
  └─ demogorgon                                                       
     ├─ cpu                          48.0     48.0                    
     ├─ memory                     80.0Gi   80.0Gi                    
     ├─ nvidia.com/gpu                1.0      1.0                    
     └─ pods                          1.0      1.0                    
  user-julian-zimmermann                                              
  ├─ asmodeus                                                         
  │  ├─ cpu                         225.0    225.0                    
  │  ├─ memory                      1.5Ti    1.5Ti                    
  │  ├─ nvidia.com/gpu                3.0      3.0                    
  │  └─ pods                          1.0      1.0                    
  └─ demogorgon                                                       
     ├─ cpu                           8.0      8.0                    
     ├─ memory                      1.0Ti    1.0Ti                    
     ├─ nvidia.com/gpu                2.0      2.0                    
     └─ pods                          1.0      1.0                    
  user-maya-dagher                                                    
  └─ fierna                                                           
     ├─ cpu                           1.0      1.0                    
     ├─ memory                     10.0Gi   10.0Gi                    
     ├─ nvidia.com/gpu                1.0      1.0                    
     └─ pods                          1.0      1.0                    
  user-mike-battistella                                               
  ├─ belial                                                           
  │  ├─ cpu                           1.0      1.0                    
  │  ├─ memory                     32.0Gi   32.0Gi                    
  │  ├─ nvidia.com/gpu                1.0      1.0                    
  │  └─ pods                          1.0      1.0                    
  └─ tiamat                                                           
     ├─ cpu                           1.0      2.0                    
     ├─ memory                     32.0Gi   32.0Gi                    
     ├─ nvidia.com/gpu                1.0      1.0                    
     └─ pods                          1.0      1.0                    
  user-mohsen-jenadeleh                                               
  ├─ asmodeus                                                         
  │  ├─ cpu                          20.0     20.0                    
  │  ├─ memory                    100.0Gi  100.0Gi                    
  │  ├─ nvidia.com/gpu                1.0      1.0                    
  │  └─ pods                          1.0      1.0                    
  ├─ mindflayer01                                                     
  │  ├─ cpu                        500.0m   500.0m                    
  │  ├─ memory                      1.0Gi    1.0Gi                    
  │  └─ pods                          1.0      1.0                    
  └─ tiamat                                                           
     ├─ cpu                          40.0     40.0                    
     ├─ memory                     40.0Gi   40.0Gi                    
     ├─ nvidia.com/gpu                1.0      1.0                    
     └─ pods                          1.0      1.0                    
  user-oliver-wiedemann                                               
  └─ zariel                                                           
     ├─ cpu                          32.0     64.0                    
     ├─ memory                    400.0Gi  400.0Gi                    
     ├─ nvidia.com/gpu                2.0      2.0                    
     └─ pods                          1.0      1.0                    
  user-segun-aroyehun                                                 
  └─ zariel                                                           
     ├─ cpu                          50.0    100.0                    
     ├─ ephemeral-storage         100.0Gi  150.0Gi                    
     ├─ memory                    250.0Gi  400.0Gi                    
     ├─ nvidia.com/gpu                1.0      1.0                    
     └─ pods                          1.0      1.0                    
  user-sifei-li                                                       
  └─ zariel                                                           
     ├─ cpu                          64.0     96.0                    
     ├─ memory                     64.0Gi   80.0Gi                    
     ├─ nvidia.com/gpu                2.0      2.0                    
     └─ pods                          1.0      1.0                    
  user-stephen-tyndel                                                 
  └─ kiaransalee                                                      
     ├─ cpu                          12.0     12.0                    
     ├─ memory                     80.0Gi   80.0Gi                    
     ├─ nvidia.com/gpu                1.0      1.0                    
     └─ pods                          1.0      1.0                    
  user-till-aust                                                      
  └─ belial                                                           
     ├─ cpu                          56.0     56.0                    
     ├─ memory                    320.0Gi  450.0Gi                    
     ├─ nvidia.com/gpu                1.0      1.0                    
     └─ pods                         12.0     12.0                    
  user-ulas-bingoel                                                   
  └─ zariel                                                           
     ├─ cpu                           8.0      8.0                    
     ├─ memory                     40.0Gi   40.0Gi                    
     ├─ nvidia.com/gpu                1.0      1.0                    
     └─ pods                          1.0      1.0                    
  user-v-time                                                         
  ├─ mindflayer01                                                     
  │  ├─ cpu                        500.0m      1.0                    
  │  ├─ memory                    256.0Mi  512.0Mi                    
  │  └─ pods                          1.0      1.0                    
  └─ mindflayer02                                                     
     ├─ cpu                           1.0      2.0                    
     ├─ memory                    512.0Mi    1.0Gi                    
     └─ pods                          1.0      1.0                    
  web                                                                 
  ├─ mindflayer01                                                     
  │  ├─ cpu                        500.0m      0.0                    
  │  └─ pods                          8.0      8.0                    
  └─ mindflayer03                     1.0      1.0                    
     └─ pods                          1.0      1.0                    




Ceph file system report

  cluster:
    id:     3fee6f38-ba9f-11ec-9328-e188936dcafd
    health: HEALTH_OK
 
  services:
    mon: 5 daemons, quorum beholder03,beholder01,beholder02,mindflayer02,mindflayer03 (age 5d)
    mgr: beholder01.verxwn(active, since 11M), standbys: beholder03.nprqzk, mindflayer03.rzdvrr, mindflayer01.mkuopd, mindflayer02.ympgrs, beholder02.akktmp
    mds: 4/4 daemons up, 2 standby
    osd: 24 osds: 24 up (since 11M), 24 in (since 18M)
 
  data:
    volumes: 1/1 healthy
    pools:   3 pools, 545 pgs
    objects: 72.44M objects, 96 TiB
    usage:   193 TiB used, 89 TiB / 282 TiB avail
    pgs:     544 active+clean
             1   active+clean+scrubbing+deep
 
  io:
    client:   38 KiB/s wr, 0 op/s rd, 4 op/s wr
 
HEALTH_OK




Etcd cluster

+------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
|     ENDPOINT     |        ID        | VERSION | DB SIZE | IS LEADER | IS LEARNER | RAFT TERM | RAFT INDEX | RAFT APPLIED INDEX | ERRORS |
+------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
| 192.168.1.1:4252 | 7126e7a3a9cc42ca |  3.4.30 |  156 MB |     false |      false |    302055 |  684215935 |          684215935 |        |
| 192.168.1.2:4252 | 39d72894bf6c7600 |  3.4.30 |  156 MB |     false |      false |    302055 |  684215935 |          684215935 |        |
| 192.168.1.3:4252 | bbf4a2b99c3fd692 |  3.4.30 |  156 MB |      true |      false |    302055 |  684215935 |          684215935 |        |
| 192.168.2.1:4252 | 5cb9997dd1c2246b |  3.3.25 |  156 MB |     false |      false |    302055 |  684215935 |                  0 |        |
| 192.168.2.3:4252 |  cbc1cf89959ea4e |  3.3.25 |  156 MB |     false |      false |    302055 |  684215935 |                  0 |        |
+------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+




Detailed network health

API and web servers

beholder01
SSH port open yes
Report available yes
External interface up ok
Infiniband interface up ok
API servers reachable 1 2 3 4
Ceph monitors reachable 1 2 3
cephfs mounted /cephfs
local raid mounted /raid
Test pod responding dnsutils-beholder01
Can reach kube-dns 10.96.0.10
Pod can reach kube-dns yes
Pod can reach internet yes
beholder02
SSH port open yes
Report available yes
External interface up ok
Infiniband interface up ok
API servers reachable 1 2 3 4
Ceph monitors reachable 1 2 3
cephfs mounted /cephfs
local raid mounted /raid
Test pod responding dnsutils-beholder02
Can reach kube-dns 10.96.0.10
Pod can reach kube-dns yes
Pod can reach internet yes
beholder03
SSH port open yes
Report available yes
External interface up ok
Infiniband interface up ok
API servers reachable 1 2 3 4
Ceph monitors reachable 1 2 3
cephfs mounted /cephfs
local raid mounted /raid
Test pod responding dnsutils-beholder03
Can reach kube-dns 10.96.0.10
Pod can reach kube-dns yes
Pod can reach internet yes




Ceph osd nodes

mindflayer01
SSH port open yes
Report available yes
Infiniband interface up ok
API servers reachable 1 2 3 4
Ceph monitors reachable 1 2 3
cephfs mounted /cephfs
local raid mounted /raid
Test pod responding dnsutils-mindflayer01
Can reach kube-dns 10.96.0.10
Pod can reach kube-dns yes
Pod can reach internet yes
mindflayer02
SSH port open yes
Report available yes
Infiniband interface up ok
API servers reachable 1 2 3 4
Ceph monitors reachable 1 2 3
cephfs mounted /cephfs
local raid mounted /raid
Test pod responding dnsutils-mindflayer02
Can reach kube-dns 10.96.0.10
Pod can reach kube-dns yes
Pod can reach internet yes
mindflayer03
SSH port open yes
Report available yes
Infiniband interface up ok
API servers reachable 1 2 3 4
Ceph monitors reachable 1 2 3
cephfs mounted /cephfs
local raid mounted /raid
Test pod responding dnsutils-mindflayer03
Can reach kube-dns 10.96.0.10
Pod can reach kube-dns yes
Pod can reach internet yes




Compute nodes

vecna
SSH port open NO
Test pod responding dnsutils-vecna
kiaransalee
SSH port open yes
Report available yes
Infiniband interface up ok
API servers reachable 1 2 3 4
Ceph monitors reachable 1 2 3
cephfs mounted /cephfs/abyss
local raid mounted /raid
Test pod responding dnsutils-kiaransalee
Can reach kube-dns 10.96.0.10
Pod can reach kube-dns yes
Pod can reach internet yes
belial
SSH port open yes
Report available yes
Infiniband interface up ok
API servers reachable 1 2 3 4
Ceph monitors reachable 1 2 3
cephfs mounted /cephfs/abyss
local raid mounted /raid
Test pod responding dnsutils-belial
Can reach kube-dns 10.96.0.10
Pod can reach kube-dns yes
Pod can reach internet yes
fierna
SSH port open yes
Report available yes
Infiniband interface up ok
API servers reachable 1 2 3 4
Ceph monitors reachable 1 2 3
cephfs mounted /cephfs/abyss
local raid mounted /raid
Test pod responding dnsutils-fierna
Can reach kube-dns 10.96.0.10
Pod can reach kube-dns yes
Pod can reach internet yes
demogorgon
SSH port open yes
Report available yes
Infiniband interface up ok
API servers reachable 1 2 3 4
Ceph monitors reachable 1 2 3
cephfs mounted /cephfs/abyss
local raid mounted /raid
Test pod responding dnsutils-demogorgon
Can reach kube-dns 10.96.0.10
Pod can reach kube-dns yes
Pod can reach internet yes
tiamat
SSH port open yes
Report available yes
Infiniband interface up ok
API servers reachable 1 2 3 4
Ceph monitors reachable 1 2 3
cephfs mounted /cephfs/abyss
local raid mounted /raid
Test pod responding dnsutils-tiamat
Can reach kube-dns 10.96.0.10
Pod can reach kube-dns yes
Pod can reach internet yes
asmodeus
SSH port open yes
Report available yes
Infiniband interface up ok
API servers reachable 1 2 3 4
Ceph monitors reachable 1 2 3
cephfs mounted /cephfs/abyss
local raid mounted /raid
Test pod responding dnsutils-asmodeus
Can reach kube-dns 10.96.0.10
Pod can reach kube-dns yes
Pod can reach internet yes
zariel
SSH port open yes
Report available yes
Infiniband interface up ok
API servers reachable 1 2 3 4
Ceph monitors reachable 1 2 3
cephfs mounted /cephfs/abyss
local raid mounted /raid
Test pod responding dnsutils-zariel
Can reach kube-dns 10.96.0.10
Pod can reach kube-dns yes
Pod can reach internet yes




nVidia driver and GPU status

dretch

belial

Wed Dec 17 21:52:57 2025       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 565.57.01              Driver Version: 565.57.01      CUDA Version: 12.7     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  Quadro RTX 6000                Off |   00000000:1B:00.0 Off |                  Off |
| 33%   33C    P8              4W /  260W |       1MiB /  24576MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   1  Quadro RTX 6000                Off |   00000000:1C:00.0 Off |                  Off |
| 33%   31C    P8             12W /  260W |     784MiB /  24576MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   2  Quadro RTX 6000                Off |   00000000:1D:00.0 Off |                  Off |
| 33%   32C    P8              4W /  260W |       1MiB /  24576MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   3  Quadro RTX 6000                Off |   00000000:1E:00.0 Off |                  Off |
| 33%   32C    P8              5W /  260W |    4552MiB /  24576MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   4  Quadro RTX 6000                Off |   00000000:3D:00.0 Off |                  Off |
| 33%   31C    P8             13W /  260W |       1MiB /  24576MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   5  Quadro RTX 6000                Off |   00000000:3F:00.0 Off |                  Off |
| 33%   30C    P8              6W /  260W |       1MiB /  24576MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   6  Quadro RTX 6000                Off |   00000000:40:00.0 Off |                  Off |
| 33%   39C    P0             60W /  260W |     262MiB /  24576MiB |     12%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   7  Quadro RTX 6000                Off |   00000000:41:00.0 Off |                  Off |
| 33%   32C    P8              4W /  260W |    9064MiB /  24576MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
                                                                                         
+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|    1   N/A  N/A   4044749      C   python                                        780MiB |
|    3   N/A  N/A   1041500      C   python                                       2274MiB |
|    6   N/A  N/A   1778810      C   python                                        258MiB |
|    7   N/A  N/A   2703123      C   python                                       2264MiB |
|    7   N/A  N/A   2703126      C   python                                       2264MiB |
+-----------------------------------------------------------------------------------------+

fierna

Wed Dec 17 21:52:59 2025       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 565.57.01              Driver Version: 565.57.01      CUDA Version: 12.7     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  Quadro RTX 6000                Off |   00000000:1B:00.0 Off |                  Off |
| 33%   47C    P0            115W /  260W |    4531MiB /  24576MiB |     35%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   1  Quadro RTX 6000                Off |   00000000:1C:00.0 Off |                  Off |
| 33%   30C    P8             15W /  260W |       1MiB /  24576MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   2  Quadro RTX 6000                Off |   00000000:1D:00.0 Off |                  Off |
| 33%   32C    P8              4W /  260W |       1MiB /  24576MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   3  Quadro RTX 6000                Off |   00000000:1E:00.0 Off |                  Off |
| 33%   31C    P8              5W /  260W |       1MiB /  24576MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   4  Quadro RTX 6000                Off |   00000000:3D:00.0 Off |                  Off |
| 33%   38C    P0             98W /  260W |    4559MiB /  24576MiB |     19%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   5  Quadro RTX 6000                Off |   00000000:3F:00.0 Off |                  Off |
| 33%   40C    P0             58W /  260W |    4559MiB /  24576MiB |      6%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   6  Quadro RTX 6000                Off |   00000000:40:00.0 Off |                  Off |
| 33%   40C    P0             63W /  260W |    4553MiB /  24576MiB |     20%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   7  Quadro RTX 6000                Off |   00000000:41:00.0 Off |                  Off |
| 33%   31C    P8             16W /  260W |       1MiB /  24576MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
                                                                                         
+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|    0   N/A  N/A   1797462      C   python                                       2264MiB |
|    0   N/A  N/A   1797463      C   python                                       2264MiB |
|    4   N/A  N/A   3399154      C   python                                       2278MiB |
|    4   N/A  N/A   3399155      C   python                                       2278MiB |
|    5   N/A  N/A   3558893      C   python                                       2278MiB |
|    5   N/A  N/A   3558894      C   python                                       2278MiB |
|    6   N/A  N/A   3644131      C   python                                       2276MiB |
|    6   N/A  N/A   3644132      C   python                                       2274MiB |
+-----------------------------------------------------------------------------------------+

tiamat

Wed Dec 17 21:53:01 2025       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 565.57.01              Driver Version: 565.57.01      CUDA Version: 12.7     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA A100-SXM4-40GB          Off |   00000000:01:00.0 Off |                    0 |
| N/A   27C    P0             50W /  400W |       1MiB /  40960MiB |      0%      Default |
|                                         |                        |             Disabled |
+-----------------------------------------+------------------------+----------------------+
|   1  NVIDIA A100-SXM4-40GB          Off |   00000000:41:00.0 Off |                    0 |
| N/A   28C    P0             57W /  400W |    4483MiB /  40960MiB |      0%      Default |
|                                         |                        |             Disabled |
+-----------------------------------------+------------------------+----------------------+
|   2  NVIDIA A100-SXM4-40GB          Off |   00000000:81:00.0 Off |                    0 |
| N/A   26C    P0             49W /  400W |       1MiB /  40960MiB |      0%      Default |
|                                         |                        |             Disabled |
+-----------------------------------------+------------------------+----------------------+
|   3  NVIDIA A100-SXM4-40GB          Off |   00000000:C1:00.0 Off |                    0 |
| N/A   25C    P0             50W /  400W |       1MiB /  40960MiB |      0%      Default |
|                                         |                        |             Disabled |
+-----------------------------------------+------------------------+----------------------+
                                                                                         
+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|    1   N/A  N/A   2859948      C   python                                       4472MiB |
+-----------------------------------------------------------------------------------------+

vecna

asmodeus

Wed Dec 17 21:53:07 2025       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 570.86.15              Driver Version: 570.86.15      CUDA Version: 12.8     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA A100-SXM4-80GB          On  |   00000000:01:00.0 Off |                    0 |
| N/A   29C    P0             74W /  500W |     463MiB /  81920MiB |      0%      Default |
|                                         |                        |             Disabled |
+-----------------------------------------+------------------------+----------------------+
|   1  NVIDIA A100-SXM4-80GB          On  |   00000000:41:00.0 Off |                    0 |
| N/A   29C    P0             58W /  500W |       1MiB /  81920MiB |      0%      Default |
|                                         |                        |             Disabled |
+-----------------------------------------+------------------------+----------------------+
|   2  NVIDIA A100-SXM4-80GB          On  |   00000000:81:00.0 Off |                    0 |
| N/A   27C    P0             60W /  500W |       1MiB /  81920MiB |      0%      Default |
|                                         |                        |             Disabled |
+-----------------------------------------+------------------------+----------------------+
|   3  NVIDIA A100-SXM4-80GB          On  |   00000000:C1:00.0 Off |                    0 |
| N/A   26C    P0             60W /  500W |       1MiB /  81920MiB |      0%      Default |
|                                         |                        |             Disabled |
+-----------------------------------------+------------------------+----------------------+
                                                                                         
+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI              PID   Type   Process name                        GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|    0   N/A  N/A         3800600      C   python                                  452MiB |
+-----------------------------------------------------------------------------------------+

zariel

Wed Dec 17 22:53:09 2025       
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.274.02             Driver Version: 535.274.02   CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  NVIDIA A100-SXM4-40GB          On  | 00000000:07:00.0 Off |                    0 |
| N/A   30C    P0              54W / 400W |      0MiB / 40960MiB |      0%      Default |
|                                         |                      |             Disabled |
+-----------------------------------------+----------------------+----------------------+
|   1  NVIDIA A100-SXM4-40GB          On  | 00000000:0F:00.0 Off |                    0 |
| N/A   32C    P0              73W / 400W |  38828MiB / 40960MiB |      0%      Default |
|                                         |                      |             Disabled |
+-----------------------------------------+----------------------+----------------------+
|   2  NVIDIA A100-SXM4-40GB          On  | 00000000:47:00.0 Off |                    0 |
| N/A   28C    P0              52W / 400W |      0MiB / 40960MiB |      0%      Default |
|                                         |                      |             Disabled |
+-----------------------------------------+----------------------+----------------------+
|   3  NVIDIA A100-SXM4-40GB          On  | 00000000:4E:00.0 Off |                    0 |
| N/A   28C    P0              52W / 400W |      0MiB / 40960MiB |      0%      Default |
|                                         |                      |             Disabled |
+-----------------------------------------+----------------------+----------------------+
|   4  NVIDIA A100-SXM4-40GB          On  | 00000000:87:00.0 Off |                    0 |
| N/A   37C    P0              80W / 400W |  38804MiB / 40960MiB |      0%      Default |
|                                         |                      |             Disabled |
+-----------------------------------------+----------------------+----------------------+
|   5  NVIDIA A100-SXM4-40GB          On  | 00000000:90:00.0 Off |                    0 |
| N/A   33C    P0              51W / 400W |      0MiB / 40960MiB |      0%      Default |
|                                         |                      |             Disabled |
+-----------------------------------------+----------------------+----------------------+
|   6  NVIDIA A100-SXM4-40GB          On  | 00000000:B7:00.0 Off |                    0 |
| N/A   31C    P0              50W / 400W |      0MiB / 40960MiB |      0%      Default |
|                                         |                      |             Disabled |
+-----------------------------------------+----------------------+----------------------+
|   7  NVIDIA A100-SXM4-40GB          On  | 00000000:BD:00.0 Off |                    0 |
| N/A   30C    P0              54W / 400W |      0MiB / 40960MiB |      0%      Default |
|                                         |                      |             Disabled |
+-----------------------------------------+----------------------+----------------------+
                                                                                         
+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|    1   N/A  N/A   4041701      C   /opt/conda/envs/konx/bin/python           38820MiB |
|    4   N/A  N/A   4041701      C   /opt/conda/envs/konx/bin/python           38796MiB |
+---------------------------------------------------------------------------------------+

demogorgon

Wed Dec 17 21:53:13 2025       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 565.57.01              Driver Version: 565.57.01      CUDA Version: 12.7     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA A40                     On  |   00000000:01:00.0 Off |                    0 |
|  0%   29C    P8             23W /  300W |       1MiB /  46068MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   1  NVIDIA A40                     On  |   00000000:25:00.0 Off |                    0 |
|  0%   29C    P8             25W /  300W |       1MiB /  46068MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   2  NVIDIA A40                     On  |   00000000:41:00.0 Off |                    0 |
|  0%   67C    P0            242W /  300W |    8225MiB /  46068MiB |     97%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   3  NVIDIA A40                     On  |   00000000:61:00.0 Off |                    0 |
|  0%   28C    P8             24W /  300W |       1MiB /  46068MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   4  NVIDIA A40                     On  |   00000000:81:00.0 Off |                    0 |
|  0%   28C    P8             24W /  300W |       1MiB /  46068MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   5  NVIDIA A40                     On  |   00000000:A1:00.0 Off |                    0 |
|  0%   64C    P0            234W /  300W |    8225MiB /  46068MiB |     99%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   6  NVIDIA A40                     On  |   00000000:C1:00.0 Off |                    0 |
|  0%   69C    P0            238W /  300W |    8305MiB /  46068MiB |     93%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   7  NVIDIA A40                     On  |   00000000:E1:00.0 Off |                    0 |
|  0%   68C    P0            254W /  300W |    8241MiB /  46068MiB |     91%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
                                                                                         
+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|    2   N/A  N/A   1415817      C   /usr/bin/python                              8216MiB |
|    5   N/A  N/A   1415818      C   /usr/bin/python                              8216MiB |
|    6   N/A  N/A   1415819      C   /usr/bin/python                              8296MiB |
|    7   N/A  N/A   1415820      C   /usr/bin/python                              8232MiB |
+-----------------------------------------------------------------------------------------+

kiaransalee

Wed Dec 17 21:53:16 2025       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.127.08             Driver Version: 550.127.08     CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA H100 80GB HBM3          On  |   00000000:26:00.0 Off |                    0 |
| N/A   31C    P0             98W /  700W |       1MiB /  81559MiB |      0%      Default |
|                                         |                        |             Disabled |
+-----------------------------------------+------------------------+----------------------+
|   1  NVIDIA H100 80GB HBM3          On  |   00000000:2F:00.0 Off |                    0 |
| N/A   42C    P0            147W /  700W |   75430MiB /  81559MiB |      0%      Default |
|                                         |                        |             Disabled |
+-----------------------------------------+------------------------+----------------------+
|   2  NVIDIA H100 80GB HBM3          On  |   00000000:46:00.0 Off |                    0 |
| N/A   35C    P0             76W /  700W |       1MiB /  81559MiB |      0%      Default |
|                                         |                        |             Disabled |
+-----------------------------------------+------------------------+----------------------+
|   3  NVIDIA H100 80GB HBM3          On  |   00000000:54:00.0 Off |                    0 |
| N/A   33C    P0            101W /  700W |       1MiB /  81559MiB |      0%      Default |
|                                         |                        |             Disabled |
+-----------------------------------------+------------------------+----------------------+
|   4  NVIDIA H100 80GB HBM3          On  |   00000000:A6:00.0 Off |                    0 |
| N/A   51C    P0             85W /  700W |       1MiB /  81559MiB |      0%      Default |
|                                         |                        |             Disabled |
+-----------------------------------------+------------------------+----------------------+
|   5  NVIDIA H100 80GB HBM3          On  |   00000000:AF:00.0 Off |                    0 |
| N/A   34C    P0             77W /  700W |       1MiB /  81559MiB |      0%      Default |
|                                         |                        |             Disabled |
+-----------------------------------------+------------------------+----------------------+
|   6  NVIDIA H100 80GB HBM3          On  |   00000000:C6:00.0 Off |                   On |
| N/A   33C    P0             76W /  700W |      89MiB /  81559MiB |     N/A      Default |
|                                         |                        |              Enabled |
+-----------------------------------------+------------------------+----------------------+
|   7  NVIDIA H100 80GB HBM3          On  |   00000000:CF:00.0 Off |                   On |
| N/A   29C    P0             73W /  700W |      90MiB /  81559MiB |     N/A      Default |
|                                         |                        |              Enabled |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| MIG devices:                                                                            |
+------------------+----------------------------------+-----------+-----------------------+
| GPU  GI  CI  MIG |                     Memory-Usage |        Vol|      Shared           |
|      ID  ID  Dev |                       BAR1-Usage | SM     Unc| CE ENC DEC OFA JPG    |
|                  |                                  |        ECC|                       |
|==================+==================================+===========+=======================|
|  6    1   0   0  |              51MiB / 40320MiB    | 64      0 |  4   0    4    0    4 |
|                  |                 0MiB / 65535MiB  |           |                       |
+------------------+----------------------------------+-----------+-----------------------+
|  6    2   0   1  |              38MiB / 40320MiB    | 60      0 |  3   0    3    0    3 |
|                  |                 0MiB / 65535MiB  |           |                       |
+------------------+----------------------------------+-----------+-----------------------+
|  7    7   0   0  |              13MiB /  9984MiB    | 16      0 |  1   0    1    0    1 |
|                  |                 0MiB / 16383MiB  |           |                       |
+------------------+----------------------------------+-----------+-----------------------+
|  7    8   0   1  |              13MiB /  9984MiB    | 16      0 |  1   0    1    0    1 |
|                  |                 0MiB / 16383MiB  |           |                       |
+------------------+----------------------------------+-----------+-----------------------+
|  7    9   0   2  |              13MiB /  9984MiB    | 16      0 |  1   0    1    0    1 |
|                  |                 0MiB / 16383MiB  |           |                       |
+------------------+----------------------------------+-----------+-----------------------+
|  7   11   0   3  |              13MiB /  9984MiB    | 16      0 |  1   0    1    0    1 |
|                  |                 0MiB / 16383MiB  |           |                       |
+------------------+----------------------------------+-----------+-----------------------+
|  7   12   0   4  |              13MiB /  9984MiB    | 16      0 |  1   0    1    0    1 |
|                  |                 0MiB / 16383MiB  |           |                       |
+------------------+----------------------------------+-----------+-----------------------+
|  7   13   0   5  |              13MiB /  9984MiB    | 16      0 |  1   0    1    0    1 |
|                  |                 0MiB / 16383MiB  |           |                       |
+------------------+----------------------------------+-----------+-----------------------+
|  7   14   0   6  |              13MiB /  9984MiB    | 16      0 |  1   0    1    0    1 |
|                  |                 0MiB / 16383MiB  |           |                       |
+------------------+----------------------------------+-----------+-----------------------+
                                                                                         
+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|    1   N/A  N/A   2188183      C   VLLM::EngineCore                            75420MiB |
+-----------------------------------------------------------------------------------------+