Cluster status Wed, 02 Jul 2025 13:12:01 +0000 report from beholder02

Resource usage (overall)
Resource usage (by namespace)
Ceph file system status
Etcd cluster status
Detailed network health report
nVidia Driver and GPU reports
  Imp
  Dretch
  Belial
  Fierna
  Tiamat
  Vecna
  Asmodeus
  Zariel
  Demogorgon

Resource usage (overall)

 Resource                                                    Requested           Limit  Allocatable      Free 
  cpu                                                      (43%) 702.2     (56%) 903.0         1.6k     713.0 
  ├─ asmodeus                                              (98%) 250.3     (98%) 250.1        256.0       5.7 
  │  ├─ asmodeus-all-a2v2-julian-schaefer-zimmermann             250.0           250.0                        
  │  ├─ dnsutils-asmodeus                                       100.0m          100.0m                        
  │  └─ kube-router-4g72z                                       250.0m             0.0                        
  ├─ beholder01                                            (4%) 900.0m     (0%) 100.0m         24.0      23.1 
  │  ├─ dnsutils-beholder01                                     100.0m          100.0m                        
  │  ├─ kube-apiserver-beholder01                               250.0m             0.0                        
  │  ├─ kube-controller-manager-beholder01                      200.0m             0.0                        
  │  ├─ kube-router-pl7hf                                       250.0m             0.0                        
  │  └─ kube-scheduler-beholder01                               100.0m             0.0                        
  ├─ beholder02                                               (8%) 1.9        (9%) 2.1         24.0      21.9 
  │  ├─ dnsutils-beholder02                                     100.0m          100.0m                        
  │  ├─ kube-apiserver-beholder02                               250.0m             0.0                        
  │  ├─ kube-controller-manager-beholder02                      200.0m             0.0                        
  │  ├─ kube-router-vmrf4                                       250.0m             0.0                        
  │  ├─ kube-scheduler-beholder02                               100.0m             0.0                        
  │  ├─ wordpress-8c8944c8d-cxbrq                               500.0m             1.0                        
  │  └─ wordpress-mariadb-565c547dc8-59fhr                      500.0m             1.0                        
  ├─ beholder03                                            (4%) 900.0m     (0%) 100.0m         24.0      23.1 
  │  ├─ dnsutils-beholder03                                     100.0m          100.0m                        
  │  ├─ kube-apiserver-beholder03                               250.0m             0.0                        
  │  ├─ kube-controller-manager-beholder03                      200.0m             0.0                        
  │  ├─ kube-router-8kcvh                                       250.0m             0.0                        
  │  └─ kube-scheduler-beholder03                               100.0m             0.0                        
  ├─ belial                                                 (84%) 67.3    (166%) 133.1         80.0       0.0 
  │  ├─ carmely-nn-standard-hps-job-lstm-2-b5pnk                   4.0             8.0                        
  │  ├─ carmely-nn-standard-hps-job-lstm-2x949                     4.0             8.0                        
  │  ├─ carmely-nn-standard-hps-job-transformer-4dhdr              4.0             8.0                        
  │  ├─ dnsutils-belial                                         100.0m          100.0m                        
  │  ├─ gpu-pod                                                    1.0             1.0                        
  │  ├─ kube-router-swlkw                                       250.0m             0.0                        
  │  ├─ mariya-nn-well-being-classifier-job-8mwrx                  4.0             8.0                        
  │  └─ ubuntu-gpu1                                               50.0           100.0                        
  ├─ demogorgon                                             (98%) 94.3    (108%) 104.1         96.0       0.0 
  │  ├─ a2v2-single-gpu-jcsz                                      10.0            10.0                        
  │  ├─ dino-exp-pod-02                                            4.0             6.0                        
  │  ├─ dino-exp-pod-03                                            4.0             8.0                        
  │  ├─ dino-exp-pod-04                                            4.0             8.0                        
  │  ├─ dnsutils-demogorgon                                     100.0m          100.0m                        
  │  ├─ gpu-demogorgon                                            48.0            48.0                        
  │  ├─ kube-router-mnkcb                                       250.0m             0.0                        
  │  ├─ leon-wenzler-ag-pod                                       16.0            16.0                        
  │  └─ ulas-bingoel-model-pod-6                                   8.0             8.0                        
  ├─ fierna                                                 (89%) 71.3      (89%) 71.1         80.0       8.7 
  │  ├─ all-gpu-a2v2-pretrain-j-schaefer-zimmermann               70.0            70.0                        
  │  ├─ carmely-ubuntu-entry-pod                                   1.0             1.0                        
  │  ├─ dnsutils-fierna                                         100.0m          100.0m                        
  │  └─ kube-router-q5gnq                                       250.0m             0.0                        
  ├─ kiaransalee                                            (19%) 36.4      (19%) 36.1        192.0     155.7 
  │  ├─ bash-pod                                                  36.0            36.0                        
  │  ├─ dnsutils-kiaransalee                                    100.0m          100.0m                        
  │  └─ kube-router-67fv9                                       250.0m             0.0                        
  ├─ lolth                                                 (2%) 850.0m       (10%) 4.1         40.0      35.9 
  │  ├─ coredns-55cb58b774-cxfmw                                100.0m             0.0                        
  │  ├─ dnsutils-lolth                                          100.0m          100.0m                        
  │  ├─ gatekeeper-audit-59d4b6fd4c-q7cj8                       100.0m             1.0                        
  │  ├─ gatekeeper-controller-manager-66f474f785-qfkmj          100.0m             1.0                        
  │  ├─ kube-router-clh9r                                       250.0m             0.0                        
  │  └─ ubuntu-test-pod                                         200.0m             2.0                        
  ├─ mindflayer01                                             (5%) 3.1        (8%) 5.1         64.0      58.9 
  │  ├─ dex-cd447fbd5-jrqq2                                     250.0m             0.0                        
  │  ├─ dex-loginapp-597f8d79f8-8mb9n                           100.0m             0.0                        
  │  ├─ dnsutils-mindflayer01                                   100.0m          100.0m                        
  │  ├─ gatekeeper-controller-manager-66f474f785-lxl5x          100.0m             1.0                        
  │  ├─ gpu-pod                                                    1.0             1.0                        
  │  ├─ kube-router-rfg76                                       250.0m             0.0                        
  │  ├─ mariadb-849498cb44-dldtm                                500.0m             1.0                        
  │  ├─ prometheus-deployment-b65d5d898-t2z9l                   500.0m             1.0                        
  │  ├─ registry-685bf565ff-k6b4q                               100.0m             0.0                        
  │  ├─ registry-auth-5b679f6456-x7m66                          100.0m             0.0                        
  │  └─ temp-pod                                                100.0m             1.0                        
  ├─ mindflayer02                                           (18%) 11.7      (28%) 18.1         64.0      45.9 
  │  ├─ cool-pod                                                  10.0            16.0                        
  │  ├─ dex-mysql-66b5885f7b-5h8hw                              100.0m             0.0                        
  │  ├─ dnsutils-mindflayer02                                   100.0m          100.0m                        
  │  ├─ kube-router-7xmk6                                       250.0m             0.0                        
  │  ├─ mediawiki-mariadb-5cc4866855-xffs4                      250.0m             0.0                        
  │  └─ phpfpm-nginx-5db96d6895-97f2g                              1.0             2.0                        
  ├─ mindflayer03                                             (3%) 2.1        (4%) 2.6         64.0      61.4 
  │  ├─ check-data-pod                                          100.0m          500.0m                        
  │  ├─ coredns-55cb58b774-4kpwk                                100.0m             0.0                        
  │  ├─ dnsutils-mindflayer03                                   100.0m          100.0m                        
  │  ├─ gatekeeper-controller-manager-66f474f785-5gv4n          100.0m             1.0                        
  │  ├─ kube-router-rgkl9                                       250.0m             0.0                        
  │  ├─ ldap-6987986dbc-2zr2c                                   250.0m             0.0                        
  │  ├─ ledavio-text-search                                        1.0             1.0                        
  │  ├─ system-registry-8877cd57-59pzr                          100.0m             0.0                        
  │  └─ user-registry-64bb8ff7cf-5cktq                          100.0m             0.0                        
  ├─ tiamat                                                (0%) 350.0m     (0%) 100.0m        256.0     255.7 
  │  ├─ dnsutils-tiamat                                         100.0m          100.0m                        
  │  └─ kube-router-nww9q                                       250.0m             0.0                        
  ├─ vecna                                                  (17%) 16.4    (104%) 100.1         96.0       0.0 
  │  ├─ dnsutils-vecna                                          100.0m          100.0m                        
  │  ├─ felix-petersen-job-20                                     16.0           100.0                        
  │  └─ kube-router-cx7mv                                       250.0m             0.0                        
  └─ zariel                                                (56%) 144.3     (69%) 176.1        256.0      79.9 
     ├─ dnsutils-zariel                                         100.0m          100.0m                        
     ├─ julian-welzel-job-1                                       96.0            96.0                        
     ├─ kube-router-vx8s8                                       250.0m             0.0                        
     ├─ leon-wenzler-ma-pod                                       16.0            16.0                        
     └─ oliver-wiedemann-pod                                      32.0            64.0                        
  ephemeral-storage                                       (1%) 100.0Gi    (1%) 150.0Gi        11.9T     11.7T 
  ├─ asmodeus                                                 (0%) 0.0        (0%) 0.0        94.6G     94.6G 
  ├─ beholder01                                               (0%) 0.0        (0%) 0.0         1.7T      1.7T 
  ├─ beholder02                                               (0%) 0.0        (0%) 0.0         1.7T      1.7T 
  ├─ beholder03                                               (0%) 0.0        (0%) 0.0         1.7T      1.7T 
  ├─ belial                                              (57%) 100.0Gi   (85%) 150.0Gi       189.2G     28.2G 
  │  └─ ubuntu-gpu1                                            100.0Gi         150.0Gi                        
  ├─ demogorgon                                               (0%) 0.0        (0%) 0.0       706.7G    706.7G 
  ├─ fierna                                                   (0%) 0.0        (0%) 0.0       189.2G    189.2G 
  ├─ kiaransalee                                              (0%) 0.0        (0%) 0.0         1.7T      1.7T 
  ├─ lolth                                                    (0%) 0.0        (0%) 0.0       530.5G    530.5G 
  ├─ mindflayer01                                             (0%) 0.0        (0%) 0.0       211.5G    211.5G 
  ├─ mindflayer02                                             (0%) 0.0        (0%) 0.0       211.5G    211.5G 
  ├─ mindflayer03                                             (0%) 0.0        (0%) 0.0       211.5G    211.5G 
  ├─ tiamat                                                   (0%) 0.0        (0%) 0.0       164.4G    164.4G 
  ├─ vecna                                                    (0%) 0.0        (0%) 0.0       849.0G    849.0G 
  └─ zariel                                                   (0%) 0.0        (0%) 0.0         1.7T      1.7T 
  memory                                                    (30%) 4.2T     (47%) 6.1Ti       12.9Ti     6.8Ti 
  ├─ asmodeus                                              (77%) 1.5Ti     (77%) 1.5Ti        2.0Ti   467.4Gi 
  │  ├─ asmodeus-all-a2v2-julian-schaefer-zimmermann             1.5Ti           1.5Ti                        
  │  ├─ dnsutils-asmodeus                                      100.0Mi         100.0Mi                        
  │  └─ kube-router-4g72z                                      250.0Mi             0.0                        
  ├─ beholder01                                           (0%) 350.0Mi    (0%) 100.0Mi       92.9Gi    92.5Gi 
  │  ├─ dnsutils-beholder01                                    100.0Mi         100.0Mi                        
  │  └─ kube-router-pl7hf                                      250.0Mi             0.0                        
  ├─ beholder02                                           (1%) 862.0Mi      (1%) 1.1Gi       92.9Gi    91.8Gi 
  │  ├─ dnsutils-beholder02                                    100.0Mi         100.0Mi                        
  │  ├─ kube-router-vmrf4                                      250.0Mi             0.0                        
  │  ├─ wordpress-8c8944c8d-cxbrq                              256.0Mi         512.0Mi                        
  │  └─ wordpress-mariadb-565c547dc8-59fhr                     256.0Mi         512.0Mi                        
  ├─ beholder03                                           (0%) 350.0Mi    (0%) 100.0Mi       92.9Gi    92.5Gi 
  │  ├─ dnsutils-beholder03                                    100.0Mi         100.0Mi                        
  │  └─ kube-router-8kcvh                                      250.0Mi             0.0                        
  ├─ belial                                              (52%) 390.3Gi   (84%) 630.1Gi      754.4Gi   124.3Gi 
  │  ├─ carmely-nn-standard-hps-job-lstm-2-b5pnk                30.0Gi          60.0Gi                        
  │  ├─ carmely-nn-standard-hps-job-lstm-2x949                  30.0Gi          60.0Gi                        
  │  ├─ carmely-nn-standard-hps-job-transformer-4dhdr           40.0Gi          40.0Gi                        
  │  ├─ dnsutils-belial                                        100.0Mi         100.0Mi                        
  │  ├─ gpu-pod                                                 10.0Gi          10.0Gi                        
  │  ├─ kube-router-swlkw                                      250.0Mi             0.0                        
  │  ├─ mariya-nn-well-being-classifier-job-8mwrx               30.0Gi          60.0Gi                        
  │  └─ ubuntu-gpu1                                            250.0Gi         400.0Gi                        
  ├─ demogorgon                                          (15%) 304.3Gi   (29%) 572.1Gi        2.0Ti     1.4Ti 
  │  ├─ a2v2-single-gpu-jcsz                                    32.0Gi          32.0Gi                        
  │  ├─ dino-exp-pod-02                                         16.0Gi          48.0Gi                        
  │  ├─ dino-exp-pod-03                                         16.0Gi          48.0Gi                        
  │  ├─ dino-exp-pod-04                                         16.0Gi         220.0Gi                        
  │  ├─ dnsutils-demogorgon                                    100.0Mi         100.0Mi                        
  │  ├─ gpu-demogorgon                                          80.0Gi          80.0Gi                        
  │  ├─ kube-router-mnkcb                                      250.0Mi             0.0                        
  │  ├─ leon-wenzler-ag-pod                                     64.0Gi          64.0Gi                        
  │  └─ ulas-bingoel-model-pod-6                                80.0Gi          80.0Gi                        
  ├─ fierna                                             (100%) 754.3Gi  (100%) 754.1Gi      754.4Gi   100.4Mi 
  │  ├─ all-gpu-a2v2-pretrain-j-schaefer-zimmermann            744.0Gi         744.0Gi                        
  │  ├─ carmely-ubuntu-entry-pod                                10.0Gi          10.0Gi                        
  │  ├─ dnsutils-fierna                                        100.0Mi         100.0Mi                        
  │  └─ kube-router-q5gnq                                      250.0Mi             0.0                        
  ├─ kiaransalee                                           (8%) 135.7G    (8%) 120.1Gi        1.5Ti      1.5T 
  │  ├─ bash-pod                                               120.0Gi         120.0Gi                        
  │  ├─ dnsutils-kiaransalee                                   100.0Mi         100.0Mi                        
  │  ├─ jupyter-elenasolar                                        1.1G             0.0                        
  │  ├─ jupyter-giordano-2ddemarzo                                1.1G             0.0                        
  │  ├─ jupyter-jonaschrade                                       1.1G             0.0                        
  │  ├─ jupyter-leanderstreun                                     1.1G             0.0                        
  │  ├─ jupyter-samrauh                                           1.1G             0.0                        
  │  ├─ jupyter-saroyehun                                         1.1G             0.0                        
  │  └─ kube-router-67fv9                                      250.0Mi             0.0                        
  ├─ lolth                                                  (0%) 1.1Gi      (1%) 3.3Gi      251.4Gi   248.2Gi 
  │  ├─ coredns-55cb58b774-cxfmw                                70.0Mi         170.0Mi                        
  │  ├─ dnsutils-lolth                                         100.0Mi         100.0Mi                        
  │  ├─ gatekeeper-audit-59d4b6fd4c-q7cj8                      256.0Mi         512.0Mi                        
  │  ├─ gatekeeper-controller-manager-66f474f785-qfkmj         256.0Mi         512.0Mi                        
  │  ├─ kube-router-clh9r                                      250.0Mi             0.0                        
  │  └─ ubuntu-test-pod                                        200.0Mi           2.0Gi                        
  ├─ mindflayer01                                           (3%) 12.2G     (3%) 13.1Gi      376.5Gi   363.4Gi 
  │  ├─ dnsutils-mindflayer01                                  100.0Mi         100.0Mi                        
  │  ├─ gatekeeper-controller-manager-66f474f785-lxl5x         256.0Mi         512.0Mi                        
  │  ├─ gpu-pod                                                 10.0Gi          10.0Gi                        
  │  ├─ kube-router-rfg76                                      250.0Mi             0.0                        
  │  ├─ mariadb-849498cb44-dldtm                               256.0Mi         512.0Mi                        
  │  ├─ prometheus-deployment-b65d5d898-t2z9l                   500.0M           1.0Gi                        
  │  └─ temp-pod                                               100.0Mi           1.0Gi                        
  ├─ mindflayer02                                         (17%) 64.8Gi    (17%) 65.1Gi      376.5Gi   311.4Gi 
  │  ├─ cool-pod                                                64.0Gi          64.0Gi                        
  │  ├─ dnsutils-mindflayer02                                  100.0Mi         100.0Mi                        
  │  ├─ kube-router-7xmk6                                      250.0Mi             0.0                        
  │  └─ phpfpm-nginx-5db96d6895-97f2g                          512.0Mi           1.0Gi                        
  ├─ mindflayer03                                          (9%) 32.8Gi     (9%) 33.3Gi      376.5Gi   343.2Gi 
  │  ├─ check-data-pod                                         100.0Mi         500.0Mi                        
  │  ├─ coredns-55cb58b774-4kpwk                                70.0Mi         170.0Mi                        
  │  ├─ dnsutils-mindflayer03                                  100.0Mi         100.0Mi                        
  │  ├─ gatekeeper-controller-manager-66f474f785-5gv4n         256.0Mi         512.0Mi                        
  │  ├─ kube-router-rgkl9                                      250.0Mi             0.0                        
  │  └─ ledavio-text-search                                     32.0Gi          32.0Gi                        
  ├─ tiamat                                               (0%) 350.0Mi    (0%) 100.0Mi     1007.6Gi  1007.2Gi 
  │  ├─ dnsutils-tiamat                                        100.0Mi         100.0Mi                        
  │  └─ kube-router-nww9q                                      250.0Mi             0.0                        
  ├─ vecna                                                (7%) 100.3Gi    (106%) 1.6Ti        1.5Ti       0.0 
  │  ├─ dnsutils-vecna                                         100.0Mi         100.0Mi                        
  │  ├─ felix-petersen-job-20                                  100.0Gi           1.6Ti                        
  │  └─ kube-router-cx7mv                                      250.0Mi             0.0                        
  └─ zariel                                              (31%) 628.3Gi   (46%) 928.1Gi        2.0Ti     1.1Ti 
     ├─ dnsutils-zariel                                        100.0Mi         100.0Mi                        
     ├─ julian-welzel-job-1                                    100.0Gi         400.0Gi                        
     ├─ kube-router-vx8s8                                      250.0Mi             0.0                        
     ├─ leon-wenzler-ma-pod                                    128.0Gi         128.0Gi                        
     └─ oliver-wiedemann-pod                                   400.0Gi         400.0Gi                        
  nvidia.com/gpu                                            (84%) 52.0      (84%) 52.0         62.0      10.0 
  ├─ asmodeus                                               (100%) 4.0      (100%) 4.0          4.0       0.0 
  │  └─ asmodeus-all-a2v2-julian-schaefer-zimmermann               4.0             4.0                        
  ├─ belial                                                  (75%) 6.0       (75%) 6.0          8.0       2.0 
  │  ├─ carmely-nn-standard-hps-job-lstm-2-b5pnk                   1.0             1.0                        
  │  ├─ carmely-nn-standard-hps-job-lstm-2x949                     1.0             1.0                        
  │  ├─ carmely-nn-standard-hps-job-transformer-4dhdr              1.0             1.0                        
  │  ├─ gpu-pod                                                    1.0             1.0                        
  │  ├─ mariya-nn-well-being-classifier-job-8mwrx                  1.0             1.0                        
  │  └─ ubuntu-gpu1                                                1.0             1.0                        
  ├─ demogorgon                                              (88%) 7.0       (88%) 7.0          8.0       1.0 
  │  ├─ a2v2-single-gpu-jcsz                                       1.0             1.0                        
  │  ├─ dino-exp-pod-02                                            1.0             1.0                        
  │  ├─ dino-exp-pod-03                                            1.0             1.0                        
  │  ├─ dino-exp-pod-04                                            1.0             1.0                        
  │  ├─ gpu-demogorgon                                             1.0             1.0                        
  │  ├─ leon-wenzler-ag-pod                                        1.0             1.0                        
  │  └─ ulas-bingoel-model-pod-6                                   1.0             1.0                        
  ├─ fierna                                                 (100%) 8.0      (100%) 8.0          8.0       0.0 
  │  └─ all-gpu-a2v2-pretrain-j-schaefer-zimmermann                8.0             8.0                        
  ├─ kiaransalee                                             (50%) 3.0       (50%) 3.0          6.0       3.0 
  │  ├─ jupyter-giordano-2ddemarzo                                 1.0             1.0                        
  │  ├─ jupyter-leanderstreun                                      1.0             1.0                        
  │  └─ jupyter-saroyehun                                          1.0             1.0                        
  ├─ tiamat                                                   (0%) 0.0        (0%) 0.0          4.0       4.0 
  ├─ vecna                                                 (100%) 16.0     (100%) 16.0         16.0       0.0 
  │  └─ felix-petersen-job-20                                     16.0            16.0                        
  └─ zariel                                                 (100%) 8.0      (100%) 8.0          8.0       0.0 
     ├─ julian-welzel-job-1                                        4.0             4.0                        
     ├─ leon-wenzler-ma-pod                                        2.0             2.0                        
     └─ oliver-wiedemann-pod                                       2.0             2.0                        
  nvidia.com/mig-1g.10gb                                     (29%) 2.0       (29%) 2.0          7.0       5.0 
  └─ kiaransalee                                             (29%) 2.0       (29%) 2.0          7.0       5.0 
     ├─ jupyter-jonaschrade                                        1.0             1.0                        
     └─ jupyter-samrauh                                            1.0             1.0                        
  nvidia.com/mig-3g.40gb                                      (0%) 0.0        (0%) 0.0          1.0       1.0 
  └─ kiaransalee                                              (0%) 0.0        (0%) 0.0          1.0       1.0 
  nvidia.com/mig-4g.40gb                                    (100%) 1.0      (100%) 1.0          1.0       0.0 
  └─ kiaransalee                                            (100%) 1.0      (100%) 1.0          1.0       0.0 
     └─ jupyter-elenasolar                                         1.0             1.0                        
  pods                                                      (9%) 153.0      (9%) 153.0         1.6k      1.5k 
  ├─ asmodeus                                                 (5%) 5.0        (5%) 5.0        110.0     105.0 
  │  ├─ asmodeus-all-a2v2-julian-schaefer-zimmermann               1.0             1.0                        
  │  ├─ dnsutils-asmodeus                                          1.0             1.0                        
  │  ├─ kube-proxy-rbzdb                                           1.0             1.0                        
  │  ├─ kube-router-4g72z                                          1.0             1.0                        
  │  └─ nvidia-device-plugin-daemonset-lrkw7                       1.0             1.0                        
  ├─ beholder01                                               (5%) 6.0        (5%) 6.0        110.0     104.0 
  │  ├─ dnsutils-beholder01                                        1.0             1.0                        
  │  ├─ kube-apiserver-beholder01                                  1.0             1.0                        
  │  ├─ kube-controller-manager-beholder01                         1.0             1.0                        
  │  ├─ kube-proxy-7f5v4                                           1.0             1.0                        
  │  ├─ kube-router-pl7hf                                          1.0             1.0                        
  │  └─ kube-scheduler-beholder01                                  1.0             1.0                        
  ├─ beholder02                                               (7%) 8.0        (7%) 8.0        110.0     102.0 
  │  ├─ dnsutils-beholder02                                        1.0             1.0                        
  │  ├─ kube-apiserver-beholder02                                  1.0             1.0                        
  │  ├─ kube-controller-manager-beholder02                         1.0             1.0                        
  │  ├─ kube-proxy-dlc5n                                           1.0             1.0                        
  │  ├─ kube-router-vmrf4                                          1.0             1.0                        
  │  ├─ kube-scheduler-beholder02                                  1.0             1.0                        
  │  ├─ wordpress-8c8944c8d-cxbrq                                  1.0             1.0                        
  │  └─ wordpress-mariadb-565c547dc8-59fhr                         1.0             1.0                        
  ├─ beholder03                                               (5%) 6.0        (5%) 6.0        110.0     104.0 
  │  ├─ dnsutils-beholder03                                        1.0             1.0                        
  │  ├─ kube-apiserver-beholder03                                  1.0             1.0                        
  │  ├─ kube-controller-manager-beholder03                         1.0             1.0                        
  │  ├─ kube-proxy-jqcj7                                           1.0             1.0                        
  │  ├─ kube-router-8kcvh                                          1.0             1.0                        
  │  └─ kube-scheduler-beholder03                                  1.0             1.0                        
  ├─ belial                                                 (10%) 11.0      (10%) 11.0        110.0      99.0 
  │  ├─ carmely-nn-standard-hps-job-lstm-2-b5pnk                   1.0             1.0                        
  │  ├─ carmely-nn-standard-hps-job-lstm-2x949                     1.0             1.0                        
  │  ├─ carmely-nn-standard-hps-job-transformer-4dhdr              1.0             1.0                        
  │  ├─ dnsutils-belial                                            1.0             1.0                        
  │  ├─ gpu-feature-discovery-j9rgb                                1.0             1.0                        
  │  ├─ gpu-pod                                                    1.0             1.0                        
  │  ├─ kube-proxy-nsltd                                           1.0             1.0                        
  │  ├─ kube-router-swlkw                                          1.0             1.0                        
  │  ├─ mariya-nn-well-being-classifier-job-8mwrx                  1.0             1.0                        
  │  ├─ nvidia-device-plugin-daemonset-kn6h4                       1.0             1.0                        
  │  └─ ubuntu-gpu1                                                1.0             1.0                        
  ├─ demogorgon                                             (10%) 11.0      (10%) 11.0        110.0      99.0 
  │  ├─ a2v2-single-gpu-jcsz                                       1.0             1.0                        
  │  ├─ dino-exp-pod-02                                            1.0             1.0                        
  │  ├─ dino-exp-pod-03                                            1.0             1.0                        
  │  ├─ dino-exp-pod-04                                            1.0             1.0                        
  │  ├─ dnsutils-demogorgon                                        1.0             1.0                        
  │  ├─ gpu-demogorgon                                             1.0             1.0                        
  │  ├─ kube-proxy-gknqq                                           1.0             1.0                        
  │  ├─ kube-router-mnkcb                                          1.0             1.0                        
  │  ├─ leon-wenzler-ag-pod                                        1.0             1.0                        
  │  ├─ nvidia-device-plugin-daemonset-wcsbj                       1.0             1.0                        
  │  └─ ulas-bingoel-model-pod-6                                   1.0             1.0                        
  ├─ fierna                                                   (6%) 7.0        (6%) 7.0        110.0     103.0 
  │  ├─ all-gpu-a2v2-pretrain-j-schaefer-zimmermann                1.0             1.0                        
  │  ├─ carmely-ubuntu-entry-pod                                   1.0             1.0                        
  │  ├─ dnsutils-fierna                                            1.0             1.0                        
  │  ├─ gpu-feature-discovery-8tz6l                                1.0             1.0                        
  │  ├─ kube-proxy-9tg4f                                           1.0             1.0                        
  │  ├─ kube-router-q5gnq                                          1.0             1.0                        
  │  └─ nvidia-device-plugin-daemonset-r5pcj                       1.0             1.0                        
  ├─ kiaransalee                                            (13%) 14.0      (13%) 14.0        110.0      96.0 
  │  ├─ bash-pod                                                   1.0             1.0                        
  │  ├─ continuous-image-puller-6fs4k                              1.0             1.0                        
  │  ├─ continuous-image-puller-6z8bj                              1.0             1.0                        
  │  ├─ dnsutils-kiaransalee                                       1.0             1.0                        
  │  ├─ gpu-feature-discovery-znz6m                                1.0             1.0                        
  │  ├─ jupyter-elenasolar                                         1.0             1.0                        
  │  ├─ jupyter-giordano-2ddemarzo                                 1.0             1.0                        
  │  ├─ jupyter-jonaschrade                                        1.0             1.0                        
  │  ├─ jupyter-leanderstreun                                      1.0             1.0                        
  │  ├─ jupyter-samrauh                                            1.0             1.0                        
  │  ├─ jupyter-saroyehun                                          1.0             1.0                        
  │  ├─ kube-proxy-65675                                           1.0             1.0                        
  │  ├─ kube-router-67fv9                                          1.0             1.0                        
  │  └─ nvidia-device-plugin-daemonset-wj2wf                       1.0             1.0                        
  ├─ lolth                                                  (17%) 19.0      (17%) 19.0        110.0      91.0 
  │  ├─ coredns-55cb58b774-cxfmw                                   1.0             1.0                        
  │  ├─ dnsutils-lolth                                             1.0             1.0                        
  │  ├─ echo1-77fbfb54d-8t4dq                                      1.0             1.0                        
  │  ├─ echo1-77fbfb54d-hcrph                                      1.0             1.0                        
  │  ├─ echo2-5d58759df-ssc7q                                      1.0             1.0                        
  │  ├─ gatekeeper-audit-59d4b6fd4c-q7cj8                          1.0             1.0                        
  │  ├─ gatekeeper-controller-manager-66f474f785-qfkmj             1.0             1.0                        
  │  ├─ hub-767b56fc4d-5k8hh                                       1.0             1.0                        
  │  ├─ hub-78d6dd898d-89np9                                       1.0             1.0                        
  │  ├─ hub-85db65cb54-8tktv                                       1.0             1.0                        
  │  ├─ kube-proxy-cl9gc                                           1.0             1.0                        
  │  ├─ kube-router-clh9r                                          1.0             1.0                        
  │  ├─ kube-state-metrics-8945855d-9fqtg                          1.0             1.0                        
  │  ├─ nvidia-device-plugin-daemonset-smgtn                       1.0             1.0                        
  │  ├─ proxy-5bc89cc587-z8q9p                                     1.0             1.0                        
  │  ├─ ubuntu-test-pod                                            2.0             2.0                        
  │  ├─ user-scheduler-5cf5ffbc54-htfdj                            1.0             1.0                        
  │  └─ user-scheduler-c7db6c584-cf297                             1.0             1.0                        
  ├─ mindflayer01                                           (20%) 22.0      (20%) 22.0        110.0      88.0 
  │  ├─ dex-cd447fbd5-jrqq2                                        1.0             1.0                        
  │  ├─ dex-loginapp-597f8d79f8-8mb9n                              1.0             1.0                        
  │  ├─ dnsutils-mindflayer01                                      1.0             1.0                        
  │  ├─ gatekeeper-controller-manager-66f474f785-lxl5x             1.0             1.0                        
  │  ├─ gpu-pod                                                    1.0             1.0                        
  │  ├─ kube-proxy-d2xl4                                           1.0             1.0                        
  │  ├─ kube-router-rfg76                                          1.0             1.0                        
  │  ├─ local-path-provisioner-759479454f-jxl54                    1.0             1.0                        
  │  ├─ mariadb-849498cb44-dldtm                                   1.0             1.0                        
  │  ├─ memcached-578474d6f9-8dgb8                                 1.0             1.0                        
  │  ├─ nginx-frontend-7d744d4cdb-5rkwg                            1.0             1.0                        
  │  ├─ nginx-ip-2023-78b9c84dbf-znhrx                             1.0             1.0                        
  │  ├─ nginx-k8s-7c8d949b5f-grwnx                                 1.0             1.0                        
  │  ├─ nginx-rec-2023-79df8c77cd-gg87f                            1.0             1.0                        
  │  ├─ nginx-rsn-2024-7dbc6b668b-jbjk7                            1.0             1.0                        
  │  ├─ nginx-self-service-password-65b44f7547-dhvwp               1.0             1.0                        
  │  ├─ nvidia-device-plugin-daemonset-5gmd7                       1.0             1.0                        
  │  ├─ pdf-b647544f-snvmm                                         1.0             1.0                        
  │  ├─ prometheus-deployment-b65d5d898-t2z9l                      1.0             1.0                        
  │  ├─ registry-685bf565ff-k6b4q                                  1.0             1.0                        
  │  ├─ registry-auth-5b679f6456-x7m66                             1.0             1.0                        
  │  └─ temp-pod                                                   1.0             1.0                        
  ├─ mindflayer02                                             (7%) 8.0        (7%) 8.0        110.0     102.0 
  │  ├─ cool-pod                                                   1.0             1.0                        
  │  ├─ dex-mysql-66b5885f7b-5h8hw                                 1.0             1.0                        
  │  ├─ dnsutils-mindflayer02                                      1.0             1.0                        
  │  ├─ kube-proxy-zbwk4                                           1.0             1.0                        
  │  ├─ kube-router-7xmk6                                          1.0             1.0                        
  │  ├─ mediawiki-mariadb-5cc4866855-xffs4                         1.0             1.0                        
  │  ├─ nvidia-device-plugin-daemonset-7wv2j                       1.0             1.0                        
  │  └─ phpfpm-nginx-5db96d6895-97f2g                              1.0             1.0                        
  ├─ mindflayer03                                           (16%) 18.0      (16%) 18.0        110.0      92.0 
  │  ├─ check-data-pod                                             1.0             1.0                        
  │  ├─ coredns-55cb58b774-4kpwk                                   1.0             1.0                        
  │  ├─ dnsutils-mindflayer03                                      1.0             1.0                        
  │  ├─ echo1-77fbfb54d-8bnp6                                      1.0             1.0                        
  │  ├─ echo1-77fbfb54d-rcwzd                                      1.0             1.0                        
  │  ├─ gatekeeper-controller-manager-66f474f785-5gv4n             1.0             1.0                        
  │  ├─ kube-proxy-b68xq                                           1.0             1.0                        
  │  ├─ kube-router-rgkl9                                          1.0             1.0                        
  │  ├─ ldap-6987986dbc-2zr2c                                      1.0             1.0                        
  │  ├─ ledavio-text-search                                        1.0             1.0                        
  │  ├─ nginx-ip-2025-5888cccd9-fmnh8                              1.0             1.0                        
  │  ├─ nvidia-device-plugin-daemonset-7k6p6                       1.0             1.0                        
  │  ├─ proxy-5495d795d5-vx2ld                                     1.0             1.0                        
  │  ├─ proxy-7f79cc645f-gj2ld                                     1.0             1.0                        
  │  ├─ system-registry-8877cd57-59pzr                             1.0             1.0                        
  │  ├─ user-registry-64bb8ff7cf-5cktq                             1.0             1.0                        
  │  ├─ user-scheduler-5cf5ffbc54-gkwvq                            1.0             1.0                        
  │  └─ user-scheduler-c7db6c584-4xdvn                             1.0             1.0                        
  ├─ tiamat                                                   (5%) 6.0        (5%) 6.0        110.0     104.0 
  │  ├─ continuous-image-puller-wtkhd                              1.0             1.0                        
  │  ├─ dnsutils-tiamat                                            1.0             1.0                        
  │  ├─ gpu-feature-discovery-5qb95                                1.0             1.0                        
  │  ├─ kube-proxy-nbjvl                                           1.0             1.0                        
  │  ├─ kube-router-nww9q                                          1.0             1.0                        
  │  └─ nvidia-device-plugin-daemonset-9b4tb                       1.0             1.0                        
  ├─ vecna                                                    (5%) 5.0        (5%) 5.0        110.0     105.0 
  │  ├─ dnsutils-vecna                                             1.0             1.0                        
  │  ├─ felix-petersen-job-20                                      1.0             1.0                        
  │  ├─ kube-proxy-djtx4                                           1.0             1.0                        
  │  ├─ kube-router-cx7mv                                          1.0             1.0                        
  │  └─ nvidia-device-plugin-daemonset-qln9t                       1.0             1.0                        
  └─ zariel                                                   (6%) 7.0        (6%) 7.0        110.0     103.0 
     ├─ dnsutils-zariel                                            1.0             1.0                        
     ├─ julian-welzel-job-1                                        1.0             1.0                        
     ├─ kube-proxy-c8frf                                           1.0             1.0                        
     ├─ kube-router-vx8s8                                          1.0             1.0                        
     ├─ leon-wenzler-ma-pod                                        1.0             1.0                        
     ├─ nvidia-device-plugin-daemonset-djsrl                       1.0             1.0                        
     └─ oliver-wiedemann-pod                                       1.0             1.0                        




Resource usage by namespace

 Resource                       Requested    Limit  Allocatable  Free 
  auth                                                                
  ├─ mindflayer01                                                     
  │  ├─ cpu                        550.0m      0.0                    
  │  └─ pods                          4.0      4.0                    
  ├─ mindflayer02                                                     
  │  ├─ cpu                        100.0m      0.0                    
  │  └─ pods                          1.0      1.0                    
  └─ mindflayer03                                                     
     ├─ cpu                        250.0m      0.0                    
     └─ pods                          1.0      1.0                    
  frontend                            4.0      4.0                    
  ├─ lolth                            1.0      1.0                    
  │  └─ pods                          1.0      1.0                    
  ├─ mindflayer01                     2.0      2.0                    
  │  └─ pods                          2.0      2.0                    
  └─ mindflayer03                     1.0      1.0                    
     └─ pods                          1.0      1.0                    
  gatekeeper-system                                                   
  ├─ lolth                                                            
  │  ├─ cpu                        200.0m      2.0                    
  │  ├─ memory                    512.0Mi    1.0Gi                    
  │  └─ pods                          2.0      2.0                    
  ├─ mindflayer01                                                     
  │  ├─ cpu                        100.0m      1.0                    
  │  ├─ memory                    256.0Mi  512.0Mi                    
  │  └─ pods                          1.0      1.0                    
  └─ mindflayer03                                                     
     ├─ cpu                        100.0m      1.0                    
     ├─ memory                    256.0Mi  512.0Mi                    
     └─ pods                          1.0      1.0                    
  gcpr-vmv-2022                                                       
  └─ beholder02                                                       
     ├─ cpu                           1.0      2.0                    
     ├─ memory                    512.0Mi    1.0Gi                    
     └─ pods                          2.0      2.0                    
  jupyterhub                                                          
  ├─ kiaransalee                                                      
  │  ├─ memory                       4.3G      0.0                    
  │  ├─ nvidia.com/gpu                3.0      3.0                    
  │  ├─ nvidia.com/mig-4g.40gb        1.0      1.0                    
  │  └─ pods                          5.0      5.0                    
  ├─ lolth                            3.0      3.0                    
  │  └─ pods                          3.0      3.0                    
  └─ mindflayer03                     1.0      1.0                    
     └─ pods                          1.0      1.0                    
  jupyterhub-kuckling                 3.0      3.0                    
  ├─ lolth                            1.0      1.0                    
  │  └─ pods                          1.0      1.0                    
  ├─ mindflayer03                     1.0      1.0                    
  │  └─ pods                          1.0      1.0                    
  └─ tiamat                           1.0      1.0                    
     └─ pods                          1.0      1.0                    
  jupyterhub-students                                                 
  ├─ kiaransalee                                                      
  │  ├─ memory                       2.1G      0.0                    
  │  ├─ nvidia.com/mig-1g.10gb        2.0      2.0                    
  │  └─ pods                          3.0      3.0                    
  ├─ lolth                            2.0      2.0                    
  │  └─ pods                          2.0      2.0                    
  └─ mindflayer03                     2.0      2.0                    
     └─ pods                          2.0      2.0                    
  kube-system                                                         
  ├─ asmodeus                                                         
  │  ├─ cpu                        350.0m   100.0m                    
  │  ├─ memory                    350.0Mi  100.0Mi                    
  │  └─ pods                          4.0      4.0                    
  ├─ beholder01                                                       
  │  ├─ cpu                        900.0m   100.0m                    
  │  ├─ memory                    350.0Mi  100.0Mi                    
  │  └─ pods                          6.0      6.0                    
  ├─ beholder02                                                       
  │  ├─ cpu                        900.0m   100.0m                    
  │  ├─ memory                    350.0Mi  100.0Mi                    
  │  └─ pods                          6.0      6.0                    
  ├─ beholder03                                                       
  │  ├─ cpu                        900.0m   100.0m                    
  │  ├─ memory                    350.0Mi  100.0Mi                    
  │  └─ pods                          6.0      6.0                    
  ├─ belial                                                           
  │  ├─ cpu                        350.0m   100.0m                    
  │  ├─ memory                    350.0Mi  100.0Mi                    
  │  └─ pods                          5.0      5.0                    
  ├─ demogorgon                                                       
  │  ├─ cpu                        350.0m   100.0m                    
  │  ├─ memory                    350.0Mi  100.0Mi                    
  │  └─ pods                          4.0      4.0                    
  ├─ fierna                                                           
  │  ├─ cpu                        350.0m   100.0m                    
  │  ├─ memory                    350.0Mi  100.0Mi                    
  │  └─ pods                          5.0      5.0                    
  ├─ kiaransalee                                                      
  │  ├─ cpu                        350.0m   100.0m                    
  │  ├─ memory                    350.0Mi  100.0Mi                    
  │  └─ pods                          5.0      5.0                    
  ├─ lolth                                                            
  │  ├─ cpu                        450.0m   100.0m                    
  │  ├─ memory                    420.0Mi  270.0Mi                    
  │  └─ pods                          5.0      5.0                    
  ├─ mindflayer01                                                     
  │  ├─ cpu                        350.0m   100.0m                    
  │  ├─ memory                    350.0Mi  100.0Mi                    
  │  └─ pods                          4.0      4.0                    
  ├─ mindflayer02                                                     
  │  ├─ cpu                        350.0m   100.0m                    
  │  ├─ memory                    350.0Mi  100.0Mi                    
  │  └─ pods                          4.0      4.0                    
  ├─ mindflayer03                                                     
  │  ├─ cpu                        450.0m   100.0m                    
  │  ├─ memory                    420.0Mi  270.0Mi                    
  │  └─ pods                          5.0      5.0                    
  ├─ tiamat                                                           
  │  ├─ cpu                        350.0m   100.0m                    
  │  ├─ memory                    350.0Mi  100.0Mi                    
  │  └─ pods                          5.0      5.0                    
  ├─ vecna                                                            
  │  ├─ cpu                        350.0m   100.0m                    
  │  ├─ memory                    350.0Mi  100.0Mi                    
  │  └─ pods                          4.0      4.0                    
  └─ zariel                                                           
     ├─ cpu                        350.0m   100.0m                    
     ├─ memory                    350.0Mi  100.0Mi                    
     └─ pods                          4.0      4.0                    
  local-path-storage                  1.0      1.0                    
  └─ mindflayer01                     1.0      1.0                    
     └─ pods                          1.0      1.0                    
  monitoring                                                          
  ├─ lolth                            1.0      1.0                    
  │  └─ pods                          1.0      1.0                    
  └─ mindflayer01                                                     
     ├─ cpu                        500.0m      1.0                    
     ├─ memory                     500.0M    1.0Gi                    
     └─ pods                          1.0      1.0                    
  registry                                                            
  └─ mindflayer03                                                     
     ├─ cpu                        200.0m      0.0                    
     └─ pods                          2.0      2.0                    
  testing                             3.0      3.0                    
  ├─ lolth                            2.0      2.0                    
  │  └─ pods                          2.0      2.0                    
  └─ mindflayer03                     1.0      1.0                    
     └─ pods                          1.0      1.0                    
  user-adwait-deshpande                                               
  ├─ mindflayer01                                                     
  │  ├─ cpu                        100.0m      1.0                    
  │  ├─ memory                    100.0Mi    1.0Gi                    
  │  └─ pods                          1.0      1.0                    
  └─ mindflayer03                                                     
     ├─ cpu                        100.0m   500.0m                    
     ├─ memory                    100.0Mi  500.0Mi                    
     └─ pods                          1.0      1.0                    
  user-alex-chan                                                      
  └─ mindflayer02                                                     
     ├─ cpu                          10.0     16.0                    
     ├─ memory                     64.0Gi   64.0Gi                    
     └─ pods                          1.0      1.0                    
  user-andri-rutschmann                                               
  └─ kiaransalee                                                      
     ├─ cpu                          36.0     36.0                    
     ├─ memory                    120.0Gi  120.0Gi                    
     └─ pods                          1.0      1.0                    
  user-carmely-reiska                                                 
  ├─ belial                                                           
  │  ├─ cpu                          12.0     24.0                    
  │  ├─ memory                    100.0Gi  160.0Gi                    
  │  ├─ nvidia.com/gpu                3.0      3.0                    
  │  └─ pods                          3.0      3.0                    
  └─ fierna                                                           
     ├─ cpu                           1.0      1.0                    
     ├─ memory                     10.0Gi   10.0Gi                    
     └─ pods                          1.0      1.0                    
  user-felix-petersen                                                 
  └─ vecna                                                            
     ├─ cpu                          16.0    100.0                    
     ├─ memory                    100.0Gi    1.6Ti                    
     ├─ nvidia.com/gpu               16.0     16.0                    
     └─ pods                          1.0      1.0                    
  user-jacob-davidson                                                 
  └─ lolth                                                            
     ├─ cpu                        100.0m      1.0                    
     ├─ memory                    100.0Mi    1.0Gi                    
     └─ pods                          1.0      1.0                    
  user-julian-jandeleit                                               
  ├─ demogorgon                                                       
  │  ├─ cpu                          48.0     48.0                    
  │  ├─ memory                     80.0Gi   80.0Gi                    
  │  ├─ nvidia.com/gpu                1.0      1.0                    
  │  └─ pods                          1.0      1.0                    
  └─ lolth                                                            
     ├─ cpu                        100.0m      1.0                    
     ├─ memory                    100.0Mi    1.0Gi                    
     └─ pods                          1.0      1.0                    
  user-julian-welzel                                                  
  └─ zariel                                                           
     ├─ cpu                          96.0     96.0                    
     ├─ memory                    100.0Gi  400.0Gi                    
     ├─ nvidia.com/gpu                4.0      4.0                    
     └─ pods                          1.0      1.0                    
  user-julian-zimmermann                                              
  ├─ asmodeus                                                         
  │  ├─ cpu                         250.0    250.0                    
  │  ├─ memory                      1.5Ti    1.5Ti                    
  │  ├─ nvidia.com/gpu                4.0      4.0                    
  │  └─ pods                          1.0      1.0                    
  ├─ demogorgon                                                       
  │  ├─ cpu                          10.0     10.0                    
  │  ├─ memory                     32.0Gi   32.0Gi                    
  │  ├─ nvidia.com/gpu                1.0      1.0                    
  │  └─ pods                          1.0      1.0                    
  └─ fierna                                                           
     ├─ cpu                          70.0     70.0                    
     ├─ memory                    744.0Gi  744.0Gi                    
     ├─ nvidia.com/gpu                8.0      8.0                    
     └─ pods                          1.0      1.0                    
  user-leon-wenzler                                                   
  ├─ demogorgon                                                       
  │  ├─ cpu                          16.0     16.0                    
  │  ├─ memory                     64.0Gi   64.0Gi                    
  │  ├─ nvidia.com/gpu                1.0      1.0                    
  │  └─ pods                          1.0      1.0                    
  └─ zariel                                                           
     ├─ cpu                          16.0     16.0                    
     ├─ memory                    128.0Gi  128.0Gi                    
     ├─ nvidia.com/gpu                2.0      2.0                    
     └─ pods                          1.0      1.0                    
  user-mariya-tykhonchuk                                              
  ├─ belial                                                           
  │  ├─ cpu                           4.0      8.0                    
  │  ├─ memory                     30.0Gi   60.0Gi                    
  │  ├─ nvidia.com/gpu                1.0      1.0                    
  │  └─ pods                          1.0      1.0                    
  └─ mindflayer01                                                     
     ├─ cpu                           1.0      1.0                    
     ├─ memory                     10.0Gi   10.0Gi                    
     └─ pods                          1.0      1.0                    
  user-maya-dagher                                                    
  └─ belial                                                           
     ├─ cpu                           1.0      1.0                    
     ├─ memory                     10.0Gi   10.0Gi                    
     ├─ nvidia.com/gpu                1.0      1.0                    
     └─ pods                          1.0      1.0                    
  user-mike-battistella                                               
  └─ mindflayer03                                                     
     ├─ cpu                           1.0      1.0                    
     ├─ memory                     32.0Gi   32.0Gi                    
     └─ pods                          1.0      1.0                    
  user-oliver-wiedemann                                               
  └─ zariel                                                           
     ├─ cpu                          32.0     64.0                    
     ├─ memory                    400.0Gi  400.0Gi                    
     ├─ nvidia.com/gpu                2.0      2.0                    
     └─ pods                          1.0      1.0                    
  user-philip-zimmermann                                              
  └─ demogorgon                                                       
     ├─ cpu                          12.0     22.0                    
     ├─ memory                     48.0Gi  316.0Gi                    
     ├─ nvidia.com/gpu                3.0      3.0                    
     └─ pods                          3.0      3.0                    
  user-segun-aroyehun                                                 
  └─ belial                                                           
     ├─ cpu                          50.0    100.0                    
     ├─ ephemeral-storage         100.0Gi  150.0Gi                    
     ├─ memory                    250.0Gi  400.0Gi                    
     ├─ nvidia.com/gpu                1.0      1.0                    
     └─ pods                          1.0      1.0                    
  user-ulas-bingoel                                                   
  └─ demogorgon                                                       
     ├─ cpu                           8.0      8.0                    
     ├─ memory                     80.0Gi   80.0Gi                    
     ├─ nvidia.com/gpu                1.0      1.0                    
     └─ pods                          1.0      1.0                    
  user-v-time                                                         
  ├─ mindflayer01                                                     
  │  ├─ cpu                        500.0m      1.0                    
  │  ├─ memory                    256.0Mi  512.0Mi                    
  │  └─ pods                          1.0      1.0                    
  └─ mindflayer02                                                     
     ├─ cpu                           1.0      2.0                    
     ├─ memory                    512.0Mi    1.0Gi                    
     └─ pods                          1.0      1.0                    
  web                                                                 
  ├─ mindflayer01                     6.0      6.0                    
  │  └─ pods                          6.0      6.0                    
  ├─ mindflayer02                                                     
  │  ├─ cpu                        250.0m      0.0                    
  │  └─ pods                          1.0      1.0                    
  └─ mindflayer03                     1.0      1.0                    
     └─ pods                          1.0      1.0                    




Ceph file system report

  cluster:
    id:     3fee6f38-ba9f-11ec-9328-e188936dcafd
    health: HEALTH_WARN
            1 nearfull osd(s)
            3 pool(s) nearfull
 
  services:
    mon: 5 daemons, quorum beholder03,beholder01,beholder02,mindflayer02,mindflayer03 (age 8w)
    mgr: beholder01.verxwn(active, since 5M), standbys: beholder03.nprqzk, mindflayer03.rzdvrr, mindflayer01.mkuopd, mindflayer02.ympgrs, beholder02.akktmp
    mds: 4/4 daemons up, 2 standby
    osd: 24 osds: 24 up (since 5M), 24 in (since 12M)
 
  data:
    volumes: 1/1 healthy
    pools:   3 pools, 545 pgs
    objects: 77.85M objects, 113 TiB
    usage:   227 TiB used, 55 TiB / 282 TiB avail
    pgs:     543 active+clean
             2   active+clean+scrubbing+deep
 
  io:
    client:   1.1 MiB/s wr, 0 op/s rd, 8 op/s wr
 
HEALTH_WARN 1 nearfull osd(s); 3 pool(s) nearfull
[WRN] OSD_NEARFULL: 1 nearfull osd(s)
    osd.16 is near full
[WRN] POOL_NEARFULL: 3 pool(s) nearfull
    pool '.mgr' is nearfull
    pool 'cephfs_data' is nearfull
    pool 'cephfs_metadata' is nearfull




Etcd cluster

+------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
|     ENDPOINT     |        ID        | VERSION | DB SIZE | IS LEADER | IS LEARNER | RAFT TERM | RAFT INDEX | RAFT APPLIED INDEX | ERRORS |
+------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
| 192.168.1.1:4252 | 7126e7a3a9cc42ca |  3.4.30 |  156 MB |     false |      false |    302053 |  606650739 |          606650739 |        |
| 192.168.1.2:4252 | 39d72894bf6c7600 |  3.4.30 |  156 MB |     false |      false |    302053 |  606650739 |          606650739 |        |
| 192.168.1.3:4252 | bbf4a2b99c3fd692 |  3.4.30 |  156 MB |     false |      false |    302053 |  606650739 |          606650739 |        |
| 192.168.2.1:4252 | 5cb9997dd1c2246b |  3.3.25 |  156 MB |      true |      false |    302053 |  606650739 |                  0 |        |
| 192.168.2.3:4252 |  cbc1cf89959ea4e |  3.3.25 |  156 MB |     false |      false |    302053 |  606650739 |                  0 |        |
+------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+




Detailed network health

API and web servers

beholder01
SSH port open yes
Report available yes
External interface up ok
Infiniband interface up ok
API servers reachable 1 2 3 4
Ceph monitors reachable 1 2 3
cephfs mounted /cephfs
local raid mounted /raid
Test pod responding dnsutils-beholder01
Can reach kube-dns 10.96.0.10
Pod can reach kube-dns yes
Pod can reach internet yes
beholder02
SSH port open yes
Report available yes
External interface up ok
Infiniband interface up ok
API servers reachable 1 2 3 4
Ceph monitors reachable 1 2 3
cephfs mounted /cephfs
local raid mounted /raid
Test pod responding dnsutils-beholder02
Can reach kube-dns 10.96.0.10
Pod can reach kube-dns yes
Pod can reach internet yes
beholder03
SSH port open yes
Report available yes
External interface up ok
Infiniband interface up ok
API servers reachable 1 2 3 4
Ceph monitors reachable 1 2 3
cephfs mounted /cephfs
local raid mounted /raid
Test pod responding dnsutils-beholder03
Can reach kube-dns 10.96.0.10
Pod can reach kube-dns yes
Pod can reach internet yes
lolth
SSH port open yes
Report available yes
External interface up ok
Infiniband interface up ok
API servers reachable 1 2 3 4
Ceph monitors reachable 1 2 3
cephfs mounted /cephfs
local raid mounted /raid
Test pod responding dnsutils-lolth
Can reach kube-dns 10.96.0.10
Pod can reach kube-dns yes
Pod can reach internet yes




Ceph osd nodes

mindflayer01
SSH port open yes
Report available yes
Infiniband interface up ok
API servers reachable 1 2 3 4
Ceph monitors reachable 1 2 3
cephfs mounted /cephfs
local raid mounted /raid
Test pod responding dnsutils-mindflayer01
Can reach kube-dns 10.96.0.10
Pod can reach kube-dns yes
Pod can reach internet yes
mindflayer02
SSH port open yes
Report available yes
Infiniband interface up ok
API servers reachable 1 2 3 4
Ceph monitors reachable 1 2 3
cephfs mounted /cephfs
local raid mounted /raid
Test pod responding dnsutils-mindflayer02
Can reach kube-dns 10.96.0.10
Pod can reach kube-dns yes
Pod can reach internet yes
mindflayer03
SSH port open yes
Report available yes
Infiniband interface up ok
API servers reachable 1 2 3 4
Ceph monitors reachable 1 2 3
cephfs mounted /cephfs
local raid mounted /raid
Test pod responding dnsutils-mindflayer03
Can reach kube-dns 10.96.0.10
Pod can reach kube-dns yes
Pod can reach internet yes




Compute nodes

vecna
SSH port open yes
Report available yes
Infiniband interface up ok
API servers reachable 1 2 3 4
Ceph monitors reachable 1 2 3
cephfs mounted /cephfs/abyss
local raid mounted /raid
Test pod responding dnsutils-vecna
Can reach kube-dns 10.96.0.10
Pod can reach kube-dns yes
Pod can reach internet yes
kiaransalee
SSH port open yes
Report available yes
Infiniband interface up ok
API servers reachable 1 2 3 4
Ceph monitors reachable 1 2 3
cephfs mounted /cephfs/abyss
local raid mounted /raid
Test pod responding dnsutils-kiaransalee
Can reach kube-dns 10.96.0.10
Pod can reach kube-dns yes
Pod can reach internet yes
belial
SSH port open yes
Report available yes
Infiniband interface up ok
API servers reachable 1 2 3 4
Ceph monitors reachable 1 2 3
cephfs mounted /cephfs/abyss
local raid mounted /raid
Test pod responding dnsutils-belial
Can reach kube-dns 10.96.0.10
Pod can reach kube-dns yes
Pod can reach internet yes
fierna
SSH port open yes
Report available yes
Infiniband interface up ok
API servers reachable 1 2 3 4
Ceph monitors reachable 1 2 3
cephfs mounted /cephfs/abyss
local raid mounted /raid
Test pod responding dnsutils-fierna
Can reach kube-dns 10.96.0.10
Pod can reach kube-dns yes
Pod can reach internet yes
demogorgon
SSH port open yes
Report available yes
Infiniband interface up ok
API servers reachable 1 2 3 4
Ceph monitors reachable 1 2 3
cephfs mounted /cephfs/abyss
local raid mounted /raid
Test pod responding dnsutils-demogorgon
Can reach kube-dns 10.96.0.10
Pod can reach kube-dns yes
Pod can reach internet yes
tiamat
SSH port open yes
Report available yes
Infiniband interface up ok
API servers reachable 1 2 3 4
Ceph monitors reachable 1 2 3
cephfs mounted /cephfs/abyss
local raid mounted /raid
Test pod responding dnsutils-tiamat
Can reach kube-dns 10.96.0.10
Pod can reach kube-dns yes
Pod can reach internet yes
asmodeus
SSH port open yes
Report available yes
Infiniband interface up ok
API servers reachable 1 2 3 4
Ceph monitors reachable 1 2 3
cephfs mounted /cephfs/abyss
local raid mounted /raid
Test pod responding dnsutils-asmodeus
Can reach kube-dns 10.96.0.10
Pod can reach kube-dns yes
Pod can reach internet yes
zariel
SSH port open yes
Report available yes
Infiniband interface up ok
API servers reachable 1 2 3 4
Ceph monitors reachable 1 2 3
cephfs mounted /cephfs/abyss
local raid mounted /raid
Test pod responding dnsutils-zariel
Can reach kube-dns 10.96.0.10
Pod can reach kube-dns yes
Pod can reach internet yes




nVidia driver and GPU status

dretch

belial

Wed Jul  2 13:12:49 2025       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 565.57.01              Driver Version: 565.57.01      CUDA Version: 12.7     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  Quadro RTX 6000                Off |   00000000:1B:00.0 Off |                  Off |
| 33%   33C    P8              4W /  260W |       1MiB /  24576MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   1  Quadro RTX 6000                Off |   00000000:1C:00.0 Off |                  Off |
| 34%   55C    P0            146W /  260W |     480MiB /  24576MiB |     53%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   2  Quadro RTX 6000                Off |   00000000:1D:00.0 Off |                  Off |
| 33%   34C    P8              5W /  260W |       1MiB /  24576MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   3  Quadro RTX 6000                Off |   00000000:1E:00.0 Off |                  Off |
| 45%   68C    P0            230W /  260W |    4418MiB /  24576MiB |     93%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   4  Quadro RTX 6000                Off |   00000000:3D:00.0 Off |                  Off |
| 45%   68C    P0            257W /  260W |   15642MiB /  24576MiB |    100%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   5  Quadro RTX 6000                Off |   00000000:3F:00.0 Off |                  Off |
| 33%   33C    P8              6W /  260W |       1MiB /  24576MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   6  Quadro RTX 6000                Off |   00000000:40:00.0 Off |                  Off |
| 47%   70C    P0            262W /  260W |    2536MiB /  24576MiB |     99%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   7  Quadro RTX 6000                Off |   00000000:41:00.0 Off |                  Off |
| 33%   34C    P8              4W /  260W |       1MiB /  24576MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
                                                                                         
+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|    1   N/A  N/A   1505774      C   python                                        476MiB |
|    3   N/A  N/A   1508092      C   python                                       4414MiB |
|    4   N/A  N/A   1131784      C   python                                      15638MiB |
|    6   N/A  N/A   1508091      C   python                                       2532MiB |
+-----------------------------------------------------------------------------------------+

fierna

Wed Jul  2 13:12:51 2025       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 565.57.01              Driver Version: 565.57.01      CUDA Version: 12.7     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  Quadro RTX 6000                Off |   00000000:1B:00.0 Off |                  Off |
| 33%   47C    P0             70W /  260W |   19631MiB /  24576MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   1  Quadro RTX 6000                Off |   00000000:1C:00.0 Off |                  Off |
| 33%   42C    P0             63W /  260W |   18315MiB /  24576MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   2  Quadro RTX 6000                Off |   00000000:1D:00.0 Off |                  Off |
| 33%   55C    P0             59W /  260W |   18315MiB /  24576MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   3  Quadro RTX 6000                Off |   00000000:1E:00.0 Off |                  Off |
| 33%   46C    P0             60W /  260W |   17471MiB /  24576MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   4  Quadro RTX 6000                Off |   00000000:3D:00.0 Off |                  Off |
| 33%   45C    P0             68W /  260W |   17455MiB /  24576MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   5  Quadro RTX 6000                Off |   00000000:3F:00.0 Off |                  Off |
| 34%   47C    P0             58W /  260W |   18315MiB /  24576MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   6  Quadro RTX 6000                Off |   00000000:40:00.0 Off |                  Off |
| 34%   48C    P0             68W /  260W |   17475MiB /  24576MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   7  Quadro RTX 6000                Off |   00000000:41:00.0 Off |                  Off |
| 33%   49C    P0            115W /  260W |   18431MiB /  24576MiB |    100%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
                                                                                         
+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|    0   N/A  N/A   3289258      C   /usr/bin/python3                            18402MiB |
|    0   N/A  N/A   3289259      C   /usr/bin/python3                              164MiB |
|    0   N/A  N/A   3289260      C   /usr/bin/python3                              164MiB |
|    0   N/A  N/A   3289261      C   /usr/bin/python3                              164MiB |
|    0   N/A  N/A   3289262      C   /usr/bin/python3                              164MiB |
|    0   N/A  N/A   3289263      C   /usr/bin/python3                              164MiB |
|    0   N/A  N/A   3289264      C   /usr/bin/python3                              164MiB |
|    0   N/A  N/A   3289265      C   /usr/bin/python3                              164MiB |
|    1   N/A  N/A   3289259      C   /usr/bin/python3                            18302MiB |
|    2   N/A  N/A   3289260      C   /usr/bin/python3                            18302MiB |
|    3   N/A  N/A   3289261      C   /usr/bin/python3                            17458MiB |
|    4   N/A  N/A   3289262      C   /usr/bin/python3                            17442MiB |
|    5   N/A  N/A   3289263      C   /usr/bin/python3                            18302MiB |
|    6   N/A  N/A   3289264      C   /usr/bin/python3                            17462MiB |
|    7   N/A  N/A   3289265      C   /usr/bin/python3                            18418MiB |
+-----------------------------------------------------------------------------------------+

tiamat

Wed Jul  2 13:13:04 2025       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 565.57.01              Driver Version: 565.57.01      CUDA Version: 12.7     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA A100-SXM4-40GB          On  |   00000000:01:00.0 Off |                    0 |
| N/A   28C    P0             55W /  400W |    1101MiB /  40960MiB |      0%      Default |
|                                         |                        |             Disabled |
+-----------------------------------------+------------------------+----------------------+
|   1  NVIDIA A100-SXM4-40GB          On  |   00000000:41:00.0 Off |                 ERR! |
|ERR!  ERR! ERR!             ERR! / ERR!  |    1101MiB /  40960MiB |    ERR!      Default |
|                                         |                        |                 ERR! |
+-----------------------------------------+------------------------+----------------------+
|   2  NVIDIA A100-SXM4-40GB          On  |   00000000:81:00.0 Off |                 ERR! |
|ERR!  ERR! ERR!             ERR! / ERR!  |    1101MiB /  40960MiB |    ERR!      Default |
|                                         |                        |                 ERR! |
+-----------------------------------------+------------------------+----------------------+
|   3  NVIDIA A100-SXM4-40GB          On  |   00000000:C1:00.0 Off |                    0 |
| N/A   26C    P0             56W /  400W |    1101MiB /  40960MiB |      0%      Default |
|                                         |                        |             Disabled |
+-----------------------------------------+------------------------+----------------------+
                                                                                         
+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
+-----------------------------------------------------------------------------------------+

vecna

Wed Jul  2 15:14:01 2025       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 565.57.01              Driver Version: 565.57.01      CUDA Version: 12.7     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  Tesla V100-SXM3-32GB           On  |   00000000:34:00.0 Off |                    0 |
| N/A   48C    P0            180W /  350W |   16516MiB /  32768MiB |     99%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   1  Tesla V100-SXM3-32GB           On  |   00000000:36:00.0 Off |                    0 |
| N/A   49C    P0            192W /  350W |   16542MiB /  32768MiB |     99%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   2  Tesla V100-SXM3-32GB           On  |   00000000:39:00.0 Off |                    0 |
| N/A   59C    P0            256W /  350W |   16540MiB /  32768MiB |     99%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   3  Tesla V100-SXM3-32GB           On  |   00000000:3B:00.0 Off |                    0 |
| N/A   62C    P0            246W /  350W |   16542MiB /  32768MiB |     99%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   4  Tesla V100-SXM3-32GB           On  |   00000000:57:00.0 Off |                    0 |
| N/A   52C    P0            253W /  350W |   16542MiB /  32768MiB |     99%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   5  Tesla V100-SXM3-32GB           On  |   00000000:59:00.0 Off |                    0 |
| N/A   68C    P0            264W /  350W |   16542MiB /  32768MiB |     99%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   6  Tesla V100-SXM3-32GB           On  |   00000000:5C:00.0 Off |                    0 |
| N/A   53C    P0            243W /  350W |   16542MiB /  32768MiB |     99%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   7  Tesla V100-SXM3-32GB           On  |   00000000:5E:00.0 Off |                    0 |
| N/A   66C    P0            247W /  350W |   16542MiB /  32768MiB |     99%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   8  Tesla V100-SXM3-32GB           On  |   00000000:B7:00.0 Off |                    0 |
| N/A   50C    P0            210W /  350W |   16542MiB /  32768MiB |     99%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   9  Tesla V100-SXM3-32GB           On  |   00000000:B9:00.0 Off |                    0 |
| N/A   49C    P0            255W /  350W |   16542MiB /  32768MiB |     99%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|  10  Tesla V100-SXM3-32GB           On  |   00000000:BC:00.0 Off |                    0 |
| N/A   63C    P0            237W /  350W |   16542MiB /  32768MiB |     99%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|  11  Tesla V100-SXM3-32GB           On  |   00000000:BE:00.0 Off |                    0 |
| N/A   66C    P0            239W /  350W |   16542MiB /  32768MiB |     99%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|  12  Tesla V100-SXM3-32GB           On  |   00000000:E0:00.0 Off |                    0 |
| N/A   53C    P0            207W /  350W |   16540MiB /  32768MiB |     99%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|  13  Tesla V100-SXM3-32GB           On  |   00000000:E2:00.0 Off |                    0 |
| N/A   52C    P0            260W /  350W |   16542MiB /  32768MiB |     99%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|  14  Tesla V100-SXM3-32GB           On  |   00000000:E5:00.0 Off |                    0 |
| N/A   68C    P0            261W /  350W |   16540MiB /  32768MiB |     99%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|  15  Tesla V100-SXM3-32GB           On  |   00000000:E7:00.0 Off |                    0 |
| N/A   65C    P0            237W /  350W |   16420MiB /  32768MiB |     99%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
                                                                                         
+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|    0   N/A  N/A   1251301      C   python                                      16508MiB |
|    1   N/A  N/A   1251461      C   python                                      16534MiB |
|    2   N/A  N/A   1251699      C   python                                      16532MiB |
|    3   N/A  N/A   1251784      C   python                                      16534MiB |
|    4   N/A  N/A   1251868      C   python                                      16534MiB |
|    5   N/A  N/A   1252020      C   python                                      16534MiB |
|    6   N/A  N/A   1252118      C   python                                      16534MiB |
|    7   N/A  N/A   1252120      C   python                                      16534MiB |
|    8   N/A  N/A   1252255      C   python                                      16534MiB |
|    9   N/A  N/A   1252339      C   python                                      16534MiB |
|   10   N/A  N/A   1252409      C   python                                      16534MiB |
|   11   N/A  N/A   1252551      C   python                                      16534MiB |
|   12   N/A  N/A   1253617      C   python                                      16532MiB |
|   13   N/A  N/A   1252631      C   python                                      16534MiB |
|   14   N/A  N/A   1253547      C   python                                      16532MiB |
|   15   N/A  N/A   1252650      C   python                                      16412MiB |
+-----------------------------------------------------------------------------------------+

asmodeus

Wed Jul  2 13:14:03 2025       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 570.86.15              Driver Version: 570.86.15      CUDA Version: 12.8     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA A100-SXM4-80GB          On  |   00000000:01:00.0 Off |                    0 |
| N/A   58C    P0            432W /  500W |   71687MiB /  81920MiB |    100%      Default |
|                                         |                        |             Disabled |
+-----------------------------------------+------------------------+----------------------+
|   1  NVIDIA A100-SXM4-80GB          On  |   00000000:41:00.0 Off |                    0 |
| N/A   52C    P0            514W /  500W |   70535MiB /  81920MiB |     98%      Default |
|                                         |                        |             Disabled |
+-----------------------------------------+------------------------+----------------------+
|   2  NVIDIA A100-SXM4-80GB          On  |   00000000:81:00.0 Off |                    0 |
| N/A   27C    P0             60W /  500W |       1MiB /  81920MiB |      0%      Default |
|                                         |                        |             Disabled |
+-----------------------------------------+------------------------+----------------------+
|   3  NVIDIA A100-SXM4-80GB          On  |   00000000:C1:00.0 Off |                    0 |
| N/A   39C    P0             96W /  500W |   70535MiB /  81920MiB |    100%      Default |
|                                         |                        |             Disabled |
+-----------------------------------------+------------------------+----------------------+
                                                                                         
+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI              PID   Type   Process name                        GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|    0   N/A  N/A         2613081      C   /usr/bin/python3                      70376MiB |
|    0   N/A  N/A         2613082      C   /usr/bin/python3                        416MiB |
|    0   N/A  N/A         2613083      C   /usr/bin/python3                        416MiB |
|    1   N/A  N/A         2613082      C   /usr/bin/python3                      70372MiB |
|    3   N/A  N/A         2613083      C   /usr/bin/python3                      70372MiB |
+-----------------------------------------------------------------------------------------+

zariel

Wed Jul  2 15:14:07 2025       
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.216.03             Driver Version: 535.216.03   CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  NVIDIA A100-SXM4-40GB          On  | 00000000:07:00.0 Off |                    0 |
| N/A   30C    P0              55W / 400W |      0MiB / 40960MiB |      0%      Default |
|                                         |                      |             Disabled |
+-----------------------------------------+----------------------+----------------------+
|   1  NVIDIA A100-SXM4-40GB          On  | 00000000:0F:00.0 Off |                    0 |
| N/A   29C    P0              53W / 400W |      0MiB / 40960MiB |      0%      Default |
|                                         |                      |             Disabled |
+-----------------------------------------+----------------------+----------------------+
|   2  NVIDIA A100-SXM4-40GB          On  | 00000000:47:00.0 Off |                    0 |
| N/A   47C    P0             216W / 400W |  24167MiB / 40960MiB |     99%      Default |
|                                         |                      |             Disabled |
+-----------------------------------------+----------------------+----------------------+
|   3  NVIDIA A100-SXM4-40GB          On  | 00000000:4E:00.0 Off |                    0 |
| N/A   49C    P0             220W / 400W |  28939MiB / 40960MiB |    100%      Default |
|                                         |                      |             Disabled |
+-----------------------------------------+----------------------+----------------------+
|   4  NVIDIA A100-SXM4-40GB          On  | 00000000:87:00.0 Off |                    0 |
| N/A   55C    P0             234W / 400W |  24167MiB / 40960MiB |     99%      Default |
|                                         |                      |             Disabled |
+-----------------------------------------+----------------------+----------------------+
|   5  NVIDIA A100-SXM4-40GB          On  | 00000000:90:00.0 Off |                    0 |
| N/A   53C    P0             197W / 400W |  24167MiB / 40960MiB |     99%      Default |
|                                         |                      |             Disabled |
+-----------------------------------------+----------------------+----------------------+
|   6  NVIDIA A100-SXM4-40GB          On  | 00000000:B7:00.0 Off |                    0 |
| N/A   41C    P0              53W / 400W |      3MiB / 40960MiB |      0%      Default |
|                                         |                      |             Disabled |
+-----------------------------------------+----------------------+----------------------+
|   7  NVIDIA A100-SXM4-40GB          On  | 00000000:BD:00.0 Off |                    0 |
| N/A   40C    P0              58W / 400W |      3MiB / 40960MiB |      0%      Default |
|                                         |                      |             Disabled |
+-----------------------------------------+----------------------+----------------------+
                                                                                         
+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|    2   N/A  N/A    528077      C   python                                    24158MiB |
|    3   N/A  N/A    501467      C   python                                    28930MiB |
|    4   N/A  N/A   3827687      C   python                                    24158MiB |
|    5   N/A  N/A   2247100      C   python                                    24158MiB |
+---------------------------------------------------------------------------------------+

demogorgon

Wed Jul  2 13:14:11 2025       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 565.57.01              Driver Version: 565.57.01      CUDA Version: 12.7     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA A40                     On  |   00000000:01:00.0 Off |                    0 |
|  0%   28C    P8             24W /  300W |       1MiB /  46068MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   1  NVIDIA A40                     On  |   00000000:25:00.0 Off |                    0 |
|  0%   28C    P8             26W /  300W |       1MiB /  46068MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   2  NVIDIA A40                     On  |   00000000:41:00.0 Off |                    0 |
|  0%   28C    P8             23W /  300W |       1MiB /  46068MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   3  NVIDIA A40                     On  |   00000000:61:00.0 Off |                    0 |
|  0%   28C    P8             24W /  300W |       1MiB /  46068MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   4  NVIDIA A40                     On  |   00000000:81:00.0 Off |                    0 |
|  0%   29C    P8             24W /  300W |       1MiB /  46068MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   5  NVIDIA A40                     On  |   00000000:A1:00.0 Off |                    0 |
|  0%   28C    P8             24W /  300W |       1MiB /  46068MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   6  NVIDIA A40                     On  |   00000000:C1:00.0 Off |                    0 |
|  0%   29C    P8             24W /  300W |       1MiB /  46068MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   7  NVIDIA A40                     On  |   00000000:E1:00.0 Off |                    0 |
|  0%   30C    P8             24W /  300W |       1MiB /  46068MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
                                                                                         
+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|  No running processes found                                                             |
+-----------------------------------------------------------------------------------------+

kiaransalee

Wed Jul  2 13:14:14 2025       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.127.08             Driver Version: 550.127.08     CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA H100 80GB HBM3          On  |   00000000:26:00.0 Off |                    0 |
| N/A   28C    P0             95W /  700W |       1MiB /  81559MiB |      0%      Default |
|                                         |                        |             Disabled |
+-----------------------------------------+------------------------+----------------------+
|   1  NVIDIA H100 80GB HBM3          On  |   00000000:2F:00.0 Off |                    0 |
| N/A   32C    P0             79W /  700W |       1MiB /  81559MiB |      0%      Default |
|                                         |                        |             Disabled |
+-----------------------------------------+------------------------+----------------------+
|   2  NVIDIA H100 80GB HBM3          On  |   00000000:46:00.0 Off |                    0 |
| N/A   30C    P0             81W /  700W |       1MiB /  81559MiB |      0%      Default |
|                                         |                        |             Disabled |
+-----------------------------------------+------------------------+----------------------+
|   3  NVIDIA H100 80GB HBM3          On  |   00000000:54:00.0 Off |                    0 |
| N/A   33C    P0            141W /  700W |   61082MiB /  81559MiB |      0%      Default |
|                                         |                        |             Disabled |
+-----------------------------------------+------------------------+----------------------+
|   4  NVIDIA H100 80GB HBM3          On  |   00000000:A6:00.0 Off |                    0 |
| N/A   31C    P0            166W /  700W |   29392MiB /  81559MiB |     27%      Default |
|                                         |                        |             Disabled |
+-----------------------------------------+------------------------+----------------------+
|   5  NVIDIA H100 80GB HBM3          On  |   00000000:AF:00.0 Off |                    0 |
| N/A   55C    P0            609W /  700W |   34202MiB /  81559MiB |     71%      Default |
|                                         |                        |             Disabled |
+-----------------------------------------+------------------------+----------------------+
|   6  NVIDIA H100 80GB HBM3          On  |   00000000:C6:00.0 Off |                   On |
| N/A   32C    P0             81W /  700W |      89MiB /  81559MiB |     N/A      Default |
|                                         |                        |              Enabled |
+-----------------------------------------+------------------------+----------------------+
|   7  NVIDIA H100 80GB HBM3          On  |   00000000:CF:00.0 Off |                   On |
| N/A   26C    P0             74W /  700W |      90MiB /  81559MiB |     N/A      Default |
|                                         |                        |              Enabled |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| MIG devices:                                                                            |
+------------------+----------------------------------+-----------+-----------------------+
| GPU  GI  CI  MIG |                     Memory-Usage |        Vol|      Shared           |
|      ID  ID  Dev |                       BAR1-Usage | SM     Unc| CE ENC DEC OFA JPG    |
|                  |                                  |        ECC|                       |
|==================+==================================+===========+=======================|
|  6    1   0   0  |              51MiB / 40320MiB    | 64      0 |  4   0    4    0    4 |
|                  |                 0MiB / 65535MiB  |           |                       |
+------------------+----------------------------------+-----------+-----------------------+
|  6    2   0   1  |              38MiB / 40320MiB    | 60      0 |  3   0    3    0    3 |
|                  |                 0MiB / 65535MiB  |           |                       |
+------------------+----------------------------------+-----------+-----------------------+
|  7    7   0   0  |              13MiB /  9984MiB    | 16      0 |  1   0    1    0    1 |
|                  |                 0MiB / 16383MiB  |           |                       |
+------------------+----------------------------------+-----------+-----------------------+
|  7    8   0   1  |              13MiB /  9984MiB    | 16      0 |  1   0    1    0    1 |
|                  |                 0MiB / 16383MiB  |           |                       |
+------------------+----------------------------------+-----------+-----------------------+
|  7    9   0   2  |              13MiB /  9984MiB    | 16      0 |  1   0    1    0    1 |
|                  |                 0MiB / 16383MiB  |           |                       |
+------------------+----------------------------------+-----------+-----------------------+
|  7   11   0   3  |              13MiB /  9984MiB    | 16      0 |  1   0    1    0    1 |
|                  |                 0MiB / 16383MiB  |           |                       |
+------------------+----------------------------------+-----------+-----------------------+
|  7   12   0   4  |              13MiB /  9984MiB    | 16      0 |  1   0    1    0    1 |
|                  |                 0MiB / 16383MiB  |           |                       |
+------------------+----------------------------------+-----------+-----------------------+
|  7   13   0   5  |              13MiB /  9984MiB    | 16      0 |  1   0    1    0    1 |
|                  |                 0MiB / 16383MiB  |           |                       |
+------------------+----------------------------------+-----------+-----------------------+
|  7   14   0   6  |              13MiB /  9984MiB    | 16      0 |  1   0    1    0    1 |
|                  |                 0MiB / 16383MiB  |           |                       |
+------------------+----------------------------------+-----------+-----------------------+
                                                                                         
+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|    3   N/A  N/A   3985182      C   /opt/conda/bin/python                       61072MiB |
|    4   N/A  N/A      6593      C   /opt/conda/bin/python                       29382MiB |
|    5   N/A  N/A     20037      C   python                                      34192MiB |
+-----------------------------------------------------------------------------------------+