Differences
This shows you the differences between two versions of the page.
| Both sides previous revision Previous revision Next revision | Previous revision | ||
| breakout [2018/12/27 09:40] – [Running long jobs] beckmanf | breakout [2025/03/12 17:25] (current) – [Breakout - GPU HPC an der Fakultät Elektrotechnik] beckmanf | ||
|---|---|---|---|
| Line 9: | Line 9: | ||
| * 2 x 480 GB SATA3 SSD Intel DC S3500 | * 2 x 480 GB SATA3 SSD Intel DC S3500 | ||
| * 1 x 400 GB PCIe NVME SSD P3500 | * 1 x 400 GB PCIe NVME SSD P3500 | ||
| + | * 1 x 12 TB Western Digital DC HC520 SATA3 (12/2021 neu) | ||
| * Intel X540-T2 10GB Base-T Ethernet Netzwerkanschluss | * Intel X540-T2 10GB Base-T Ethernet Netzwerkanschluss | ||
| * 4 x NVIDIA Geforce GTX 1080 mit GP104 Pascal, 2560 Cores, 8 GB RAM | * 4 x NVIDIA Geforce GTX 1080 mit GP104 Pascal, 2560 Cores, 8 GB RAM | ||
| - | * Debian | + | * Debian |
| - | * NVidia Treiber | + | * NVidia Treiber |
| - | * Kernel | + | * Kernel |
| - | * Cuda 10 | + | * Cuda 12.8.1-1 |
| + | * Tensorflow, Torch | ||
| + | * Docker 5: | ||
| + | * NVidia Container Toolkit 1.17.5-1 | ||
| ===== Nutzungshinweise ===== | ===== Nutzungshinweise ===== | ||
| Line 21: | Line 25: | ||
| < | < | ||
| - | MacBook: | + | ssh -p 2222 < |
| </ | </ | ||
| Line 32: | Line 36: | ||
| < | < | ||
| - | MacBook: ssh -Y -p 2222 < | + | MacBook: ssh -Y -p 2222 < |
| </ | </ | ||
| Line 50: | Line 54: | ||
| < | < | ||
| - | MacBook: ssh -p 2222 fritz@hs-augsburg.de | + | MacBook: ssh -p 2222 fritz@breakout.hs-augsburg.de |
| </ | </ | ||
| Line 150: | Line 154: | ||
| </ | </ | ||
| - | Now you can start a program. You can leave the tmux session (and the program) running when you type CTRL-b d. This will detach you from the tmux session. Then you can logout from you ssh session and keep everything running on the breakout. | + | Now you can start a program. You can leave the tmux session (and the program) running when you type CTRL-b d. This will detach you from the tmux session. Then you can logout from you ssh session and keep everything running on the breakout. |
| < | < | ||
| Line 156: | Line 160: | ||
| </ | </ | ||
| - | Then you should see the output from your running program. | + | You should see the output from your running program. |
| === kerberos - keep your file system alive === | === kerberos - keep your file system alive === | ||
| Line 226: | Line 230: | ||
| == Start a job with automatic kerberos ticket renew == | == Start a job with automatic kerberos ticket renew == | ||
| - | You can do the ticket renew process automatically. When you start a job with " | + | You can do the ticket renew process automatically. When you start a job with " |
| < | < | ||
| Line 232: | Line 236: | ||
| </ | </ | ||
| - | If you do this inside a tmux session, then you detach and logout. The job will run for up to seven days. | + | If you do this inside a tmux session, then you can detach and logout. The job will run for up to seven days. When you login later you can check the status of the jobs kerberos ticket again with klist. You have to provide the filename of the jobs ticket cache. |
| + | |||
| + | < | ||
| + | klist / | ||
| + | </ | ||
| + | |||
| + | In my example the new cache name from krenew was / | ||
| + | |||
| + | == Login via Public Key Authentication == | ||
| + | |||
| + | When you login via Public Key Authentication, | ||
| ==== PyTorch ==== | ==== PyTorch ==== | ||
| Line 324: | Line 338: | ||
| </ | </ | ||
| - | The training takes about 5 days on the breakout. Refer to "Running long jobs" | + | The training takes about 5 days on the breakout. Refer to [[#Running long jobs]] to see how you can run that long jobs on the breakout. |
| ==== Bauingenieure - Photoscan ==== | ==== Bauingenieure - Photoscan ==== | ||
| Line 497: | Line 510: | ||
| Once you reconnected to the server, you are ready to use python3 with TensorFlow. | Once you reconnected to the server, you are ready to use python3 with TensorFlow. | ||
| + | |||
| + | ==== Deskproto ==== | ||
| + | |||
| + | The Deskproto CAM software [[sw-milling|for milling]] is installed and can be started with the GUI. Please start the graphical desktop manager via TurboVNC as described in the [[breakout# | ||
| + | |||
| + | === First Time Setup === | ||
| + | |||
| + | The first run of Deskproto requires two setup steps. First run Deskproto from your home directory. | ||
| + | |||
| + | < | ||
| + | cd | ||
| + | / | ||
| + | </ | ||
| + | |||
| + | Select your language, Scaling and choose any machine. We will overwrite that in the next step. Once Deskproto has started, close it. Starting Deskproto for the first time will create two directories | ||
| + | |||
| + | < | ||
| + | ~/ | ||
| + | ~/ | ||
| + | </ | ||
| + | |||
| + | which contain drivers, help pages e.t.c. We have the [[sw-milling|StepFour XPERT 1000s]] mill in the lab and use [[https:// | ||
| + | |||
| + | < | ||
| + | cd | ||
| + | cp / | ||
| + | </ | ||
| + | |||
| + | === Startup of Deskproto === | ||
| + | |||
| + | After you have overwritten the configuration file, you can start Deskproto. Due to a bug the file access to your nfs mounted home directory is slow. Any file dialog will take quite a while (maybe 2 minutes) to display files in your home directory. You can redefine the HOME variable for deskproto and start it. | ||
| + | |||
| + | < | ||
| + | cd | ||
| + | HOME=/fast / | ||
| + | </ | ||
| + | |||
| + | |||
| + | |||