Jason Nucciarone · @nuccitheboss
27 followers · 36 posts · Server mast.hpc.social

If anyone is interest in testing a backport of 23.02 for Ubuntu 22.04 LTS, I published a PPA yesterday that contains the necessary Debian packages (I had to backport rocm-smi-lib too):

```bash
sudo add-apt-repository ppa:ubuntu-hpc/slurm-wlm-23.02
sudo apt install <slurm package>
```

I'm keen on making sure that it works for other people too and not just me 😅

#slurm

Last updated 1 year ago

Bjørnar 🇧🇻 · @btuftin
109 followers · 2610 posts · Server social.coop

Job keeps running out of memory and I keep increasing the memory request until I notice I've got a typo so it's been running on the default 4GB every time ... 🤦

#slurm #programming #analysis

Last updated 1 year ago

Tyler Smith · @plantarum
692 followers · 1001 posts · Server ottawa.place

@Danwwilson @Mehrad @rstats with you have a lot of options here: you can open and edit files on the server in your local emacs. Or you can edit local files on a local emacs, but run the code on a remote server.

It's a little roundabout, but I even use this system for editing multiple scripts on a local machine, and submitting them as jobs to a remote cluster. I have been meaning to write this up for a while, maybe later in the summer

#orgmode #rstats #slurm

Last updated 1 year ago

nojhan · @nojhan
497 followers · 1891 posts · Server mamot.fr

Depends on core utils, and colout: github.com/nojhan/colout

#shell #bash #slurm #hpc

Last updated 1 year ago

Matthieu Urvoy · @birozularutti
6 followers · 257 posts · Server piaille.fr

Hello Mastodon!

Are some of you using or to distribute computations across a set of network nodes?

Requirements:
- multiple executions of a single C++ app, with different arguments
- no communication between executions required
- approximately 20000 executions to distribute across 5-10 nodes
- possibility to run 2-4 tasks in parallel on the same node to speed things up
- retrieval of individual output files for further analysis

What do you suggest?

#distributedcomputing #gnuparallel #slurm

Last updated 2 years ago

AskUbuntu · @askubuntu
78 followers · 2079 posts · Server ubuntu.social

Who is mounting /cgroup/cpuacct and /cgroup/freezer in Ubuntu 22.04? #2204

askubuntu.com/q/1464812/612

#cgroup #slurm

Last updated 2 years ago

nebelgrau/Michal · @nebelgrau77
108 followers · 883 posts · Server fosstodon.org

I don't want to brag but I kinda do 🤭 I ran my own little program on two clusters today. It's a tiny tool that runs a command, processes the output with , and outputs metrics. But it works really nice, it's easy to adapt the RegEx to what I need to get, and it's easy to configure the metrics the way I want. Right now it's just counting the GPUs in use, labeling them with GPU type, EC2 instance type, etc., but technically could be...

#rustlang #slurm #regex #prometheus

Last updated 2 years ago

nebelgrau/Michal · @nebelgrau77
106 followers · 832 posts · Server fosstodon.org

Good news, everyone! 😀 I wrote a little piece of 🦀 code, which is now running on a cluster, like, at work!
It's a simple exporter, that reads a sinfo command and outputs the metric. But it works, and it takes some args (port and update interval). The original idea was to figure out how to modify the existing exporter written in Go, but as I don't know almost anything about it, and also this opportunity showed up... go Rust! 🙃

#rustlang #prometheus #slurm

Last updated 2 years ago

Scalable Analyses · @scalable
3 followers · 32 posts · Server fosstodon.org

AWS ParallelCluster has a new user interface. The software automates the setup of multi-node HPC machines in the public cloud.

day1hpc.com/post/announcing-th

#aws #hpc #cloud #slurm

Last updated 2 years ago

Matt Vaughn ⛅️ · @yakshavers
202 followers · 154 posts · Server hachyderm.io

Optimize your workloads with -based memory-aware scheduling. Automatically balance memory usage for improved efficiency and reduced wait times 🚀☁️ day1hpc.com/post/slurm-based-m

#hpc #aws #parallelcluster #slurm

Last updated 2 years ago

Aalto Scientific Computing · @SciCompAalto
101 followers · 36 posts · Server fosstodon.org

On a cluster, array jobs let you parallelize things without parallelizing your code - parallelize an easy script instead. For many tasks, this is enough! The basic idea is same code, slightly different data, and :s connect it all. Our array tutorial explains the concepts and provides copy-and-paste examples, and works on any cluster.

scicomp.aalto.fi/triton/tut/ar

#hpc #shellscript #slurm #rseng #scicomp #tip

Last updated 2 years ago

LisPi · @lispi314
186 followers · 2686 posts · Server mastodon.top

@miko Neat, I didn't know about .

#slurm

Last updated 2 years ago

Angel Pizarro · @delagoya
41 followers · 21 posts · Server hachyderm.io

Nice post by @yakshavers on Slurm accounting, which adds flexibility, transparency, and control to operating an cluster on AWS using ! Version 3.3.0 can now automatically configure accounting whether you are using your own database or Amazon .

aws.amazon.com/blogs/hpc/lever

#hpc #aws #parallelcluster #slurm #aurora

Last updated 2 years ago

Matt Vaughn 🎄🕎 · @yakshavers
191 followers · 128 posts · Server hachyderm.io

🚨Release Alert: 3.4.1 is out today on PyPi. It fixes an issue with where nodes could become inaccessible or backed by the wrong instance type 🐛 pypi.org/project/aws-parallelc

#aws #parallelcluster #slurm #ec2 #hpc

Last updated 2 years ago

Bart Janssens 🇧🇪 · @bart
632 followers · 549 posts · Server sociabl.be

@mbauman Aha, that is good to know, as well as that there is a option to disable turbo mode. Impressive results then:

discourse.julialang.org/t/how-

#slurm

Last updated 2 years ago

Rebecca Hartman-Baker · @hpcgal
211 followers · 125 posts · Server mast.hpc.social

@rkdarst @minrk @priesgo @SciCompAalto at we also offer on using . I was not involved in setting it up or running it, but I can connect you with the right people if you have questions.

#nersc #jupyterhub #hpc #slurm

Last updated 2 years ago

· @johanneskoester
115 followers · 19 posts · Server fosstodon.org

So here is your first christmas present: 7.19 is released, adding native support, which @rupdecat and I have implemented in the last months. Apart from that, the release provides various bug fixes. snakemake.github.io

#snakemake #slurm #sciworkflows #reproducibility

Last updated 2 years ago

nebelgrau/Michal · @nebelgrau77
73 followers · 341 posts · Server fosstodon.org

So... I think I finally got it to work. I have a tiny cluster 🙃 Well, cluster is a big word. A clusterino, as Ned Flanders would put it. I followed some of the "Raspberrry Pi cluster" articles and tutorials, but as I only have one , it's the head node, and the compute node is my desktop. And a 32GB USB key is the NFS 😂 But it is fun for sure 🙃

#slurm #raspi #hpc

Last updated 2 years ago

Paniz Karbasi · @hpcgenome
26 followers · 13 posts · Server mast.hpc.social

presentations from now available: slurm.schedmd.com/publications
There’s one I specifically had been hoping to be discussed for so long, and it finally happened: Slurm and/or/versus Kubernetes
It talks about potentially getting the slurm to work with K8s. Something that may not exactly be necessary as engineers do the on-prem and k8s separately, but there are still good reasons to think about the possibility of integration; like managing one infrastructure vs. two.

#slurm #sc22

Last updated 2 years ago

F.Felix · @ffelixr
53 followers · 55 posts · Server mastodon.social

After many years, I'm coming back to play with job schedulers for cluster computing. 😀 This is a humble project but happy to see that "I Still Do"

#slurm

Last updated 2 years ago