clusters:minerva:useful_tips
Table of Contents
Useful tips for using Minerva
Quota
To find your current quota:
beegfs-ctl --getquota --uid $(id -u) --mount=/home beegfs-ctl --getquota --uid $(id -u) --mount=/work
Unsorted
(From Ian's collection of useful commands. TODO: tidy these up)
Using ethernet
export I_MPI_FABRICS=shm:tcp
export I_MPI_TCP_NETMASK=enp7s0
Idle loaded nodes
sinfo -p container -t idle -o "%n %O" -h|sort -k 2 -n|awk '$2 > 0.5 {print}'
Rogue processes
for n in $(sinfo -p container -t idle -o %n -h); do printf "%s\t%s\n" $n $(ssh -n $n ps -e -o comm= |grep -v 'supervisord\|slurmd\|rsyslogd\|sshd\|rsyslogd\|ps\|munged\|nslcd'); done
for n in $(sinfo -p container -t drain,idle -o %n -h); do printf "%s\t%s\n" $n $(ssh -n $n ps -e -o comm= |grep -v 'supervisord\|slurmd\|rsyslogd\|sshd\|rsyslogd\|ps\|munged\|nslcd'); done
Active processes on each node
while read node; do echo $node; ssh $node top -b -n 1 -i -d 10 </dev/null|tail -n +8; done <machines.txt
Job results
sacct -S 2016-07-19 -u ian -o jobid,end,alloccpus,jobname%20,state|grep -v batch|sort -k 2
Reason nodes are in drain
sinfo -o "%P %.5a %.10l %.6D %.6t %N %E"
sinfo -p normal -t drain,down -o "%20H %.40E: %N" | sort -r
(with timestamps)
Nodes used by users
squeue -a|awk '$5 == "R" {user = $4; nodes[user] += $7} END {for (u in nodes) {print u,nodes[u]}}'
Nodes in job
c[109,135-141]
for node in c109 c135 c136 c137 c138 c139 c140 c141; do echo $node; ssh $node top -b -n 1 -i -d 10 </dev/null|tail -n +8; done
Node list of a job
squeue -j 37411 -o %N -h
Interactive job
srun -N 1 --pty bash
Installing LALSimulation
git clone git@git.ligo.org:lscsoft/lalsuite.git cd lalsuite module load gsl/gcc/2.4 export LIBRARY_PATH=/home/sossokine/sw/lib export CPATH=/home/sossokine/sw/include export PYTHONPATH=$PWD/lalsimulation/python:$PWD/lal/python/lal:$PYTHONPATH ./00boot ./configure --prefix ~/software/lalsuite --enable-all-lal=no --enable-lalsimulation=yes nice make -j 8 make install
clusters/minerva/useful_tips.txt · Last modified: by stefgru
