Skip to content

How to onboard researchers and labs

We have two types of researchers that use our cluster. CAIS researchers and researchers that belong to other labs. A researchers lab information should be on the Onboarding Kanban. Look for a tile with the researcher name. If the information isn't present or you are unable to determine what lab a researcher belong to contact Oliver Zhang. Researchers who are ready to get onboarded will be under the Need to Add to Cluster section. The researchers public SSH key should be attached to the tile. Look for the SSH Key section in the tile.

Onboarding Script

We now have an onboarding script that automates most of the onboarding process. The script has been added to the /opt/oci-hpc/bin directory and has been named onboard.sh. To use the script start by modifying the script by providing values for the following variables,

...
# Variables (Replace these with actual values or pass them as arguments)
lab_name='' # john_smith
researcher_name='' # paul_atredies
full_name='' # Paul Atredies
user_password=''
parent_account='' # cais, Labs, grads, etc... 
ssh_key=''
...
Then execute the script onboard.sh. After you're done using the script don't forgot to clear the values you set.

Manual Steps

Lab onboarding

If the researcher belongs to a lab that already exists on the cluster move on to the next section. For name_of_lab use the name of the labs PI.

# on the bastion node
# add the lab group: 
cluster group create {name_of_lab}
# add slurm lab account:
sudo sacctmgr add account {name_of_lab} Parent=labs Description="Professor X Lab" Organization=Prof_X

Researcher onboarding

Create cluster user

# retrieve the lab group id
cluster group list | grep -wA1 {name_of_lab} | grep gidNumber
> gidNumber: {group_id}

cluster user add {researcher_name} --gid {group_id}
password = NQVjnPD6bYY8SNNade7aeFxTSKJZWVqR

Create Slurm user:

Add the user to the proper Slurm account:

sudo sacctmgr create user {researcher_name} DefaultAccount={name_of_lab}

Add SSH key

# become the user
sudo su {researcher_name}
# add SSH key to authorized_keys file
vim ~/.ssh/authorized_keys

Test

Make sure that the researcher can get jobs allocated:

# become the user
sudo su {researcher_name}
# run this slurm job
srun --gpus=2 --pty /bin/bash
# keep an eye out for errors

Filesystem Quotas

By default, we give each researcher 1TB of free space in their home directories. We manage our filesystem quotas using the builtin quota feature of Weka. You can set their quota with the following command:

sudo weka fs quota set /data/{researcher_name} --hard 1TB --grace 1d
If a researchers request for more space has been approved you can increase their quota with the same command with the updated quota amount.

Additional commands

How to change a users group

Change their group via pam

First find their new group number. Then change the user to the new group.

cluster group list

cluster user change-group --help

# Changes opc to cais group
cluster user change-group opc 10011

Then update it on slurm. You update it by deleting them and readding them to the new group.

sudo sacctmgr remove user where user=opc and account=test
sudo sacctmgr add user name=opc DefaultAccount=cais
Alternatively

sudo sacctmgr modify user where user=opc set defaultaccount=cais

Check the results with

sudo sacctmgr show user where user=opc

Cluster command

For more information about the cluster command:

cluster --help
cluser user --help
cluster group --help