Slurm Source Code Install | Cluster Deployment - Day3 Deploy slurm
Running it!
First of all, I need to test run slurmctld and slurmd.
I have got all binary file in the usr/local/etc, and just to run it. However run it with error. It hint me that configure test error. I debug it for a long time. Finally I found that it’s my fault of modifing it source code.
I modify the testconfig and let it initilize as the false;
then I cannot init slurmctld successful because of let this virable as false;
DO NOT MODIYF SRC FIRST BEFORE INSTALLING IT SUCCESSFULLY
After that I can see this information and we really run it successfully
1 | slurmctld: error: Configured MailProg is invalid |
recover and preserve
1 | slurmctld: Recovered state of 4 nodes |
MCS
1 | slurmctld: No parameter for mcs plugin, default values set |
Cgroup deployment
I choose to not use cgroup this time, But I really want to try to use cgroup;
Slurm provides cgroup versions of a number of plugins.
proctrack (process tracking)
task (task management)
jobacct_gather (job accounting statistics)
The cgroup plugins can provide a number of benefits over the other more standard plugins, as described below.
cgroup.conf provides general options that are common to all cgroup plugins, plus additional options that apply only to specific plugins.
When I run I face this problem error: cgroup namespace ‘freezer’ not mounted. aborting;
I search the question in the Internet. Looks like this was because I did not set the cgroup.conf.
Cgroup can
- AllowedKmemSpace Constrain the job cgroup kernel memory to this amount of the allocated memory
- AllowedRAMSpace
- AllowedSwapSpace
- ConstrainCores constrain allowed cores to the subset of allocated resources.
- ConstrainDevices If configured to “yes” then constrain the job’s allowed devices based on GRES allocated resources. It uses the devices subsystem for that. The default value is “no”.