0%

Slurm-Day3

Slurm Source Code Install | Cluster Deployment - Day3 Deploy slurm

  • Running it
  • Cgroup Deployment

Running it!

First of all, I need to test run slurmctld and slurmd.

I have got all binary file in the usr/local/etc, and just to run it. However run it with error. It hint me that configure test error. I debug it for a long time. Finally I found that it’s my fault of modifing it source code.

I modify the testconfig and let it initilize as the false;

then I cannot init slurmctld successful because of let this virable as false;

DO NOT MODIYF SRC FIRST BEFORE INSTALLING IT SUCCESSFULLY

After that I can see this information and we really run it successfully

1
2
3
4
5
6
7
slurmctld: error: Configured MailProg is invalid
slurmctld: slurmctld version 19.05.7 started on cluster cluster

slurmctld: No memory enforcing mechanism configured.

slurmctld: layouts: no layout to initialize
slurmctld: layouts: loading entities/relations information

recover and preserve

1
2
3
4
5
6
7
8
9
10
slurmctld: Recovered state of 4 nodes
slurmctld: Recovered information about 0 jobs
slurmctld: cons_res: select_p_node_init
slurmctld: cons_res: preparing for 1 partitions
slurmctld: Recovered state of 0 reservations
slurmctld: _preserve_plugins: backup_controller not specified
slurmctld: cons_res: select_p_reconfigure
slurmctld: cons_res: select_p_node_init
slurmctld: cons_res: preparing for 1 partitions
slurmctld: Running as primary controller

MCS

1
2
slurmctld: No parameter for mcs plugin, default values set
slurmctld: mcs: MCSParameters = (null). ondemand set.

Cgroup deployment

I choose to not use cgroup this time, But I really want to try to use cgroup;

Slurm provides cgroup versions of a number of plugins.

proctrack (process tracking)
task (task management)
jobacct_gather (job accounting statistics)
The cgroup plugins can provide a number of benefits over the other more standard plugins, as described below.

cgroup.conf provides general options that are common to all cgroup plugins, plus additional options that apply only to specific plugins.

When I run I face this problem error: cgroup namespace ‘freezer’ not mounted. aborting;

I search the question in the Internet. Looks like this was because I did not set the cgroup.conf.

Cgroup can

  • AllowedKmemSpace Constrain the job cgroup kernel memory to this amount of the allocated memory
  • AllowedRAMSpace
  • AllowedSwapSpace
  • ConstrainCores constrain allowed cores to the subset of allocated resources.
  • ConstrainDevices If configured to “yes” then constrain the job’s allowed devices based on GRES allocated resources. It uses the devices subsystem for that. The default value is “no”.