Hp XC System 2.x Software Manuel d'utilisateur Page 87

  • Télécharger
  • Ajouter à mon manuel
  • Imprimer
  • Page
    / 154
  • Table des matières
  • MARQUE LIVRES
  • Noté. / 5. Basé sur avis des utilisateurs
Vue de la page 86
To illustrate ho w the external scheduler is used to laun ch an application, consider the followin g
command line, which launches an applicatio n on ten nodes with one task per node:
$ bsub -n 10 -ext "SLURM[nodes=10]" srun my_app
The following comm and line laun ches the same application, also on ten nodes, but stipulates
that node n16 should not be used:
$ bsub -n 10 -ext "SLURM[nodes=10;exclude=n16]" srun my_app
7.1.3 Notes on LSF-HPC
The follo win g are noteworthy items for users of LSF-HPC on HP XC systems:
You must run jobs as a non-root user such as lsfadmin or any other local user; do not
run jobs as the root user.
A SLURM partition named lsf is used to manage LSF jobs. You can view information
about this partition with the sinfo command.
LSF daemons only run on one node in th e HP XC system. As a result, t he lshosts
and bhosts commands on ly list one host that represents all the r esources of the HP XC
system. The total numb er of CPUs for that host should be equal to the total number o f CPUs
found in the nodes assigned to t he SLURM lsf partition.
The total number of processors for that host should be equal to the total number of
processors assigned to the SLURM lsf partition.
When a job is submitted and the resources are available, LSF-HPC creates a properly sized
SLURM allocation and adds several standard LSF environm ent variables to the environm e nt
in which the job is to b e run. The following two environment variables are also added:
SLURM_JOBID
This environment variable is created so that subsequent srun
commands make use of the SLURM allocation created by
LSF-HPC for the job. This variable can be used by a job script to
query information about the SLURM allocation , as shown here:
$ squeue --jobs $SLURM_JOBID
SLURM_NPROCS
This environm ent variable passes along the total number of
tasks requested with the bsub -n comm and to all subsequent
srun comm ands. User scrip ts can override this v alue with the
srun -n command, but the new value must be less than or
equal to the original num ber of requested tasks.
LSF-HPC dispatches all jobs locally. The default installation of LSF-HPC for SLURM
on the HP XC system provides a job starter script that is configured for use by all
LSF-HPC q ueues. This job start er script adjusts the LSB_HOSTS and LSB_MCPU_HOSTS
environment variables to the correct resource valu es in the allo cation. Then, the job starter
script uses the srun command to launch the user task on the first node in the allocation.
If this job starter script is not configured for a queue, the user jobs begin execution locally
on the L SF-H PC execution host. In this case, it is recommended that the user job uses one
or more srun commands to make use of the resources allocated to the job. Work done
on the LSF-HPC execution host competes for CPU time with the LSF-HPC daemons, and
could affect the overall performance of LSF-HPC on the HP XC system.
The bqueues -l command displays the full queue configuration, including whether or
not a job starter script has been configured. See the Platform LSF docu men tation or the
bqueues
(1) manp age for more information on the use of this command.
For example, consider an LSF-HPC LSF configuration in which nod e n20 is the LSF-HPC
execution host and nodes n[1-10] are in the SLURM lsf part ition. The default normal
Using LSF 7-3
Vue de la page 86
1 2 ... 82 83 84 85 86 87 88 89 90 91 92 ... 153 154

Commentaires sur ces manuels

Pas de commentaire