Hp XC System 2.x Software Manuel d'utilisateur Page 84

  • Télécharger
  • Ajouter à mon manuel
  • Imprimer
  • Page
    / 154
  • Table des matières
  • MARQUE LIVRES
  • Noté. / 5. Basé sur avis des utilisateurs
Vue de la page 83
Example 6 -8: Reporting Reasons for Downed, Drained, and Draining Nodes
$ sinfo -R
REASON NODELIST
Memory errors dev[0,5]
Not Responding dev8
6.8 Job Accounting
HP XC System Software provides an extension to SLURM for job accounting. The sacct
command displays job accounting data in a variety o f forms for your analysis. Job accounting
data is stored in a log f il e; the sacct command filters that log file to report on your jobs,
jobsteps, status, and errors. See your system administrator if job accounting is not configured
on your system.
You can find d e tailed information on th e sacct command and j ob accounting data in the
sacct
(1) manpage.
6.9 Fault Tolerance
SLURM can handle a variety of failure m odes with out terminating workloads, including crashes
of the node running the SL URM controller. User jobs m ay be configured to con tinue execution
despite t he failure of one or more nodes on which they are executing (refer to Section 6.4.5.1 for
further information). The command controlling a job m ay detach and reattach from the parallel
tasks at any tim e. Nodes allocated to a job are available for reuse as soon as the job(s) allocated
to that node terminate. If some nodes fail to complete job termination in a timely fashion because
of hardware or software problems, only the scheduling o f those tardy nod es will be affected.
6.10 Security
SLURM has a simple security model:
Any user of the system can submit jobs to execute. Any user can cancel his or her own jobs.
Any user can view SLURM configuration and s tat e information.
Only privileged users can modify the SLURM configuration, cancel any job, or
perform other restricted activities. Privileged users in SLURM include root users and
SlurmUser (as defined in the SLURM configuration f ile).
If permission to m od ify SLURM configuration is required by others, set-uid programs may
be used to grant specific permissions to specific users.
SLURM accomplishes security by means o f communication authentication, job authentication,
and user authorization.
Refer to SLURM documentatio n for further information about SLU RM security features.
6-14 Using SLURM
Vue de la page 83
1 2 ... 79 80 81 82 83 84 85 86 87 88 89 ... 153 154

Commentaires sur ces manuels

Pas de commentaire