Distributed Computing and Long Running Jobs at CS

The department of Computer Science has a number of machines that may be used for distributed computing, if it is done responsibly.

Depending on your account class, you may be able to run jobs on machines in the Ryerson or CSPP labs, and possibly other desktops around the department. Since these are shared machines, and may have a user sitting at the console, you must take care not to have your process consume too many resources, as the machine may become unusable by others.

If distributed processes cause problems for regular users, we reserve the right to kill the processes.

You can find machines you have access to here.

Compute Responsibly

Use nice!

  • nice runs a program with an adjusted scheduling priority.
    • Priorities range from 19 (lowest) to -20 (highest)

  • Any program that runs for more than a few minutes should use nice
    • nice 19 myProgram

  • See: man nice and man renice for more details.

Monitor your jobs

  • Use tools like top, ps, ping to monitor your long running jobs.
  • See their respective man pages for details, or email techstaff@cs.uchicago.edu for advice.

Be ready to kill your jobs

  • Distributed jobs can run amok.
  • When setting up your code, know how to kill all your processes effectively.
  • Feel free to email techstaff@cs.uchicago.edu for advice.

Talk to techstaff

  • Please email techstaff@cs.uchicago.edu if you think you may be putting a large load on the system, or would like any help or advice with running or monitoring long running jobs.

  • It is easy to "lose control" of distributed jobs.
    • Don't just walk away! Contact techstaff for help.

Appropriate Use Policy

CS computers are intended to primarily support Computer Science department academic activities and departmental research. Shared resources, such as disk space, cpu, memory, open files, processes, etc, must not be monopolised by any one person or project, and priority will be given to departmental academic and research activities.