HetCluster

The HetCluster is a cluster of PC-nodes used for small scale numerical computations in research and education by members of the Theory Group of the Physics Department of the National Technical University of Athens.

You can see current/past scientific/educational activity on the cluster here.

Table of Nodes

Node CPU
(GHz)
Cache
(Mb)
RAM
(Gb)
M/B Other User Notes
het1 * 3.0 (PIV) 1 2 ASUS P4C800-E HD250G   RP: ILB 0910;CA 0411
het2 3.0 (PIV) 1 2 ASUS P4C800-E HD80G   RP: ILB 0910;RP: ILB 0607 Fan;CA 0411
het3 3.0 (PIV) 1 2 ASUS P4C800-E HD80G   RP: ILB 0910;CA 0411
het4 * 3.0 (PIV) 1 2 ASUS P4C800-E HD80G   RP: ILB 0910;CA 0411
het5 3.0 (PIV) 1 1 ASRock P4V88 HD80G   ILB 0507
het6 3.0 (PIV) 1 1 ASRock P4V88 HD80G   RP: ILB 0910;ILB 0507
het7 2.4+2.4 (Dual Xeon) 0.5 2 Intel SE7500CW2 HD80G+HD80G   CA 0211
het8 3.0 (PIV) 2 0.5 ASRock 755V88 HD80G   RP: ILB 0607 RAM; UP: ILB 0512 M/B+CPU+VGA+RAM; CA 0211;
het9 2.5 (PIV) 0.5 0.5 ASRock P4i65G HD80G   RP: ILB 0610 M/B+RAM;CA 0211;
het10 2.5 (PIV) 0.5 0.5 ASUS P4T533-C HD80G   UP: ILB 0512 PS;CA 0211;
het11 * 2.5 (PIV) 0.5 0.5 ASUS P4T533-C HD80G   RP: ILB 0910;CA 0211;
het12 2.5 (PIV) 0.5 0.5 ASRock P4i65G HD80G   RP: ILB 0610 M/B+RAM+VGA;CA 0211;
het13 2.5 (PIV) 0.5 0.5 ASUS P4T533-C (?) HD80G   RP: ILB 0605 MB+VGA; RP: ILB 0610 PS; CA 0211;
het14 3.0 (PIV) 2 0.5 ASRock 755V88 HD80G   UP: ILB 0512 M/B+CPU+RAM; CA 0211;
het15 * 2.5 (PIV) 0.5 0.5 ASUS P4T533-C (?) HD80G   RP: ILB 0605 MB; UP: ILB 0512 PS;CA 0211;
het16 3.0 (PIV) 2 0.5 ASRock 755V88 HD80G   RP: ILB 0910;UP: ILB 0602 M/B+CPU+RAM+VGA;CA 0211;
het17 3.4 (P D 945) 2 1.0 (DDR2 @400MHz) Asus P5VD2-VM HD160G   ILB 0702;
het18 3.4 (P D 945) 2 1.0 (DDR2 @533MHz) Asus P5VD2-VM HD160G   ILB 0702;
het19 3.4 (P D 945) 2 1.0 (DDR2 @533MHz) Asus P5VD2-VM HD160G   ILB 0702;
het20 3.4 (P D 945) 2 1.0 (DDR2 @533MHz) Asus P5VD2-VM HD160G   ILB 0702;
het21 3.0 (C2 Duo) 6 4.0 (DDR2 @800MHz) Asus P5K-VM HD250G   ILB 0805;
het22 3.0 (C2 Duo) 6 4.0 (DDR2 @800MHz) Asus P5K-VM HD250G   ILB 0805;
het23 3.0 (C2 Duo) 6 4.0 (DDR2 @800MHz) Asus P5K-VM HD250G   ILB 0805;
het24 3.0 (C2 Duo) 6 4.0 (DDR2 @800MHz) Asus P5K-VM HD250G   ILB 0805;
het25 3.0 (C2 Duo) 6 4.0 (DDR2 @800MHz) Asus P5K-VM HD250G   ILB 0805;
het26 3.0 (C2 Duo) 6 4.0 (DDR2 @800MHz) Asus P5K-VM HD250G   ILB 0805;
het27 3.0 (C2 Duo) 6 4.0 (DDR2 @800MHz) Asus P5K-VM HD250G   ILB 0805;
het28 3.0 (C2 Duo) 6 4.0 (DDR2 @800MHz) Asus P5K-VM HD250G   ILB 0805;
het29 3.0 (C2 Duo) 6 4.0 (DDR2 @1872MHz) Asus P5K-VM HD250G   ILB 0805;
het30 3.0 (C2 Duo) 6 4.0 (DDR2 @1872MHz) Asus P5K-VM HD250G   ILB 0805;
het31 3.0 (C2 Duo) 6 4.0 (DDR2 @1872MHz) Asus P5K-VM HD250G   ILB 0805;
het32 3.0 (C2 Duo) 6 4.0 (DDR2 @1872MHz) Asus P5K-VM HD250G   ILB 0805;
het33 3.0 (C2 Duo) 6 4.0 (DDR2 @1872MHz) Asus P5K-VM HD250G   ILB 0805;
het34 3.0 (C2 Duo) 6 4.0 (DDR2 @1872MHz) Asus P5K-VM HD250G   ILB 0805;
het35 3.0 (C2 Duo) 6 4.0 (DDR2 @1872MHz) Asus P5K-VM HD250G   ILB 0805;
het36 3.07 (i7) 8 12.0 (@1066MHz) ASUSTeK P6T SE HD500G+HD500G farakos PPS 0910;
het37 2.33 (Quad) 2 4.0 (DDR2 @800MHz) Dell 0M858N (Optiplex 760) HD250 farakos PPS 0910;
het38 2.33 (Quad) 2 4.0 (DDR2 @800MHz) Dell 0M858N (Optiplex 760) HD250 farakos PPS 0910;
Abbreviations:
i7 = Intel Core i7
Quad = Intel Core 2 Quad
C2 Duo=Intel Core 2 Duo
UP=Upgrade,HWP=Hardware Problem,RP=Replace/Repair
M/B=motherboard,HD=Hard Disk,PS=Power Supply,VGA=Video Card
CA=California Computers,ILB=Infolab,PPS:Papasavvas
Notes:

News

General Information

The HetCluster is a cluster of PCs connected via 10/100 ethernet to the network. Each one is independent and the Operating System is Linux under the Fedora Core 4, 6 and Ubuntu Server 8.04,9.10 distribution. For information on how to obtain an account and existing software contact the administrator. You may login to a node via ssh at the address node.physics.ntua.gr . For the moment there is no server and the filesystems are independent. This may change in the future and a common filesystem may be created on het7 and mounted on all other nodes on /data via NFS. Some observed instabilities of NFS has made us delay this option.

You may use ssh to submit jobs on remote machines. This has been set up so that no password is asked between het nodes. A prototype script for job submission/remote command execution that checks node occupancy is the script /usr/local/bin/rtop. Use it in order to check the load of each mahine before tou decide to submit your jobs.

Rules for Using the Cluster

Please observe the following rules and in case that they cannot accomodate your needs contact the administrator.

Tips for Using the Cluster

Ssh gives extreme flexibility to automate remote operations on nodes. You will benefit greatly from learning its use. Some examples are given below.
  1. Monitor the usage of all nodes: Helpful scripts (available on the nodes) are rtop, rload and ruse. rtop gives an instant of the command top on each node, rload reports the load averages and ruse the load averages and most important jobs. Each command can take as arguments the name(s) of a host(s) in order to report only on them, e.g.
    rload (reports on all nodes)
    rtop het3 het15 het4 het6
    rload 3 5 7 9 15
    ruse 3 het7 het9 15 het14 3
  2. nice your jobs to the required level using the command /usr/bin/nice, e.g.
    /usr/bin/nice +20 a.out >& log &
    /usr/bin/nice +8 a.out >& log &
  3. Copy files: Use the script rcopy to copy files to the exact same location in the filesystem of all or some of the cluster nodes. Use relative, not absolute links. e.g.:
    rcopy file1 dir1 (copies file file1 and directory dir1 on all nodes - but not on the one you are now!)
    rcopy -n 2 -n het7 -n 12 file1 file2 (copies files file1,file2 on het2,het7,het12)
  4. Submit Jobs: There are infinite possibilities. One example is the script rexec where I submit a job sign.com passing to it parameters via flags by simply editing the table at the top of the script. A simpler example is rrun where by giving the command:
    rrun run.het1 run.het2 run.het3
    I submit the jobs run.hetn on hetn
  5. Transfer Data: The GNU utilities tar and find together with scp and/or ssh can do miracles:
    scp -r -p het2:data . (brings all directory ~/data and its contents from het2 to the current node)
    ssh het2 "tar cfp - data" | tar xfvp - (does the same using the capabilities of tar)
    More complicated jobs can be handled by scripts that can keep data directories up to date. An example is the script get-new-data-remote which is used together with the script get-new-data.