Changes Archive

Transcription

Changes Archive
Changes Archive
Intel Compiler, $FC and $F77 variablen: -nofor-main
flag entfernt
Paul Kapinos posted on Feb 19, 2014
Ab 20.02.2014 wird in der $FC und $F77 variablen der Flag '-nofor-main' erstmals seit 2007-09-25 nicht mehr gesetzt, weil
'cmake' damit nicht zurecht kommt. Sollten Sie Probleme beim Übersetzen/Linken von Software bekommen, bitte melden Sie
sich! From 20.02.2014 on, the flag '-nofor-main' will be removed from $FC and $F77 variables, because 'cmake' has problems
with it. If you get trouble by compiling/linking the applications, let us know!
Status RWTH Compute Cluster, 2014-02-14
Frank Robel posted on Feb 14, 2014
Status RWTH Compute Cluster, 2014-02-14 The wrapper r_memusage will replace memusage: The wrapper memusage could
only show the virtual memory peak. The new command with the name r_memusage can also show the amount of really used
memory and includes all child processes. The Harpertown nodes are removed from the group of MPI backends.
Status RWTH Compute Cluster, 2014-02-06
Frank Robel posted on Feb 07, 2014
Status RWTH Compute Cluster, 2014-02-06 A faulty update package has led to disruptions in the graphical dialog systems.
The misbehavior of one user resulted in multiple failures of cluster and cluster-linux. To avoid this kind of failure, please use
$TMP and not /tmp to store temporary files. Due to the failure of Lustre ($ HPCWORK) some files may be corrupted. The
“Center for Computing and Communication” is now called “IT Center”.
openmpi/1.7.4
Paul Kapinos posted on Feb 06, 2014
Neue Version von Open MPI, 1.7.4 (ohne und mit multithreading als 1.7.4mt), werden ab 07.02.2014 im HPC-Cluster
verfügbar. http://www.open-mpi.org/community/lists/announce/2014/02/0059.php Gleichzeitig wird die Version 1.7.3 ins
DEPRECATED verschoben.
gcc/4.8
Paul Kapinos posted on Feb 04, 2014
neue Version von GCC compilern, 4.8.2 wurde auf dem Cluster installiert und ist ab dem 05.02.2014 erreichbar mit $ module
switch intel gcc/4.8 Die frühere Version von gcc/4.8 (4.8.1) wirde nach DEPRECATED verschoben.
Status RWTH Compute Cluster, 2014-01-30
Frank Robel posted on Jan 30, 2014
Status RWTH Compute Cluster, 2014-01-30 Our Lustre file system $HPCWORK has failed. Possibly as a result of incorrect
use. If you want to use $HPCWORK in batch mode then you need to request it! Otherwise, your job will be rejected. You need
to add the following line into your batch script if you want to use $HPCWORK: #BSUB -R "select[hpcwork_fast]"
Status RWTH Compute Cluster, 2014-01-17
Frank Robel posted on Jan 17, 2014
Status RWTH Compute Cluster, 2014-01-17 Partial failure of the cluster gateway On Friday 2014-01-10 between 13:48 and
14:01 there was a partial failure of the cluster gateway. Only cluster internal communication was possible. The frontends could
not be reached. Partial failure of q_cpuquota Between 2013-12-18 12:45 and 2014-01-13 07:30 q_cpuquota delivered incorrect
values. Java JDK has been updated to version 1.7.0_51.
FDS (Fire Dynamic Simulator) Installation
Paul Kapinos posted on Jan 15, 2014
DE: Die Installation von FDS (Fire Dynamic Simulator, http://code.google.com/p/fds-smv/wiki/FDS_Release_Notes) im Cluster
wurde erneuert: Versionen 5.3 (5.3.1), 5.5 (5.5.3 gebaut mit älteren Compilern), 6.0.1 (SVN:17598), und 6.0alpha (SVN:16086)
wurden ins DEPRECATED verschoben; Version 6.0.2 (aktualisiert auf SVN:18014) wurde mit dem aktuellen Intel Compiler
14.0 gebaut und als *6.0* installiert und zu Standard-Version fürs FDS gemacht.…
Status RWTH Compute Cluster, 2014-01-10
Frank Robel posted on Jan 10, 2014
Status RWTH Compute Cluster, 2014-01-10 The Intel license server was down on 2014-01-06 from 13:00 to 17:00 o’clock. The
frontend cluster-linux-opteron was shut down as announced.
Status RWTH Compute Cluster, 2013-12-19
Paul Kapinos posted on Dec 23, 2013
Status RWTH Compute Cluster, 2013-12-19 in the week 12-19.12.2013, some problems with LDAP servers are present. Thus
the 'member' function was from time to time not usable. The JARA accounting tool 'q_cpuquota' did not work propperly, see
"Störungsmeldungen": http://maintenance.rz.rwth-aachen.de/list.php?id=14 As of end of the year 2013 the frontend
cluster-linux-opteron will go offline. R.I.P. , old good () Opteron hardware! Merry HPC-mas!
Status RWTH Compute Cluster, 2013-12-12
Frank Robel posted on Dec 13, 2013
Status RWTH Compute Cluster, 2013-12-12 At present we see sporadically unusual job aborts The problem is investigated
Status RWTH Compute Cluster, 2013-12-06
Frank Robel posted on Dec 13, 2013
Status RWTH Compute Cluster, 2013-12-06 Problems with memory cgroups on BCS Some BCS jobs crashed without any
error message
ABAQUS: bekannte Inkompatibilität mit neuer Version
des Intel Compilers / a known incompatinility with new
Intel compiler
Paul Kapinos posted on Dec 03, 2013
Die Software ABAQUS kann in Fortran programmierte Nutzerunterprogramme nutzen. Dazu nutzt es den verfügbaren Fortran
Compiler, welcher bei uns der Intel's 'ifort' ist, vgl. $ abaqus info=system Bei der Wartung am 26.11.2013 wurde die version des
Intel Compielrs von 13.1 auf 14.0 geändert. Das führt in der aktuellen Installation zu Fehlern wie >
/rwthfs/rz/SW/ABAQUS/6.13-1/code/bin/standard: symbol lookup error:
/tmp/xx000000/linux000_14578/xx000000_m01-v01_2299/libstandardU.…
FDS (Fire Dynamic Simulator) - Version 6.0.1 Installier
Paul Kapinos posted on Dec 03, 2013
Version 6.0.1 (SVN 17598) installiert. Optimisation -O2 statt -O3 wegen eines Fehlers in Intel compiler 14.0. $ module load
TECHNICS $ module load fds/6.0.1
Status RWTH Compute Cluster, 2013-11-29
Frank Robel posted on Nov 29, 2013
Status RWTH Compute Cluster, 2013-11-29 Maintenance on Nov. 26th successful completed Default modules Intel compiler
and OpenMPI changed The frontend cluster-linux-opteron will be shut down end of the year X-Win32 / NX Client Maintenance
on Nov. 26th successful completed LSF 8.x was replaced by LSF 9.1.1.1 From now on, the memory usage is monitored.
Processes using more than the requested memory will be terminated.…
ausgediente Modules verschoben / old modules
deprecated
Paul Kapinos posted on Nov 27, 2013
DE: Folgende module sind nach DEPRECATED Bereich verschoben: EN: the following modules are moved to DEPRECATED
area: DEVELOP/openmpi/1.6.1 DEVELOP/openmpi/1.6.1mt DEVELOP/openmpi/1.6.4 DEVELOP/openmpi/1.6.4mt
DEVELOP/openmpi/1.6.4knem DEVELOP/studio/12.2 LIBRARIES/nag/c_mark8 LIBRARIES/nag/c_mark8_ilp64
LIBRARIES/nag/c_mark9 LIBRARIES/nag/c_mark9_ilp64 LIBRARIES/nag/fortran_mark21 LIBRARIES/nag/fortran_mark22
LIBRARIES/nag/smp_mark21 LIBRARIES/nag/smp_mark22
Status RWTH Compute Cluster, 2013-11-21
Frank Robel posted on Nov 22, 2013
Status RWTH Compute Cluster, 2013-11-21 Maintenance 26.11., 7:30 – 16:30 Jobs with greater runtime will not be started
before the Maintenance. Deprecation of 32 bit Support On November 26th we will have a big maintenance where we have to
shut down the complete cluster including the frontends (7:30 - 16:30). Work on the following components will be done: Power
supplies Air condition Fire alarm system Major Update of LSF (v 9.1.1.…
Gromacs 4.6.4 installiert
Paul Kapinos posted on Nov 19, 2013
Die Version 4.6.4 von Gromacs http://www.gromacs.org wurde im RZ-Cluster in 3 Versionen installiert: gromacs/4.6.4 (Intel
Compiler, Standard-Genauigkeit) gromacs/4.6.4dp (Intel Compiler, doppelte Genauigkeit, vgl.
http://www.gromacs.org/Documentation/Terminology/Precision ) gromacs/4.6.4cuda (gcc/4.8 Compiler, Standard-Genauigkeit,
Nutzung von GPGPU - nur auf GPU Cluster zu nutzen) Die Version gromacs/4.6.4 ist nun die neue Standardversion. Die
Ältere Version 4.5.…
TotalView: new BETA version 8.13
Paul Kapinos posted on Nov 18, 2013
A new BETA version of the TotalView debugger is available in the cluster from 19.11.2013 on: $ module load totalview/8.13b
load the 8T.13.0-0 version. This version use an own license file which is valid until 03.03.2014. Please report any problems and
feedback to ServiceDesk. A cut-out of the official announcement: >We would like testing on all platforms and of all functionality
but there are some really cool new features that I want to encourage you to look at - MemoryScape for Xeon Phi.…
Status RWTH Compute Cluster, 2013-11-14
Paul Kapinos posted on Nov 14, 2013
Status RWTH Compute Cluster, 2013-11-14 The maintenance of power supplies (announced for 7.11-15.11) is completed. As
a consequence of power shutdown, a couple of nodes got hardware problems; these nodes are currently under repair.
Neue/New Intel Sowtware
Paul Kapinos posted on Nov 14, 2013
Neue Intel Software wurde installiert: intel/14.0.1.106 (wird in nächsten Tagen den 14.0.0.080 als intel/14.0 ersetzen)
intelmpi/5.0b (5.0.0.007, eine neue Beta Version von Intel MPI mit z.B. Unterstützung von MPI3, vgl.
http://software.intel.com/en-us/articles/intel-mpi-library-50-beta-readme) intelitac/9.0b (9.0.0.007, eine neue Beta Version)
------------------------------------------------------------------- New Intel Software: intel/14.0.1.106 (wird became intel/14.0 next days)
intelmpi/5.…
OpenFOAM version 2.1.1p
Paul Kapinos posted on Nov 12, 2013
Neue Version von OpenFOAM wurde am 12.11.2013 installiert: Version 2.1.1 mit patch vom 30.06.2012,
https://github.com/OpenFOAM/OpenFOAM-2.1.x/commit/7d703d585daf11438fbc4ad3bae01199675e7f78 $ module load
TECHNICS $ module load openfoam/2.1.1p ------------------------------------------------------------------------------ New version of
OpenFOAM available. This version contain patch from 30.06.2012: https://github.com/OpenFOAM/OpenFOAM-2.1.…
Status RWTH Compute Cluster, 2013-11-07
Frank Robel posted on Nov 07, 2013
Status RWTH Compute Cluster, 2013-11-07 Maintenance of Power Supplies 7.11.-15.11 Secondary Accounts From 7.11. to
15.11. we have to maintain the power supplies of each rack individual. As consequence the waiting times may slightly increase
during this period of time. All frontends will be available. An automatic creation of “Hochleistungsrechnen RWTH Aachen”
secondary accounts is no longer possible.
Status RWTH Compute Cluster, 2013-10-31
Tim Cramer posted on Oct 31, 2013
Status RWTH Compute Cluster, 2013-10-31 Maintenance of Power Supplies 7.11.-15.11. Maintenance 26.11., 7:30 – 16:30
Deprecation of 32 bit Support Cgroups issue on BCS X-Win32 Issues From 7.11. to 15.11. we have to maintain the power
supplies of each rack individual. As consequence the waiting times may slightly increase during this period of time. All
frontends will be available.…
Intel compiler version update
Paul Kapinos posted on Oct 29, 2013
Die Version des standardmäßig geladenen Intel Compilers (13.1) wurde von 13.1.1.163 (= "Composer XE2013.3.163") auf
13.1.3.192 (= "Composer XE2013.5.192") aktualisiert. Die Version 13.1.1.163 wurde nach DEPRECATED verschoben. Hiermit
sollte dieser Bug beseitigt werden: http://software.intel.com/en-us/articles/svd-multithreading-bug-in-mkl
The default
intel/13.1 compiler updated from 13.1.1.163 (= "Composer XE2013.3.163") to 13.1.3.192 (= "Composer XE2013.5.192").
Version 13.1.1.…
Status RWTH Compute Cluster, 2013-10-25
Frank Robel posted on Oct 25, 2013
Status RWTH Compute Cluster, 2013-10-25 Announcement of Kepler GPUs New software stack on GPU cluster
Discontinuance of the Windows GPU machine On Wednesday, we announced four NVIDIA Kepler GPUs on the rzcluster
mailing list. Access will be granted by an e-mail to [email protected]. We updated the software stack on the
whole GPU cluster (57 Fermi GPUs, 4 Kepler GPUs) New graphics driver: 319.49 New CUDA toolkit: 5.5 New kernel:
2.6.32-358.23.2.el6.…
Status RWTH Compute Cluster, 2013-10-18
Tim Cramer posted on Oct 18, 2013
Status RWTH Compute Cluster, 2013-10-18 1. ScaleMP Downtime 2. X-Win32 / NX Client 1. The ScaleMP machine will be
down as of Tuesday, October, 22nd due to a reconfiguration. Please consider that the period of time for downtime might be a
longer period of time. 2. As already announced Starnets X-Win32 software will be the new standard remote desktop software to
access the RWTH Compute Cluster. It will replace the NX Client to connect the frontends.…
The $OMP_THREAD_LIMIT environment variable will
not be set
Paul Kapinos posted on Oct 10, 2013
On 06. 08 we changed teh environmet to set the$OMP_THREAD_LIMIT environment variable, see The
$OMP_THREAD_LIMIT environment variable Due to a several issue with the Intel compiler OpenMP runtime we decided now
to revert this change and not to set the $OMP_THREAD_LIMIT environment variable from tomorrow on. You have to re-login
once in order to get this change working.
Status RWTH Compute Cluster, 2013-09-26
Frank Robel posted on Sep 26, 2013
Maintenance on Sep. 24th successful completed We apologize for delays Some LSF jobs are aborted and restarted after
maintenance. For further information: https://maintenance.rz.rwth-aachen.de/list.php?id=14 Oracle Java 1.6 is switched off,
please use Oracle Java 1.7
Status RWTH Compute Cluster, 2013-09-19
Tim Cramer posted on Sep 19, 2013
Maintenance on Sep. 24th, 11:00-18:00 As already announce at the rzcluster mailing list we scheduled a maintenance
downtime for the whole RWTH compute cluster on Sep. 24th, 11:00-18:00. Both parts, linux and windows, are affected and
unavailable for the denoted period. Planned works: - Software upgrade of NetApp filers - Firmware upgrade of the HPCWORK
storage systems - Installation of service pack Feb 2013 for LSF 8.0.1 - Removal of oracle JDK 1.6
Intel MPI - Startup Issue on Big Jobs in LSF fixed
Marcus Wagner posted on Sep 12, 2013
The issue with starting of Big Jobs (more than come 32 nodes) using Intel MPI in
the LSF batch system is fixed now. Please use as usual this line to start your jobs: $MPIEXEC $FLAGS_MPI_BATCH a.out As
the fix did not work for older versions of Intel MPI, the version intelmpi/4.0 was moved to DEPRECATED area.
Status RWTH Compute Cluster, 2013-09-06
Tim Cramer posted on Sep 06, 2013
Status RWTH Compute Cluster, 2013-09-06 Java 1.6 is not supported anymore X-Win32 supersede the NoMachine NX client
Defective power supply in MPI complex Due to security bugs which will not be fixed by Oracle Java 1.6 is not supported
anymore. Due to several issues with the NoMachine NX client the support for this software package will by stopped by the end
of 2013. The replacement for live sessions will be X-Win32. Please refer to Interactive Mode for more information.…
Neues Release des Primers 8.2.6
Paul Kapinos posted on Aug 15, 2013
Unter http://www.rz.rwth-aachen.de/hpc/primer ist nun eine neue Ausgabe des Primers verfügbar, Version 8.2.6 von 15. August
2013. Wichtigste Änderungen: • As some older nodes reached the EOL (end-of-live) timeline, the chapters – 2.4 The older
Xeon based Machines – 2.5 IBM eServer LS42 has been removed • As the idb debugger is deprecated by Intel, chapter – 7.3.3
Intel idb (Lin) has been removed • As the Intel Thread Checker and Profiler tools are superseded by Intel Inspector and VTune
tools,…
Der(The) interactive $MPIEXEC wrapper
aktualisiert(updated)
Paul Kapinos posted on Aug 13, 2013
Der interaktive $MPIEXEC Wrapper ('mpiexec', 'mpirun' wenn Sie interaktiv eingeloggt sind in den HPC-Cluster)ist aktualisiert
auf die neue Version 2.6. (Siehe Kapitel 6.2.1 des Primers: http://www.rz.rwth-aachen.de/hpc/primer ) Bitte melden Sie falls
Störungen irgendwelcher Art auftreten. ------------------------------------------------------------------------------ The interactive $MPIEXEC
wrapper ('mpiexec',…
Status RWTH Compute Cluster, 2013-08-09
Tim Cramer posted on Aug 09, 2013
Status RWTH Compute Cluster, 2013-08-09 1. New PuTTY version 2. OMP_THREAD_LIMIT errors The ssh client PuTTY 0.63
fixes several security bugs. If you use an older version to login to the cluster you should upgrade the software. Refer here for
more information As announced last week we set the maximum OpenMP thread limit (environment variable
OMP_THREAD_LIMIT) to 2x of the logical core number. Unfortunately, this courses some errors for some applications (e.g.,
Gaussian).…
TotalView - new version 8.12 installed an set to defalt
Paul Kapinos posted on Aug 08, 2013
A new version 8.12 of the TotalView debugger (http://www.roguewave.com/products/totalview.aspx) will be available from
06.08.2013 on. This version will be the new default version of TotalView. The version 8.12b moved to DEPRECATED (=>
8T.12.0-1) The version 8.9.2 moved to DEPRECATED (=> 8.9.2-2 )
The $OMP_THREAD_LIMIT environment variable
Paul Kapinos posted on Aug 06, 2013
From 06.08.2013 on, the environment variable $OMP_THREAD_LIMIT will be set to 2x of number of logical cores (number of
'CPUs' the operating system believes to have available) in the HPC-Cluster standard environment. This will limit the number of
threads of an Open MP program and avoid extremal overloading of the nodes.
Intel Threading Tools (ITT) DEPRECATED
Paul Kapinos posted on Aug 06, 2013
The old Intel Threading Tools (ITT) module 'intelitt' was moved to DEPRECATED area. The "Thread Checker" functionality
(and more) is now contained in the Intel Inspector product (module: 'intelixe'); The "Thread Profiler" functionality (and more) is
now contained in the Intel VTune/Amplifier product (module: 'intelvtune').
Intel MPI stability problems after the OFED update Fixed
Paul Kapinos posted on Aug 01, 2013
The stability problems with Intel MPI reported in
https://wiki2.rz.rwth-aachen.de/display/bedoku/2013/06/27/Intel+MPI+problems+in+the+cluster should be resolved now. The
workaround described in the linked page (disabling DAPL) is not needed now. Side notes: 1. using older versions of Intel MPI
than currently default 4.1, these warning: linuxscc004.rz.RWTH-Aachen.DE:7ea7:ad202700: 2189 us(2189 us): open_hca:
device mlx4_0 not found linuxscc004.rz.RWTH-Aachen.…
Status RWTH Compute Cluster, 2013-07-22
Tim Cramer posted on Jul 22, 2013
Status RWTH Compute Cluster, 2013-07-22 1. BCS firmware update 2. Lustre (HPCWORK) malfunction 3. Request to set your
password 4. KNEM kernel module Due to a firmware update on all BCS systems the waiting times for these systems might be
longer as usual. After a malfunction (refer to http://maintenance.rz.rwth-aachen.de/list.php?id=14) the lustre (HPCWORK) file
system is fully operative again.…
Python installation in the HPC Cluster
Paul Kapinos posted on Jul 17, 2013
The Python installation in the HPC Cluster is rebuild in the days 15-17.07.2013. 1. Older versions 2.5.6, 2.6.8, 2.7.3, 3.3.0
moved to DEPRECATED 2. new versions 2.7.5 and 3.3.2 installed. The version 2.7.5 is now the default python version in
modules. (Note that the Linux-default Python is 2.6.6.) The new version support NumPy, SciPy, Matplotlib modules. These
versions are configured with '--enable-unicode=ucs4 --enable-ipv6' flags, as the Linux-default Python.…
Status RWTH Compute Cluster, 2013-07-16
Tim Cramer posted on Jul 16, 2013
Status RWTH Compute Cluster, 2013-07-16 1. ulimits for batch jobs 2. Intel Xeon Phi cluster 3. New frontend: cluster-copy2 In
future we will set the ulimits in batch jobs to the cluster wide default of a default account (i.e., the limit you see with ulimit –a for
a default, unmodified account). If you have special requirements concerning the stack size or the core file size you can change
the limits by setting the corresponding LSF option in MB (e.g., #BSUB –C 400 or BSUB –S 1000).…
Intel Trace Analyzer and Collector (ITAC) - new version
installed
Paul Kapinos posted on Jul 15, 2013
New version 8.1(.1.027) of the Intel Trace Analyzer and Collector (ITAC) was installed, available as 'intelitac' older versions 7.2
and 8.0 are moved to DEPRECATED
Intel Advisor XE - new version installed
Paul Kapinos posted on Jul 15, 2013
New version '2013 XE update 3" of Intel Advisor was installed and is now the default version of 'intelaxe' module. Older version
'update 1' was moved to DEPRECATED
Intel Inspector XE - new version installed
Paul Kapinos posted on Jul 15, 2013
The 'intelixe' softare (Intel Inspector 2013 XE) is now installed in new version 'Update 6'. This versions is also set to default.
Older versions are moved to the DEPECATED area.
GCC compiler - new version installed
Paul Kapinos posted on Jul 11, 2013
GCC compilers version 4.8.1 are now installed in the HPC-Cluster and available as 'gcc/4.8': $ module switch intel gcc/4.8 The
pervious 4.8 version, 4.8.0, now mover to DEPRECATED.
Intel MPI problems in the cluster
Paul Kapinos posted on Jun 27, 2013
currently running of Intel MPI Jobs in the cluster can be disturbed, you can get such errors:
linuxbmc1226.rz.RWTH-Aachen.DE:5cae:490e4700: 1053229598 us(1053229598 us!!!): dapl_cma_active: CONN_ERR
event=0x7 status=-110 TIMEOUT DST 134.61.205.38, 30720 rank 51 in job 1 linuxbmc1226.rz.RWTH-Aachen.DE_48603
caused collective abort of all ranks exit status of rank 51: return code 1 Especially bigger jobs (more than some 5 nodes / 60
processes) are affected.…
Intel MPI 4.1 update
Paul Kapinos posted on Jun 27, 2013
Intel MPI 4.1 update: the default intelmpi/4.1 was updated from 4.1.0.030 to 4.1.1.036 (the version 4.1.0.030 goes to
DEPRECATED software)
Status RWTH Compute Cluster, 2013-06-13
Tim Cramer posted on Jun 13, 2013
Status RWTH Compute Cluster, 2013-06-13 Maintenance 11th June, 9:00-16:00 During the maintenance the new OFED stack
was installed successfully. Up to now we do not see any serious stability problems. A bug in the lippsm_infinipath library was
fixed after the maintenance. As expected the Intel Xeon Phi coprocessors are not working with the new OFED stack, so that
the bandwidth between host and coprocessor is very slow. We try to find a solution for that issue.…
Status RWTH Compute Cluster, 2013-06-07
Tim Cramer posted on Jun 07, 2013
Status RWTH Compute Cluster, 2013-06-07 1. Power Outage 2. Replacement of Power Supply in JARA Partition 3.
Maintenance 11th, 9:00-16:00 4. Staff Outing 14th Due to a defective electric cable (excavator) we had a power outage on
Saturday, 1st. About 800 systems were rebooted. Please resubmit affected jobs. We had a replacement of a power supply in
the JARA partition, so that some jobs had a slightly longer waiting time. As already announced we will have a big maintenance
on 11th, 9:00-16:00.…
memusage
Paul Kapinos posted on Jun 05, 2013
The 'memusage' script (see Primer,chapter 5.11 on page 66, http://www.rz.rwth-aachen.de/hpc/primer) was updated to version
1.3
Status RWTH Compute Cluster, 2013-05-27
Tim Cramer posted on May 27, 2013
Status RWTH Compute Cluster, 2013-05-27 1. ScaleMP 2. cluster-linux-counter 3. New OFED stack The ScaleMP system is
online again. The special purpose frontend cluster-linux-counter.rz.rwth-aachen.de is not needed anymore and was turned off.
Due to a security bug in the Linux kernel we are forced to switch the OFED stack. The current OpenFabrics stack is not
compatible with latest kernel and the lustre (HPCWORK) system.…
Status RWTH Compute Cluster, 2013-05-17
Tim Cramer posted on May 17, 2013
Status RWTH Compute Cluster, 2013-05-17 Due to license issues the big ScaleMP node is offline at the moment. We are
working on a solution.
Neues Release des Primers 8.2.5
Paul Kapinos posted on May 08, 2013
Unter http://www.rz.rwth-aachen.de/hpc/primer ist nun eine neue Ausgabe des Primers verfügbar, Version 8.2.5 von 08. Mai
2013. Wichtigste Änderungen: • The version of the default compiler changed to intel/13.1 (instead of 12.1 versions). • Some
other modules are updated in their versions, old versions were deprecated. • The runtime limit in the JARA-HPC Partition has
been changed from 24h to 72h, cf. chapter 4.6 on page 48 • The Array Job example script has been fixed, cf. listing 3 on page
40.…
Boost (Lin) installiert
Paul Kapinos posted on May 07, 2013
Boost library now installed for Intel and Open MPI, and for Intel and GCC compilers. Boost provides free peer-reviewed
portable C++ source libraries that work well with the C++ Standard Library. Boost libraries are intended to be widely useful, and
usable across a broad spectrum of applications. More information can be found at http://www.boost.org/. To initialize the
environment, use $ module load LIBRARIES; module load boost. This will set the environment variables boost_root,…
HDF5 (Lin) installiert
Paul Kapinos posted on Apr 29, 2013
HDF5 library now installed for Intel and Open MPI, and for Intel and GCC compilers. HDF5 is a data model, library, and file
format for storing and managing data. It supports an unlimited variety of datatypes, and is designed for flexible and efficient I/O
and for high volume and complex data. More information can be found at http://www.hdfgroup.org/HDF5/. To initialize the
environment, use $ module load LIBRARIES; module load hdf5. This will set the environment variables hdf5_root,…
Status RWTH Compute Cluster, 2013-04-26
Tim Cramer posted on Apr 26, 2013
Status RWTH Compute Cluster, 2013-04-26 Maintenance At May 6th, 13:00-17:00, we have scheduled a maintenance
downtime for the RWTH compute cluster. Both linux and windows dialog and batch systems are affected and will be
unavailable in that time. The following works are planned: Upgrade of all Netapp filers (HOME/WORK) to a new ONTAP
version Upgrade of all Lustre (HPCWORK) clients from version 1.8 to version 2.…
OpenFOAM news
Paul Kapinos posted on Apr 23, 2013
Die Installation von OpenFOAM (http://openfoam.com/) wurde am 22.04.2013 neu gemacht: neue Version 2.2.0 installiert und
zum Standard gemacht (gebaut mit intel/13.1, gcc/4.8 Compilern, Intel und Open MPIs). Versionen 2.0.0, 2.0.1, 2.1.1 neu
gebaut mit den aktuellen Compilern (intel/13.1, GCC/system_default, Intel und Open MPIs). nun ist ParaView in der Installation
von OpenFOAM enthalten; das Laden von ParaView aus Modules entfällt. Die bis 22.04.…
Intel MPI default change
Paul Kapinos posted on Apr 22, 2013
Intel MPI Aktualisierung der Versionen: 4.1(.0.024) wird nach DEPRECATED verschoben; 4.1(.0.030) wird zum Standard-4.1
(statt von 4.1.0.024) und zum IntelMPI-Standard 4.0 - (ehemaliges Standard) - verbleibt noch fuer ein Weilchen und wird in
wenigen Wochen nach DEPRECATED verschoben.
Intel Compiler minor revision update
Paul Kapinos posted on Apr 22, 2013
die minor revision von intel/13.1(default) hat sich von ".0.146" auf ".1.163" geändert 13.1(.0.146) => DEPRECATED 13.1.1.163
=> 13.1(default)
Status RWTH Compute Cluster, 2013-04-19
Tim Cramer posted on Apr 19, 2013
Status RWTH Compute Cluster, 2013-04-19 The default behaviour of screen was modified (/etc/screenrc), so that the
LD_LIBRARY_PATH will be retained.
New Version of ParaView - 3.98.1
Paul Kapinos posted on Apr 18, 2013
Neue Version 3.98.1 von ParaView http://www.paraview.org/ wurde installiert und gleich zu Standard-ParaView gemacht:
module load GRAPHICS module load paraview Versionen 3.6.1 und 3.10.1 wurden nach DEPRECATED verschoben
Status RWTH Compute Cluster, 2013-04-12
Tim Cramer posted on Apr 12, 2013
Status RWTH Compute Cluster, 2013-04-12 There will be a maintenance on 6.5.2013 where we have to shut down the
complete Linux cluster, because a defective Infiniband switch and the lustre client will be switched. Details are following.
TotalView Debugger - new (beta) release 8.12
Paul Kapinos posted on Apr 11, 2013
Eine neue (beta) Version des TotalView Debuggers ist nun verfügbar: totalview/8.12b
GCC compilers - new release 4.8.0 available
Paul Kapinos posted on Apr 11, 2013
Neues Release (4.8.0) von den GCC compilers wird ab 12.04.2013 verfügbar: module load gcc/4.8
Neue Intel Software
Paul Kapinos posted on Apr 02, 2013
Neue Intel Software im RZ-Cluster installiert (verfügbar ab 03.04.2013): Intel® Cluster Studio XE for Linux Version 2013
(Update 3 Eng/Jpn) $ module switch intel intel/13.1.1.163 Intel® Inspector XE for Linux Version 2013 (Update 5) $ module load
intelixe
Python Installationen
Paul Kapinos posted on Mar 28, 2013
Major update in the Python installations at the RZ-Cluster: Versions 2.5.6, 2.6.8, 2.7.3, 3.3.0 installed. On all but 3.x versions
now these Python modules are available: NumPy, SciPy, matplotlib. Unset PYTHONPATH to disable these modules (e.g. for
installing/using own versions). The GCC compiler used for building; thus in doubt try 'module switch intel gcc'. Version 2.7.3 set
to the default python module (instead of 2.7.1). Versions 2.4.6, 2.5.5, 2.6.6, 2.7.1, 3.1.1, 3.2.…
Status RWTH Compute Cluster, 2013-03-28
Tim Cramer posted on Mar 28, 2013
Status RWTH Compute Cluster, 2013-03-28 1. Maintenance 2. Eastern During the maintenance this week we successfully
upgraded the lustre servers without losing any data. We wish all our users a happy Easter vacation.
Neue Standard-Module - New default modules
Paul Kapinos posted on Mar 26, 2013
Während der Wartung am 26.03.2013 wurden folgende Änderungen der Standradmodule durchgeführt: 1. Neuer Standard-MPI
ist Open MPI 1.6.4 (früher 1.6.1) 2. Neuer Standard-Compiler ist Intel 13.1(.0.146), statt früher 12.1(.6.361). 3. Der intel/12.1
Compiler ist nun 12.1.7.367 statt von 12.1.6.361 (letztere Version ist verschoben nach DEPRECATED) 4. Die Module intel/13.0
und openmpi/1.5.…
Status RWTH Compute Cluster, 2013-03-21
Frank Robel posted on Mar 21, 2013
Status RWTH Compute Cluster, 2013-03-21 1. Maintenance, March 26th 9:00 – 12:00 2. Malfunction Lustre Filesystem 3.
Malfunction one IH chassis As already announced we scheduled a maintenance for March 26th 9:00 -12:00. Upgrade of the
Lustre filesystem (HPCWORK ). Please note the following things: The downtime of HPCWORK will exceed the announced
maintenance period. Therefore, it is important to mark all jobs using HPCWORK to mark with #BSUB -R “select[hpcwork]”.…
VASP Neuigkeiten & news
Paul Kapinos posted on Mar 21, 2013
Folgende Änderungen wurden an der Installation von VASP (Vienna Ab initio Simulation Package, https://www.vasp.at/)
durchgeführt: Versionen 4.5, 4.6hack, 5.2.2, 5.2.12 (ohne Fix vom 11.11.2011) wurden nach DEPRECATED verschoben.
Sollten Sie eine dieser Versionen zwingend brauchen, bitte informieren Sie uns. Anderweitig werden diese Versionen zu Ende
2013 unwiderruflich entfernt. Version 5.2.12f (5.2.12 mit Fix vom 11.11.…
Status RWTH Compute Cluster, 2013-03-08
Tim Cramer posted on Mar 08, 2013
Status RWTH Compute Cluster, 2013-03-08 1. PPCES 2013 2. LSF and HPCWORK For our PPCES 2013 several nodes were
reserved (GPU cluster, 10 MPI nodes), so that they can not be used during the next week in the normal queue. Find all details
about PPCES here We improved the submission routine, so that you might get warning if you try to use HPCWORK without
setting #BSUB -R "select[hpcwork]". Please always set this option if you need the HPCWORK.
Status RWTH Compute Cluster, 2013-03-01
Tim Cramer posted on Mar 01, 2013
Status RWTH Compute Cluster, 2013-03-01 Maintenance, March 26th 9:00 – 12:00 As already announced we scheduled a
maintenance for March 26th 9:00 -12:00. Upgrade of the Lustre filesystem (HPCWORK ). Please note the following things: The
downtime of HPCWORK will exceed the announced maintenance period. Therefore, it is important to mark all jobs using
HPCWORK to mark with #BSUB –R “select[hpcwork]”. Jobs not marked accordingly will fail when trying to access paths in
$HPCWORK.…
Status RWTH Compute Cluster, 2013-02-22
Tim Cramer posted on Feb 22, 2013
Status RWTH Compute Cluster, 2013-02-22 1. cluster-linux-tuning available again 2. Java update 3. Firmware updates After
the replacement of a defective motherboard cluster-linux-tuning is available again. Java was (again) updated with the latest
security patches. We have to install several firmware updates to all compute nodes. For will be done gradually, so that parts of
the compute nodes will not be available during this time.
NAG Toolbox für (for) MATLAB verfügbar (available)
Paul Kapinos posted on Feb 22, 2013
Werte Nutzer von MATLAB, Die NAG Toolbox für MATLAB ist nun verfügbar für die Versionen 2008a, 2008b, 2009a, 2009b,
2010a von MATLAB. (Für die Versionen 2010b 2011a 2011b 2012a 2012b die NAG Toolbox ist derzeit nicht nutzbar aufgrund
von Lizenzproblemen. Bitte teilen Sie uns mit wenn Sie die NAG Toolbox auf einer der Versionen dringend brauchen). Die
Toolbox ist integriert direkt in MATLAB, so keiner zusätzliche Module sind zu laden.…
Status RWTH Compute Cluster, 2013-02-19
Tim Cramer posted on Feb 19, 2013
Status RWTH Compute Cluster, 2013-02-19 1. Update Intel Compiler and Intel MPI 2. Memory Limits on graphical Frontends
We installed the Intel Compiler 13.1 and the Intel MPI 4.1.0.030. Both are available in the module system. We decreased the
memory-limit (cgroups) per user to 16 GB real used memory on the frontends cluster-x and cluster-x2. There is no limit for the
virtual memory.
NAG's numerical libraries - new versions installed
Paul Kapinos posted on Feb 18, 2013
The following new versions of the numerical libraries from NAG (Numerical Algorithms Group, www.nag.com/) are now
available: The NAG C library Mk23: http://www.nag.com/numeric/CL/CLdescription.asp The NAG SMP library Mk23:
http://www.nag.com/numeric/FL/FSdescription.asp Note that the NAG C libraries are now available in two versions: LP64
('usual' 64bit library, integer 32bit) and ILP64 ('..._ilp64' version, integer 64bit). Using the incongruous version will crash your
application.…
NAG's "nagfor" compiler version 5.3.1
Paul Kapinos posted on Feb 18, 2013
Der Fotran Compiler "nagfor" von Numerical Algorithms Group (NAG, www.nag.com/) ist nun in der Version 5.3.1 installiert. Die
ältere Version 5.2 ist noch verfügbar. Version 5.3.1: $ module load nagfor oder $ module load nagfor/5.3.1 Version 5.2: $
module load/5.2
AMD's ACML installation updates
Paul Kapinos posted on Feb 13, 2013
New ACML versions installed: 4.4.0 (set to default ACML version instead of 4.3.0) 4.4.0_mt 5.3.0 5.3.0_mt The 5.3.0 version is
known to work with GCC compilers version 4.6 or newer, so it did not work with the default GCC compiler gcc/system-default
These versions of ACML are moved to DEPRECATED category: 4.0.1, 4.0.1_mp, 5.0.0, 5.0.0_mp
Status RWTH Compute Cluster, 2013-02-11
Tim Cramer posted on Feb 11, 2013
Status RWTH Compute Cluster, 2013-02-11 1. Update likwid 2. Maintenance 26.03.2013 3. cluster-linux-tuning likwid was
updated to version 3.0. Use $ module load likwid to use the performance tool. Please note that the tool does not work on the
BCS systems. We scheduled a big maintenance for the 26.03.2013. Morning: Whole cluster including the frontends. At least (!)
one day: HPCWORK. Due to a bug in the lustre file system HPCWORK will not be available for at least the complete day.…
Status RWTH Compute Cluster, 2013-01-25
Tim Cramer posted on Jan 25, 2013
Status RWTH Compute Cluster, 2013-01-25 1. New frontend: cluster-copy This week we just have a little reminder that we
established a new Cluster frontend: cluster-copy.rz.RWTH-Aachen.DE (short: cluster-copy). This node is intended to use for
data transfer operations. Please use this frontend for copying data from/to/within the cluster, for packing with TAR, for
compressing with GZIP or similar operations.
Status RWTH Compute Cluster, 2013-01-17
Frank Robel posted on Jan 17, 2013
1. Correct use of $HPCWORK in batch mode. Correct use of $HPCWORK in batch mode: If your batch job uses the
HPCWORK file system you should set this parameter:#BSUB -R "select[hpcwork]" This will ensure that the job will run on
machines with up’n’running Lustre file system. On some machines (mainly the hardware from pre-Bull installation and some
machines from Integrative Hosting) the HPCWORK is connected via ethernet instead of InfiniBand,…
Status RWTH Compute Cluster, 2013-01-11
Paul Kapinos posted on Jan 11, 2013
1. we're glad to announce a new RZ-Cluster frontend: cluster-copy.rz.RWTH-Aachen.DE (short: cluster-copy). This node is
dedicated to big data transfer operations (movement of data from/to RZ-Cluster, packing with TAR, compressing with GZIP and
so on). 2. The amount of jobs waiting for execution in the batch queue is extraordinary high now. Thus the actual waiting time
can be pretty high. 3. A new Linux kernel was installed, thus all RZ-Cluster nodes are rebooted.…
Happy New Year 2013
Tim Cramer posted on Jan 04, 2013
Dear Cluster user, We hope you all had a Merry Christmas and wish a Happy New Year for you and your families. The Cluster
Team
Removal of old deprecated modules
Paul Kapinos posted on Dec 14, 2012
The following modules in DEPRECATED are now removed: Deleted: openmpi/1.4.3-O2 openmpi/1.4.4nn openmpi/1.4.4rc3
openmpi/1.5.4 openmpi/1.5.4hjs openmpi/1.5.4nn openmpi/1.5.5 openmpi/1.5.5mt openmpi/1.6 openmpi/1.6mt Deleted:
python/2.4.2 python/2.5.2 python/2.6.4
TotalView Debugger update
Paul Kapinos posted on Dec 11, 2012
TotalView Debugger update Version 8.11 (8.11.0-0) installiert und als Standard gemacht Versionen 8.9.1 und 8.10 nach
DEPRECATED verschoben
Status RWTH Compute Cluster, 2012-12-10
Frank Robel posted on Dec 10, 2012
Status RWTH Compute Cluster, 2012-12-10 1. Memory limit with cgroups 2. Maintenance HPCWORK (Lustre) 3. Malfunction
LSF License server Now cgroups memory limits are used on all frontends and MPI backends. One single user can use 75% of
the available memory as a hard limit. In mid-January we plan maintenance on HPCWORK (Lustre). If your batch job relies on
HPCWORK make sure that you have included the line #BSUB -R "select[hpcwork]" in your batch script.…
Status RWTH Compute Cluster, 2012-11-23
Tim Cramer posted on Nov 23, 2012
Status RWTH Compute Cluster, 2012-11-23 1. Balancing JARA Partition / normal Queue 2. Java Heap-Space The number of
available mpi-s nodes in the JARA partition was increased from 450 nodes to 648. As consequence the number of available
nodes for the normal queue was reduced from 648 to 450. Please find all information about the JARA partition here. The
default heap-space for Java applications was increased from 512 MB to 2048 MB by setting the environment variable
JAVA_TOOL_OPTIONS=-Xmx2048m.
Status RWTH Compute Cluster, 2012-11-16
Tim Cramer posted on Nov 16, 2012
Status RWTH Compute Cluster, 2012-11-16 1. Memory limit with cgroups 2. VASP jobs We changed the behavior of the
cgroup memory limits on cluster-linux and cluster-linux-xeon to evaluate the usability. One single user can use 80 % of the
available memory as a hard limit and 25 % as a soft limit. This means that one can execute an application which uses more
than 25% of the total memory, but if the memory becomes tight on the node this process might get killed by the operating
system.…
Intel Software updates
Paul Kapinos posted on Nov 14, 2012
The following minor updates on Intel software installation are done: 1. Intel Compiler 13.0.1.117 installed and now available as
intel/13.0 (the previous 13.0 version, 13.0.0.079, go to DEPRECATED) 2. Intel Compiler 12.1.7.367 installed and now available
as intel/12.1.7.367 Note: Update of 12.1 (the default Intel compiler) is planned for the next cluster maintenance. 3.…
Status RWTH Compute Cluster, 2012-11-09
Tim Cramer posted on Nov 09, 2012
Status RWTH Compute Cluster, 2012-11-09 1. Maintenance 14.11.2012, 8:00 - 9:00 2. Binding to BCS and ScaleMP boards
As already announced there will be a maintenance on 14.11.2012, 8:00 - 9:00. Due to a security kernel update all Linux batch
nodes will be rebooted in this maintenance. The frontends are not affected. We switched the mechanism of the board binding
on BCS and ScaleMP to cgroups-based approach.…
Status RWTH Compute Cluster, 2012-11-02
Tim Cramer posted on Nov 02, 2012
Status RWTH Compute Cluster, 2012-11-02 1. New Primer Release 2. Malfunction Cooling System 3. Malfunction Bull
Infiniband 4. Maintenance HPCWORK (Lustre) 5. Deactivation of XRC We published a new (minor) primer release:
http://www.rz.rwth-aachen.de/hpc/primer In the night from Thursday, November 1st to Friday morning we had a malfunction of
the cooling systems. As result of this malfunction around 100 nodes shut down automatically. Please resubmit affect jobs.
From Friday,…
Status RWTH Compute Cluster, 26.10.2012
Frank Robel posted on Oct 26, 2012
Status RWTH Compute Cluster, 26.10.2012 1. Memory limitation on the frontend cluster-linux-xeon 2. Maintenance of cluster
On cluster-linux-xeon the amount of memory each user can access has been limited to 8GB using cgroups for testing
purposes. The cluster will likely be maintenend at Nov. 13th 2012. Further information is going to be announced soon.
Status RWTH Compute Cluster, 2012-10-22
Tim Cramer posted on Oct 22, 2012
Status RWTH Compute Cluster, 2012-10-22 1. New Java versions 2. Cgroups 3. NX Client 4. Kernel Update 5. LSF
configuration JARA queue 6. GPU cluster 7. MPI jobs on barcelona The Java version was updated jdk1.6.0_35 updated to
jdk1.6.0_37 jdk1.7.0_07 updated to jdk1.7.0_09 The java module will be deleted As already announced we established a fair
share for the cpu time on all frontend nodes by applying cgroups. Furthermore,…
VASP - switch to openmpi-1.6.1 versions, possible
failures at monday and thuesday
Paul Kapinos posted on Oct 16, 2012
Dear VASP users, all MPI versions of all VASP versions available over modules are now those compiled with Open MPI 1.6.1.
Due to activation of the XRC feature at Monday you batch jobs may have been failed between 10:00 at 2012.10.15 and 18:30
at 2012.10.16 with this error message: > WARNING: The Open MPI build was compiled without XRC support, but XRC("X")
queues were specified in the btl_openib_receive_queues MCAparameter. Please resubmit these jobs.
OpenMPI 1.6.1 XRC activated
Tim Cramer posted on Oct 16, 2012
OpenMPI 1.6.1: XRC activated The XRC feature was activated for OpenMPI 1.6.1 on monday. With this update old binaries
can not be started in the batch system although they might work interactively. Please rebuild your applications (can be done on
the frontends).
Status RWTH Compute Cluster, 12.10.2012
Tim Cramer posted on Oct 12, 2012
Status RWTH Compute Cluster, 12.10.2012 1. Upgrade SL 6.3 2. MPI Fallback deactivated 3. BCS / ScaleMP Binding We will
upgrade all nodes in the cluster to Scientific Linux 6.3 during the next days – An impact for the users is not expected. We will
deactivate MPI Fallback for all Infiniband connected nodes. This will avoid that MPI jobs start with very slow connection
alternatives like IBoIP by mistake. The disadvantage is that the job will die after an IB malfunction.…
Status RWTH Compute Cluster, 2012-10-04
Frank Robel posted on Oct 04, 2012
Status RWTH Compute Cluster, 2012-10-04 The CPU-ulimits on the frontends are switched off for test as a consequence of
the introduction of cgroups.
Status RWTH Compute Cluster, 2012-09-27
Frank Robel posted on Oct 04, 2012
Status RWTH Compute Cluster, 2012-09-27 Small disruption on Monday 24.09. from 05:25 to 08:40. No jobs were started.
Neue Intel Software
Paul Kapinos posted on Sep 21, 2012
Neue Intel Software: Intel Compiler 12.1.6.361 (module load intel/12.1, by default — Version 12.1.5.339 DEPRECATED) Intel
MKL 10.3.9.293 (module load LIBRARIES intelmkl/12.3 — Version 10.3.7.256 DEPRECATED)
neuer GCC Compiler - 4.7.2
Paul Kapinos posted on Sep 21, 2012
Werte RZ-Cluster-Nutzer, ein neuer GCC compiler ist installiert, Version 4.7.2. Erreichbar als gcc/4.7. Die bis jetzt unter diesem
Namen verfügbare Version 4.7.0 ist nach DEPRECATED gewandert. Dear RZ Cluster User, ner version of GCC compiler is
available: 4.7.2. Load it as gcc/4.7 - the version 4.7.0 which was available by this name untel yet is moved to DEPRECATED
Status RWTH Compute Cluster, 2012-09-14
Tim Cramer posted on Sep 14, 2012
Status RWTH Compute Cluster, 2012-09-14 1. Malfunction of the Job Scheduler 2. Maintenance of the Cooling System 3.
Power Outage, Monday 10.09.2012 4. Defective Power Supply, Wednesday 12.09.2012 5. Cgroups The vendor of the LSF job
scheduler provided a workaround for the slot reservation issue. Due to the fact that the slot reservation works with this
modification at the moment, it is possible to get big MPI jobs scheduled again. Platform Computing (IBM) is still working on a
final solution.…
Status RWTH Compute Cluster, 2012-09-07
Tim Cramer posted on Sep 07, 2012
Status RWTH Compute Cluster, 2012-09-07 1. Malfunction of the Job Scheduler 2. JAVA Update 3. Cgroups There is still a
problem with the LSF job scheduler which makes it very difficult to get big MPI jobs scheduled. The vendor Platform LSF (IBM)
was able to reproduce the problem and is working with the highest priority on this issue. In order to fix a security issue with
JAVA, the version in the cluster was updated.…
New Portland Group Compilers
Marcus Wagner posted on Sep 06, 2012
have been installed. While 12.8 is the new version, 12.3 still remains the default.
OpenFOAM - Installationen überarbeitet
Paul Kapinos posted on Sep 03, 2012
Die Installationen von OpenFOAM können zur Zeit (Stand 18.04.2012) nicht mit dem aktuellen Intel Compiler benutzt werden.
Der Compiler Intel 12.1 muss benutzt werden. So bald wie möglich werden die OpenFOAM Installationen erneuert werden. Die
Installationen von OpenFOAM (http://www.openfoam.com/) sind heute (mit Wirkung vom 04.09.2012) überarbeitet worden: die
Versionen 2.0.0, 2.0.1, 2.1.1 (jetzige Standardversion) wurden neu gebaut mit Intel 12.1 und GCC compilern und Open MPI
1.6.…
FDS (Fire Dynamic Simulator) - erneut neue
Installationen
Paul Kapinos posted on Sep 03, 2012
Die hier angekündigte Arbeiten an der Installation von FDS wurden heute (2012.09.03) wie folgt erweitert/angepasst: Version
6.0a12450 neu installiert (erreichbar als 6.0a) Version 6.0a11934 (ehemals erreichbar als 6.0a) verschoben nach
DEPRECATED Version 5.2 (5.2.5) verschoben nach DEPRECATED Versionen 5.5 (5.5.3) und 5.3 (5.3.1) neu gebaut mit Intel
Compiler (12.1.5.339), Intel MPI (4.0.3.008), Open MPI (1.6.1).…
VASP new version 5.2.12 with accumulated fixes from
11.11.2012
Paul Kapinos posted on Aug 31, 2012
Neue Version von VASP Software verfügbar: $ module load CHEMISTRY $ module load vasp/5.2.12f Diese Version beinhaltet
Bugfixes für VASP 5.2.12:
http://www.vasp.at/index.php?option=com_content&view=article&id=98:bugfix-accumulated-fixes-for-vasp5212&catid=40:bugfi
xes&Itemid=63 Diese Version ist gebaut mit Intel Compilern 12.1.5.339, und Open MPI 1.6.1. Die Dateien mpi.o, main.o,
xcgrad.o, xcspin.o wurde mit einer verringerten Optimierungstufe übersetzt (-O1 statt -O3).…
NWChem
Paul Kapinos posted on Aug 31, 2012
New release (6.1.1) of NWChem software (see http://www.nwchem-sw.org/index.php/Main_Page) is installed. This version use
Open MPI 1.6.1 and is compiled using Intel compiler 12.1.5.339. older version 6.0 and 6.1 of the NWChem software are moved
to DEPRECATED category.
Status RWTH Compute Cluster, 2012-08-31
Tim Cramer posted on Aug 31, 2012
Status RWTH Compute Cluster, 2012-08-31 1. Malfunction of the Job Scheduler 2. Maintenance of the Cooling System 3. New
MPI: Open MPI 1.6.1 Due to a malfunction of the LSF job scheduler from last Monday 12:00 to Tuesday 00:45 no jobs started
on the linux-cluster. To fix this problem the scheduler does not make any reservations at the moment. As a consequence there
might be longer waiting times for parallel jobs.…
PETSc - neue Installation
Paul Kapinos posted on Aug 28, 2012
Folgende Änderungen an der Installation von PETSc (Portable, Extensible Toolkit for Scientific Computation,
http://www.mcs.anl.gov/petsc/) sind durchgeführt: Version 3.3 (p2) neu installiert, erreichbar als petsc/3.3 Hinweis: diese
Version wurde gebaut mit OpenMPI 1.6.1, so muss dieses MPI vor dem Laden des PETSc Modules geladen sein. die
Standard-Version gesetzt auf 3.3 (Version 3.2 bleibt verfügbar) Version 3.0.0 nach DEPRECATED verschoben The following
changes on PETSc installation: v3.…
FDS (Fire Dynamic Simulator) - neue Installationen
Paul Kapinos posted on Aug 13, 2012
Folgende Änderungen an der Installationen von FDS (Fire Dynamic Simulator) sind durchgefügrt: Version 6.0a11934 neu
installiert (erreichbar als 6.0a) Version 6.0a10519 (ehemals erreichbar als 6.0a) verschoben nach DEPRECATED Version 5.5.3
(erreichbar als 5.5) mit aktuellem Intel Compiler gebaut mit Intel MPI, Open MPI (1.6), seriell Standard-FDS auf 5.5 gesetzt
(statt von 5.…
Status RWTH Compute Cluster, 2012-08-10
Tim Cramer posted on Aug 10, 2012
Status RWTH Compute Cluster, 2012-08-10 1. Requeued jobs For maintainance reasons we had to requeue several jobs
during the week.
Status RWTH Compute Cluster, 2012-08-03
Tim Cramer posted on Aug 03, 2012
Status RWTH Compute Cluster, 2012-08-03 1. Node Cleaner 2. Maintenance 20.08.2012, 9:00 - 15:00 3. Modification of
ulimits 4. Security problems on the GPU cluster In order to improve the performance of the cluster we established a new node
cleaner which is executed before and after every exclusive batch job. The node clean does two things: A flush on the file
system cache. This will solve some serious performance issues concerning the correct ccNUMA placement of your data,…
Status RWTH Compute Cluster, 2012-07-27
Tim Cramer posted on Jul 27, 2012
Status RWTH Compute Cluster, 2012-07-27 1. Firmware updates on Bull machines 2. Binding of MPI processes At the
moment all nodes in the Bull cluster get new firmware updates for a better memory monitoring. These updates may lead to
earlier crashes of the nodes in case of defective memory modules, so that the mean time to recover (MTTR) will be lower. If
one of your jobs crashed unexpectedly this week, please just try to resubmit it.…
Status RWTH Compute Cluster, 2012-07-13
Frank Robel posted on Jul 13, 2012
BCS OpenMPI 1.6 Lustre / $HPCWORK Air conditioning maintenance BCS: Now you can submit 120 hours jobs at BCS.
Previously, only 24 hours were possible. OpenMPI 1.6: OpenMPI 1.6 is now for disposal. Since it will be soon setup as default,
you must convert and test your programs. At the moment, you can switch to the new version with the following command:
module switch openmpi openmpi/1.6 Programs using older versions of OpenMPI won’t work with OpenMPI 1.6.…
Status RWTH Compute Cluster, 2012-06-29
Tim Cramer posted on Jun 29, 2012
Status RWTH Compute Cluster, 2012-06-29 1. Maintenance 2. Power outage 3. Update quota command As announced (refer
to http://www1.rz.rwth-aachen.de/kommunikation/betrieb/auto/stoerungsmeldungen/index.php) there will be a maintenance
starting on Sunday 1st, 12 am until Monday 2nd, 12am: Bull benchmarks Operation system update to Scientific Linux 6.2
Lustre (HPCWORK) update to 1.8.8 Update of the Netapp (HOME/WORK) system to Ontap 8.1.p1 The default Intel Compiler
will be updated to 12.1.5.…
Status RWTH Compute Cluster, 2012-06-15
Frank Robel posted on Jun 15, 2012
Status RWTH Compute Cluster, 2012-06-15 As announced on the rzcluster list, we scheduled two maintenance frames
29.06.2012 15:00-16:00: Bull will prepare their benchmark session. All BCS-systems will be rebooted The
Infiniband-Subnet-Manager in the Bull-Fabric will be restarted. For that, the batch-mode in the Bull-Fabric must be stopped On
cluster, cluster-linux and cluster-x, there will be a short network failure. You might use cluster2 or cluster-x2. 02.07.…
Status RWTH Compute Cluster, 2012-06-04
Tim Cramer posted on Jun 04, 2012
Status RWTH Compute Cluster, 2012-06-04 All information in /etc/passwd are stored anonymous now. As name (e.g. for the
finger command) only the username is displayed, gender specific data has been removed.
Status RWTH Compute Cluster, 28.05.2012
Tim Cramer posted on May 29, 2012
Status RWTH Compute Cluster, 28.05.2012 1. QPI speed on BCS node 2. MPI test reactivated 3. Minor Malfunction Due to
some stability problems detected by Bull in their BCS chips the speed of the QPI was reduced from 6.4 GT/s to 4.8 GT/s on the
corresponding nodes. Please note that there are will be additional maintenance windows for these systems (only the BCS) in
the next weeks. The MPI test which is executed before every MPI-parallel job is reactivated again.…
Open MPI - 'carto' feature activated on BCS nodes
Paul Kapinos posted on May 22, 2012
On nodes with more than one active InfiniBand cards (=> BCS nodes), the 'carto' feature of Open MPI is active since
22.05.2012. Note these environment variables: OMPI_MCA_carto OMPI_MCA_carto_file_path
Status RWTH Compute Cluster, 2012-05-16
Tim Cramer posted on May 16, 2012
Status RWTH Compute Cluster, 2012-05-16 1. Maintenance on last Monday 2. MPI Test 3.…
Module changes
Paul Kapinos posted on May 14, 2012
At the opportunity of the cluster maintenance at 14.05.2012, some changes on the available modules are made. 1) older
software versions are moved to the DEPRECATED category. This versions are still available after loading the DEPRECATED
category by 'module load DEPRECATED' 2) Names of some beta versions are unified: now, beta versions have 'b' as the last
letter in the versions number.
Status RWTH Compute Cluster, 2012-05-11
Tim Cramer posted on May 11, 2012
Status RWTH Compute Cluster, 2012-05-11 1. Maintenance 2. BCS nodes partly online 3. ScaleMP online 4. Fault on central
IB network component 5. Routing algorithm Infiniband subnet manager As announced on the rzcluster list we scheduled a
maintenance 14.05.2012, 15:00-18:00: The complete Linux and Windows cluster (including frontends) will not be available.
Following changes will be done: Update on the file systems Configuration of integrative hosted systems 14.05.2012, 18:00 –
15.05.2012,…
Status RWTH Compute Cluster, 2012-05-04
Tim Cramer posted on May 04, 2012
Status RWTH Compute Cluster, 2012-05-04 1. Batch System / Compute Resources 2. Fault on central IB network component
3. Defective Compute Nodes One third of the Bull cluster is now reserved exclusively for the JARA HPC partition. In order to
make big parallel jobs possible for all other users and get a fair share of the remaining resources all serial jobs will not be
scheduled to the Bull Cluster anymore, but only to older parts of the cluster.…
Status RWTH Compute Cluster, 2012-04-21
Tim Cramer posted on Apr 22, 2012
Status RWTH Compute Cluster, 2012-04-21 1. Maintenance April, 24th 8am to April, 25th 8am 2. New hardware of cluster As
already announced on the rzcluster list we scheduled the next maintenance to stabilize the Infiniband network together with
Bull from April, 24th 8am to April, 25th 8am. During the maintenance no batch jobs will start on the Bull cluster and it is not
possible to run MPI jobs interactively on the backends with $MPIEXEC wrapper.…
Status RWTH Compute Cluster, 2012-04-13
Tim Cramer posted on Apr 13, 2012
Status RWTH Compute Cluster, 2012-04-13 1. Shell profiles / module system 2. $WORK / $HPWORK clean up 3.
Maintenances April, 12th (yesterday) 4. Next maintenance from Monday, 16.4. 12:00 – Tuesday 17.4. 12:00 (24 hours) Many
users load modules in there shell profiles (e.g. $HOME/.zshrc). Please note that this can trigger a lot of issues for your batch
jobs, so that we really recommend not to do that. Unfortunately, we cannot support module loads in your profiles.…
Status RWTH Compute Cluster, 2012-03-30
Tim Cramer posted on Mar 30, 2012
Status RWTH Compute Cluster, 2012-03-30 1. Maintenance 2. $HPCWORK performance 3. Dispatching time of bigger MPI
jobs As already announced last week we scheduled maintenance for 03.04.2012, 10:00-14:00. Additional to the changes
announced last week, following modifications will be done: Change of the startup mechanism of Intel MPI. If you use the
$MPIEXEC and $FLAGS_MPI_BATCH environment variable your old job scripts will still work. The machine
cluster.rz.rwth-aachen.…
GCC 4.7.0 installiert
Paul Kapinos posted on Mar 26, 2012
$ module load gcc/4.7
Status RWTH Compute Cluster, 2012-03-26
Tim Cramer posted on Mar 26, 2012
Status RWTH Compute Cluster, 2012-03-26 1. Power outage 2. Maintenance 03.04.2012, 10:00-14:00 3. Storage system Due
to short-circuit in one of the chassis and a power outage (city center and campus Melaten) caused by a bird (really!) last week,
1000 systems in the Linux cluster were shut down. Due to this failure 90% of all jobs running at this time were killed. We
apologize for any inconvenience. All systems are up and running again. We scheduled maintenance for 03.04.2012,
10:00-14:00.…
new (beta) version of TotalView Debugger
Paul Kapinos posted on Mar 21, 2012
Log: TotalView Versionspflege: 8.10.0 Beta installiert (egentl. 8X.10.0-4, erreichbar als totalview/8.10beta) Standard-TV ist
8.9.2 geblieben, nun aber in Version 8.9.2-2 Ehemaligen 8.9.2 (eigentl. 8.9.2-0) verschoben nach DEPRECATED 8.9 (eigentl.
8.9.0-0) verschoben nach DEPRECATED
Status RWTH Compute Cluster, 2012-03-16
Tim Cramer posted on Mar 16, 2012
Status RWTH Compute Cluster, 2012-03-16 1) Reconfiguration of the SMP systems 2) Mailing list rzcluster 3) Configuration of
license servers 1) Bull will install the BCS systems until the end of this month, so that all nodes known as SMP complex are still
offline. 2) Please note that the rzcluster list is not a public discussion forum, because only members of the communication
center are allowed to write on this list to inform the users. If you have any questions please contact servicedesk@rz.…
TotalView 8.9.2-2 installiert
Paul Kapinos posted on Mar 15, 2012
TotalView /8.9.2-2 installiert und gleich gepatcht. die Datei
/rwthfs/rz/SW/ETNUS/toolworks/totalview.X.Y.Z/lib/parallel_support.tvd ediert.
FLAGS_LPATH
Paul Kapinos posted on Mar 12, 2012
In Version 12 des Intel Compiler sind die Pfade ja aaaaaanders als in 11.x Versionen (Jaaaaa, Intel mag die Pfade aendern!)
Dank Nutzer-Hinweis (Tiketl 20120308-0466) isses nun aufgefallen. Nun gefixt. MfG PK HPCG RZ RWTH AC
Status RWTH Compute Cluster, 2012-03-09
Tim Cramer posted on Mar 09, 2012
Status RWTH Compute Cluster, 2012-03-09 1) Reconfiguration of the SMP-Systems 2) JARA HPC 3) Reconfiguration of the
automounter 4) Accessing /rwthfs 5) Maintenance March 12th 1) We expect that until end of March the second stage of the
cluster configuration will be completed. In this stage, two or four of the 4-socket-systems (known as SMP systems / Nehalem
EX) will be connected to 8 or 16-socket-systems, respectively, with proprietary BCS-Chips from Bull.…
Interaktiver MPIEXEC Wrapper update - 'mpitest'
updated
Paul Kapinos posted on Mar 05, 2012
Interaktiver MPIEXEC Wrapper update: 'mpitest' updated U mpiexec.py U test_wrap.sh Updated to revision 3358.
Status RWTH Compute Cluster, 2012-03-02
Tim Cramer posted on Mar 04, 2012
Status RWTH Compute Cluster, 2012-03-02 1) Data in $WORK 2) Unstable $HPCWORK (Lustre Filesystem) 3) Deactivated
automounter 4) Frontends cluster, cluster-linux 5) Maintenance March 12th 1) The data in $WORK is not automatically cleaned
up after 4 weeks anymore. However, please keep in mind that this might change in the future again (after a new
announcement, of course) and that there is no backup for $WORK. 2) We expect that the problems with the suspending
$HPCWORK are solved.…
Status RWTH Compute Cluster
Tim Cramer posted on Feb 24, 2012
Status RWTH Compute Cluster, 2012-02-24 1) "Hänger" beim interaktiven Arbeiten auf den Linux Frontends 2) Instabiles
HPCWORK (Lustre Filesystem) 3) Lange Wartezeiten im Linux Batch System 4) Performanceprobleme bei MPI Anwendungen
5) Weitere Hinweise Zu 1) Auf den Frontends cluster.rz.rwth-aachen.de, cluster-x.rz.rwth-aachen.de und
cluster-linux.rz.rwth-aachen.de wurden die NFS-Mounts von TCP auf UDP umgestellt. Seitdem werden in unseren Messungen
seit ca.…
Softwarepflege - Löschung ausgedienter Software
Paul Kapinos posted on Feb 17, 2012
Viele alte Versionen von Intel Software und einige andere (ur-)alte Softwarebestände wurde gelöscht. Betroffen sind diese
Verzeichnisse: /rwthfs/rz/SW/NAG /rwthfs/rz/SW/UTIL /rwthfs/rz/SW/intel Dementsprechend wurden einige Module aus
DEPRECATED Bereich entfernt.
Neues Release des Primers 8.2.1
Paul Kapinos posted on Feb 14, 2012
Unter http://www.rz.rwth-aachen.de/hpc/primer ist nun eine neue Ausgabe des Primers verfügbar, Version 8.2.1 von 14.
Februar 2012. Wichtigste Änderungen: 1. LSF Kapitel (Array Jobs, Chain Jobs, Wegfall von -We Parameter, neue Parameter
für Jobs die auf $HPCWORK zugreifen, Empfehlungen für max. Speichernutzung per Slot) 2. Sun Analyzer (kann nun wieder
mal die Hardware Counter - auch für MPI Programme). 3. memalign32 Script (wichtig für MPI Programme auf SMP Knoten)
Inifiniband Netzwerk Probleme
Marcus Wagner posted on Feb 14, 2012
am Freitag, dem 10. Februar 2012 kam es zu Infiniband-Problemen, die dann auch eines der Gateways mit ins Nirvana
gerissen hatten. Daher waren ein Viertel der Rechner der Bull-Fabric von ausserhalb der Bull-Fabric nicht mehr erreichbar. Auf
diesen Systemen gestartete Jobs verschwanden jedoch im Nirvana, da diese die LDAP-Server ausserhalb der Bull-Fabric nicht
erreichen konnten. Daher war eine Nutzerauthentifizierung nicht möglich. Das Problem ist inzwischen behoben.
"Hänger" auf den Linux-Cluster-Frontends
Georg Schramm posted on Feb 03, 2012
Auf den Frontendsystemen (primär innerhalb des neuen BULL Clusters) kommt es in den letzten Wochen häufig zu "Hängern".
Diese Problematik ist bekannt und es wird intensiv an einer Lösung gearbeitet.…
Neues Release des Primers 8.2
Paul Kapinos posted on Jan 30, 2012
Unter http://www.rz.rwth-aachen.de/hpc/primer ist nun eine neue Ausgabe des Primers verfügbar, Version 8.2 von Januar
2012. Wichtigste Änderungen: 1. LSF Kapitel (Array Jobs, Chain Jobs, Wegfall von -We) 2. Sun Analyzer (kann nun wieder mal
die Hardware Counter - auch für MPI Programme).
Sun Studio Version 12.3 (release) installiert.
Paul Kapinos posted on Jan 26, 2012
Sun Studio Version 12.3 (release) installiert und direkt zum Default-Studio gemacht. Erreichbar mit "module load studio".
Verfügbare Hardware Counter anzeigen: $ collect -h
Versionspflege des Intel Compilers 12.1.2.273 neuer
Standardcompiler
Paul Kapinos posted on Jan 04, 2012
Versionspflege des Intel Compilers: 12.1.0.233 geht nach DEPRECATED 12.1.2.273 wird zum neuen Default-Compiler (12.1).
Hintergrud: "version 12.1.0.233 of the Intel compiler has a vectorization bug."
https://svn.open-mpi.org/trac/ompi/changeset/25290 http://www.open-mpi.org/community/lists/users/2012/01/18091.php -This
line, and those below, will be ignored- A + version-3.0/modulefiles/linux/x86-64/DEPRECATED/intel/12.1.0.233 R +
version-3.0/modulefiles/linux/x86-64/DEVELOP/intel/12.…
Matlab Module angepasst
Paul Kapinos posted on Dec 21, 2011
Matlab Module ediert: 1. Versionen 2006*, 2007* sind nicht mehr verfu"gbar und werden hiermit ins Nirvana befo"rdert. 2.
Modul angepasst, damit alle A"nderungen im Modul stattfinden und die Versionsdateien nichts ausser der Versionsnummer
enthalten 3. Warnung hinzugefu"gt, dass die sonstige Programme ewtl. nicht nutzbar sind wenn Matlab geladen ist..
Modulepflege - alte TotalView Versionen
DEPRECATED
Paul Kapinos posted on Dec 16, 2011
Modulepflege: Versionen von TotalView 8.8, 8S.9.1-0A, 8.9.2beta nach DEPRECATED neuer Standardtotalview ist nun 8.9.2
Gromacs 4.5.5 installiert
Paul Kapinos posted on Nov 18, 2011
Die Version 4.5.5 von Gromacs http://www.gromacs.org wurde im neuen (Bull) Cluster installiert. Laden des Modules: $ module
load CHEMISTRY $ module load gromacs Ältere Versionen sind nun ausser Gefecht.
NWChem 6.0 (ohne Gewähr)
Paul Kapinos posted on Nov 07, 2011
http://www.nwchem-sw.org NWChem Software in Version 6.0 wurde installiert, gebaut mit Standard-Kombination von Intel 12
Compiler und OpenMPI 1.5.3. Nutzung: $ module load CHEMISTRY nwchem Den Nutzern steht es frei, diese Installation
auszuprobieren; Fehlermeldungen sind willkommen, aber wir geben KEINE GARANTIE dass diese Installation funktioniert und
dass wie die im Fehlerfall reparieren können werden.
HPC ChangeLog PETSc Installationen (ohne gewähr)
Paul Kapinos posted on Oct 28, 2011
Versuch wurde unternommen, "vanilla" Installationen von PETSc vorzunehmen. Den Nutzern steht es frei, diese Installationen
auszuprobieren; Fehlermeldungen sind willkommen, aber wir geben KEINE GARANTIE dass diese Installationen funktionieren
und dass wie die im Fehlerfall reparieren können werden. $ module load LIBRARIES $ module load petsc Verfügbare
Varianten: Für Intel Compiler, Version 3.0.0 für OpenMPI und IntelMPI Version 3.2 nur für OpenMPI. Und hier ein Bild,…
GCC compiler v.4.3.4 installiert
Paul Kapinos posted on Oct 24, 2011
Die Version 4.3.4 von GCC compilern wurde installiert und ist ab Morgen erreichbar mit "module load gcc/4.3". Diese betagte
Version ist für Matlab vonnöten.
Sun Studio Version 12.3 (beta) installiert.
Paul Kapinos posted on Oct 24, 2011
Erreichbar mit "module load studio/12.3beta".
ParaView- neue Versionen und Module verschoben
Paul Kapinos posted on Oct 21, 2011
ParaView (http://www.paraview.org/) is an open-source, multi-platform data analysis and visualization application.Wird unter
Anderem zusammen mit OpenFOAM genutzt. I. Es wurden folgende Versionen installiert: 3.10.1 ==> neues Default
3.12.0.RC2 ==> (funktioniert noch nicht) UPD:2011.10.24 ==> Version 3.12.0.RC2 wieder entfernt, weil kaputt. Die Version
3.6.1 wurde zu einem neuen Installationsort verschoben (funktioniert weiterhin). II.…
Permissions für PETSc nachgebessert
Paul Kapinos posted on Oct 18, 2011
Die Leserechte für Nutzer warn nicht gesetzt; dies ist nun nachgebesserrt. /rwthfs/rz/SW/NUMLIB/PETSc-3.0.0/openmpi-1.5
Reference No.: 20111018-0524
studio Versionen v12.2 bleibt - alles andere
DEPRECATED
Paul Kapinos posted on Oct 10, 2011
die einzige verfuegbare Version von Studio bleibt /12.2 Bei Bedarf kann /12.3 (z.Zt. beta) installiert werden, gibtz noch zum
Runterladen. studio/12.1p6 geht nach DEPRECATED (weil Probleme mit C++ im neuen Cluster und sonst zu alt)
studio/express geht nach DEPRECATED (weil aelter als 12.2) ansonsten die Namen noch gerade gezogen.
gcc 4.5 auf SL6x Maschinen gefixt
Paul Kapinos posted on Oct 10, 2011
die Version gcc/4.5 war auf Scientific Linux 6.x Rechnern (neuer Cluster) falsch verlinkt (auf 4.5.1 vom alten Cluster), dies ist
nun behoben - die zeigt auf 4.5.3).
Sun MPI hat ausgedient und ist nun DEPRECATED.
Paul Kapinos posted on Oct 09, 2011
kwt.
VASP- neue Installation
Paul Kapinos posted on Oct 05, 2011
Werte Test-Cluster-Nutzer, die Installation von VASP wurde neu gemacht mit den aktuellen Compiler (Intel Fortran 12.1.0.233)
sowie OpenMPI 1.5.3. es ist nun kein Switch zu openmpi/1.4.3 mehr nötig es ist auch eine Version für Intel MPI verfügbar, (v.
4.0.3.008). die AEDENS und TOTAL Varianten werden nicht mehr gebaut, da erkannt, dass die sich von der Standardvariante
nicht unterscheiden die bis dato verfügbaren Versionen können zu Testzwecken verfügbar gemacht werden.…
neue Standardmodule (neuer Cluster)
Paul Kapinos posted on Sep 21, 2011
Neue Standard-Module: intel/12.1 (statt von intel/12, welches nach intel/12.0 umbenannt ist) openmpi/1.5.3 (statt von
openmpi/1.4.3) bei der Gelegenheit auch noch zwei tote Links weggerauemut zu nicht mehr vorhandenen Versionen von
OpenMPI
TotalView 8.9.2 (beta) installiert
Paul Kapinos posted on Sep 13, 2011
Die Beta-Version 8T.9.2-0 von TotalView 8.9.2 wurde installiert. Zugreifbar mit $ module load totalview/8.9.2beta Lizenz gültig
bis 01-oct-2011
LSF Memory limit changed
Tim Cramer posted on Sep 02, 2011
We changed the configuration for the LSF memory limit. Unfortunately, LSF used in some cases a per-process and in some
cases a per-job limit, which is very confusing. We deactivated the control of LSF so that the memory limit is enforced by the
operating system now. If you still have problems with the memory limits, please contact us. Note that the error message is not
very clear, if you request too less memory. The LSF Mail says "Successfully completed",…
OpenFOAM 2.0.0
Paul Kapinos posted on Aug 11, 2011
The OpenFOAM 2.0.0 is now installed in the beta cluster. This version is build using the GCC compiles and the
defaultly-loaded MPI, so you have to load the GCC compiler instead of the Intel compiler. An Intel-compiler build version may
be added if needed. Note: this installation is not intended for development on OpenFOAm itself. To use OpenFOAM,…
gcc 4.5 und 4.6
Paul Kapinos posted on Aug 11, 2011
Die GCC Installationen im neuen Cluster (gcc/4.5 und gcc/4.6) wurden neu gemacht auf dem neuen SL60 Cluster. Dadurch
sollten nun die C++ Compiler auch funktionierene. Ausgerollt auf cluster-beta2 (weitere Rechner ab 12.08.2011)
Open MPI for intel compiler (the default) rebuild with
intel 11.1
Paul Kapinos posted on Aug 11, 2011
The versions of Open MPI for the Intel compiler (including the default one) are rebuild with intel/11.1 compiler at 10.08.2011.
Thus, now it is possible to use the Open MPI also with the intel/11.1 compiler (the previous versions works with intel/12 only).
The quell for the error message alike 'forrtl: severe (71): integer divide by zero' (also see below) is still unknown. We
recommend to try out the intel/11.1 compiler if you see such messages from your Fortran program.…
Default queue changed back to normal
Georg Schramm posted on Aug 01, 2011
With the majority of nodes installed under SL6.0 the deafult queue is now again the queue "normal". All jobs in the queue
"normal-sl6" will be allowed to finish, but the queue does not accept jobs.
ssh2blaucn wrapper now less verbose
Paul Kapinos posted on Jul 29, 2011
Der SSH auf blaunch Wrapper in /opt/lsf/8.0/linux2.6-glibc2.3-x86_64/bin/rsh wurde angepasst und errichtet die Arbeit nun im
Schweigen. (Debug-Ausgabe ausgeschaltet). UPD (Tue Aug 2 15:30:13 CEST 2011) Der Fix wurde in die z.Zt. aktive
Installation von LSF überstragen, nämlich in die von linuxtc04 erreichbare (von der linuxtc03). Ein Versionierungssystem für die
LSF Einstellungsdateien wäre sicherlich sehr von Vorteil..
VASP Installtion re-done on SL60
Paul Kapinos posted on Jul 29, 2011
The VASP Installtion in the new cluster is re-done. The versions 4.5, 4.6. 5.2 known from the old cluster, are compiled with the
new Intel compilers and OpenMPI for the MPI versions. Now the version 5.2 is the default VASP version. The new installation
will be active from tomorrow, 20.07.2011, on Scientific Linux 6.0 (SL60) part of the cluster only (cluster-beta2) The version 5.2
is the same as the 5.2.2 in the old part of cluster
Closure of CentOS 5.6 queue and new default queue
Georg Schramm posted on Jul 29, 2011
The "normal" queue with CentOS 5.6 systems was closed today and the queue normal-sl6 was configured as the default
queue. Jobs running in the CentOS 5.6 queue will be left running until next week. Waiting jobs will be switched to the SL6.0
queue, if not removed. Submitting jobs to the normal queue is disabled.
OpenMPI for SL60 computers re-installed
Paul Kapinos posted on Jul 28, 2011
All OpenMPI versions are build and installed for the ScientificLinux 6.0 (SL60) computers. New-installed: /1.5.3 (all versions),
/1.5.3mt (all versions), /1.4.3/pgi, /1.4.3mt/pgi Active: from tomorrow, 29.07.2011 Rebuild: /1.4.3/(gcc,intel,studio),
/1.4.3mt/(gcc,intel,studio) Active: immediately Note: the old cluster and the CentOS 5.6 part of the new cluster are not
changed.
Superfluous output during intelmpi jobs
Georg Schramm posted on Jul 28, 2011
The superfluous output when running an IntelMPI job has been avoided by redirecting the PAMs (ParallelApplicationManager)
output to the error stream.
SunMPI ins NFS verschoben
Paul Kapinos posted on Jul 28, 2011
Die Installationsmethode von SunMPI wurde angepasst: statt RPMs auf jedem Knoten auszurollen, sind die
Installationsverzeichnisse nach /rwthfs/rz/SW/MPI/SCIENTIFIC-6.0/SUNWhpc verschoben und unter /opt/SUNWhpc verlinkt
(sunmpi.lnk). Gleichzeitig wurde ein kleiner Bug behoben. (PGI Compiler soll nun können MPI Programme 32bittig zu bauen)
LSF updated
Georg Schramm posted on Jul 26, 2011
The default queue is called "normal" now instead of "parallel". Also, there exists a new queue, called "normal-sl6", which is the
queue for the freshly installed Scientific Linux 6.0 Systems.
Memory limit introduced
Georg Schramm posted on Jul 19, 2011
Per process memory limits have been introduced. If not set with bsub option -M <n> (unit MB) or below 512MB, the default
value of 512MB is set. Jobs submitted prior to job id 17514 are not concerned.
Exclusive Jobs
Marcus Wagner posted on Jul 19, 2011
Every Job, which requests >= 32 Slots, is set exclusive
File access to hpcwork
Georg Schramm posted on Jul 18, 2011
The access to the hpcwork directory on the frontend-nodes cluster-beta and cluster-beta2 via file transfer clients like WinSCP
or Secure File Transfer Client is now possible.
Changed the startup-way of intelmpi
Marcus Wagner posted on Jul 18, 2011
done transparently for the user, he still uses mpirun.lsf. Does not use mdpboot anymore, but starts the mpd's on the remote
hosts by hand. This way, intelmpi is controllable by LSF and the administrators.
work replacement lustre
Georg Schramm posted on Jul 15, 2011
In the new cluster the lustre file system is available and will be used as a replacement for the work directory due to an outage
of the work file server. $WORK can be accessed only on the frontend nodes cluster-beta and cluster-beta2 to copy files to new
lustre directories. The lustre directory is under /hpcwork/<user>. The quota can be listed using the following command $> lfs
quota /lustreb By default each user has the soft limits 1 TB data and 50000 files.</user>…
WORK file server outage
Georg Schramm posted on Jul 15, 2011
Due to file server outage dispatching of jobs has been disabled.
Rescheduling fixed
Georg Schramm posted on Jul 14, 2011
Jobs exiting with value 98 were not rescheduled correctly, problem was fixed.
Running multiple commands with bsub
Georg Schramm posted on Jul 12, 2011
$> bsub "echo $SHELL; echo $SHELL" produced: before user command execution in jobstarter /usr/local_rwth/bin/zsh after
user command execution in jobstarter /usr/local_rwth/bin/zsh now produces: before user command execution in jobstarter
/usr/local_rwth/bin/zsh /usr/local_rwth/bin/zsh after user command execution in jobstarter
Default runtime limit enabled
Georg Schramm posted on Jul 08, 2011
A default run time limit of 15min was introduced, which is set if no run time limit is set using the bsub option -W
[<hours>:]<minutes>.
Run time limit normalization deactivated
Georg Schramm posted on Jul 07, 2011
ABS_RUNLIMIT=Y avoids normalizing of run time limits according to the CPU factor of a host. Run time limits (bsub option -W
<minutes>) should be interpreted absolutely now.
changes in the serial queue
Marcus Wagner posted on Jul 05, 2011
the serial queue is for the moment a special purpose queue, only one special user has access, please do NOT submit to this
queue, your jobs won't run
added host-groups
Marcus Wagner posted on Jul 05, 2011
there exist now new hostgroups: bull-mpi-l bull-mpi-s bull-smp-l bull-smp-s
Requeue exit value and maximum requeue value set
Georg Schramm posted on Jul 05, 2011
Values have been set as follows: REQUEUE_EXIT_VALUES=98 MAX_JOB_REQUEUE=10
Neue Gruppe für Zugang zu cluster-beta* eingeführt
Georg Schramm posted on Jul 04, 2011
Der Zugang zu den LSF-Cluster-Frontends (cluster-beta, cluster-beta2) wird ab sofort durch die Unix-Gruppe pilotstage
geregelt. Probleme können ggfs. beim Login auf diesen Knoten auftreten, in diesem Fall bitte melden.
openssh-askpass installed on the bull nodes
Marcus Wagner posted on Jun 30, 2011
supposably needed by VASP
Change of Limits
Marcus Wagner posted on Jun 30, 2011
Up to now, all limits used in LSF, like the requested memory, were meant to be in KB. We changed this to MB, like it was within
SGE.
Kernel-ib module built newly
Sascha Bücken posted on Jun 29, 2011
the infiniband module was newly compiled and activated after an reboot. MPI should be functional now.
Environment variables TMP, TEMP, TMPDIR,
TMPSESS set
admin posted on Jun 29, 2011
The environment variables TMP, TEMP, TMPDIR and TMPSESS are set now for the batch jobs execution.
several "Frontend" packages have been additionally
installed on the two frontend-nodes
Sascha Bücken posted on Jun 29, 2011
firefox-3.6.18-1.el5.centos.i386 firefox-3.6.18-1.el5.centos.x86_64 nano-1.3.12-1.1.x86_64 nedit-5.5-21.el5.x86_64
perl-PDL-2.4.1-47.x86_64 1:qt-3.3.6-23.el5.x86_64 8:arts-1.5.4-1.x86_64 libieee1284-0.2.9-4.el5.x86_64
desktop-backgrounds-basic-2.0-41.el5.centos.noarch avahi-qt3-0.6.16-10.el5_6.x86_64 1:tix-8.4.0-11.fc6.x86_64
tkinter-2.4.3-44.el5.x86_64 python-imaging-1.1.5-5.el5.x86_64 3:htdig-3.2.0b6-11.el5.x86_64 fribidi-0.10.7-5.1.x86_64
libexif-0.6.13-4.0.2.el5_1.1.x86_64 gphoto2-2.2.0-3.…
Gaussian module edited
admin posted on Jun 29, 2011
The default TMP path in the gaussian module caused an error. Problem solved.

Documents pareils