Changes Archive
Transcription
Changes Archive
Changes Archive Intel Compiler, $FC and $F77 variablen: -nofor-main flag entfernt Paul Kapinos posted on Feb 19, 2014 Ab 20.02.2014 wird in der $FC und $F77 variablen der Flag '-nofor-main' erstmals seit 2007-09-25 nicht mehr gesetzt, weil 'cmake' damit nicht zurecht kommt. Sollten Sie Probleme beim Übersetzen/Linken von Software bekommen, bitte melden Sie sich! From 20.02.2014 on, the flag '-nofor-main' will be removed from $FC and $F77 variables, because 'cmake' has problems with it. If you get trouble by compiling/linking the applications, let us know! Status RWTH Compute Cluster, 2014-02-14 Frank Robel posted on Feb 14, 2014 Status RWTH Compute Cluster, 2014-02-14 The wrapper r_memusage will replace memusage: The wrapper memusage could only show the virtual memory peak. The new command with the name r_memusage can also show the amount of really used memory and includes all child processes. The Harpertown nodes are removed from the group of MPI backends. Status RWTH Compute Cluster, 2014-02-06 Frank Robel posted on Feb 07, 2014 Status RWTH Compute Cluster, 2014-02-06 A faulty update package has led to disruptions in the graphical dialog systems. The misbehavior of one user resulted in multiple failures of cluster and cluster-linux. To avoid this kind of failure, please use $TMP and not /tmp to store temporary files. Due to the failure of Lustre ($ HPCWORK) some files may be corrupted. The “Center for Computing and Communication” is now called “IT Center”. openmpi/1.7.4 Paul Kapinos posted on Feb 06, 2014 Neue Version von Open MPI, 1.7.4 (ohne und mit multithreading als 1.7.4mt), werden ab 07.02.2014 im HPC-Cluster verfügbar. http://www.open-mpi.org/community/lists/announce/2014/02/0059.php Gleichzeitig wird die Version 1.7.3 ins DEPRECATED verschoben. gcc/4.8 Paul Kapinos posted on Feb 04, 2014 neue Version von GCC compilern, 4.8.2 wurde auf dem Cluster installiert und ist ab dem 05.02.2014 erreichbar mit $ module switch intel gcc/4.8 Die frühere Version von gcc/4.8 (4.8.1) wirde nach DEPRECATED verschoben. Status RWTH Compute Cluster, 2014-01-30 Frank Robel posted on Jan 30, 2014 Status RWTH Compute Cluster, 2014-01-30 Our Lustre file system $HPCWORK has failed. Possibly as a result of incorrect use. If you want to use $HPCWORK in batch mode then you need to request it! Otherwise, your job will be rejected. You need to add the following line into your batch script if you want to use $HPCWORK: #BSUB -R "select[hpcwork_fast]" Status RWTH Compute Cluster, 2014-01-17 Frank Robel posted on Jan 17, 2014 Status RWTH Compute Cluster, 2014-01-17 Partial failure of the cluster gateway On Friday 2014-01-10 between 13:48 and 14:01 there was a partial failure of the cluster gateway. Only cluster internal communication was possible. The frontends could not be reached. Partial failure of q_cpuquota Between 2013-12-18 12:45 and 2014-01-13 07:30 q_cpuquota delivered incorrect values. Java JDK has been updated to version 1.7.0_51. FDS (Fire Dynamic Simulator) Installation Paul Kapinos posted on Jan 15, 2014 DE: Die Installation von FDS (Fire Dynamic Simulator, http://code.google.com/p/fds-smv/wiki/FDS_Release_Notes) im Cluster wurde erneuert: Versionen 5.3 (5.3.1), 5.5 (5.5.3 gebaut mit älteren Compilern), 6.0.1 (SVN:17598), und 6.0alpha (SVN:16086) wurden ins DEPRECATED verschoben; Version 6.0.2 (aktualisiert auf SVN:18014) wurde mit dem aktuellen Intel Compiler 14.0 gebaut und als *6.0* installiert und zu Standard-Version fürs FDS gemacht.… Status RWTH Compute Cluster, 2014-01-10 Frank Robel posted on Jan 10, 2014 Status RWTH Compute Cluster, 2014-01-10 The Intel license server was down on 2014-01-06 from 13:00 to 17:00 o’clock. The frontend cluster-linux-opteron was shut down as announced. Status RWTH Compute Cluster, 2013-12-19 Paul Kapinos posted on Dec 23, 2013 Status RWTH Compute Cluster, 2013-12-19 in the week 12-19.12.2013, some problems with LDAP servers are present. Thus the 'member' function was from time to time not usable. The JARA accounting tool 'q_cpuquota' did not work propperly, see "Störungsmeldungen": http://maintenance.rz.rwth-aachen.de/list.php?id=14 As of end of the year 2013 the frontend cluster-linux-opteron will go offline. R.I.P. , old good () Opteron hardware! Merry HPC-mas! Status RWTH Compute Cluster, 2013-12-12 Frank Robel posted on Dec 13, 2013 Status RWTH Compute Cluster, 2013-12-12 At present we see sporadically unusual job aborts The problem is investigated Status RWTH Compute Cluster, 2013-12-06 Frank Robel posted on Dec 13, 2013 Status RWTH Compute Cluster, 2013-12-06 Problems with memory cgroups on BCS Some BCS jobs crashed without any error message ABAQUS: bekannte Inkompatibilität mit neuer Version des Intel Compilers / a known incompatinility with new Intel compiler Paul Kapinos posted on Dec 03, 2013 Die Software ABAQUS kann in Fortran programmierte Nutzerunterprogramme nutzen. Dazu nutzt es den verfügbaren Fortran Compiler, welcher bei uns der Intel's 'ifort' ist, vgl. $ abaqus info=system Bei der Wartung am 26.11.2013 wurde die version des Intel Compielrs von 13.1 auf 14.0 geändert. Das führt in der aktuellen Installation zu Fehlern wie > /rwthfs/rz/SW/ABAQUS/6.13-1/code/bin/standard: symbol lookup error: /tmp/xx000000/linux000_14578/xx000000_m01-v01_2299/libstandardU.… FDS (Fire Dynamic Simulator) - Version 6.0.1 Installier Paul Kapinos posted on Dec 03, 2013 Version 6.0.1 (SVN 17598) installiert. Optimisation -O2 statt -O3 wegen eines Fehlers in Intel compiler 14.0. $ module load TECHNICS $ module load fds/6.0.1 Status RWTH Compute Cluster, 2013-11-29 Frank Robel posted on Nov 29, 2013 Status RWTH Compute Cluster, 2013-11-29 Maintenance on Nov. 26th successful completed Default modules Intel compiler and OpenMPI changed The frontend cluster-linux-opteron will be shut down end of the year X-Win32 / NX Client Maintenance on Nov. 26th successful completed LSF 8.x was replaced by LSF 9.1.1.1 From now on, the memory usage is monitored. Processes using more than the requested memory will be terminated.… ausgediente Modules verschoben / old modules deprecated Paul Kapinos posted on Nov 27, 2013 DE: Folgende module sind nach DEPRECATED Bereich verschoben: EN: the following modules are moved to DEPRECATED area: DEVELOP/openmpi/1.6.1 DEVELOP/openmpi/1.6.1mt DEVELOP/openmpi/1.6.4 DEVELOP/openmpi/1.6.4mt DEVELOP/openmpi/1.6.4knem DEVELOP/studio/12.2 LIBRARIES/nag/c_mark8 LIBRARIES/nag/c_mark8_ilp64 LIBRARIES/nag/c_mark9 LIBRARIES/nag/c_mark9_ilp64 LIBRARIES/nag/fortran_mark21 LIBRARIES/nag/fortran_mark22 LIBRARIES/nag/smp_mark21 LIBRARIES/nag/smp_mark22 Status RWTH Compute Cluster, 2013-11-21 Frank Robel posted on Nov 22, 2013 Status RWTH Compute Cluster, 2013-11-21 Maintenance 26.11., 7:30 – 16:30 Jobs with greater runtime will not be started before the Maintenance. Deprecation of 32 bit Support On November 26th we will have a big maintenance where we have to shut down the complete cluster including the frontends (7:30 - 16:30). Work on the following components will be done: Power supplies Air condition Fire alarm system Major Update of LSF (v 9.1.1.… Gromacs 4.6.4 installiert Paul Kapinos posted on Nov 19, 2013 Die Version 4.6.4 von Gromacs http://www.gromacs.org wurde im RZ-Cluster in 3 Versionen installiert: gromacs/4.6.4 (Intel Compiler, Standard-Genauigkeit) gromacs/4.6.4dp (Intel Compiler, doppelte Genauigkeit, vgl. http://www.gromacs.org/Documentation/Terminology/Precision ) gromacs/4.6.4cuda (gcc/4.8 Compiler, Standard-Genauigkeit, Nutzung von GPGPU - nur auf GPU Cluster zu nutzen) Die Version gromacs/4.6.4 ist nun die neue Standardversion. Die Ältere Version 4.5.… TotalView: new BETA version 8.13 Paul Kapinos posted on Nov 18, 2013 A new BETA version of the TotalView debugger is available in the cluster from 19.11.2013 on: $ module load totalview/8.13b load the 8T.13.0-0 version. This version use an own license file which is valid until 03.03.2014. Please report any problems and feedback to ServiceDesk. A cut-out of the official announcement: >We would like testing on all platforms and of all functionality but there are some really cool new features that I want to encourage you to look at - MemoryScape for Xeon Phi.… Status RWTH Compute Cluster, 2013-11-14 Paul Kapinos posted on Nov 14, 2013 Status RWTH Compute Cluster, 2013-11-14 The maintenance of power supplies (announced for 7.11-15.11) is completed. As a consequence of power shutdown, a couple of nodes got hardware problems; these nodes are currently under repair. Neue/New Intel Sowtware Paul Kapinos posted on Nov 14, 2013 Neue Intel Software wurde installiert: intel/14.0.1.106 (wird in nächsten Tagen den 14.0.0.080 als intel/14.0 ersetzen) intelmpi/5.0b (5.0.0.007, eine neue Beta Version von Intel MPI mit z.B. Unterstützung von MPI3, vgl. http://software.intel.com/en-us/articles/intel-mpi-library-50-beta-readme) intelitac/9.0b (9.0.0.007, eine neue Beta Version) ------------------------------------------------------------------- New Intel Software: intel/14.0.1.106 (wird became intel/14.0 next days) intelmpi/5.… OpenFOAM version 2.1.1p Paul Kapinos posted on Nov 12, 2013 Neue Version von OpenFOAM wurde am 12.11.2013 installiert: Version 2.1.1 mit patch vom 30.06.2012, https://github.com/OpenFOAM/OpenFOAM-2.1.x/commit/7d703d585daf11438fbc4ad3bae01199675e7f78 $ module load TECHNICS $ module load openfoam/2.1.1p ------------------------------------------------------------------------------ New version of OpenFOAM available. This version contain patch from 30.06.2012: https://github.com/OpenFOAM/OpenFOAM-2.1.… Status RWTH Compute Cluster, 2013-11-07 Frank Robel posted on Nov 07, 2013 Status RWTH Compute Cluster, 2013-11-07 Maintenance of Power Supplies 7.11.-15.11 Secondary Accounts From 7.11. to 15.11. we have to maintain the power supplies of each rack individual. As consequence the waiting times may slightly increase during this period of time. All frontends will be available. An automatic creation of “Hochleistungsrechnen RWTH Aachen” secondary accounts is no longer possible. Status RWTH Compute Cluster, 2013-10-31 Tim Cramer posted on Oct 31, 2013 Status RWTH Compute Cluster, 2013-10-31 Maintenance of Power Supplies 7.11.-15.11. Maintenance 26.11., 7:30 – 16:30 Deprecation of 32 bit Support Cgroups issue on BCS X-Win32 Issues From 7.11. to 15.11. we have to maintain the power supplies of each rack individual. As consequence the waiting times may slightly increase during this period of time. All frontends will be available.… Intel compiler version update Paul Kapinos posted on Oct 29, 2013 Die Version des standardmäßig geladenen Intel Compilers (13.1) wurde von 13.1.1.163 (= "Composer XE2013.3.163") auf 13.1.3.192 (= "Composer XE2013.5.192") aktualisiert. Die Version 13.1.1.163 wurde nach DEPRECATED verschoben. Hiermit sollte dieser Bug beseitigt werden: http://software.intel.com/en-us/articles/svd-multithreading-bug-in-mkl The default intel/13.1 compiler updated from 13.1.1.163 (= "Composer XE2013.3.163") to 13.1.3.192 (= "Composer XE2013.5.192"). Version 13.1.1.… Status RWTH Compute Cluster, 2013-10-25 Frank Robel posted on Oct 25, 2013 Status RWTH Compute Cluster, 2013-10-25 Announcement of Kepler GPUs New software stack on GPU cluster Discontinuance of the Windows GPU machine On Wednesday, we announced four NVIDIA Kepler GPUs on the rzcluster mailing list. Access will be granted by an e-mail to [email protected]. We updated the software stack on the whole GPU cluster (57 Fermi GPUs, 4 Kepler GPUs) New graphics driver: 319.49 New CUDA toolkit: 5.5 New kernel: 2.6.32-358.23.2.el6.… Status RWTH Compute Cluster, 2013-10-18 Tim Cramer posted on Oct 18, 2013 Status RWTH Compute Cluster, 2013-10-18 1. ScaleMP Downtime 2. X-Win32 / NX Client 1. The ScaleMP machine will be down as of Tuesday, October, 22nd due to a reconfiguration. Please consider that the period of time for downtime might be a longer period of time. 2. As already announced Starnets X-Win32 software will be the new standard remote desktop software to access the RWTH Compute Cluster. It will replace the NX Client to connect the frontends.… The $OMP_THREAD_LIMIT environment variable will not be set Paul Kapinos posted on Oct 10, 2013 On 06. 08 we changed teh environmet to set the$OMP_THREAD_LIMIT environment variable, see The $OMP_THREAD_LIMIT environment variable Due to a several issue with the Intel compiler OpenMP runtime we decided now to revert this change and not to set the $OMP_THREAD_LIMIT environment variable from tomorrow on. You have to re-login once in order to get this change working. Status RWTH Compute Cluster, 2013-09-26 Frank Robel posted on Sep 26, 2013 Maintenance on Sep. 24th successful completed We apologize for delays Some LSF jobs are aborted and restarted after maintenance. For further information: https://maintenance.rz.rwth-aachen.de/list.php?id=14 Oracle Java 1.6 is switched off, please use Oracle Java 1.7 Status RWTH Compute Cluster, 2013-09-19 Tim Cramer posted on Sep 19, 2013 Maintenance on Sep. 24th, 11:00-18:00 As already announce at the rzcluster mailing list we scheduled a maintenance downtime for the whole RWTH compute cluster on Sep. 24th, 11:00-18:00. Both parts, linux and windows, are affected and unavailable for the denoted period. Planned works: - Software upgrade of NetApp filers - Firmware upgrade of the HPCWORK storage systems - Installation of service pack Feb 2013 for LSF 8.0.1 - Removal of oracle JDK 1.6 Intel MPI - Startup Issue on Big Jobs in LSF fixed Marcus Wagner posted on Sep 12, 2013 The issue with starting of Big Jobs (more than come 32 nodes) using Intel MPI in the LSF batch system is fixed now. Please use as usual this line to start your jobs: $MPIEXEC $FLAGS_MPI_BATCH a.out As the fix did not work for older versions of Intel MPI, the version intelmpi/4.0 was moved to DEPRECATED area. Status RWTH Compute Cluster, 2013-09-06 Tim Cramer posted on Sep 06, 2013 Status RWTH Compute Cluster, 2013-09-06 Java 1.6 is not supported anymore X-Win32 supersede the NoMachine NX client Defective power supply in MPI complex Due to security bugs which will not be fixed by Oracle Java 1.6 is not supported anymore. Due to several issues with the NoMachine NX client the support for this software package will by stopped by the end of 2013. The replacement for live sessions will be X-Win32. Please refer to Interactive Mode for more information.… Neues Release des Primers 8.2.6 Paul Kapinos posted on Aug 15, 2013 Unter http://www.rz.rwth-aachen.de/hpc/primer ist nun eine neue Ausgabe des Primers verfügbar, Version 8.2.6 von 15. August 2013. Wichtigste Änderungen: • As some older nodes reached the EOL (end-of-live) timeline, the chapters – 2.4 The older Xeon based Machines – 2.5 IBM eServer LS42 has been removed • As the idb debugger is deprecated by Intel, chapter – 7.3.3 Intel idb (Lin) has been removed • As the Intel Thread Checker and Profiler tools are superseded by Intel Inspector and VTune tools,… Der(The) interactive $MPIEXEC wrapper aktualisiert(updated) Paul Kapinos posted on Aug 13, 2013 Der interaktive $MPIEXEC Wrapper ('mpiexec', 'mpirun' wenn Sie interaktiv eingeloggt sind in den HPC-Cluster)ist aktualisiert auf die neue Version 2.6. (Siehe Kapitel 6.2.1 des Primers: http://www.rz.rwth-aachen.de/hpc/primer ) Bitte melden Sie falls Störungen irgendwelcher Art auftreten. ------------------------------------------------------------------------------ The interactive $MPIEXEC wrapper ('mpiexec',… Status RWTH Compute Cluster, 2013-08-09 Tim Cramer posted on Aug 09, 2013 Status RWTH Compute Cluster, 2013-08-09 1. New PuTTY version 2. OMP_THREAD_LIMIT errors The ssh client PuTTY 0.63 fixes several security bugs. If you use an older version to login to the cluster you should upgrade the software. Refer here for more information As announced last week we set the maximum OpenMP thread limit (environment variable OMP_THREAD_LIMIT) to 2x of the logical core number. Unfortunately, this courses some errors for some applications (e.g., Gaussian).… TotalView - new version 8.12 installed an set to defalt Paul Kapinos posted on Aug 08, 2013 A new version 8.12 of the TotalView debugger (http://www.roguewave.com/products/totalview.aspx) will be available from 06.08.2013 on. This version will be the new default version of TotalView. The version 8.12b moved to DEPRECATED (=> 8T.12.0-1) The version 8.9.2 moved to DEPRECATED (=> 8.9.2-2 ) The $OMP_THREAD_LIMIT environment variable Paul Kapinos posted on Aug 06, 2013 From 06.08.2013 on, the environment variable $OMP_THREAD_LIMIT will be set to 2x of number of logical cores (number of 'CPUs' the operating system believes to have available) in the HPC-Cluster standard environment. This will limit the number of threads of an Open MP program and avoid extremal overloading of the nodes. Intel Threading Tools (ITT) DEPRECATED Paul Kapinos posted on Aug 06, 2013 The old Intel Threading Tools (ITT) module 'intelitt' was moved to DEPRECATED area. The "Thread Checker" functionality (and more) is now contained in the Intel Inspector product (module: 'intelixe'); The "Thread Profiler" functionality (and more) is now contained in the Intel VTune/Amplifier product (module: 'intelvtune'). Intel MPI stability problems after the OFED update Fixed Paul Kapinos posted on Aug 01, 2013 The stability problems with Intel MPI reported in https://wiki2.rz.rwth-aachen.de/display/bedoku/2013/06/27/Intel+MPI+problems+in+the+cluster should be resolved now. The workaround described in the linked page (disabling DAPL) is not needed now. Side notes: 1. using older versions of Intel MPI than currently default 4.1, these warning: linuxscc004.rz.RWTH-Aachen.DE:7ea7:ad202700: 2189 us(2189 us): open_hca: device mlx4_0 not found linuxscc004.rz.RWTH-Aachen.… Status RWTH Compute Cluster, 2013-07-22 Tim Cramer posted on Jul 22, 2013 Status RWTH Compute Cluster, 2013-07-22 1. BCS firmware update 2. Lustre (HPCWORK) malfunction 3. Request to set your password 4. KNEM kernel module Due to a firmware update on all BCS systems the waiting times for these systems might be longer as usual. After a malfunction (refer to http://maintenance.rz.rwth-aachen.de/list.php?id=14) the lustre (HPCWORK) file system is fully operative again.… Python installation in the HPC Cluster Paul Kapinos posted on Jul 17, 2013 The Python installation in the HPC Cluster is rebuild in the days 15-17.07.2013. 1. Older versions 2.5.6, 2.6.8, 2.7.3, 3.3.0 moved to DEPRECATED 2. new versions 2.7.5 and 3.3.2 installed. The version 2.7.5 is now the default python version in modules. (Note that the Linux-default Python is 2.6.6.) The new version support NumPy, SciPy, Matplotlib modules. These versions are configured with '--enable-unicode=ucs4 --enable-ipv6' flags, as the Linux-default Python.… Status RWTH Compute Cluster, 2013-07-16 Tim Cramer posted on Jul 16, 2013 Status RWTH Compute Cluster, 2013-07-16 1. ulimits for batch jobs 2. Intel Xeon Phi cluster 3. New frontend: cluster-copy2 In future we will set the ulimits in batch jobs to the cluster wide default of a default account (i.e., the limit you see with ulimit –a for a default, unmodified account). If you have special requirements concerning the stack size or the core file size you can change the limits by setting the corresponding LSF option in MB (e.g., #BSUB –C 400 or BSUB –S 1000).… Intel Trace Analyzer and Collector (ITAC) - new version installed Paul Kapinos posted on Jul 15, 2013 New version 8.1(.1.027) of the Intel Trace Analyzer and Collector (ITAC) was installed, available as 'intelitac' older versions 7.2 and 8.0 are moved to DEPRECATED Intel Advisor XE - new version installed Paul Kapinos posted on Jul 15, 2013 New version '2013 XE update 3" of Intel Advisor was installed and is now the default version of 'intelaxe' module. Older version 'update 1' was moved to DEPRECATED Intel Inspector XE - new version installed Paul Kapinos posted on Jul 15, 2013 The 'intelixe' softare (Intel Inspector 2013 XE) is now installed in new version 'Update 6'. This versions is also set to default. Older versions are moved to the DEPECATED area. GCC compiler - new version installed Paul Kapinos posted on Jul 11, 2013 GCC compilers version 4.8.1 are now installed in the HPC-Cluster and available as 'gcc/4.8': $ module switch intel gcc/4.8 The pervious 4.8 version, 4.8.0, now mover to DEPRECATED. Intel MPI problems in the cluster Paul Kapinos posted on Jun 27, 2013 currently running of Intel MPI Jobs in the cluster can be disturbed, you can get such errors: linuxbmc1226.rz.RWTH-Aachen.DE:5cae:490e4700: 1053229598 us(1053229598 us!!!): dapl_cma_active: CONN_ERR event=0x7 status=-110 TIMEOUT DST 134.61.205.38, 30720 rank 51 in job 1 linuxbmc1226.rz.RWTH-Aachen.DE_48603 caused collective abort of all ranks exit status of rank 51: return code 1 Especially bigger jobs (more than some 5 nodes / 60 processes) are affected.… Intel MPI 4.1 update Paul Kapinos posted on Jun 27, 2013 Intel MPI 4.1 update: the default intelmpi/4.1 was updated from 4.1.0.030 to 4.1.1.036 (the version 4.1.0.030 goes to DEPRECATED software) Status RWTH Compute Cluster, 2013-06-13 Tim Cramer posted on Jun 13, 2013 Status RWTH Compute Cluster, 2013-06-13 Maintenance 11th June, 9:00-16:00 During the maintenance the new OFED stack was installed successfully. Up to now we do not see any serious stability problems. A bug in the lippsm_infinipath library was fixed after the maintenance. As expected the Intel Xeon Phi coprocessors are not working with the new OFED stack, so that the bandwidth between host and coprocessor is very slow. We try to find a solution for that issue.… Status RWTH Compute Cluster, 2013-06-07 Tim Cramer posted on Jun 07, 2013 Status RWTH Compute Cluster, 2013-06-07 1. Power Outage 2. Replacement of Power Supply in JARA Partition 3. Maintenance 11th, 9:00-16:00 4. Staff Outing 14th Due to a defective electric cable (excavator) we had a power outage on Saturday, 1st. About 800 systems were rebooted. Please resubmit affected jobs. We had a replacement of a power supply in the JARA partition, so that some jobs had a slightly longer waiting time. As already announced we will have a big maintenance on 11th, 9:00-16:00.… memusage Paul Kapinos posted on Jun 05, 2013 The 'memusage' script (see Primer,chapter 5.11 on page 66, http://www.rz.rwth-aachen.de/hpc/primer) was updated to version 1.3 Status RWTH Compute Cluster, 2013-05-27 Tim Cramer posted on May 27, 2013 Status RWTH Compute Cluster, 2013-05-27 1. ScaleMP 2. cluster-linux-counter 3. New OFED stack The ScaleMP system is online again. The special purpose frontend cluster-linux-counter.rz.rwth-aachen.de is not needed anymore and was turned off. Due to a security bug in the Linux kernel we are forced to switch the OFED stack. The current OpenFabrics stack is not compatible with latest kernel and the lustre (HPCWORK) system.… Status RWTH Compute Cluster, 2013-05-17 Tim Cramer posted on May 17, 2013 Status RWTH Compute Cluster, 2013-05-17 Due to license issues the big ScaleMP node is offline at the moment. We are working on a solution. Neues Release des Primers 8.2.5 Paul Kapinos posted on May 08, 2013 Unter http://www.rz.rwth-aachen.de/hpc/primer ist nun eine neue Ausgabe des Primers verfügbar, Version 8.2.5 von 08. Mai 2013. Wichtigste Änderungen: • The version of the default compiler changed to intel/13.1 (instead of 12.1 versions). • Some other modules are updated in their versions, old versions were deprecated. • The runtime limit in the JARA-HPC Partition has been changed from 24h to 72h, cf. chapter 4.6 on page 48 • The Array Job example script has been fixed, cf. listing 3 on page 40.… Boost (Lin) installiert Paul Kapinos posted on May 07, 2013 Boost library now installed for Intel and Open MPI, and for Intel and GCC compilers. Boost provides free peer-reviewed portable C++ source libraries that work well with the C++ Standard Library. Boost libraries are intended to be widely useful, and usable across a broad spectrum of applications. More information can be found at http://www.boost.org/. To initialize the environment, use $ module load LIBRARIES; module load boost. This will set the environment variables boost_root,… HDF5 (Lin) installiert Paul Kapinos posted on Apr 29, 2013 HDF5 library now installed for Intel and Open MPI, and for Intel and GCC compilers. HDF5 is a data model, library, and file format for storing and managing data. It supports an unlimited variety of datatypes, and is designed for flexible and efficient I/O and for high volume and complex data. More information can be found at http://www.hdfgroup.org/HDF5/. To initialize the environment, use $ module load LIBRARIES; module load hdf5. This will set the environment variables hdf5_root,… Status RWTH Compute Cluster, 2013-04-26 Tim Cramer posted on Apr 26, 2013 Status RWTH Compute Cluster, 2013-04-26 Maintenance At May 6th, 13:00-17:00, we have scheduled a maintenance downtime for the RWTH compute cluster. Both linux and windows dialog and batch systems are affected and will be unavailable in that time. The following works are planned: Upgrade of all Netapp filers (HOME/WORK) to a new ONTAP version Upgrade of all Lustre (HPCWORK) clients from version 1.8 to version 2.… OpenFOAM news Paul Kapinos posted on Apr 23, 2013 Die Installation von OpenFOAM (http://openfoam.com/) wurde am 22.04.2013 neu gemacht: neue Version 2.2.0 installiert und zum Standard gemacht (gebaut mit intel/13.1, gcc/4.8 Compilern, Intel und Open MPIs). Versionen 2.0.0, 2.0.1, 2.1.1 neu gebaut mit den aktuellen Compilern (intel/13.1, GCC/system_default, Intel und Open MPIs). nun ist ParaView in der Installation von OpenFOAM enthalten; das Laden von ParaView aus Modules entfällt. Die bis 22.04.… Intel MPI default change Paul Kapinos posted on Apr 22, 2013 Intel MPI Aktualisierung der Versionen: 4.1(.0.024) wird nach DEPRECATED verschoben; 4.1(.0.030) wird zum Standard-4.1 (statt von 4.1.0.024) und zum IntelMPI-Standard 4.0 - (ehemaliges Standard) - verbleibt noch fuer ein Weilchen und wird in wenigen Wochen nach DEPRECATED verschoben. Intel Compiler minor revision update Paul Kapinos posted on Apr 22, 2013 die minor revision von intel/13.1(default) hat sich von ".0.146" auf ".1.163" geändert 13.1(.0.146) => DEPRECATED 13.1.1.163 => 13.1(default) Status RWTH Compute Cluster, 2013-04-19 Tim Cramer posted on Apr 19, 2013 Status RWTH Compute Cluster, 2013-04-19 The default behaviour of screen was modified (/etc/screenrc), so that the LD_LIBRARY_PATH will be retained. New Version of ParaView - 3.98.1 Paul Kapinos posted on Apr 18, 2013 Neue Version 3.98.1 von ParaView http://www.paraview.org/ wurde installiert und gleich zu Standard-ParaView gemacht: module load GRAPHICS module load paraview Versionen 3.6.1 und 3.10.1 wurden nach DEPRECATED verschoben Status RWTH Compute Cluster, 2013-04-12 Tim Cramer posted on Apr 12, 2013 Status RWTH Compute Cluster, 2013-04-12 There will be a maintenance on 6.5.2013 where we have to shut down the complete Linux cluster, because a defective Infiniband switch and the lustre client will be switched. Details are following. TotalView Debugger - new (beta) release 8.12 Paul Kapinos posted on Apr 11, 2013 Eine neue (beta) Version des TotalView Debuggers ist nun verfügbar: totalview/8.12b GCC compilers - new release 4.8.0 available Paul Kapinos posted on Apr 11, 2013 Neues Release (4.8.0) von den GCC compilers wird ab 12.04.2013 verfügbar: module load gcc/4.8 Neue Intel Software Paul Kapinos posted on Apr 02, 2013 Neue Intel Software im RZ-Cluster installiert (verfügbar ab 03.04.2013): Intel® Cluster Studio XE for Linux Version 2013 (Update 3 Eng/Jpn) $ module switch intel intel/13.1.1.163 Intel® Inspector XE for Linux Version 2013 (Update 5) $ module load intelixe Python Installationen Paul Kapinos posted on Mar 28, 2013 Major update in the Python installations at the RZ-Cluster: Versions 2.5.6, 2.6.8, 2.7.3, 3.3.0 installed. On all but 3.x versions now these Python modules are available: NumPy, SciPy, matplotlib. Unset PYTHONPATH to disable these modules (e.g. for installing/using own versions). The GCC compiler used for building; thus in doubt try 'module switch intel gcc'. Version 2.7.3 set to the default python module (instead of 2.7.1). Versions 2.4.6, 2.5.5, 2.6.6, 2.7.1, 3.1.1, 3.2.… Status RWTH Compute Cluster, 2013-03-28 Tim Cramer posted on Mar 28, 2013 Status RWTH Compute Cluster, 2013-03-28 1. Maintenance 2. Eastern During the maintenance this week we successfully upgraded the lustre servers without losing any data. We wish all our users a happy Easter vacation. Neue Standard-Module - New default modules Paul Kapinos posted on Mar 26, 2013 Während der Wartung am 26.03.2013 wurden folgende Änderungen der Standradmodule durchgeführt: 1. Neuer Standard-MPI ist Open MPI 1.6.4 (früher 1.6.1) 2. Neuer Standard-Compiler ist Intel 13.1(.0.146), statt früher 12.1(.6.361). 3. Der intel/12.1 Compiler ist nun 12.1.7.367 statt von 12.1.6.361 (letztere Version ist verschoben nach DEPRECATED) 4. Die Module intel/13.0 und openmpi/1.5.… Status RWTH Compute Cluster, 2013-03-21 Frank Robel posted on Mar 21, 2013 Status RWTH Compute Cluster, 2013-03-21 1. Maintenance, March 26th 9:00 – 12:00 2. Malfunction Lustre Filesystem 3. Malfunction one IH chassis As already announced we scheduled a maintenance for March 26th 9:00 -12:00. Upgrade of the Lustre filesystem (HPCWORK ). Please note the following things: The downtime of HPCWORK will exceed the announced maintenance period. Therefore, it is important to mark all jobs using HPCWORK to mark with #BSUB -R “select[hpcwork]”.… VASP Neuigkeiten & news Paul Kapinos posted on Mar 21, 2013 Folgende Änderungen wurden an der Installation von VASP (Vienna Ab initio Simulation Package, https://www.vasp.at/) durchgeführt: Versionen 4.5, 4.6hack, 5.2.2, 5.2.12 (ohne Fix vom 11.11.2011) wurden nach DEPRECATED verschoben. Sollten Sie eine dieser Versionen zwingend brauchen, bitte informieren Sie uns. Anderweitig werden diese Versionen zu Ende 2013 unwiderruflich entfernt. Version 5.2.12f (5.2.12 mit Fix vom 11.11.… Status RWTH Compute Cluster, 2013-03-08 Tim Cramer posted on Mar 08, 2013 Status RWTH Compute Cluster, 2013-03-08 1. PPCES 2013 2. LSF and HPCWORK For our PPCES 2013 several nodes were reserved (GPU cluster, 10 MPI nodes), so that they can not be used during the next week in the normal queue. Find all details about PPCES here We improved the submission routine, so that you might get warning if you try to use HPCWORK without setting #BSUB -R "select[hpcwork]". Please always set this option if you need the HPCWORK. Status RWTH Compute Cluster, 2013-03-01 Tim Cramer posted on Mar 01, 2013 Status RWTH Compute Cluster, 2013-03-01 Maintenance, March 26th 9:00 – 12:00 As already announced we scheduled a maintenance for March 26th 9:00 -12:00. Upgrade of the Lustre filesystem (HPCWORK ). Please note the following things: The downtime of HPCWORK will exceed the announced maintenance period. Therefore, it is important to mark all jobs using HPCWORK to mark with #BSUB –R “select[hpcwork]”. Jobs not marked accordingly will fail when trying to access paths in $HPCWORK.… Status RWTH Compute Cluster, 2013-02-22 Tim Cramer posted on Feb 22, 2013 Status RWTH Compute Cluster, 2013-02-22 1. cluster-linux-tuning available again 2. Java update 3. Firmware updates After the replacement of a defective motherboard cluster-linux-tuning is available again. Java was (again) updated with the latest security patches. We have to install several firmware updates to all compute nodes. For will be done gradually, so that parts of the compute nodes will not be available during this time. NAG Toolbox für (for) MATLAB verfügbar (available) Paul Kapinos posted on Feb 22, 2013 Werte Nutzer von MATLAB, Die NAG Toolbox für MATLAB ist nun verfügbar für die Versionen 2008a, 2008b, 2009a, 2009b, 2010a von MATLAB. (Für die Versionen 2010b 2011a 2011b 2012a 2012b die NAG Toolbox ist derzeit nicht nutzbar aufgrund von Lizenzproblemen. Bitte teilen Sie uns mit wenn Sie die NAG Toolbox auf einer der Versionen dringend brauchen). Die Toolbox ist integriert direkt in MATLAB, so keiner zusätzliche Module sind zu laden.… Status RWTH Compute Cluster, 2013-02-19 Tim Cramer posted on Feb 19, 2013 Status RWTH Compute Cluster, 2013-02-19 1. Update Intel Compiler and Intel MPI 2. Memory Limits on graphical Frontends We installed the Intel Compiler 13.1 and the Intel MPI 4.1.0.030. Both are available in the module system. We decreased the memory-limit (cgroups) per user to 16 GB real used memory on the frontends cluster-x and cluster-x2. There is no limit for the virtual memory. NAG's numerical libraries - new versions installed Paul Kapinos posted on Feb 18, 2013 The following new versions of the numerical libraries from NAG (Numerical Algorithms Group, www.nag.com/) are now available: The NAG C library Mk23: http://www.nag.com/numeric/CL/CLdescription.asp The NAG SMP library Mk23: http://www.nag.com/numeric/FL/FSdescription.asp Note that the NAG C libraries are now available in two versions: LP64 ('usual' 64bit library, integer 32bit) and ILP64 ('..._ilp64' version, integer 64bit). Using the incongruous version will crash your application.… NAG's "nagfor" compiler version 5.3.1 Paul Kapinos posted on Feb 18, 2013 Der Fotran Compiler "nagfor" von Numerical Algorithms Group (NAG, www.nag.com/) ist nun in der Version 5.3.1 installiert. Die ältere Version 5.2 ist noch verfügbar. Version 5.3.1: $ module load nagfor oder $ module load nagfor/5.3.1 Version 5.2: $ module load/5.2 AMD's ACML installation updates Paul Kapinos posted on Feb 13, 2013 New ACML versions installed: 4.4.0 (set to default ACML version instead of 4.3.0) 4.4.0_mt 5.3.0 5.3.0_mt The 5.3.0 version is known to work with GCC compilers version 4.6 or newer, so it did not work with the default GCC compiler gcc/system-default These versions of ACML are moved to DEPRECATED category: 4.0.1, 4.0.1_mp, 5.0.0, 5.0.0_mp Status RWTH Compute Cluster, 2013-02-11 Tim Cramer posted on Feb 11, 2013 Status RWTH Compute Cluster, 2013-02-11 1. Update likwid 2. Maintenance 26.03.2013 3. cluster-linux-tuning likwid was updated to version 3.0. Use $ module load likwid to use the performance tool. Please note that the tool does not work on the BCS systems. We scheduled a big maintenance for the 26.03.2013. Morning: Whole cluster including the frontends. At least (!) one day: HPCWORK. Due to a bug in the lustre file system HPCWORK will not be available for at least the complete day.… Status RWTH Compute Cluster, 2013-01-25 Tim Cramer posted on Jan 25, 2013 Status RWTH Compute Cluster, 2013-01-25 1. New frontend: cluster-copy This week we just have a little reminder that we established a new Cluster frontend: cluster-copy.rz.RWTH-Aachen.DE (short: cluster-copy). This node is intended to use for data transfer operations. Please use this frontend for copying data from/to/within the cluster, for packing with TAR, for compressing with GZIP or similar operations. Status RWTH Compute Cluster, 2013-01-17 Frank Robel posted on Jan 17, 2013 1. Correct use of $HPCWORK in batch mode. Correct use of $HPCWORK in batch mode: If your batch job uses the HPCWORK file system you should set this parameter:#BSUB -R "select[hpcwork]" This will ensure that the job will run on machines with up’n’running Lustre file system. On some machines (mainly the hardware from pre-Bull installation and some machines from Integrative Hosting) the HPCWORK is connected via ethernet instead of InfiniBand,… Status RWTH Compute Cluster, 2013-01-11 Paul Kapinos posted on Jan 11, 2013 1. we're glad to announce a new RZ-Cluster frontend: cluster-copy.rz.RWTH-Aachen.DE (short: cluster-copy). This node is dedicated to big data transfer operations (movement of data from/to RZ-Cluster, packing with TAR, compressing with GZIP and so on). 2. The amount of jobs waiting for execution in the batch queue is extraordinary high now. Thus the actual waiting time can be pretty high. 3. A new Linux kernel was installed, thus all RZ-Cluster nodes are rebooted.… Happy New Year 2013 Tim Cramer posted on Jan 04, 2013 Dear Cluster user, We hope you all had a Merry Christmas and wish a Happy New Year for you and your families. The Cluster Team Removal of old deprecated modules Paul Kapinos posted on Dec 14, 2012 The following modules in DEPRECATED are now removed: Deleted: openmpi/1.4.3-O2 openmpi/1.4.4nn openmpi/1.4.4rc3 openmpi/1.5.4 openmpi/1.5.4hjs openmpi/1.5.4nn openmpi/1.5.5 openmpi/1.5.5mt openmpi/1.6 openmpi/1.6mt Deleted: python/2.4.2 python/2.5.2 python/2.6.4 TotalView Debugger update Paul Kapinos posted on Dec 11, 2012 TotalView Debugger update Version 8.11 (8.11.0-0) installiert und als Standard gemacht Versionen 8.9.1 und 8.10 nach DEPRECATED verschoben Status RWTH Compute Cluster, 2012-12-10 Frank Robel posted on Dec 10, 2012 Status RWTH Compute Cluster, 2012-12-10 1. Memory limit with cgroups 2. Maintenance HPCWORK (Lustre) 3. Malfunction LSF License server Now cgroups memory limits are used on all frontends and MPI backends. One single user can use 75% of the available memory as a hard limit. In mid-January we plan maintenance on HPCWORK (Lustre). If your batch job relies on HPCWORK make sure that you have included the line #BSUB -R "select[hpcwork]" in your batch script.… Status RWTH Compute Cluster, 2012-11-23 Tim Cramer posted on Nov 23, 2012 Status RWTH Compute Cluster, 2012-11-23 1. Balancing JARA Partition / normal Queue 2. Java Heap-Space The number of available mpi-s nodes in the JARA partition was increased from 450 nodes to 648. As consequence the number of available nodes for the normal queue was reduced from 648 to 450. Please find all information about the JARA partition here. The default heap-space for Java applications was increased from 512 MB to 2048 MB by setting the environment variable JAVA_TOOL_OPTIONS=-Xmx2048m. Status RWTH Compute Cluster, 2012-11-16 Tim Cramer posted on Nov 16, 2012 Status RWTH Compute Cluster, 2012-11-16 1. Memory limit with cgroups 2. VASP jobs We changed the behavior of the cgroup memory limits on cluster-linux and cluster-linux-xeon to evaluate the usability. One single user can use 80 % of the available memory as a hard limit and 25 % as a soft limit. This means that one can execute an application which uses more than 25% of the total memory, but if the memory becomes tight on the node this process might get killed by the operating system.… Intel Software updates Paul Kapinos posted on Nov 14, 2012 The following minor updates on Intel software installation are done: 1. Intel Compiler 13.0.1.117 installed and now available as intel/13.0 (the previous 13.0 version, 13.0.0.079, go to DEPRECATED) 2. Intel Compiler 12.1.7.367 installed and now available as intel/12.1.7.367 Note: Update of 12.1 (the default Intel compiler) is planned for the next cluster maintenance. 3.… Status RWTH Compute Cluster, 2012-11-09 Tim Cramer posted on Nov 09, 2012 Status RWTH Compute Cluster, 2012-11-09 1. Maintenance 14.11.2012, 8:00 - 9:00 2. Binding to BCS and ScaleMP boards As already announced there will be a maintenance on 14.11.2012, 8:00 - 9:00. Due to a security kernel update all Linux batch nodes will be rebooted in this maintenance. The frontends are not affected. We switched the mechanism of the board binding on BCS and ScaleMP to cgroups-based approach.… Status RWTH Compute Cluster, 2012-11-02 Tim Cramer posted on Nov 02, 2012 Status RWTH Compute Cluster, 2012-11-02 1. New Primer Release 2. Malfunction Cooling System 3. Malfunction Bull Infiniband 4. Maintenance HPCWORK (Lustre) 5. Deactivation of XRC We published a new (minor) primer release: http://www.rz.rwth-aachen.de/hpc/primer In the night from Thursday, November 1st to Friday morning we had a malfunction of the cooling systems. As result of this malfunction around 100 nodes shut down automatically. Please resubmit affect jobs. From Friday,… Status RWTH Compute Cluster, 26.10.2012 Frank Robel posted on Oct 26, 2012 Status RWTH Compute Cluster, 26.10.2012 1. Memory limitation on the frontend cluster-linux-xeon 2. Maintenance of cluster On cluster-linux-xeon the amount of memory each user can access has been limited to 8GB using cgroups for testing purposes. The cluster will likely be maintenend at Nov. 13th 2012. Further information is going to be announced soon. Status RWTH Compute Cluster, 2012-10-22 Tim Cramer posted on Oct 22, 2012 Status RWTH Compute Cluster, 2012-10-22 1. New Java versions 2. Cgroups 3. NX Client 4. Kernel Update 5. LSF configuration JARA queue 6. GPU cluster 7. MPI jobs on barcelona The Java version was updated jdk1.6.0_35 updated to jdk1.6.0_37 jdk1.7.0_07 updated to jdk1.7.0_09 The java module will be deleted As already announced we established a fair share for the cpu time on all frontend nodes by applying cgroups. Furthermore,… VASP - switch to openmpi-1.6.1 versions, possible failures at monday and thuesday Paul Kapinos posted on Oct 16, 2012 Dear VASP users, all MPI versions of all VASP versions available over modules are now those compiled with Open MPI 1.6.1. Due to activation of the XRC feature at Monday you batch jobs may have been failed between 10:00 at 2012.10.15 and 18:30 at 2012.10.16 with this error message: > WARNING: The Open MPI build was compiled without XRC support, but XRC("X") queues were specified in the btl_openib_receive_queues MCAparameter. Please resubmit these jobs. OpenMPI 1.6.1 XRC activated Tim Cramer posted on Oct 16, 2012 OpenMPI 1.6.1: XRC activated The XRC feature was activated for OpenMPI 1.6.1 on monday. With this update old binaries can not be started in the batch system although they might work interactively. Please rebuild your applications (can be done on the frontends). Status RWTH Compute Cluster, 12.10.2012 Tim Cramer posted on Oct 12, 2012 Status RWTH Compute Cluster, 12.10.2012 1. Upgrade SL 6.3 2. MPI Fallback deactivated 3. BCS / ScaleMP Binding We will upgrade all nodes in the cluster to Scientific Linux 6.3 during the next days – An impact for the users is not expected. We will deactivate MPI Fallback for all Infiniband connected nodes. This will avoid that MPI jobs start with very slow connection alternatives like IBoIP by mistake. The disadvantage is that the job will die after an IB malfunction.… Status RWTH Compute Cluster, 2012-10-04 Frank Robel posted on Oct 04, 2012 Status RWTH Compute Cluster, 2012-10-04 The CPU-ulimits on the frontends are switched off for test as a consequence of the introduction of cgroups. Status RWTH Compute Cluster, 2012-09-27 Frank Robel posted on Oct 04, 2012 Status RWTH Compute Cluster, 2012-09-27 Small disruption on Monday 24.09. from 05:25 to 08:40. No jobs were started. Neue Intel Software Paul Kapinos posted on Sep 21, 2012 Neue Intel Software: Intel Compiler 12.1.6.361 (module load intel/12.1, by default — Version 12.1.5.339 DEPRECATED) Intel MKL 10.3.9.293 (module load LIBRARIES intelmkl/12.3 — Version 10.3.7.256 DEPRECATED) neuer GCC Compiler - 4.7.2 Paul Kapinos posted on Sep 21, 2012 Werte RZ-Cluster-Nutzer, ein neuer GCC compiler ist installiert, Version 4.7.2. Erreichbar als gcc/4.7. Die bis jetzt unter diesem Namen verfügbare Version 4.7.0 ist nach DEPRECATED gewandert. Dear RZ Cluster User, ner version of GCC compiler is available: 4.7.2. Load it as gcc/4.7 - the version 4.7.0 which was available by this name untel yet is moved to DEPRECATED Status RWTH Compute Cluster, 2012-09-14 Tim Cramer posted on Sep 14, 2012 Status RWTH Compute Cluster, 2012-09-14 1. Malfunction of the Job Scheduler 2. Maintenance of the Cooling System 3. Power Outage, Monday 10.09.2012 4. Defective Power Supply, Wednesday 12.09.2012 5. Cgroups The vendor of the LSF job scheduler provided a workaround for the slot reservation issue. Due to the fact that the slot reservation works with this modification at the moment, it is possible to get big MPI jobs scheduled again. Platform Computing (IBM) is still working on a final solution.… Status RWTH Compute Cluster, 2012-09-07 Tim Cramer posted on Sep 07, 2012 Status RWTH Compute Cluster, 2012-09-07 1. Malfunction of the Job Scheduler 2. JAVA Update 3. Cgroups There is still a problem with the LSF job scheduler which makes it very difficult to get big MPI jobs scheduled. The vendor Platform LSF (IBM) was able to reproduce the problem and is working with the highest priority on this issue. In order to fix a security issue with JAVA, the version in the cluster was updated.… New Portland Group Compilers Marcus Wagner posted on Sep 06, 2012 have been installed. While 12.8 is the new version, 12.3 still remains the default. OpenFOAM - Installationen überarbeitet Paul Kapinos posted on Sep 03, 2012 Die Installationen von OpenFOAM können zur Zeit (Stand 18.04.2012) nicht mit dem aktuellen Intel Compiler benutzt werden. Der Compiler Intel 12.1 muss benutzt werden. So bald wie möglich werden die OpenFOAM Installationen erneuert werden. Die Installationen von OpenFOAM (http://www.openfoam.com/) sind heute (mit Wirkung vom 04.09.2012) überarbeitet worden: die Versionen 2.0.0, 2.0.1, 2.1.1 (jetzige Standardversion) wurden neu gebaut mit Intel 12.1 und GCC compilern und Open MPI 1.6.… FDS (Fire Dynamic Simulator) - erneut neue Installationen Paul Kapinos posted on Sep 03, 2012 Die hier angekündigte Arbeiten an der Installation von FDS wurden heute (2012.09.03) wie folgt erweitert/angepasst: Version 6.0a12450 neu installiert (erreichbar als 6.0a) Version 6.0a11934 (ehemals erreichbar als 6.0a) verschoben nach DEPRECATED Version 5.2 (5.2.5) verschoben nach DEPRECATED Versionen 5.5 (5.5.3) und 5.3 (5.3.1) neu gebaut mit Intel Compiler (12.1.5.339), Intel MPI (4.0.3.008), Open MPI (1.6.1).… VASP new version 5.2.12 with accumulated fixes from 11.11.2012 Paul Kapinos posted on Aug 31, 2012 Neue Version von VASP Software verfügbar: $ module load CHEMISTRY $ module load vasp/5.2.12f Diese Version beinhaltet Bugfixes für VASP 5.2.12: http://www.vasp.at/index.php?option=com_content&view=article&id=98:bugfix-accumulated-fixes-for-vasp5212&catid=40:bugfi xes&Itemid=63 Diese Version ist gebaut mit Intel Compilern 12.1.5.339, und Open MPI 1.6.1. Die Dateien mpi.o, main.o, xcgrad.o, xcspin.o wurde mit einer verringerten Optimierungstufe übersetzt (-O1 statt -O3).… NWChem Paul Kapinos posted on Aug 31, 2012 New release (6.1.1) of NWChem software (see http://www.nwchem-sw.org/index.php/Main_Page) is installed. This version use Open MPI 1.6.1 and is compiled using Intel compiler 12.1.5.339. older version 6.0 and 6.1 of the NWChem software are moved to DEPRECATED category. Status RWTH Compute Cluster, 2012-08-31 Tim Cramer posted on Aug 31, 2012 Status RWTH Compute Cluster, 2012-08-31 1. Malfunction of the Job Scheduler 2. Maintenance of the Cooling System 3. New MPI: Open MPI 1.6.1 Due to a malfunction of the LSF job scheduler from last Monday 12:00 to Tuesday 00:45 no jobs started on the linux-cluster. To fix this problem the scheduler does not make any reservations at the moment. As a consequence there might be longer waiting times for parallel jobs.… PETSc - neue Installation Paul Kapinos posted on Aug 28, 2012 Folgende Änderungen an der Installation von PETSc (Portable, Extensible Toolkit for Scientific Computation, http://www.mcs.anl.gov/petsc/) sind durchgeführt: Version 3.3 (p2) neu installiert, erreichbar als petsc/3.3 Hinweis: diese Version wurde gebaut mit OpenMPI 1.6.1, so muss dieses MPI vor dem Laden des PETSc Modules geladen sein. die Standard-Version gesetzt auf 3.3 (Version 3.2 bleibt verfügbar) Version 3.0.0 nach DEPRECATED verschoben The following changes on PETSc installation: v3.… FDS (Fire Dynamic Simulator) - neue Installationen Paul Kapinos posted on Aug 13, 2012 Folgende Änderungen an der Installationen von FDS (Fire Dynamic Simulator) sind durchgefügrt: Version 6.0a11934 neu installiert (erreichbar als 6.0a) Version 6.0a10519 (ehemals erreichbar als 6.0a) verschoben nach DEPRECATED Version 5.5.3 (erreichbar als 5.5) mit aktuellem Intel Compiler gebaut mit Intel MPI, Open MPI (1.6), seriell Standard-FDS auf 5.5 gesetzt (statt von 5.… Status RWTH Compute Cluster, 2012-08-10 Tim Cramer posted on Aug 10, 2012 Status RWTH Compute Cluster, 2012-08-10 1. Requeued jobs For maintainance reasons we had to requeue several jobs during the week. Status RWTH Compute Cluster, 2012-08-03 Tim Cramer posted on Aug 03, 2012 Status RWTH Compute Cluster, 2012-08-03 1. Node Cleaner 2. Maintenance 20.08.2012, 9:00 - 15:00 3. Modification of ulimits 4. Security problems on the GPU cluster In order to improve the performance of the cluster we established a new node cleaner which is executed before and after every exclusive batch job. The node clean does two things: A flush on the file system cache. This will solve some serious performance issues concerning the correct ccNUMA placement of your data,… Status RWTH Compute Cluster, 2012-07-27 Tim Cramer posted on Jul 27, 2012 Status RWTH Compute Cluster, 2012-07-27 1. Firmware updates on Bull machines 2. Binding of MPI processes At the moment all nodes in the Bull cluster get new firmware updates for a better memory monitoring. These updates may lead to earlier crashes of the nodes in case of defective memory modules, so that the mean time to recover (MTTR) will be lower. If one of your jobs crashed unexpectedly this week, please just try to resubmit it.… Status RWTH Compute Cluster, 2012-07-13 Frank Robel posted on Jul 13, 2012 BCS OpenMPI 1.6 Lustre / $HPCWORK Air conditioning maintenance BCS: Now you can submit 120 hours jobs at BCS. Previously, only 24 hours were possible. OpenMPI 1.6: OpenMPI 1.6 is now for disposal. Since it will be soon setup as default, you must convert and test your programs. At the moment, you can switch to the new version with the following command: module switch openmpi openmpi/1.6 Programs using older versions of OpenMPI won’t work with OpenMPI 1.6.… Status RWTH Compute Cluster, 2012-06-29 Tim Cramer posted on Jun 29, 2012 Status RWTH Compute Cluster, 2012-06-29 1. Maintenance 2. Power outage 3. Update quota command As announced (refer to http://www1.rz.rwth-aachen.de/kommunikation/betrieb/auto/stoerungsmeldungen/index.php) there will be a maintenance starting on Sunday 1st, 12 am until Monday 2nd, 12am: Bull benchmarks Operation system update to Scientific Linux 6.2 Lustre (HPCWORK) update to 1.8.8 Update of the Netapp (HOME/WORK) system to Ontap 8.1.p1 The default Intel Compiler will be updated to 12.1.5.… Status RWTH Compute Cluster, 2012-06-15 Frank Robel posted on Jun 15, 2012 Status RWTH Compute Cluster, 2012-06-15 As announced on the rzcluster list, we scheduled two maintenance frames 29.06.2012 15:00-16:00: Bull will prepare their benchmark session. All BCS-systems will be rebooted The Infiniband-Subnet-Manager in the Bull-Fabric will be restarted. For that, the batch-mode in the Bull-Fabric must be stopped On cluster, cluster-linux and cluster-x, there will be a short network failure. You might use cluster2 or cluster-x2. 02.07.… Status RWTH Compute Cluster, 2012-06-04 Tim Cramer posted on Jun 04, 2012 Status RWTH Compute Cluster, 2012-06-04 All information in /etc/passwd are stored anonymous now. As name (e.g. for the finger command) only the username is displayed, gender specific data has been removed. Status RWTH Compute Cluster, 28.05.2012 Tim Cramer posted on May 29, 2012 Status RWTH Compute Cluster, 28.05.2012 1. QPI speed on BCS node 2. MPI test reactivated 3. Minor Malfunction Due to some stability problems detected by Bull in their BCS chips the speed of the QPI was reduced from 6.4 GT/s to 4.8 GT/s on the corresponding nodes. Please note that there are will be additional maintenance windows for these systems (only the BCS) in the next weeks. The MPI test which is executed before every MPI-parallel job is reactivated again.… Open MPI - 'carto' feature activated on BCS nodes Paul Kapinos posted on May 22, 2012 On nodes with more than one active InfiniBand cards (=> BCS nodes), the 'carto' feature of Open MPI is active since 22.05.2012. Note these environment variables: OMPI_MCA_carto OMPI_MCA_carto_file_path Status RWTH Compute Cluster, 2012-05-16 Tim Cramer posted on May 16, 2012 Status RWTH Compute Cluster, 2012-05-16 1. Maintenance on last Monday 2. MPI Test 3.… Module changes Paul Kapinos posted on May 14, 2012 At the opportunity of the cluster maintenance at 14.05.2012, some changes on the available modules are made. 1) older software versions are moved to the DEPRECATED category. This versions are still available after loading the DEPRECATED category by 'module load DEPRECATED' 2) Names of some beta versions are unified: now, beta versions have 'b' as the last letter in the versions number. Status RWTH Compute Cluster, 2012-05-11 Tim Cramer posted on May 11, 2012 Status RWTH Compute Cluster, 2012-05-11 1. Maintenance 2. BCS nodes partly online 3. ScaleMP online 4. Fault on central IB network component 5. Routing algorithm Infiniband subnet manager As announced on the rzcluster list we scheduled a maintenance 14.05.2012, 15:00-18:00: The complete Linux and Windows cluster (including frontends) will not be available. Following changes will be done: Update on the file systems Configuration of integrative hosted systems 14.05.2012, 18:00 – 15.05.2012,… Status RWTH Compute Cluster, 2012-05-04 Tim Cramer posted on May 04, 2012 Status RWTH Compute Cluster, 2012-05-04 1. Batch System / Compute Resources 2. Fault on central IB network component 3. Defective Compute Nodes One third of the Bull cluster is now reserved exclusively for the JARA HPC partition. In order to make big parallel jobs possible for all other users and get a fair share of the remaining resources all serial jobs will not be scheduled to the Bull Cluster anymore, but only to older parts of the cluster.… Status RWTH Compute Cluster, 2012-04-21 Tim Cramer posted on Apr 22, 2012 Status RWTH Compute Cluster, 2012-04-21 1. Maintenance April, 24th 8am to April, 25th 8am 2. New hardware of cluster As already announced on the rzcluster list we scheduled the next maintenance to stabilize the Infiniband network together with Bull from April, 24th 8am to April, 25th 8am. During the maintenance no batch jobs will start on the Bull cluster and it is not possible to run MPI jobs interactively on the backends with $MPIEXEC wrapper.… Status RWTH Compute Cluster, 2012-04-13 Tim Cramer posted on Apr 13, 2012 Status RWTH Compute Cluster, 2012-04-13 1. Shell profiles / module system 2. $WORK / $HPWORK clean up 3. Maintenances April, 12th (yesterday) 4. Next maintenance from Monday, 16.4. 12:00 – Tuesday 17.4. 12:00 (24 hours) Many users load modules in there shell profiles (e.g. $HOME/.zshrc). Please note that this can trigger a lot of issues for your batch jobs, so that we really recommend not to do that. Unfortunately, we cannot support module loads in your profiles.… Status RWTH Compute Cluster, 2012-03-30 Tim Cramer posted on Mar 30, 2012 Status RWTH Compute Cluster, 2012-03-30 1. Maintenance 2. $HPCWORK performance 3. Dispatching time of bigger MPI jobs As already announced last week we scheduled maintenance for 03.04.2012, 10:00-14:00. Additional to the changes announced last week, following modifications will be done: Change of the startup mechanism of Intel MPI. If you use the $MPIEXEC and $FLAGS_MPI_BATCH environment variable your old job scripts will still work. The machine cluster.rz.rwth-aachen.… GCC 4.7.0 installiert Paul Kapinos posted on Mar 26, 2012 $ module load gcc/4.7 Status RWTH Compute Cluster, 2012-03-26 Tim Cramer posted on Mar 26, 2012 Status RWTH Compute Cluster, 2012-03-26 1. Power outage 2. Maintenance 03.04.2012, 10:00-14:00 3. Storage system Due to short-circuit in one of the chassis and a power outage (city center and campus Melaten) caused by a bird (really!) last week, 1000 systems in the Linux cluster were shut down. Due to this failure 90% of all jobs running at this time were killed. We apologize for any inconvenience. All systems are up and running again. We scheduled maintenance for 03.04.2012, 10:00-14:00.… new (beta) version of TotalView Debugger Paul Kapinos posted on Mar 21, 2012 Log: TotalView Versionspflege: 8.10.0 Beta installiert (egentl. 8X.10.0-4, erreichbar als totalview/8.10beta) Standard-TV ist 8.9.2 geblieben, nun aber in Version 8.9.2-2 Ehemaligen 8.9.2 (eigentl. 8.9.2-0) verschoben nach DEPRECATED 8.9 (eigentl. 8.9.0-0) verschoben nach DEPRECATED Status RWTH Compute Cluster, 2012-03-16 Tim Cramer posted on Mar 16, 2012 Status RWTH Compute Cluster, 2012-03-16 1) Reconfiguration of the SMP systems 2) Mailing list rzcluster 3) Configuration of license servers 1) Bull will install the BCS systems until the end of this month, so that all nodes known as SMP complex are still offline. 2) Please note that the rzcluster list is not a public discussion forum, because only members of the communication center are allowed to write on this list to inform the users. If you have any questions please contact servicedesk@rz.… TotalView 8.9.2-2 installiert Paul Kapinos posted on Mar 15, 2012 TotalView /8.9.2-2 installiert und gleich gepatcht. die Datei /rwthfs/rz/SW/ETNUS/toolworks/totalview.X.Y.Z/lib/parallel_support.tvd ediert. FLAGS_LPATH Paul Kapinos posted on Mar 12, 2012 In Version 12 des Intel Compiler sind die Pfade ja aaaaaanders als in 11.x Versionen (Jaaaaa, Intel mag die Pfade aendern!) Dank Nutzer-Hinweis (Tiketl 20120308-0466) isses nun aufgefallen. Nun gefixt. MfG PK HPCG RZ RWTH AC Status RWTH Compute Cluster, 2012-03-09 Tim Cramer posted on Mar 09, 2012 Status RWTH Compute Cluster, 2012-03-09 1) Reconfiguration of the SMP-Systems 2) JARA HPC 3) Reconfiguration of the automounter 4) Accessing /rwthfs 5) Maintenance March 12th 1) We expect that until end of March the second stage of the cluster configuration will be completed. In this stage, two or four of the 4-socket-systems (known as SMP systems / Nehalem EX) will be connected to 8 or 16-socket-systems, respectively, with proprietary BCS-Chips from Bull.… Interaktiver MPIEXEC Wrapper update - 'mpitest' updated Paul Kapinos posted on Mar 05, 2012 Interaktiver MPIEXEC Wrapper update: 'mpitest' updated U mpiexec.py U test_wrap.sh Updated to revision 3358. Status RWTH Compute Cluster, 2012-03-02 Tim Cramer posted on Mar 04, 2012 Status RWTH Compute Cluster, 2012-03-02 1) Data in $WORK 2) Unstable $HPCWORK (Lustre Filesystem) 3) Deactivated automounter 4) Frontends cluster, cluster-linux 5) Maintenance March 12th 1) The data in $WORK is not automatically cleaned up after 4 weeks anymore. However, please keep in mind that this might change in the future again (after a new announcement, of course) and that there is no backup for $WORK. 2) We expect that the problems with the suspending $HPCWORK are solved.… Status RWTH Compute Cluster Tim Cramer posted on Feb 24, 2012 Status RWTH Compute Cluster, 2012-02-24 1) "Hänger" beim interaktiven Arbeiten auf den Linux Frontends 2) Instabiles HPCWORK (Lustre Filesystem) 3) Lange Wartezeiten im Linux Batch System 4) Performanceprobleme bei MPI Anwendungen 5) Weitere Hinweise Zu 1) Auf den Frontends cluster.rz.rwth-aachen.de, cluster-x.rz.rwth-aachen.de und cluster-linux.rz.rwth-aachen.de wurden die NFS-Mounts von TCP auf UDP umgestellt. Seitdem werden in unseren Messungen seit ca.… Softwarepflege - Löschung ausgedienter Software Paul Kapinos posted on Feb 17, 2012 Viele alte Versionen von Intel Software und einige andere (ur-)alte Softwarebestände wurde gelöscht. Betroffen sind diese Verzeichnisse: /rwthfs/rz/SW/NAG /rwthfs/rz/SW/UTIL /rwthfs/rz/SW/intel Dementsprechend wurden einige Module aus DEPRECATED Bereich entfernt. Neues Release des Primers 8.2.1 Paul Kapinos posted on Feb 14, 2012 Unter http://www.rz.rwth-aachen.de/hpc/primer ist nun eine neue Ausgabe des Primers verfügbar, Version 8.2.1 von 14. Februar 2012. Wichtigste Änderungen: 1. LSF Kapitel (Array Jobs, Chain Jobs, Wegfall von -We Parameter, neue Parameter für Jobs die auf $HPCWORK zugreifen, Empfehlungen für max. Speichernutzung per Slot) 2. Sun Analyzer (kann nun wieder mal die Hardware Counter - auch für MPI Programme). 3. memalign32 Script (wichtig für MPI Programme auf SMP Knoten) Inifiniband Netzwerk Probleme Marcus Wagner posted on Feb 14, 2012 am Freitag, dem 10. Februar 2012 kam es zu Infiniband-Problemen, die dann auch eines der Gateways mit ins Nirvana gerissen hatten. Daher waren ein Viertel der Rechner der Bull-Fabric von ausserhalb der Bull-Fabric nicht mehr erreichbar. Auf diesen Systemen gestartete Jobs verschwanden jedoch im Nirvana, da diese die LDAP-Server ausserhalb der Bull-Fabric nicht erreichen konnten. Daher war eine Nutzerauthentifizierung nicht möglich. Das Problem ist inzwischen behoben. "Hänger" auf den Linux-Cluster-Frontends Georg Schramm posted on Feb 03, 2012 Auf den Frontendsystemen (primär innerhalb des neuen BULL Clusters) kommt es in den letzten Wochen häufig zu "Hängern". Diese Problematik ist bekannt und es wird intensiv an einer Lösung gearbeitet.… Neues Release des Primers 8.2 Paul Kapinos posted on Jan 30, 2012 Unter http://www.rz.rwth-aachen.de/hpc/primer ist nun eine neue Ausgabe des Primers verfügbar, Version 8.2 von Januar 2012. Wichtigste Änderungen: 1. LSF Kapitel (Array Jobs, Chain Jobs, Wegfall von -We) 2. Sun Analyzer (kann nun wieder mal die Hardware Counter - auch für MPI Programme). Sun Studio Version 12.3 (release) installiert. Paul Kapinos posted on Jan 26, 2012 Sun Studio Version 12.3 (release) installiert und direkt zum Default-Studio gemacht. Erreichbar mit "module load studio". Verfügbare Hardware Counter anzeigen: $ collect -h Versionspflege des Intel Compilers 12.1.2.273 neuer Standardcompiler Paul Kapinos posted on Jan 04, 2012 Versionspflege des Intel Compilers: 12.1.0.233 geht nach DEPRECATED 12.1.2.273 wird zum neuen Default-Compiler (12.1). Hintergrud: "version 12.1.0.233 of the Intel compiler has a vectorization bug." https://svn.open-mpi.org/trac/ompi/changeset/25290 http://www.open-mpi.org/community/lists/users/2012/01/18091.php -This line, and those below, will be ignored- A + version-3.0/modulefiles/linux/x86-64/DEPRECATED/intel/12.1.0.233 R + version-3.0/modulefiles/linux/x86-64/DEVELOP/intel/12.… Matlab Module angepasst Paul Kapinos posted on Dec 21, 2011 Matlab Module ediert: 1. Versionen 2006*, 2007* sind nicht mehr verfu"gbar und werden hiermit ins Nirvana befo"rdert. 2. Modul angepasst, damit alle A"nderungen im Modul stattfinden und die Versionsdateien nichts ausser der Versionsnummer enthalten 3. Warnung hinzugefu"gt, dass die sonstige Programme ewtl. nicht nutzbar sind wenn Matlab geladen ist.. Modulepflege - alte TotalView Versionen DEPRECATED Paul Kapinos posted on Dec 16, 2011 Modulepflege: Versionen von TotalView 8.8, 8S.9.1-0A, 8.9.2beta nach DEPRECATED neuer Standardtotalview ist nun 8.9.2 Gromacs 4.5.5 installiert Paul Kapinos posted on Nov 18, 2011 Die Version 4.5.5 von Gromacs http://www.gromacs.org wurde im neuen (Bull) Cluster installiert. Laden des Modules: $ module load CHEMISTRY $ module load gromacs Ältere Versionen sind nun ausser Gefecht. NWChem 6.0 (ohne Gewähr) Paul Kapinos posted on Nov 07, 2011 http://www.nwchem-sw.org NWChem Software in Version 6.0 wurde installiert, gebaut mit Standard-Kombination von Intel 12 Compiler und OpenMPI 1.5.3. Nutzung: $ module load CHEMISTRY nwchem Den Nutzern steht es frei, diese Installation auszuprobieren; Fehlermeldungen sind willkommen, aber wir geben KEINE GARANTIE dass diese Installation funktioniert und dass wie die im Fehlerfall reparieren können werden. HPC ChangeLog PETSc Installationen (ohne gewähr) Paul Kapinos posted on Oct 28, 2011 Versuch wurde unternommen, "vanilla" Installationen von PETSc vorzunehmen. Den Nutzern steht es frei, diese Installationen auszuprobieren; Fehlermeldungen sind willkommen, aber wir geben KEINE GARANTIE dass diese Installationen funktionieren und dass wie die im Fehlerfall reparieren können werden. $ module load LIBRARIES $ module load petsc Verfügbare Varianten: Für Intel Compiler, Version 3.0.0 für OpenMPI und IntelMPI Version 3.2 nur für OpenMPI. Und hier ein Bild,… GCC compiler v.4.3.4 installiert Paul Kapinos posted on Oct 24, 2011 Die Version 4.3.4 von GCC compilern wurde installiert und ist ab Morgen erreichbar mit "module load gcc/4.3". Diese betagte Version ist für Matlab vonnöten. Sun Studio Version 12.3 (beta) installiert. Paul Kapinos posted on Oct 24, 2011 Erreichbar mit "module load studio/12.3beta". ParaView- neue Versionen und Module verschoben Paul Kapinos posted on Oct 21, 2011 ParaView (http://www.paraview.org/) is an open-source, multi-platform data analysis and visualization application.Wird unter Anderem zusammen mit OpenFOAM genutzt. I. Es wurden folgende Versionen installiert: 3.10.1 ==> neues Default 3.12.0.RC2 ==> (funktioniert noch nicht) UPD:2011.10.24 ==> Version 3.12.0.RC2 wieder entfernt, weil kaputt. Die Version 3.6.1 wurde zu einem neuen Installationsort verschoben (funktioniert weiterhin). II.… Permissions für PETSc nachgebessert Paul Kapinos posted on Oct 18, 2011 Die Leserechte für Nutzer warn nicht gesetzt; dies ist nun nachgebesserrt. /rwthfs/rz/SW/NUMLIB/PETSc-3.0.0/openmpi-1.5 Reference No.: 20111018-0524 studio Versionen v12.2 bleibt - alles andere DEPRECATED Paul Kapinos posted on Oct 10, 2011 die einzige verfuegbare Version von Studio bleibt /12.2 Bei Bedarf kann /12.3 (z.Zt. beta) installiert werden, gibtz noch zum Runterladen. studio/12.1p6 geht nach DEPRECATED (weil Probleme mit C++ im neuen Cluster und sonst zu alt) studio/express geht nach DEPRECATED (weil aelter als 12.2) ansonsten die Namen noch gerade gezogen. gcc 4.5 auf SL6x Maschinen gefixt Paul Kapinos posted on Oct 10, 2011 die Version gcc/4.5 war auf Scientific Linux 6.x Rechnern (neuer Cluster) falsch verlinkt (auf 4.5.1 vom alten Cluster), dies ist nun behoben - die zeigt auf 4.5.3). Sun MPI hat ausgedient und ist nun DEPRECATED. Paul Kapinos posted on Oct 09, 2011 kwt. VASP- neue Installation Paul Kapinos posted on Oct 05, 2011 Werte Test-Cluster-Nutzer, die Installation von VASP wurde neu gemacht mit den aktuellen Compiler (Intel Fortran 12.1.0.233) sowie OpenMPI 1.5.3. es ist nun kein Switch zu openmpi/1.4.3 mehr nötig es ist auch eine Version für Intel MPI verfügbar, (v. 4.0.3.008). die AEDENS und TOTAL Varianten werden nicht mehr gebaut, da erkannt, dass die sich von der Standardvariante nicht unterscheiden die bis dato verfügbaren Versionen können zu Testzwecken verfügbar gemacht werden.… neue Standardmodule (neuer Cluster) Paul Kapinos posted on Sep 21, 2011 Neue Standard-Module: intel/12.1 (statt von intel/12, welches nach intel/12.0 umbenannt ist) openmpi/1.5.3 (statt von openmpi/1.4.3) bei der Gelegenheit auch noch zwei tote Links weggerauemut zu nicht mehr vorhandenen Versionen von OpenMPI TotalView 8.9.2 (beta) installiert Paul Kapinos posted on Sep 13, 2011 Die Beta-Version 8T.9.2-0 von TotalView 8.9.2 wurde installiert. Zugreifbar mit $ module load totalview/8.9.2beta Lizenz gültig bis 01-oct-2011 LSF Memory limit changed Tim Cramer posted on Sep 02, 2011 We changed the configuration for the LSF memory limit. Unfortunately, LSF used in some cases a per-process and in some cases a per-job limit, which is very confusing. We deactivated the control of LSF so that the memory limit is enforced by the operating system now. If you still have problems with the memory limits, please contact us. Note that the error message is not very clear, if you request too less memory. The LSF Mail says "Successfully completed",… OpenFOAM 2.0.0 Paul Kapinos posted on Aug 11, 2011 The OpenFOAM 2.0.0 is now installed in the beta cluster. This version is build using the GCC compiles and the defaultly-loaded MPI, so you have to load the GCC compiler instead of the Intel compiler. An Intel-compiler build version may be added if needed. Note: this installation is not intended for development on OpenFOAm itself. To use OpenFOAM,… gcc 4.5 und 4.6 Paul Kapinos posted on Aug 11, 2011 Die GCC Installationen im neuen Cluster (gcc/4.5 und gcc/4.6) wurden neu gemacht auf dem neuen SL60 Cluster. Dadurch sollten nun die C++ Compiler auch funktionierene. Ausgerollt auf cluster-beta2 (weitere Rechner ab 12.08.2011) Open MPI for intel compiler (the default) rebuild with intel 11.1 Paul Kapinos posted on Aug 11, 2011 The versions of Open MPI for the Intel compiler (including the default one) are rebuild with intel/11.1 compiler at 10.08.2011. Thus, now it is possible to use the Open MPI also with the intel/11.1 compiler (the previous versions works with intel/12 only). The quell for the error message alike 'forrtl: severe (71): integer divide by zero' (also see below) is still unknown. We recommend to try out the intel/11.1 compiler if you see such messages from your Fortran program.… Default queue changed back to normal Georg Schramm posted on Aug 01, 2011 With the majority of nodes installed under SL6.0 the deafult queue is now again the queue "normal". All jobs in the queue "normal-sl6" will be allowed to finish, but the queue does not accept jobs. ssh2blaucn wrapper now less verbose Paul Kapinos posted on Jul 29, 2011 Der SSH auf blaunch Wrapper in /opt/lsf/8.0/linux2.6-glibc2.3-x86_64/bin/rsh wurde angepasst und errichtet die Arbeit nun im Schweigen. (Debug-Ausgabe ausgeschaltet). UPD (Tue Aug 2 15:30:13 CEST 2011) Der Fix wurde in die z.Zt. aktive Installation von LSF überstragen, nämlich in die von linuxtc04 erreichbare (von der linuxtc03). Ein Versionierungssystem für die LSF Einstellungsdateien wäre sicherlich sehr von Vorteil.. VASP Installtion re-done on SL60 Paul Kapinos posted on Jul 29, 2011 The VASP Installtion in the new cluster is re-done. The versions 4.5, 4.6. 5.2 known from the old cluster, are compiled with the new Intel compilers and OpenMPI for the MPI versions. Now the version 5.2 is the default VASP version. The new installation will be active from tomorrow, 20.07.2011, on Scientific Linux 6.0 (SL60) part of the cluster only (cluster-beta2) The version 5.2 is the same as the 5.2.2 in the old part of cluster Closure of CentOS 5.6 queue and new default queue Georg Schramm posted on Jul 29, 2011 The "normal" queue with CentOS 5.6 systems was closed today and the queue normal-sl6 was configured as the default queue. Jobs running in the CentOS 5.6 queue will be left running until next week. Waiting jobs will be switched to the SL6.0 queue, if not removed. Submitting jobs to the normal queue is disabled. OpenMPI for SL60 computers re-installed Paul Kapinos posted on Jul 28, 2011 All OpenMPI versions are build and installed for the ScientificLinux 6.0 (SL60) computers. New-installed: /1.5.3 (all versions), /1.5.3mt (all versions), /1.4.3/pgi, /1.4.3mt/pgi Active: from tomorrow, 29.07.2011 Rebuild: /1.4.3/(gcc,intel,studio), /1.4.3mt/(gcc,intel,studio) Active: immediately Note: the old cluster and the CentOS 5.6 part of the new cluster are not changed. Superfluous output during intelmpi jobs Georg Schramm posted on Jul 28, 2011 The superfluous output when running an IntelMPI job has been avoided by redirecting the PAMs (ParallelApplicationManager) output to the error stream. SunMPI ins NFS verschoben Paul Kapinos posted on Jul 28, 2011 Die Installationsmethode von SunMPI wurde angepasst: statt RPMs auf jedem Knoten auszurollen, sind die Installationsverzeichnisse nach /rwthfs/rz/SW/MPI/SCIENTIFIC-6.0/SUNWhpc verschoben und unter /opt/SUNWhpc verlinkt (sunmpi.lnk). Gleichzeitig wurde ein kleiner Bug behoben. (PGI Compiler soll nun können MPI Programme 32bittig zu bauen) LSF updated Georg Schramm posted on Jul 26, 2011 The default queue is called "normal" now instead of "parallel". Also, there exists a new queue, called "normal-sl6", which is the queue for the freshly installed Scientific Linux 6.0 Systems. Memory limit introduced Georg Schramm posted on Jul 19, 2011 Per process memory limits have been introduced. If not set with bsub option -M <n> (unit MB) or below 512MB, the default value of 512MB is set. Jobs submitted prior to job id 17514 are not concerned. Exclusive Jobs Marcus Wagner posted on Jul 19, 2011 Every Job, which requests >= 32 Slots, is set exclusive File access to hpcwork Georg Schramm posted on Jul 18, 2011 The access to the hpcwork directory on the frontend-nodes cluster-beta and cluster-beta2 via file transfer clients like WinSCP or Secure File Transfer Client is now possible. Changed the startup-way of intelmpi Marcus Wagner posted on Jul 18, 2011 done transparently for the user, he still uses mpirun.lsf. Does not use mdpboot anymore, but starts the mpd's on the remote hosts by hand. This way, intelmpi is controllable by LSF and the administrators. work replacement lustre Georg Schramm posted on Jul 15, 2011 In the new cluster the lustre file system is available and will be used as a replacement for the work directory due to an outage of the work file server. $WORK can be accessed only on the frontend nodes cluster-beta and cluster-beta2 to copy files to new lustre directories. The lustre directory is under /hpcwork/<user>. The quota can be listed using the following command $> lfs quota /lustreb By default each user has the soft limits 1 TB data and 50000 files.</user>… WORK file server outage Georg Schramm posted on Jul 15, 2011 Due to file server outage dispatching of jobs has been disabled. Rescheduling fixed Georg Schramm posted on Jul 14, 2011 Jobs exiting with value 98 were not rescheduled correctly, problem was fixed. Running multiple commands with bsub Georg Schramm posted on Jul 12, 2011 $> bsub "echo $SHELL; echo $SHELL" produced: before user command execution in jobstarter /usr/local_rwth/bin/zsh after user command execution in jobstarter /usr/local_rwth/bin/zsh now produces: before user command execution in jobstarter /usr/local_rwth/bin/zsh /usr/local_rwth/bin/zsh after user command execution in jobstarter Default runtime limit enabled Georg Schramm posted on Jul 08, 2011 A default run time limit of 15min was introduced, which is set if no run time limit is set using the bsub option -W [<hours>:]<minutes>. Run time limit normalization deactivated Georg Schramm posted on Jul 07, 2011 ABS_RUNLIMIT=Y avoids normalizing of run time limits according to the CPU factor of a host. Run time limits (bsub option -W <minutes>) should be interpreted absolutely now. changes in the serial queue Marcus Wagner posted on Jul 05, 2011 the serial queue is for the moment a special purpose queue, only one special user has access, please do NOT submit to this queue, your jobs won't run added host-groups Marcus Wagner posted on Jul 05, 2011 there exist now new hostgroups: bull-mpi-l bull-mpi-s bull-smp-l bull-smp-s Requeue exit value and maximum requeue value set Georg Schramm posted on Jul 05, 2011 Values have been set as follows: REQUEUE_EXIT_VALUES=98 MAX_JOB_REQUEUE=10 Neue Gruppe für Zugang zu cluster-beta* eingeführt Georg Schramm posted on Jul 04, 2011 Der Zugang zu den LSF-Cluster-Frontends (cluster-beta, cluster-beta2) wird ab sofort durch die Unix-Gruppe pilotstage geregelt. Probleme können ggfs. beim Login auf diesen Knoten auftreten, in diesem Fall bitte melden. openssh-askpass installed on the bull nodes Marcus Wagner posted on Jun 30, 2011 supposably needed by VASP Change of Limits Marcus Wagner posted on Jun 30, 2011 Up to now, all limits used in LSF, like the requested memory, were meant to be in KB. We changed this to MB, like it was within SGE. Kernel-ib module built newly Sascha Bücken posted on Jun 29, 2011 the infiniband module was newly compiled and activated after an reboot. MPI should be functional now. Environment variables TMP, TEMP, TMPDIR, TMPSESS set admin posted on Jun 29, 2011 The environment variables TMP, TEMP, TMPDIR and TMPSESS are set now for the batch jobs execution. several "Frontend" packages have been additionally installed on the two frontend-nodes Sascha Bücken posted on Jun 29, 2011 firefox-3.6.18-1.el5.centos.i386 firefox-3.6.18-1.el5.centos.x86_64 nano-1.3.12-1.1.x86_64 nedit-5.5-21.el5.x86_64 perl-PDL-2.4.1-47.x86_64 1:qt-3.3.6-23.el5.x86_64 8:arts-1.5.4-1.x86_64 libieee1284-0.2.9-4.el5.x86_64 desktop-backgrounds-basic-2.0-41.el5.centos.noarch avahi-qt3-0.6.16-10.el5_6.x86_64 1:tix-8.4.0-11.fc6.x86_64 tkinter-2.4.3-44.el5.x86_64 python-imaging-1.1.5-5.el5.x86_64 3:htdig-3.2.0b6-11.el5.x86_64 fribidi-0.10.7-5.1.x86_64 libexif-0.6.13-4.0.2.el5_1.1.x86_64 gphoto2-2.2.0-3.… Gaussian module edited admin posted on Jun 29, 2011 The default TMP path in the gaussian module caused an error. Problem solved.