Production-Quality Grid Environments with UNICORE
Transcription
Production-Quality Grid Environments with UNICORE
Production-Quality Grid Environments with UNICORE Dietmar Erwin, Michael Rambadt, Achim Streit and Philipp Wieder Research Centre Jülich (FZJ) GGF 14 Workshop on Grid Applications: from Early Adopters to Mainstream Users June 27, 2005, Chicago Forschungszentrum Jülich in der Helmholtz-Gemeinschaft Contents ¾ ¾ ¾ ¾ ¾ Setting the scene UNICORE in production OpenMolGRID: an application example Lessons learned: what users expect (& what they get) Conclusions Forschungszentrum Jülich in der Helmholtz-Gemeinschaft GGF 14, June 27, 2005 2 Status UNICORE ¾ Long project history (and future), currently DEISA, NaReGI, UniGrids, VIOLA, ... ¾ Features ¾ GUI with single sign-on, X.509 certificates for AA and job/data signing, only one opened port in firewall required, workflow engine for complex multi-site/multi-step workflows, job monitoring, extensible application support, secure data transfer integrated, resource management, resources autonomy remains, production quality, … ¾ Available as Open Source under BSD license on SourceForge at http://unicore.sourceforge.net Forschungszentrum Jülich in der Helmholtz-Gemeinschaft GGF 14, June 27, 2005 3 Production at FZJ ¾ National high-performance computing centre “John von Neumann Institute for Computing” ¾ About 650 users in 150 research projects ¾ Access via UNICORE to ¾ p690 IBM eServer Cluster 1600 (8,9 TFlops peak) since June 2004 ¾ BlueGene/L (5,7 TFlops peak) ¾ Cray SV1ex, Cray XD1 ¾ 116 active UNICORE users ¾ 72 external, 44 internal ¾ Resource usage (2004/05; CPU-hours) ¾ Dec: 18.4%, Jan: 30.4%, Feb: 30.5%, Mar: 27.1%, Apr: 29.7%, May: 39.1% Forschungszentrum Jülich in der Helmholtz-Gemeinschaft GGF 14, June 27, 2005 4 DEISA ¾ Status (Q1/2005): 4 sites / ~24 TFlops IBM Power4 / 1 Gbit / GPFS / LoadLeveler ¾ Goal (2006): 11 sites / ~140 TFlops / 10 GBit / GPFS / Metascheduler Forschungszentrum Jülich in der Helmholtz-Gemeinschaft GGF 14, June 27, 2005 5 DEISA configuration Forschungszentrum Jülich in der Helmholtz-Gemeinschaft GGF 14, June 27, 2005 6 OpenMolGRID: Problem Goal: “Speed-up, automate, and standardise the drug-design using Grid technology” ¾ Characteristics of the problem: ¾ ¾ ¾ ¾ ¾ High calculation complexity Large amounts of data Integration of different applications Secure access to distributed data Collaborative work A real Grid application Forschungszentrum Jülich in der Helmholtz-Gemeinschaft GGF 14, June 27, 2005 7 OpenMolGRID: Solution ¾ ¾ ¾ ¾ Usage of UNICORE, integration of applications Extended workflow support to automate processes Interface to databases API & command line interface QSAR 3D Output EPA ECOTOX Database Descriptors > 5 days 280 2D structures downloaded < 2 hours QSAR 3D Output Forschungszentrum Jülich in der Helmholtz-Gemeinschaft Descriptors GGF 14, June 27, 2005 8 Lessons Learned ¾ Taught by end-users and software developers from ¾ Industry: pharmaceutical, petrochemical, bio-molecular, weather prediction, automotive, engineering ¾ Research: astrophysics, quantum physics, material science, medicine, biology, chemistry ¾ Deployment of new production software has to offer added value ¾ Ease usage, increase effectiveness, decrease cost, … ¾ Users have to be stimulated and encourage to ¾ use Grid technology for applications, computations, data transfer and access to resources ¾ adapt/integrate their applications to/into Grids ¾ Operation of production environments is costly ¾ Certification authority, administrative tools, integration into site management, licenses, … ¾ Common production environment difficult to maintain Forschungszentrum Jülich in der Helmholtz-Gemeinschaft GGF 14, June 27, 2005 9 More lessons learned ¾ Fulfilment of functional requirements is not enough ¾ Users want ¾ software of high quality, especially high reliability and resilience ¾ help to overcome initial hurdles like • obtaining certificates, adapt applications, ... ¾ 24/7 availability of the Grid infrastructure ¾ 24/7 availability of the Grid experts • support hotline, help desk, mailing lists, … ¾ long-term commitment for continuous development and support ¾ workshops, hands-on training, ... Forschungszentrum Jülich in der Helmholtz-Gemeinschaft GGF 14, June 27, 2005 10 Conclusions ¾ Production Grids are possible ¾ Easy/unified usage, integration of legacy applications, cost reduction, … ¾ When will we see the WS-* impact? ¾ Users demand a fully-fledged product ¾ Functions, but also support, support, support ¾ Continuity is crucial ¾ Open Source distribution is the right way ¾ Source for bug reports, requirements, … ¾ Higher visibility & community building Forschungszentrum Jülich in der Helmholtz-Gemeinschaft GGF 14, June 27, 2005 11