Production-Quality Grid Environments with UNICORE

Transcription

Production-Quality Grid Environments with UNICORE
Production-Quality Grid Environments
with UNICORE
Dietmar Erwin, Michael Rambadt,
Achim Streit and Philipp Wieder
Research Centre Jülich (FZJ)
GGF 14 Workshop on Grid Applications:
from Early Adopters to Mainstream Users
June 27, 2005, Chicago
Forschungszentrum Jülich
in der Helmholtz-Gemeinschaft
Contents
¾
¾
¾
¾
¾
Setting the scene
UNICORE in production
OpenMolGRID: an application example
Lessons learned: what users expect (& what they get)
Conclusions
Forschungszentrum Jülich
in der Helmholtz-Gemeinschaft
GGF 14, June 27, 2005
2
Status UNICORE
¾ Long project history (and future), currently DEISA,
NaReGI, UniGrids, VIOLA, ...
¾ Features
¾ GUI with single sign-on, X.509 certificates for AA and
job/data signing, only one opened port in firewall required,
workflow engine for complex multi-site/multi-step
workflows, job monitoring, extensible application support,
secure data transfer integrated, resource management,
resources autonomy remains, production quality, …
¾ Available as Open Source under BSD license on
SourceForge at http://unicore.sourceforge.net
Forschungszentrum Jülich
in der Helmholtz-Gemeinschaft
GGF 14, June 27, 2005
3
Production at FZJ
¾ National high-performance computing centre
“John von Neumann Institute for Computing”
¾ About 650 users in 150 research projects
¾ Access via UNICORE to
¾ p690 IBM eServer Cluster 1600 (8,9 TFlops peak)
since June 2004
¾ BlueGene/L (5,7 TFlops peak)
¾ Cray SV1ex, Cray XD1
¾ 116 active UNICORE users
¾ 72 external, 44 internal
¾ Resource usage (2004/05; CPU-hours)
¾ Dec: 18.4%, Jan: 30.4%, Feb: 30.5%,
Mar: 27.1%, Apr: 29.7%, May: 39.1%
Forschungszentrum Jülich
in der Helmholtz-Gemeinschaft
GGF 14, June 27, 2005
4
DEISA
¾ Status (Q1/2005): 4 sites / ~24 TFlops IBM Power4 /
1 Gbit / GPFS / LoadLeveler
¾ Goal (2006): 11 sites / ~140 TFlops / 10 GBit / GPFS
/ Metascheduler
Forschungszentrum Jülich
in der Helmholtz-Gemeinschaft
GGF 14, June 27, 2005
5
DEISA configuration
Forschungszentrum Jülich
in der Helmholtz-Gemeinschaft
GGF 14, June 27, 2005
6
OpenMolGRID: Problem
Goal: “Speed-up, automate, and standardise the
drug-design using Grid technology”
¾ Characteristics of the problem:
¾
¾
¾
¾
¾
High calculation complexity
Large amounts of data
Integration of different applications
Secure access to distributed data
Collaborative work
A real Grid application
Forschungszentrum Jülich
in der Helmholtz-Gemeinschaft
GGF 14, June 27, 2005
7
OpenMolGRID: Solution
¾
¾
¾
¾
Usage of UNICORE, integration of applications
Extended workflow support to automate processes
Interface to databases
API & command line interface
QSAR
3D Output
EPA
ECOTOX
Database
Descriptors
> 5 days
280
2D
structures
downloaded
< 2 hours
QSAR
3D Output
Forschungszentrum Jülich
in der Helmholtz-Gemeinschaft
Descriptors
GGF 14, June 27, 2005
8
Lessons Learned
¾ Taught by end-users and software developers from
¾ Industry: pharmaceutical, petrochemical, bio-molecular, weather
prediction, automotive, engineering
¾ Research: astrophysics, quantum physics, material science,
medicine, biology, chemistry
¾ Deployment of new production software has to offer added
value
¾ Ease usage, increase effectiveness, decrease cost, …
¾ Users have to be stimulated and encourage to
¾ use Grid technology for applications, computations, data transfer
and access to resources
¾ adapt/integrate their applications to/into Grids
¾ Operation of production environments is costly
¾ Certification authority, administrative tools, integration into site
management, licenses, …
¾ Common production environment difficult to maintain
Forschungszentrum Jülich
in der Helmholtz-Gemeinschaft
GGF 14, June 27, 2005
9
More lessons learned
¾ Fulfilment of functional requirements is not enough
¾ Users want
¾ software of high quality, especially high reliability and
resilience
¾ help to overcome initial hurdles like
• obtaining certificates, adapt applications, ...
¾ 24/7 availability of the Grid infrastructure
¾ 24/7 availability of the Grid experts
• support hotline, help desk, mailing lists, …
¾ long-term commitment for continuous development and
support
¾ workshops, hands-on training, ...
Forschungszentrum Jülich
in der Helmholtz-Gemeinschaft
GGF 14, June 27, 2005
10
Conclusions
¾ Production Grids are possible
¾ Easy/unified usage, integration of legacy applications, cost
reduction, …
¾ When will we see the WS-* impact?
¾ Users demand a fully-fledged product
¾ Functions, but also support, support, support
¾ Continuity is crucial
¾ Open Source distribution is the right way
¾ Source for bug reports, requirements, …
¾ Higher visibility & community building
Forschungszentrum Jülich
in der Helmholtz-Gemeinschaft
GGF 14, June 27, 2005
11