CHIPP Computing Board (CCB) - Minutes of Meeting (April 21st 2006)
================================================================

Present: F. Orellana, D. Feichtinger, H. P. Beck, C. Grab, P. Kunszt,
              R. Bernet, U. Langenegger, S. Haug, J. van Hunen


Location: ETHZ - HPK E33; 10:30 - 16:00

----------------------------------------------------
1) Phoenix Status (CG)
----------------------------------------------------

- APLE accounting system currently defective, PK said they are looking at it
- Phoenix ramp up plan: We agree to not change the tables for the moment
- Estimated cost for HW acquisition ~3Mio
  Operation / year ~0.5Mio
- financing: Some preliminary information was transmitted, but no
  letter received yet.
- mention of the offer of SUN to establish a research collaboration in order
  to build up the cluster.
  We need to understand what this would mean for us in terms of additional
  research work.
- PK suggested that KTI (a Federal orgnisation) could also help in financing
  and that CSCS has good contacts there.
- long discussion about the CHIPP vote that led to the current stalemate
  --> need to wait for letter from CHIPP board
- we decided to make a decision end of May.

----------------------------------------------------
2) CSCS status (PK)
----------------------------------------------------

- Experiments support personnel:
- - DF+FO stopped at CERN, now attached to CMS+ATLAS.
- - LHCb: Does RB need help (currently uses ~15%). Depends on what Swiss
    LHCb physicists want to do. No user analysis for now. Production: just send an email.
    --> Conclusion: no extra help needed for the moment. No need for a VO box.

- CSCS support personnel:
- - PK single person for now, Vincenzo also active.
- - System administration done by internal internal FUS group.
- - CSCS hiring 1-2 more persons, new person August 1.

- PK described user support through GGUS, Derek adviced that he, FO and RB join the GGUS support group.
- PK: most tickets handled by the CSCS, since that's what they are paid for by the EGEE. Experiment specific tickets will be reassigned to DF, RB and FO.

- PK described Linux kernel 24.->2.6 upgrade issues.

- LCG 2.7 upgrade:
- - DPM there now
- - No LFC yet - expected in two weeks.
    
- SH: who will be responsible for VO box and services uptime?
- PK: the relevant persons here will have root access.
    
- Networking:
- - Manno - Domodossola - Martigny - Geneve link is now in place.
- - 1 Gb-s now, 10 Gb-s possible.
- - Redundancy possible in the future, hooking on to FZK - CNAF link.
    
- SC4:
- - Good collaboration with FZK.
- - We are the first tier-2 site, apart from DESY to join.

- DF asked about presence on monitoring sites. Seems ok.
    
- Bottlenecks:
- - Storage bandwidth and firewalls.
- - Dedicated links to CERN NOT an issue.
- - Important to have good connectivity to more than 1 tier-1, should be guaranteed,
    since everything Swiss goes through CERN hub.
- - SWITCH can help tune: http://www.switch.ch/network/pert/

- EGEE:
- - CSCS part of EGEE-2: 184'000 Euros for 2 years: one extra person (perhaps 3 extra in total, depending on needs).
- - Suggestion: extend PHOENIX with 15-20% for other communities, funding from CSCS/EGEE, win-win situation.

- Portal
- - alprose01.projects.cscs.ch.
- - Will implement NG support.

- CG on tier-1 issue:
- - CMS attached to Northern tier-1 group, effectively FZK
- - LHCb: FZK
- - ATLAS: FZK/CERN, will be clear after SC4

- PK: still completely unclear which services CERN will provide, so better stay with FZK.

- UL: where will the money come from for faster link in CSCS when need arises?
- CG: 50'000 Euros per year should be enough, not too worried - the fibre is there.

- CG: again: peaks on tier-1s when reprocessing has finished, bottleneck not network, but disk IO on tier-1s.

----------------------------------------------------
3) Usage of CSCS tier-2
----------------------------------------------------

CMS (DF):

- DF set up full VO box in Manno in the beginning of the year: CMS file transfer service, CMS file catalogue.
- PK has bought 5 servers for all 3 experiments to use for VO boxes.
- DF described issues with file catalogue publishing: large effort, will be obsolete in 1-2 months,
  3 dataset published in total till now.
- CMS software is in a process of change, both core and grid software.
- Main user is UL: running Monte Carlo; submits about a thousand jobs a day when running.
- UL is worried about ATLAS jobs running via 2 queues in Manno: we agreed to check Maui configuration to ensure true fair-sharing.

ATLAS (FO+SH)

- Central production via LCG. CSCS is participating via LCG.
- A few small-medium scale user productions via NG.
- Will investigate setting up ATLAS DDM on VO box.

LHCb (RB)

- Central production via DIRAC. CSCS is participating.
- No VO box needed.

----------------------------------------------------
4) Status of Service Challenge 4 (experiments)
----------------------------------------------------

- DF reminded us to look at SC4 blog and follow what each experiment needs to do to participate.
- DF reminded us to look at and update CSCS wiki.
- Tier-1 attachment from experiments point of view:
- - CMS:FZK.
- - ATLAS:FZK/CERN.
- Time line:
- - CMS: set up software next few weeks, run tests in may.
- - ATLAS: get ready before June.

----------------------------------------------------
5) Status at home institutes
----------------------------------------------------

- ETHZ (UL):
- - Student+Andreas Holzner.
- - Old hardware.
- - PBS, no LCG, but interest in installing LCG after SC4.

- DF comments: currently tier-3 centres do not exist in LCG and cannot expect support.

- PSI (DF): no cluster as such, but afs installation of the whole CMS environment.

- ATLAS (FO+SH):
- - Presentation by SH.
- - NorduGrid setup in operation.
- - Planning to implement GUI over the summer.
- - Users will start small productions during the summer.

- EPFL (JH):
- - 15 FTEs, half are running physics analysis jobs.
- - Shift from hw to sw.
- - Jobs are run by central production team after request per email.
- - Desktop project progressing - using Condor because of security problems with PBS-pro.

- Comment from DF: qemu: wmware disk producer

- LHCb (RB):
- - T3 not running LCG, but standalone Dirac.
- - Agent making oportunistic use of cluster.
- - Just running production jobs.
- - No big storage attached.

----------------------------------------------------
6) AOB
----------------------------------------------------

- Next meeting: 2/6.
- Extended discussion on call for tender.