PHENIX Computing Center in Japan (RIKEN CC-J) is located at RIKEN Wako campus and is intended as a principal site of computing for PHENIX simulation, a regional PHENIX computing center especially for Japanese and Asian collaborators, and a center for the analysis of RHIC spin physics. CC-J is operated by the local Planning and Coordination Office (PCO), headed by Takashi Ichihara . In addition, the CC-J Working Group has been organized by PHENIX-J members to prepare and construct CC-J. Because CC-J is still under construction phase of three years, users are encouraged to contact these people closely. CC-J has its own web site. Please visit http://ccjsun.riken.go.jp/ccj/.
- Get account.
- login ccjgw.riken.go.jp or ccjsun.riken.go.jp from WAN and login some hosts of the Linux Farm(linux1,linux2, linux3) from ccjgw or ccjsun. ccjgw and ccjsun are operated with Linux. (The latter was operated Solaris and replaced 2007/Nov.) linux1/2/3/4 are assigned for interactive work(edit, compile, test, submitting jobs etc.). From 2007/09/01, ssh public-key authentication is requried to login the login servers.
- Please use ccjsun.riken.go.jp to transfer your files to/from other site because the Linux farm cannot be seen from WAN. "scp", "ssh protected ftp" and "bbftp". These can be used between rftpexp.rcf.bnl.gov. On rcas20??, ccjsun can be seen using scp. About bbftp, you can find the performance achieved in CCJ here. 'bbftp' command should be operated on CCJ, not on rftpexp or rcas at RCF. If you need to transfer more than 50GB of files, please consult phenix-ccj-admin -at- rarfaxp.riken.go.jp.
- make binaries for the Linux Farm on above interactive hosts. PHENIX software environment is also build here using AFS.
- Users have to use the LSF to submit large scale jobs which use many CPU's. Sample script to submit using LSF is here: http://ccjsun.riken.go.jp/ccj/doc/LSF/Primer.html.
- HPSS is a large scale storage system. Users can access them using /usr/local/bin/pftp(parallel ftp) or /usr/local/bin/cftp(ccj parallel ftp) command. The former is available for large scale archive of PWG level work. Please consult to phenix-ccj-admin -at- rarfaxp.riken.go.jp. The latter is resemblng 'rftp' in RCF, can be used for personal archive, see the HPSS directory and get some files from HPSS. For multiple access using pftp/cftp, some error handling is needed and sample script for error handling can be seen here.
http://rscc.riken.go.jp/hpss/systemusage/index.html is a support page maintained by SE (in Japanese).
Recently chtar and htar/hsi are available in CCJ-HPSS. Command 'chtar' is modified version of 'htar', like cftp, to limit the COS to 7 for archive, which is renewaled from COS 6 in June 2009. In the COS 7, tape is duplicated automatically for the security. The path is /usr/local/bin/chtar or /opt/ccj/bin/chtar. The 'chtar' is available for all users while htar/hsi are for limited user. Please consult to use htar/hsi.- Temporary disk on each Linux node (/job_tmp) is used as a read/write buffer for jobs on each node. Users have to clear the disk at the end of each job running on the node. If any file is found after jobs, it can be removed by System Manager. Please use "/usr/local/bin/rcpx" command instead of rcp command in order to upload/download the large files to/from local disk from/to non-local disk. It limit the maximum number of access to disk at the same time and prevent the crowding out of the NFS server.
- Back up policy is here. Users are recommended to back up own important source files by themselves via network.
- Tutorials
- local software
cftp|| rcpx|| Objectivity/DB
- technical local rules/tips (in Japanese)
The system configuration is summarized in this figure. The key components are;
- Login Server (ccjgw.riken.go.jp, ccjsun.riken.go.jp)
- Linux CPU Farm
- Disk Server(ccjnfs11, ccjnfs12, ccjnfs13, ccjnfs14, ccjnfs15(SUN Fire V40z),ccjnfs20(SUN M4000))
- HPSS (High Performance Storage System)
- Network
- RSCC (is replaced by RICC in Aug. 2009. Description in this page should be updated...)
see also http://ccjsun.riken.go.jp/ccj/Wako-Sys/to get more detail infomation.
- Login Server(ccjgw.riken.go.jp and ccjsun.riken.go.jp)
Users have to login this machine from WAN, because Linux Farm have only private IP address. Users can receive e-mail at ccjsun, especially from Mailing List for user support, but Japanese input and tex are not supported.- Linux CPU Farm
The Linux CPU Farm consists of 166 dual pentium nodes:ap01~ap148 and ap201~ap218 (thus 332 CPUs). They are in the private IP address space and can be reached only via ccjsun. Each node has 9 to 16 GB of local disk space, 1GB of memory and 2GB of swap space. The local disk must be made clean after each job. Out of 166 nodes, 3 nodes are assigned for interactive use and aliased linux1,linux2 and linux3 respectively now, while others are for batch jobs where users should use LSF to submit jobs. Current LSF version is 6.0. Interactive nodes subject to change. Users are recommended to use the alias linux1~4. OS is the Scientific Linux 3.0 for almost all nodes since 2005 Mar. The 'IBM' nodes, ap113-148 are also used to have the PHENIX ndst on thier local disks.- Disk Server
The Disk Server ccjnfs11/12/13/14/15/20 are NFS servers for /ccj/u and /ccj/w, and ccjnfs20 is also a NIS server. Users cannot login these servers, except for the large scale data transfer.- HPSS (High Performance Storage System)
HPSS is a hierarchical storage software running on five nodes of IBM p630 workstation, which replaced RS6000/SP on Dec. 2003. Eight STK T9940B tape drives handle 200GB-tape cartridge (30MB/s) and two T10000 drives handle 500GB-tape cartridge(120MB/s). In total, 9000 of tape cartrigdes (1.5 PB) are in two tape robots (STK Powderhorn 9310). Total size is expandable to 4.5 PB by replacing to 500GB-tape. About 80MB/sec of transfer rate has been obtained by parallel ftp (pftp) from HPSS to Linux and 60-100MB/sec from farm to HPSS at CC-J.
HPSS home page is here: http://www5.clearlake.ibm.com.- Network
Disk Servers, login Server and HPSS are connected with Gigabit Ethernet and a Gigabit Ether Switch as shown in the figure. Each node of the Linux CPU farm are connected to the Gigabit Ether Switch by 100-Base-T. About 100Mbytes/sec is achieved as transfer speed from the Linux farm from/to HPSS. Super-Sinet/SINET3 (10Gbps) connects between the Internet and RIKEN. Between the RIKEN FW and CCJ(Disk servers and CPUs), and CCJ and RSCC(HPSS and new CPUs) are connected with aggregated two gigabit fibers. and approximately 1.5Gbps and 2Gbps are achieved by them respectively.- RSCC
RSCC is RIKEN Super Combined Cluster System operated by RIKEN IT division. Integrated operation between RSCC and CCJ was started 2004. While RSCC has 2048 3.06GHz Xeon CPUs, we can use 256 CPUs named as the pc2c cluster, for dedicated use of CCJ. The PHENIX environment are shared by the pc2c cluster. Because the NIS are separated, user have to make another application form to login the pc2c cluster. Please consult the phenix-ccj-admin -at- rarfaxp.riken.go.jp to get the account. After the application is processed by RIKEN IT division, you can use same login name and same home directory as CCJ on the pc2c cluster. pc2cn001 and pc2cn002 are the interactive nodes of pc2c cluster. You can login them from ccjgw or ccjsun as for linux1-4. HPSS is also available on pc2c cluster. If you want to submit the batch job to pc2c cluster, you have to submit them from the pc2cn001/2.
Accounts are issued to the responsible persons for the large scale computing project authorized by PHENIX Physics Working Groups (PWG). In principle 3 accounts at most are allowed for each PWG . Users have to fill the account request form and email it to phenix-ccj-admin -at- rarfaxp.riken.go.jp.
CC-J user also should have a visiting position of RIKEN. If you don't have any position in RIKEN, please read here(Japanese || English ) and fill out another form.
When you lose your RIKEN position, your account subject to be suspend. In such a case, please consult us to access your account.
For the ssh-publickey login, the fingerprint of your key should be put with the account request form. If you have your web page in RCF, the publickey ifself should be located your WWW/p/draft-region and the URL should be notified. If you don't have account in RCF, please consult us.
Initial disk quota is 4GB/5GB(soft/hard limit) on /ccj/u/ (user home region) and 40GB/50GB on /ccj/w/r01 or /ccj/w/r02 (work region), which are served by ccjnfs20. Users can confirm their own quota sizes by themselves using a command '/usr/bin/rsh ccjnfs20 vxquota -v' on ccjsun. Users can use the HPSS to archive for archive region using the command /usr/local/bin/cftp. COS is fixed to 6 (archive) and the total available space is limited. One user waste it, other users are bothered. HPSS is also used to archive for the PWG-project level work. When one project is finished, account may be suspended till next project is started.
storage usage guide line
- /ccj/u/ and /ccj/w/ are quota-limited.
- HPSS is for an archive of results of PWG-work.
- HPSS is also for an personal archive using 'cftp'. Total space of personal archive is limited.
- /job_tmp on each Linux node can be used the temporary working area. Users should make own directory named their username under /job_tmp of each node and use it. Temporary files under the directory should be removed by the end of each job on the node.
CCJ-PCO expects that proposals of computing project to use the CCJ computing resources are approved in and submitted from Physics Working Groups in PHENIX. The proposals are submitted to CCJ-PCO either directly or through the simulation coordinator in PHENIX from Physics Working Groups. CCJ-PCO allocate the available computing resources to each project. If necessary, job priority is defined by CCJ-PCO reflecting the PHENIX decision made in the collaboration meeting. CCJ-PCO reports regularly to the PHENIX collaboration the status of the computing projects, including allocation of computing resources, the priority, and the completeness. A conflict amongst the projects is to be consulted to PHENIX spokes- person and Executive Council if the conflict is beyond the coordination by CCJ-PCO.
The template of the proposal is here: http://ccjsun.riken.go.jp/ccj/forms/app.html . Fill and send by Email to ccj-pco -at- rarfaxp.riken.go.jp
User have to use the tool '/usr/local/bin/cftp' to access the HPSS. All kind of files can be 'get' by this tool if permission is open for you. But for 'put', the available region is limited for personal archive. It is 'rftp' like command. Data caroucel as RCF is under testing now. 'pftp/cftp' has an interface like ordinary ftp command. To use it from in the script, see below examples.
/usr/local/bin/cftp-get-test.pl
/usr/local/bin/raw_transfer_pftp.pl
'cftp' and 'chtar' are just for personal archive and getting the any kind of files from HPSS. For large scale storage, please consult phenix-ccj-admin -at- rarfaxp.riken.go.jp.
and use 'pftp' to use HPSS effectively.
'hsi' and 'htar' are also available. Please consult admin.
LSF (Load Sharing Facility) is also used at RCF.
In CCJ, many queues (short,long,bg etc.) are set now. See http://ccjsun.riken.go.jp/ccj/doc/LSF/index.html.
For PWG work, other queue for each group will be made to control the job priority by CCJ-PCO.
Submit the jobs from the interactive nodes linux1/2/3/4. In RSCC, LSF is also available from Feb. 2005.
PHENIX Linux software environment at CC-J is pretty much similar to that of RCF. Because you will find "/afs/rhic/phenix" and "/opt/phenix" directories from your Linux environment.
-----> If you have experienced RCF/PHENIX user, this is enough to start.
In order to get a standard PHENIX environment, DO source /opt/phenix/bin/phenix_setup.csh However, there is one point to remember. CC-J Linux nodes do not use AFS, neither Transarc's original AFS or free client arla. But these directories "/afs/rhic/...", "/opt/phenix/..." and "/cern" are served via NFS. "sys name" (@sys) is symbolic linked to "i386_sl302" at every switchyard. As long as you use "SL3" (at the moment), this difference wouldn't appear for normal use.
No klog necessary for "CVS checkout". But of course it non-sense to do "CVS checkin", because it will not reflect real AFS server at RCF.
-----> If you are PHENIX software beginner, follow a very basic tutorial, To BUILD and RUN PISA99 (PHENIX version of GEANT3) !!
http://ccjsun.riken.go.jp/~hayashi/usguide-mypart.txt
Objectivity/DB is not supported yet. PostgreSQL database of PHENIX is copied and operated. See http://ccjsun.riken.go.jp/ccj/doc/phenix-data/phenix-db/index.html
When user transport the PHENIX data file ( raw data or DST ) from RCF to CC-J, please consult phenix-ccj-admin -at- rarfaxp.riken.go.jp. We assign some disk space or HPSS space for public use. A part of nDSTs for Run2-Run4 are available on the disks. See also the page:PHENIX DATA Location at CCJ
/ccj/u and /ccj/w are RAID disk, while /job_tmp on each Linux node is non-RAID. /ccj/w and /job_tmp are NOT backed up. /ccj/u is 'rsync'ed to /ccj/w/data56/ccj-u-bkp once a day. User can retrieve own data from there if it is just after removing it. /ccj/u is also backed up to be able to recover from a disk crash. Restore requests coming from users who have lost files will NOT be approved.
- current status of Linux cpu farm:
http://ccjsun.riken.go.jp/ccj/doc/farms/stat/
- Network:
http://ccjsun.riken.go.jp/ccj/info/mrtg/network/
Users should subscribe a mailing list ccj-users. Notices concering shutdowns and other informations are announced in this list. Users can report any troubles about CC-J to this list. Use English.
PHENIX-J members are recommended to subscribe a mailing list ccj-users-j. This list is for Discussion on CC-J as a regional center. Use Japanese.
A mailing list ccj-pco is for the administrative issue.
Users(phenix-j member) will be added in the list ccj-users (and ccj-users-j) when they get own account. Mails will be delivered to your ccj account. Please set your .forward if you want to get mails on your mail server.
| Last modified: Sep. 16, 2009 | T. Nakamura/S. Yokkaichi | Back to the CC-J Home page |