References

References

[1] Paul Albitz and Cricket Liu. DNS and BIND. O'Reilly & Associates, Inc., Sebastopol, CA 95472, 4th edition, 2001.

[2] Stephen F. Altschul,Thomas L. Madden,Alejandro A. Schaffer,Jinghui Shang,Zheng Zhang,Webb Miller, and David J. Lipman. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res., 25:3389–3402, 1997.

[3] Sridhar Anandakrishnan. Penguins everywhere: GNU/Linux in Antarctica. IEEE Software, 16(6):90–96, Nov/Dec 1999.

[4] E. Anderson,Z. Bai,C. Bischof,J. Demmel,J. Dongarra,J. Du Croz,A. Greenbaum,S. Hammarling,A. McKenney,S. Ostrouchov, and D. Sorensen. LAPACK Users' Guide. SIAM, Philadelphia, 1992.

[5] Thomas E. Anderson,Michael D. Dahlin,Jeanna M. Neefe,David A. Patterson,Drew S. Roselli, and Randolph Y. Wang. Serverless network file systems. ACM Transactions on Computer Systems, 14(1):41–79, February 1996.

[6] Aztec home page. http://www.cs.sandia.gov/CRF/aztec1.html.

[7] Zhaojun Bai,James Demmel,Jack Dongarra,Axel Ruhe, and Henk van der Vorst. Templates for the Solution of Algebraic Eigenvalue Problems, A Practical Guide. SIAM, 2000.

[8] Satish Balay,Kris Buschelman,William D. Gropp,Dinesh Kaushik,Matt Knepley,Lois Curfman McInnes,Barry F. Smith, and Hong Zhang. PETSc web page. http://www.mcs.anl.gov/petsc, 2001.

[9] Satish Balay,Kris Buschelman,William D. Gropp,Dinesh Kaushik,Matt Knepley,Lois Curfman McInnes,Barry F. Smith, and Hong Zhang. PETSc users manual. Technical Report ANL-95/11 - Revision 2.1.5, Argonne National Laboratory, 2002.

[10] Satish Balay,William D. Gropp,Lois Curfman McInnes, and Barry F. Smith. Efficient management of parallelism in object oriented numerical software libraries. In E. Arge, A. M. Bruaset, and H. P. Langtangen, editors, Modern Software Tools in Scientific Computing, pages 163-202. Birkhauser Press, 1997.

[11] Daniel J. Barrett and Richard Silverman. SSH, The Secure Shell: The Definitive Guide. O'Reilly & Associates, Inc., Sebastopol, CA 95472, 1st edition, 2001.

[12] Richard Barrett,Michael Berry,Tony F. Chan,James Demmel,June Donato,Jack Dongarra,Victor Eijkhout,Roldan Pozo,Charles Romine, and Henk van der Vorst. Templates for the Solution of Linear Systems: Building Blocks for Iterative Methods. SIAM, Philadelphia PA, 1994. http://www.netlib.org/templates/.

[13] Luiz Andr? Barroso,Jeffrey Dean, and Urs H?lzle. Web search for a planet: The Google cluster architecture. IEEE Micro, 2003.

[14] David M. Beazley. Python Essential Reference. New Riders Publishing, second edition, 2001.

[15] L.S. Blackford,J. Choi,A. Cleary,E. D'Azevedo,J. Demmel,I. Dhillon,J. Dongarra,S. Hammerling,G. Henry,A. Petitet,K. Stanley,D. Walker, and R.C. Whaley. ScaLAPACK Users' Guide. SIAM, 1997.

[16] BLAS web page. http://www.netlib.org/blas.

[17] Peter J. Braam. The Lustre storage architecture. Technical report, Cluster File Systems, Inc., 2003.

[18] Tim Bray. Bonnie file system benchmark. http://www.textuality.com/bonnie/.

[19] Ron Brightwell,Tramm Hudson,Arthur B. Maccabe, and Rolf Riesen. The Portals 3.0 message passing interface. Technical Report SAND99-2959, Sandia Technical Report, November 1999.

[20] Surendra Byna,William Gropp,Xian-He Sun, and Rajeev Thakur. Improving the performance of MPI derived datatypes by optimizing memory-access cost. Technical Report ANL/MCS-P1045-0403, Mathematics and Computer Science Division, Argonne National Laboratory, 2003.

[21] B. Callaghan,B. Pawlowski, and P. Staubach. NFS version 3 protocol specification. Technical Report RFC 1813, Sun Microsystems, Inc., June 1995.

[22] Philip H. Carns,Walter B. Ligon III,Robert B. Ross, and Rajeev Thakur. PVFS: A parallel file system for Linux clusters. In Proceedings of the 4th Annual Linux Showcase and Conference, pages 317–327, Atlanta, GA, October 2000. USENIX Association.

[23] CERT web site. http://www.cert.org.

[24] Chaco web page. http://www.cs.sandia.gov/~bahendr/chaco.html.

[25] Albert Cheng and Michael Folk. HDF5: High performance science data solution for the new millennium. In ACM, editor, SC2000: High Performance Networking and Computing. Dallas Convention Center, Dallas, TX, USA, November 4–10, 2000, pages 149–149, New York, NY 10036, USA and 1109 Spring Street, Suite 300, Silver Spring, MD 20910, USA, 2000. ACM Press and IEEE Computer Society Press.

[26] Averg Ching,Alok Choudhary,Kenin Coloma,Wei keng Liao,Robert Ross, and William Gropp. Noncontiguous I/O accesses through MPI-IO. In Proceedings of the Third IEEE/ACM International Symposium on Cluster Computing and the Grid (CCGrid2003), May 2003.

[27] Avery Ching,Alok Choudhary,Wei keng Liao,Robert Ross, and William Gropp. Noncontiguous I/O through PVFS. In Proceedings of the 2002 IEEE International Conference on Cluster Computing, September 2002.

[28] Douglas Comer. Internetworking with TCP/IP, Volume 1: Principles, Protocols, and Architecture. Prentice Hall, Inc., Englewood Cliffs, NJ 07632, 4th edition, 2000.

[29] Peter F. Corbett and Dror G. Feitelson. The Vesta parallel file system. In Hai Jin, Toni Cortes, and Rajkumar Buyya, editors, High Performance Mass Storage and Parallel I/O: Technologies and Applications, chapter 20, pages 285–308. IEEE Computer Society Press and Wiley, New York, NY, 2001.

[30] Cray Research. Application Programmer's Library Reference Manual, 2nd edition, November 1995. Publication SR-2165.

[31] David E. Culler,Richard M. Karp,David A. Patterson,Abhijit Sahay,Klaus E. Schauser,Eunice Santos,Ramesh Subramonian, and Thorsten von Eicken. LogP: towards a realistic model of parallel computation. ACM SIGPLAN Notices, 28(7):1–12, July 1993.

[32] I. S. Dhillon. A new O(n2) Algorithm for the Symmetric Tridiagonal Eigenvalue/Eigenvector Problem. PhD thesis, Computer Science Division, University of California, Berkeley, California, 1997.

[33] Chris DiBona,Sam Ockman, and Mark Stone. Open Sources: Voices from the Open Source Revolution. O'Reilly & Associates, Inc., 1999.

[34] Jack Dongarra. Performance of various computers using standard linear equations software. Technical Report Number CS-89-85, University of Tennessee, Knoxville TN, 37996, 2001. http://www.netlib.org/benchmark/performance.ps.

[35] Jack J. Dongarra,Iain S. Duff,Danny C. Sorensen, and Henk A. van der Vorst. Solving Linear Systems on Vector and Shared Memory Computers. SIAM, Philadelphia, 1991.

[36] R. Evard,N. Desai,J. Navarro, and D. Nurmi. Clusters as large-scale development facilities. In Proceedings of the 2002 IEEE International Conference on Cluster Computing, September 2002.

[37] FFTW web page. http://www.fftw.org.

[38] Fluent web page. http://www.fluent.com.

[39] G. C. Fox,S. W. Otto, and A. J. G. Hey. Matrix algorithms on a hypercube I: Matrix multiplication. Parallel Computing, 4:17–31, 1987.

[40] Matteo Frigo and Steven G. Johnson. FFTW: An adaptive software architecture for the FFT. In Proc. 1998 IEEE Intl. Conf. Acoustics Speech and Signal Processing, volume 3, pages 1381–1384. IEEE, 1998.

[41] The galley parallel file system. http://www.cs.dartmouth.edu/~dfk/nils//galley.html.

[42] Gaussian web page. http://www.gaussian.com.

[43] Al Geist,Adam Beguelin,Jack Dongarra,Weicheng Jiang,Bob Manchek, and Vaidy Sunderam. PVM: Parallel Virtual Machine—A User's Guide and Tutorial for Network Parallel Computing. MIT Press, Cambridge, Mass., 1994.

[44] W. Gropp and E. Lusk. Scalable Unix tools on parallel processors. In Proceedings of the Scalable High-Performance Computing Conference, May 23–25, 1994, Knoxville, Tennessee, pages 56–62, 1109 Spring Street, Suite 300, Silver Spring, MD 20910, USA, 1994. IEEE Computer Society Press.

[45] W. D. Gropp,D. K. Kaushik,D. E. Keyes, and B. F. Smith. Towards realistic performance bounds for implicit CFD codes. In Proceedings of Parallel CFD'99, pages 241–248, 1999.

[46] William Gropp,Steven Huss-Lederman,Andrew Lumsdaine,Ewing Lusk,Bill Nitzberg,William Saphir, and Marc Snir. MPI—The Complete Reference: Volume 2, The MPI-2 Extensions. MIT Press, Cambridge, MA, 1998.

[47] William Gropp,Ewing Lusk,Nathan Doss, and Anthony Skjellum. A high-performance, portable implementation of the MPI Message-Passing Interface standard. Parallel Computing, 22(6):789–828, 1996.

[48] William Gropp,Ewing Lusk, and Anthony Skjellum. Using MPI: Portable Parallel Programming with the Message Passing Interface, 2nd edition. MIT Press, Cambridge, MA, 1999.

[49] William Gropp,Ewing Lusk, and Debbie Swider. Improving the performance of MPI derived datatypes. In Anthony Skjellum, Purushotham V. Bangalore, and Yoginder S. Dandass, editors, Proceedings of the Third MPI Developer's and User's Conference, pages 25–30. MPI Software Technology Press, 1999.

[50] William Gropp,Ewing Lusk, and Rajeev Thakur. Using MPI-2: Advanced Features of the Message-Passing Interface. MIT Press, Cambridge, MA, 1999.

[51] William D. Gropp and Ewing Lusk. Reproducible measurements of MPI performance characteristics. In Jack Dongarra, Emilio Luque, and Tom?s Margalef, editors, Recent Advances in Parallel Virtual Machine and Message Passing Interface, volume 1697 of Lecture Notes in Computer Science, pages 11–18. Springer Verlag, 1999. 6th European PVM/MPI Users' Group Meeting, Barcelona, Spain, September 1999.

[52] Michael Hasenstein. The logical volume manager (LVM). Technical Report Whitepaper, SuSE Inc., 2001.

[53] Don Heller. Rabbit: A performance counters library for Intel/AMD processors and Linux. www.scl.ameslab.gov/Projects/Rabbit/.

[54] J. M. D. Hill,B. McColl,D. C. Stefanescu,M. W. Goudreau,K. Lang,S. B. Rao,T. Suel,T. Tsantilas, and R. H. Bisseling. BSPlib: The BSP programming library. Parallel Computing, 24(14):1947–1980, December 1998.

[55] James V. Huber, Jr.,Christopher L. Elford,Daniel A. Reed,Andrew A. Chien, and David S. Blumenthal. PPFS: A high performance portable parallel file system. In Hai Jin, Toni Cortes, and Rajkumar Buyya, editors, High Performance Mass Storage and Parallel I/O: Technologies and Applications, chapter 22, pages 330–343. IEEE Computer Society Press and Wiley, New York, NY, 2001.

[56] Craig Hunt. TCP/IP Network Administration. O'Reilly & Associates, Inc., Sebastopol, CA 95472, 3rd edition, 2002.

[57] S. A. Hutchinson,J. N. Shadid, and R. S. Tuminaro. Aztec user's guide: Version 1.1. Technical Report SAND95-1559, Sandia National Laboratories, 1995.

[58] IEEE/ANSI Std. 1003.1. Portable operating system interface (POSIX)-part 1: System application program interface (API) [C language], 1996 edition.

[59] Iperf home page. http://dast.nlanr.net/projects/iperf.

[60] Alan H. Karp. Bit reversal on uniprocessors. SIAM Review, 38(1): 1–26, March 1996.

[61] Jeffrey Kephart and David Chess. The vision of autonomic computing. IEEE Computer, pages 41–50, January 2003.

[62] David Kotz. Disk-directed I/O for MIMD multiprocessors. In Hai Jin, Toni Cortes, and Rajkumar Buyya, editors, High Performance Mass Storage and Parallel I/O: Technologies and Applications, chapter 35, pages 513–535. IEEE Computer Society Press and John Wiley & Sons, 2001.

[63] LAPACK software. http://www.netlib.org/lapack.

[64] C. Lawson,R. Hanson,D. Kincaid, and F. Krogh. Basic linear algebra subprograms for FORTRAN usage. Transactions on Mathematical Software, 5:308–323, 1979.

[65] Edward K. Lee and Chandramohan A. Thekkath. Petal: Distributed virtual disks. In Proceedings of the Seventh International Conference on Architectural Support for Programming Languages and Operating Systems, pages 84–92, Cambridge, MA, October 1996.

[66] J. Li,W.-K. Liao,A. Choudhary,R. Ross,R. Thakur,W. Gropp, and R. Latham. Parallel netCDF: A scientific high-performance I/O interface. Technical Report ANL/MCS-P1048-0503, Mathematics and Computer Science Division, Argonne National Laboratory, May 2003.

[67] Xiaoye S. Li. Sparse Gaussian Eliminiation on High Performance Computers. PhD thesis, University of California at Berkeley, 1996.

[68] Josip Loncaric. Linux 2.2.12 TCP performance fix for short messages. www.icase.edu/coral/LinuxTCP2.html. This web site is no longer available.

[69] LS-Dyna web page. http://www.lstc.com.

[70] Alex Martelli and David Ascher, editors. Python Cookbook. O'Reilly and Associates, 2002.

[71] John D. McCalpin. STREAM: Sustainable memory bandwidth in high performance computers. http://www.cs.virginia.edu/stream/.

[72] Message Passing Interface Forum. MPI: A Message-Passing Interface standard. International Journal of Supercomputer Applications, 8(3/4):165–414, 1994.

[73] Message Passing Interface Forum. MPI2: A message passing interface standard. International Journal of High Performance Computing Applications, 12(1–2):1–299, 1998.

[74] Jeffrey Mogul and Steve Deering. Path MTU discovery. Technical Report IETF RFC 1191, Digital Equipment Corporation WRL and Stanford University, November 1990. http://www.ietf.org/rfc/rfc1191.txt.

[75] P. Mucci,S. Brown,C. Deane, and G. Ho. Papi: A portable interface to hardware performance counters. icl.cs.utk.edu/projects/papi/.

[76] NAMD web page. http://www.ks.uiuc.edu/Research/namd/.

[77] Nastran web page. http://www.mscsoftware.com/products/products_detail.cfm?S=74&PI=7&M=0.

[78] Nils Nieuwejaar and David Kotz. The Galley parallel file system. Parallel Computing, 23(4):447–476, June 1997.

[79] Nils Nieuwejaar,David Kotz,Apratim Purakayastha,Carla Schlatter Ellis, and Michael Best. File-access characteristics of parallel scientific workloads. IEEE Transactions on Parallel and Distributed Systems, 7(10):1075–1089, October 1996.

[80] Bill Nowicki. NFS: Network file system protocol specification. Technical Report RFC 1094, Sun Microsystems, Inc., March 1989.

[81] NWChem web page. http://www.emsl.pnl.gov:2080/docs/nwchem/nwchem.html.

[82] Emil Ong,Ewing Lusk, and William Gropp. Scalable Unix commands for parallel processors: A high-performance implementation. In Jack Dongarra and Yiannis Cotronis, editors, Proceedings of Euro PVM/MPI. Springer Verlag, 2001.

[83] OpenMP Web page. www.openmp.org.

[84] ParMetis web page. http://www-users.cs.umn.edu/~karypis/metis/parmetis/index.html.

[85] Chrisila Pettey,Ralph Butler,Brad Rudnik, and Thomas Naughton. A rapid recovery Beowulf platform. In Henry Selvaraj and Venkatesan Muthukumar, editors, Proceedings of Fifteenth International Conference on Systems Engineering, pages 278–283, 2002.

[86] PLAPACK web page. http://www.cs.utexas.edu/users/plapack/.

[87] Jon Postel, editor. Transmission control protocol. Technical Report IETF RFC 793, Information Sciences Institute, University of Southern California, September 1981. http://www.ietf.org/rfc/rfc0793.txt.

[88] Kenneth W. Preslan,Andrew Barry,Jonathan E. Brassow,Russell Cattlelan,Adam Manthei,Erling Nygaard,Seth Van Oort,David C. Teigland,Mike Tilstra, Matthew O'Keefe,Grant Erickson, and Manish Agarwal. A 64-bit, shared disk file system for Linux. In Proceedings of the Eighth NASA Goddard Conference on Mass Storage Systems and Technologies, March 2000

Part III: Managing Clusters