pvfs2-quickstart.tex

%
%
\documentclass[11pt, letterpaper]{article}
\usepackage[dvips]{graphicx}
%\usepackage{html}
\usepackage{epsfig}
\usepackage{rotating}
\usepackage{times}
\pagestyle{empty}

%
% GET THE MARGINS RIGHT, THE UGLY WAY
%
\topmargin 0.0in
\textwidth 6.5in
\textheight 9.0in
\columnsep 0.25in
\oddsidemargin 0.0in
\evensidemargin 0.0in
\headsep 0.0in
\headheight 0.0in


\title{A Quick Start Guide to PVFS2}
\author{ PVFS2 Development Team }

%
% BEGINNING OF DOCUMENT
%
\begin{document}
\maketitle

\tableofcontents

\newpage

\thispagestyle{empty}

\section{How to use this document}
\label{sec:howto}

The quick start guide is intended to be a reference on how to quickly
install and configure a PVFS2 file system.  It is broken down into
three parts.  The first describes how to download and compile the
PVFS2 software.  The next section walks through the steps of
configuring PVFS2 to store and access files on a single host, which
may be useful for simple testing and evaluation.  The final section of
this document describes how to install and configure PVFS2 in a true
cluster environment with multiple servers and/or clients.

\subsection{Versions}

This document only applies to the most recent snapshot of PVFS2.

\section{Downloading and compiling PVFS2}

Follow the information at http://www.pvfs.org/pvfs2/download.html .
Once the source code is downloaded,
compiling the PVFS2 source code is a matter of running './configure',
followed by 'make' from the top level source directory.  More detailed
instruction for building and installing are provided below.

\subsection{Dependencies}

The following software packages are currently required by PVFS2:
\begin{itemize}
\item Berkely DB with development libraries (version 3 or 4)
\item aio support (provided by glibc and librt)
\item pthreads
\item gcc 2.96 or newer (DO NOT USE gcc 2.95! gcc 3.x recommended)
\item GNU Make
\item flex
\item bison
\item kernel sources (for client kernel interface)
\item GTK+ (for Karma)
\end{itemize}

The following software packages are currently recommended for use with PVFS2:
\begin{itemize}
\item GNU Libc (glibc) 2.3.2 [ or later ]
\item Linux kernel version 2.6.0 (or later) or 2.4.19 (or later) (NOTE: not
necessary for running PVFS2 servers, only the client kernel module).
\item A GNU/Linux environment (heterogenous configuration are
supported)
\end{itemize}

ROMIO supports PVFS2.  It is not provided with pvfs2, but can be found
as part of the following MPI implementations:

\begin{itemize}
\item MPICH2-0.96p2 or newer, though we suggest using the most recent MPICH2
release
\item OpenMPI-1.0 or newer, though it may not have some of the bug fixes or
features of the MPICH2 version
\end{itemize}

\subsection{Untarring the packages}

All source code is contained in one tarball: pvfs2-x.x.x.tar.gz.  The
following example assumes that you will be building in the /usr/src
directory, although that is not required:

\begin{verbatim}
[root@testhost /root]# cp pvfs2-x.x.x.tar.gz /usr/src
[root@testhost /root]# cd /usr/src
[root@testhost /usr/src]# tar -xzf pvfs2-x.x.x.tar.gz
[root@testhost /usr/src]# ln -s pvfs2-x.x.x pvfs2
[root@testhost /usr/src]# ls -lF
total 476
lrwxrwxrwx   1 root    root         15 Aug 14 17:42 pvfs2 -> pvfs2-x.x.x/
drwxr-xr-x  12 root    root        512 Aug 14 10:11 pvfs2-x.x.x/
-rw-r--r--   1 root    root     371535 Aug 14 17:41 pvfs2-x.x.x.tar.gz
\end{verbatim}

\subsection{Building and installing the packages}

The default steps for building and installing PVFS2 are as follows:

\begin{verbatim}
[root@testhost /usr/src]# cd pvfs2
[root@testhost /usr/src/pvfs2-XXX]# ./configure
[root@testhost /usr/src/pvfs2-XXX]# make
[root@testhost /usr/src/pvfs2-XXX]# make install
\end{verbatim}

Here are some optional configure arguments which may be of interest:
\begin{itemize}
\item --prefix=$<$path$>$: installs all files in the specified
directory (/usr/local/ is the default if --prefix is not specified)
\item --with-kernel=$<$path to 2.6.x kernel source$>$: this enables
compilation of the PVFS2 Linux kernel driver [ Requires Linux Kernel
2.6.0 or later ]
\item --with-kernel24=$<$path to 2.4.x kernel source$>$: this enables
compilation of the PVFS2 Linux kernel driver [ Requires Linux Kernel
2.4.19 or later ]
\item --with-mpi=$<$path to mpi installation$>$: this enables
compilation of MPI based test programs
\item --with-efence: automatically links in Electric Fence for
debugging assistance
\end{itemize}

Also note that the pvfs2 2.6.x kernel source supports out of tree
builds if you prefer to use that technique.

\section{Configuring PVFS2 for a single host}
\label{sec:single}

This section documents the steps required to configure PVFS2 on a system
in which a single machine acts as both the client and server for all
PVFS2 operations.  It assumes that you have completed the above sections
on building and installation already.  The hostname of the example machine
is ``testhost'' and will be referenced as such in the following examples.

IMPORTANT: if you intend to use the provided rc scripts to handle startup
and shutdown of the PVFS2 server, then you must specify a valid hostname
as reported by the \texttt{hostname} command line tool in the configuration.
For this reason, we recommend that you \emph{not} use ``localhost'' as
the hostname of your server, even if you intend to only test one machine.
We will store all PVFS2 data in /pvfs2-storage-space.  /mnt/pvfs2 will
serve as the mount point for the file system.  For more details about
the purpose of these directories please see the PVFS2 users guide.

\subsection{Server configuration}

Since this is a single host configuration, we only have to configure
one server daemon.  In the original PVFS, the metadata and I/O servers
were separated into two separate programs (mgr and iod).  PVFS2,
however, has only a single daemon called pvfs2-server which serves
both roles.

The most important part of server configuration is simply generating
the configuration files.  These can be created using the
pvfs2-genconfig script.  This is an interactive script which will ask
several questions to determine your desired configuration.  Please pay
particular attention to the listing of the metadata servers and I/O
servers.  In this example we will use ``testhost'' for both.

The pvfs2-genconfig tool will generate a single file system configuration
file that will be identical for all servers.  This script should be
excuted as root, so that we can place the configuration file in its
default /etc/ location.

In this simple configuration, we can accept the default options for
every field.  We will use the hostname ``testhost'' rather than
``localhost'' however.

\begin{verbatim}
root@testhost:~# /usr/bin/pvfs2-genconfig  \
	/etc/pvfs2-fs.conf
**********************************************************************
    Welcome to the PVFS2 Configuration Generator:

This interactive script will generate configuration files suitable
for use with a new PVFS2 file system.  Please see the PVFS2 quickstart
guide for details.

**********************************************************************
You must first select the network protocol that your file system will use.
The only currently supported options are "tcp", "gm", "mx", "ib", and "portals".
(For multi-homed configurations, use e.g. "ib,tcp".)

* Enter protocol type [Default is tcp]:

Choose a TCP/IP port for the servers to listen on.  Note that this
script assumes that all servers will use the same port number.

* Enter port number [Default is 3334]:

Choose a directory for each server to store data in.

* Enter directory name: [Default is /pvfs2-storage-space]:

Choose a directory for each server to store metadata in.

* Enter directory name: [Default is /pvfs2-storage-space]:

Choose a file for each server to write log messages to.

* Enter log file location [Default is /tmp/pvfs2-server.log]:

Next you must list the hostnames of the machines that will act as
I/O servers.  Acceptable syntax is "node1, node2, ..." or "node{#-#,#,#}".

* Enter hostnames [Default is localhost]: testhost

Use same servers for metadata? (recommended)

* Enter yes or no [Default is yes]:

Configured a total of 1 servers:
1 of them are I/O servers.
1 of them are Metadata servers.

* Would you like to verify server list (y/n) [Default is n]?

Writing fs config file... done
\end{verbatim}

The generated config file will have conservative default values.  The PVFS2
Users Guide has more information about the settings and the consequences of
setting more aggressive, high performance values.

\subsection{Starting the server}

Before you run pvfs2-server for the first time, you must run it with a special
argument that tells it to create a new storage space if it does not already
exist.  In this example, we must run the server as root in order to create
a storage space in /pvfs2-storage-space as specified in the configuration
files.

\begin{verbatim}
bash-2.05b# /usr/sbin/pvfs2-server /etc/pvfs2-fs.conf -f
\end{verbatim}

Once the above step is done, you can start the server in normal mode
as follows:

\begin{verbatim}
bash-2.05b# /usr/sbin/pvfs2-server /etc/pvfs2-fs.conf
\end{verbatim}

All log messages will be directed to /tmp/pvfs2-server.log, unless you specified
a different location while running pvfs2-genconfig.  If you would prefer to run
pvfs2-server in the foreground and direct all messages to stderr, then
you may run the server as follows:

\begin{verbatim}
bash-2.05b# /usr/sbin/pvfs2-server /etc/pvfs2-fs.conf -d
\end{verbatim}

On startup, the PVFS2 server uses the hostname of the machine that it
is running on to determine necessary information from the configuration
file.  If the hostname doesn't match any of the addresses specified in
the config file, then then you must use the -a option.  For example,
each of above command lines could include ``-a testhost'' to
specify that the server is using the \texttt{testhost} alias in the
configuration file.

\subsubsection{Automatic server startup and shutdown}
\label{sec:rc}

Like most other system services, PVFS2 may be started up automatically
at boot up time through the use of rc scripts.  We have provided one
such script that is suitable for use on RedHat (or similar) rc
systems.  The following example demonstrates how to set this up:

\begin{verbatim}
bash-2.05b# cp /usr/src/pvfs2/examples/pvfs2-server.rc \
    /etc/init.d/pvfs2-server
bash-2.05b# chmod a+x /etc/init.d/pvfs2-server
bash-2.05b# chkconfig pvfs2-server on
bash-2.05b#  ls -al /etc/rc3.d/S35pvfs2-server
lrwxrwxrwx  1 root  root   22 Sep 21 13:11 /etc/rc3.d/S35pvfs2-server \
    -> ../init.d/pvfs2-server
\end{verbatim}

This script will now automatically launch on startup and shutdown to
ensure that the pvfs2-server is started and stopped gracefully.
To manually start the server, you can run the following command:

\begin{verbatim}
bash-2.05b# /etc/init.d/pvfs2-server start
Starting PVFS2 server:                                     [  OK  ]
\end{verbatim}

To manually stop the server:

\begin{verbatim}
bash-2.05b# /etc/init.d/pvfs2-server stop
Stopping PVFS2 server:                                     [  OK  ]
\end{verbatim}

\subsection{Client configuration}
\label{subsec:client}

There are two primary methods for accessing a PVFS2 file system.  The first is
using the kernel module to provide standard Linux file system compatibility.
This interface allows the user to run existing binaries and system utilities
on PVFS2 without recompiling.  The second is through the MPI-IO interface,
which is built on top of the \texttt{libpvfs2} library and allows for higher
performance for parallel applications.

Both of these methods require the same bit of information on the client to
tell the client where to find the PVFS2 file system (or systems).  The
information is presented in the same way as an \texttt{fstab (5)} entry:

\begin{verbatim}
tcp://testhost:3334/pvfs2-fs /mnt/pvfs2 pvfs2 defaults,noauto 0 0
\end{verbatim}

The entry lists a PVFS2 server (\texttt{tcp://testhost:3334/pvfs2-fs}) and a
mount point (\texttt{/mnt/pvfs2}) on the client.  See the \texttt{fstab (5)}
man page for more information on the format of these lines.

We must create a mount point for the file system as well as an {\tt
/etc/pvfs2tab} entry that will be used by the PVFS2 libraries to
locate the file system.  The {\tt pvfs2tab} file is analogous to the
{\tt /etc/fstab} file that most linux systems use to keep up with file
system mount points.

\begin{verbatim}
[root@testhost /root]# mkdir /mnt/pvfs2
[root@testhost /root]# touch /etc/pvfs2tab
[root@testhost /root]# chmod a+r /etc/pvfs2tab
\end{verbatim}

Now edit this file so that it contains the following, except that you should
substitute your host name in place of ``testhost'':

\begin{verbatim}
tcp://testhost:3334/pvfs2-fs /mnt/pvfs2 pvfs2 defaults,noauto 0 0
\end{verbatim}

There are a few alternatives to using an /etc/pvfs2tab which may be useful
in production environments:
\begin{itemize}
\item One could put this entry in \texttt{/etc/fstab} file instead of
\texttt{/etc/pvfs2tab}.
\item One could avoid static tab file entries entirely and let the pvfs2 tools
detect file systems that have been mounted using the Linux kernel
driver.  This approach only works if you use the 2.6 Linux kernel or
install the mount.pvfs2 utility on 2.4 Linux kernel systems.
\end{itemize}

\subsection{Testing your installation}
\label{subsec:testing}
PVFS2 currently includes (among others) the following tools for
manipulating the file system using the native PVFS2 library:
pvfs2-ping, pvfs2-cp, and pvfs2-ls.  These tools
check the health of the file system, copy files to and from a PVFS2 file system, and list the
contents of directories, respectively.  Their usage
can best be summarized with the following examples:

\begin{verbatim}
bash-2.05b# ./pvfs2-ping -m /mnt/pvfs2

(1) Searching for /mnt/pvfs2 in /etc/pvfs2tab...

   Initial server: tcp://testhost:3334
   Storage name: pvfs2-fs
   Local mount point: /mnt/pvfs2

(2) Initializing system interface and retrieving configuration from server...

   meta servers (duplicates are normal):
   tcp://testhost:3334

   data servers (duplicates are normal):
   tcp://testhost:3334

(3) Verifying that all servers are responding...

   meta servers (duplicates are normal):
   tcp://testhost:3334 Ok

   data servers (duplicates are normal):
   tcp://testhost:3334 Ok

(4) Verifying that fsid 9 is acceptable to all servers...

   Ok; all servers understand fs_id 9

(5) Verifying that root handle is owned by one server...

   Root handle: 0x00100000
   Ok; root handle is owned by exactly one server.

=============================================================

The PVFS2 filesystem at /mnt/pvfs2 appears to be correctly configured.

bash-2.05b# ./pvfs2-ls /mnt/pvfs2/

bash-2.05b# ./pvfs2-cp -t /usr/lib/libc.a /mnt/pvfs2/testfile
Wrote 2310808 bytes in 0.264689 seconds. 8.325842 MB/seconds

bash-2.05b# ./pvfs2-ls /mnt/pvfs2/
testfile

bash-2.05b# ./pvfs2-ls -alh /mnt/pvfs2/
drwxrwxrwx    1 pcarns  users            0 2003-08-14 22:45 .
drwxrwxrwx    1 pcarns  users            0 2003-08-14 22:45 .. (faked)
-rw-------    1 root    root            2M 2003-08-14 22:47 testfile

bash-2.05b# ./pvfs2-cp -t /mnt/pvfs2/testfile /tmp/testfile-out
Wrote 2310808 bytes in 0.180621 seconds. 12.201016 MB/seconds

bash-2.05b# diff /tmp/testfile-out /usr/lib/libc.a
\end{verbatim}

\section{Installing PVFS2 on a cluster}
\label{sec:cluster}
It is important to have in mind the roles that machines (a.k.a. nodes) will
play in the PVFS2 system. There are three potential roles that a machine might
play: metadata server,  I/O server, or client.

A metadata server is a node that keeps up with metadata (such as permissions
and time stamps) for the file system. An I/O server is a node that actually
stores a portion of the PVFS2 file data. A client is a node that can read and
write PVFS2 files. Your applications will typically be run on PVFS2 clients so
that they can access the file system.

A machine can fill one, two, or all of these roles simultaneously. Unlike
PVFS-1, each role requires just the pvfs2-server binary.  It will consult the
cluster-wide config file and the node-specific config file when it starts up to
know what role pvfs2-server should perform on this machine.

There can be many metadata servers, I/O servers, and clients. In this section
we will discuss the components and configuration files needed to fulfill each
role.

We will configure our example system so that the node ``cluster1'' provides
metadata information, eight nodes (named ``cluster1'' through ``cluster8'')
provide I/O services, and all nodes act as clients.

\subsection{Server configuration}
\label{sec:server-config}

We will assume that at this point you have either performed a make install
on every node, or else have provided the pvfs2 executables, headers, and
libraries to each machine by some other means.

Installing PVFS2 on a cluster is quite similar to installing it on a single
machine, so familiarize yourself with Section \ref{sec:single}.  We are going
to generate one master config file and 8 smaller node-specific config files.
Again, remember that it is critical to list correct hostnames for each machine,
and to make sure that these hostnames match the output of the \texttt{hostname}
command on each machine that will act as a server.

\begin{verbatim}
root@cluster1:~# /usr/local/pvfs2/bin/pvfs2-genconfig  \
	/etc/pvfs2-fs.conf
**********************************************************************
        Welcome to the PVFS2 Configuration Generator:

This interactive script will generate configuration files suitable
for use with a new PVFS2 file system.  Please see the PVFS2 quickstart
guide for details.

**********************************************************************

You must first select the network protocol that your file system will use.
The only currently supported options are "tcp" and "gm".

* Enter protocol type [Default is tcp]:

Choose a TCP/IP port for the servers to listen on.  Note that this
script assumes that all servers will use the same port number.

* Enter port number [Default is 3334]:

Next you must list the hostnames of the machines that will act as
I/O servers.  Acceptable syntax is "node1, node2, ..." or "node{#-#,#,#}".

* Enter hostnames [Default is localhost]: cluster{1-8}

Now list the hostnames of the machines that will act as Metadata
servers.  This list may or may not overlap with the I/O server list.

* Enter hostnames [Default is localhost]: cluster1

Configured a total of 8 servers:
8 of them are I/O servers.
1 of them are Metadata servers.

* Would you like to verify server list (y/n) [Default is n]? y

****** I/O servers:
tcp://cluster1:3334
tcp://cluster2:3334
tcp://cluster3:3334
tcp://cluster4:3334
tcp://cluster5:3334
tcp://cluster6:3334
tcp://cluster7:3334
tcp://cluster8:3334

****** Metadata servers:
tcp://cluster1:3334

* Does this look ok (y/n) [Default is y]? y

Choose a file for each server to write log messages to.

* Enter log file location [Default is /tmp/pvfs2-server.log]:

Choose a directory for each server to store data in.

* Enter directory name: [Default is /pvfs2-storage-space]:

Writing fs config file... Done.

Configuration complete!
\end{verbatim}

The generated config files will have conservative default values.  The PVFS2
Users Guide has more information about the settings and the consequences of
setting more aggressive, high performance values.

We have now made all the config files for an 8-node storage cluster:
\begin{verbatim}
root@cluster1:~# ls /etc/pvfs2/foo/
pvfs2-fs.conf
\end{verbatim}

Now the config files must be copied out to all of the server nodes.  If you
use the provided (Redhat style) rc scripts, then you can simply copy all
config files to every node; each server will pick the correct config files
based on its own hostname at startup time.  The following example assumes
that you will use scp to copy files to cluster nodes.  Other possibilities
include rcp, bpcp, or simply storing the configuration files on an NFS volume.
Please note, however, that the rc script should be modified if you intend
to store config files in any location other than the default /etc/.

At this time, we also will copy out the example rc script an enable it on
each machine.

\begin{verbatim}
root@cluster1:~# for i in `seq 1 8`; do
> scp /etc/pvfs2-fs.conf cluster\${i}:/etc/
> scp /usr/src/pvfs2/examples/pvfs2-server.rc \
    cluster\${i}:/etc/init.d/pvfs2-server
> ssh cluster\${i} /sbin/chkconfig pvfs2-server on
> done
\end{verbatim}

\subsection{Starting the servers}

As with the single-machine case, you must run pvfs2-server with a
special argument to create the storage space on all the nodes if it
does not already exist.  Run the following command on every metadata
or IO node in the cluster:

\begin{verbatim}
root@cluster1# /usr/sbin/pvfs2-server /etc/pvfs2-fs.conf -f
\end{verbatim}

Then once the storage space is created, start the server for real with a
command like this on every metadata or IO node in the cluster:

\begin{verbatim}
root@cluster1# /usr/sbin/pvfs2-server /etc/pvfs2-fs.conf
\end{verbatim}

If you want to run the server in the foreground (e.g. for debugging), use the
-d option.

If you wish to automate server startup and shutdown with rc scripts, refer
to the corresponding section \ref{sec:rc} from the single server example.

\subsection{Client configuration}

Setting up a client for multiple servers is the same as setting up a client
for a single server.  Refer to section \ref{subsec:client}.

The \texttt{/etc/pvfs2tab} file (or an \texttt{/etc/fstab} entry) needs to
exist on each client so that each client can find the file system.  The server
listed for each client can be different; any server in the PVFS2 file system
will do.  For large clusters, using different server names will eliminate one
potential bottleneck in the system by balancing the load of clients reading
initial configuration information.

\subsection {Testing your Installation}

Testing a multiple-server pvfs2 installation is the same as testing a
single-server pvfs2 installation.  Refer to section
\ref{subsec:testing}

\section{The PVFS2 Linux Kernel Interface}
\label{sec:kernel-interface}

\subsection{Finding an Appropriate Kernel Version}
\label{sec:kernel-check}

Now that you've mastered the download and installation steps of
managing the userspace PVFS2 source code, configuring the PVFS2 Linux
Kernel Interface is relatively straight forward.  We assume at this
point that you are familiar with running the server and that a PVFS2
storage space has already been created on the node that you would like
to configure for use with the VFS.

A Linux 2.6.0 kernel or later is recommended for the kernel interface,
although 2.4.x kernel support has been added for systems that require
it.  If you're using a 2.4.x kernel, you must be running 2.4.19 or
later, as previous versions are NOT (and will not be) supported.

The following examples assume that you've already downloaded,
compiled, and are now running the Linux kernel located in the
/usr/src/linux-2.x.x directory on your system.

Before compiling the kernel module against your running kernel, check
to make sure that you are running an appropriate kernel version.  You
can do this in the following manner:

\begin{verbatim}
lain linux # cat /proc/version
Linux version 2.6.6 (root@lain.mcs.anl.gov) (gcc version 3.3.3
20040412 (Gentoo Linux 3.3.3-r5, ssp-3.3-7, pie-8.7.5.3)) #3 SMP Wed
May 26 16:22:11 CDT 2004
\end{verbatim}

By issuing that command, we are able to inspect the output to ensure
that we're running an appropriate kernel version.  If your kernel is
older than 2.6.0 (for 2.6.x kernels) or 2.4.19 (for 2.4.x kernels),
please download and install a later kernel version (or submit a
request to your site's System Administrator).

For reference, you can download Linux kernels at:
\begin{verbatim}
2.6.x kernels: http://www.kernel.org/pub/linux/kernel/v2.6/
2.4.x kernels: http://www.kernel.org/pub/linux/kernel/v2.4/
\end{verbatim}

Once you're convinced the Linux kernel version is appropriate, it's
time to compile the PVFS2 kernel module.

\subsection{Preparing Linux Kernel 2.6.x configurations}
\label{sec:vfs-configure}

To generate the Makefile, you need to make sure that you run
'./configure' with the '--with-kernel=path' argument.  An example is
provided here for your convenience:

\begin{verbatim}
gil:/usr/src/pvfs2# ./configure --with-kernel=/usr/src/linux-2.6.0
\end{verbatim}

Note that you can often find a kernel source tree (or a symlink to the
right place) at /lib/modules/`uname -r`/build2.  For example, if you were
running the default Fedora 3 kernel (linux-2.6.9-1.667) you would find the
kernel tree in \texttt{/lib/modules/2.6.9-1.667/build}.

After this configure command is issued, build the PVFS2 source tree if it
has not yet been built.

Building the 2.6.x kernel module requires an extra step.  Since
current kernels require writing a few files in the kernel source
directory to build a module, you may have to become root to compile
the kernel module.  To build the module, type ``make kmod''.

At this point, we have a valid PVFS2 2.6.x Kernel module.  The module
itself is the file {\tt pvfs2.ko} in subdirectory {\tt
src/kernel/linux-2.6} in your build tree.  You may install it to the
standard system location with ``make kmod\_install'', again you will
likely have to be root to do this.  Or you may override the install
location by setting the variable {\tt KMOD\_DIR} variable when you
install.

\subsection{Preparing Linux Kernel 2.4.x configurations}
\label{sec:vfs24-configure}

To generate the Makefile, you need to make sure that you run
'./configure' with the '--with-kernel24=path' argument.  An example is
provided here for your convenience:

\begin{verbatim}
gil:/usr/src/pvfs2# ./configure --with-kernel24=/usr/src/linux-2.4.26
\end{verbatim}

After this command is issued, build the PVFS2 source tree if it has
not yet been built.

Building the 2.4.x kernel module requires an extra step.  Since
current kernels require writing a few files in the kernel source
directory to build a module, you may have to become root to compile
the kernel module.  To build the module, type ``make kmod24''.

At this point, we have a valid PVFS2 2.4.x Kernel module.  The module
itself is the file {\tt pvfs2.o} in subdirectory {\tt
src/kernel/linux-2.4} in your build tree.  You may install it to the
standard system location with ``make kmod24\_install'', again you will
likely have to be root to do this.  Or you may override the install
location by setting the variable {\tt KMOD\_DIR} variable when you
install.

\subsection{Testing the Kernel Interface}
\label{sec:vfs-test}

Now that you've built a valid PVFS2 kernel module, there are several
steps to perform to properly use the file system.

The basic steps are as follows:
\begin{itemize}
\item Create a mount point on the local filesystem
\item Load the Kernel Module into the running kernel
\item Start the PVFS2 Server application
\item Start the PVFS2 Client application
\item Mount your existing PVFS2 volume on the local filesystem
\item Issue VFS commands
\end{itemize}

First, choose where you'd like to mount your existing PVFS2 volume.
Create this directory on the local file system if it does not already
exist.  Our mount point in this example is /mnt/pvfs2.

\begin{verbatim}
gil:~# mkdir /mnt/pvfs2
\end{verbatim}

Now load the kernel module into your running kernel.  You can do this
by using the 'insmod' program, or modprobe if you've copied your
module into the appropriate /lib/modules directory for your running
kernel.

\subsubsection{Loading the kernel module}
For 2.6.x kernels ONLY:
\begin{verbatim}
gil:~# insmod /usr/src/pvfs2/src/kernel/linux-2.6/pvfs2.ko
\end{verbatim}

For 2.4.x kernels ONLY:
\begin{verbatim}
gil:~# insmod /usr/src/pvfs2/src/kernel/linux-2.4/pvfs2.o
\end{verbatim}

You should verify that the module was loaded properly using the
command ``lsmod''.  Also, you can use the ``rmmod'' to remove the
PVFS2 module after it's been loaded.  Only remove the module when you
have safely unmounted all mounted file systems (if any) and stopped
the pvfs2-client software.

At this point, we need to start the PVFS2 server and the PVFS2 client
applications before trying to mount a PVFS2 volume.  See previous
sections on how to properly start the PVFS2 server if you're unsure.
Starting the PVFS2 client is covered below.

The PVFS2 client application consists of two programs.
``pvfs2-client-core'' and ``pvfs2-client''.  DO NOT run
``pvfs2-client-core'' by itself.  ``pvfs2-client'' is the PVFS2 client
application.  This application cannot be started unless the PVFS2
server is already running.  Here is an example of how to start the
PVFS2 client:

\begin{verbatim}
gil:/usr/src/pvfs2# cd src/apps/kernel/linux-2.6/
gil:/usr/src/pvfs2/src/apps/kernel/linux-2.6# ./pvfs2-client -f -p ./pvfs2-client-core
pvfs2-client starting
Spawning new child process
About to exec ./pvfs2-client-core
Waiting on child with pid 17731
\end{verbatim}

The -f argument is not required.  For reference, this keeps the PVFS2
client application running in the foreground.

The -p argument is required unless the pvfs2-client-core is installed
and can be found in your PATH.

Also worth noting is the -a argument (not required).  For reference,
this sets the timeout value (in milliseconds) of the client side
attribute cache.  Setting this to a large value will improve attribute
read times (e.g. running ``ls'' repeatedly), but can reflect incorrect
attributes if a remote client is modifying them.  The default value is
0 milliseconds, effectively disabling this client side attribute
cache.

Other arguments and descriptions can be viewed by running the program
with the -h option.

Now that the module is loaded, and the pvfs2-server and pvfs2-client
programs are running, we can mount our PVFS2 file system (and verify
that it's properly mounted) as follows:

\begin{verbatim}
lain pvfs2 # mount -t pvfs2 tcp://testhost:3334/pvfs2-fs /mnt/pvfs2
lain pvfs2 # mount | grep pvfs2
tcp://lain.mcs.anl.gov:3334/pvfs2-fs on /tmp/mnt type pvfs2 (rw)
\end{verbatim}

NOTE: The device of the format tcp://testhost:3334/pvfs2-fs MUST be
specified, as we need to know a valid running pvfs2-server and file
system name to dynamically mount a pvfs2 volume.  These values can be
read from your configuration files.  As a side note, you can use
``umount'' to unmount the PVFS2 volume when you're ready.

Now that a PVFS2 volume is mounted, normal VFS operation can be issued
on the command line.  An example is provided below:

\begin{verbatim}
gil:/usr/src/pvfs2/src/kernel/linux-2.6# mkdir /mnt/pvfs2/newdir
gil:/usr/src/pvfs2/src/kernel/linux-2.6# ls -al /mnt/pvfs2/newdir
total 1
drwxr-xr-x    2 root     root            0 Aug 15 13:29 .
drwxr-xr-x    3 root     root            0 Aug 15 13:21 ..
gil:/usr/src/pvfs2/src/kernel/linux-2.6# cp pvfs2.ko
/mnt/pvfs2/newdir/foo
gil:/usr/src/pvfs2/src/kernel/linux-2.6# ls -al /mnt/pvfs2/newdir
total 2
drwxr-xr-x    2 root     root            0 Aug 15 13:29 .
drwxr-xr-x    3 root     root            0 Aug 15 13:21 ..
-rw-r--r--    1 root     root       330526 Aug 15 13:30 foo
\end{verbatim}

\subsubsection{Special Note for 2.4 kernels}

We need a small helper application \texttt{/sbin/mount.pvfs2} to mount pvfs2
under 2.4 kernels.   It must be installed under \texttt{/sbin}.  Note that
``make install'' will not touch \texttt{/sbin}, so you will have to install it
by hand.  With the helper application installed, the 2.6 mount commands and
fstab entries are the same.

If you do not have \texttt{/sbin/mount.pvfs2} available, you can still use the
old appraoch:

\begin{verbatim}
gil:~# mount -t pvfs2 pvfs2 /mnt/pvfs2 -o tcp://testhost:3334/pvfs2-fs
gil:~# mount | grep pvfs2
pvfs2 on /mnt/pvfs2 type pvfs2 (rw)
\end{verbatim}


\subsection{Unmounting and stopping PVFS2 on a client}

While this is a quick \emph{start} guide, knowing how to cleanly shut
things down can be helpful too!

Unmounting a PVFS2 volume is as simple as using ``umount'':
\begin{verbatim}
gil:~# umount /mnt/pvfs2
gil:~# mount | grep pvfs2
\end{verbatim}

After all PVFS2 volumes have been unmounted, it is safe to kill the
pvfs2-client:
\begin{verbatim}
gil:~# killall pvfs2-client
\end{verbatim}

Waiting a few seconds after killing the pvfs2-client will ensure that
everything has terminated properly.  Once the pvfs2-client has been
killed, it is safe to remove the PVFS2 kernel module:
\begin{verbatim}
gil:~# rmmod pvfs2
\end{verbatim}

\appendix

\section{Notes on running PVFS2 without root access}

The preceding documentation assumes that you have root access on the
machine(s) that you wish to install the file system.  However, this is
not strictly required for any component except for the kernel VFS
support.  The servers, client libraries (such as MPI-IO), and
administrative tools can all be used by non-priviledged users.  This
may be particularly useful for evaluation or testing purposes.

In order to do this, you must make the following adjustments to the
installation and configuration process:
\begin{itemize}
\item Use the --prefix option at configure time to choose an alternate
directory (one that you have write access to) for installation.  An example
would be /home/username/pvfs2-build.
\item When generating the server config files, choose a data storage
directory that you have write access to, but preferably not NFS mounted.  An
example would be /tmp/pvfs2-test-space.
\item Place the pvfs2tab file in an alternate location, such as
/home/username/pvfs2-build/pvfs2tab, instead of /etc/pvfs2tab.
Then set the PVFS2TAB\_FILE environment variable to the full path
to this file.  A tcsh example would be: ``setenv PVFS2TAB\_FILE
/home/username/pvfs2-build/pvfs2tab''.
\end{itemize}


\section{Debugging your PVFS2 configuration}

Bug reports and questions should be directed to the PVFS2 users
mailing list for best results (see the PVFS2 web site for details:
http://www.pvfs.org/pvfs2/lists.html).  It is helpful to include a
description of your problem, the PVFS2 version number, and include
relevant log information from /var/log/messages and
/tmp/pvfs2-server.log.

People who wish to find more verbose information about what the file
system is doing can enable extra logging messages from the server.
This is done by adjusting the ``EventLogging'' field in the file
system configuration file.  By default it is set to ``none''.  You can
set it to a comma seperated list of log masks to get more information.
An example would be ``EventLogging storage,network,server'', which
will result in verbose messages from the storage subsystem, the
network subsystem, and server state machines.  \emph{WARNING: this may
result in extremely large log files!}  The logging masks can also be
set at runtime using the pvfs2-set-debugmask command line tool.  Usage
information and a list of supported masks will be shown if it is run
with no arguments.

Similarly, run-time client debugging information can be gathered by
using environment variables before running the client application.
The default client logging method is to set the variable
PVFS2\_DEBUGMASK to values such as ``client,network''.  Many of the
supported client debugging masks overlap the server masks that can be
verified using pvfs2-set-debugmask.  By default, setting
PVFS2\_DEBUGMASK emits debugging information to stderr, often
intermixed with the client program output.  If you'd like to redirect
client debugging to a file, set the PVFS2\_DEBUGFILE environment
variable to a valid file name.  This causes all debug information
specified by the PVFS2\_DEBUGMASK to be stored in the file specified,
no longer intermixing the output with the client program.  The file
specified in the PVFS2\_DEBUGFILE environment variable will be
appended if it already exists.
Another influential environment variable that can be used to trigger
kmod logging messages is PVFS2\_KMODMASK. By setting values of this variable
to ``file, inode'' prior to starting pvfs2-client-core daemon,
verbose kmod subsystem error diagnostics are written to the system ring buffer
and eventually to the kernel logs.
One could also set the kmod diagnostic level when the kernel module is loaded
like so, insmod pvfs2.ko module\_parm\_debug\_mask=<diagnostic level>.
The diagnostic level will be a bitwise OR of values specified in pvfs2-debug.h.
For more information on setting the kernel or client debug mask, see
\texttt{doc/pvfs2-logging.txt} in the PVFS source tree.

\section{ROMIO Support}
\label{sec:romio}

Building ROMIO with PVFS2 support can be a bit tricky, and is certainly not
well documented.  Reports of the correct way to build for OpenMPI would be
appreciated.  This document will cover building for MPICH2.

First, get the software.  Download MPICH2 from
http://www.mcs.anl.gov/mpi/mpich2/.  We may have found bug fixes since the last MPICH2 release.  If there are bug fixes, they can be found at
http://www.mcs.anl.gov/romio/pvfs2-patches.html.

Unpack mpich2. Apply the ROMIO patch in the src/mpi/romio directory if one is
needed.

\begin{verbatim}
prompt% tar xzf ~/src/mpich2-1.4.0p1.tar.gz    # unpack mpich2 source
prompt% cd mpich2-1.4.0p1/src/mpi/romio        # change to ROMIO dir
prompt% patch -p1 < ~/src/romio-<CORRECT_VERSION>.diff   #apply patch
prompt% cd ../../..                            # return to top of src
prompt%
\end{verbatim}

In order to build MPICH2 with a ROMIO that speaks PVFS2, pass the
\texttt{--with-pvfs2=PVFS\_PREFIX} option to configure.  'PVFS2\_PREFIX' is the
prefix you gave to the PVFS configure script (e.g. /usr/local or
/opt/packages/pvfs-2.6.0).

\begin{verbatim}
configure --with-pvfs2=PVFS_PREFIX [other flags]
\end{verbatim}

Now compile and install MPICH2 as you normally would.  Applications accessing
PVFS2 through MPI-IO will bypass the kernel interface and talk to PVFS2 servers
directly.  If you do not have the file system mounted, ROMIO will still work.  Just be sure to add the "pvfs2:" to your file name.

Please note: older versions of MPICH2 need a few changes
to the normal configure process.  MPICH2-1.0.4p1 and older will need to
know the path to the PVFS2 installation.  Modify the {\tt CFLAGS},
{\tt LDFLAGS} and {\tt LIBS} environment variables.

\begin{verbatim}
prompt% export CFLAGS="<other desired flags> -I/usr/local/pvfs2/include"
prompt% export LDFLAGS="-L/usr/local/pvfs2/lib"
prompt% export LIBS="-lpvfs2 -lpthread"
prompt% configure --with-file-system=ufs+nfs+testfs+pvfs2 [other flags]
\end{verbatim}

\end{document}