Skip to content
Permalink
71b943284f
Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?
Go to file
 
 
Cannot retrieve contributors at this time
176 lines (137 sloc) 26.2 KB
\documentclass[11pt]{article}
\usepackage {multicol}
\usepackage{url}
\usepackage[pdfstartview={FitH}]{hyperref}
\usepackage[margin=1in]{geometry}
\usepackage[pdftex]{graphicx}
\usepackage{float}
% Title Page
\title{Tracing Paper}
\author{Paul Arthur Wortman}
\begin{document}
\maketitle
\setlength{\columnsep}{11pt}
\begin{multicols}{2} {
\section{Abstract}
\label{Abstract}
As different technologies advance, its pace \& development are recorded so that we (humans) are able to see the strengths and weaknesses of said technologies \& are able to rehash our efforts to improve it. But as with any sort of benchmark, there are inherent simplifications that are taken into account when first setting these watermarks for advancing technology. In the case of networking benchmarks many of these oversimplifications occur when dealing with either the spatial or temporal scaling of systems. While these simplifications were acceptable for the past systems being tested, this facile outlook is no longer acceptable for supplying advantageous data/information. Without taking into account these intricacies, technology will only be able to progress in the avenues that we know of, while never being able to tackle the bottlenecks that are made apparent through more accurate benchmarking.
I/O benchmarking, the process of comparing I/O systems by subjecting them to known workloads, is a widespread practice in the storage industry and serves as the basis for purchasing decisions, performance tuning studies, and marketing campains ~\cite{Anderson2004}.
As seen in previous trace work done [Leung et al, elllard et al, meyer et al] entire concepts of how these computer systems are being used versus their initial purpose have allowed for great strides in eliminating these bottlenecks rather than spending unnecessary time working on “imagined” bottlenecks. Leung \textit{et. al.} found a series of observations, from the fact that files are rarely re-opened to finding that read-write access patterns are more frequent~\cite{Leung2008}. Without the illumination of these underlying actions (e.g. read-write ratios, file death rates, file access rates, etc.) then these issues are not readily tackled. The purpose of my work is to tackle this gap \& hopefully this new study will bring insight to complexity of network communication.
\small {
\section{Introduction}
\label{Introduction}
\subsection{Purpose of Tracing}
\label{Purpose of Tracing}
Performing these sorts of investigations \& traces is important because without these attempts to better understand the intricacies of computer systems it is impossible for humankind to progress its technologies and make optimal use of its resources. Without a better understanding of materials one is not able to improve computer hardware, without a greater understanding of memory one is never able to make effective (and efficient) use of memory resources, and without further investigation one can't hope to strengthen the human understanding of network communication between devices \& how the aspects of this communication may be directly effecting the performance of these systems.
\subsection{Issues with Tracing}
\label{Issues with Tracing}
The majority of benchmarks are facile attempts to represent some known system structure on which this “original” design was tested. While this is all well and good, there are many issues with this sort of approach; two important are temporal \& spatial concerns. With the matter of temporal scaling, the main concern is that current day benchmarks do not account for the subtleties of intercommunication between clients \& servers on a network. While these temporal issues have been tackled for single processor (and even somewhat for cases of multi-processor) these same timing issues are not properly tackled when dealing with inter-network communication. \textit{\textbf{ADD IN SECTION ON TEMPORAL SCALING.}} One finds similar handicaps when dealing with the spatial aspect of computer networks. Due to the singular-system nature of most benchmarks there is much simplification taken into account when using this resource to represent a larger network. Common practice is to have this singular benchmark (e.g. a benchmark representative of only a single computer system running the benchmark) being run in parallel across some N computer systems \& taking the result to be facile representation of a parallel/networked system; the simplification being that communication in-between the systems is not accounted for. Thus the more interesting data (e.g. inter-network communication) is not accurately represented, and nothing can be done about inter-network bottlenecks because these issues are not even known.
\textbf{\textit{Temporal Scaling}:} This is where one needs to account for the nuances of timing with respect to the run time of “commands”; computation, communication \& service. \textbf{Example:} If it takes process A 10s to run, and then at time 20s process B occurs, then the concern is how interacting on different systems affects these benchmark of processes A \& B. In these scenarios the nuances being examined is the way in which these processes interact with each other (if at all) and how the limiting factor in their operation is either the result of an other process (e.g. process B must wait for process A), a result of the speed capabilities of the network (e.g. inter arrival time is shorter due to faster transmission speeds) or even no effect (e.g. neither process A nor process B influence each other). A system with temporal scaling would take these subtleties into account when expanding its operation across multiple machines in a network. To an even better degree a benchmark would be able to induce all three scenarios.
\textbf{\textit{Spatial Scaling}:} This is where one needs to account for the nuances of expanding a benchmark to incorporate a number of \textbf{n} machines over a network. While performing a benchmark on a single machine is easily feasible, there is much more to consider when dealing with multiple machines communicating with each other; and the expected requirements of fully testing these aspects. \textbf{Example:} Regardless of the number of machines that are being used the benchmark should be able to adapt \& incorporate this into the full testing of the system. The idea is that a system that properly incorporates spatial scaling is one that would be able to detect (or be told) the number of machines on the system and incorporate communication (possibly even in varying intensities) between all the machines on the system, thus stress testing all aspects on the network which in turn allows for accurate representation of bottlenecks.
\subsection{Previous Advances Due to Testing}
\label{Previous Advances Due to Testing}
\subsection{The Need for a New Study}
\label{The Need for a New Study}
Leung Paper - 'Table 2: Summary of major file system studies over the past two decades'. As has been pointed out by past work, the design of systems is usually guided by an understanding of the file system workloads and user behavior~\cite{Leung2008}. It is for that reason that new studies are constantly performed by the science community, from large scale studies to individual protocol studies \textit{CITE SUCH STUDIES HERE}. Even within these studies, the information gleamed is only as meaningful as the considerations put into the nauces of how the data is handled. There are some dis-concerning points that our work hopes to alleviate are the following: there has been no large scale study done on networks for some time, there has been no study on CIFS/SMB for even longer, and most importantly these studies have not tackled the spacial \& temporal scaling idiosyncrasies of network communication. It is for these reasons that we develop this tracing system and developed new studies for temporal scaling. This was done through process ID tracking which is further explained in section ~\ref{Process ID Tracking}.
\section{Methodology}
\label{Methodology}
\subsection{Effects of System Setup on Tracing}
\label{Effects of System Setup on Tracing}
When initially designing the tracing system (of this paper) there are different aspects that one has to take into account, such as space limitations of the tracing system, packet capture limitations (e.g. file size), and speed limitations of the hardware, etc. The major space limitations that are deal with in this work are the amount of space that the system has for storing the captured packets, including the resulting ds-file compressions. The encountered limitation of the packet capture system is that the functional pcap (packet capture file) size was found to be about 750MB. \textit{When attempting to run tshark with larger pcap files (such as 1GB) it was found that once the program ran for some time (typically about 772 files) it would crash (often due to a stack smashing error). Unfortunately the nature of this error has yet to be discovered.} The speed limitations of the hardware are dictated by the hardware being used (e.g. GB capture interface) and the software that makes use of this hardware (e.g. PF\_RING). After all, our data can only be as accurate as the information being captured.
\subsection{Main Challenges}
\label{Main Challenges}
\section{Tracing System}
\label{Tracing System}
\subsection{Different Stages of Trace}
\label{Different Stages of Trace}
\textit{tshark}: The purpose of tshark is to act as a collection program for grabbing all of the redirected network traffic and saving this packet information into files (e.g. pcap files). In order to help minimize packet loss, as this represents lost data, the '-n' option is used so that network object name resolution is disabled, thus helping simplify the packet capturing process.
\\\textit{pcap2ds}: The purpose of pcap2ds is to go through the contents of each pcap file and to re-write the information in the DataSeries format (e.g. ds files). The most important aspect of this step is that while this re-formatting of information is occurring, there is also a compression of information (e.g. the file is in a "zipped"/"tar-ed" form) taking place. Preliminary examination of the numbers show ~99% compression. The key reason for this compression is that the pcap2ds program goes through the contents of the pcap file and only writes/compresses field information that the user believes to be important of useful; consequently, not all of the captured packet information is saved. The reason for this selective behavior is that not all of the information that is sent through network communications is pertinent to our tracking of the client-server interactions. Due to the basal nature of this work, there is no need to track every piece of information that is exchanged, only that information which illuminates the behavior of the clients \\& servers that function over the network (e.g. read \& write transactions).
\\\textit{inotify}: The purpose of inotify is to act as a watchdog for the directory in which tshark is writing its pcap files. As each pcap file is "completed" (e.g. has been written to the fulll desired size; 750MB) inotify "sees" the 'closed after write' (e.g. a file is closed after writing to it) that occurs and calls pcap2ds on the newly finished pcap file. In order to do this inotify calls the fork\_test() function, where a fork is called and each child process prepares the arguments required for running pcap2ds with a certain protocol (e.g. SMB, NFS, iSCSI) then runs that instance of pcap2ds. It should be noted that while the system is capable of preforming pcap2ds using SMB, NFS \& iSCSI protocols, the system currently only deals with the SMB/CIFS protocol. While these forked pcap2ds instances run, inotify continues to monitor the pcap file directory so as to not miss any of the incoming information.
\subsection{About the Systems Being Traced}
\label{About the Systems Being Traced}
\textit{SMB Server, iSCSI Trace, ECS, etc.}: The SMB/CIFS information being captured comes from the university network. All packet and transaction information is passed through a duplicating switch(\textit{pipe?}) that then allows for the trace1 system to capture these packet transactions over a 10 GB(\textit{bytes?bit?}) port. The reason for using a 10GB/b hardware is to help ensure that the system is able to capture any \& all information on the network
\\\textit{Expectations}: Blah blah
\\\textit{Worries}: la ti da
\section{Trace Analysis}
\label{Trace Analysis}
\subsection{SMB}
\label{SMB}
Server Message Block (SMB) is the modern dialect of Common Internet File System (CIFS). The most important aspect of SAMBA (e.g. SMB) is that it is a stateful protocol. Being a stateful protocol means that with the information being sent via SMB there are identifying fields that allow for process ID tracking.
\\The structure for sending message payloads in SMB is as follows: each SMB message is split into three blocks. The first block is a fixed-length SMB header. The second block is made up of two variable-length blocks called the SMB parameters. The third block is made up of the SMB data. Depending on the transaction occurring these different blocks are used in different manners. For example, the SMB protocol dictates that error responses \textbf{should} be sent with empty SMB parameters \& SMB data blocks (along with the WordCount \& ByteCount fields set to zero). The purpose of the SMB header is particularly important because the header identifies the message as an SMB message payload~\cite{MS-CIFS}. When used in a response message the header also includes status information that indicates whether and how the command succeeded or failed. The most important aspects of the SMB header, which the tracing system constantly examines, are the PID/MID tuple (for the purpose of identifying a client/server) and the commands value which is passed (notifying our tracing system of the actions taking place on the network). It is through this command field that the process ID tracking system is able to follow the different commands (read/write/general event) that occur \& try to find patterns in these network communications.
\subsection{Other (e.g. HTML)}
\label{Other (e.g. HTML)}
\subsection{Process ID Tracking}
\label{Process ID Tracking}
%\begin{figure}
\centerline{\includegraphics[width=50mm]{communications_sketch.png}}
\centerline{Figure A: Sketch of Communication}
%\caption{Sketch of Communication}
%\label{figure sketch}
%\end{figure}
Following these process IDs is as a way to check for intercommunication between two or more processes. In particular, we examine the compute time \& I/O time (otherwise known as inter-arrival time). This is done by examining the inter-arrival times (e.g. time spent in communication; between information arrivals) between the server \& the client. The reason this would be interesting is that this information will give us a realistic sense of the data transit time of the network connections being used (e.g. ethernet, firewire, fibre, etc.). Other pertinent information would be how often the client makes requests \& how often this event occurs per client process ID; identifiable by their PID, MID tuple. One could also track the amount of sharing that is occurring between users. The PID is the process identifier and the MID is the multiplex identifier, which is set by the client to be used for identifying groups of commands belonging to the same logical thread of operation on the client node. Tracking the iat is interesting because we want to know the activity of the client (e.g. how many connections/connection requests each client is producing) because then that can be used to map behavior for low, medium \& high level clients (i.e. amount of traffic being produced) for use in an adaptive benchmarking system. The second half (per client process ID) is of interest because this information can be used to map the activity of given programs, thus allowing for finer granularity in the produced benchmark (e.g. control down to process types ran by individual client levels). \textbf{Figure A} shows a rough sketch of communication between a client \& server. The general order that constitutes a full tracking is as follows: (client) computation [process to filesystem], (client) communication [SMB protocol used to send data client→server], (server) timestamping + service [server gets data, logs it, performs service], (server) communication [SMB data send server→client], (client) next computation. Other areas of interest are the time between an open \& close, or how many opens/closes occurred in a window (e.g. a period of time). This information could be used as a gauge of current day trends in filesystem usage \& its consequent taxation on the surrounding network. It would also allow for greater insight on the r/w habits of users on a network along with a rough comparison between other registered events that occur on the network. Lastly, though no less important, is to look at how many occurrences are there of shared files between different users, though one must note that there is some issue (though hopefully rare) of resource locking (e.g. shared files) that needs to be taken into account. This is initially addressed by monitoring any oplock flags that are sent for read \& writes.
\\Currently the focus of process ID tracking is to see the number of reads, writes and events that occur due to the actions of clients on the network. This is done by using a tuple of the PID \& MID fields which allows for the identification of client. Since these values are unique and \textbf{MUST} be sent with each packet, this tuple is used as the key for the unordered map that is used to track this information. The structure is as follows: the tuple functions as the key for the pairing of the identifying tuple \& corresponding event\_data structure; which is used to house pertinent information about reads/writes/events. The information stored in the structure is the last time a read/write/event occurred, the total iat (inter-arrival times) of the observed read/write/events, and the total number of reads/writes/events that have occurred for the identified tuple. The purpose for tracking this information is to profile the read/write “habits” of the users on the network as well as comparing this information against the general events’ inter-arrival times, thus allowing one to see if the read \& write events are being processed differently (e.g. longer or shorter iats) than the rest of the events occurring on the network. This information also helps provide a preliminary mapping of how the network is used \& what sort of traffic populates the communication.
\\One should note that there are separate purposes to the PID/MID tuple from the PID/MID/TID/UID tuple. The first tuple (2-tuple) is used to uniquely identify groups of commands belonging to the same logical thread of operation on the client node, while the latter tuple (4-tuple) allows for unique identification for request \& responses that are part of the same transaction. While the PID/MID tuple is mainly what we are interested in, since this allows the following of a single logical thread, there is some interest in making use of the TID/UID tuple because this would allow us to count the number of transactions that occur in a single logical thread. This information could provide interesting information on how the computer systems on the network may be deciding to handle/send commands over the network; e.g. sending multiple commands per transaction, multiple packet commands per transaction, etc.
\subsection{Run Patterns}
\label{Run Patterns}
\subsection{Locating Performance Bottlenecks}
\label{Locating Performance Bottlenecks}
\section{Intuition Confirm/Change}
\label{Intuition Confirm/Change}
\subsection{Characterizations of Different Packet Types}
\label{Characterizations of Different Packet Types}
\section{Related Work}
\label{Related Work}
\subsection{Anderson 2004 Paper}
\label{Anderson 2004 Paper}
This paper tackles the temporal inaccuracy of current day benchmarks \& the impact and errors produced due to these naive benchmarking tools. Timing accuracy (issuing I/Os at the desired time) at high I/O rates is difficult to achieve on stock operating systems ~\cite{Anderson2004}. Due to this inaccuracy, these may be introduction of substantial errors into observed system metrics when benchmarking I/O systems; including the use of these inaccurate tools for replaying traces or for producing synthetic workloads with known inter-arrival times ~\cite{Anderson2004}. Anderson \textit{et al.} demonstrates the need for timing accuracy for I/O benchmarking in the context of replaying I/O traces. Andreson \textit{et al.} showed that the error in perceived I/O response times can be as much as +350% or -15% by using naive benchmarking tools that have timing inaccuracies ~\cite{Anderson2004}. Anderson \textit{et al.}'s measurements indicated that the issue accuracy achieved by using standard system calls is not adequate and that errors in issuing I/Os can lead to substantial errors in measurements of I/O statistics such as mean latency and number of outstanding I/Os.
\subsection{Ellard Ledlie 2003}
\label{Ellard Ledlie 2003}
This paper examines two workloads (research and email) to see if they resemble previously studied workloads as well as performing several new analyses on the NFS protocol. Trace-based analyses have guided and motivated contemporary file system design for the past two decades; the original analysis of the 4.2BSD file system motivated many of the design decisions of the log-structured file system (LFS)~\cite{EllardLedlie2003}. This paper also makes a call that since technology's use has expanded and evolved then there can be a fundamental change in workloads and therefore needs to be traced to observe \& understand the changes. "We believe that as the community of computer users has expanded and evolved there has been a fundamental change in the workloads seen by file servers, and that the research community must find ways to observe and measure these new workloads."~\cite{EllardLedlie2003} Some of the contributions of this paper include new techniques for analyzing NFS traces along with tools to gather new anonymized NFS traces. Anderson \textit{et al.} also observed that much of the variance of load characterization statistics over time can be explained by high-level changes in the workload over time; however, this correlation has been observed in many trace studies but its effects are usually ignored~\cite{EllardLedlie2003}. The most noticeable change in their traces was the difference between peak and off-peak hours of operation. This finding conveyed that time is a strong predictor of operation counts, amount of data transferred, and the read-write ratios for their CAMPUS (e.g. email) workload.
\subsection{Ellard 2003}
\label{Ellard 2003}
This paper shows that the technology being research actively gains improvement faster and that the technology that is not improved will end up being the bottleneck of the system. Ellard and Seltzer give the example of how file system performance is steadily losing ground relative to CPU, memory, and even network performance. Even though Ellard and Seltzer began their efforts to measure accurately the impact of changes to their system, they also discovered several other phenomena that interacted with the performance of the disk and files system in ways that had far more impact on the overall performance of the system than their improvements~\cite{Ellard2003}. This paper loosely groups all benchmarks into two categories: micro benchmarks and macro/workload benchmarks. The difference between these two being that micro benchmarks measure specific low-level aspects of system performance while workload benchmarks estimate the performance of the system running a particular workload.
\subsection{Leung 2008 Paper}
\label{Leung 2008 Paper}
\textit{Compared to Previous Studies}
\begin{itemize}
\item 1. Both of our workloads are more write-oriented. Read to write byte ratios have significantly decreased.
\item 2. Read-write access patterns have increased 30-fold relative to read-only and write-only access patterns.
\item 3. Most bytes are transferred in longer sequential runs. These runs are an order of magnitude larger.
\item 4. Most bytes transferred are from larger files. File sizes are up to an order of magnitude larger.
\item 5. Files live an order of magnitude longer. Fewer than 50% are deleted within a day of creation.
\end{itemize}
\textit{New Observations}
\begin{itemize}
\item 6. Files are rarely re-opened. Over 66% are re-opened once and 95% fewer than five times.
\item 7. Files re-opens are temporally related. Over 60% of re-opens occur within a minute of the first.
\item 8. A small fraction of clients account for a large fraction of file activity. Fewer than 1% of clients account for 50% of file requests.
\item 9. Files are infrequently shared by more than one client. Over 76% of files are never opened by more than one client.
\item 10. File sharing is rarely concurrent and sharing is usually read-only. Only 5% of files opened by multiple clients are concurrent and 90% of sharing is read-only.
\end{itemize}
\textit{List of interesting data points (comes from 'Table 3: Summary of trace statistics')}
\begin{itemize}
\item Clients, Days, Data Read (GB), Data Written (GB), R:W I/O Ratio, R:W Byte Ratio, Total Operations
\item Operation Names: Session Create, Open, Close, Read, Write, Flush, Lock, Delete, File Stat, Set Attribute, Directory Read, Rename, Pipe Transactions
\end{itemize}
\textit{Table 4: Comparison of file access patterns - This figure gives good show of Read-Only, Write-Only \& Read-Write}
\\\textit{Observations:}
\begin{itemize}
\item 1) “Both of our workloads are more write-heavy than workloads studied previously”
\item 2) “Read-write access patterns are much more frequent compared to past studies”
\item 3) “Bytes are transferred in much longer sequential runs than in previous studies”
\item 4) Bytes are transferred from much larger files than in previous studies
\item 5) Files live an order of magnitude longer than in previous studies
\item 6) Most files are not re-opened once they are closed
\item 7) If a file is re-opened, it is temporally related to the previous close
\end{itemize}
\section{Conclusion}
\label{Conclusion}
\textit{Do the results show a continuation in the trend of traditional computer science workloads?}
% Figure ~\ref{figure sketch}
%references section
\bibliographystyle{plain}
\bibliography{body}
}
}
\end{multicols}
\end{document}