diff --git a/TracingPaper.log b/TracingPaper.log index 4bb2708..e05274f 100644 --- a/TracingPaper.log +++ b/TracingPaper.log @@ -1,4 +1,4 @@ -This is pdfTeX, Version 3.1415926-2.3-1.40.12 (MiKTeX 2.9 64-bit) (preloaded format=pdflatex 2012.11.13) 19 FEB 2015 17:17 +This is pdfTeX, Version 3.1415926-2.3-1.40.12 (MiKTeX 2.9 64-bit) (preloaded format=pdflatex 2012.11.13) 5 MAR 2015 13:22 entering extended mode **C:/Users/rundeMT/Documents/UConn/TracingPaper/TracingPaper.tex (C:/Users/rundeMT/Documents/UConn/TracingPaper/TracingPaper.tex diff --git a/TracingPaper.pdf b/TracingPaper.pdf index d673bd4..b915b5c 100644 Binary files a/TracingPaper.pdf and b/TracingPaper.pdf differ diff --git a/TracingPaper.tex b/TracingPaper.tex index 398b864..1fec586 100644 --- a/TracingPaper.tex +++ b/TracingPaper.tex @@ -76,38 +76,69 @@ \maketitle \begin{abstract} -With any sort of benchmark, there are inherent oversimplifications that are taken into account when first designing these watermarks for advancing technology. In the case of networking benchmarks, many of these simplifications occur when dealing with the low level operations of the system; spatial/temporal scaling, timestamping, IO and system behavior. While these simplifications were acceptable for past systems being tested, this facile outlook is no longer acceptable for supplying worthwhile information. Without taking into account the intricacies of current day machines, technology will only be able to progress in the avenues that we know of, while never being able to tackle the bottlenecks that are made apparent through more accurate benchmarking. +Traces are an important and necessary part of systems research because this works leads to a better understanding of the behavior of protocols, architectures, or even entire networks. Further more the lessons learned through examining these traces can lead to the development of better benchmarks which will in turns allow for more accurate testing and advancement of technologies and their performance. Some key findings found from the examination of CIFS traces include \textbf{ADD ONCE KEY FINDINGS HAVE BEEN FOUND}. +\\ +%NOTE: Perhaps bring up at the end of the paper when mentioning why trace work is important in a grander scheme +%With any sort of benchmark, there are inherent oversimplifications that are taken into account when first designing these watermarks for advancing technology. In the case of networking benchmarks, many of these simplifications occur when dealing with the low level operations of the system; spatial/temporal scaling, timestamping, IO and system behavior. While these simplifications were acceptable for past systems being tested, this facile outlook is no longer acceptable for supplying worthwhile information. Without taking into account the intricacies of current day machines, technology will only be able to progress in the avenues that we know of, while never being able to tackle the bottlenecks that are made apparent through more accurate benchmarking. \end{abstract} \section{Introduction} \label{Introduction} -Benchmarks are important for the purpose of developing and taking accurate metrics of current technologies. Benchmarks allow for the stress testing of different aspects of a system (e.g. network, single system). There are three steps to creating a benchmark; first one takes a trace of an existing system. This information is then used to compare the expected actions of a system (theory) against the traced actions of said system (practice). The next step is to determine which aspects of the trace are most representative of what occurred during the tracing of the system, while figuring out which are represntative of the habits and patterns of said system. This discovered information is used to produce a benchmark, either by running a repeat of the captured traces or by using synthetic benchmark created from the trends detailed within the captured tracing data~\cite{Anderson2004}. +Traces are important for the purpose of developing and taking accurate metrics of current technologies. One must determine which aspects of the trace are most representative of what occurred during the tracing of the system, while figuring out which are represntative of the habits and patterns of said system. This discovered information is used to produce a benchmark, either by running a repeat of the captured traces or by using synthetic benchmark created from the trends detailed within the captured tracing data~\cite{Anderson2004}. -[ADD THIS SECTION TO RELATED WORK?]As seen in previous trace work done [Leund et al, ellard et al, roselli et al], the general perceptions of how computer systems are being used versus their initial purpose have allowed for great strides in eliminating actual bottlenecks rather than spending unnecessary time working on imagined bottlenecks. Leung's \textit{et. al.} work led to a series of obervations, from the fact that files are rarely re-opened to finding that read-write access patterns are more frequent ~\cite{Leung2008}. Without illumination of these underlying actions (e.g. read-write ratios, file death rates, file access rates) these issues can not be readily tackled. +As seen in previous trace work done [Leund et al, ellard et al, roselli et al], the general perceptions of how computer systems are being used versus their initial purpose have allowed for great strides in eliminating actual bottlenecks rather than spending unnecessary time working on imagined bottlenecks. Leung's \textit{et. al.} work led to a series of obervations, from the fact that files are rarely re-opened to finding that read-write access patterns are more frequent ~\cite{Leung2008}. Without illumination of these underlying actions (e.g. read-write ratios, file death rates, file access rates) these issues can not be readily tackled. +\\ +\textbf{NOT SURE IF KEEP OR NEEDED} I/O benchmarking, the process of comparing I/O systems by subjecting them to known workloads, is a widespread pratice in the storage industry and serves as the basis for purchasing decisions, performance tuning studies, and marketing campaigns ~\cite{Anderson2004}. -The purpose of my work is to tackle this gap and hopefully bring insight to the complexity of network communication. I/O benchmarking, the process of comparing I/O systems by subjecting them to known workloads, is a widespread pratice in the storage industry and serves as the basis for purchasing decisions, performance tuning studies, and marketing campaigns ~\cite{Anderson2004}. +The purpose of my work is to tackle this gap and hopefully bring insight to the complexity of network communication through the examination of CIFS network traffic. \subsection{Issues with Tracing} \label{Issues with Tracing} -The majority of benchmarks are attempts to represent a known system and structure on which some “original” design/system was tested. While this is all well and good, there are many issues with this sort of approach; temporal \& spatial scaling concerns, timestamping and buffer copying, as well as driver operation for capturing packets~\cite{Orosz2013,Dabir2008,Skopko2012}. Each of these aspects contribute to the inital problems with dissection and analysis of the captured information. Inaccuracies in scheduling I/Os may result in as much as a factor of 3.5 differences in measured response time and factor of 26 in measured queue sizes; differences that are too large to ignore~\cite{Anderson2004}. [MENTION EXAMPLE ISSUES BROUGHT FROM THIS - TWO GOOD EXAMPLES]. +\textbf{REWORD TO REMOVE MENTION OF BENCHMARKS}\\ +The majority of benchmarks are attempts to represent a known system and structure on which some “original” design/system was tested. While this is all well and good, there are many issues with this sort of approach; temporal \& spatial scaling concerns, timestamping and buffer copying, as well as driver operation for capturing packets~\cite{Orosz2013,Dabir2008,Skopko2012}. Each of these aspects contribute to the inital problems with dissection and analysis of the captured information. Inaccuracies in scheduling I/Os may result in as much as a factor of 3.5 differences in measured response time and factor of 26 in measured queue sizes; differences that are too large to ignore~\cite{Anderson2004}. Inaccuracies in packet timestamping can be caused due to overhead in generic kernel-time based solutions, as well as use of the kernel data structures ~\cite{Orosz2013,PFRINGMan}. -With the matter of temporal scaling, the main concern is that current day benchmarks do not account for the subtleties of intercommunication between clients \& servers on a network. Temporal scaling refers to the need to account for the nuances of timing with respect to the run time of commands; consiting of computation, communication \& service. A temporally scalable benchmarking system would take these subtleties into account when expanding its operation across multiple machines in a network. While these temporal issues have been tackled for a single processor (and even somewhat for cases of multi-processor), these same timing issues are not properly handles when dealing with inter-network communication. Spatial scaling refers to the need to account for the nuances of expanding a benchmark to incorporate a number of (\textbf{n}) machines over a network. A system that properly incorporates spatial scaling is one that would be able to inccorporate communication (even in varying intensities) between all the machines on a system, thus stress testing all communicative actions and aspects (e.g. resource ocks, queueing) on the network. Common practice is to have this singular benchmark run in parallel across some N computer systems \& to take the result as a facile representation of a parallel/networks system; thus the more interesting data (e.g. inter-network communication) is not accurately represented and nothing can be done about inter-network bottlenecks because these issues are not even known. - -[CLOSING SENTENCES?]While performing a benchmark on a single machine is easily feasible, there is much more to consider when dealing with multiple machines communicating with each other, and the expected requirements of fully testing these aspects +Temporal scaling refers to the need to account for the nuances of timing with respect to the run time of commands; consiting of computation, communication \& service. A temporally scalable benchmarking system would take these subtleties into account when expanding its operation across multiple machines in a network. While these temporal issues have been tackled for a single processor (and even somewhat for cases of multi-processor), these same timing issues are not properly handles when dealing with inter-network communication. Spatial scaling refers to the need to account for the nuances of expanding a benchmark to incorporate a number of (\textbf{n}) machines over a network. A system that properly incorporates spatial scaling is one that would be able to inccorporate communication (even in varying intensities) between all the machines on a system, thus stress testing all communicative actions and aspects (e.g. resource ocks, queueing) on the network. \subsection{Previous Advances Due to Testing} \label{Previous Advances Due to Testing} -Previous tracing work has shown that one of the largest \& broadest hurdles to tackle is that benchmarks must be tailored (to every extent) to the system being tested. There are always some generalizations taken into account but these generalizations can also be a major source of error~\cite{Anderson2004,Traeger2008,Vogels1999,Dabir2008,Orosz2013,Skopko2012,Ellard2003,EllardLedlie2003,Ruemmler1993}. To produce a benchmark with high fidelity one needs to understand not only the technology being used but how it is being implemented within the system to benchmark~\cite{Roselli2000,Ruemmler1993,Traeger2008}. All of these aspects will lend to the behavior of the system; from timing \& resource elements to how the managing software governs~\cite{Ellard2003,EllardLedlie2003,Douceur1999}. Further more, in persuing this work one may find unexpected results and learn new things through examination~\cite{Leung2008,Ellard2003,Roselli2000}. - -[PERHAPS USE THIS PART?]Understanding that no paper an really see the whole scope of tracing/benchmarks, this paper attempts to tackle an aspect of trying to bridge macro and micro benchmarks by building a system that incorporates a micro benchmark's low level replication fidelity with proper scaling to allow for macro level and a "full spectrum scope" analysis of everything in-between using traces of data input and synthetic trace generation. Due to the magnitude of this goal, this paper will further limit its focus towards the often forgot [CITE NEEDED?] networking aspect of multi-system scalable benchmarking and tracing. +Previous tracing work has shown that one of the largest \& broadest hurdles to tackle is that benchmarks (and traces) must be tailored (to every extent) to the system being tested. There are always some generalizations taken into account but these generalizations can also be a major source of error~\cite{Anderson2004,Traeger2008,Vogels1999,Dabir2008,Orosz2013,Skopko2012,Ellard2003,EllardLedlie2003,Ruemmler1993}. To produce a benchmark with high fidelity one needs to understand not only the technology being used but how it is being implemented within the system to trace \& benchmark~\cite{Roselli2000,Ruemmler1993,Traeger2008}. All of these aspects will lend to the behavior of the system; from timing \& resource elements to how the managing software governs~\cite{Ellard2003,EllardLedlie2003,Douceur1999}. Further more, in pursuing this work one may find unexpected results and learn new things through examination~\cite{Leung2008,Ellard2003,Roselli2000}. \subsection{The Need for a New Study} \label{The Need for a New Study} -As has been pointed out by past work, the design of systems is usually guided by an understanding of the file system workloads and user behavior~\cite{Leung2008}. It is for that reason that new studies are constantly performed by the science community, from large scale studies to individual protocol studies~\cite{Leung2008,Ellard2003,Anderson2004,Roselli2000,Vogels1999}. Even within these studies, the information gleaned is only as meaningful as the considerations of how the data is handled. The following are issues that our work hopes to alleviate: there has been no large scale study done on networks for some time, there has been no study on CIFS(Common Internet File System)/SMB(Server Message Block) protocols for even longer, and most importantly these studies have not tackled lower level aspects of the trace, such as spacial \& temporal scaling idiosyncrasies of network communication. It is for these reasons that we have developed this tracing system and have developed new studies for lower level aspects of communication network. A detailed overview of the tracings and analysis system can be seen in section ~\ref{Tracing System}. The hope is to further the progress made with benchmarks \& tracing in the hope that it too will lend to improbvng and deepening the knowledge and understanding of these systems so that as a result the technology and methodology is bettered as a whole. +Tracing collection and analysis has proved its worth in time from previous studies where can be seen important lessons pulled from the research; change in behavior of read/write events, overhead concerns originating in system implementation, bottlenecks in communication, and other revelations found in the traces \textbf{CITE PAPERS HERE}. GRAB TEXT FROM OTHER SECTION WRITTEN TO STATE WHY TRACES/BENCHMARKS MATTER. +Certain elements of this research are purposed to improve tracing knowledge along with general understanding of networked systems. One such element is the delevopment of tracking IO inter-arrival times along with processor times. This allows for more realistic replay of commands run due to more complete temporal considerations; time taken for computation and "travel" time. Another elements is the PID/MID/TID/UID tracking which allows for following command executions between a given client and server. This element paired with the previous development helps expand the understanding and replay-ability of temporal scaling. +Things to make sure to say: Need trace of CIFS behavior. +These studies are required in order to evaluate the development of technologies and methodologies along with furthering knowledge of different system aspects and capabilities. +\\ +As has been pointed out by past work, the design of systems is usually guided by an understanding of the file system workloads and user behavior~\cite{Leung2008}. It is for that reason that new studies are constantly performed by the science community, from large scale studies to individual protocol studies~\cite{Leung2008,Ellard2003,Anderson2004,Roselli2000,Vogels1999}. Even within these studies, the information gleaned is only as meaningful as the considerations of how the data is handled. The following are issues that our work hopes to alleviate: there has been no large scale study done on networks for some time, there has been no study on CIFS(Common Internet File System)/SMB(Server Message Block) protocols for even longer, and most importantly these studies have not tackled lower level aspects of the trace, such as spacial \& temporal scaling idiosyncrasies of network communication. It is for these reasons that we have developed this tracing system and have developed new studies for lower level aspects of communication network. A detailed overview of the tracings and analysis system can be seen in section ~\ref{Tracing System}. The hope is to further the progress made with benchmarks \& tracing in the hope that it too will lend to improving and deepening the knowledge and understanding of these systems so that as a result the technology and methodology is bettered as a whole. \section{Methodology} \label{Methodology} +\subsection{Interesting Aspects of Research} +\label{Interesting Aspects of Research} +\textbf{RENAME THIS SECTION SOMETHING MORE INTELLIGENT} \\ +Key components of the tracing system are as follows: +\begin{enumerate} +\item PF\_RING lends to the tracing system by minimizing copying of packets which allows for more accurate timestamping of incoming traffic packets being captured \textbf{CITE Orosoz and PF\_RING here}. + \begin{itemize} + \item PF\_RING license are free for students doing research (how licenses were obtained) + \item PF\_RING makes use of a memory ring allocated at creation time. Incoming packets are copied by the kernel module to the memory ring, and read by the user-space applications. This aids in minimizing packet loss/timestamping issues by not passing packets through the kernel data structures (straight from the PF\_RING user manual). + \end{itemize} +\item Setup of trace1 to intake upto 10Gb/s of traffic that comes from a network tap on the UITS system. + \begin{itemize} + \item The use of PF\_RING software aids in allowing for the 10Gb/s rate + \end{itemize} +\item Code written to convert CIFS protocol traffic into DataSeries format. Specific fields were chosen to be the interesting fields to be kept for analysis. It should be noted that this was done arbitrarily and changes/additions have been made as the value of certain fields are determined to be worth examining. +\item Code written to analyze the captured DataSeries format packets \& other aspects of the analysis. + \begin{itemize} + \item Packet dissection for R/W events. + \item ID tracking (PID/TID/MID/UID) + \item OpLock information. + \item \textbf{Note:} Future work - combine ID tracking with OpLock info to track resource sharing. + \end{itemize} +\end{enumerate} + \subsection{System Limitations} \label{System Limitations} When initially designing the tracing system used in this paper, different aspects were taken into account, such as space limitations of the tracing system, packet capture limitations (e.g. file size), and speed limitations of the hardware. The major space limitation that is dealt with in this work is the amount of space that the system has for storing the captured packets, including the resulting DataSeries-file compressions. One limitation encountered in the packet capture system deals with the functional pcap (packet capture file) size. The concern being that the pcap files only need to be held until they have been filtered for specific protocol information and then compressed using the DataSeries format, but still allow for room for the DataSeries files being created to be stored. Other limitation concerns came from the software and packages used to collect the network traffic data~\cite{Orosz2013,Dabir2008,Skopko2012}. These ranged from timestamp resolution provided by the tracing system's kernel~\cite{Orosz2013} to how the packet capturing drivers and programs (such as dumpcap and tshark) operate along with how many copies are performed and how often. These aspects were tackled by installing PF\_RING, which is a kernel module which allows for kernel-based capture and sampling with the idea that this will limit packets loss and timestamp overhead leading to faster packet capture while efficiently preserving CPU cycles~\cite{PFRING}. The speed limitations of the hardware are dictated by the hardware being used (e.g. GB capture interface) and the software that makes use of this hardware (e.g. PF\_RING). After all, our data can only be as accurate as the information being captured~\cite{Ellard2003,Anderson2004}. @@ -367,6 +398,14 @@ The future work of this project would be to \item 2. All DataSeries files (which are purposed for distribution) would be a single file per day's worth of communication; this may be possible with new additions to the DataSeries code but pcap limitations do not currently allow for this. \item 3. Modulation of the capturing software would not only pull out information pertanent to the SMB/CIFS protocol, but would be able to pull multiple protocols which a user would be able to define prior to run-time. \item 4. Better automation of the capturing system would remove the potential of human error cause loss of data. Use of new DataSeries tools may allow for recovery of previously corrupted DataSeries files. + \item 5. Update DataSeries code to current git version to see if this takes care of some of the nuances that exist in the DataSeries code + \begin{itemize} + \item Issues with unsigned values. + \item Issues with reading from DataSeries fields + \begin{itemize} + \item Problem is that if not read properly, data is lost due to method by which the reading is performed. + \end{itemize} + \end{itemize} \end{itemize} %references section @@ -416,6 +455,9 @@ A Study of Practical Deduplication}, ACM Transactions on Storage (January 2012) \bibitem{PFRING} \emph{PF\_RING High-speed packet capture, filtering and analysis}, url{http://www.ntop.org/products/pf\_ring/} +\bibitem{PFRINGMan} \emph{PF\_RING User Guide}, +url{https://svn.ntop.org/svn/ntop/trunk/PF\_RING/doc/UsersGuide.pdf} + \bibitem{Traeger2008} Avishay Traeger and Erez Zadok and Nikolai Joukov and Charles P.~Wright, \emph{ A Nine Year Study of File System and Storage Benchmarking}, ACM Transactions on Storage (May 2008)