Skip to content

Commit

Permalink
Browse files Browse the repository at this point in the history
Smaller edits and re-wordings
  • Loading branch information
Duncan committed Feb 3, 2020
1 parent 8ac7cf2 commit 6766e8c
Showing 1 changed file with 4 additions and 4 deletions.
8 changes: 4 additions & 4 deletions trackingPaper.tex
Expand Up @@ -368,8 +368,8 @@ Average Write Size (B) & 63 \\ \hline
\end{table}
% NOTE: Not sure but this reference keeps referencing the WRONG table

Tables~\ref{tbl:TraceSummaryTotal}
show a summary of the SMB traffic captured, statistics of the I/O operations, and read/write data exchange observed for the network filesystem. This information is further detailed in Table~\ref{tbl:SMBCommands}, which illustrates that the majority of I/O operations are general (74.87\%). As shown in the bottom part of Table~\ref{tbl:SMBCommands} general includes metadata commands such as \texttt{connect}, close, query info, etc.
Table~\ref{tbl:TraceSummaryTotal}
show a summary of the SMB traffic captured, statistics of the I/O operations, and read/write data exchange observed for the network filesystem. This information is further detailed in Table~\ref{tbl:SMBCommands}, which illustrates that the majority of I/O operations are general (74.87\%). As shown in the bottom part of Table~\ref{tbl:SMBCommands} general I/O includes metadata commands such as connect, close, query info, etc.

Our examination of the collected network filesystem data revealed interesting patterns for the current use of CIFS/SMB in a large engineering academic setting. The first is that there is a major shift away from read and write operations towards more metadata-based ones. This matches the last CIFS observations made by Leung et.~al.~that files were being generated and accessed infrequently. The change in operations are due to a movement of use activity from reading and writing data to simply checking file and directory metadata. However, since the earlier study, SMB has transitioned to the SMB2 protocol which was supposed to be less "chatty" and thus we would expect fewer general SMB operations. Table~\ref{tbl:SMBCommands} shows a breakdown of SMB and SMB2 usage over the time period of May. From this table, one can see that the SMB2 protocol makes up $99.14$\% of total network operations compared to just $0.86$\% for SMB, indicating that most clients have upgraded to SMB2. However, $74.66$\% of SMB2 I/O are still general operations. Contrary to the purpose of implementing the SMB2 protocol, there is still a large amount of general I/O.
%While CIFS/SMB protocol has less metadata operations, this is due to a depreciation of the SMB protocol commands, therefore we would expect to see less total operations (e.g. $0.04$\% of total operations).
Expand Down Expand Up @@ -488,7 +488,7 @@ Figures~\ref{fig:PDF-Bytes-Read} and~\ref{fig:PDF-Bytes-Write} show the probabil
% \label{fig:CDF-Bytes-RW}
%\end{figure}
Figures~\ref{fig:CDF-Bytes-Read} and~\ref{fig:CDF-Bytes-Write} show cumulative distribution functions (CDF) for bytes read and bytes written. As can be seen, almost no read transfer sizes are less than 32 bytes, whereas 20\% writes below 32 bytes. Table~\ref{fig:transferSizes} shows a tabular view of this data. For reads, $34.97$\% are between 64 and 512 bytes, with another $28.86$\% at 64 byte request sizes. There are a negligible percentage of read requests larger than 512.
This read data differs from the size of reads observed by Leung et al. by a factor of 4 smaller.
This read data differs from the size of reads observed by Leung et al. by a factor of four smaller.
%This read data is similar to what was observed by Leung et al, however at an order of magnitude smaller.
Writes observed also differ from previous inspection of the protocol's usage. % are very different.
Leung et al. showed that $60$-$70$\% of writes were less than 4K in size and $90$\% less than 64K in size. In our data, however, we see that almost all writes are less than 1K in size. In fact, $11.16$\% of writes are less than 4 bytes, $52.41$\% are 64 byte requests, and $43.63$\% of requests are less than 64 byte writes.
Expand Down Expand Up @@ -964,7 +964,7 @@ Due to the large number of metadata operations, the use of smart storage solutio

\subsection{System Limitations and Challenges}
\label{System Limitations and Challenges}
When initially designing the tracing system used in this paper, different aspects were taken into account, such as space limitations of the tracing system, packet capture limitations (e.g. file size), and speed limitations of the hardware. One limitation encountered in the packet capture system deals with the functional pcap (packet capture file) size. The concern being that the pcap files only need to be held until they have been filtered for specific protocol information and then compressed using the DataSeries format, but still allow for room for the DataSeries files being created to be stored. Other limitation concerns came from the software and packages used to collect the network traffic data~\cite{Orosz2013,dabir2007bottleneck,skopko2012loss}. These ranged from timestamp resolution provided by the tracing system's kernel~\cite{Orosz2013} to how the packet capturing drivers and programs (such as dumpcap and tshark) operate along with how many copies are performed and how often. The speed limitations of the hardware are dictated by the hardware being used (e.g. GB capture interface) and the software that makes use of this hardware (e.g. PF\_RING). After all, our data can only be as accurate as the information being captured~\cite{seltzer2003nfs,anderson2004buttress}.
When initially designing the tracing system used in this paper, different aspects were taken into account, such as space limitations of the tracing system, packet capture limitations (e.g. file size), and speed limitations of the hardware. One limitation encountered in the packet capture system deals with the functional pcap (packet capture file) size. The concern being that the pcap files only need to be held until they have been filtered for specific protocol information and then compressed using the DataSeries format, but still allow for room for the DataSeries files being created to be stored. Other limitation concerns came from the software and packages used to collect the network traffic data~\cite{Orosz2013,dabir2007bottleneck,skopko2012loss}. These ranged from timestamp resolution provided by the tracing system's kernel~\cite{Orosz2013} to how the packet capturing drivers and programs (such as dumpcap and tshark) operate along with how many copies are performed and how often. The speed limitations of the hardware are dictated by the hardware being used (e.g. Gb capture interface) and the software that makes use of this hardware (e.g. PF\_RING). After all, our data can only be as accurate as the information being captured~\cite{seltzer2003nfs,anderson2004buttress}.
An other concern was whether or not the system would be able to function optimally during periods of high network traffic. All aspects of the system, from the hardware to the software, have been altered to help combat these concerns and allow for the most accurate packet capturing possible.

%About Challenges of system
Expand Down

0 comments on commit 6766e8c

Please sign in to comment.