%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%% Please remove the next line of code if you
%% are satisfied that your installation is
%% complete and working.
%%
%% It is only there to help you in detecting
%% potential problems.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%\input{aipcheck}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%% SELECT THE LAYOUT
%%
%% The class supports further options.
%% See aipguide.pdf for details.
%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\documentclass[
,final % use final for the camera ready runs
%% ,draft % use draft while you are working on the paper
%% ,numberedheadings % uncomment this option for numbered sections
%% , % add further options here if necessary
]
{aipproc}
\layoutstyle{8x11single}
\usepackage{epsfig}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%% FRONTMATTER
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\begin{document}
\title{An Analysis of Program Phase Behavior and its Predictability}
\classification{07.05.Kf, 89.20.Ff, 89.75.Kd}
\keywords {program phase behavior, phase predictability, Basic Block Vectors, SPEC CPU2000, modeling}
\author{Frederik Vandeputte}{
address={Ghent University
\\
Department of Electronics and Information Systems (ELIS)
\\
Sint-Pietersnieuwstraat 41, B-9000 Gent, Belgium}
}
\author{Lieven Eeckhout}{
address={Ghent University
\\
Department of Electronics and Information Systems (ELIS)
\\
Sint-Pietersnieuwstraat 41, B-9000 Gent, Belgium}
}
\author{Koen De Bosschere}{
address={Ghent University
\\
Department of Electronics and Information Systems (ELIS)
\\
Sint-Pietersnieuwstraat 41, B-9000 Gent, Belgium}
}
\begin{abstract}
The execution of a computer program typically consists of a number of phases where each phase exhibits its own behavior. This phase behavior can be exploited for various purposes such as hardware adaptation, performance modeling, etc. Recently, a number of phase identification and classification schemes have been proposed to capture this phase behavior. For some of these phase classification schemes, phase predictors are needed to accurately predict phase transitions.
In this paper, we analyze program phase behavior by constructing a model for representing the phase sequence of a program execution. We validate this model by generating a probabilistic phase sequence from it and by comparing the phase prediction accuracy of this phase sequence with the original program phase sequence for a number of phase predictors.
From this model, we derive a compact model which we use to evaluate the performance of various types of phase predictors. This gives us a better understanding in program phase behavior.
\end{abstract}
\maketitle
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%% MAINMATTER
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{1. Introduction}
\label{section:introduction}
A computer program execution typically consists of a number of phases. Ideally, each phase identifies some part of the execution of a program that exhibits similar behavior within that phase and distinct behavior in comparison with the other phases of the program execution. Being aware of this large-scale time-varying behavior is key to understanding the behavior of a program as a whole. Phase behavior can be exploited for various purposes, like performance modeling~\cite{sherwood02automatically}, compiler optimizations~\cite{sherwood02automatically}, hardware adaptation~\cite{albonesi99selective,dhodapkar02managing,dropsho02integrating,huang01profile,huang03positional,sherwood02automatically,sherwood03tracking,vandeputte05offline}, etc. For example in phase-based hardware adaptation, if we know that particular parts of the processor are nearly unused during some program phase, we could turn off those parts during that phase, resulting in a reduced energy consumption without affecting overall performance.
There exist many ways for identifying phases in a program. One way is to divide the entire program execution into fixed-length instruction intervals and to cluster similar instruction intervals into a phase, based on the code that is being executed within these intervals; this is called a temporal approach.
When this type of phase classification is used in a dynamic phase-based optimization system, it is important to be able to accurately predict when the next phase transition will occur and to what phase the program will change so that the system can anticipate that phase transition.
In this paper, we will focus on a temporal approach which is roughly sketched in section 3. Using this approach, we will extract phase sequences from a number of benchmarks and study their phase behavior by building a model that is capable of summarizing these phase sequences. The main goal of this work is to gain better insight in the phase behavior of computer programs. We will then validate this model by comparing the phase prediction accuracies of the original phase sequences and the phase sequences generated with the model, for a number of phase predictors. Finally, we will evaluate the performance of several phase predictors for particular synthetically generated phase sequences.
%and try to partition a number of benchmarks into a number of categories, based on their phase behavior.
%contributions
%outline?
%nadruk leggen op feit dat wij hier temporele fasen gaan modelleren, geen positionele/hierarchische...
\section{2. Previous Work}
\label{section:related}
%Automatically caracterizing...
In~\cite{sherwood02automatically}, Sherwood et al.\ developed an offline phase identification and classification method for automatically extracting a number of homogeneous phases in a program. In their approach, they first collect so-called \textit{Basic Block Vectors} (BBVs) to summarize the execution behavior of a program, followed by a clustering algorithm to partition these BBVs into a number of phases. In~\cite{sherwood03tracking}, they developed an online phase classification method, which tries to identify these phases on the fly, while the program is being executed by the processor. In that work, they also proposed a number of phase predictors to effectively predict the next phase transition. In~\cite{lau05transition}, Lau et al.\ added some improvements to this online phase classification method by adding a transition phase and variable thresholds. They also improved existing phase predictor schemes by adding confidence counters to the phase predictor table.
%Phase tracking and prediction
%Comparing program phase detection techniques
Dhodapkar et al.~\cite{dhodapkar02managing} made a comparison between several state-of-the-art phase detection methods by evaluating their ability to effectively detect phase changes. They concluded that BBV techniques provide higher sensitivity and more stable phases than other phase detection techniques.
%Offline phase analysis and optimization
In~\cite{vandeputte05offline}, Vandeputte et al.\ proposed an offline phase classification method based on~\cite{sherwood03tracking}, that is able to efficiently deal with multi-configuration processors.
In~\cite{vandeputte05detailed}, Vandeputte et al.\ analyzed and compared a number of state-of-the-art phase predictors. They also improved the phase predictors presented in~\cite{lau05transition} by adding conditional update.
%A detailed study on phase predictors
\section{3. Methodology}
\label{section:methodology}
\begin{figure}[t]
%\begin{center}
\includegraphics[width=0.75\textwidth]{phasesequence.eps}
\vspace{-5mm}
\caption{An example phase sequence, its breakdown in burst length sequences and phase transitions and the phase transition matrix.}
\label{fig:example}
%\end{center}
\vspace{-6mm}
\end{figure}
In our analysis, we use the complete SPEC CPU2000 benchmark suite\footnote{http://www.spec.org/osg/cpu2000/}. The binaries were taken from the SimpleScalar website\footnote{http://www.simplescalar.com/}. We used an off-line phase classification scheme as presented in~\cite{vandeputte05offline}, using 1 million and 8 million instruction intervals\footnote{To be exact, each interval consists of $2^{20}$ and $2^{23}$ instructions, respectively.}. %Due to lack of space, we will only show the results for 8M instruction intervals. For 1M instruction intervals, the results are comparable.
Once the phases corresponding with a program execution are identified, we extract the phase sequence, corresponding to that program execution, by determining for each execution interval to what phase it belongs. A phase sequence thus consists of a sequence of phase IDs. In this paper, we will study the program phase behavior using these phase sequences and construct a model from it.
%During the evaluation, we will often use two types of phase sequences for a given program execution, namely the reference phase sequences and the cross-reference phase sequences. For the cross-reference phase sequences, the phases appearing in the sequence are identified by profiling the same program executed with a training input. These phases are then used when executing the program with a reference input. For the reference phase sequences, This For the
To validate our model, we use a number of existing phase predictors as presented in~\cite{vandeputte05detailed}. For each predictor type, we use the optimal predictor for a given hardware budget of 256 bytes.
% zeker ook nog uitleggen van cross en ref
\section{4. Program Phase Model}
\label{section:model}
In this section, we will build a model for representing program phase sequences. First, we will introduce and discuss the phase model parameters. Second, we will show how a probabilistic program phase sequence can be generated from this phase model.
%OPMERKING: het model dat hier opgeboiuwd is, is eig vooral bedoeld om de voorspelbaarheid mee te testen. Voor andere toepassingen kan het zijn dat dit model te uitgebreid is of zelfs niet volledig genoeg.
\subsection{4.1. Phase Model Parameters}
\label{subsection:params}
Our phase model consists of two types of parameters, namely intra-phase and inter-phase parameters. The intra-phase parameters consist of all parameters needed for modeling one phase. The inter-phase parameters consist of all parameters needed for combining all phases into a full program phase sequence. An example phase sequence is shown in Fig.~\ref{fig:example}. An overview of all parameters is shown in Table~\ref{tab:params}.
\begin{figure}%[t]
\hspace{-5mm}
\mbox
{
{{ \includegraphics[width=0.25\textwidth]{burstlengthdistr.gcc.166.phase.eps} }}
\hspace{-5mm}
{{ \includegraphics[width=0.25\textwidth]{BLRdistr.gcc.166.phase.eps} }}
\hspace{-5mm}
% {{ \includegraphics[width=0.20\textwidth]{vtransdistr.gcc.166.phase.eps} }}
% \hspace{-5mm}
{{ \includegraphics[width=0.25\textwidth]{htransdistr.gcc.166.phase.eps} }}
\hspace{-5mm}
{{ \includegraphics[width=0.25\textwidth]{PTRdistr.gcc.166.phase.eps} }}
}
\vspace{-5mm}
\caption{An example of the real and approximated burst length distribution, BLR distribution, next phase transition distribution and PTR distribution, for one phase of gcc.}
\label{fig:paramfitting}
\vspace{-6mm}
\end{figure}
\subsubsection{4.1.1. Intra-Phase Parameters}
Each phase has a number of characteristics that are specific to that phase and that are (more or less) independent of the characteristics of all other phases in a program. For our model, we tried to find a minimal set of parameters that was able to accurately model a particular phase. In Fig.~\ref{fig:paramfitting}, the real distributions and their approximations in our model are shown for a number of parameters of the model.
\begin{description}
\item[Phase length] The total number of intervals belonging to a particular phase in the program phase sequence is called the phase length. This parameter indicates the relative importance of each phase, as there typically exist only a few large phases.
\item[Number of phase bursts] Not all intervals belonging to the same phase are contiguous but are interleaved with intervals of other phases. This is because a program often executes some pieces of code several times throughout the execution of a program. The sequence of all consecutive intervals belonging to the same phase, is called a phase burst (see Fig.~\ref{fig:example}). Each phase thus consists of a set of phase bursts. As the number of phase bursts may affect the predictability of a phase sequence -- many small phase bursts could drastically reduce the predictability -- this is an important parameter in our model.
\item[Burst length distribution] As a phase consists of several phase bursts and each phase burst has its own burst length, we also model the distribution of these burst lengths for each phase. We experimentally verified that this distribution can be modeled using a lognormal distribution, which deals very well with the fact that all burst lengths must be positive and thus are not symmetric in case the average burst length is close to 1. This distribution requires two parameters $\mu$ and $\sigma$, the mean and standard deviation of the lognormal distribution, respectively. In Fig.~\ref{fig:paramfitting}, an example is shown for a burst length distribution close to 1. Notice that in such cases, the distribution also resembles a geometric distribution. This is of course not the case if the average burst length is much higher than one.
\item[Burst length run distribution] When looking at real phase sequences, it appears that for a particular phase, the burst lengths are not randomly spread over all phase bursts. Instead, contiguous phase bursts of the same phase frequently have exactly the same burst length for a given period of time. The number of contiguous bursts with the same burst length is called a burst length run (BLR). For example, in Fig.~\ref{fig:example}, all three bursts of phase $C$ have the same burst length -- namely 1, meaning that phase $C$ has a BLR of length 3. Phase $B$ has a BLR of length 3, followed by a BLR of length 1, as the last burst length differs from the previous one. Phase $A$ has two BLRs of length 1 and one of length 3. The distribution of all the BLRs of one phase can be accurately modeled by a (weighted) geometric distribution which has only one parameter $p$. An example is shown in Fig.~\ref{fig:paramfitting}. For evaluating phase predictability, modeling this non-random behavior is very important.
\end{description}
\begin{table}[t]
%\begin{center}
\caption{An overview of all phase model parameters.}
\resizebox{12cm}{!}
{
\begin{tabular}{|l|l|}
\hline
\textbf{Parameter} & \textbf{Meaning} \\
\hline
PhaseLength[$k$] & Total phase length of phase $k$ \\
NBursts[$k$] & Number of bursts in phase $k$\\
BurstLength\_$\mu$[$k$] & Burst length distribution mean\\
BurstLength\_$\sigma$[$k$] & Burst length distribution standard deviation\\
BLR\_P[$k$] & Burst length run distribution parameter \\
\hline
NPhases & Number of phases in the program\\
InPhaseTransitions\_P[$k$] & Distribution parameter for the transitions \textit{to} phase $k$ \\
OutPhaseTransitions\_P[$k$] & Distribution parameter for the transitions \textit{from} phase $k$ \\
PTR\_P[$k$] & Phase Transition run distribution parameter\\
\hline
\end{tabular}
\label{tab:params}
}
%\end{center}
\vspace{-6mm}
\end{table}
\subsubsection{4.1.2. Inter-Phase Parameters}
The previous parameters only describe the characteristics for each phase separately; in this section we will describe a number of parameters that model the relationship between all phases in the program.
\begin{description}
\item[Number of phases] In our model, we also include how many phases there are in the program phase sequence.
\item[Phase transition distribution] As can be seen in Fig.~\ref{fig:example}, the phase transition sequence can be described with a phase transition matrix (PTM). In our model, we model the transition probabilities for going from phase $A$ to any other phase (row $A$ in the PTM), and vice versa, the transition probability for going from some phase to phase $A$ (column $A$ in the PTM). It appears that these probabilities can be modeled accurately with a (truncated) geometric distribution. So in our model, for each phase, we have one parameter modeling the previous-phase and next-phase transition probability, respectively. Fig.~\ref{fig:paramfitting} shows an example next-phase transition probability distribution.
%In our model, gaan we niet de volledige matrix en de exacte overgangen van fase A naar fase B behouden, we gaan enkel de distributie proberen behouden
\item[Phase transition run distribution] As for the burst lengths, particular phase transitions are not randomly spread across the program. Instead, phase transitions (e.g.\ going from phase $A$ to phase $C$) tend to occur in bursts, which is understandable, as most programs are not random, but follow a distinct pattern. As this is also a property most phase predictors take advantage of, it is important to model this bursty behavior. As such, we calculate for each phase how long each phase transition remains the same and we make a distribution of these phase transition runs (PTRs). As for the BLR distribution, it appears that this distribution can also be modeled accurately using a (weighted) geometric distribution. An example is shown in Fig.~\ref{fig:paramfitting}.
\end{description}
%TODO: waar nog iets zeggen over het unificeren van alle intra-phase params???
\begin{figure}[t]
%\begin{center}
\mbox
{
{{ \includegraphics[width=0.45\textwidth]{modelpredictoraccuracy_ref_1M.eps} }}
{{ \includegraphics[width=0.45\textwidth]{modelpredictoraccuracy_ref_8M.eps} }}
}
\vspace{-5mm}
\caption{The misprediction rate of various types of phase predictors on the x axis applied to the original program phase sequence and the probabilistic phase sequence. The left graph shows the results when 1M instruction intervals are used; the right graph shows the results for 8M instruction intervals. The results are averaged over all SPEC CPU2000 benchmarks. }
\label{fig:modelpredictoraccuracy}
%\end{center}
\vspace{-6mm}
\end{figure}
\subsection{4.2. Construction of a Probabilistic Program Phase Sequence}
\label{subsection:construction}
In the previous section, we introduced all parameters involved in the model. In this section, we will give a very brief description of how a probabilistic program phase sequence can be generated, using the model parameters as described in the previous section. The construction process consists of three steps. First, a sequence of phase bursts is generated for each phase. Second, a phase transition matrix and a phase transition sequence is built. Finally, the phase burst sequences are merged with the phase transition sequence into a probabilistic program phase sequence that matches the model.
As most phase model parameters are not independent of each other, the main challenge was to find a scheme that allows for generating a sequence that meets all modeling parameters as good as possible.
To generate the phase burst sequences, we first generate a collection of BLRs using a geometric distribution, resulting in a number of clusters of bursts with the same burst length. Then we generate for each such BLR cluster an appropriate burst length (which all bursts in this BLR cluster will share). Because this may not lead to a good result immediately, we repeat this process for several iterations. Finally, we order this collection of clusters in a way that adjacent clusters of bursts do not have the same burst length. For this, we generate a burst length transition matrix and extract a valid burst length transition sequence from it, which is similar to finding a valid phase transition sequence as will be discussed further in this subsection.
Once we have the phase burst sequences, we still have to find a valid phase transition sequence in which each phase $A$ is split into NBursts[$A$] bursts. In other words, we have to find a combination of all phase bursts of all phases so that no two bursts of the same phase are adjacent. To solve this problem, we divide this process into two steps. First, we build a phase transition matrix (PTM) that satisfies Kirchoff's node law for networks and the modeling parameters from the previous subsection. We developed an algorithm to generate a PTM that meets these two conditions, but due to the lack of space, we will not elaborate on this. Then we try to find a valid path in the transition graph decribed by this PTM, that traverses through all the edges of the transition graph. Such a path is called an Euler path and we used a modified version of Fleury's algorithm~\cite{lucas91fleury} to find such a path that also matches the PTR-distributions. For this, we had to build and solve another transition matrix to model the transitions from one successor node to another successor. We will not elaborate on this due to the lack of space.
Finally, the program phase sequence can be generated by simply merging the phase burst sequences with the phase transition sequence.
\section{5. Evaluation}
\label{section:evaluation}
In this section, we will evaluate our model as described in section 4. First, we will validate this model by comparing the misprediction rates for several phase predictors on the original and probabilistic phase sequence.
Second, we will try to find a distribution for each phase specific parameter. From these distributions, a compact model will be derived.
Third, we will use this compact model to investigate the performance of the predictor types mentioned in~\cite{vandeputte05detailed}.
%Finally, we will try to partition all SPEC CPU2000 benchmarks into a number of categories, based on the modeling parameters.
%We will also try to explain why some benchmarks are more predictable than others.
\subsection{5.1. Model Validation}
\label{subsection:validation}
\begin{figure}[t]
%\begin{center}
\includegraphics[width=1.0\textwidth]{predictoraccuracy_benchmarks_ref_8M.eps}
\vspace{-5mm}
\caption{For all SPEC CPU2000 benchmarks, the misprediction rate of the 1-level burst predictor with conditional update and confidence, applied to the original reference phase sequence and the probabilistic phase sequence. The last-value misprediction rate is shown as a reference point. The results are for 8M instruction intervals.}
\label{fig:predictoraccuracy_benchmarks}
%\end{center}
\vspace{-6mm}
\end{figure}
To validate the model, we compare the performance of several phase predictors on the original and probabilistic phase sequence for all SPEC CPU2000 benchmarks. The results of this\footnote{Some note on the abbreviations for identifying the phase predictors. The first part states if it is a burst predictor or an RLE predictor, \textit{sat} means that conditional update is used, then the number of history levels used by the predictor is shown and a \textit{t} at the end means that confidence is being used in that predictor. An overview and description of all these predictor types can be found in~\cite{vandeputte05detailed}.} can be seen in Fig.~\ref{fig:modelpredictoraccuracy}. On average, the error between the misprediction rates of the original and probabilistic phase sequence is about 3\% for 1M instruction intervals and about 2\% for 8M instruction intervals.
%Especially for the 8M instruction intervals, the largest error can be noticed for the 2-level phase predictors. This is because we do not explicitly model second-order effects and thus our model could generate a phase sequence which is 'too random' for these types of predictors.
Fig.~\ref{fig:predictoraccuracy_benchmarks} shows the misprediction rates for each benchmark separately for a 1-level burst and RLE predictor with conditional update and confidence for 8M instruction intervals. These results show that the probabilistic phase sequences track the original phase sequences fairly well. The largest errors can be seen for phase sequences where the (original) burst predictors outperform the simple last value predictor (i.e.\ they have a much higher prediction accuracy).
%For the 2-level predictor, one can notice a positive error between the misprediction rates of the original and the probabilistic phase sequences for most benchmarks. As mentioned before, this is because we do not explicitly model second-order effects and thus our model generates a phase sequence that is 'too random' for these types of predictors. As a result, one can see that on average the difference between the misprediction rates of the 1-level and 2-level predictor is not as high for the probabilistic phase sequences as for the original phase sequences.
%reden negatieve error 1-level pred: seq gegenereerd uit model voldoet niet voor de 100% aan het model, maar wijkt er lichtjes vanaf => ook es iets van opmeten???
\subsection{5.2. Summarizing Phase Specific Parameters}
The model as presented in section 4 collects statistics for each phase separately. In this subsection, we will try to find a connection between the phase model parameters (as listed in Table~\ref{tab:params}) of all phases of one program and represent them by an appropriate distribution. The end goal then is to create a compact model of it, which will be used in the next subsection for evaluating phase predictor behavior.
\begin{table}[t]
%\begin{center}
\caption{The parameters of the compact phase sequence model.}
\resizebox{15cm}{!}
{
\begin{tabular}{|ll|l|}
\hline
\multicolumn{2}{|c|}{\textbf{Parameter in compact model}} & \multicolumn{1}{|c|}{\textbf{Meaning}} \\
\hline
NPhases & & Number of phases\\
\hline
PhaseLength\_P & Avg\_PhaseLength& Distribution of phase lengths\\
NBursts\_P & Avg\_NBursts& Distribution of number of bursts\\
BurstLength\_$\mu$\_P & Avg\_BurstLength\_$\mu$& Distribution of BurstLength\_$\mu$ parameter\\
BurstLength\_$\sigma$\_P & Avg\_BurstLength\_$\sigma$& Distribution of BurstLength\_$\sigma$ parameter\\
\hline
BLR\_P\_$\mu$ & BLR\_P\_$\sigma$ (=0.24)& Distribution of burst length run parameter\\
PhaseTransition\_P\_$\mu$ & PhaseTransition\_P\_$\sigma$ (=0.22)& Distribution of phase transition parameter\\
PTR\_P\_$\mu$ & PTR\_P\_$\sigma$ (=0.24)& Distribution of phase transition run parameter\\
\hline
\end{tabular}
\label{tab:compactmodel}
}
%\end{center}
\vspace{-6mm}
\end{table}
\begin{figure}[t]
% \begin{center}
\mbox
{
\hspace{-5mm}
{{ \includegraphics[width=0.33\textwidth]{phaselengthdistr.ammp.eps} }}
\hspace{-5mm}
{{ \includegraphics[width=0.33\textwidth]{phaselengthdistr.gcc.166.eps} }}
\hspace{-5mm}
{{ \includegraphics[width=0.33\textwidth]{phaselengthdistr.facerec.eps} }}
}
\vspace{-5mm}
\caption{Truncated geometric distribution versus the sorted relative phase lengths for 3 benchmarks. }
\label{fig:phaselengthdistribution}
% \end{center}
\vspace{-3mm}
\end{figure}
For example, if we take the relative phase lengths of all phases of one program and sort them in decreasing order, it appears that these fractions approximate a truncated geometric distribution very well, as can be seen in Fig.~\ref{fig:phaselengthdistribution}. For benchmarks ammp and gcc, the error when fitting the phase lengths using the least mean square method is less than 0.5\%. For facerec, the least mean square error is 3\%. The reason for this mismatch is that there is only one major phase; all other phases are small and more or less equal in length. Averaging over all SPEC CPU2000 benchmarks, the error is less than 1\%. As a result, the phase lengths of all phases of a program can be modeled with only 2 parameters, namely parameter $PhaseLength\_P$ coming from the geometric distribution, and $Avg\_PhaseLength$, the sum of all phase lengths (which is also necessary because the geometric distribution only describes the relative phase lengths), divided by the number of phases in the program. The same can be done for parameters NBursts[$k$], BurstLength\_$\mu$[$k$] and BurstLength\_$\sigma$[$k$] in Table~\ref{tab:params}, with a least mean square error of 1\%, 0.7\% and 5\%, respectively.
% \begin{table}[t]
% \begin{center}
% \caption{The range of all parameters in the compact model over all SPEC CPU2000 benchmarks.}
% \resizebox{\textwidth}{!}
% {
%
% \begin{tabular}{|l|rrrrrrrrrrrr|}
% \hline
% & NPhases & PhLen\_P & Avg\_PhLen & BL\_$\mu$\_P & Avg\_BL\_$\mu$ & BL\_$\sigma$\_P & Avg\_BL\_$\sigma$ & NBursts\_P & Avg\_NBursts & BLR\_P\_$\mu$ & PhTrans\_P\_$\mu$ & PTR\_P\_$\mu$ \\
% \hline
% min & 2 & 0.12 & 60 & 0.056 & 0.25 & 0.001 & 0 & 0.001 & 1 & 0.062 & 0.39 & 0.034\\
% avg & 10 & 0.52 & 2911 & 0.31 & 1.89 & 0.25 & 0.96 & 0.22 & 259 & 0.52 & 0.68 & 0.35\\
% max & 32 & 1 & 18713 & 1 & 7.55 & 1 & 4.06 & 0.45 & 3540 & 1 & 1 & 0.71\\
% stddev & 8.02 & 0.25 & 3796 & 0.18 & 1.61 & 0.20 & 0.76 & 0.14 & 547 & 0.26 & 0.14 & 0.19\\
% \hline
% \end{tabular}
% \label{tab:modelparamrange}
% }
% \end{center}
% \vspace{-6mm}
% \end{table}
The other parameters in the model (namely BLR\_P[$k$], In/OutPhaseTransitions\_P[$k$] and PTR\_P[$k$]) can be modeled using a Normal distribution having two parameters $\mu$ and $\sigma$. Because the standard deviation of these distributions is fairly constant over all benchmarks, we used a fixed standard deviation to simplify the compact model. We also combined the In/OutPhaseTransitions\_P[$k$] distributions into one distribution. The resulting compact model is shown in Table~\ref{tab:compactmodel}.
\begin{figure}[t]
%\begin{center}
\includegraphics[width=1.0\textwidth]{predictoraccuracy_benchmarks_ref_8M_compact.eps}
\vspace{-5mm}
\caption{For all SPEC CPU2000 benchmarks, the misprediction rate of the 1-level burst predictor with conditional update and confidence, applied to the original reference phase sequence and the probabilistic phase sequence using only the parameters of the compact model. The last-value misprediction rate is shown as a reference point. The results are for 8M instruction intervals.}
\label{fig:predictoraccuracy_benchmarks_compact}
%\end{center}
\vspace{-6mm}
\end{figure}
For generating a probabilistic phase sequence with this compact model, we use the compact model to generate a value for all parameters of all phases in the extended model. Then we simply generate a probabilistic phase sequence as described in section 4. The main issue here is to find a good and valid permutation so that the generated parameters for each phase are consistent and viable.
%Due to lack of space, we will not elaborate on this.
We validated the model in the same manner as described in section 5.1. In Fig.~\ref{fig:predictoraccuracy_benchmarks_compact} the performance of several phase predictors when applied to the original and corresponding probabilistic phase sequences is shown. In general, the accuracy is similar to the accuracy of the more extended model (compare Fig.~\ref{fig:predictoraccuracy_benchmarks} and Fig.~\ref{fig:predictoraccuracy_benchmarks_compact}, except for equake and facerec -- the benchmarks with the highest last value misprediction rates, where there is a significant under- and overestimation of the misprediction rate, respectively. In these cases, the compact model was not able to fully capture their specific short bursted phase behavior.
%nog ieten over accuracy met dit model?
\subsection{5.3. Predictor Type Evaluation}
\begin{table}[t]
%\begin{center}
\caption{The initial values for the parameters of the compact phase sequence model.}
\vspace{-3mm}
\resizebox{15cm}{!}
{
\begin{tabular}{|l|r||l|r||l|r||l|r|}
\hline
Parameter & Value & Parameter & Value & Parameter & Value & Parameter & Value\\
\hline
NPhases & 10 & BLR\_P\_$\mu$ & 0.40 & PhaseTransition\_P\_$\mu$ & 0.70 & PTR\_P\_$\mu$ & 0.40\\
PhaseLength\_P & 0.30 & Avg\_PhaseLength & 1000 & NBursts\_P & 0.20 & Avg\_NBursts & 200\\
BurstLength\_$\mu$\_P & 0.40 & Avg\_BurstLength\_$\mu$ & 0.90 & BurstLength\_$\sigma$\_P & 0.30 & Avg\_BurstLength\_$\sigma$ & 0.30\\
\hline
\end{tabular}
\label{tab:defaultparams}
}
%\end{center}
\vspace{-6mm}
\end{table}
We will now use this compact model to investigate the relationship between some parameters of the model versus the performance of several phase predictor types and compare it to other predictor types. The initial values for all phase model parameters used in our evaluation are shown in Table~\ref{tab:defaultparams}. All graphs presented in this subsection show the improvement of certain predictors over the standard last-value predictor\footnote{The last-value predictor predicts that the phase ID of the next execution interval will be the same as the phase ID of the current interval. In other words, with the last-value predictor, there is a misprediction at the end of every phase burst.} misprediction rate.
\begin{figure}[t]
%\begin{center}
\mbox
{
{{ \includegraphics[width=0.48\textwidth]{eval_burst_9.eps} }}
{{ \includegraphics[width=0.48\textwidth]{eval_burst_11.eps} }}
}
\vspace{-5mm}
\caption{The reduction of the phase misprediction rate of various types of burst predictors over the last-value predictor when parameter BLR\_P\_$\mu$ (left graph) or parameter PTR\_P\_$\mu$ (right graph) is varied between 0.1 and 1.0. }
\label{fig:evalburstparam9and11}
%\end{center}
\vspace{-6mm}
\end{figure}
Fig.~\ref{fig:evalburstparam9and11} shows the effect of varying parameter BLR\_P\_$\mu$ (i.e.\ the mean of the \textit{burst length run} geometric distribution parameter) and parameter PTR\_P\_$\mu$ (i.e.\ the mean of the \textit{phase transition run} geometric distribution parameter) on the performance of various types of burst predictors\footnote{As not all parameters are independent of each other, we also had to vary some other parameters accordingly to be able to explore the full range of these parameters.}. Anything below zero means that the predictor performs worse than the simple last-value predictor for that configuration. From these graphs, a number of interesting observations can be made.
First, one can see a clear difference in performance between the predictors with and without confidence; the former remain above zero, whereas the latter drop far below zero when BLR\_P\_$\mu$ (and PTR\_P\_$\mu$) approaches 1.0. The main reason for this is the following. When the BLR\_P\_$\mu$ goes to one, the average BLR decreases to one, meaning that the length of two consecutive bursts of the same phase will be different (almost) all the time. This will increase the number of phase mispredictions, as the standard burst predictor uses the length of the previous burst to predict the length of the current burst. Moreover, if the real burst length is larger than the predicted burst length, there will be two mispredictions for each phase burst: one when the current burst reaches the predicted burst length, and one when the current burst reaches its real burst length. This is why the burst predictor performs worse for high values of BLR\_P\_$\mu$. The predictors with confidence counters however will only predict a phase change when they are confident about the prediction~\cite{vandeputte05detailed}. As a result, when BLR\_P\_$\mu$ reaches 1.0, phase predictors with confidence will degrade in fact to the last-value predictor -- as all confidence counters will remain 0 almost all the time, and the phase prediction accuracy converges to the phase prediction accuracy of the last-value predictor. The same can be noticed with parameter PTR\_P\_$\mu$. Note that in very predictable phase sequences, adding confidence can also degrade the performance of the phase predictors, because of hesitation.
A second observation that can be made is that condition update is specifically beneficial when BLR\_P\_$\mu$ or PTR\_P\_$\mu$ is low. For values close to 1.0, the performance of both predictor types (i.e.\ with and without conditional update) seems to converge. This is understandable, because conditional update is intended to filter out noisy behavior in phase sequences with regular patterns. For irregular patterns, conditional update is less effective.
%A second observation one can make is that for the given configuration, there is little or no difference between the 1-level, 2-level and 3-level predictors in the right graph, whereas on the left graph, one can clearly see that using more levels of history can improve the prediction accuracy. There are two reasons for this. The first is phase behavior may appear to be random at the first level, but not at higher levels. The other reason will be explained later, when discussing the 3D-graphs of Fig.~\ref{fig:evalparam9en11}. However, when more levels are being used, the learning time when using confidence and conditional update increases, reducing the impact of having more levels of history. Note also that we did not model second-order effects in our model, meaning that systematic behavior at higher levels may not be captured. For this configuration, using confidence and conditional update thus is more beneficial than using more levels of history.
% ik zou hier ook nog een bespreking kunnen doen ivm param 10, maar wegens plaatsgebrek... :s
% \begin{figure}[t]
% \begin{center}
% \mbox
% {
% {{ \includegraphics[width=0.45\textwidth]{eval_levels_burst_11_def.eps} }}
% {{ \includegraphics[width=0.45\textwidth]{eval_levels_rle_11_def.eps} }}
% }
% \vspace{-5mm}
% \caption{The reduction of the phase misprediction rate of the burst predictor and RLE predictor over the last-value predictor when parameter PTR\_P\_$\mu$ is varied between 0.1 and 1.0. The left graph shows the results for the standard burst predictor when the number of history levels is varied, whereas the right graph shows the corresponding results for the standard RLEBurst predictor. }
% \label{fig:evallevelsparam11}
% \end{center}
% \vspace{-6mm}
% \end{figure}
%In Fig.~\ref{fig:evallevelsparam11} the effect of varying parameter PTR\_P\_$\mu$ (i.e. the mean of the \textit{phase transition run} geometric distribution parameter) on the performance of the 1-level and 2-level burst predictor and its variants is shown. As can be seen again, using confidence is especially beneficial when the phase sequences are less predictable (i.e.\ PTR\_P\_$\mu$ goes to 1.0). A second observation is that condition update is mainly beneficial when PTR\_P\_$\mu$ is low. For values of PTR\_P\_$\mu$ close to 1.0, the performance of both predictor types seem to converge. This is logical, because conditional update is intended to filter out noisy behavior in phase sequences with regular patterns. Third, for regular phase sequences (i.e.\ sequences with a low PTR\_P\_$\mu$), the performance of the 1-level burst predictors is slightly higher than the performance of the 2-level burst predictors. The main reason for this is that 2-level burst predictors have a longer learning time than 1-level predictors (especially because we do not explicitly model second order effects).
Fig.~\ref{fig:evalparam9en11} shows the effect of varying both BLR\_P\_$\mu$ and PTR\_P\_$\mu$ simultaneously. One can see that for the standard 1-level burst predictor, there is only a small range of phase sequences for which this predictor performs better than the last-value predictor. However, when confidence (and conditional update) is added, the performance will almost never be worse than the last-value predictor. Notice that for the standard burst predictor, the performance decreases faster for increasing values of BLR\_P\_$\mu$ than for increasing values of PTR\_P\_$\mu$. This is because for the burst predictor, predicted burst lengths are stored in the predictor table and the last phase ID for each phase is used to index the predictor table. In other words, if the last phase changes frequently, the predictor simply will not find a valid entry corresponding with that behavior and thus will not make an extra incorrect prediction, whereas if the phase transition behavior is stable and the burst length does change frequently, the predictor will introduce many extra incorrect predictions.
%Another interesting thing can be seen in the third graph of Fig.~\ref{fig:evalparam9en11}. Apparently, when the BLR\_P\_$\mu$ approximates 1.0, the performance of the standard 2-level burst predictor is higher when PTR\_P\_$\mu$ is close to 1.0 than when PTR\_P\_$\mu$ is close to zero.
%The same effect can be noticed when using even higher level burst predictors. The reason for this is the same as already mentioned above. Because the PTR\_P\_$\mu$ is close to 1.0, the phase ID of the phase following a particular phase (burst) changes rapidly. When using more levels of history, this effect will even be worse (especially as we do not model second order effects directly). This results in a huge amount of combinations, which cannot all be stored in the predictor table. As a result, the predictor will make less (incorrect) predictions and the performance will be closer to the performance of the last-value predictor. This is in fact a case where aliasing reduces the performance degradation of the predictor in case of unpredictable sequences.
% ik zou ook nog een verschilcurve kunnen maken, waarbij je dan kan zien in welke range welke predictor het best is...
% en ook nog op zo'n 3D curve een aantal benchmarks plaatsen, zodat de lezer een idee heeft van... => voor in uitgebreid spul?
%mss korte samenvatting geven van een aantal andere eig'en...gewoon es vermelden?
In this subsection, we focused on the effect of varying two parameters of the model. We chose these parameters because they have the most direct and intuitive effect on the performance of the phase predictors. Besides these two parameters, we also examined the effect of varying the other model parameters. Due to lack of space, we will only give a quick overview of some other interesting characteristics. First of all, it appears that when all other parameters are kept constant, the performance of the burst and RLE phase predictors increases when the number of phases is increased compared to the last value predictor. Also, the effectiveness of conditional update also increases when the number of phases is increased. On the other hand, the performance of the burst and RLE predictors appears to be decreasing drastically when parameters BurstLength\_$\mu$\_P and BurstLength\_$\sigma$\_P are decreased towards 0, respectively. The reason for this is that parameters BurstLength\_$\mu$ and BurstLength\_$\sigma$ of all phases then are more evenly spread over all phases.
%Using confidence also appears to be better when parameters BurstLength\_$\mu$\_P and BurstLength\_$\sigma$\_P are close to 1.0, whereas for values close to 0, it appears to b
\begin{figure}[t]
% \hspace{-5mm}
%\begin{center}
\mbox
{
{{ \includegraphics[width=0.45\textwidth]{eval_burst_9_11_1lev_def.eps} }}
% \hspace{-5mm}
{{ \includegraphics[width=0.45\textwidth]{eval_burst_9_11_1lev_satt.eps} }}
% \hspace{-5mm}
% {{ \includegraphics[width=0.25\textwidth]{eval_burst_9_11_2lev_def.eps} }}
% \hspace{-5mm}
% {{ \includegraphics[width=0.25\textwidth]{eval_burst_9_11_2lev_satt.eps} }}
}
\vspace{-5mm}
\caption{The reduction of the phase misprediction rate of the 1-level burst predictor over the last-value predictor when BLR\_P\_$\mu$ and PTR\_P\_$\mu$ are varied between 0.1 and 1.0. The left graph shows the performance of the standard 1-level burst predictor; the right graph shows the performance of the 1-level burst predictor with conditional update and confidence. }
\label{fig:evalparam9en11}
%\end{center}
\vspace{-3mm}
\end{figure}
% \subsection{Categorizing benchmarks based on phase behavior}
%
% TODO
%de twee laatste subsecties omkeren???
\section{6. Conclusions}
\label{section:conclusions}
The behavior of most computer program executions can be divided into a number of phases, each having its own behavior. Recently, this knowledge has been exploited for several purposes such as hardware adaptation for energy efficiency, performance modeling, etc. When this phase behavior is used in a dynamic system, phase predictors may be needed to predict and anticipate on phase predictions.
In this paper, we analyzed the program phase behavior of the SPEC CPU2000 benchmarks by constructing a model for representing program phase sequences. We first validated the model by generating probabilistic phase sequences based on the model and by comparing the behavior of several phase predictors on the original phase sequences and on their probabilistic equivalent. Then, we tried to find a relationship between the values of the modeling parameters of each phase by fitting them to a distribution. The resulting compact model was then used to analyze and gain insight into the performance of several phase predictor types.
\begin{theacknowledgments}
This research was funded by Ghent University and by the Fund for Scientific Research-Flanders (FWO-Flanders).
\end{theacknowledgments}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%% The bibliography can be prepared using the BibTeX program or
%% manually.
%%
%% The code below assumes that BibTeX is used. If the bibliography is
%% produced without BibTeX comment out the following lines and see the
%% aipguide.pdf for further information.
%%
%% For your convenience a manually coded example is appended
%% after the \end{document}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%% You may have to change the BibTeX style below, depending on your
%% setup or preferences.
%%
%%
%% For The AIP proceedings layouts use either
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\bibliographystyle{aipproc} % if natbib is available
%\bibliographystyle{aipprocl} % if natbib is missing
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%% You probably want to use your own bibtex database here
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\bibliography{casys05}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%% Just a reminder that you may have to run bibtex
%% All of it up to \end{document} can be removed
%% if you don't like the warning.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% \IfFileExists{\jobname.bbl}{}
% {\typeout{}
% \typeout{******************************************}
% \typeout{** Please run "bibtex \jobname" to optain}
% \typeout{** the bibliography and then re-run LaTeX}
% \typeout{** twice to fix the references!}
% \typeout{******************************************}
% \typeout{}
% }
\end{document}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%% The following lines show an example how to produce a bibliography
%% without the help of the BibTeX program. This could be used instead
%% of the above.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\endinput
%%
%% End of file `template-8s.tex'.