diff --git a/autobva/autobva.tex b/autobva/autobva.tex new file mode 100644 index 0000000000000000000000000000000000000000..721927fe8c7d051cf42063cba4859108ebedebff --- /dev/null +++ b/autobva/autobva.tex @@ -0,0 +1,61 @@ +% arara: lualatex: {shell: true} + +\documentclass[a4paper, oneside, article]{memoir} + +% \input{../preamble.tex} + +\usepackage{inputenc} +\usepackage{amsmath} +\usepackage{mathtools} +\usepackage{unicode-math} +\usepackage{todonotes} + +\author{Jasper Gräflich\\Potsdam University\\Agentur für Innovationen in der Cybersicherheit} +\title{General Global Search for Automated Boundary Value Analysis} + +\begin{document} + +\maketitle + +\section{Introduction} + +\cite{AutoBVA} introduces a \emph{AutoBVA}, an algorithm for automated boundary value analysis, building on the \emph{program derivative} defined in previous work \cite{TowardsAutoBVT}, \cite{BVE-SW-Analysis}. The authors mention several limitations of their implementation: + +\begin{enumerate} + \item The systems under test (SUTs) considered are simple, and more work on how AutoBVA works on complex SUTs is needed. + \item All SUTs only take integers as inputs. General input types are not possible. + \item The search algorithm is separated into global sampling and local search. This necessitates clustering/filtering of results and could result in duplicate work. +\end{enumerate} + +We address the second and third limitation by presenting a global search algorithm that can work on more types. In particular: + +\begin{itemize} + \item A type $T$ is \emph{searchable}, if there is a suitable metric $d \colon T \times T \to \mathbb{R}_{\geq 0} $ defined on the type. + \item For algebraic data types, that is product and sum types, such a metric can be derived automatically, although a specialized implementation may prove more useful in many cases. + \item Subtypes can use the metric of their supertype, as long as the Liskov substitution principle holds. + \item \cite{AutoBVA} used random uniform sampling. The algorithm was relatively stable, but we present an algorithm that does not depend on randomness. Our search algorithm makes use of \emph{low discrepancy series} instead, where possible.\todo{lean on bituniform/byteuniform sampling} This ensures a uniform sampling of the whole search space in a deterministic manner. It could also easily be used as sampling algorithm for AutoBVA.\todo{Is it really better? Performance and results.} + \item \emph{Stratified Sampling} can be used to manipulate the distribution of sampled points. In this way, a tester could manually mark subdomains that are likely to contain boundary values to be searched more intensively, or reduce the samples in subdomains that are probably less interesting. Using e. g. \emph{Smirnov Transformations}, certain regions can be sampled more densely \todo{How big is the influence of that?} + \item Samples are shrunk to hone in on a boundary (similarly to the BCS method explored in \cite{AutoBVA}) using a \emph{midpoint generator}. This requires the input type to be a so-called \emph{CAT$(0)$-space}, that is a geodesic metric space in which the CN inequality holds. +\end{itemize} + +\section{The Algorithm} + +Input: +\begin{itemize} + \item $f \colon A \to B$, the SUT. $A$ and $B$ are metric spaces. $A$ has unique midpoints, via a midpoint generator $m \colon A² \to A$. + \item $g \colon \mathbb{N}_0 \to A$, a sample generator. + \item $r$, relevance minimum for input pairs\todo{Can also use this to check if, after halving the interval, both halves are relevant} + \item $\delta$, closeness minimum for midpoint generation +\end{itemize} + +\section{Evaluation} + +\todo[inline]{Comparison with the algorithm in \cite{AutoBVA}} + +\section{Limitations} + +\begin{itemize} + \item If there are very small subdomains (for example, a reciprocal function would have a single point at which it reaches an error state), the sampling may not reach them. If all points near this domain are relatively similar, shrinking towards this small subdomain would not happen. +\end{itemize} + +\end{document} \ No newline at end of file