Rev 52 | Blame | Compare with Previous | Last modification | View Log | RSS feed

%TODO: some package abuses \emph for \underline (bibliography!!!)

\section*{---------------}

This is another text (in line with the Florian Haftmann's ``statement

why Isabelle/HOL is the system of choice'' and ``Theorem Proving

underway to Engineering Practice'') preparing for cooperation between

the Theorem Proving Group Munich and RISC Linz.

\section{Computer Algebra's Steps towards Isabelle}

Computer Algebra (CA) is traditionally based on computational

principles, whereas Isabelle, as theorem prover (TP), is based on

principles of deduction. Both kinds of basements have their

strengths, which have led to widespread acceptance of software tools

in science and industry of both, of CA tools as well as TP tools.

Widening application scenarios for both promoted mutual approaches of

TP and CA over the years \footnote{CA approaches TP by implementing

``assume facilities'' \cite{Beaumont02towardsbetter} etc, and TP

approaches CA by implementation of solving differential equations

\cite{immler2012ode} etc.}. The steps below describe a possible

process, where CA approaches TP and accounts for issues arising

alongside these steps.

\medskip

\S\ref{step-1} discusses embedding of CA into Isabelle such that only

available infrastructure is required, \S\ref{step-2} and

\S\ref{step-3} present CA features, which seem to require specific

enhancements of Isabelle for conciseness of CA and elegance of CA

implementation. \S\ref{step-4} revises the enhancements required for

improvement of CA and \S\ref{step-5} concludes with discussing

relevance of possible enhancements beyond application in CA.

\subsection{Step 1: Exploit Isabelle's Computational Features}\label{step-1}

Some CA features are algorithms which are total over a wide range (and

do not directly involve predicates and propositions constraining the

range), and thus are mere computations independent from deductive

infrastructure. Such algorithms are the Gr\"obner Bases method,

cylindrical algebraic decomposition

\cite{Collins:1976:QER:1093390.1093393} and others. For implementing

these algorithms Isabelle already provides an appropriate

infrastructure as shown in the sequel.

\paragraph{Example:} Cancellation of fractions, an indispensable part

of symbolic simplification:

$$\frac{x^2-y^2}{x^2-x\cdot y}=\frac{\cancel{(x+y)}\cdot(x-y)}{\cancel{(x+y)}\cdot x}=\frac{x-y}{x}\;\land\;x+y\not=0\;\land\;x\not=0$$

Cancellation of multi-variate polynomials requires advanced methods of

symbolic computation (Hensel lifting or residua), it cannot be accomplished by

rewriting. Rather, a respective algorithm implemented according to

\cite{winkler:poly} comprises about 77\fixme[exact number of funs in

gcd\_poly] functions with about 600 lines of code\fixme[exact LoCs in

gcd\_poly].

This algorithm can be implemented\footnote{At the time of writing this

paper, a translation form SML to Isabelle is under construction. First

experiences do not suggest principal problems with implementation.} by

use of Isabelle's function package

\cite{krauss:partial06,krauss-fun-def-10} according to the

\textit{implicit} specification~\footnote{We use the notion

'\textit{implicit} specification' in order to distinguish from other

notions of 'specification'; for instance, in the literature

'specification' is also used for code like the one describing

\textit{how} to compute the \texttt{gcd} for natural numbers; here

this is called '\textit{explicit} specification'.}:

\label{gcd_poly-def}

{\small

\begin{verbatim}

locale gcd_poly =

fixes gcd_poly :: int poly -> int poly -> int poly

assumes gcd_poly a b = c

and not (a = [:0:])

and not (b = [:0:])

==> c dvd a and c dvd b

and forall c'. (c' dvd a and c' dvd b) => c' =< c

\end{verbatim}

}%\small

So this algorithm not only generates assumptions as shown in the

example above, the algorithm is also partial and for most usages the

arguments must not be zero polynomials \texttt{[:0:]}. However, in HOL there

is the acknowledged method of ``totalization'' which would assign the

value \texttt{[:0:]} to such arguments.

\medskip The function package translates algorithmic code, here called

'\textit{explicit} specification', into equations. For instance, the

code for calculating the \texttt{gcd} for natural numbers is as

follows:

{\small

\begin{verbatim}

fun gcd :: "nat => nat => nat"

where "gcd m n = (if n = 0 then m else gcd n (m mod n))"

\end{verbatim}

}%\small

\noindent From this explicit specification the function package

generates the following equations, one per kind in this simple

example:

{\small

\begin{verbatim}

thm gcd.simps

-- "gcd ?m ?n = (if ?n = 0 then ?m else gcd ?n (?m mod ?n))"

thm gcd.induct

-- "(forall m n. (n ~= 0 => ?P n (m mod n)) => ?P m n) => ?P ?a0.0 ?a1.0"

thm gcd.cases

-- "(forall m n. ?x = (m, n) => ?P) => ?P"

thm gcd.psimps

-- "gcd_dom (?m, ?n) => gcd ?m ?n = (if ?n = 0 then ?m else gcd ?n (?m mod ?n))"

thm gcd.pinduct

-- "gcd_dom (?a0.0, ?a1.0) => (forall m n. gcd_dom (m, n)

=> (n ~= 0 => ?P n (m mod n)) => ?P m n) => ?P ?a0.0 ?a1.0"

\end{verbatim}

}%\small

The theorem \texttt{gcd.simps} allows to evaluate, for instance

\texttt{gcd 18 12} to \texttt{6}. The other theorems server different

purposes, for instance, to prove termination automatically or, if too

complicated for automation, to prove interactively.

\medskip While calculation of \texttt{gcd 18 12} is determined by two

lines presented above, the calculation of \texttt{gcd\_poly} according

to the implicit specification above requires 77 functions comprising

600 LoCs. \fixme[exact LoCs and no. of funs in gcd\_poly] So the

functions package has to operate on a huge amount of equations, but it

does that with response times (including for evaluation) sufficient

for interactive work.

Another essential purpose of the generated functions is, to prove the

postcondition $c \;{\it dvd}\; a \land c \;{\it dvd}\; b \land \forall

c^\prime. (c^\prime \;{\it dvd}\; a \land c^\prime \;{\it dvd}\; b)

\Rightarrow c^\prime \leq c$. That way implementations can be proved

correct, for \texttt{gcd\_poly} already a comprehensive task.

\bigskip

Once the algorithm is implemented by the 77 (TODO) functions, these

functions can be fed into the code generator

\cite{nipkow-bergh-code-gen-02,haftm-nipkow-code-gen-HRS-10}; the

proof of the implementation's correctness transfers to the generated

code. There are several techniques to embed generated code into

Isabelle's runtime system, as well as a mechanism for separated

compilation \cite{codegen-man-12}. Given such an

embedding

%WN alle \fixme beantwortet in Mail 01/28/2013 10:47 PM

%

% \fixme[?Florian: which kind(s) of embedding would you

% recommend in this case?],

% In diesem Fall gibt es schon ML-Code. Dieser enthält schon eine

% symbolische Beschreibung des Problems, und genau dies ist die

% Einbettung. Um über diese symbolische Beschreibung etwas zu beweisen,

% muss man ihr in der Logik eine Semantik geben. Nur ein Teilbeispiel:

% der ML-Code enthält (meines Wissens) nach Listen ('a list), die für

% Polynome stehen. D.h in der Logik bräuchte man die Einbettung 'a list

% => 'a poly (gibt es, vermute ich, schon).

proving in Isabelle/Isar would allow to omit

the assumptions as shown below by \sout{strike-through}.

{\small

\begin{tabbing}

12\=12\=12\=12\=12\kill

\> lemma cancel:\\

\>\> assumes \sout{asm1: "(x$\hat{\;\;}$2 - y$\hat{\;\;}$2) = (x + y) * (x - y)"} \\

\>\>\> \sout{and asm2: "x$\hat{\;\;}$2 - x * y = x * (x - y)"}\\

\>\>\> \sout{and asm3: "x $\not=$ 0" and} asm4: "x$\hat{\;\;}$2 - x * y $\not=$ 0" \\

\>\> shows "(x$\hat{\;\;}$2 - y$\hat{\;\;}$2) / (x$\hat{\;\;}$2 - x * y) = (x + y) / (x::real)"\\

\>\> apply (insert \sout{asm1 asm2 asm3} asm4)\\

\>\> apply simp\\

\> done

\end{tabbing}

}%\small

The algorithm generated to SML code is expected to constitute an

important part of Isabelle's simplifier, such that simplifications

like that are automated:

% \fixme[Florian: how would that code be integrated into Isabelle's

% simplifier?]

% A priori ist schwer zu sagen, wie genau. Man kann den blanken ML-Code

% nehmen und eine Simproc schreiben. Oder eher in Richtung Reflection

% denken und so eine Anbindung schreiben, wie sie die typsichen

% Beispiele in HOL/Decision_Procs haben. Oder…

%

% \fixme[Florian: are there already mechanisms in the simplifier to push

% assumptions like $x+y\not=0\;\land\;x\not=0$ into the context?]

% Ganz hervorragende. Einen Überblick über den Simplifier gibt es

% übrigens in §3.1 im HOL-Tutorial.

$$\frac{y}{x}+\frac{x^2-y^2}{x^2-x\cdot y}=1\;\;\;\land\;x+y\not=0\;\land\;x\not=0$$

\paragraph{Planned achievements} from implementing this kind of CA

algorithms using Isabelle's function package and from generating SML

using the code generator could be as follows:

\begin{itemize}

\item CA functions existing in Isabelle made efficient via code

generation: Differential Equations

\footnote{\url{http://afp.sourceforge.net/entries/Ordinary_Differential_Equations.shtml}},

fast Fourier transform

\footnote{\url{http://afp.sourceforge.net/entries/FFT.shtml}},

Gauss-Jordan Elimination

\footnote{\url{http://afp.sourceforge.net/entries/Gauss_Jordan_Elim_Fun.shtml}}.

% \fixme[?Florian: Do these cases make sense?],

% Von meinem Verständnis der Algorithmen schon. Zumindest ergab ein Grep

% nach »code« nichts bis dato verdächtiges. Das Bsp. bei den gewöhnlichen

% Differentialgleichungen schaut etwas magic aus…

%

% \fixme[?Florian: other relevant candidates of existing Isabelle

% developments for code generation?]

% Beantwortung ist eine eigene Analyse, und Antwort veraltet beim

% Wachstum des AFP schnell (drei neue Einträge allein im

% Jan. 2013!). Der Index unter http://afp.sourceforge.net/topics.shtml

% liefert aber einen ganz guten Überblick.

\item Enhance existing calculational mechanisms, for instance Linear

arithmetic, Presburger arithmetic, floating-point arithmetic,

normalising terms in semi-rings, etc.

% \fixme[?Florian: other relevant candidates of existing Isabelle

% developments for actually calculating results?]

% Lineare Arithmetik. Presburger-Arithmetik. Normalisierung von

% Ausdrücken in Semiringen. Alles in HOL/Decision_Procs. Teilweise roh

% behauene Steine.

\item Existing Isabelle developments extended such that they

actually compute results (instead of checking an input

result):

% \fixme[?Florian: is there a way for the existing Gr\"obner Bases to

% automatically compute a result already?]):

% Gute Frage. Könnte wahrscheinlich Amine (der Autor)

% beantworten. Ansonsten bleibt nur eine nicht-triviale Codeanalyse.

%

% WN: hab isabelle-users@ nicht gefragt, weil Isar keine Syntax fuer

% solche calculations hat.

Gr\"obner

Bases\footnote{\url{http://isabelle.in.tum.de/dist/library/HOL/Groebner_Basis.html}},

\dots

\item New CA functionality implemented, the implementation proved

correct and/or made efficient by code generation. Cancellation for

multivariate polynomials has begun, other functions shall be agreed

on.

\item Experience of how to exploit Isabelle's features to their

limits: function package, code generation, syntax of Isabelle's

executable fragment, exception handling.

\end{itemize}

\paragraph{Open issues,} which already seem clear, are the following:

\begin{itemize}

\item How handle partiality? ``Totalization'' ($\frac{a}{0}\leadsto

0$, \dots) --- non-evaluation like in CAS (${\it hd}\;[\,]\leadsto{\it

hd}\;[\,]$, \dots) --- guarding evaluation by preconditions?\\ Note:

in our implementation of \textit{gcd\_poly} an argument $[:0:]$ would

cause non-termination in deeply nested functions.

\item Nested exception handling in composed functions?

\item Missing language features?

\end{itemize}

\medskip

\paragraph{Summary:} $\;$\\ \textit{

Isabelle's computational features (function package $+$ code

generator) are {\rm powerful enough to tackle implementation of

CA}, which

\begin{compactenum}

\item includes most advanced functionality implemented in contemporary

CAS,

\item is more trustable than specific CA features due to Isabelle's

logical foundations and Isabelle's deductive infrastructure

\item is efficient enough for engineering practice due to code generation.

\end{compactenum}

}

\bigskip

\subsection{Step 2: Novel Approaches to ``multivalued functions''}\label{step-2}

CA systems at the state-of-the-art are indispensable tools in

engineering and science. Practice in both benefits from efficient

computation of highly general algorithms --- while the users are more

or less aware, that interpretation of the systems' results is their own

responsibility. Academic education makes significant efforts to remind

students to CA systems' limited trustability.

\paragraph{\large State-of-the-art in CA} is characterized by fervent

discussions about ``multivalued functions'', a major source for

limited trustability of CA systems. For pure mathematics ``multivalued

functions'' are a ``contradictio per se'', whereas practitioners need

to operate on ``functions'' like $\sqrt, \arcsin,$ etc, more or less

aware, that these might result in multiple values. Below, CA systems'

limitations are identified as (1) \textit{under-specification} and (2)

lack of \textit{deductive mechanisms}.

\subparagraph{Multiple views to functions} comprise ``branch view'',

``table-maker's view'', ``differential algebra view'', ``view of

power-series analysis'', a variety of views which has been made

explicit over the years, and which exhibits crucial incompatibilities

\cite{davenport-10:multivalued}. CA originated from ancient AI's ideas

of ``general problem solvers'' and still adheres to the idea to serve

as many of the above views in parallel as possible and to provide

tools covering several views at once. So CA traditionally prefers

techniques which are under-specified, but which {\it ``can verify that

simplifications are valid {\rm 'almost} everywhere'~''}

\cite{Beaumont02towardsbetter} --- while {\it 'almost} is a point of

view unacceptable for TP.

\subparagraph{Lack of deductive mechanisms} is a consequence of

original design decisions and of the inclination to ``totalization'' of

functions which, however, cannot hold for handling expressions

containing ``multivalued functions'' like this one:

\begin{align}\label{arctan}

\arctan x + \arctan y = arctan \left( \frac{x+y}{1-xy} \right) +

\begin{cases}

\pi & xy > 1 \land x > 0 \\

0 & xy < 1 \\

-\pi & xy > 1 \land x < 0

\end{cases}

\end{align}

A symbolic computation on such expressions can be seen as a tree where

nodes are branching points, and branches denote assumptions. Each

occurrence opens new branches (three in the case of (\ref{arctan})),

each of which is related to assumptions, inequalities in the case of

(\ref{arctan}). Collections of such inequalities can be simplified

``to a minimum'' by ``cylindrical algebraic decomposition''

\cite{Collins:75} and other powerful methods, which allow to preserve

``totality''.

On the other hand, the issue of branching problems has been recognised

for some time. Thus automated solvers into CA systems

\cite{assume-maple-92} have been incorporated. Also techniques

handling ``guarded expressions'' \cite{guarded-expr-97} have been

introduced. However, such ``assume facilities'' still remain isolated

within specific tools in predominant CA systems.

\medskip

Present Maple, for instance, employs tight integration of an

``evaluator'' a ``property reasoner'' and a ``solver''

\cite{Armando:2005:REM:1740741.1740840} \fixme[add a figure depicting

Maple's ``evaluator'' a ``property reasoner'' and a

``solver''.]

% \fixme[?Florian: Can Isabelle already compute (=``evaluate'')

% floating-point numbers? How do I invoke that feature interactively?].

%

% Es gibt eine approximative Theorie von Johannes

% (Decision_Procs/Approximation). Für exaktive symbolische Berechnung

% habe ich vor Zeiten mal eine Frage auf Isabelle users gestellt, wo

% auch etliche Verweise auf Publikationen zusammen kamen, wie man so

% etwas implementieren könnte.

%

%WN sehr veraltet:

%https://lists.cam.ac.uk/mailman/htdig/cl-isabelle-users/2005-August/msg00013.html

However, ``properties functions'' allow one to

express a restricted class of properties and this limits the potential

of the technique.

CA striving for ``totality'', for having total functions, such solvers

need to be restricted to efficient algorithms, for instance to

\textit{linear programming}. Still there are suggestions for removing

errors and improving automated solvers, for instance

\cite{Armando:2005:REM:1740741.1740840}. The latter work also

identifies open issues; a simple example is the following:

\begin{align}

\sin{x} &= y \label{sin}\\

x &= \arcsin{y} \label{arcsin}\\

x &=

\begin{cases}

\{\} & y<-1 \;\lor\; 1< y \label{arcsin-cases}\\

\!\begin{aligned}[b]%http://tex.stackexchange.com/questions/5183/split-like-environment-inside-cases-environment

& \{x. \;\sin{x}=y\land

\forall x^\prime. \;\sin{x^\prime}=y \Rightarrow \\

& \quad \exists k\in{\cal Z}.\;x^\prime = 2k \pi-x

\lor x^\prime = (2k+1)\pi+x\}

\end{aligned} & -1 \leq y \;\land\; y < 0\\[1ex]%#

\!\begin{aligned}[b]%http://tex.stackexchange.com/questions/5183/split-like-environment-inside-cases-environment

& \{x. \;\sin{x}=y\land

\forall x^\prime. \;\sin{x^\prime}=y \Rightarrow \\

& \quad \exists k\in{\cal Z}.\;x^\prime = 2k \pi+x

\lor x^\prime = (2k+1)\pi-x\}

\end{aligned} & 0 \leq y \;\land\; y \leq 1

\end{cases}

\end{align}

Expressions like (\ref{sin}) and (\ref{arcsin}) are customary in CA,

but obviously under-specified. The branching problem stated in

expression (\ref{arcsin-cases}) cannot be handled by any known CA

algorithm: the $k$ introduces an infinite number of equalities, which

is beyond the scope of cylindrical algebraic decomposition and other

methods.

%Davenport 2010, challenges of multival:

% $$\sqrt{z-1}\sqrt{z+1}=\sqrt{z^2 -1} \;\;\;

% \text{for}\; \Re(z) \geq 0\lor (\Re(z)=0 \land \Im(z)>0) $$

%

% \[

% \log z_1 + \log z+2 = \log z_1z_2 +

% \begin{cases}

% 2\pi i & \arg z_1 + \arg z_2 > \pi \\

% 0 & -\pi < \arg z_1 + \arg z_2 < \pi \\

% -2\pi i & \arg z_1 + \arg z_2 < -\pi

% \end{cases}

% \]

\medskip So, a recent conclusion for the state-of-the-art in CA is

\textit{``Producing (formal) proofs of such statements [on

'multivalued functions'] is a developing subject, even in the context

of a computer algebra system. Converting them into an actual theorem

prover is a major challenge. Unfortunately, cylindrical algebraic

decomposition, as used here [on 'multivalued functions'], does not

seem to lend itself to being used as an ‘oracle’ by theorem provers''}

\cite{davenport-10:multivalued}.

\paragraph{\large Novel approaches to ``multivalued functions''} which

employ Isabelle's logical foundations and powerful deductive

mechanisms appear in straight forward manner as shown subsequently.

\subparagraph{In Isabelle ``univalued'' definitions} for

transcendental functions are already there. For instance, $\sin$ is

defined as power-series, such that the coefficients {\small\tt

sin\_coeff} are defined separately:

{\small\tt

\footnote{\url{http://isabelle.in.tum.de/dist/library/HOL/Transcendental.html}

for Isabelle2012.}

\begin{tabbing}

121212\=12\="sin\_coeff = ($\lambda$n. \=12\=12\=12\=12\=12\=12\=12\=12\=12\kill

\> definition sin\_coeff :: "nat $\Rightarrow$ real" where\\

\>\> "sin\_coeff = ($\lambda$n. if even n then 0 \\

\>\>\> else -1 $\widehat{\;}$ ((n - Suc 0) div 2) / real (fact n))"\\

%\\

\> definition sin :: "real $\Rightarrow$ real" where\\

\>\> "sin = ($\lambda$x. $\Sigma$n. sin\_coeff n * x $\widehat{\;}$ n)"

\end{tabbing}

}%small

These definitions, of course, come along with proofs of the respective

characteristic properties:

{\small\tt

\begin{tabbing}

121212\=12\=12\=12\kill

\> lemma sin\_periodic [simp]: "sin (x + 2*pi) = sin x"\\

\> \dots\\

\> lemma sin\_gt\_zero2: "[| 0 < x; x < pi/2 |] ==$>$ 0 < sin x"\\

\> \dots

\end{tabbing}

}%small

When $\sin$ and other functions will be extended to complex numbers,

these will be rigorously distinguished (by Isabelle's axiomatic type

classes). $\arcsin$ is defined as an inverse function on an interval,

such that it actually is a function (and not a relation):

{\small\tt

\begin{tabbing}

121212\=12\=12\=12\kill

\> "arcsin y = (THE x. -(pi/2) $\leq$ x \& x $\leq$ pi/2 \& sin x = y)"

\end{tabbing}

}%small

By definitions like the above ones, expressions (\ref{sin}) and

(\ref{arcsin}) on p.\pageref{sin} are well defined --- however,

solving (\ref{sin}) as an equation in $x$ cannot be done by expression

(\ref{arcsin}) if $x\not\in[-\frac{\pi}{2},\frac{\pi}{2}]$; this point

will be discussed in the next paragraph.

\subparagraph{In Isabelle computational mechanisms} have been

introduced recently (where ``computation'' means rewriting by

equations generated by the function package as well as execution of

code generated by the code generator). Using these means when pushing

implementation of CA to the cutting edge and beyond, indicates need

for further integration of computational mechanisms into Isabelle's

elaborated deductive mechanisms. Such need particularly arises from

tackling novel approaches towards ``multivalued functions'' as shown

in the following.

\medskip %some algorithms according to step 1

In order to reach the state-of-the-art, there are some ``total'' CA

algorithms, which can be implemented in Isabelle according to

\S\ref{step-1}, for instance cylindrical algebraic decomposition (CAD)

\cite{Collins:1976:QER:1093390.1093393} or probably linear

programming. Such implementation would recast CA algorithms in terms

of TP concepts. For instance, Maple’s evaluation process can be recast

as Constraint Contextual Rewriting (CCR)

\cite{Armando:2005:REM:1740741.1740840}. Thus reconstruction of CA

algorithms within a logical framework immediately improves

trustability.

\medskip %more expressiveness in TP than in CA

When heading beyond the state-of-the-art, contextual reasoning as

required by ``multivalued functions'' is well addressed in TP: TP is

specialised to handle assumptions like $\exists k\in{\cal

Z}. \;x^\prime = 2k \pi-x \lor x^\prime = (2k+1)\pi+x\}$ in

(\ref{arcsin-cases}) on p.\pageref{arcsin-cases} (assumptions on

branches, which cannot be handled by traditional CA, by cylindrical

algebraic decomposition). Thus a considerably wider class of

properties can be handled than those presently expressible by means of property

functions in Maple, see for instance

\cite{Armando:2005:REM:1740741.1740840}.

%special methods for special cases -- interactivity

Handling special expressions by special methods agrees well with

concepts of interactive proving. Isabelle's \textit{contexts} are

designed to handle arbitrary logical data locally and consistent with

scopes --- so \textit{contexts} are the perfect mechanism to tackle

cascades of branches in expressions with ``multivalued functions''.

%interactively follow into branches during computation

Isabelle's interactivity also supports to follow into specific

branches, while the \textit{context} handles the respective

assumptions during computation --- that means, the respective

\textit{context} is able to provide the simplifier (evaluating

equations generated by the function package) with the local

assumptions, and also to provide interactive proof tools with

appropriate logical data. This way even special cases might be

handled, which have been excluded so far

\cite{Beaumont02towardsbetter}, for instance $\log{\sin(x)},

\arctan(x\arctan(\frac{1}{x})),$ on all arguments, or

$\sin((x\arctan(\frac{1}{x}))^2), \frac{\arctan(x)}{\arctan(2x)}$ for

$x=1$.

\medskip As already mentioned,

efficiency reasons suggest to transfer CA algorithms for ``multivalued

functions'' to SML by Isabelle's code generator. Such transfer from

interactive Isabelle/Isar to generated code involves deductive

mechanisms as well. This opens novel issues for R\&D discussed

addressed below among others.

% \cite{davenport-10:multivalued} NOTATION:

% $\mathbf{P}(A)$ denotes the power set of $A$. For a function $f$, we

% write ${\it graph}(f)$ for $\{(x,f(x)).\; x\in{\it dom}(f)\}$ and

% ${\it graph}(f)^T$ for $\{(f(x),x).\; x\in{\it dom}(f)\}$. Polar

% coordinates in the complex plane $\mathbf{C}\equiv X\times_{\it

% polar}Y$ are written as $z=r e^{i\phi}.\;r\in X\land\phi\in

% Y$. According to Bourbaki, a function is {\em total} and {\em

% singe-valued}, and written als $(F,A,B)$ making explicit the domain

% $A$ and the codomain $B$.

\paragraph{Planned achievements} in handling ``multivalued

functions'':

\begin{itemize}

\item Clarify principles for overcoming long-standing deficiencies in

CA by implementation in a logical framework:

\begin{itemize}

\item assign specific provers appropriate for specific kinds of

assumptions alongside branches

\item keep management of such assumptions open for interactivity

\item localize assumptions using Isabelle's contexts.

\end{itemize}

\item Design specific improvements for particular deficiencies

selected in mutual agreement from topics like simplification and

equation solving. Select one or more specific topics and investigate

possible improvements in detail.

\item Implement improvements for agreed topics in Isabelle.

\end{itemize}

\paragraph{Open issues} from novel approaches to ``multivalued

functions'' in logical frameworks:

\begin{itemize}

\item Selection and combination of specific interactive and automated

provers and decision procedures for specific classes of assumptions,

assumptions from the present state of CA towards the full expressiveness

of predicate calculus.

\item Stop rewriting at branching points during evaluation by

rewriting the function package's equations in the simplifier, for two

purposes

\begin{itemize}

\item for entering interactive proof on simplification of

assumptions and respective branching

\item for debugging and inspecting contexts within deeply nested

functions.

\end{itemize}

\item Transfer of integrated mechanisms for proving and computing

(i.e. rewriting in the simplifier) to generated code: identify

subclasses of problems, which allow {\it automated} proofs on {\it

all} branches.

\item Calls of deductive mechanisms from generated code; call of

\textit{interactive} proof methods?

\end{itemize}

\medskip

\paragraph{Summary:} $\;$\\ \textit{

``Multivalued functions'' is an important topic, which suffers from

long-standing deficiencies unresolved in CA. At the-state-of-the-art

CA cannot appropriately handle the branching problems involved.

\medskip\\

Handling assumptions alongside cascades of branch cuts appears most

promising by deductive mechanisms. Such handling has not yet been

tackled, because

\begin{compactitem}

\item CA systems are not interested (no logics!)

\item TP systems are not ready (no complex functions yet).\medskip

\end{compactitem}

In Isabelle, calling deductive mechanisms during computation, would

require

\begin{compactitem}

\item adapting the simplifier for calls during ``execution'' of

functions

\item adapting the code generator for inserting calls into generated

code.

\end{compactitem}

}

\bigskip

\subsection{Step 3: Modularisation for Large CA Features}\label{step-3}

While the previous ``Step 2'' addressed relations between Isabelle's

computational and deductive mechanisms \textit{within} algorithms,

this step addresses these relations \textit{between} algorithms. The

motivation for that are architectural considerations, an architecture

of CA to achieve a congruence between computational structure and

deductive structure.

\paragraph{Example 'Symbolic Integration':} The

Risch-algorithm \cite{risch-1st-69} was an early break-through of CA

outperforming human experts, who constituted an important professional

field concerned with finding integrals before. The algorithms is also theoretically important

as a decision procedure, which assures non-existence of a solution (in

form of an elementary function) if the algorithm doesn't find one.

The Risch-algorithm decomposes an integral into polynomial parts

(essentially relying on partial fraction decomposition) like in

{\footnotesize

$$\int{\frac{x^5-2x^4+x^3+x+5}{x^3-2x^2+x-2}}dx=\int{x^2}dx+\int{\frac{3}{x-2}}dx+\int{\frac{-x-1}{x^2+1}}dx$$

}

\noindent and transcendental parts like in

{\footnotesize

\[

\begin{aligned}

\int{\frac{-e^x-x+ \ln(x)x+\ln(x)xe^x}{x(e^x+x)^2}}dx &= \\

\int{\frac{1+e^x}{(e^x+x)^2}}=-\frac{1}{e^x+x} \;\dots\; \\

\dots &= -\frac{\ln(x)}{e^x+x}

\end{aligned}

\]

}

\noindent and creates a cascade of sub-problems while applying

Liouville's theorem and computing the Risch differential equation. The

Risch algorithm has been considerably improved in the meanwhile, see

for instance \cite{lazard-rioboo-trager-90}. Errors in CA systems'

integration have been reported all over the years, for instance by

\cite{hoelzl09realineequalities}\footnote{The reported error

$\int_{-1}^1{\frac{1}{\sqrt{x^2}}dx=0}$ is found corrected to

``integral does not converge'' in Wolfram Alpha on Jan.20 2013.};

errors found in Maple's implementation have been thoroughly

investigated and published, for instance \cite{Mulders97anote}.

\bigskip

For some time architectural considerations were concerned with

``oracles''

\cite{ballarin-paulson-ca-99,harr:HOL-maple,broker-archi-01}: these

seem particularly plausible in the case of integration --- just let a

CA system tell an integral like an oracle, and let trustable TP

mechanisms check correctness by differentiation (which is the simple

inverse problem). When striving to improve the state-of-the-art,

oracles in this case appear insufficient for several reasons:

\begin{enumerate}

\item An oracle is incapable to serve within a decision procedure,in

the above example it cannot ensure the absence of an integral. So, CA

features like integration call for better integration of computation

into deduction than can be accomplished by oracles (where each

non-certified code can only provide trustability on the level of

oracles).

\item In CA systems errors like \cite{Mulders97anote} should not be

found by chance, such errors should be excluded by the systems'

architecture. So integration of computation and deduction as envisaged

\S\ref{step-2} in Step 2 promises some advancement in this respect;

however, we want to go further~\dots

\item If sufficiently modularised, large CA modules like the

Risch-algorithm easily could replace respective parts by improvements

like the one by TODO/Rioboo/Trager. Such replacements not only

concerns the code for the algorithm) including check for

preconditions), but also the deductive structure (the postcondition

should be preserved).

\end{enumerate}

\medskip

More generally, for reasons of clarity, a congruence of computational

structure and of deductive structure seems desirable: the former is

represented by the code of the algorithm, the latter by preconditions

and postconditions, i.e. implicit specifications as given for

\texttt{\small gcd\_poly} on p.\pageref{gcd_poly-def}.

Thus, particularly within large CA modules, we want to guard essential

parts by implicit specifications (input / preconditions / output /

postconditions), abbreviated \textit{imSpec}. And interactive

manipulation of \textit{imSpecs} shall be integrated into

Isabelle/Isar: postconditions in full expressiveness of HOL's

predicate calculus, support for storage of \textit{imSpecs} in

appropriate collections and retrieval from such (large) collections,

assignment to specific functions and connection of proofs of the

respective postconditions, etc.

The idea of such guards is not new. ``Contract based Design''

\cite{meyer-contract-09}, as implemented in the Eiffel language, integrates preconditions, invariants and

postconditions in the code. Our proposal re-raises these principles

within a logical framework. So, proofs of a large CA function's

postcondition can follow the structure of the algorithms' code, where

essential parts are connected with postconditions --- so these

postconditions can directly contribute to the overall proof, and the

structure of that proof is nicely congruent with the computational

structure.

So the idea appears in line with Calculemus

\cite{calculemus-report-05} Prove: $\forall x.\;F\,x$, Solve: $\exists

x.\;F\,x$, Compute: $\lambda x.\;F\,x$.

\paragraph{Planned achievements}

\begin{itemize}

\item Clarify principles in congruence of computational and deductive

aspects in CA such that particular inputs to the algorithm illuminate

the proof code as special cases, and such that the deductive aspects

illuminate the semantics of the algorithm's code.

\item Clarify advancements of above congruence for modularisation of

large CA functions, where sub-functions with semantics significant for

the overall-function are guarded by implicit specifications (input /

precondition / output / postcondition)

\item Demonstrate the concepts by a specific example; promising

examples are expected in the area of integration \footnote{The Risch

algorithm concerns integration over complex numbers, whereas Isabelle

only has implemented functions over the reals.}, equation solving with

``multivalued functions'', differential and integral equations, etc.

\item Implement demonstrators and actually useful CA algorithms in

Isabelle.

\end{itemize}

\paragraph{Open issues}

\begin{itemize}

\item How represent \textit{implicit} specifications (input /

precondition / output / postcondition) in Isabelle? \\ locales? an

additional type of data? \dots

\item What is an appropriate structure in Isabelle for collections of

such specifications? \\ trees? dags? \dots

\item What are the requirements in interactive manipulation on large

collections of implicit specifications; manipulations like store,

retrieve, assign a particular specification to particular code,

support immediate proof of respective postconditions.

\item How adjust presentation of collections to Isabelle/Isar's

document model? Integration into the existing model? An additional

tree (in line with trees for counter examples, etc)?

\end{itemize}

\medskip

\paragraph{Summary:} $\;$\\ \textit{

Given large CA modules embedded in logical frameworks, architecture

should make explicit both, computational structure and deductive

structure.

\medskip\\

Deductive structure of programs is described by implicit

specifications (pre-, postconditions); so programs and implicit

specifications need to be tightly connected:

\begin{compactitem}

\item preconditions

\begin{compactitem}

\item require evaluation by deductive mechanisms

\item guard programs, i.e. start execution only if preconditions are

'true'

\end{compactitem}

\item postconditions of sub-programs

\begin{compactitem}

\item support proving the postcondition of the main-program

\item complete contexts for evaluating preconditions of subsequent

sub-programs.\medskip

\end{compactitem}

\end{compactitem}

In Isabelle, connection of functions and explicit specification

involves

\begin{compactitem}

\item rethinking of the function package

\item rethinking of the document model (for managing a multitude of

different models in parallel)

\end{compactitem}

}

\bigskip

\subsection{Step 4: Tackle Open Issues from Points 1..3}\label{step-4}

.\\.\\.\\

\paragraph{Planned achievements}

\begin{itemize}

\item

\item

\item

\end{itemize}

\paragraph{Expected open issues}

\begin{itemize}

\item

\item

\item

\end{itemize}

\medskip

\paragraph{Summary:} $\;$\\ \textit{

The advancement of CA features as described in the points

1..3 raises novel requirements.

\\

\medskip These requirements need to be sorted out into a

\emph{consistent software design}

\\

\medskip --- a process which shall proceed in steps and by mutual

agreement between Munich and Linz.

}

\bigskip

\subsection{Step 5: Towards Application beyond CA ?}\label{step-5}

% AFP: 2012-11-14: A Separation Logic Framework for Imperative HOL

% Author: Peter Lammich and Rene Meis

% \\

% ?Florian: ``...As we target HOL, our programs can be translated to

% efficiently executable code in various target languages, including ML,

% OCaml, Haskell, and Scala.'' ... how does Imperative HOL relate to

% codegen ?

%

% Ein Aufsatz auf den existierenden Codegenerator, um imperative

% Datenstrukturen verwenden zu können. In der Handhabung etwas

% spröde. Will man damit imperative Programme verifizieren, empfiehlt es

% sich, erstmal ein entsprechendes funktionales Programm zu verifizieren

% und dann dieses Resultat »rüberzuretten«. Für unsere Arbeit, denke

% ich, am Anfang nicht interessant.

\paragraph{Planned achievements}

\begin{itemize}

\item

\item

\item

\end{itemize}

\paragraph{Expected open issues}

\begin{itemize}

\item

\item

\item

\end{itemize}

\medskip

\paragraph{Summary:} $\;$\\ \textit{

The above requirements raised for advancing CA features are

novel in a way which raises questions like:

\\

\medskip Which of the concepts and mechanisms might \emph{scale up to

a scope beyond CA?}\\ Which of them might reach immediate relevance

for practice of engineering?\\ \dots?

\\

\medskip Such questions are considered hypothetical at present, but

might be relevant for future practice in ``Formal Methods'' {\em and}

give directions even in small steps.

}

\bigskip

\section*{---------------}

This is another text (in line with the Florian Haftmann's ``statement

why Isabelle/HOL is the system of choice'' and ``Theorem Proving

underway to Engineering Practice'') preparing for cooperation between

the Theorem Proving Group Munich and RISC Linz.

\section{Computer Algebra's Steps towards Isabelle}

Computer Algebra (CA) is traditionally based on computational

principles, whereas Isabelle, as theorem prover (TP), is based on

principles of deduction. Both kinds of basements have their

strengths, which have led to widespread acceptance of software tools

in science and industry of both, of CA tools as well as TP tools.

Widening application scenarios for both promoted mutual approaches of

TP and CA over the years \footnote{CA approaches TP by implementing

``assume facilities'' \cite{Beaumont02towardsbetter} etc, and TP

approaches CA by implementation of solving differential equations

\cite{immler2012ode} etc.}. The steps below describe a possible

process, where CA approaches TP and accounts for issues arising

alongside these steps.

\medskip

\S\ref{step-1} discusses embedding of CA into Isabelle such that only

available infrastructure is required, \S\ref{step-2} and

\S\ref{step-3} present CA features, which seem to require specific

enhancements of Isabelle for conciseness of CA and elegance of CA

implementation. \S\ref{step-4} revises the enhancements required for

improvement of CA and \S\ref{step-5} concludes with discussing

relevance of possible enhancements beyond application in CA.

\subsection{Step 1: Exploit Isabelle's Computational Features}\label{step-1}

Some CA features are algorithms which are total over a wide range (and

do not directly involve predicates and propositions constraining the

range), and thus are mere computations independent from deductive

infrastructure. Such algorithms are the Gr\"obner Bases method,

cylindrical algebraic decomposition

\cite{Collins:1976:QER:1093390.1093393} and others. For implementing

these algorithms Isabelle already provides an appropriate

infrastructure as shown in the sequel.

\paragraph{Example:} Cancellation of fractions, an indispensable part

of symbolic simplification:

$$\frac{x^2-y^2}{x^2-x\cdot y}=\frac{\cancel{(x+y)}\cdot(x-y)}{\cancel{(x+y)}\cdot x}=\frac{x-y}{x}\;\land\;x+y\not=0\;\land\;x\not=0$$

Cancellation of multi-variate polynomials requires advanced methods of

symbolic computation (Hensel lifting or residua), it cannot be accomplished by

rewriting. Rather, a respective algorithm implemented according to

\cite{winkler:poly} comprises about 77\fixme[exact number of funs in

gcd\_poly] functions with about 600 lines of code\fixme[exact LoCs in

gcd\_poly].

This algorithm can be implemented\footnote{At the time of writing this

paper, a translation form SML to Isabelle is under construction. First

experiences do not suggest principal problems with implementation.} by

use of Isabelle's function package

\cite{krauss:partial06,krauss-fun-def-10} according to the

\textit{implicit} specification~\footnote{We use the notion

'\textit{implicit} specification' in order to distinguish from other

notions of 'specification'; for instance, in the literature

'specification' is also used for code like the one describing

\textit{how} to compute the \texttt{gcd} for natural numbers; here

this is called '\textit{explicit} specification'.}:

\label{gcd_poly-def}

{\small

\begin{verbatim}

locale gcd_poly =

fixes gcd_poly :: int poly -> int poly -> int poly

assumes gcd_poly a b = c

and not (a = [:0:])

and not (b = [:0:])

==> c dvd a and c dvd b

and forall c'. (c' dvd a and c' dvd b) => c' =< c

\end{verbatim}

}%\small

So this algorithm not only generates assumptions as shown in the

example above, the algorithm is also partial and for most usages the

arguments must not be zero polynomials \texttt{[:0:]}. However, in HOL there

is the acknowledged method of ``totalization'' which would assign the

value \texttt{[:0:]} to such arguments.

\medskip The function package translates algorithmic code, here called

'\textit{explicit} specification', into equations. For instance, the

code for calculating the \texttt{gcd} for natural numbers is as

follows:

{\small

\begin{verbatim}

fun gcd :: "nat => nat => nat"

where "gcd m n = (if n = 0 then m else gcd n (m mod n))"

\end{verbatim}

}%\small

\noindent From this explicit specification the function package

generates the following equations, one per kind in this simple

example:

{\small

\begin{verbatim}

thm gcd.simps

-- "gcd ?m ?n = (if ?n = 0 then ?m else gcd ?n (?m mod ?n))"

thm gcd.induct

-- "(forall m n. (n ~= 0 => ?P n (m mod n)) => ?P m n) => ?P ?a0.0 ?a1.0"

thm gcd.cases

-- "(forall m n. ?x = (m, n) => ?P) => ?P"

thm gcd.psimps

-- "gcd_dom (?m, ?n) => gcd ?m ?n = (if ?n = 0 then ?m else gcd ?n (?m mod ?n))"

thm gcd.pinduct

-- "gcd_dom (?a0.0, ?a1.0) => (forall m n. gcd_dom (m, n)

=> (n ~= 0 => ?P n (m mod n)) => ?P m n) => ?P ?a0.0 ?a1.0"

\end{verbatim}

}%\small

The theorem \texttt{gcd.simps} allows to evaluate, for instance

\texttt{gcd 18 12} to \texttt{6}. The other theorems server different

purposes, for instance, to prove termination automatically or, if too

complicated for automation, to prove interactively.

\medskip While calculation of \texttt{gcd 18 12} is determined by two

lines presented above, the calculation of \texttt{gcd\_poly} according

to the implicit specification above requires 77 functions comprising

600 LoCs. \fixme[exact LoCs and no. of funs in gcd\_poly] So the

functions package has to operate on a huge amount of equations, but it

does that with response times (including for evaluation) sufficient

for interactive work.

Another essential purpose of the generated functions is, to prove the

postcondition $c \;{\it dvd}\; a \land c \;{\it dvd}\; b \land \forall

c^\prime. (c^\prime \;{\it dvd}\; a \land c^\prime \;{\it dvd}\; b)

\Rightarrow c^\prime \leq c$. That way implementations can be proved

correct, for \texttt{gcd\_poly} already a comprehensive task.

\bigskip

Once the algorithm is implemented by the 77 (TODO) functions, these

functions can be fed into the code generator

\cite{nipkow-bergh-code-gen-02,haftm-nipkow-code-gen-HRS-10}; the

proof of the implementation's correctness transfers to the generated

code. There are several techniques to embed generated code into

Isabelle's runtime system, as well as a mechanism for separated

compilation \cite{codegen-man-12}. Given such an

embedding

%WN alle \fixme beantwortet in Mail 01/28/2013 10:47 PM

%

% \fixme[?Florian: which kind(s) of embedding would you

% recommend in this case?],

% In diesem Fall gibt es schon ML-Code. Dieser enthält schon eine

% symbolische Beschreibung des Problems, und genau dies ist die

% Einbettung. Um über diese symbolische Beschreibung etwas zu beweisen,

% muss man ihr in der Logik eine Semantik geben. Nur ein Teilbeispiel:

% der ML-Code enthält (meines Wissens) nach Listen ('a list), die für

% Polynome stehen. D.h in der Logik bräuchte man die Einbettung 'a list

% => 'a poly (gibt es, vermute ich, schon).

proving in Isabelle/Isar would allow to omit

the assumptions as shown below by \sout{strike-through}.

{\small

\begin{tabbing}

12\=12\=12\=12\=12\kill

\> lemma cancel:\\

\>\> assumes \sout{asm1: "(x$\hat{\;\;}$2 - y$\hat{\;\;}$2) = (x + y) * (x - y)"} \\

\>\>\> \sout{and asm2: "x$\hat{\;\;}$2 - x * y = x * (x - y)"}\\

\>\>\> \sout{and asm3: "x $\not=$ 0" and} asm4: "x$\hat{\;\;}$2 - x * y $\not=$ 0" \\

\>\> shows "(x$\hat{\;\;}$2 - y$\hat{\;\;}$2) / (x$\hat{\;\;}$2 - x * y) = (x + y) / (x::real)"\\

\>\> apply (insert \sout{asm1 asm2 asm3} asm4)\\

\>\> apply simp\\

\> done

\end{tabbing}

}%\small

The algorithm generated to SML code is expected to constitute an

important part of Isabelle's simplifier, such that simplifications

like that are automated:

% \fixme[Florian: how would that code be integrated into Isabelle's

% simplifier?]

% A priori ist schwer zu sagen, wie genau. Man kann den blanken ML-Code

% nehmen und eine Simproc schreiben. Oder eher in Richtung Reflection

% denken und so eine Anbindung schreiben, wie sie die typsichen

% Beispiele in HOL/Decision_Procs haben. Oder…

%

% \fixme[Florian: are there already mechanisms in the simplifier to push

% assumptions like $x+y\not=0\;\land\;x\not=0$ into the context?]

% Ganz hervorragende. Einen Überblick über den Simplifier gibt es

% übrigens in §3.1 im HOL-Tutorial.

$$\frac{y}{x}+\frac{x^2-y^2}{x^2-x\cdot y}=1\;\;\;\land\;x+y\not=0\;\land\;x\not=0$$

\paragraph{Planned achievements} from implementing this kind of CA

algorithms using Isabelle's function package and from generating SML

using the code generator could be as follows:

\begin{itemize}

\item CA functions existing in Isabelle made efficient via code

generation: Differential Equations

\footnote{\url{http://afp.sourceforge.net/entries/Ordinary_Differential_Equations.shtml}},

fast Fourier transform

\footnote{\url{http://afp.sourceforge.net/entries/FFT.shtml}},

Gauss-Jordan Elimination

\footnote{\url{http://afp.sourceforge.net/entries/Gauss_Jordan_Elim_Fun.shtml}}.

% \fixme[?Florian: Do these cases make sense?],

% Von meinem Verständnis der Algorithmen schon. Zumindest ergab ein Grep

% nach »code« nichts bis dato verdächtiges. Das Bsp. bei den gewöhnlichen

% Differentialgleichungen schaut etwas magic aus…

%

% \fixme[?Florian: other relevant candidates of existing Isabelle

% developments for code generation?]

% Beantwortung ist eine eigene Analyse, und Antwort veraltet beim

% Wachstum des AFP schnell (drei neue Einträge allein im

% Jan. 2013!). Der Index unter http://afp.sourceforge.net/topics.shtml

% liefert aber einen ganz guten Überblick.

\item Enhance existing calculational mechanisms, for instance Linear

arithmetic, Presburger arithmetic, floating-point arithmetic,

normalising terms in semi-rings, etc.

% \fixme[?Florian: other relevant candidates of existing Isabelle

% developments for actually calculating results?]

% Lineare Arithmetik. Presburger-Arithmetik. Normalisierung von

% Ausdrücken in Semiringen. Alles in HOL/Decision_Procs. Teilweise roh

% behauene Steine.

\item Existing Isabelle developments extended such that they

actually compute results (instead of checking an input

result):

% \fixme[?Florian: is there a way for the existing Gr\"obner Bases to

% automatically compute a result already?]):

% Gute Frage. Könnte wahrscheinlich Amine (der Autor)

% beantworten. Ansonsten bleibt nur eine nicht-triviale Codeanalyse.

%

% WN: hab isabelle-users@ nicht gefragt, weil Isar keine Syntax fuer

% solche calculations hat.

Gr\"obner

Bases\footnote{\url{http://isabelle.in.tum.de/dist/library/HOL/Groebner_Basis.html}},

\dots

\item New CA functionality implemented, the implementation proved

correct and/or made efficient by code generation. Cancellation for

multivariate polynomials has begun, other functions shall be agreed

on.

\item Experience of how to exploit Isabelle's features to their

limits: function package, code generation, syntax of Isabelle's

executable fragment, exception handling.

\end{itemize}

\paragraph{Open issues,} which already seem clear, are the following:

\begin{itemize}

\item How handle partiality? ``Totalization'' ($\frac{a}{0}\leadsto

0$, \dots) --- non-evaluation like in CAS (${\it hd}\;[\,]\leadsto{\it

hd}\;[\,]$, \dots) --- guarding evaluation by preconditions?\\ Note:

in our implementation of \textit{gcd\_poly} an argument $[:0:]$ would

cause non-termination in deeply nested functions.

\item Nested exception handling in composed functions?

\item Missing language features?

\end{itemize}

\medskip

\paragraph{Summary:} $\;$\\ \textit{

Isabelle's computational features (function package $+$ code

generator) are {\rm powerful enough to tackle implementation of

CA}, which

\begin{compactenum}

\item includes most advanced functionality implemented in contemporary

CAS,

\item is more trustable than specific CA features due to Isabelle's

logical foundations and Isabelle's deductive infrastructure

\item is efficient enough for engineering practice due to code generation.

\end{compactenum}

}

\bigskip

\subsection{Step 2: Novel Approaches to ``multivalued functions''}\label{step-2}

CA systems at the state-of-the-art are indispensable tools in

engineering and science. Practice in both benefits from efficient

computation of highly general algorithms --- while the users are more

or less aware, that interpretation of the systems' results is their own

responsibility. Academic education makes significant efforts to remind

students to CA systems' limited trustability.

\paragraph{\large State-of-the-art in CA} is characterized by fervent

discussions about ``multivalued functions'', a major source for

limited trustability of CA systems. For pure mathematics ``multivalued

functions'' are a ``contradictio per se'', whereas practitioners need

to operate on ``functions'' like $\sqrt, \arcsin,$ etc, more or less

aware, that these might result in multiple values. Below, CA systems'

limitations are identified as (1) \textit{under-specification} and (2)

lack of \textit{deductive mechanisms}.

\subparagraph{Multiple views to functions} comprise ``branch view'',

``table-maker's view'', ``differential algebra view'', ``view of

power-series analysis'', a variety of views which has been made

explicit over the years, and which exhibits crucial incompatibilities

\cite{davenport-10:multivalued}. CA originated from ancient AI's ideas

of ``general problem solvers'' and still adheres to the idea to serve

as many of the above views in parallel as possible and to provide

tools covering several views at once. So CA traditionally prefers

techniques which are under-specified, but which {\it ``can verify that

simplifications are valid {\rm 'almost} everywhere'~''}

\cite{Beaumont02towardsbetter} --- while {\it 'almost} is a point of

view unacceptable for TP.

\subparagraph{Lack of deductive mechanisms} is a consequence of

original design decisions and of the inclination to ``totalization'' of

functions which, however, cannot hold for handling expressions

containing ``multivalued functions'' like this one:

\begin{align}\label{arctan}

\arctan x + \arctan y = arctan \left( \frac{x+y}{1-xy} \right) +

\begin{cases}

\pi & xy > 1 \land x > 0 \\

0 & xy < 1 \\

-\pi & xy > 1 \land x < 0

\end{cases}

\end{align}

A symbolic computation on such expressions can be seen as a tree where

nodes are branching points, and branches denote assumptions. Each

occurrence opens new branches (three in the case of (\ref{arctan})),

each of which is related to assumptions, inequalities in the case of

(\ref{arctan}). Collections of such inequalities can be simplified

``to a minimum'' by ``cylindrical algebraic decomposition''

\cite{Collins:75} and other powerful methods, which allow to preserve

``totality''.

On the other hand, the issue of branching problems has been recognised

for some time. Thus automated solvers into CA systems

\cite{assume-maple-92} have been incorporated. Also techniques

handling ``guarded expressions'' \cite{guarded-expr-97} have been

introduced. However, such ``assume facilities'' still remain isolated

within specific tools in predominant CA systems.

\medskip

Present Maple, for instance, employs tight integration of an

``evaluator'' a ``property reasoner'' and a ``solver''

\cite{Armando:2005:REM:1740741.1740840} \fixme[add a figure depicting

Maple's ``evaluator'' a ``property reasoner'' and a

``solver''.]

% \fixme[?Florian: Can Isabelle already compute (=``evaluate'')

% floating-point numbers? How do I invoke that feature interactively?].

%

% Es gibt eine approximative Theorie von Johannes

% (Decision_Procs/Approximation). Für exaktive symbolische Berechnung

% habe ich vor Zeiten mal eine Frage auf Isabelle users gestellt, wo

% auch etliche Verweise auf Publikationen zusammen kamen, wie man so

% etwas implementieren könnte.

%

%WN sehr veraltet:

%https://lists.cam.ac.uk/mailman/htdig/cl-isabelle-users/2005-August/msg00013.html

However, ``properties functions'' allow one to

express a restricted class of properties and this limits the potential

of the technique.

CA striving for ``totality'', for having total functions, such solvers

need to be restricted to efficient algorithms, for instance to

\textit{linear programming}. Still there are suggestions for removing

errors and improving automated solvers, for instance

\cite{Armando:2005:REM:1740741.1740840}. The latter work also

identifies open issues; a simple example is the following:

\begin{align}

\sin{x} &= y \label{sin}\\

x &= \arcsin{y} \label{arcsin}\\

x &=

\begin{cases}

\{\} & y<-1 \;\lor\; 1< y \label{arcsin-cases}\\

\!\begin{aligned}[b]%http://tex.stackexchange.com/questions/5183/split-like-environment-inside-cases-environment

& \{x. \;\sin{x}=y\land

\forall x^\prime. \;\sin{x^\prime}=y \Rightarrow \\

& \quad \exists k\in{\cal Z}.\;x^\prime = 2k \pi-x

\lor x^\prime = (2k+1)\pi+x\}

\end{aligned} & -1 \leq y \;\land\; y < 0\\[1ex]%#

\!\begin{aligned}[b]%http://tex.stackexchange.com/questions/5183/split-like-environment-inside-cases-environment

& \{x. \;\sin{x}=y\land

\forall x^\prime. \;\sin{x^\prime}=y \Rightarrow \\

& \quad \exists k\in{\cal Z}.\;x^\prime = 2k \pi+x

\lor x^\prime = (2k+1)\pi-x\}

\end{aligned} & 0 \leq y \;\land\; y \leq 1

\end{cases}

\end{align}

Expressions like (\ref{sin}) and (\ref{arcsin}) are customary in CA,

but obviously under-specified. The branching problem stated in

expression (\ref{arcsin-cases}) cannot be handled by any known CA

algorithm: the $k$ introduces an infinite number of equalities, which

is beyond the scope of cylindrical algebraic decomposition and other

methods.

%Davenport 2010, challenges of multival:

% $$\sqrt{z-1}\sqrt{z+1}=\sqrt{z^2 -1} \;\;\;

% \text{for}\; \Re(z) \geq 0\lor (\Re(z)=0 \land \Im(z)>0) $$

%

% \[

% \log z_1 + \log z+2 = \log z_1z_2 +

% \begin{cases}

% 2\pi i & \arg z_1 + \arg z_2 > \pi \\

% 0 & -\pi < \arg z_1 + \arg z_2 < \pi \\

% -2\pi i & \arg z_1 + \arg z_2 < -\pi

% \end{cases}

% \]

\medskip So, a recent conclusion for the state-of-the-art in CA is

\textit{``Producing (formal) proofs of such statements [on

'multivalued functions'] is a developing subject, even in the context

of a computer algebra system. Converting them into an actual theorem

prover is a major challenge. Unfortunately, cylindrical algebraic

decomposition, as used here [on 'multivalued functions'], does not

seem to lend itself to being used as an ‘oracle’ by theorem provers''}

\cite{davenport-10:multivalued}.

\paragraph{\large Novel approaches to ``multivalued functions''} which

employ Isabelle's logical foundations and powerful deductive

mechanisms appear in straight forward manner as shown subsequently.

\subparagraph{In Isabelle ``univalued'' definitions} for

transcendental functions are already there. For instance, $\sin$ is

defined as power-series, such that the coefficients {\small\tt

sin\_coeff} are defined separately:

{\small\tt

\footnote{\url{http://isabelle.in.tum.de/dist/library/HOL/Transcendental.html}

for Isabelle2012.}

\begin{tabbing}

121212\=12\="sin\_coeff = ($\lambda$n. \=12\=12\=12\=12\=12\=12\=12\=12\=12\kill

\> definition sin\_coeff :: "nat $\Rightarrow$ real" where\\

\>\> "sin\_coeff = ($\lambda$n. if even n then 0 \\

\>\>\> else -1 $\widehat{\;}$ ((n - Suc 0) div 2) / real (fact n))"\\

%\\

\> definition sin :: "real $\Rightarrow$ real" where\\

\>\> "sin = ($\lambda$x. $\Sigma$n. sin\_coeff n * x $\widehat{\;}$ n)"

\end{tabbing}

}%small

These definitions, of course, come along with proofs of the respective

characteristic properties:

{\small\tt

\begin{tabbing}

121212\=12\=12\=12\kill

\> lemma sin\_periodic [simp]: "sin (x + 2*pi) = sin x"\\

\> \dots\\

\> lemma sin\_gt\_zero2: "[| 0 < x; x < pi/2 |] ==$>$ 0 < sin x"\\

\> \dots

\end{tabbing}

}%small

When $\sin$ and other functions will be extended to complex numbers,

these will be rigorously distinguished (by Isabelle's axiomatic type

classes). $\arcsin$ is defined as an inverse function on an interval,

such that it actually is a function (and not a relation):

{\small\tt

\begin{tabbing}

121212\=12\=12\=12\kill

\> "arcsin y = (THE x. -(pi/2) $\leq$ x \& x $\leq$ pi/2 \& sin x = y)"

\end{tabbing}

}%small

By definitions like the above ones, expressions (\ref{sin}) and

(\ref{arcsin}) on p.\pageref{sin} are well defined --- however,

solving (\ref{sin}) as an equation in $x$ cannot be done by expression

(\ref{arcsin}) if $x\not\in[-\frac{\pi}{2},\frac{\pi}{2}]$; this point

will be discussed in the next paragraph.

\subparagraph{In Isabelle computational mechanisms} have been

introduced recently (where ``computation'' means rewriting by

equations generated by the function package as well as execution of

code generated by the code generator). Using these means when pushing

implementation of CA to the cutting edge and beyond, indicates need

for further integration of computational mechanisms into Isabelle's

elaborated deductive mechanisms. Such need particularly arises from

tackling novel approaches towards ``multivalued functions'' as shown

in the following.

\medskip %some algorithms according to step 1

In order to reach the state-of-the-art, there are some ``total'' CA

algorithms, which can be implemented in Isabelle according to

\S\ref{step-1}, for instance cylindrical algebraic decomposition (CAD)

\cite{Collins:1976:QER:1093390.1093393} or probably linear

programming. Such implementation would recast CA algorithms in terms

of TP concepts. For instance, Maple’s evaluation process can be recast

as Constraint Contextual Rewriting (CCR)

\cite{Armando:2005:REM:1740741.1740840}. Thus reconstruction of CA

algorithms within a logical framework immediately improves

trustability.

\medskip %more expressiveness in TP than in CA

When heading beyond the state-of-the-art, contextual reasoning as

required by ``multivalued functions'' is well addressed in TP: TP is

specialised to handle assumptions like $\exists k\in{\cal

Z}. \;x^\prime = 2k \pi-x \lor x^\prime = (2k+1)\pi+x\}$ in

(\ref{arcsin-cases}) on p.\pageref{arcsin-cases} (assumptions on

branches, which cannot be handled by traditional CA, by cylindrical

algebraic decomposition). Thus a considerably wider class of

properties can be handled than those presently expressible by means of property

functions in Maple, see for instance

\cite{Armando:2005:REM:1740741.1740840}.

%special methods for special cases -- interactivity

Handling special expressions by special methods agrees well with

concepts of interactive proving. Isabelle's \textit{contexts} are

designed to handle arbitrary logical data locally and consistent with

scopes --- so \textit{contexts} are the perfect mechanism to tackle

cascades of branches in expressions with ``multivalued functions''.

%interactively follow into branches during computation

Isabelle's interactivity also supports to follow into specific

branches, while the \textit{context} handles the respective

assumptions during computation --- that means, the respective

\textit{context} is able to provide the simplifier (evaluating

equations generated by the function package) with the local

assumptions, and also to provide interactive proof tools with

appropriate logical data. This way even special cases might be

handled, which have been excluded so far

\cite{Beaumont02towardsbetter}, for instance $\log{\sin(x)},

\arctan(x\arctan(\frac{1}{x})),$ on all arguments, or

$\sin((x\arctan(\frac{1}{x}))^2), \frac{\arctan(x)}{\arctan(2x)}$ for

$x=1$.

\medskip As already mentioned,

efficiency reasons suggest to transfer CA algorithms for ``multivalued

functions'' to SML by Isabelle's code generator. Such transfer from

interactive Isabelle/Isar to generated code involves deductive

mechanisms as well. This opens novel issues for R\&D discussed

addressed below among others.

% \cite{davenport-10:multivalued} NOTATION:

% $\mathbf{P}(A)$ denotes the power set of $A$. For a function $f$, we

% write ${\it graph}(f)$ for $\{(x,f(x)).\; x\in{\it dom}(f)\}$ and

% ${\it graph}(f)^T$ for $\{(f(x),x).\; x\in{\it dom}(f)\}$. Polar

% coordinates in the complex plane $\mathbf{C}\equiv X\times_{\it

% polar}Y$ are written as $z=r e^{i\phi}.\;r\in X\land\phi\in

% Y$. According to Bourbaki, a function is {\em total} and {\em

% singe-valued}, and written als $(F,A,B)$ making explicit the domain

% $A$ and the codomain $B$.

\paragraph{Planned achievements} in handling ``multivalued

functions'':

\begin{itemize}

\item Clarify principles for overcoming long-standing deficiencies in

CA by implementation in a logical framework:

\begin{itemize}

\item assign specific provers appropriate for specific kinds of

assumptions alongside branches

\item keep management of such assumptions open for interactivity

\item localize assumptions using Isabelle's contexts.

\end{itemize}

\item Design specific improvements for particular deficiencies

selected in mutual agreement from topics like simplification and

equation solving. Select one or more specific topics and investigate

possible improvements in detail.

\item Implement improvements for agreed topics in Isabelle.

\end{itemize}

\paragraph{Open issues} from novel approaches to ``multivalued

functions'' in logical frameworks:

\begin{itemize}

\item Selection and combination of specific interactive and automated

provers and decision procedures for specific classes of assumptions,

assumptions from the present state of CA towards the full expressiveness

of predicate calculus.

\item Stop rewriting at branching points during evaluation by

rewriting the function package's equations in the simplifier, for two

purposes

\begin{itemize}

\item for entering interactive proof on simplification of

assumptions and respective branching

\item for debugging and inspecting contexts within deeply nested

functions.

\end{itemize}

\item Transfer of integrated mechanisms for proving and computing

(i.e. rewriting in the simplifier) to generated code: identify

subclasses of problems, which allow {\it automated} proofs on {\it

all} branches.

\item Calls of deductive mechanisms from generated code; call of

\textit{interactive} proof methods?

\end{itemize}

\medskip

\paragraph{Summary:} $\;$\\ \textit{

``Multivalued functions'' is an important topic, which suffers from

long-standing deficiencies unresolved in CA. At the-state-of-the-art

CA cannot appropriately handle the branching problems involved.

\medskip\\

Handling assumptions alongside cascades of branch cuts appears most

promising by deductive mechanisms. Such handling has not yet been

tackled, because

\begin{compactitem}

\item CA systems are not interested (no logics!)

\item TP systems are not ready (no complex functions yet).\medskip

\end{compactitem}

In Isabelle, calling deductive mechanisms during computation, would

require

\begin{compactitem}

\item adapting the simplifier for calls during ``execution'' of

functions

\item adapting the code generator for inserting calls into generated

code.

\end{compactitem}

}

\bigskip

\subsection{Step 3: Modularisation for Large CA Features}\label{step-3}

While the previous ``Step 2'' addressed relations between Isabelle's

computational and deductive mechanisms \textit{within} algorithms,

this step addresses these relations \textit{between} algorithms. The

motivation for that are architectural considerations, an architecture

of CA to achieve a congruence between computational structure and

deductive structure.

\paragraph{Example 'Symbolic Integration':} The

Risch-algorithm \cite{risch-1st-69} was an early break-through of CA

outperforming human experts, who constituted an important professional

field concerned with finding integrals before. The algorithms is also theoretically important

as a decision procedure, which assures non-existence of a solution (in

form of an elementary function) if the algorithm doesn't find one.

The Risch-algorithm decomposes an integral into polynomial parts

(essentially relying on partial fraction decomposition) like in

{\footnotesize

$$\int{\frac{x^5-2x^4+x^3+x+5}{x^3-2x^2+x-2}}dx=\int{x^2}dx+\int{\frac{3}{x-2}}dx+\int{\frac{-x-1}{x^2+1}}dx$$

}

\noindent and transcendental parts like in

{\footnotesize

\[

\begin{aligned}

\int{\frac{-e^x-x+ \ln(x)x+\ln(x)xe^x}{x(e^x+x)^2}}dx &= \\

\int{\frac{1+e^x}{(e^x+x)^2}}=-\frac{1}{e^x+x} \;\dots\; \\

\dots &= -\frac{\ln(x)}{e^x+x}

\end{aligned}

\]

}

\noindent and creates a cascade of sub-problems while applying

Liouville's theorem and computing the Risch differential equation. The

Risch algorithm has been considerably improved in the meanwhile, see

for instance \cite{lazard-rioboo-trager-90}. Errors in CA systems'

integration have been reported all over the years, for instance by

\cite{hoelzl09realineequalities}\footnote{The reported error

$\int_{-1}^1{\frac{1}{\sqrt{x^2}}dx=0}$ is found corrected to

``integral does not converge'' in Wolfram Alpha on Jan.20 2013.};

errors found in Maple's implementation have been thoroughly

investigated and published, for instance \cite{Mulders97anote}.

\bigskip

For some time architectural considerations were concerned with

``oracles''

\cite{ballarin-paulson-ca-99,harr:HOL-maple,broker-archi-01}: these

seem particularly plausible in the case of integration --- just let a

CA system tell an integral like an oracle, and let trustable TP

mechanisms check correctness by differentiation (which is the simple

inverse problem). When striving to improve the state-of-the-art,

oracles in this case appear insufficient for several reasons:

\begin{enumerate}

\item An oracle is incapable to serve within a decision procedure,in

the above example it cannot ensure the absence of an integral. So, CA

features like integration call for better integration of computation

into deduction than can be accomplished by oracles (where each

non-certified code can only provide trustability on the level of

oracles).

\item In CA systems errors like \cite{Mulders97anote} should not be

found by chance, such errors should be excluded by the systems'

architecture. So integration of computation and deduction as envisaged

\S\ref{step-2} in Step 2 promises some advancement in this respect;

however, we want to go further~\dots

\item If sufficiently modularised, large CA modules like the

Risch-algorithm easily could replace respective parts by improvements

like the one by TODO/Rioboo/Trager. Such replacements not only

concerns the code for the algorithm) including check for

preconditions), but also the deductive structure (the postcondition

should be preserved).

\end{enumerate}

\medskip

More generally, for reasons of clarity, a congruence of computational

structure and of deductive structure seems desirable: the former is

represented by the code of the algorithm, the latter by preconditions

and postconditions, i.e. implicit specifications as given for

\texttt{\small gcd\_poly} on p.\pageref{gcd_poly-def}.

Thus, particularly within large CA modules, we want to guard essential

parts by implicit specifications (input / preconditions / output /

postconditions), abbreviated \textit{imSpec}. And interactive

manipulation of \textit{imSpecs} shall be integrated into

Isabelle/Isar: postconditions in full expressiveness of HOL's

predicate calculus, support for storage of \textit{imSpecs} in

appropriate collections and retrieval from such (large) collections,

assignment to specific functions and connection of proofs of the

respective postconditions, etc.

The idea of such guards is not new. ``Contract based Design''

\cite{meyer-contract-09}, as implemented in the Eiffel language, integrates preconditions, invariants and

postconditions in the code. Our proposal re-raises these principles

within a logical framework. So, proofs of a large CA function's

postcondition can follow the structure of the algorithms' code, where

essential parts are connected with postconditions --- so these

postconditions can directly contribute to the overall proof, and the

structure of that proof is nicely congruent with the computational

structure.

So the idea appears in line with Calculemus

\cite{calculemus-report-05} Prove: $\forall x.\;F\,x$, Solve: $\exists

x.\;F\,x$, Compute: $\lambda x.\;F\,x$.

\paragraph{Planned achievements}

\begin{itemize}

\item Clarify principles in congruence of computational and deductive

aspects in CA such that particular inputs to the algorithm illuminate

the proof code as special cases, and such that the deductive aspects

illuminate the semantics of the algorithm's code.

\item Clarify advancements of above congruence for modularisation of

large CA functions, where sub-functions with semantics significant for

the overall-function are guarded by implicit specifications (input /

precondition / output / postcondition)

\item Demonstrate the concepts by a specific example; promising

examples are expected in the area of integration \footnote{The Risch

algorithm concerns integration over complex numbers, whereas Isabelle

only has implemented functions over the reals.}, equation solving with

``multivalued functions'', differential and integral equations, etc.

\item Implement demonstrators and actually useful CA algorithms in

Isabelle.

\end{itemize}

\paragraph{Open issues}

\begin{itemize}

\item How represent \textit{implicit} specifications (input /

precondition / output / postcondition) in Isabelle? \\ locales? an

additional type of data? \dots

\item What is an appropriate structure in Isabelle for collections of

such specifications? \\ trees? dags? \dots

\item What are the requirements in interactive manipulation on large

collections of implicit specifications; manipulations like store,

retrieve, assign a particular specification to particular code,

support immediate proof of respective postconditions.

\item How adjust presentation of collections to Isabelle/Isar's

document model? Integration into the existing model? An additional

tree (in line with trees for counter examples, etc)?

\end{itemize}

\medskip

\paragraph{Summary:} $\;$\\ \textit{

Given large CA modules embedded in logical frameworks, architecture

should make explicit both, computational structure and deductive

structure.

\medskip\\

Deductive structure of programs is described by implicit

specifications (pre-, postconditions); so programs and implicit

specifications need to be tightly connected:

\begin{compactitem}

\item preconditions

\begin{compactitem}

\item require evaluation by deductive mechanisms

\item guard programs, i.e. start execution only if preconditions are

'true'

\end{compactitem}

\item postconditions of sub-programs

\begin{compactitem}

\item support proving the postcondition of the main-program

\item complete contexts for evaluating preconditions of subsequent

sub-programs.\medskip

\end{compactitem}

\end{compactitem}

In Isabelle, connection of functions and explicit specification

involves

\begin{compactitem}

\item rethinking of the function package

\item rethinking of the document model (for managing a multitude of

different models in parallel)

\end{compactitem}

}

\bigskip

\subsection{Step 4: Tackle Open Issues from Points 1..3}\label{step-4}

.\\.\\.\\

\paragraph{Planned achievements}

\begin{itemize}

\item

\item

\item

\end{itemize}

\paragraph{Expected open issues}

\begin{itemize}

\item

\item

\item

\end{itemize}

\medskip

\paragraph{Summary:} $\;$\\ \textit{

The advancement of CA features as described in the points

1..3 raises novel requirements.

\\

\medskip These requirements need to be sorted out into a

\emph{consistent software design}

\\

\medskip --- a process which shall proceed in steps and by mutual

agreement between Munich and Linz.

}

\bigskip

\subsection{Step 5: Towards Application beyond CA ?}\label{step-5}

% AFP: 2012-11-14: A Separation Logic Framework for Imperative HOL

% Author: Peter Lammich and Rene Meis

% \\

% ?Florian: ``...As we target HOL, our programs can be translated to

% efficiently executable code in various target languages, including ML,

% OCaml, Haskell, and Scala.'' ... how does Imperative HOL relate to

% codegen ?

%

% Ein Aufsatz auf den existierenden Codegenerator, um imperative

% Datenstrukturen verwenden zu können. In der Handhabung etwas

% spröde. Will man damit imperative Programme verifizieren, empfiehlt es

% sich, erstmal ein entsprechendes funktionales Programm zu verifizieren

% und dann dieses Resultat »rüberzuretten«. Für unsere Arbeit, denke

% ich, am Anfang nicht interessant.

\paragraph{Planned achievements}

\begin{itemize}

\item

\item

\item

\end{itemize}

\paragraph{Expected open issues}

\begin{itemize}

\item

\item

\item

\end{itemize}

\medskip

\paragraph{Summary:} $\;$\\ \textit{

The above requirements raised for advancing CA features are

novel in a way which raises questions like:

\\

\medskip Which of the concepts and mechanisms might \emph{scale up to

a scope beyond CA?}\\ Which of them might reach immediate relevance

for practice of engineering?\\ \dots?

\\

\medskip Such questions are considered hypothetical at present, but

might be relevant for future practice in ``Formal Methods'' {\em and}

give directions even in small steps.

}

\bigskip