viff

changeset 863:5e2a190baf4a

Some documentation on preprocessing.
author Martin Geisler <mg@daimi.au.dk>
date Thu, 31 Jul 2008 21:18:55 +0200
parents 14be5568022c
children 175a77921da9
files doc/implementation.txt doc/preprocessing.txt doc/runtime.txt
diffstat 3 files changed, 92 insertions(+), 0 deletions(-) [+]
line diff
     1.1 --- a/doc/implementation.txt	Thu Jul 31 19:23:05 2008 +0200
     1.2 +++ b/doc/implementation.txt	Thu Jul 31 21:18:55 2008 +0200
     1.3 @@ -16,6 +16,7 @@
     1.4     prss
     1.5     config
     1.6     program-counters
     1.7 +   preprocessing
     1.8     coding-style
     1.9     development
    1.10     unit-testing
     2.1 --- /dev/null	Thu Jan 01 00:00:00 1970 +0000
     2.2 +++ b/doc/preprocessing.txt	Thu Jul 31 21:18:55 2008 +0200
     2.3 @@ -0,0 +1,89 @@
     2.4 +
     2.5 +Preprocessing
     2.6 +=============
     2.7 +
     2.8 +Some of the protocols in VIFF use auxiliary data which does not depend
     2.9 +on the protocol inputs. Since the values does not depend on the actual
    2.10 +inputs we can calculate them in advance in a *pre-processing* step.
    2.11 +This is also called an *offline* calculation, not because it does not
    2.12 +require network communication, but because it can be done outside of
    2.13 +the peek hours.
    2.14 +
    2.15 +The best example is the multiplication protocol which is secure
    2.16 +against active adversaries. It uses random numbers *a*, *b*, and *c*
    2.17 +where *ab = c*. Since they are random they cannot depend on the inputs
    2.18 +to the multiplication and we can thus safely calculate them in
    2.19 +advance.
    2.20 +
    2.21 +
    2.22 +Knowing how much to preprocess
    2.23 +------------------------------
    2.24 +
    2.25 +When a program is invoked for the first time we cannot know how much
    2.26 +preprocessed data it will need and all data must therefore be produced
    2.27 +on demand (online). The problem is that there is no parsing step in
    2.28 +the execution of a program using VIFF, so we have no chance of
    2.29 +analysing the program in advance and since Python is a general
    2.30 +Turing-complete programming language doing analysis would probably end
    2.31 +up either impossible or very conservative meaning that the estimate
    2.32 +would be higher than the actual need for would preprocessed data.
    2.33 +
    2.34 +What we can do is to monitor the program when it runs and note when
    2.35 +preprocessed data was needed and how much. We then use this data on
    2.36 +the next run.
    2.37 +
    2.38 +We record the following:
    2.39 +
    2.40 +* the program counter,
    2.41 +* the function that produced the data,
    2.42 +* the arguments used function.
    2.43 +
    2.44 +Using this information the next program run can start with a
    2.45 +preprocessing phase where data is produced and associated with the
    2.46 +program counters stored.
    2.47 +
    2.48 +This will obviously only work if executing the program always produces
    2.49 +the same trace of program counters. And indeed a secure program must
    2.50 +behave like this, for if the program counter trace changed depending
    2.51 +on the (private) inputs to the program, then this would be observable
    2.52 +from the outside and thus leak information.
    2.53 +
    2.54 +One can therefore build a program and test it on toy-input to
    2.55 +establish the program counter trace. This trace can then be used when
    2.56 +the program is deployed and used on real data.
    2.57 +
    2.58 +Branching programs
    2.59 +~~~~~~~~~~~~~~~~~~
    2.60 +
    2.61 +The above is not entirely true for a program like the double auction.
    2.62 +This program branches on values opened during the execution and could
    2.63 +thus potentially produce different traces on each run while still
    2.64 +being secure.
    2.65 +
    2.66 +There is no problem in this particular case since the double auction
    2.67 +has the same trace in each branch, but this is not true in general. We
    2.68 +will need to find a better mechanism for dealing with programs that
    2.69 +intentionally leak information when executing and branch to different
    2.70 +parts.
    2.71 +
    2.72 +
    2.73 +Implementing preprocessing
    2.74 +--------------------------
    2.75 +
    2.76 +Preprocessing is implemented using two tools: the
    2.77 +:func:`viff.runtime.preprocess` decorator is used on methods that
    2.78 +produce data which can be preprocessed. It replaces the methods with
    2.79 +proxies which will attempt to return preprocessed data. If the data
    2.80 +could not be satisfied from the pool of preprocessed data, then the
    2.81 +program counter and function arguments are stored before falling back
    2.82 +to calculating the data online using the original method.
    2.83 +
    2.84 +To generate preprocessed data based on a trace of program counters one
    2.85 +must call the :meth:`viff.runtime.BasicRuntime.preprocess` method. It
    2.86 +returns a :class:`Deferred` which triggers when all preprocessed data
    2.87 +is ready which means that the online part of the computation can
    2.88 +begin.
    2.89 +
    2.90 +Instead of starting the online phase immediately, one could also
    2.91 +choose to store the preprocessed data for later use, but this has not
    2.92 +been implemented yet.
     3.1 --- a/doc/runtime.txt	Thu Jul 31 19:23:05 2008 +0200
     3.2 +++ b/doc/runtime.txt	Thu Jul 31 21:18:55 2008 +0200
     3.3 @@ -23,6 +23,8 @@
     3.4  
     3.5     .. autofunction:: increment_pc
     3.6  
     3.7 +   .. autofunction:: preprocess
     3.8 +
     3.9     .. autofunction:: create_runtime
    3.10  
    3.11     .. autoclass:: BasicRuntime