- Published on
Solving Proofs, Arguments, and Zero-Knowledge
- Authors
- Name
- Marco Besier, Ph.D.
This post is a work in progress.
Chapter 3: Definitions and Technical Preliminaries
Exercise 3.1
Let be matrices over a field . In Section 2.2, we presented a randomized algorithm for checking that . The algorithm picked a random field element , let , and output EQUAL if , and output NOT-EQUAL otherwise. Suppose instead that each entry of the vector is chosen independently and uniformly at random from . Show that:
If for all , then the algorithm outputs EQUAL for every possible choice of .
If there is even one such that , then the algorithm outputs NOT-EQUAL with probability at least .
Solution
First, let , where each is chosen independently and uniformly at random from . Furthermore, let , so that our goal is to determine whether the product matrix actually equals the product matrix .
The first property (completeness) is easy to see: if , then clearly for any , so the algorithm will output EQUAL for every choice of .
To see that the second property (soundness) holds, let us define the matrix . Furthermore, suppose , i.e., . Since is non-zero, we have that . By the rank-nullity theorem, we then have that . Since our vector coordinates are sampled independently and uniformly at random from , the soundness error is given by
In other words, if there is even one such that , then the algorithm outputs NOT-EQUAL with probability at least .
Exercise 3.2
In Section 2.1, we described a communication protocol of logarithmic cost for determining whether Alice's and Bob's input vectors are equal. Specifically, Alice and Bob interpreted their inputs as degree- univariate polynomials and , chose a random with , and compared to . Give a different communication protocol in which Alice and Bob interpret their inputs as multilinear rather than univariate polynomials over . How large should be to ensure that the probability Bob outputs the wrong answer is at most ? What is the communication cost in bits of this protocol?
Solution
For the protocol to work, we assume (without loss of generality) that Alice's and Bob's vectors are both of length . Whenever that's not the case, Alice and Bob agree to pad their vectors with zeros until the length becomes the next power of two available.
With that assumption, Alice's vector can be viewed as defining a function . Let denote the corresponding multilinear extension.
Similarly, Bob's vector can be viewed as defining a function with a corresponding multilinear extension .
The steps of the protocol are as follows:
- Alice and Bob agree to encode their vectors into multilinear extensions as described above.
- Alice chooses a random point .
- Alice sends and to Bob.
- Bob computes and checks if it's equal to .
Now, how large should be to ensure that the probability Bob outputs the wrong answer is at most ? To answer this question, note that if , then is a non-zero multilinear polynomial of degree . By the Schwartz-Zippel Lemma, the probability that for a randomly chosen is at most:
To ensure the error is , set:
Lastly, to compute the protocol's total communication cost in bits, recall that the number of bits needed to represent an element of is
because that's the number of bits needed to represent integers up to .
During the protocol, Alice sends:
- The vector , i.e., bits.
- The value , i.e., bits.
Therefore, the protocol's total communication cost in bits is bits.
To see how efficient this is, let's consider an example. Suppose , i.e., . To ensure the soundness error is , we choose . With this choice, we have . The total communication cost in bits is, therefore, given by:
Compared to naively sending the whole vector (1024 field elements = 14,336 bits), our protocol gives an exponential saving.
Exercise 3.3
Let . Consider the function given by , and . Write out an explicit expression for the multilinear extension of . What is ?
Now consider the function given by . What is ? How many field multiplications did you perform during the calculation? Can you work through the calculation of that uses "just" 20 multipliactions operations? Hint: see Lemma 3.8.
Solution
Let . Consider the function given by , and . Using Lemma 3.6, we can compute via:
Consequently, we have .
Now consider the function given by . Using Lemma 3.6, we can compute via
where:
Consequently, we can compute using a total of 24 multiplications. Using the memoization procedure from Lemma 3.8, we can reduce the number of required multiplications to 20. To see that, notice that each of the following four multiplications appears twice in the above expressions for the .
Therefore, it suffices to compute each of them only once instead of evaluating each instance at independently. As a result, we only need a total of multiplications to compute .
Exercise 3.4
Fix some prime of your choosing. Write a Python program that takes as input an array of lenght specifying all evaluations of a function and a vector , and outputs .
Solution
Recall that the multilinear extension of a function is defined as
where:
We can, therefore, implement the multilinear extension in Python as follows:
from itertools import product
def eval_multilinear_extension(f_vals, x, p):
"""
Evaluate the multilinear extension of a function f: {0,1}^l -> F_p at point x in F_p^l.
Inputs:
- f_vals: List of 2^l field elements IN LEX ORDER(!) (w in {0,1}^l)
- x: List of length l, point in F_p^l
- p: Prime field modulus
Output:
- tilde_f(x) in F_p
"""
l = len(x)
result = 0
for idx, w in enumerate(product([0, 1], repeat=l)):
chi = 1
for i in range(l):
chi *= (1 - x[i]) if w[i] == 0 else x[i]
chi %= p
result += f_vals[idx] * chi
result %= p
return result
Notice that the function assumes that the input list f_vals
is ordered according to lexicoraphic order on bitstrings. That is, the input values must correspond to ordered like:
This ordering is generated by product([0, 1], repeat=l)
in the above code and must match exactly. If this assumption is violated, the output will be incorrect.
As an example, here's how to correctly reproduce from the previous exercise:
# Input values
p = 11
# indices map to:
# (0,0,0), (0,0,1), (0,1,0), (0,1,1), (1,0,0), (1,0,1), (1,1,0), (1,1,1)
f_vals = [1, 5, 2, 6, 3, 7, 4, 8]
x = [2, 4, 6]
# Output: 0 (= 33%11)
print(eval_multilinear_extension(f_vals, x, p))
If we had passed the f_vals
in a way that would result in non-lexicographic order of the values instead, our function would have given a wrong result:
# Input values
p = 11
# indices map to:
# (0,0,0), (0,1,0), (1,0,0), (1,1,0), (0,0,1), (0,1,1), (1,0,1), (1,1,1)
# Note: this order is NOT LEXICOGRAPHIC
f_vals = [1, 2, 3, 4, 5, 6, 7, 8]
x = [2, 4, 6]
# Output: 1
print(eval_multilinear_extension(f_vals, x, p))