Fuzzy extractor




Fuzzy extractors are a method that allows to use biometric data as inputs to standard cryptographic techniques for biometric security. "Fuzzy", in this context, refers to the fact that the fixed values required for cryptography will be extracted from values close but not identical to the original one, without compromising the security required. One application is to encrypt and authenticate users records, using the biometric inputs of the user as a key.


Fuzzy extractors are a biometric tool that allows to authenticate a user using for the key a biometric template constructed from his biometric data. They extract a uniform and random string R{displaystyle R}R from an input w{displaystyle w}w with a tolerance for noise. If the input changes to w′{displaystyle w'}w' but is still close to w{displaystyle w}w, the same string R{displaystyle R}R will be re-constructed. To achieve this, during the initial computation of R{displaystyle R}R the process also outputs a helper string P{displaystyle P}Pwhich will be stored to recover R{displaystyle R}R later and can be made public without compromising the security of R{displaystyle R}R. The security of the process is ensured also when an adversary modifies P{displaystyle P}P. Once the fixed string R{displaystyle R}Rhas been calculated, it can be used for example for key agreement between a user and a server based only on a biometric input.


Historically, the first biometric system of this kind was designed by Juels and Wattenberg and was called "Fuzzy commitment", where the cryptographic key is decommitted using biometric data. Later, Juels and Sudan came up with Fuzzy vault schemes which are order invariant for the fuzzy commitment scheme but uses a Reed–Solomon code. Codeword is evaluated by polynomial and the secret message is inserted as the coefficients of the polynomial. The polynomial is evaluated for different values of a set of features of the biometric data. So Fuzzy commitment and Fuzzy Vault were precursors to Fuzzy extractors.


This description is based on the papers "Fuzzy Extractors: A Brief Survey of Results from 2004 to 2006" and "Fuzzy Extractors: How to Generate Strong Keys from Biometrics and Other Noisy Data" by Yevgeniy Dodis, Rafail Ostrovsky, Leonid Reyzin and Adam Smith




Contents






  • 1 Motivation


  • 2 Basic definitions


    • 2.1 Predictability


    • 2.2 Min-entropy


    • 2.3 Statistical distance


    • 2.4 Definition 1 (strong extractor)


    • 2.5 Secure sketch


    • 2.6 Definition 2 (secure sketch)


    • 2.7 Fuzzy extractor


    • 2.8 Definition 3 (fuzzy extractor)


    • 2.9 Secure sketches and fuzzy extractors


    • 2.10 Lemma 1 (fuzzy extractors from sketches)


    • 2.11 Corollary 1




  • 3 Basic constructions


    • 3.1 Hamming distance constructions


      • 3.1.1 Code-offset construction


      • 3.1.2 Syndrome construction




    • 3.2 Set difference constructions


      • 3.2.1 Pin sketch construction




    • 3.3 Edit distance constructions


    • 3.4 Other distance measure constructions




  • 4 Improving error-tolerance via relaxed notions of correctness


    • 4.1 Random errors


    • 4.2 Input-dependent errors


    • 4.3 Computationally bounded errors




  • 5 Privacy guarantees


    • 5.1 Correlation between helper string and biometric input


    • 5.2 Gen(W) as a probabilistic map


    • 5.3 Uniform fuzzy extractors


    • 5.4 Uniform secure sketches


    • 5.5 Applications




  • 6 Protection against active attacks


    • 6.1 Robust fuzzy extractors




  • 7 References





Motivation


In order for fuzzy extractors to generate strong keys from Biometrics and other Noisy Data, cryptography paradigms will be applied to this biometric data which means they must be allow to


(1) Limit the number of assumptions required about the content of the biometric data (these data comes from a variety of sources and in order to avoid an adversary to exploit them, it's best to assume the input is unpredictable)


(2) Apply usual cryptographic techniques from the input. (for this fuzzy extractors convert biometric data into secret, uniformly random and reliably reproducible random string).


The paper "Fuzzy Extractors: How to Generate Strong Keys from Biometrics and Other Noisy Data" by Yevgeniy Dodis, Rafail Ostrovsky, Leonid Reyzin and Adam Smith shows that these techniques can also have other broader applications for other type of noisy inputs such as approximative data from human memory, images used as passwords, keys from quantum channel. According to the Differential Privacy paper by Cynthia Dwork (ICALP 2006), fuzzy extractors also have applications in the proof of impossibility of the strong notions of privacy for statistical databases.



Basic definitions



Predictability


Predictability indicates the probability that adversary can guess a secret key. Mathematically speaking, the predictability of a random variable A{displaystyle A}A is maxaP[A=a]{displaystyle max _{mathrm {a} }P[A=a]}max _{{{mathrm  {a}}}}P[A=a].


For example, given a pair of random variable A{displaystyle A}A and B{displaystyle B}B, if the adversary knows b{displaystyle b}b of B{displaystyle B}B, then the predictability of A{displaystyle A}A will be maxaP[A=a|B=b]{displaystyle max _{mathrm {a} }P[A=a|B=b]}max_{mathrm{a}}P[A = a | B = b]. So, an adversary can predict A{displaystyle A}A with Eb←B[maxaP[A=a|B=b]]{displaystyle E_{bleftarrow B}[max _{mathrm {a} }P[A=a|B=b]]} E_{b leftarrow B}[max_{mathrm{a}} P[A = a | B = b]]. We use the average over B{displaystyle B}B as it is not under adversary control, but since knowing b{displaystyle b}b makes the prediction of A{displaystyle A}A adversarial, we take the worst case over A{displaystyle A}A.



Min-entropy


Min-entropy indicates the worst-case entropy. Mathematically speaking, it is defined as H∞(A)=−log⁡(maxaP[A=a]){displaystyle H_{infty }(A)=-log(max _{mathrm {a} }P[A=a])}H_{infty }(A)=-log(max _{{{mathrm  {a}}}}P[A=a]) .


A random variable with a min-entropy at least of m{displaystyle m}m is called a m{displaystyle m}m-source.



Statistical distance


Statistical distance is a measure of distinguishability. Mathematically speaking, it is expressed for two probability distributions A{displaystyle A}A and B{displaystyle B}B as SD[A,B]{displaystyle SD[A,B]}SD[A,B] = 12∑v|P[A=v]−P[B=v]|{displaystyle {frac {1}{2}}sum _{mathrm {v} }|P[A=v]-P[B=v]|}{frac  {1}{2}}sum _{{{mathrm  {v}}}}|P[A=v]-P[B=v]|. In any system, if A{displaystyle A}A is replaced by B{displaystyle B}B, it will behave as the original system with a probability at least of 1−SD[A,B]{displaystyle 1-SD[A,B]}1-SD[A,B].



Definition 1 (strong extractor)


Setting M{displaystyle M}M as a strong randomness extractor. The randomized function Ext: M→{0,1}l{displaystyle Mrightarrow {0,1}^{l}}Mrightarrow {0,1}^{l} with randomness of length r{displaystyle r} r is a (m,l,ϵ){displaystyle (m,l,epsilon )}(m,l,epsilon ) strong extractor if for all m{displaystyle m}m-sources W{displaystyle W}W on M(Ext(W;I),I)≈ϵ(Ul,Ur),{displaystyle M(Ext(W;I),I)approx _{epsilon }(U_{l},U_{r}),}M(Ext(W;I),I)approx _{epsilon }(U_{l},U_{r}), where I=Ur{displaystyle I=U_{r}}I=U_{r} is independent of W{displaystyle W}W.


The output of the extractor is a key generated from w←W{displaystyle wleftarrow W}wleftarrow W with the seed i←I{displaystyle ileftarrow I}ileftarrow I. It behaves independently of other parts of the system with the probability of 1−ϵ{displaystyle 1-epsilon }1-epsilon . Strong extractors can extract at most l=m−2log1ϵ+O(1){displaystyle l=m-2log{frac {1}{epsilon }}+O(1)}l=m-2log{frac  {1}{epsilon }}+O(1) bits from an arbitrary m{displaystyle m}m-source.



Secure sketch


Secure sketch makes it possible to reconstruct noisy input, so that if the input is w{displaystyle w}w and the sketch is s{displaystyle s}s, given s{displaystyle s}s and a value w′{displaystyle w'}w'close to w{displaystyle w}w, w{displaystyle w}w can be recovered. But the sketch s{displaystyle s}s must not reveal information about w{displaystyle w}w, in order to keep it secure.


If M{displaystyle mathbb {M} }{mathbb  {M}} is a metric space with the distance function dis, Secure sketch recovers the string w∈M{displaystyle win mathbb {M} }win {mathbb  {M}} from any close string w′∈M{displaystyle w'in mathbb {M} }w'in {mathbb  {M}} without disclosing w{displaystyle w}w.



Definition 2 (secure sketch)


An (m,m~,t){displaystyle (m,{tilde {m}},t)}(m,{tilde  {m}},t) secure sketch is a pair of efficient randomized procedures (the Sketch noted SS, the Recover noted Rec) such that :


(1) The sketching procedure SS applied on input w∈M{displaystyle win mathbb {M} }win {mathbb  {M}} returns a string s∈{0,1}∗{displaystyle sin {{0,1}^{*}}}sin {{0,1}^{*}}.


The recovery procedure Rec uses as input two elements w′∈M{displaystyle w'in mathbb {M} } w' in mathbb{M} and s∈{0,1}∗{displaystyle sin {{0,1}^{*}}}s in {{0,1}^*} .


(2) Correctness: If dis(w,w′)≤t{displaystyle dis(w,w')leq t}dis(w,w')leq t then Rec(w′,SS(w))=w{displaystyle Rec(w',SS(w))=w}Rec(w',SS(w))=w.


(3) Security: For any m{displaystyle m}m-source over M{displaystyle M}M, the min-entropy of W{displaystyle W}W given s{displaystyle s}s is high:


for any (W,E){displaystyle (W,E)}(W,E), if H~(W|E)≥m{displaystyle {tilde {H}}_{mathrm {infty } }(W|E)geq m}{tilde  {H}}_{{{mathrm  {infty }}}}(W|E)geq m, then H~(W|SS(W),E)≥m~{displaystyle {tilde {H}}_{mathrm {infty } }(W|SS(W),E)geq {tilde {m}}}{tilde  {H}}_{{{mathrm  {infty }}}}(W|SS(W),E)geq {tilde  {m}}.



Fuzzy extractor


Fuzzy extractors do not recover the original input but generate a string R{displaystyle R}R (which is close to uniform) from w{displaystyle w}w and allow its subsequent reproduction (using helper string P{displaystyle P}P) given any w′{displaystyle w'}w' close to w{displaystyle w}w. Strong extractors are a special case of fuzzy extractors when t{displaystyle t}t = 0 and P=I{displaystyle P=I}P=I.



Definition 3 (fuzzy extractor)


An (m,l,t,ϵ){displaystyle (m,l,t,epsilon )}(m,l,t,epsilon ) fuzzy extractor is a pair of efficient randomized procedures (Gen – Generate and Rep – Reproduce) such that:


(1) Gen, given w∈M{displaystyle win mathbb {M} }win {mathbb  {M}}, outputs an extracted string R∈{0,1}l{displaystyle Rin {mathbb {{} 0,1}^{l}}}Rin {{mathbb  {}0,1}^{l}} and a helper string P∈{0,1}∗{displaystyle Pin {mathbb {{} 0,1}^{*}}}Pin {{mathbb  {}0,1}^{*}}.


(2) Correctness: If dis(w,w′)≤t{displaystyle dis(w,w')leq t}dis(w,w')leq t and (R,P)←Gen(w){displaystyle (R,P)leftarrow Gen(w)}(R,P)leftarrow Gen(w), then Rep(w′,P)=R{displaystyle Rep(w',P)=R}Rep(w',P)=R.


(3) Security: For all m-sources W{displaystyle W}W over M{displaystyle M}M, the string R{displaystyle R}R is nearly uniform even given P{displaystyle P}P, So H~(W|E)≥m{displaystyle {tilde {H}}_{mathrm {infty } }(W|E)geq m}{tilde  {H}}_{{{mathrm  {infty }}}}(W|E)geq m, then (R,P,E)≈(Ul,P,E){displaystyle (R,P,E)approx (U_{mathrm {l} },P,E)}(R,P,E)approx (U_{{{mathrm  {l}}}},P,E).


So Fuzzy extractors output almost uniform random sequences of bits which are a prerequisite for using cryptographic applications (as secret keys). Since the output bits are slightly non-uniform, there's a risk of a decreased security, but the distance from an uniform distribution is no more than ϵ{displaystyle epsilon }epsilon and as long as this distance is sufficiently small, the security will remains adequate.



Secure sketches and fuzzy extractors


Secure sketches can be used to construct fuzzy extractors. Like applying SS to w{displaystyle w}w to obtain s{displaystyle s}s and strong extractor Ext with randomness x{displaystyle x}x to w{displaystyle w}w to get R{displaystyle R}R. (s,x){displaystyle (s,x)}(s,x) can be stored as helper string P{displaystyle P}P. R{displaystyle R}R can be reproduced by w′{displaystyle w'}w' and P=(s,x){displaystyle P=(s,x)}P=(s,x). Rec(w′,s){displaystyle Rec(w',s)}Rec(w',s) can recover w{displaystyle w}w and Ext(w,x){displaystyle Ext(w,x)}Ext(w,x) can reproduce R{displaystyle R}R.
Following Lemma formalize this.



Lemma 1 (fuzzy extractors from sketches)


Assume (SS,Rec) is an (M,m,m~,t){displaystyle (M,m,{tilde {m}},t)}(M,m,{tilde  {m}},t) secure sketch and let Ext be an average-case (n,m~,l,ϵ){displaystyle (n,{tilde {m}},l,epsilon )}(n,{tilde  {m}},l,epsilon ) strong extractor. Then the following (Gen, Rep) is an (M,m,l,t,ϵ){displaystyle (M,m,l,t,epsilon )}(M,m,l,t,epsilon ) fuzzy extractor:
(1) Gen (w,r,x){displaystyle (w,r,x)}{displaystyle (w,r,x)}: set P=(SS(w;r),x),R=Ext(w;x),{displaystyle P=(SS(w;r),x),R=Ext(w;x),}{displaystyle P=(SS(w;r),x),R=Ext(w;x),} and output (R,P){displaystyle (R,P)}(R,P).
(2) Rep (w′,(s,x)){displaystyle (w',(s,x))}(w',(s,x)): recover w=Rec(w′,s){displaystyle w=Rec(w',s)}w=Rec(w',s) and output R=Ext(w;x){displaystyle R=Ext(w;x)}R=Ext(w;x).


Proof: From the definition of secure sketch (Definition 2),
H∞(W|SS(W))≥m~{displaystyle H_{infty }(W|SS(W))geq {tilde {m}}}H_{infty }(W|SS(W))geq {tilde  {m}}. And since Ext is an average-case (n,m,l,ϵ){displaystyle (n,m,l,epsilon )}(n,m,l,epsilon )-strong extractor. SD((Ext(W;X),SS(W),X),(Ul,SS(W),X))=SD((R,P),(Ul,P))≤ϵ.{displaystyle SD((Ext(W;X),SS(W),X),(U_{l},SS(W),X))=SD((R,P),(U_{l},P))leq epsilon .}SD((Ext(W;X),SS(W),X),(U_{l},SS(W),X))=SD((R,P),(U_{l},P))leq epsilon .



Corollary 1


If (SS,Rec) is an (M,m,m~,t){displaystyle (M,m,{tilde {m}},t)}(M,m,{tilde  {m}},t) secure sketch and Ext is an (n,m~log(1δ),l,ϵ){displaystyle (n,{tilde {m}}-log({frac {1}{delta }}),l,epsilon )}(n,{tilde  {m}}-log({frac  {1}{delta }}),l,epsilon ) strong extractor, then the above construction (Gen,Rep) is a (M,m,l,t,ϵ){displaystyle (M,m,l,t,epsilon +delta )}(M,m,l,t,epsilon +delta ) fuzzy extractor.


The reference paper "Fuzzy Extractors: How to Generate Strong Keys from Biometrics and Other Noisy Data" by Yevgeniy Dodis, Rafail Ostrovsky, Leonid Reyzin and Adam Smith (2008) includes many generic combinatorial bounds on secure sketches and fuzzy extractors



Basic constructions


Due to their error tolerant properties, a secure sketches can be treated, analyzed, and constructed like a (n,k,d)F{displaystyle (n,k,d)_{mathcal {F}}}(n,k,d)_{{{mathcal  {F}}}} general error correcting code or [n,k,d]F{displaystyle [n,k,d]_{mathcal {F}}}[n,k,d]_{{{mathcal  {F}}}} for linear codes, where n{displaystyle n}n is the length of codewords, k{displaystyle k}k is the length of the message to be codded, d{displaystyle d}d is the distance between codewords, and F{displaystyle {mathcal {F}}}{mathcal {F}} is the alphabet. If Fn{displaystyle {mathcal {F}}^{n}}{mathcal  {F}}^{n} is the universe of possible words then it may be possible to find an error correcting code C∈Fn{displaystyle Cin {mathcal {F}}^{n}}Cin {mathcal  {F}}^{n} that has a unique codeword c∈C{displaystyle cin C}cin C for every w∈Fn{displaystyle win {mathcal {F}}^{n}}win {mathcal  {F}}^{n} and have a Hamming distance of disHam(c,w)≤(d−1)/2{displaystyle dis_{Ham}(c,w)leq (d-1)/2}dis_{{Ham}}(c,w)leq (d-1)/2. The first step for constructing a secure sketch is determining the type of errors that will likely occur and then choosing a distance to measure.




Red is the code-offset construction, blue is the syndrome construction, green represents edit distance and other complex constructions.



Hamming distance constructions


When there is no risk of data being deleted and only of it being corrupted then the best measurement to use for error correction is the Hamming distance. There are two common constructions for correcting Hamming errors depending on whether the code is linear or not. Both constructions start with an error correcting code that has a distance of 2t+1{displaystyle 2t+1}2t+1 where t{displaystyle {t}}{t} is the number of tolerated errors.



Code-offset construction


When using a (n,k,2t+1)F{displaystyle (n,k,2t+1)_{mathcal {F}}}(n,k,2t+1)_{{{mathcal  {F}}}} general code, assign a uniformly random codeword c∈C{displaystyle cin C}cin C to each w{displaystyle w}w, then let SS(w)=s=w−c{displaystyle SS(w)=s=w-c}SS(w)=s=w-c which is the shift needed to change c{displaystyle c}c into w{displaystyle w}w. To fix errors in w′{displaystyle w'}w' subtract s{displaystyle s}s from w′{displaystyle w'}w' then correct the errors in the resulting incorrect codeword to get c{displaystyle c}c and finally add s{displaystyle s}s to c{displaystyle c}c to get w{displaystyle w}w. This means Rec(w′,s)=s+dec(w′−s)=w{displaystyle Rec(w',s)=s+dec(w'-s)=w}Rec(w',s)=s+dec(w'-s)=w. This construction can achieve the best possible tradeoff between error tolerance and entropy loss when F≥n{displaystyle {mathcal {F}}geq n}{mathcal  {F}}geq n and a Reed–Solomon code is used resulting in an entropy loss of 2tlog⁡(F){displaystyle 2tlog({mathcal {F}})}2tlog({mathcal  {F}}). The only way to improve upon would be to find a code better than Reed–Solomon.



Syndrome construction


When using a [n,k,2t+1]F{displaystyle [n,k,2t+1]_{mathcal {F}}}[n,k,2t+1]_{{{mathcal  {F}}}} linear code let the SS(w)=s{displaystyle SS(w)=s}SS(w)=s be the syndrome of w{displaystyle w}w. To correct w′{displaystyle w'}w' find a vector e{displaystyle e}e such that syn(e)=syn(w′)−s{displaystyle syn(e)=syn(w')-s}syn(e)=syn(w')-s, then w=w′−e{displaystyle w=w'-e}w=w'-e.



Set difference constructions


When working with a very large alphabet or very long strings resulting in a very large universe U{displaystyle {mathcal {U}}}{mathcal {U}}, it may be more efficient to treat w{displaystyle w}w and w′{displaystyle w'}w' as sets and look at set differences to correct errors. To work with a large set w{displaystyle w}w it is useful to look at its characteristic vector xw{displaystyle x_{w}}x_{w}, which is a binary vector of length n{displaystyle n}n that has a value of 1 when an element a∈U{displaystyle ain {mathcal {U}}}ain {mathcal  {U}} and a∈w{displaystyle ain w}ain w, or 0 when a∉w{displaystyle anotin w}anotin w. The best way to decrease the size of a secure sketch when n{displaystyle n}n is large is make k{displaystyle k}k large since the size is determined by n−k{displaystyle n-k}n-k. A good code to base this construction on is a [n,n−,2t+1]2{displaystyle [n,n-talpha ,2t+1]_{2}}[n,n-talpha ,2t+1]_{{2}} BCH code where n=2α1{displaystyle n=2^{alpha }-1}n=2^{{alpha }}-1 and t≪n{displaystyle tll n}tll n so k≤n−log(nt){displaystyle kleq n-log{n choose {t}}}kleq n-log{n choose {t}}, it is also useful that BCH codes can be decode in sub-linear time.



Pin sketch construction


Let SS(w)=s=syn(xw){displaystyle SS(w)=s=syn(x_{w})}SS(w)=s=syn(x_{w}). To correct w′{displaystyle w'}w' first find SS(w′)=s′=syn(xw′){displaystyle SS(w')=s'=syn(x_{w}')}SS(w')=s'=syn(x_{w}'), then find a set v where syn(xv)=s′−s{displaystyle syn(x_{v})=s'-s}syn(x_{v})=s'-s, finally compute the symmetric difference to get Rec(w′,s)=w′△v=w{displaystyle Rec(w',s)=w'triangle v=w}Rec(w',s)=w'triangle v=w. While this is not the only construction than can be used to set the difference, it is the easiest one.



Edit distance constructions


When data can be corrupted or deleted, the best measurement to use is edit distance. To make a construction based on edit distance, the easiest is to start with a construction for set difference or hamming distance as an intermediate correction step and then build the edit distance construction around that.



Other distance measure constructions


There are many other types of errors and distances that can be used to model other situations. Most of these other possible constructions are built upon simpler constructions, like edit distance constructions.



Improving error-tolerance via relaxed notions of correctness


It can be shown that the error-tolerance of a secure sketch can be improved by applying a probabilistic method to error correction and only requesting errors to be correctable with a high probability. This allows to exceed the Plotkin bound which limits to correcting n/4{displaystyle n/4}n/4 errors, and to approach Shannon’s bound allowing for nearly n/2{displaystyle n/2}n/2 corrections. To achieve this enhanced error correction, a less restrictive error distribution model must be used.



Random errors


For this most restrictive model use a BSCp{displaystyle _{p}}_{{p}} to create a w′{displaystyle w'}w' that a probability p{displaystyle p}p at each position in w′{displaystyle w'}w' that the bit received is wrong. This model can show that entropy loss is limited to nH(p)−o(n){displaystyle nH(p)-o(n)}nH(p)-o(n), where H{displaystyle H}H is the binary entropy function, and if min-entropy m≥n(H(12−γ))+ε{displaystyle mgeq n(H({frac {1}{2}}-gamma ))+varepsilon }mgeq n(H({frac  {1}{2}}-gamma ))+varepsilon then n(12−γ){displaystyle n({frac {1}{2}}-gamma )}n({frac  {1}{2}}-gamma ) errors can be tolerated, for some constant γ>0{displaystyle gamma >0}gamma >0.



Input-dependent errors


For this model errors do not have a known distribution and can be from an adversary, the only constraints are diserr≤t{displaystyle dis_{text{err}}leq t}dis_{{text{err}}}leq t and that a corrupted word depends only on the input w{displaystyle w}w and not on the secure sketch. It can be shown for this error model that there will never be more than t{displaystyle t}t errors since this model can account for all complex noise processes, meaning that Shannon’s bound can be reached, to do this a random permutation is prepended to the secure sketch that will reduce entropy loss.



Computationally bounded errors


This differs from the input dependent model by having errors that depend on both the input w{displaystyle w}w and the secure sketch, and an adversary is limited to polynomial time algorithms for introducing errors. Since algorithms that can run in better than polynomial time are not currently feasible in the real world, then a positive result using this error model would guarantee that any errors can be fixed. This is the least restrictive model the only known way to approach Shannon’s bound is to use list-decodable codes although this may not always be useful in practice since returning a list instead of a single codeword may not always be acceptable.



Privacy guarantees


In general a secure system attempts to leak as little information as possible to an adversary. In the case of biometrics if information about the biometric reading is leaked the adversary may be able to learn personal information about a user. For example an adversary notices that there is a certain pattern in the helper strings that implies the ethnicity of the user. We can consider this additional information a function f(W){displaystyle f(W)}f(W). If an adversary were to learn a helper string, it must be ensured that, from this data he can not infer any data about the person from which the biometric reading was taken.



Correlation between helper string and biometric input


Ideally the helper string P{displaystyle P}P would reveal no information about the biometric input w{displaystyle w}w. This is only possible when every subsequent biometric reading w′{displaystyle w'}w' is identical to the original w{displaystyle w}w. In this case there is actually no need for the helper string, so it is easy to generate a string that is in no way correlated to w{displaystyle w}w.


Since it is desirable to accept biometric input w′{displaystyle w'}w' similar to w{displaystyle w}w the helper string P{displaystyle P}P must be somehow correlated. The more different w{displaystyle w}w and w′{displaystyle w'}w' are allowed to be, the more correlation there will be between P{displaystyle P}P and w{displaystyle w}w, the more correlated they are the more information P{displaystyle P}P reveals about w{displaystyle w}w. We can consider this information to be a function f(W){displaystyle f(W)}f(W). The best possible solution is to make sure the adversary can't learn anything useful from the helper string.



Gen(W) as a probabilistic map


A probabilistic map Y(){displaystyle Y()}Y() hides the results of functions with a small amount of leakage ϵ{displaystyle epsilon }epsilon . The leakage is the difference in probability two adversaries have of guessing some function when one knows the probabilistic map and one does not. Formally:


|Pr[A1(Y(W))=f(W)]−Pr[A2()=f(W)]|≤ϵ{displaystyle |Pr[A_{1}(Y(W))=f(W)]-Pr[A_{2}()=f(W)]|leq epsilon }|Pr[A_{1}(Y(W))=f(W)]-Pr[A_{2}()=f(W)]|leq epsilon

If the function Gen(W){displaystyle Gen(W)}Gen(W) is a probabilistic map, then even if an adversary knows both the helper string P{displaystyle P}P and the secret string R{displaystyle R}R they are only negligibly more likely figure something out about the subject as if they knew nothing. The string R{displaystyle R}R is supposed to kept secret, so even if it is leaked (which should be very unlikely) the adversary can still figure out nothing useful about the subject, as long as ϵ{displaystyle epsilon }epsilon is small. We can consider f(W){displaystyle f(W)}f(W) to be any correlation between the biometric input and some physical characteristic of the person. Setting Y=Gen(W)=R,P{displaystyle Y=Gen(W)=R,P}Y=Gen(W)=R,P in the above equation changes it to:


|Pr[A1(R,P)=f(W)]−Pr[A2()=f(W)]|≤ϵ{displaystyle |Pr[A_{1}(R,P)=f(W)]-Pr[A_{2}()=f(W)]|leq epsilon }|Pr[A_{1}(R,P)=f(W)]-Pr[A_{2}()=f(W)]|leq epsilon

This means that if one adversary A1{displaystyle A_{1}}A_{1} has (R,P){displaystyle (R,P)}(R,P) and a second adversary A2{displaystyle A_{2}}A_{2} knows nothing, their best guesses at f(W){displaystyle f(W)}f(W) are only ϵ{displaystyle epsilon }epsilon apart.



Uniform fuzzy extractors


Uniform fuzzy extractors are a special case of fuzzy extractors, where the output (R,P){displaystyle (R,P)}(R,P) of Gen(W){displaystyle Gen(W)}Gen(W) are negligibly different from strings picked from the uniform distribution, i.e. (R,P)≈ϵ(Uℓ,U|P|){displaystyle (R,P)approx _{epsilon }(U_{ell },U_{|P|})}(R,P)approx _{epsilon }(U_{ell },U_{{|P|}})



Uniform secure sketches


Since secure sketches imply fuzzy extractors, constructing a uniform secure sketch allows for the easy construction of a uniform fuzzy extractor. In a uniform secure sketch the sketch procedure SS(w){displaystyle SS(w)}SS(w) is a randomness extractor Ext(w;i){displaystyle Ext(w;i)}Ext(w;i). Where w{displaystyle w}w is the biometric input and i{displaystyle i}i is the random seed. Since randomness extractors output a string that appears to be from a uniform distribution they hide all the information about their input.



Applications


Extractor sketches can be used to construct (m,t,ϵ){displaystyle (m,t,epsilon )}(m,t,epsilon )-fuzzy perfectly one-way hash functions. When used as a hash function the input w{displaystyle w}w is the object you want to hash. The P,R{displaystyle P,R}P,R that Gen(w){displaystyle Gen(w)}Gen(w) outputs is the hash value. If one wanted to verify that a w′{displaystyle w'}w' within t{displaystyle t}t from the original w{displaystyle w}w, they would verify that Rep(w′,P)=R{displaystyle Rep(w',P)=R}Rep(w',P)=R. (m,t,ϵ){displaystyle (m,t,epsilon )}(m,t,epsilon )-fuzzy perfectly one-way hash functions are special hash functions where they accept any input with at most t{displaystyle t}t errors, compared to traditional hash functions which only accept when the input matches the original exactly. Traditional cryptographic hash functions attempt to guarantee that is it is computationally infeasible to find two different inputs that hash to the same value. Fuzzy perfectly one-way hash functions make an analogous claim. They make it computationally infeasible two find two inputs, that are more than t{displaystyle t}t Hamming distance apart and hash to the same value.



Protection against active attacks


An active attack could be one where the adversary can modify the helper string P{displaystyle P}P. If the adversary is able to change P{displaystyle P}P to another string that is also acceptable to the reproduce function Rep(W,P){displaystyle Rep(W,P)}Rep(W,P), it causes Rep(W,P){displaystyle Rep(W,P)}Rep(W,P) to output an incorrect secret string R~{displaystyle {tilde {R}}}{tilde  {R}}. Robust fuzzy extractors solve this problem by allowing the reproduce function to fail, if a modified helper string is provided as input.



Robust fuzzy extractors


One method of constructing robust fuzzy extractors is to use hash functions. This construction requires two hash functions H1{displaystyle H_{1}}H_{1} and H2{displaystyle H_{2}}H_{2}. The Gen(W){displaystyle Gen(W)}Gen(W) functions produces the helper string P{displaystyle P}P by appending the output of a secure sketch s=SS(w){displaystyle s=SS(w)}s=SS(w) to the hash of both the reading w{displaystyle w}w and secure sketch s{displaystyle s}s. It generates the secret string R{displaystyle R}R by applying the second hash function to w{displaystyle w}w and s{displaystyle s}s. Formally:
Gen(w):s=SS(w),return:P=(s,H1(w,s)),R=H2(w,s){displaystyle Gen(w):s=SS(w),return:P=(s,H_{1}(w,s)),R=H_{2}(w,s)}Gen(w):s=SS(w),return:P=(s,H_{1}(w,s)),R=H_{2}(w,s)



FuzzyExtGen.png


The reproduce function Rep(W,P){displaystyle Rep(W,P)}Rep(W,P) also makes use of the hash functions H1{displaystyle H_{1}}H_{1} and H2{displaystyle H_{2}}H_{2}. In addition to verifying the biometric input is similar enough to the one recovered using the Rec(W,S){displaystyle Rec(W,S)}Rec(W,S) function, it also verifies that hash in the second part of P{displaystyle P}P was actually derived from w{displaystyle w}w and s{displaystyle s}s. If both of those conditions are met it returns R{displaystyle R}R which is itself the second hash function applied to w{displaystyle w}w and s{displaystyle s}s. Formally:


Rep(w′,P~):{displaystyle Rep(w',{tilde {P}}):}Rep(w',{tilde  {P}}): Get s~{displaystyle {tilde {s}}}{tilde  {s}} and h~{displaystyle {tilde {h}}}{tilde  {h}} from P~;w~=Rec(w′,s~).{displaystyle {tilde {P}};{tilde {w}}=Rec(w',{tilde {s}}).}{tilde  {P}};{tilde  {w}}=Rec(w',{tilde  {s}}).
If Δ(w~,w′)≤t{displaystyle Delta ({tilde {w}},w')leq t}Delta ({tilde  {w}},w')leq t and h~=H1(w~,s~){displaystyle {tilde {h}}=H_{1}({tilde {w}},{tilde {s}})}{tilde  {h}}=H_{1}({tilde  {w}},{tilde  {s}}) then return:H2(w~,s~){displaystyle return:H_{2}({tilde {w}},{tilde {s}})}return:H_{2}({tilde  {w}},{tilde  {s}}) else return:fail{displaystyle return:fail}return:fail



Rep.png


If P{displaystyle P}P has been tampered with it will be obvious because, Rep{displaystyle Rep}Rep will output fail with very high probability. To cause the algorithm accept a different P{displaystyle P}P an adversary would have to find a w~{displaystyle {tilde {w}}}{tilde  {w}} such that H1(w,s)=H1(w~,s~){displaystyle H_{1}(w,s)=H_{1}({tilde {w}},{tilde {s}})}H_{1}(w,s)=H_{1}({tilde  {w}},{tilde  {s}}). Since hash function are believed to be one way functions, it is computationally infeasible to find such a w~{displaystyle {tilde {w}}}{tilde  {w}}.
Seeing P{displaystyle P}P would provide the adversary with no useful information. Since, again, hash function are one way functions, it is computationally infeasible for the adversary to reverse the hash function and figure out w{displaystyle w}w. Part of P{displaystyle P}P is the secure sketch, but by definition the sketch reveals negligible information about its input. Similarly seeing R{displaystyle R}R(even though it should never see it) would provide the adversary with no useful information as the adversary wouldn't be able to reverse the hash function and see the biometric input.



References





  • Fuzzy Extractors: A Brief Survey of Results from 2004 to 2006

  • Fuzzy Extractors: How to Generate Strong Keys from Biometrics and Other Noisy Data

  • Biometric Fuzzy Extractor Scheme for Iris Templates

  • A Fuzzy Vault Scheme




Comments

Popular posts from this blog

Information security

Volkswagen Group MQB platform

刘萌萌