# Introduction

## Data Provenance without Compromising Privacy, That is Why!

The Internet currently lacks effective, privacy-preserving **Data Provenance**. TLS, also known as the "s" in "https" π to the general public, ensures that data can be securely communicated between a server and a user. But how can this user credibly share this data with another user or server without compromising security, privacy, and control?

Enter TLSNotary: a protocol enabling users to export data securely from any website. Using Zero Knowledge Proof (ZKP) technology, this data can be selectively shared with others in a cryptographically verifiable manner.

TLSNotary makes data truly portable and allows a user, the `Prover`

, to share it with another party, the `Verifier`

, as they see fit.

## How Does the TLSNotary Protocol Work?

The TLSNotary protocol consists of 3 steps:

- The
`Prover`

**requests**data from a`Server`

over TLS while cooperating with the`Verifier`

in secure and privacy-preserving*multi-party computation (MPC)*. - The
`Prover`

**selectively discloses**the data to the`Verifier`

. - The
`Verifier`

**verifies**the data.

### β Multi-party TLS Request

TLSNotary works by adding a third party, a `Verifier`

, to the usual TLS connection between the `Prover`

and a `Server`

. This `Verifier`

is **not "a man in the middle"**. Instead, the `Verifier`

participates in a **secure multi-party computation** (MPC) to jointly operate the TLS connection without seeing the data in plain text. By participating in the MPC, the `Verifier`

can validate the authenticity of the data the `Prover`

received from the `Server`

.

The TLSNotary protocol is transparent to the `Server`

. From the `Server`

's perspective, the `Prover`

's connection is a standard TLS connection.

### β‘ Selective Disclosure

The TLSNotary protocol enables the `Prover`

to selectively prove the authenticity of arbitrary parts of the data to a `Verifier`

. In this **selective disclosure** phase, the `Prover`

can **redact** sensitive information from the data prior to sharing it with the `Verifier`

.

This capability can be paired with Zero-Knowledge Proofs to prove properties of the redacted data without revealing the data itself.

### β’ Data Verification

The `Verifier`

now validates the proof received from the `Prover`

. The data origin can be verified by inspecting the `Server`

certificate through trusted certificate authorities (CAs). The `Verifier`

can now make assertions about the non-redacted content of the transcript.

## TLS verification with a general-purpose Notary

Since the validation of the TLS traffic neither reveals anything about the plaintext of the TLS session nor about the `Server`

, it is possible to outsource the MPC-TLS verification β to a general-purpose TLS verifier, which we term a `Notary`

. This `Notary`

can sign (aka *notarize*) β‘ the data, making it portable. The `Prover`

can then take this signed data and selectively disclose β’ sections to an application-specific `Verifier`

, who then verifies the data β£.

In this setup, the `Notary`

cryptographically signs commitments to the data and the server's identity. The `Prover`

can store this signed data, redact it, and share it with any `Verifier`

as they see fit, making the signed data both reusable and portable.

`Verifiers`

will only accept the signed data if they trust the `Notary`

. A data `Verifier`

can also require signed data from multiple `Notaries`

to rule out collusion between the `Prover`

and a `Notary`

.

## What Can TLSNotary Do?

TLSNotary can be used for various purposes. For example, you can use TLSNotary to prove that:

- you have access to an account on a web platform
- a website showed specific content on a certain date
- you have private information about yourself (address, birth date, health, etc.)
- you have received a money transfer using your online banking account without revealing your login credentials or sensitive financial information
- you received a private message from someone
- you purchased an item online
- you were blocked from using an app
- you earned professional certificates

While TLSNotary can notarize publicly available data, it does not solve the "oracle problem". For this use case, existing oracle solutions are more suitable.

## Who is behind TLSNotary?

TLSNotary is developed by the Privacy and Scaling Exploration (PSE) research lab of the Ethereum Foundation. The PSE team is committed to conceptualizing and testing use cases for cryptographic primitives.

TLSNotary is not a new project; in fact, it has been around for more than a decade.

In 2022, TLSNotary was rebuilt from the ground up in Rust incorporating state-of-the-art cryptographic protocols. This renewed version of the TLSNotary protocol offers enhanced security, privacy, and performance.

Older versions of TLSNotary, including PageSigner, have been archived due to a security vulnerability.

# Motivation

The decentralized internet demands privacy-respecting data provenance!

Data provenance ensures internet data is authentic. It allows verification of the data's origin and ensures the data hasn't been fabricated or tampered with.

Data provenance will make data truly portable, empowering users to share it with others as they see fit.

## Non-repudiation: TLS is not enough

Transport Layer Security (TLS) plays a crucial role in digital security. TLS protects communication against eavesdropping and tampering. It ensures that the data received by a user (*"Alice"*) indeed originated from the `Server`

and was not changed. The `Server`

's identity is verified by Alice through trusted Certificate Authorities (CAs). Data integrity is maintained by transmitting a cryptographic hash (called Message Authentication Code or MAC in TLS) alongside the data, which safeguards against deliberate alterations.

However, this hash does not provide **non-repudiation**, meaning it cannot serve as evidence for the **authenticity and integrity** of the data to Bob (e.g., a service or an app). Because it is a keyed hash and TLS requires that the key is known to Alice, she could potentially modify the data and compute a corresponding hash after the TLS session is finished.

Achieving non-repudiation requires digital signatures implemented with asymmetric, public-key cryptography.

While the concept seems straightforward, enabling servers to sign data is not a part of the TLS protocol. Even if all data were securely signed, naively sharing all data with others could expose too much information, compromising Alice's privacy. **Privacy** is a vital social good that must be protected.

## Status Quo: delegate access

Currently, when Alice wants to share data from a `Server`

with another party, OAuth can be used to facilitate this if the application supports it. In this way, the other party receives the data directly from the `Server`

, ensuring authentic and unchanged data. However, applications often do not provide fine-grained control over which data to share, leading to the other party gaining access to more information than strictly necessary.

Another drawback of this solution is that the `Server`

is aware of the access delegation, enabling it to monitor and censor the other userβs requests.

It's worth noting that in many instances, OAuth is not even presented as an option. This is because a lot of servers lack the incentive to provide third-party access to the data.

## TLSNotary: data provenance and privacy with secure multi-party computation

TLSNotary operates by executing the TLS communication using **multi-party computation** (MPC). MPC allows Alice and Bob to jointly manage the TLS connection.
With TLSNotary, Alice can selectively prove the authenticity of arbitrary portions of the data to Bob. Since Bob participated in the MPC-TLS communication, he is guaranteed that the data is authentic.

The TLSNotary protocol is **transparent** to the `Server`

. From the `Server`

's perspective, the TLS connection appears just like any other connection, meaning **no modifications to the TLS protocol are necessary**.

## Make your data portable with TLSNotary!

TLSNotary is a solution designed to prove the authenticity of data while preserving user privacy. It unlocks a variety of new use cases. So, if you're looking for a way to make your data portable without compromising on privacy, TLSNotary is developed for you!

Dive into the protocol and integrate it into your applications. We eagerly await your feedback on Discord.

# Quick Start

In this guide we will set up a general-purpose TLS verifier ( a.k.a. the `Notary`

), so that a `Prover`

can notarize some TLS data and generate a proof which he then shows to a `Verifier`

for selective disclosure.

So this guide will take you through the steps of:

- starting a
`Notary`

server - running a
`Prover`

to notarize some web data - running a
`Verifier`

to verify the notarized data

## Preliminaries

### Install rust

If you don't have `rust`

installed yet, install it with rustup:

```
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
```

## Guide

### Start a Notary server:

```
git clone https://github.com/tlsnotary/notary-server
cd notary-server
cargo run --release
```

The `Notary`

server will now be running in the background waiting for connections from a `Prover`

. You can switch to another console to run the `Prover`

.

For more information on how to configure the `Notary`

server, please refer to this.

### Run a simple Prover:

```
git clone https://github.com/tlsnotary/tlsn
cd tlsn/tlsn/examples
cargo run --release --example simple_prover
```

The notarization session usually takes a few moments and the resulting proof will be written to the "proof.json" file. The proof can then be passed on to the `Verifier`

for verification.

The `simple_prover`

notarizes https://example.com and redacts the `USER_AGENT`

HTTP header from the proof for the `Verifier`

. You can change the code in `tlsn/tlsn/examples/simple_prover.rs`

to meet your needs:

- change which server the
`Prover`

connects to - add or remove HTTP request headers
- redact other strings in the request or the response

β οΈ Please note that by default the `Notary`

server expects that the cumulative size of the request and the server response is not more than 16KB.

### Run a simple Verifier:

```
cargo run --release --example simple_verifier
```

This will verify the proof from the `simple_prover`

(`proof.json`

) and output the result to the console.

Note how the parts which the prover chose not to disclose will be shown as "X":

```
GET / HTTP/1.1
host: example.com
accept: */*
accept-encoding: identity
connection: close
user-agent: XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
```

# MPC-TLS

During the MPC-TLS phase the `Prover`

and the `Verifier`

work together to generate an authenticated `Transcript`

^{1} of a TLS session with a `Server`

.

Listed below are some key points regarding this process:

- The
`Verifier`

only ever sees the*encrypted*application data of the TLS session. - The protocol guarantees that the
`Prover`

is not solely capable of constructing requests, nor can they forge responses from the`Server`

. - When the
`Verifier`

is a`Notary`

(see section on Notarization), the identity of the`Server`

is hidden and can be proven to another application-specific verifier later.

^{1}

A transcript is the application level data that is send to and received from the `Server`

# Handshake

A TLS handshake is the first step in establishing a TLS connection between a `Prover`

and a `Server`

. In TLSNotary the `Prover`

is the one who starts the TLS handshake and physically communicates with the `Server`

, but all cryptographic TLS operations are performed together with the `Verifier`

using MPC.

The `Prover`

and `Verifier`

use a series of MPC protocols to compute the TLS session key in such a way that both only have their share of the key and never learn the full key. Both parties then proceed to complete the TLS handshake using their shares of the key.

See our section on Key Exchange for more details of how this is done.

Note: to a third party observer, the

`Prover`

's connection to the server appears like a regular TLS connection and the security guaranteed by TLS remains intact for the`Prover`

.The only exception is that since the

`Verifier`

is a party to the MPC TLS, the security for the`Prover`

against a malicious`Verifier`

is provided by the underlying MPC protocols and not by TLS.

With the shares of the session key computed and the TLS handshake completed, the parties now proceed to the next MPC protocol where they use their session key shares to jointly generate encrypted requests and decrypt server responses while keeping the plaintext of both the requests and responses private from the `Verifier`

.

# Encryption, Decryption, and MAC Computation

This section explains how the `Prover`

and `Verifier`

use MPC to encrypt data sent to the server, decrypt data received from the server, and compute the MAC for the ciphertext using MPC. It shows how the `Prover`

and `Verifier`

collaborate to encrypt and decrypt data. The `Verifier`

performs these tasks "blindly", without acquiring knowledge of the plaintext.

## Encryption

To encrypt the plaintext, both parties input their TLS key shares as private inputs to the MPC protocol, along with some other public data. Additionally, the `Prover`

inputs her plaintext as a private input.

Both parties see the resulting ciphertext and execute the 2PC MAC protocol to compute the MAC for the ciphertext.

The `Prover`

then dispatches the ciphertext and the MAC to the server.

## Decryption

Once the `Prover`

receives the ciphertext and its associated MAC from the server, the parties first authenticate the ciphertext by validating the MAC. They do this by running the MPC protocol to compute the authentic MAC for the ciphertext. They then verify if the authentic MAC matches the MAC received from the server.

Next, the parties decrypt the ciphertext by providing their key shares as private inputs to the MPC protocol, along with the ciphertext and some other public data.

The resulting plaintext is revealed ONLY to the `Prover`

.

Please note, the actual low-level implementation details of decryption are more nuanced than what we have described here. For more information, please consult Low-level Decryption details.

# Notarization

As part of the TLSNotary protocol, the `Prover`

can create authenticated commitments to the plaintext and have the `Notary`

sign them without ever seeing the plaintext. This offers a way for the `Prover`

to selectively prove the authenticity of arbitrary portions of the plaintext to a different `Verifier`

later.

A naive approach of creating such authenticated commitments is to extend the `Encryption and Decryption`

steps to also compute a commitment (e.g. BLAKE3 hash) to the plaintext using MPC and have the `Notary`

sign that commitment. Unfortunately, such an approach is too resource-intensive, prompting us to provide a more lightweight commitment scheme.

The high-level idea is that the `Prover`

creates a commitment to the encodings from the MPC protocol used for `Encryption and Decryption`

. Since those encodings are chosen by the `Notary`

and are not known to the `Prover`

at the time when she makes a commitment, they can be thought of as *"authenticated plaintext"*.

## Signing the Session Header

The `Notary`

signs an artifact known as a `Session Header`

, thereby attesting to the authenticity of the plaintext from a TLS session. A `Session Header`

contains a `Prover`

's commitment to the plaintext and a `Prover`

's commitment to TLS-specific data which uniquely identifies the server.

The `Prover`

can later use the signed `Session Header`

to prove data provenance to a third-party `Verifier`

.

It's important to highlight that throughout the entire TLSNotary protocol, including this signing stage, the `Notary`

does not gain knowledge of either the plaintext or the identity of the server with which the `Prover`

communicated.

# Verification

To prove data provenance to a third-party `Verifier`

, the `Prover`

provides the following information:

`Session Header`

signed by the`Verifier`

`opening`

to the plaintext commitment`TLS-specific data`

which uniquely identifies the server`identity`

of the server

The `Verifier`

performs the following verification steps:

- verifies that the
`opening`

corresponds to the commitment in the`Session Header`

- verifies that the
`TLS-specific data`

corresponds to the commitment in the`Session Header`

- verifies the
`identity`

of the server against`TLS-specific data`

Next, the `Verifier`

parses the `opening`

with an application-specific parser (e.g. HTTP or JSON) to get the final output. Since the `Prover`

is allowed to selectively disclose the data, that data which was not disclosed by the `Prover`

will appear to the `Verifier`

as redacted.

Below is an example of a verification output for an HTTP 1.1 request and response. Note that since the `Prover`

chose not to disclose some sensitive information like their HTTP session token and address, that information will be withheld from the `Verifier`

and will appear to him as redacted (in red).

# Key Exchange

In TLS, the first step towards obtaining TLS session keys is to compute a shared secret between the client and the server by running the ECDH protocol. The resulting shared secret in TLS terms is **called the pre-master secret PMS**.

With TLSNotary, at the end of the key exchange, the `Server`

gets the `PMS`

as usual. The `Prover`

and the `Verifier`

, jointly operating as the TLS client, compute additive shares of the `PMS`

. This prevents either party from unilaterally sending or receiving messages with the `Server`

. Subsequently, the authenticity and integrity of the messages are guaranteed to both the `Prover`

and `Verifier`

, while also keeping the plaintext hidden from the `Verifier`

.

The 3-party ECDH protocol between the `Server`

the `Prover`

and the `Verifier`

works as follows:

`Server`

sends its public key $Q_{b}$ to`Prover`

, and`Prover`

forwards it to`Verifier`

`Prover`

picks a random private key share $d_{c}$ and computes a public key share $Q_{c}=d_{c}βG$`Verifier`

picks a random private key share $d_{n}$ and computes a public key share $Q_{n}=d_{n}βG$`Verifier`

sends $Q_{n}$ to`Prover`

who computes $Q_{a}=Q_{c}+Q_{n}$ and sends $Q_{a}$ to`Server`

`Prover`

computes an EC point $(x_{p},y_{p})=d_{c}βQ_{b}$`Verifier`

computes an EC point $(x_{q},y_{q})=d_{n}βQ_{b}$- Addition of points $(x_{p},y_{p})$ and $(x_{q},y_{q})$ results in the coordinate $x_{r}$, which is
`PMS`

. (The coordinate $y_{r}$ is not used in TLS)

Using the notation from here, our goal is to compute $x_{r}=(x_{q}βx_{p}y_{q}βy_{p}β)_{2}βx_{p}βx_{q}$ in such a way that

- Neither party learns the other party's $x$ value
- Neither party learns $x_{r}$, only their respective shares of $x_{r}$.

We will use two maliciously secure protocols described on p.25 in the paper Eο¬cient Secure Two-Party Exponentiation:

`A2M`

protocol, which converts additive shares into multiplicative shares, i.e. given shares`a`

and`b`

such that`a + b = c`

, it converts them into shares`d`

and`e`

such that`d * e = c`

`M2A`

protocol, which converts multiplicative shares into additive shares

We apply `A2M`

to $y_{q}+(βy_{p})$ to get $A_{q}βA_{p}$ and also we apply `A2M`

to $x_{q}+(βx_{p})$ to get $B_{q}βB_{p}$. Then the above can be rewritten as:

$x_{r}=(B_{q}A_{q}β)_{2}β(B_{p}A_{p}β)_{2}βx_{p}βx_{q}$

Then the first party locally computes the first factor and gets $C_{q}$, the second party locally computes the second factor and gets $C_{p}$. Then we can again rewrite as:

$x_{r}=C_{q}βC_{p}βx_{p}βx_{q}$

Now we apply `M2A`

to $C_{q}βC_{p}$ to get $D_{q}+D_{p}$, which leads us to two final terms each of which is the share of $x_{r}$ of the respective party:

$x_{r}=(D_{q}βx_{q})+(D_{p}βx_{p})$

# Finite-Field Arithmetic

Some protocols used in TLSNotary need to convert two-party sharings of products or sums of some field elements into each other. For this purpose we use share conversion protocols which use oblivious transfer (OT) as a sub-protocol. Here we want to have a closer look at the security guarantees these protocols offer.

### Adding covert security

Our goal is to add covert security to our share conversion protocols. This
means that we want an honest party to be able to detect a malicious adversary,
who is then able to abort the protocol. Our main concern is that the adversary
might be able to leak private inputs of the honest party without being noticed.
For this reason we require that the adversary cannot do anything which would
give him a better chance than guessing the private input at random, which is

guessing $k$ bits with a probability of $2_{βk}$ for not being detected.

In the following we want to have a closer look at how the sender and receiver can deviate from the protocol.

#### Malicious receiver

Note that in our protocol a malicious receiver cannot forge the protocol output, since he does not send anything to the sender during protocol execution. Even when this protocol is embedded into an outer protocol, where at some point the receiver has to open his output or a computation involving it, then all he can do is to open an output $y_{β²}$ with $y_{β²}ξ=y$, which is just equivalent to changing his input from $bβb_{β²}$.

#### Malicious sender

In the case of a malicious sender the following things can happen:

- The sender can impose an arbitrary field element $b_{β²}$ as input onto the receiver without him noticing. To do this he simply sends $(t_{i},t_{i})$ in every OT, where $k$ is i-th bit of $b_{β²}$.
- The sender can execute a selective-failure attack, which allows him to learn any predicate about the receiver's input. For each OT round $i$, the sender alters one of the OT values to be $T_{i}=t_{i}+c_{i}$, where $c_{i}βF,kβ{0,1}$. This will cause that in the end the equation $aβb=x+y$ no longer holds but only if the forged OT value has actually been picked by the receiver.
- The sender does not use a random number generator with a seed $r$ to sample the masks $s_{i}$, instead he simply chooses them at will.

### M2A Protocol Review

Without loss of generality let us recall the Multiplication-To-Addition (M2A) protocol, but our observations also apply to the Addition-To-Multiplication (A2M) protocol, which is very similar. We start with a short review of the M2A protocol.

Let there be a sender with some field element $a$ and some receiver with another field element $b$. After protocol execution the sender ends up with $x$ and the receiver ends up with $y$, so that $aβb=x+y$.

- $a,b,x,yβF$
- $r$ - rng seed
- $m$ - bit-length of elements in $F$
- $n$ - bit-length of rng seed

#### OT Sender

with input $aβF,rβ{0,1}_{n}$

- Sample some random masks: $s_{i}=rng(r,i),0β€i<m,s_{i}βF$
- For every $s_{i}$ compute:
- $t_{i}=s_{i}$
- $t_{i}=aβ2_{i}+s_{i}$

- Compute new share: $x=ββs_{i}$
- Send $i$ OTs to receiver: $(t_{i},t_{i})$

#### OT Receiver

with input $bβF$

- Set $v_{i}=t_{i}$ (from OT)
- Compute new share: $y=βv_{i}$

### Replay protocol

In order to mitigate the mentioned protocol deviations in the case of a malicious sender we will introduce a replay protocol.

In this section we will use capital letters for values sent in the replay protocol, which in the case of an honest sender are equal to their lowercase counterparts.

The idea for the replay protocol is that at some point after the conversion protocol, the sender has to reveal the rng seed $r$ and his input $a$ to the receiver. In order to do this, he will send $R$ and $A$ to the receiver after the conversion protocol has been executed. If the sender is honest then of course $r==R$ and $a==A$. The receiver can then check if the value he picked during protocol execution does match what he can now reconstruct from $R$ and $A$, i.e. that $T_{i}==t_{i}$.

**Using this replay protocol the sender at some point reveals all his secrets
because he sends his rng seed and protocol input to the receiver. This means
that we can only use covertly secure share conversion with replay as a
sub-protocol if it is acceptable for the outer protocol, that the input to
share-conversion becomes public at some later point.**

Now in practice we often want to execute several rounds of share-conversion, as we need to convert several field elements. Because of this we let the sender use the same rng seed $r$ to seed his rng once and then he uses this rng instance for all protocol rounds. This means we have $l$ protocol executions $a_{n}βb_{n}=x_{n}+y_{n}$, and all masks $s_{n,i}$ produced from this rng seed $r$. So the sender will write his seed $R$ and all the $A_{n}$ to some tape, which in the end is sent to the receiver. As a security precaution we also let the sender commit to his rng seed before the first protocol execution. In detail:

##### Sender

- Sender has some inputs $a_{n}$ and picks some rng seed $r$.
- Sender commits to his rng seed and sends the commitment to the receiver.
- Sender sends all his OTs for $n$ protocol executions.
- Sender sends tape which contains the rng seed $R$ and all the $A_{n}$.

##### Receiver

- Receiver checks that $R$ is indeed the committed rng seed.
- For every protocol execution $l$ the receiver checks that $T_{n,i}==t_{n,i}$.

Having a look at the ways a malicious sender could cheat from earlier, we notice:

- The sender can no longer impose an arbitrary field element $b_{β²}$ onto the receiver, because the receiver would notice that $tξ=T$ during the replay.
- The sender can still carry out a selective-failure attack, but this is equivalent to guessing $k$ bits of $b$ at random with a probability of $2_{βk}$ for being undetected.
- The sender is now forced to use an rng seed to produce the masks, because during the replay, these masks are reproduced from $R$ and indirectly checked via $t==T$.

# Dual Execution with Asymmetric Privacy

TLSNotary uses the `DEAP`

protocol described below to ensure malicious security of the overall protocol.

When using DEAP in TLSNotary, the `User`

plays the role of Alice and has full privacy and the `Notary`

plays the role of Bob and reveals all of his private inputs after the TLS session with the server is over. The Notary's private input is his TLS session key share.

The parties run the `Setup`

and `Execution`

steps of `DEAP`

but they defer the `Equality Check`

.
Since during the `Equality Check`

all of the `Notary`

's secrets are revealed to User, it must be deferred until after the TLS session with the server is over, otherwise the User would learn the full TLS session keys and be able to forge the TLS transcript.

## Introduction

Malicious secure 2-party computation with garbled circuits typically comes at the expense of dramatically lower efficiency compared to execution in the semi-honest model. One technique, called Dual Execution [MF06] [HKE12], achieves malicious security with a minimal 2x overhead. However, it comes with the concession that a malicious adversary may learn $k$ bits of the other's input with probability $2_{βk}$.

We present a variant of Dual Execution which provides different trade-offs. Our variant ensures complete privacy *for one party*, by sacrificing privacy entirely for the other. Hence the name, Dual Execution with Asymmetric Privacy (DEAP). During the execution phase of the protocol both parties have private inputs. The party with complete privacy learns the authentic output prior to the final stage of the protocol. In the final stage, prior to the equality check, one party reveals their private input. This allows a series of consistency checks to be performed which guarantees that the equality check can not cause leakage.

Similarly to standard DualEx, our variant ensures output correctness and detects leakage (of the revealing parties input) with probability $1β2_{βk}$ where $k$ is the number of bits leaked.

## Preliminary

The protocol takes place between Alice and Bob who want to compute $f(x,y)$ where $x$ and $y$ are Alice and Bob's inputs respectively. The privacy of Alice's input is ensured, while Bob's input will be revealed in the final steps of the protocol.

### Premature Leakage

Firstly, our protocol assumes a small amount of premature leakage of Bob's input is tolerable. By premature, we mean prior to the phase where Bob is expected to reveal his input.

If Alice is malicious, she has the opportunity to prematurely leak $k$ bits of Bob's input with $2_{βk}$ probability of it going undetected.

### Aborts

We assume that it is acceptable for either party to cause the protocol to abort at any time, with the condition that no information of Alice's inputs are leaked from doing so.

### Committed Oblivious Transfer

In the last phase of our protocol Bob must open all oblivious transfers he sent to Alice. To achieve this, we require a very relaxed flavor of committed oblivious transfer. For more detail on these relaxations see section 2 of Zero-Knowledge Using Garbled Circuits [JKO13].

### Notation

- $x$ and $y$ are Alice and Bob's inputs, respectively.
- $[X]_{A}$ denotes an encoding of $x$ chosen by Alice.
- $[x]$ and $[y]$ are Alice and Bob's encoded
*active*inputs, respectively, ie $Enc(x,[X])=[x]$. - $com_{x}$ denotes a binding commitment to $x$
- $G$ denotes a garbled circuit for computing $f(x,y)=v$, where:
- $Gb([X],[Y])=G$
- $Ev(G,[x],[y])=[v]$.

- $d$ denotes output decoding information where $De(d,[v])=v$
- $Ξ$ denotes the global offset of a garbled circuit where $βi:[x]_{i}=[x]_{i}βΞ$
- $PRG$ denotes a secure pseudo-random generator
- $H$ denotes a secure hash function

## Protocol

The protocol can be thought of as three distinct phases: The setup phase, execution, and equality-check.

### Setup

- Alice creates a garbled circuit $G_{A}$ with corresponding input labels $([X]_{A},[Y]_{A})$, and output label commitment $com_{[V]_{A}}$.
- Bob creates a garbled circuit $G_{B}$ with corresponding input labels $([X]_{B},[Y]_{B})$.
- For committed OT, Bob picks a seed $Ο$ and uses it to generate all random-tape for his OTs with $PRG(Ο)$. Bob sends $com_{Ο}$ to Alice.
- Alice retrieves her active input labels $[x]_{B}$ from Bob using OT.
- Bob retrieves his active input labels $[y]_{A}$ from Alice using OT.
- Alice sends $G_{A}$, $[x]_{A}$, $d_{A}$ and $com_{[V]_{A}}$ to Bob.
- Bob sends $G_{B}$, $[y]_{B}$, and $d_{B}$ to Alice.

### Execution

Both Alice and Bob can execute this phase of the protocol in parallel as described below:

#### Alice

- Evaluates $G_{B}$ using $[x]_{B}$ and $[y]_{B}$ to acquire $[v]_{B}$.
- Decodes $[v]_{B}$ to $v_{B}$ using $d_{B}$ which she received earlier. She computes $H([v_{B}]_{A},[v]_{B})$ which we will call $check_{A}$.
- Computes a commitment $Com(check_{A},r)=com_{check_{A}}$ where $r$ is a key only known to Alice. She sends this commitment to Bob.
- Waits to receive $[v]_{A}$ from Bob
^{1}. - Checks that $[v]_{A}$ is authentic, aborting if not, then decodes $[v]_{A}$ to $v_{A}$ using $d_{A}$.

At this stage, if Bob is malicious, Alice could detect that $v_{A}ξ=v_{B}$. However, Alice must not react in this case. She proceeds with the protocol regardless, having the authentic output $v_{A}$.

#### Bob

- Evaluates $G_{A}$ using $[x]_{A}$ and $[y]_{A}$ to acquire $[v]_{A}$. He checks $[v]_{A}$ against the commitment $com_{[V]_{A}}$ which Alice sent earlier, aborting if it is invalid.
- Decodes $[v]_{A}$ to $v_{A}$ using $d_{A}$ which he received earlier. He computes $H([v]_{A},[v_{A}]_{B})$ which we'll call $check_{B}$, and stores it for the equality check later.
- Sends $[v]_{A}$ to Alice
^{1}. - Receives $com_{check_{A}}$ from Alice and stores it for the equality check later.

Bob, even if malicious, has learned nothing except the purported output $v_{A}$ and is not convinced it is correct. In the next phase Alice will attempt to convince Bob that it is.

Alice, if honest, has learned the correct output $v$ thanks to the authenticity property of garbled circuits. Alice, if malicious, has potentially learned Bob's entire input $y$.

^{1}

This is a significant deviation from standard DualEx protocols such as [HKE12]. Typically the output labels are *not* returned to the Generator, instead, output authenticity is established during a secure equality check at the end. See the section below for more detail.

### Equality Check

- Bob opens his garbled circuit and OT by sending $Ξ_{B}$, $y$ and $Ο$ to Alice.
- Alice, can now derive the
*purported*input labels to Bob's garbled circuit $([X]_{B},[Y]_{B})$. - Alice uses $Ο$ to open all of Bob's OTs for $[x]_{B}$ and verifies that they were performed honestly. Otherwise she aborts.
- Alice verifies that $G_{B}$ was garbled honestly by checking $Gb([X]_{B},[Y]_{B})==G_{B}$. Otherwise she aborts.
- Alice now opens $com_{check_{A}}$ by sending $check_{A}$ and $r$ to Bob.
- Bob verifies $com_{check_{A}}$ then asserts $check_{A}==check_{B}$, aborting otherwise.

Bob is now convinced that $v_{A}$ is correct, ie $f(x,y)=v_{A}$. Bob is also assured that Alice only learned up to k bits of his input prior to revealing, with a probability of $2_{βk}$ of it being undetected.

## Analysis

### Malicious Alice

On the Leakage of Corrupted Garbled Circuits [DPB18] is recommended reading on this topic.

During the first execution, Alice has some degrees of freedom in how she garbles $G_{A}$. According to [DPB18], when using a modern garbling scheme such as [ZRE15], these corruptions can be analyzed as two distinct classes: detectable and undetectable.

Recall that our scheme assumes Bob's input is an ephemeral secret which can be revealed at the end. For this reason, we are entirely unconcerned about the detectable variety. Simply providing Bob with the output labels commitment $com_{[V]_{A}}$ is sufficient to detect these types of corruptions. In this context, our primary concern is regarding the *correctness* of the output of $G_{A}$.

[DPB18] shows that any undetectable corruption made to $G_{A}$ is constrained to the arbitrary insertion or removal of NOT gates in the circuit, such that $G_{A}$ computes $f_{A}$ instead of $f$. Note that any corruption of $d_{A}$ has an equivalent effect. [DPB18] also shows that Alice's ability to exploit this is constrained by the topology of the circuit.

Recall that in the final stage of our protocol Bob checks that the output of $G_{A}$ matches the output of $G_{B}$, or more specifically:

$f_{A}(x_{1},y_{1})==f_{B}(x_{2},y_{2})$

For the moment we'll assume Bob garbles honestly and provides the same inputs for both evaluations.

$f_{A}(x_{1},y)==f(x_{2},y)$

In the scenario where Bob reveals the output of $f_{A}(x_{1},y)$ prior to Alice committing to $x_{2}$ there is a trivial *adaptive attack* available to Alice. As an extreme example, assume Alice could choose $f_{A}$ such that $f_{A}(x_{1},y)=y$. For most practical functions this is not possible to garble without detection, but for the sake of illustration we humor the possibility. In this case she could simply compute $x_{2}$ where $f(x_{2},y)=y$ in order to pass the equality check.

To address this, Alice is forced to choose $f_{A}$, $x_{1}$ and $x_{2}$ prior to Bob revealing the output. In this case it is obvious that any *valid* combination of $(f_{A},x_{1},x_{2})$ must satisfy all constraints on $y$. Thus, for any non-trivial $f$, choosing a valid combination would be equivalent to guessing $y$ correctly. In which case, any attack would be detected by the equality check with probability $1β2_{βk}$ where k is the number of guessed bits of $y$. This result is acceptable within our model as explained earlier.

### Malicious Bob

Zero-Knowledge Using Garbled Circuits [JKO13] is recommended reading on this topic.

The last stage of our variant is functionally equivalent to the protocol described in [JKO13]. After Alice evaluates $G_{B}$ and commits to $[v]_{B}$, Bob opens his garbled circuit and all OTs entirely. Following this, Alice performs a series of consistency checks to detect any malicious behavior. These consistency checks do *not* depend on any of Alice's inputs, so any attempted selective failure attack by Bob would be futile.

Bob's only options are to behave honestly, or cause Alice to abort without leaking any information.

### Malicious Alice & Bob

They deserve whatever they get.

# Encryption

Here we will explain our protocol for 2PC encryption using a block cipher in counter-mode.

Our documentation on Dual Execution with Asymmetric Privacy is recommended prior reading for this section.

## Preliminary

### Ephemeral Keyshare

It is important to recognise that the Notary's keyshare is an *ephemeral secret*. It is only private for the duration of the User's TLS session, after which the User is free to learn it without affecting the security of the protocol.

It is this fact which allows us to achieve malicious security for relatively low cost. More details on this here.

### Premature Leakage

A small amount of undetected premature keyshare leakage is quite tolerable. For example, if the Notary leaks 3 bits of their keyshare, it gives the User no meaningful advantage in any attack, as she could have simply guessed the bits correctly with $2_{β3}=12.5%$ probability and mounted the same attack. Assuming a sufficiently long cipher key is used, eg. 128 bits, this is not a concern.

The equality check at the end of our protocol ensures that premature leakage is detected with a probability of $1β2_{βk}$ where k is the number of leaked bits. The Notary is virtually guaranteed to detect significant leakage and can abort prior to notarization.

### Plaintext Leakage

Our protocol assures *no leakage* of the plaintext to the Notary during both encryption and decryption. The Notary reveals their keyshare at the end of the protocol, which allows the Notary to open their garbled circuits and oblivious transfers completely to the User. The User can then perform a series of consistency checks to ensure that the Notary behaved honestly. Because these consistency checks do not depend on any inputs of the User, aborting does not reveal any sensitive information (in contrast to standard DualEx which does).

### Integrity

During the entirety of the TLS session the User performs the role of the garbled circuit generator, thus ensuring that a malicious Notary can not corrupt or otherwise compromise the integrity of messages sent to/from the Server.

### Notation

- $p$ is one block of plaintext
- $c$ is the corresponding block of ciphertext, ie $c=Enc(k,ctr)βp$
- $k$ is the cipher key
- $ctr$ is the counter block
- $k_{U}$ and $k_{N}$ denote the User and Notary cipher keyshares, respectively, where $k=k_{U}βk_{N}$
- $z$ is a mask randomly selected by the User
- $ectr$ is the encrypted counter-block, ie $ectr=Enc(k,ctr)$
- $Enc$ denotes the block cipher used by the TLS session
- $com_{x}$ denotes a binding commitment to the value $x$
- $[x]_{A}$ denotes a garbled encoding of $x$ chosen by party $A$

## Encryption Protocol

The encryption protocol uses DEAP without any special variations. The User and Notary directly compute the ciphertext for each block of a message the User wishes to send to the Server:

$f(k_{U},k_{N},ctr,p)=Enc(k_{U}βk_{N},ctr)βp=c$

The User creates a commitment to the plaintext active labels for the Notary's circuit $Com([p]_{N},r)=com_{[p]_{N}}$ where $r$ is a random key known only to the User. The User sends this commitment to the Notary to be used in the authdecode protocol later. It's critical that the User commits to $[p]_{N}$ prior to the Notary revealing $Ξ$ in the final phase of DEAP. This ensures that if $com_{[p]_{N}}$ is a commitment to valid labels, then it must be a valid commitment to the plaintext $p$. This is because learning the complementary wire label for any bit of $p$ prior to learning $Ξ$ is virtually impossible.

## Decryption Protocol

The protocol for decryption is very similar but has some key differences to encryption.

For decryption, DEAP is used for every block of the ciphertext to compute the *masked encrypted counter-block*:

$f(k_{U},k_{N},ctr,z)=Enc(k_{U}βk_{N},ctr)βz=ectr_{z}$

This mask $z$, chosen by the User, hides $ectr$ from the Notary and thus the plaintext too. Conversely, the User can simply remove this mask in order to compute the plaintext $p=cβectr_{z}βz$.

Following this, the User can retrieve the wire labels $[p]_{N}$ from the Notary using OT.

Similarly to the procedure for encryption, the User creates a commitment $Com([p]_{N},r)=com_{[p]_{N}}$ where $r$ is a random key known only to the User. The User sends this commitment to the Notary to be used in the authdecode protocol later.

### Proving the validity of $[p]_{N}$

In addition to computing the masked encrypted counter-block, the User must also prove that the labels $[p]_{N}$ they chose afterwards actually correspond to the ciphertext $c$ sent by the Server.

This is can be done efficiently in one execution using the zero-knowledge protocol described in [JKO13] the same as we do in the final phase of DEAP.

The Notary garbles a circuit $G_{N}$ which computes:

$pβectr=c$

Notice that the User and Notary will already have computed $ectr$ when they computed $ectr_{z}$ earlier. Conveniently, the Notary can re-use the garbled labels $[ectr]_{N}$ as input labels for this circuit. For more details on the reuse of garbled labels see [AMR17].

# Computing MAC in 2PC

- What is a MAC
- How a MAC is computed in AES-GCM
- Computing MAC using secure two-party computation (2PC)

## 1. What is a MAC

When sending an encrypted ciphertext to the Webserver, the User attaches a checksum to it. The Webserver uses this checksum to check whether the ciphertext has been tampered with while in transit. This checksum is known as the "authentication tag" and also as the "Message Authentication Code" (MAC).

In order to create a MAC for some ciphertext not only the ciphertext but also some secret key is used as an input. This makes it impossible to forge some ciphertext without knowing the secret key.

The first few paragraphs of this article
explain what would happen if there was no MAC: it would be possible for a
malicious actor to modify the **plaintext** by flipping certain bits of the
**ciphertext**.

## 2. How a MAC is computed in AES-GCM

In TLS the plaintext is split up into chunks called "TLS records". Each TLS
record is encrypted and a MAC is computed for the ciphertext. The MAC (in
AES-GCM) is obtained by XORing together the `GHASH output`

and the `GCTR output`

. Let's see how each of those outputs is computed:

### 2.1 GCTR output

The `GCTR output`

is computed by simply AES-ECB encrypting a counter block with
the counter set to 1 (the iv, nonce and AES key are the same as for the rest of
the TLS record).

### 2.2 GHASH output

The `GHASH output`

is the output of the GHASH function described in the
NIST publication
in section 6.4 in this way: "In effect, the GHASH function calculates $X_{1}β’H_{m}βX_{2}β’H_{mβ1}β...βX_{mβ1}β’H_{2}βX_{m}β’H$". $H$
and $X$ are elements of the extension field $GF(2_{128})$.

- "β’" is a special type of multiplication called
`multiplication in a finite field`

described in section 6.3 of the NIST publication. - β is
`addition in a finite field`

and it is defined as XOR.

In other words, GHASH splits up the ciphertext into 16-byte blocks, each block
is numbered $X_{1},X_{2},...$ etc. There's also $H$
which is called the `GHASH key`

, which just is the AES-encrypted zero-block. We
need to raise $H$ to as many powers as there are blocks, i.e. if
we have 5 blocks then we need 5 powers: $H,H_{2},H_{3},H_{4},H_{5}$.
Each block is multiplied by the corresponding power and all products are summed
together.

Below is the pseudocode for multiplying two 128-bit field elements `x`

and `y`

in $GF(2_{128})$:

```
1. result = 0
2. R = 0xE1000000000000000000000000000000
3. bit_length = 128
4. for i=0 upto bit_length-1
5. if y[i] == 1
6. result ^= x
7. x = (x >> 1) ^ ((x & 1) * R)
8. return result
```

Standard math properties hold in finite field math, viz. commutative: $a+b=b+a$ and distributive: $a(b+c)=ab+ac$.

## 3. Computing MAC using secure two-party computation (2PC)

The goal of the protocol is to compute the MAC in such a way that neither party
would learn the other party's share of $H$ i.e. the `GHASH key`

share. At the start of the protocol each party has:

- ciphertext blocks $X_{1},X_{2},...,X_{m}$.
- XOR share of $H$: the
`User`

has $H_{u}$ and the`Notary`

has $H_{n}$. - XOR share of the
`GCTR output`

: the`User`

has $GCTR_{u}$ and the`Notary`

has $GCTR_{n}$.

Note that **2.** and **3.** were obtained at an earlier stage of the TLSNotary protocol.

### 3.1 Example with a single ciphertext block

To illustrate what we want to achieve, we consider the case of just having
a single ciphertext block $X_{1}$. The `GHASH_output`

will be:

$X_{1}β’H=X_{1}β’(H_{u}βH_{n})=X_{1}β’H_{u}βX_{1}β’H_{n}$

The `User`

and the `Notary`

will compute locally the left and the right terms
respectively. Then each party will XOR their result to the `GCTR output`

share
and will get their XOR share of the MAC:

`User`

: $X_{1}β’H_{u}βGCTR_{u}=MAC_{u}$

`Notary`

: $X_{1}β’H_{n}βGCTR_{n}=MAC_{n}$

Finally, the `Notary`

sends $MAC_{n}$ to the `User`

who obtains:

$MAC=MAC_{n}βMAC_{u}$

**For longer ciphertexts, the problem is that higher powers of the hashkey
$H_{k}$ cannot be computed locally, because we deal with additive sharings,
i.e.$(H_{u})_{k}β(H_{n})_{k}ξ=H_{k}$.**

### 3.2 Computing ciphertexts with an arbitrary number of blocks

We now introduce our 2PC MAC protocol for computing ciphertexts with an arbitrary number of blocks. Our protocol can be divided into the following steps.

##### Steps

- First, both parties convert their
**additive**shares $H_{u}$ and $H_{n}$ into**multiplicative**shares $H_{u}$ and $H_{n}$. - This allows each party to
**locally**compute the needed higher powers of these multiplicative shares, i.e for $m$ blocks of ciphertext:- the user computes $H_{u}β_{2},H_{u}β_{3},...H_{u}β_{m}$
- the notary computes $H_{n}β_{2},H_{n}β_{3},...H_{n}β_{m}$

- Then both parties convert each of these multiplicative shares back to additive shares
- the user ends up with $H_{u},H_{u},...H_{u}$
- the notary ends up with $H_{n},H_{n},...H_{n}$

- Each party can now
**locally**compute their additive MAC share $MAC_{n/u}$.

The conversion steps (**1** and **3**) require communication between the user
and the notary. They will use **A2M** (Addition-to-Multiplication) and **M2A**
(Multiplication-to-Addition) protocols, which make use of **oblivious
transfer**, to convert the shares. **The user will be the sender and the notary
the receiver.**

#### 3.2.1 (A2M) Convert additive shares of H into multiplicative share

At first (step **1**) we have to get a multiplicative share of $H_{n/u}$,
so that notary and user can locally compute the needed higher powers. For this
we use an adapted version of the A2M protocol in chapter 4 of Efficient Secure
Two-Party Exponentiation.

The user will decompose his share into $i$ individual oblivious transfers $t_{u,i}=Rβ(kβ2_{i}+H_{u,i}β2_{i}βs_{i})$, where

- $R$ is some random value used for all oblivious transfers
- $s_{i}$ is a random mask used per oblivious transfer, with $β_{i}s_{i}=0$
- $kβ0,1$ depending on the receiver's choice.

The notary's choice in the i-th OT will depend on the bit value in the i-th position of his additive share $H_{n}$. In the end the multiplicative share of the user $H_{u}β$ will simply be the inverse $R_{β1}$ of the random value, and the notary will sum all his OT outputs, so that all the $s_{i}$ will vanish and hence he gets his multiplicative share $H_{n}β$.

$Hβ=H_{u}βH_{n}=R_{β1}βRβiββ(H_{u,i}βH_{n,i})β2_{i}βs_{i}=R_{β1}βiββt_{u,i}βRβiββs_{i}=H_{u}ββH_{n}βββ$

#### 3.2.2 (M2A) Convert multiplicative shares $H_{k}$ into additive shares

In step **3** of our protocol, we use the oblivious transfer method described
in chapter 4.1 of the Gilboa paper Two Party RSA Key
Generation
to convert all the multiplicative shares $H_{n/u}β$ back into
additive shares $H_{n/u}$. We only show how the method works for the share
$H_{n/u}β$, because it is the same for higher powers.

The user will be the OT sender and decompose his shares into $i$ individual oblivious transfers $t_{u,i}=kβH_{u}ββ2_{i}+s_{i}$, where $kβ0,1$, depending on the receiver's choices. Each of these OTs is masked with a random value $s_{i}$. He will then obliviously send them to the notary. Depending on the binary representation of his multiplicative share, the notary will choose one of the choices and do this for all 128 oblivious transfers.

After that the user will locally XOR all his $s_{i}$ and end up with his additive share $H_{u}$, and the notary will do the same for all the results of the oblivious transfers and get $H_{n}$.

$Hβ=H_{u}ββH_{n}β=H_{u}ββiββH_{n,i}ββ2_{i}=iββ(H_{n,i}ββH_{u}ββ2_{i}βs_{i})βiββs_{i}=iββt_{u,i}βiββs_{i}β‘H_{n}βH_{u}β$

### 3.3 Free Squaring

In the actual implementation of the protocol we only compute odd multiplicative
shares, i.e. $H,H_{3},H_{5},β¦$, so that
we only need to share these odd shares in step **3**. This is possible because
we can compute even additive shares from odd additive shares. We observe that
for even $k$:

$H_{k}β=(H_{n}βH_{u})_{2}=(H_{n})_{2}βH_{n}H_{u}βH_{u}H_{n}β(H_{u})_{2}=(H_{n})_{2}β(H_{u})_{2}=H_{n}βH_{u}ββ$

So we only need to convert odd multiplicative shares into odd additive shares, which gives us a 50% reduction in cost. The remaining even additive shares can then be computed locally.

### 3.3 Creating a robust protocol

Both the A2M and M2A protocols on their own only provide semi-honest security. They are secure against a malicious receiver, but the sender has degrees of freedom to cause leakage of the MAC keyshares. However, for our purposes this does not present a problem as long as leakage is detected.

To detect a malicious sender, we require the sender to commit to the PRG seed used to generate the random values in the share conversion protocols. After the TLS session is closed the MAC keyshares are no longer secret, which allows the sender to reveal this seed to the receiver. Subsequently, the receiver can perform a consistency check to make sure the sender followed the protocol honestly.

#### 3.3.1 Malicious notary

The protocol is secure against a malicious notary, because he is the OT receiver, which means that there is actually no input from him during the protocol execution except for the final MAC output. He just receives the OT input from the user, so the only thing he can do is to provide a wrong MAC keyshare. This will cause the server to reject the MAC when the user sends the request. The protocol simply aborts.

#### 3.3.2 Malicious user

A malicious user could actually manipulate what he sends in the OT and potentially endanger the security of the protocol by leaking the notary's MAC key. To address this we force the user to reveal his MAC key after the server response so that the notary can check for the correctness of the whole MAC 2PC protocol. Then if the notary detects that the user cheated, he would simply abort the protocol.

The only problem when doing this is, that we want the whole TLSNotary protocol to work under the assumption that the notary can intercept the traffic between the user and the server. This would allow the notary to trick the user into thinking that the TLS session is already terminated, if he can force the server to respond. The user would send his MAC key share too early and the notary could, now having the complete MAC key, forge the ciphertext and create a valid MAC for it. He would then send this forged request to the server and forward the response of the server to the user.

To prevent this scenario we need to make sure that the TLS connection to the
server is terminated before the user sends his MAC key share to the notary.
Following the TLS RFC,
we leverage `close_notify`

to ensure all messages sent to the server have been
processed and the connection is closed. Unfortunately, many server TLS
implementations do not support `close_notify`

. In these cases we instead send an
invalid message to the server which forces it to respond with a fatal alert
message and close the connection.

# Glossary

Term | Explanation |
---|---|

2PC | Secure Two-party computation |

A2M | Addition-to-Multiplication |

AES | Advanced Encryption Standard |

DEAP | Dual Execution with Asymmetric Privacy |

ECB | Electronic codebook (encryption mode) |

ECDH | Elliptic-Curve Diffie-Hellman |

GC | Garbled Circuit |

GCM | Galois/Counter Mode |

GHASH | GCM hash |

HMAC | Hash-based Message Authentication Code |

M2a | Multiplication-to-Addition |

MAC | Message Authentication Code |

MPC | Secure Multi-party computation |

OT | oblivious transfer |

PMS | Pre master secret (TLS) |

PRF | Pseudo Random Function |

PRG | pseudorandom generator |

PSE | Privacy and Scaling Exploration |

RSA | RivestβShamirβAdleman (public-key cryptosystem) |

TLS | transport layer security |