148 Statically Verified Refinements for Multiparty Protocols FANGYI ZHOU, Imperial College London, United Kingdom FRANCISCO FERREIRA, Imperial College London, United Kingdom RAYMOND HU, University of Hertfordshire, United Kingdom RUMYANA NEYKOVA, Brunel University London, United Kingdom NOBUKO YOSHIDA, Imperial College London, United Kingdom With distributed computing becoming ubiquitous in the modern era, safe distributed programming is an open challenge. To address this, multiparty session types (MPST) provide a typing discipline for message-passing concurrency, guaranteeing communication safety properties such as deadlock freedom. While originally MPST focus on the communication aspects, and employ a simple typing system for communication payloads, communication protocols in the real world usually contain constraints on the payload. We introduce refined multiparty session types (RMPST), an extension of MPST, that express data dependent protocols via refinement types on the data types. We provide an implementation of RMPST, in a toolchain called Session⋆, using Scribble, a toolchain for multiparty protocols, and targeting F⋆, a verification-oriented functional programming language. Users can describe a protocol in Scribble and implement the endpoints in F⋆ using refinement-typed APIs generated from the protocol. The F⋆ compiler can then statically verify the refinements. Moreover, we use a novel approach of callback-styled API generation, providing static linearity guarantees with the inversion of control. We evaluate our approach with real world examples and show that it has little overhead compared to a naïve implementation, while guaranteeing safety properties from the underlying theory. CCS Concepts: • Theory of computation→ Automated reasoning; Distributed computing models; • Soft- ware and its engineering→ Source code generation. ACM Reference Format: Fangyi Zhou, Francisco Ferreira, Raymond Hu, Rumyana Neykova, and Nobuko Yoshida. 2020. Statically Veri- fied Refinements for Multiparty Protocols. Proc. ACM Program. Lang. 4, OOPSLA, Article 148 (November 2020), 30 pages. https://doi.org/10.1145/3428216 1 INTRODUCTION Distributed interactions and message passing are fundamental elements of the modern computing landscape. Unfortunately, language features and support for high-level and safe programming of communication-oriented and distributed programs are much lacking, in comparison to those enjoyed for more traditional “localised” models of computation. One of the research directions towards addressing this challenge is concurrent behavioural types [Ancona et al. 2016; Gay and Ravara 2017], which seek to extend the benefits of conventional type systems, as the most successful form of lightweight formal verification, to communication and concurrency. Authors’ addresses: Fangyi Zhou, Imperial College London, United Kingdom, fangyi.zhou15@imperial.ac.uk; Francisco Ferreira, Imperial College London, United Kingdom, f.ferreira-ruiz@imperial.ac.uk; RaymondHu, University of Hertfordshire, United Kingdom, r.z.h.hu@herts.ac.uk; Rumyana Neykova, Brunel University London, United Kingdom, rumyana.neykova@ brunel.ac.uk; Nobuko Yoshida, Imperial College London, United Kingdom, n.yoshida@imperial.ac.uk. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from permissions@acm.org. © 2020 Copyright held by the owner/author(s). Publication rights licensed to ACM. 2475-1421/2020/11-ART148 https://doi.org/10.1145/3428216 Proc. ACM Program. Lang., Vol. 4, No. OOPSLA, Article 148. Publication date: November 2020. 148:2 Fangyi Zhou, Francisco Ferreira, Raymond Hu, Rumyana Neykova, and Nobuko Yoshida G =A→ B : Count(count : int{count ≥ 0}). µt(curr : int{curr ≥ 0 ∧ curr ≤ count})⟨curr := 0⟩. B→ C { Hello(it : int{curr < count ∧ it = count}).t⟨curr := curr + 1⟩ Finish(it : int{curr = count ∧ it = count}).end } A Global Type G Projection onto each Participant Local Type for A LA Local Type for B LB Local Type for C LC Fig. 1. Top-down View of (R)MPST Multiparty session types (MPST) [Honda et al. 2008, 2016], one of the most active topics in this area, offer a theoretical framework for specifying message passing protocols between multiple participants. MPST use a type system–based approach to statically verify whether a system of processes implements a given protocol safely. The type system guarantees key execution properties such as freedom from message reception errors or deadlocks. However, despite much recent progress, there remain large gaps between the current state of the art and (i) powerful and practical languages and techniques available to programmers today, and (ii) more advanced type disciplines needed to express a wider variety of constraints of interaction found in real-world protocols. This paper presents and combines two main developments: a theory of MPST enriched with refinement types [Freeman and Pfenning 1991], and a practical method, callback-based programming, for safe session programming. The key ideas are as follows: Refined Multiparty Session Types (RMPST). The MPST theory [Honda et al. 2008, 2016] provides a core framework for decomposing (or projecting) a global type structure, describing the collective behaviour of a distributed system, into a set of participant-specific local types (see Fig. 1). The local types are then used to implement endpoints. Our theory of RMPST follows the same top-down methodology, but enriches MPST with features from refinement types [Freeman and Pfenning 1991], to support the elaboration of data types in global and local types. Refinement types allow refinements in the form of logical predicates and constraints to be attached to a base type. This allows to express various constraints in protocols. To motivate our refinement type extension, we use a counting protocol shown in Fig. 1, and leave the details to § 4. Participant A sends B a number with a Count message. In this message, the refinement type count : int{count ≥ 0} restricts the value for count to be a natural number. Then B sends C exactly that number of Hello messages, followed by a Finish message. We demonstrate how refinement types are used to better specify the protocol: The counting part of the protocol is described using a recursive type with two branches, where we use refinement types to restrict the protocol flow. The variable curr is a recursion variable, which remembers the current iteration, initialised to 0, and increments on each recursion (curr := curr + 1). The refinement curr = count in the Finish branch specifies that the branch may only be taken at the last iteration; the refinement it = count in bothHello and Finish branches specifies a payload value dependent on the recursion variable curr and the variable count transmitted in the first message. We establish the correctness of Refined MPST. In particular, we show that projection is behaviour- preserving and that well-formed global types with refinements satisfy progress, i.e. they do not get stuck. Therefore, if the endpoints follow the behaviour prescribed by the local types, derived (via projection) from a well-formed global type with refinements, the system is deadlock-free. Callback-styled, Refinement-typedAPIs for Endpoint Implementations. One of the main challenges in applying session types in practice is dealing with session linearity: a session channel Proc. ACM Program. Lang., Vol. 4, No. OOPSLA, Article 148. Publication date: November 2020. Statically Verified Refinements for Multiparty Protocols 148:3 is used once and only once. Session typing relies on a linear treatment of communication channels, in order to track the I/O actions performed on the channel against the intended session type. Most existing implementations adopt one of two approaches: monadic interfaces in functional lan- guages [Imai et al. 2020, 2019; Orchard and Yoshida 2017], or “hybrid” approaches that complement static typing with dynamic linearity checks [Hu and Yoshida 2016; Scalas et al. 2017]. This paper proposes a fresh look to session-based programming that does not require linearity checking for static safety. We promote a form of session programming where session I/O is implicitly implemented by callback functions — we say “implicitly” because the user does not write any I/O operations themself: an input callback is written to take a received message as a parameter, and an output callback is written to simply return the label and value of the message to be sent. The callbacks are supported by a runtime, generated along with APIs in refinement types according to the local type. The runtime performs communication and invokes user-specified callback functions upon corresponding communication events. We provide a code generation tool to streamline the writing of callback functions for the projected local type. The inversion of control allows us to dispense with linearity checking, because our approach does not expose communication channels to the user. Our approach is a natural fit to functional programming settings, but also directly applicable to any statically typed language. Moreover, the linearity guarantee is achieved statically without the use of a linear type system, a feature that is usually not supported by mainstream programming languages. We follow the principle of event-based programming via the use of callbacks, prevalent in modern days of computing. A Toolchain Implementation: Session⋆. To evaluate our proposal, we implement RMPST with a toolchain — Session⋆, as an extension to the Scribble toolchain [Hu 2017; Scribble Authors 2015] (http://www.scribble.org/) with F⋆ [Swamy et al. 2016] as the target endpoint language. Building on our callback approach, we show how to integrate RMPST with the verification- oriented functional programming language F⋆, exploiting its capabilities of refinement types and static verification to extend our fully static safety guarantees to data refinements in sessions. Our experience of specifying and implementing protocols drawn from the literature and real-world applications attests to the practicality of our approach and the value of statically verified refinements. Our integration of RMPST and F⋆ allows developers to utilise advanced type system features to implement safe distributed application protocols. Paper Summary and Contributions. § 2 We present an overview of our toolchain: Session⋆, and provide background knowledge of Scribble and MPST. We use a number guessing game, HigherLower, as our running example. § 3 We introduce Session⋆, a toolchain for RMPST. We describe in detail how our generated APIs can be used to implement multiparty protocols with refinements. § 4 We establish the metatheory of RMPST, which gives semantics of global and local types with refinements. We prove trace equivalence of global and local types w.r.t. projection (Theorem 4.10), and show progress and type safety of well-formed global types (Theorem 4.14 and Theorem 4.15). § 5 We evaluate our toolchain with examples from the session type literature, and measure the time taken for compilation and execution. We show that our toolchain does not have a long compilation time, and our runtime does not incur a large overhead on execution time. The accompanying artifact [Zhou et al. 2020] contains the source code of our toolchain Session⋆, with examples and benchmarks used in the evaluation. The artifact is available as a Docker image, and can be accessed on the Docker Hub. The source files are available on GitHub (https://github. com/sessionstar/oopsla20-artifact). We present the proof of our theorems, and additional technical details of the toolchain, in the full version of the paper (https://arxiv.org/abs/2009.06541). Proc. ACM Program. Lang., Vol. 4, No. OOPSLA, Article 148. Publication date: November 2020. 148:4 Fangyi Zhou, Francisco Ferreira, Raymond Hu, Rumyana Neykova, and Nobuko Yoshida 1 2 3 4 5 6 7 global protocol HigherLower (role A, role B, role C) { start(n0:int) from A to B; ... } type state = ... type callbacks = ... type conn = ... let run callbacks conn = ... let callbacks = ... let connection = ... let () = run callbacks connection Extracted OCaml Program program.ml Projection via Scribble Generates Implements Extracts into Scribble Protocol (§ 2.2) Endpoint Implementation (§ 3.2) CFSM Representation F⋆ API User Input Internal/Generated Fig. 2. Overview of Toolchain 2 OVERVIEW OF REFINED MULTIPARTY SESSION TYPES In this section, we give an overview of our toolchain: Session⋆, describing its key toolchain stages. Session⋆ extends the Scribble toolchain with refinement types and uses F⋆ as a target language. We begin with a short background on basic multiparty session types and Scribble, then demonstrate the specification of distributed applications with refinements using the extended Scribble. 2.1 Toolchain Overview We present an overview of our toolchain in Fig. 2, where we distinguish user provided input by developers in solid boxes , from generated code or toolchain internals in dashed boxes . Development begins with specifying a protocol using an extended Scribble protocol description language. Scribble is closely associatedwith theMPST theory [Hu 2017; Neykova and Yoshida 2019], and provides a user-friendly syntax for multiparty protocols. We extend the Scribble toolchain to support RMPST, allowing refinements to be added via annotations. The extended Scribble toolchain (as part of Session⋆) validates the well-formedness of the protocol, and produces a representation in the form of a communicating finite state machine (CFSM, [Brand and Zafiropulo 1983]) for a given participant. We then use a code generator (also as part of Session⋆) to generate F⋆ APIs from the CFSM, utilising a number of advanced type system features available in F⋆ (explained later in § 3.1). The generated APIs, described in detail in § 3, consist of various type definitions, and an entry point function taking callbacks and connections as arguments. In our design methodology, we separate the concern of communication and program logic. The callbacks, corresponding to program logic, do not involve communication primitives — they are invoked to prompt a value to be sent, or to process a received value. Separately, developers provide a connection that allows base types to be serialised and transmitted to other participants. Developers implement the endpoint by providing both callbacks and connections, according to the generated refinement typed APIs. They can run the protocol by invoking the generated entry point. Finally, the F⋆ source files can be verified using the F⋆ compiler, and extracted to an OCaml program (or other supported targets) for efficient execution. Proc. ACM Program. Lang., Vol. 4, No. OOPSLA, Article 148. Publication date: November 2020. Statically Verified Refinements for Multiparty Protocols 148:5 1 global protocol HigherLower(role A, role B, role C) { 2 // A tells B a secret number `n0`, 3 // and the number `t0` of attempts that C has to guess it 4 start(n0:int) from A to B; @'0≤n0<100' 5 limit(t0:int) from A to B; @'0x ∧ t>1' // Secret is higher 10 higher() from B to A; 11 do Aux(A, B, C); @'B[n, t-1]' 12 } or { win() from B to C; @'n=x' // C wins, A loses 13 lose() from B to A; 14 } or { lower() from B to C; @'n1' // Secret is lower 15 lower() from B to A; 16 do Aux(A, B, C); @'B[n, t-1]' 17 } or { lose() from B to C; @'n,x ∧ t=1' // A wins, C loses 18 win() from B to A; 19 } } Fig. 3. A Refined Scribble Global Protocol for a HigherLower Game. 2.2 Global Protocol Specification — RMPST in Extended Scribble The workflow in the standardMPST theory [Honda et al. 2008], as is generally the case in distributed application development, starts from identifying the intended protocol for participant interactions. In our toolchain, a global protocol—the description of the whole protocol between all participants from a bird eye’s view—is specified using our RMPST extension of the Scribble protocol description language [Hu 2017; Scribble Authors 2015]. Figure 3 gives the global protocol for a three-party game, HigherLower, which we use as a running example. Basic Scribble/MPST. We first explain basic Scribble (corresponding to the standard MPST) without the @-annotations (annotations are extensions to the basic Scribble). (1) The main protocol HigherLower declares three roles A, B and C, representing the runtime communication session participants. The protocol starts with A sending B a start message and a limit message, each carrying an int payload. (2) The do construct specifies all three roles to proceed according to the (recursive) Aux sub-protocol. C sends B a guess message, also carrying an int. (The aux keyword simply tells Scribble that a sub-protocol does not need to be verified as a top-level entry protocol.) (3) The choice at B construct specifies at this point that B should decide (make an internal choice) by which one of the four cases the protocol should proceed. This decision is explicitly communicated (as an external choice) to A and C via the messages in each case. The higher and lower cases are the recursive cases, leading to another round of Aux (i.e. another guess by C); the other two cases, win and lose, end the protocol. To sum up, A sends B two numbers, and C sends a number (at least one) to B for as long as B replies with either higher or lower to C (and A). Next we demonstrate how we can express data dependencies using refinements with our extended Scribble. Extended Scribble/RMPST. As described above, a basic global protocol (equivalent to a stand- ard MPST global type) specifies the structure of participant interactions, but is not very informative Proc. ACM Program. Lang., Vol. 4, No. OOPSLA, Article 148. Publication date: November 2020. 148:6 Fangyi Zhou, Francisco Ferreira, Raymond Hu, Rumyana Neykova, and Nobuko Yoshida about the behaviour of the application in the aspect of data transmitted. This limitation can be contrasted against standardised (but informal) specifications of real-world application protocols, e.g. HTTP, OAuth, where a significant part, if not the majority, of the specification is devoted to the data side of the protocol. It goes without saying, details missing from the specification cannot be verified on an implementation. We go through Fig. 3 again, this time including the practical extensions proposed by this paper to address these limitations: RMPST enables a refinement type–based treatment of protocols, capturing and integrating both data constraints and interactions. While refinement types by themselves can already greatly enrich the specification of individual messages, the most valuable and interesting aspect is the interplay between data refinements and the consequent interactions (i.e. the protocol flow) in the distributed, multiparty setting. In our setup, we allow data-dependent refinements to be specified in the global protocol, we explain various ways of using them in the running example: • Message Values. A basic use of refinements is on message values, specifically their payload contents. The annotation on the first interaction (Line 4) specifies that A and B not only com- municate an n0:int, but that 0≤n0<100 is a postcondition for the value produced by A and a precondition on the value received by B. Similarly, the int carried by limit must be positive. • Local Protocol State. RMPST also supports refinements on the recursions that a protocol trans- itions through. The B[...] annotation (Line 7) in the Aux header specifies the local state known by B during the recursion, whenever B reaches this point in the ongoing protocol. The local state includes an int n such that 0≤n<100 and an int t such that 01 in the higher and lower refinements are necessary to ensure that the 0 x ∧ t > 1} expresses a protocol flow refinement on an output of a higher message to C. For brevity, we omit the payload data types in the CFSM edges, as this example features only ints; we omit empty payloads “()” likewise. We show the local state refinements as annotations on the corresponding CFSM states (shaded in grey, with an arrow to the state). Refined API Generation for B. CFSMs offer an intuitive understanding of the semantics of endpoint projections. Building on recent work [Castro et al. 2019; Hu and Yoshida 2016; Neykova Proc. ACM Program. Lang., Vol. 4, No. OOPSLA, Article 148. Publication date: November 2020. Statically Verified Refinements for Multiparty Protocols 148:9 1 2 3 n{0 ≤ n < 100}, t{0< t} 5 6 4 7 8 A?start(n0){0 ≤ n0 < 100} A?limit(t0){0 ≤ t0} C?guess(x){0 ≤ x< 100} C!higher{n> x ∧ t> 1}A!higher C!lower{n< x ∧ t> 1}A!lower C!win{n= x} A!lose C!lose{n, x ∧ t= 1} A!win (a) CFSM Representation of the Projection. ! stands for sending actions, and ? for receiving actions on edges. Generated F⋆ API State Edge Generated Callback Type 1 A?start s1 → (n:int{0≤n<100}) → ML unit 2 A?limit s2 → (t:int{01} | s4_lose of unit{s.n,s.x ∧ s.t=1} | s4_win of unit{s.n=s.x} | s4_higher of unit{s.n>s.x ∧ s.t>1} (c) Generated Data Type for the Output Choice Fig. 4. Projection and F⋆ API Generation for B in HigherLower et al. 2018], we use our CFSM-based representation of refined projections to generate protocol- and role-specific APIs for implementing each role in F⋆. We highlight a novel and crucial development: we exploit the approach of type generation to produce functional-style callback-based APIs that internalise all of the actual communication channels and I/O actions. In short, the transitions of the CFSM are rendered as a set of transition-specific function types to be implemented by the user — each of these functions take and return only the user-level data related to I/O actions and the running of the protocol. The transition function of the CFSM itself is embedded into the API by the generation, exporting a user interface to execute the protocol by calling back the appropriate user-supplied functions according to the current CFSM state and I/O event occurrences. We continue with our example, Fig. 4b lists the function types for B, detailed as follows. Note, a characteristic of MPST-based CFSMs is that each non-terminal state is either input- or output-only. • State Types. For each state, we generate a separate type (named by enumerating the states, by default). Each is defined as a record containing previously known payload values and its local recursion variables, or unit if none, for example: type s3 = { n0: int{0≤n0<100}; t0: int{0 1} B?higher{n > x ∧ t > 1} B?win{n = x} B?lose{n , x ∧ t = 1} (a) CFSM Representation of the Projection User implementation (* Allocate a refined int reference *) State Edge Generated type let next: ref (x:int{0≤x<100}) = alloc 50 1 B!guess s1 → ML (x:int{0≤x<100}) fun _ → !next (*Deref next*) 2 B?higher s2 → unit{n>x ∧ t>1} → ML unit fun s → next B s.x + 1 B?lower s2 → unit{n1} → ML unit fun s → next B s.x - 1 B?win s2 → unit{n=x} → ML unit fun _ → () B?lose s2 → unit{n,x ∧ t=1} → ML unit fun _ → () (b) Generated I/O Callback Types Fig. 5. Projection and F⋆ API Generation for C in HigherLower let main () = (* connect to B via TCP *) let server_B = connect ip_B port_B in (* Setup connection from TCP *) let conn = mk_conn server_B in (* Invoke the Entry Point `run` *) let () = run callbacks conn in (* Close TCP connection *) close server_B (a) Running the Endpoint C (* Signature (s:s4) → ML (s4Cases s) *) fun (s:s4) → (* Win if guessed number is correct *) if s.x=s.n then s4_win () (* Lose if running out of attempts *) else if s.t=1 then s4_lose () (* Otherwise give hints accordingly *) else if s.n>s.x then s4_higher () else s4_lower () (b) Implementing the Internal Choice for B Fig. 6. Selected Snippets of Endpoint Implementation the lose case altogether would break both the lower and higher cases, as the type checker would not be able to prove s.t>1 as required by the subsequent constructors. Lastly, Fig. 5b implements C to guess the secret by a simple search, given we know its value is bounded within the specified interval. We draw attention to the input callback for higher, where we adjust the next value. Given that the value being assigned is one more than the existing value, it might have been the case that the new value falls out of the range (in the case where next is 99), hence violating the prescribed type. However, despite that the value of n is unknown, we have known from the refinement attached to the edge that n>x holds, hence it must have been the case that our last guess x is strictly less than the secret n, which rules out the possibility that x can be 99 (the maximal value of n). Had the refinement and the erased variable not been present, the type-checker would not be able to accept this implementation, and it demonstrates that our encoding allows such reasoning with latent information from the protocol. Moreover, the type and effect system of F⋆ prevents the erased variables from being used in the callbacks. On one hand, int and erased int types are not compatible, because they are not the same type. This prevents an irrelevant variable from being used in place of a concrete variable. On the other hand, the function reveal converts a value of erased 'a to a value of 'a with Ghost effect. A function with Ghost effect cannot be mixed with a function with ML effect (as in the case of our callbacks), so irrelevant variables cannot be used in the implementation via the reveal function. Proc. ACM Program. Lang., Vol. 4, No. OOPSLA, Article 148. Publication date: November 2020. 148:12 Fangyi Zhou, Francisco Ferreira, Raymond Hu, Rumyana Neykova, and Nobuko Yoshida Interested readers are invited to try the running example out with our accompanying artifact. We propose a few modifications on the implementation code and the protocol, and invite the readers to observe errors when implementations no longer correctly conforms to the prescribed protocol. 3.4 Executing the Communicating Finite State Machine (Generated Code) As mentioned earlier, our API design sepearates the concern of program logic (with callbacks) and communication (with connections). A crucial piece of the generated code involves threading the two parts together — the execution function performs the communications actions and invokes the appropriate callbacks for handling. In this way, we do not expose explicit communication channels, so linearity can be achieved with ease by construction in our generated code. The entry point function, named run, takes callbacks and connections as arguments, and ex- ecutes the CFSM for the specified endpoint. The signature uses the permissive ML effect, since communicating with the external world performs side effects. We traverse the states (the set of states is denoted Q) in the CFSM and generate appropriate code depending on the nature of the state and its outgoing transitions. Internally, we define mutually recursive functions for each state q ∈ Q, named runq , taking the state record ⟦q⟧ as argument (⟦q⟧ stands for the state record for a given state q), which performs the required actions at state q. The run state function for a state q either (1) invokes callbacks and communication primitives, then calls the run state function for the successor state q′, or (2) returns directly for termination if q is a terminal state (without outgoing transitions). The main entry point invokes the run function for the initial state q0, starting the finite state machine. The internal run state functions are not exposed to the developer, hence it is not possible to tamper with the internal state with usual means of programming. This allows us to guarantee linearity of communication channels by construction. In the following text, we outline how to run each state, depending on whether the state is a sending state or a receiving state. Note that CFSMs constructed from local types do not have mixed states [Deniélou and Yoshida 2013, Prop. 3.1] let rec run_q (st: stateq) = let choice = callbacks.stateq_send st in match choice with | Choiceqli payload → comm.send_string q "li"; comm.send_S q payload; let st = { · · · ; xi=payload } in run_q′ st Repeat for i ∈ I (a) Template for Sending State q let rec run_q (st: stateq) = let label = comm.recv_string p () in match label with | "li" → let payload = comm.recv_S p () in callbacks.stateq_receive_li st payload; let st = { · · · ; xi=payload } in run_q′ st (b) Template for Receiving State q Fig. 7. Template for runq Running the CFSM at a Sending State. For a sending state q ∈ Q, the developer makes an internal choice on how the protocol proceeds, among the possible outgoing transitions. This is done by invoking the sending callback stateq_send with the current state record, to obtain a choice with the associated payload. We pattern match on the constructor of the label li of the message, and find the corresponding successor state q′. The label li is encoded as a string and sent via the sending primitive to q. It is followed by the payload specified in the return value of the callback, via corresponding sending primitive according to the base type with refinement erased. Proc. ACM Program. Lang., Vol. 4, No. OOPSLA, Article 148. Publication date: November 2020. Statically Verified Refinements for Multiparty Protocols 148:13 We construct a state record of ⟦q′⟧ from the existing record ⟦q⟧, adding the new field xi in the action using the callback return value. In the case of recursive protocols, we also update the recursion variable according to the definition in the protocol when constructing ⟦q′⟧. Finally, we call the run state function runq′ to continue the CFSM, effectively making the transition to state q′. Following the procedure, runq is generated as shown in Fig. 7a. Running the CFSM at a Receiving State. For a receiving state q ∈ Q, how the protocol proceeds is determined by an external choice, among the possible outgoint actions. To know what choice is made by the other party, we first receive a string and decode it into a label l , via the receiving primitive for string. Subsequently, according to the label l , we can look up the label in the possible transitions, and find the state successor q′. By invoking the appropriate receiving primitive, we obtain the payload value. We note that the receiving primitive has a return type without refinements. In order to re-attach the refinements, we use the F⋆ builtin assume to reinstate the refinements according to the protocol before using the value. According the label l received, we can call the corresponding receiving callback with the received value. This allows the developer to process the received value and perform any relevant program logic. This is followed by the same procedure for constructing the state record for the next state q′ and invoking the run function for q′. Following the procedure, runq is generated as shown in Fig. 7b. 3.5 Summary We demonstrated with our running example, HigherLower, how to implement a refined multiparty protocol with our toolchain Session⋆. Exploiting the powerful type system of F⋆, our approach has several key benefits: First, it guarantees fully static session type safety in a lightweight, practical manner — the callback-style API is portable to any statically typed language. Existing work based on code generation has considered only hybrid approaches that supplement static typing with dynamically checked linearity of explicit communication channel usages. Moreover, the separation of program logic and communication leads to a modular implementation of protocols. Second, it is well suited to functional languages like F⋆; in particular, the data-oriented nature of the user interface allows the refinements in RMPST to be directly mapped to data refinements in F⋆, allowing the refinements constraints to be discharged at the user implementation level by the F⋆ compiler — again, fully statically. Furthermore, our endpoint implementation inherits core communication safety properties such as freedom from deadlock or communication mismatches, based on the original MPST theory. We use the F⋆ type-checker to validate statically that an endpoint implementation is correctly typed with respect to the prescribed type obtained via projection of the global protocol. Therefore, the implementation benefits from additional guarantees from the refinement types. 4 A THEORY OF REFINED MULTIPARTY SESSION TYPES (RMPST) In this section, we introduce refined multiparty session types (RMPST for short). We give the syntax of types in § 4.1, extending original multiparty session types (MPST) with refinement types. We describe the refinement typing system that we use to type expressions in RMPST in § 4.2. We follow the standard MPST methodology. Global session types describe communication struc- tures of many participants (also known as roles). Local session types, describing communication structures of a single participant, can be obtained via projection (explain in § 4.3). Endpoint processes implement local types obtained from projection. We give semantics of global types and local types Proc. ACM Program. Lang., Vol. 4, No. OOPSLA, Article 148. Publication date: November 2020. 148:14 Fangyi Zhou, Francisco Ferreira, Raymond Hu, Rumyana Neykova, and Nobuko Yoshida in § 4.4, and show the equivalence of semantics with respect to projection. As a consequence, we can compose all endpoint processes implementing local types for roles in a global type, obtained via projection, to implement the global type correctly. 4.1 Syntax of Types We define the syntax of refined multiparty session types (refined MPST) in Fig. 8. We use different colours for different syntactical categories to help disambiguation, but the syntax can be understood without colours. We use pink for global types, dark blue for local types, blue for expressions, purple for base types, indigo for labels, and Teal with bold fonts for participants. S ::= int | bool | . . . Base Types T ::= x : S{E} Refinement Types E ::= x | n | op1 E | E op2 E . . . Expressions G ::= Global Types | p→ q {li (xi :Ti ).Gi }i ∈I Message | µt(x :T )⟨x := E⟩.G Recursion | t⟨x := E⟩ | end Type Var., End L ::= Local Types | p&{li (xi :Ti ).Li }i ∈I Receiving | p⊕{li (xi :Ti ).Li }i ∈I Sending | l(x :T ).L Silent Prefix | µt (x :T )⟨x := E⟩.L Recursion | t⟨x := E⟩ | end Type Var., End Fig. 8. Syntax of Refined Multiparty Session Types Value Types andExpressions. Weuse S for base types of values, ranging over integers, booleans, etc. Values of the base types must be able to be communicated. The base type S can be refined by a boolean expression, acting as a predicate on the members of the base type. A refinement type is of the form (x : S{E}). A value x of the type has base type S , and is refined by a boolean expression E. The boolean expression E acts as a predicate on the members x (possibly involving the variable x). For example, we can express natural numbers as (x : int{x ≥ 0}). We use fv(·) to denote the free variables in refinement types, expressions, etc. We consider variable x be bound in the refinement expression E, i.e. fv(x : S{E}) = fv(E) \ {x}. Where there is no ambiguity, we use the base type S directly as an abbreviation of a refinement type (x : S{true}), where x is a fresh variable, and true acts as a predicate that accepts all values. Global Session Types. Global session types (global types or protocols for short) range over G,G ′,Gi , . . . Global types give an overview of the overall communication structure. We extend the standard global types [Deniélou and Yoshida 2013] with refinement types and variable bindings in message prefixes. Extensions to the syntax are shaded in the following explanations. p→ q { li ( xi : Ti ).Gi } i ∈I is a message from p to q, which branches into one or more continu- ations with label li , carrying a payload variable xi with type Ti . We omit the curly braces when there is only one branch, like p→ q : l(x :T ). We highlight the difference from the standard syntax, i.e. the variable binding. The payload variable xi occurs bound in the continuation global type Gi , for all i ∈ I . We sometimes omit the variable if it is not used in the continuations. The free variables are defined as: fv(p→ q {li (xi :Ti ).Gi }i ∈I ) = ⋃ i ∈I fv(Ti ) ∪ ⋃ i ∈I (fv(Gi ) \ {xi }) We require that the index set I is not empty, and all labels li are distinct. To prevent duplication, we write l(x : S{E}) instead of l(x : (x : S{E})) (the first x occurs as a variable binding in the message, the second x occurs as a variable representing member values in the refinement types). We extend the construct of recursive protocols to include a variable carrying a value in the inner protocol. In this way, we enhance the expressiveness of the global types by allowing a recursion variable to be maintained across iterations of global protocols. Proc. ACM Program. Lang., Vol. 4, No. OOPSLA, Article 148. Publication date: November 2020. Statically Verified Refinements for Multiparty Protocols 148:15 The recursive global type µt( x :T )⟨ x := E ⟩.G specifies a variable x carrying type T in the recursive type, initialised with expression E. The type variable t⟨ x := E ⟩ is annotated with an assignment of expression E to variable x . The assignment updates the variable x in the current recursive protocol to expression E. The free variables in recursive type is defined as fv(µt(x :T )⟨x := E⟩.G) = fv(T ) ∪ fv(E) ∪ (fv(G) \ {x}) We require that recursive types are contractive [Pierce 2002, §21], so that recursive protocols have at least a message prefix, and protocols such as µt(x :T )⟨x := E1⟩.t⟨x := E2⟩ are not allowed. We also require recursive types to be closed with respect to type variables, e.g. protocols such as t⟨x := E⟩ alone are not allowed. WewriteG[µt(x :T ).G/t] to substitute all occurrences of type variableswith expressions t⟨x := E⟩ into µt(x :T )⟨x := E⟩.G. We write r ∈ G to say r is a participating role in the global type G. Example 4.1 (Global Types). We give the following examples of global types. (1) G1 = A→ B : Fst(x : int).B→ C : Snd(y : int{x = y}).C→ D : Trd(z : int{x = z}).end. G1 describes a protocol where A sends an int to B, and B relays the same int to C, similar for C to D. Note that we can write x = z in the refinement of z, whilst x is not known to C. (2) G2 = A→ B : Number (x : int).B→ C  Positive(unit{x > 0}).end Zero(unit{x = 0}).end Neдative(unit{x < 0}).end  G2 describes a protocol where A sends an int to B, and B tells C whether the int is positive, zero, or negative. We omit the variable here since it is not used later in the continuation. (3) G3 = µt(try : int{try ≥ 0 ∧ try ≤ 3})⟨try := 0⟩. A→ B : Password(pwd : string). B→ A  Correct(unit).end Retry(unit{try < 3}).t⟨try := try + 1⟩ Denied(unit{try = 3}).end  G3 describes a protocol where A authenticates with B with maximum 3 tries. Local Session Types. Local session types (local types for short) range over L, L′, Li , . . . Local types give a view of the communication structure of an endpoint, usually obtained from a global type. In addition to standard syntax, the recursive types are similarly extended as those of global types. Suppose the current role is q, the local type p⊕{li (xi :Ti ).Li }i ∈I describes that the role q sends a message to the partner role pwith label li (where i is selected from an index set I ), carrying payload variable xi with typeTi , and continues with Li . It is also said that the role q takes an internal choice. Similarly, the local type p&{li (xi :Ti ).Li }i ∈I describes that the role q receives a message from the partner role p. In this case, it is also said that the role q offers an external choice. We omit curly braces when there is only a single branch (as is done for global messages). We add a new syntax construct of l(x :T ).L for silent local types. We motivate this introduction of the new prefix to represent knowledge obtained from the global protocol, but not in the form of a message. Silent local types are useful to model variables obtained with irrelevant quantification [Abel and Scherer 2012; Pfenning 2001]. These variables can be used in the construction of a type, but cannot be used in that of an expression, as we explain later in § 4.2. We show an example of a silent local type later in Example 4.3, after we define endpoint projection, the process of obtaining local types from a global type. Proc. ACM Program. Lang., Vol. 4, No. OOPSLA, Article 148. Publication date: November 2020. 148:16 Fangyi Zhou, Francisco Ferreira, Raymond Hu, Rumyana Neykova, and Nobuko Yoshida [WF-Rty] Σ+, x : S ⊢ E : bool Σ ⊢ (x : S{E}) ty [TE-Var] Σ1, x ω :T , Σ2 ⊢ x : T [TE-Plus] Σ ⊢ E1 : int Σ ⊢ E2 : int Σ ⊢ E1 + E2 : (v : int{v = E1 + E2}) [TE-Sub] Σ ⊢ E : (v : S{E1}) Valid(⟦Σ⟧ ∧ ⟦E1⟧ =⇒ ⟦E2⟧) Σ ⊢ E : (v : S{E2}) [TE-Const] Σ ⊢ n : (v : int{v = n}) Fig. 9. Selected Typing Rules for Expressions in a Local Typing Context 4.2 Expressions and Typing Expressions We use E, E ′, Ei to range over expressions. Expressions consist of variables x , constants (e.g. integer literals n), and unary and binary operations. We use an SMT assisted refinement type system for typing expressions, in the style of [Rondon et al. 2008]. The simple syntax of expressions allows all expressions to be encoded into SMT logic, for deciding a semantic subtyping relation of refinement types [Bierman et al. 2012]. Typing Contexts. We define two categories of typing contexts, for use in handling global types and local types respectively. Γ ::= œ | Γ, xP : T Σ ::= œ | Σ, xθ : T θ ::= 0 | ω We annotate global and local typing contexts differently. For global contexts Γ, variables carry the annotation of a set of roles P, to denote the set of roles that have the knowledge of its value. For local contexts Σ, variables carry the annotation of their multiplicity θ . A variable with multiplicity 0 is an irrelevantly quantified variable (irrelevant variable for short), which cannot appear in the expression when typing (also denoted as x ÷T in the literature [Abel and Scherer 2012; Pfenning 2001]). Such a variable can only appear in an expression used as a predicate, when defining a refinement type. A variable with multiplicity ω is a variable without restriction. We often omit the multiplicity ω. Well-formedness. Since a refinement type can contain free variables, it is necessary to define well-formedness judgements on refinement types, and henceforth on typing contexts. We define Σ+ to be the local typing context where all irrelevant variables x0 become unrestricted xω , i.e. (œ)+ = œ; (Σ, xθ :T )+ = Σ+, xω :T . We show thewell-formedness judgement of a refinement type [WF-Rty] in Fig. 9. For a refinement type (x : S{E}) to be a well-formed type, the expression E must have a boolean type under the context Σ+, extended with variable x (representing the members of the type) with type S . The typing context Σ+ promotes the irrelevant quantified variables into unrestricted variables, so they can be used in the expression E inside the refinement type. The well-formedness of a typing context is defined inductively, requiring all refinement types in the context to be well-formed. We omit the judgements for brevity. Typing Expressions. We type expressions in local contexts, forming judgements of form Σ ⊢ E : T , and show key typing rules in Fig. 9. We modify the typing rules in a standard refinement type system [Rondon et al. 2008; Vazou et al. 2014, 2017], adding distinction between irrelevant and unrestricted variables. [TE-Const] gives constant values in the expression a refinement type that only contains the con- stant value. Similarly, [TE-Plus] gives typing derivations for the plus operator, with a corresponding refinement type encoding the addition. Proc. ACM Program. Lang., Vol. 4, No. OOPSLA, Article 148. Publication date: November 2020. Statically Verified Refinements for Multiparty Protocols 148:17 Γ ∪ {xP : T } =  Γ, xP :T if x < Γ Γ1, x P :T , Γ2 if Γ = Γ1, xœ :T , Γ2 Γ1, x P :T , Γ2 if Γ = Γ1, xP :T , Γ2 undefined otherwise Σ ∪ {xθ : T } =  Σ, xθ :T if x < Σ Σ1, x θ :T , Σ2 if Σ = Σ1, x0 :T , Σ2 Σ1, x ω :T , Σ2 if Σ = Σ1, xω :T , Σ2 undefined otherwise Fig. 10. Typing Context Extension We draw attention to the handling of variables ([TE-Var]). An irrelevant variable in the typing context cannot appear in an expression, i.e. there is no derivation for Σ1, x0 :T , Σ2 ⊢ x :T . These variables can only be used in an refinement type (see [WF-Rty]). The key feature of the refinement type system is the semantic subtyping relation decided by SMT [Bierman et al. 2012], we describe the feature in [TE-Sub]. We use ⟦E⟧ to denote the encoding of expresion E into the SMT logic. We encode a type binding xθ : (v : S{E}) in a typing context by encoding the term E[x/v], and define the encoding of a typing context ⟦Σ⟧ inductively. We define the extension of typing contexts (Γ ∪ {xP : T }; Σ ∪ {xθ : T }) in Fig. 10, used in definitions of semantics. We say a global typeG (resp. a local type L) is closed under a global context Γ (resp. a local context Σ), if all free variables in the type are in the domain of the context. Remark 4.2 (Empty Type). A refinement type may be empty, with no inhabited member. We can construct such a type under the empty context œ as (x : S{false}) with any base types S . A more specific example is a refinement type for an integer that is both negative and positive (x : int{x > 0 ∧ x < 0}). Similarly, under the context xω : int{x > 0}, the refinement type y : int{y < 0 ∧ y > x} is empty. In these cases, the typing context with the specified type becomes inconsistent, i.e. the encoded context gives a proof of falsity. Moreover, an empty type can also occur without inconsistency. For instance, in a typing context of x0 : int, the type y : int{y > x} is empty — it is not possible to produce such a value without referring to x (cf. [TE-Var]). 4.3 Endpoint Projection: From Global Contexts and Types to Local Contexts and Types In the methodology of multiparty session types, developers specify a global type, and obtain local types for the participants via endpoint projection (projection for short). In the original theory, projection is a partial function that takes a global type G and a participant p, and returns a local type L. The resulting local type L describes a the local communication behaviour for participant p in the global scenario. Such workflow has the advantage that each endpoint can obtain a local type separately, and implement a process of the given type, hence providing modularity and scalability. Projection is defined as a partial function, since only well-formed global types can be projected to all participants. In particular, a partial merge operator ⊔ is used during the projection, for creating a local type Σ ⊢ L1 ⊔ L2 = Lmerged that captures the behaviour of two local types, under context Σ. In RMPST, we first define the projection of global typing contexts (Fig. 11), and then define the projection of global types under a global typing context (Fig. 12). We use expression typing judgements in the definition of projection, to type-check expressions against their prescribed types. Projection of Global Contexts. We define the judgement Γ ↾ p = Σ for the projection of global typing context Γ to participant p in Fig. 11. In the global context Γ, a variable x is annotated with the set of participants Pwho know the value. If the projected participant p is in the set P, [P-Var-ω] is applied to obtain an unrestricted variable in the resulting local context; Otherwise, [P-Var-0] is applied to obtain an irrelevant variable. Projection of Global Types with a Global Context. When projecting a global type G, we include a global context Γ, forming a judgement of form ⟨Γ ≺ G⟩ ↾ p = ⟨Σ ≺ L⟩. Projection Proc. ACM Program. Lang., Vol. 4, No. OOPSLA, Article 148. Publication date: November 2020. 148:18 Fangyi Zhou, Francisco Ferreira, Raymond Hu, Rumyana Neykova, and Nobuko Yoshida Γ ↾ p = Σ [P-Empty] œ ↾ p = œ [P-Var-ω] p ∈ P Γ ↾ p = Σ Γ, xP : T ↾ p = Σ, xω : T [P-Var-0] p < P Γ ↾ p = Σ Γ, xP : T ↾ p = Σ, x0 : T Fig. 11. Projection Rules for Global Contexts [P-Send] Γ ↾ p = Σ ∀i ∈ I . Σ ⊢ Ti ty ⟨Γ ∪ {x {p, q}i : Ti } ≺ Gi ⟩ ↾ p = ⟨Σi ≺ Li ⟩ ⟨Γ ≺ (p→ q {li (xi :Ti ).Gi }i ∈I )⟩ ↾ p = ⟨Σ ≺ q⊕{li (xi :Ti ).Li }i ∈I ⟩ [P-Recv] Γ ↾ q = Σ ∀i ∈ I . Σ ⊢ Ti ty ⟨Γ ∪ {x {p, q}i : Ti } ≺ Gi ⟩ ↾ q = ⟨Σi ≺ Li ⟩ ⟨Γ ≺ (p→ q {li (xi :Ti ).Gi }i ∈I )⟩ ↾ q = ⟨Σ ≺ p&{li (xi :Ti ).Li }i ∈I ⟩ [P-Phi] Γ ↾ r = Σ r < {p, q} ∀i ∈ I . Σ ⊢ Ti ty ⟨Γ ∪ {x {p, q}i : Ti } ≺ Gi ⟩ ↾ r = ⟨Σi ≺ Li ⟩ ⟨Γ ≺ (p→ q {li (xi :Ti ).Gi }i ∈I )⟩ ↾ r = ⟨Σ ≺ ⊔i ∈I li (xi :Ti ).Li ⟩ [P-Rec-In] Γ ↾ r = Σ r ∈ G ⟨Γ ∪ {x {r |r∈G } : T } ≺ G⟩ ↾ r = ⟨Σ′ ≺ L⟩ Σ ⊢ T ty Σ ⊢ E : T ⟨Γ ≺ µt(x :T )⟨x := E⟩.G)⟩ ↾ r = ⟨Σ ≺ µt (x :T )⟨x := E⟩.L⟩ [P-Rec-Out] Γ ↾ r = Σ r < G ⟨Γ ≺ µt(x :T )⟨x := E⟩.G)⟩ ↾ r = ⟨Σ ≺ end⟩ [P-End] Γ ↾ r = Σ ⟨Γ ≺ end⟩ ↾ r = ⟨Σ ≺ end⟩ ⟨Γ ≺ G⟩ ↾ p = ⟨Σ ≺ L⟩ [P-Var] Γ ↾ r = Σ = Σ1, x :T , Σ2 Σ1, x :T , Σ2 ⊢ E :T ⟨Γ ≺ t⟨x := E⟩⟩ ↾ r = ⟨Σ ≺ t⟨x := E⟩⟩ Fig. 12. Projection Rules for Global Types rules are shown in Fig. 12. Including a typing context allows us to type-check expressions during projection, hence ensuring that variables attached to recursive protocols are well-typed. If the prefix of G is a message from role p to role q, the projection results a local type with a send (resp. receive) prefix into role p (resp. q) via [P-Send] (resp. [P-Recv]). For other roles r, the projection results in a local type with a silent label via [P-Phi], with prefix l(x :T ). This follows the concept of a coordinated distributed system, where all the processes follow a global protocol, and base assumptions of their local actions on actions of other roles not involving them. The projection defined in the original MPST theory does not contain information for role r about a message between p and q. We use the silent prefix to retain such information, especially the refinement typeT of the payload. For merging two local types (as used in [P-Phi]), we use a simple plain merge operator defined as Σ ⊢ L ⊔ L = L, requiring two local types to be identical in order to be merged.3 If the prefix ofG is a recursive protocol µt(x :T )⟨x := E⟩.G , the projection preserves the recursion construct if the projected role is in the inner protocol via [P-Rec-In] and that the expression E can be typed with type T under the projected local context. Typing expressions under local contexts ensures that no irrelevant variables x0 are used in the expression E, as no typing derivation exists for irrelevant variables. Otherwise projection results in end via [P-Rec-Out]. IfG is a type variable t⟨x := E⟩, we similarly validate that the expression E carries the specified type in the correspondent recursion definition, and its projection also preserves the type variable construct. 3We build upon the standard MPST theory with plain merging. Full merge [Denielou et al. 2012], allowing certain different index sets to be merged, is an alternative, more permissive merge operator. Our implementation Session⋆ uses the more permissive merge operator for better expressiveness. Proc. ACM Program. Lang., Vol. 4, No. OOPSLA, Article 148. Publication date: November 2020. Statically Verified Refinements for Multiparty Protocols 148:19 Example 4.3 (Projection of Global Types of Example 4.1 (1)). We draw attention to the projection of G1 to C, under the empty context œ. ⟨œ ≺ G1⟩ ↾ C = ⟨œ ≺ Fst(x : int).B&Snd(y : int{x = y}).D⊕Trd(z : int{x = z}).end⟩ We note that the local type for C has a silent prefix Fst(x : int), which binds the variable x in the continuation. The silent prefix adds the variable x and its type to the “local knowledge” of the endpoint C, yet the actual value of x is unknown. Remark 4.4 (Empty Session Type). Global types G and local types L can be empty because one of the value types in the protocol in an empty type (cf. Remark 4.2). For example, the local type A⊕Impossible(x : int{x > 0 ∧ x < 0}).end cannot be implemented, since such an x cannot be provided. For the same reason, the local type Pos(x : int{x > 0}).A⊕Impossible(y : int{y > x}).end can- not be implemented. Remark 4.5 (Implementable Session Types). Consider the following session type: L = B&Num(x : int).B⊕ { Pos(unit{x > 0}).end Neд(unit{x < 0}).end } . When the variable x has the value 0, neither of the choices Pos or Neд could be selected, as the refinements are not satisfied. In this case, the local type L cannot be implemented, as the internal choice callback may not be implemented in a total way, i.e. the callback returns a choice label for all possible inputs of integer x .4 4.4 Labelled Transition System (LTS) Semantics We define the labelled transition system (LTS) semantics for global types and local types. We show the trace equivalence of a global type and the collection of local types projected from the global type, to demonstrate that projection preserves LTS semantics. The equivalence result allows us to use the projected local types for the implementation of local roles separately. Therefore, we can implement the endpoints in F⋆ separately, and they compose to the specified protocol. We also prove a type safety result that well-formed global types cannot be stuck. This, combined with the trace equivalence result, guarantees that endpoints are free from deadlocks. Actions. We begin with defining actions in the LTS system. We define the label in the LTS as α ::= p→ q : l(x : T ), a message from role p to q with label l carrying a value named x with type T . We define subj(α) = {p, q} to be the subjects of the action α , namely the two roles in the action. Semantics of Global Types. We define the LTS semantics of global types in Fig. 13. Different from the original LTS semantics in [Deniélou and Yoshida 2013], we include the context Γ in the semantics along with the global typeG. Therefore, the judgements of global LTS reduction have form ⟨Γ ≺ G⟩ α−→ ⟨Γ′ ≺ G ′⟩. [G-Pfx] allows the reduction of the prefix action in a global type. An action, matching the definition in set defined in the prefix, allows the correspondent continuation to be selected. The resulting global type is the matching continuation and the resulting context contains the variable binding in the action. [G-Cnt] allows the reduction of an action that is causally independent of the prefix action in a global type, here, the subjects of the action are disjoint from the prefix of the global type. If all continuations in the global types can make the reduction of that action to the same context, then 4Since we use a permissive ML effect in the callback type, allowing all side effects to be performed in the callback, the callback may throw exceptions or diverge in case of unable to return a value. Proc. ACM Program. Lang., Vol. 4, No. OOPSLA, Article 148. Publication date: November 2020. 148:20 Fangyi Zhou, Francisco Ferreira, Raymond Hu, Rumyana Neykova, and Nobuko Yoshida [G-Pfx] j ∈ I ⟨Γ ≺ p→ q {li (xi :Ti ).Gi }i ∈I ⟩ p→q:lj (x j :Tj )−−−−−−−−−−→ ⟨Γ ∪ {x {p, q}j : Tj } ≺ G j ⟩ [G-Cnt] {p, q} ∩ subj(α) = œ ∀j ∈ I .⟨Γ ∪ {xœj : Tj } ≺ G j ⟩ α−→ ⟨Γ′ ≺ G ′j ⟩ ⟨Γ ≺ p→ q {li (xi :Ti ).Gi }i ∈I ⟩ α−→ ⟨Γ′ ≺ p→ q { li (xi :Ti ).G ′i } i ∈I ⟩ ⟨Γ ≺ G⟩ α−→ ⟨Γ′ ≺ G ′⟩ [G-Rec] ⟨Γ ∪ {x {r |r∈G } : T } ≺ G[µt(x :T ).G/t]⟩ α−→ ⟨Γ′ ≺ G ′⟩ ⟨Γ ≺ µt(x :T )⟨x := E⟩.G⟩ α−→ ⟨Γ′ ≺ G ′⟩ Fig. 13. LTS Semantics for Global Types the result context is that context and the result global type is one with continuations after reduction. When reducing the continuations, we add the variable of the prefix action into the context, but tagged with an empty set of known roles. This addition ensures that relevant information obtainable from the prefix message is not lost when performing reduction. [G-Rec] allows the reduction of a recursive type by unfolding the type once. Example 4.6 (Global Type Reductions). We demonstrate two reduction paths for a global type G = p→ q : Hello(x : int{x < 0}).r→ s : Hola(y : int{y > x}).end. Note that the two messages are not causally related (they have disjoint subjects). We have the following two reduction paths of ⟨œ ≺ G⟩ (omitting payload in LTS actions): ⟨œ ≺ G⟩ [G-Pfx] p→q:Hello−−−−−−−−−→ ⟨x {p, q} : int{x < 0} ≺ r→ s : Hola(y : int{y > x}).end⟩ [G-Pfx] r→s:Hola−−−−−−−→ ⟨x {p, q} : int{x < 0},y {r, s} : int{y > x} ≺ end⟩ ⟨œ ≺ G⟩ [G-Cnt] r→s:Hola−−−−−−−→ ⟨xœ : int{x < 0},y {r, s} : int{y > x} ≺ p→ q : Hello(x : int{x < 0}).end⟩ [G-Pfx] p→q:Hello−−−−−−−−−→ ⟨x {p, q} : int{x < 0},y {r, s} : int{y > x} ≺ end⟩ Semantics of Local Types. We define the LTS semantics of local types in Fig. 14. Similar to global type LTS semantics, we include the local context Σ in the semantics. Therefore, the judgements of local LTS reductions have form ⟨Σ ≺ L⟩ α−→ ⟨Σ′ ≺ L′⟩. When defining the LTS semantics, we also use judgements of form ⟨Σ ≺ L⟩ ϵ−→ ⟨Σ′ ≺ L′⟩. It represents a silent action that can occur without an observed action. We write ϵ−→∗ to denote the reflexive transition closure of silent actions ϵ−→. We first have a look at silent transitions. [E-Phi] allows the variable in a silent type to be added into the local context in the irrelevant form. This rule allows local roles to obtain knowledge from the messages in the global protocol without their participation. [E-Cnt] allows prefixed local type to make a silent transition, if all of its continuations are allowed to make a silent transition to reach the same context. The rule allows a prefixed local type to obtain new knowledge about irrelevant variables, if such can be obtained in all possible continuations. [E-Rec] unfolds recursive local types, analogous to the unfolding of global types. For concrete transitions, we have [L-Send] (resp. [L-Recv]) to reduce a local type with a sending (resp. receiving) prefix, if the action label is in the set of labels in the local type. The resulting Proc. ACM Program. Lang., Vol. 4, No. OOPSLA, Article 148. Publication date: November 2020. Statically Verified Refinements for Multiparty Protocols 148:21 [E-Cnt] † ∈ {&, ⊕} ∀j ∈ I . ⟨Σ ∪ {x0j : Tj } ≺ Lj ⟩ ϵ−→ ⟨Σ′ ≺ L′j ⟩ ⟨Σ ≺ q †{li (xi :Ti ).Li }i ∈I ⟩ ϵ−→ ⟨Σ′ ≺ q † { li (xi :Ti ).L′i } i ∈I ⟩ [E-Rec] ⟨Σ ≺ µt (x :T )⟨x := E⟩.L⟩ ϵ−→ ⟨Σ ∪ {xω : T } ≺ L[µt (x :T ).L/t]⟩ ⟨Σ ≺ L⟩ ϵ−→ ⟨Σ′ ≺ L′⟩ [E-Phi] ⟨Σ ≺ l(x :T ).L⟩ ϵ−→ ⟨Σ ∪ {x0 : T } ≺ L⟩ [L-Send] j ∈ I ⟨Σ ≺ q⊕{li (xi :Ti ).Li }i ∈I ⟩ p→q:lj (x j :Tj )−−−−−−−−−−→ ⟨Σ ∪ {xωj : Tj } ≺ Lj ⟩ [L-Recv] j ∈ I ⟨Σ ≺ p&{li (xi :Ti ).Li }i ∈I ⟩ p→q:lj (x j :Tj )−−−−−−−−−−→ ⟨Σ ∪ {xωj : Tj } ≺ Lj ⟩ ⟨Σ ≺ L⟩ α−→ ⟨Σ′ ≺ L′⟩ [L-Eps] ⟨Σ ≺ L⟩ ϵ−→ ⟨Σ′′ ≺ L′′⟩ ⟨Σ′′ ≺ L′′⟩ α−→ ⟨Σ′ ≺ L′⟩ ⟨Σ ≺ L⟩ α−→ ⟨Σ′ ≺ L′⟩ Fig. 14. LTS Semantics for Local Types context contains the variable in the message as a concrete variable, since the role knows the value via communication. The resulting local type is the continuation corresponding to the action label. In addition, [L-Eps] permits any number of silent actions to be taken before a concrete action. Remark 4.7 (Reductions for Empty Session Types). We consider empty session types to be reducible, since it is not possible to distinguish which types are inhabited. However, it does not invalidate the safety properties of endpoints, since no such endpoints can be implemented for an empty session type. Relating Semantics of Global and Local Types. We extend the LTS semantics to a collection of local types in Definition 4.8, in order to prove that projection preserves semantics. We define the semantics in a synchronous fashion. The set of local types reduces with an action α = p → q : l(x : T ), if the local type for role p and q both reduce with that action α . All other roles in the set of the local types are permitted to make silent actions (ϵ actions). Our definition deviates from the standard definition [Deniélou and Yoshida 2013, Def. 3.3] in two ways: One is that we use a synchronous semantics, so that one action involves two reductions, namely at the sending and receiving sides. Second is that we use contexts and silent transitions in the LTS semantics. The original definition requires all non-action roles to be identical, whereas we relax the requirement to allow silent transitions. Definition 4.8 (LTS over a collection of local types). A configuration s = {⟨Σr ≺ Lr⟩}r∈P is a collection of local types and contexts, indexable via participants. Let p ∈ P and q ∈ P. We say s = {⟨Σr ≺ Lr⟩}r∈P α=p→q:l (x :T )−−−−−−−−−−−→ s ′ = {⟨Σ′r ≺ L′r⟩}r∈P if (1) ⟨Σp ≺ Lp⟩ α−→ ⟨Σ′p ≺ L′p⟩ and, ⟨Σq ≺ Lq⟩ α−→ ⟨Σ′q ≺ L′q⟩ and, (2) for all s ∈ P, s , p, s , q. ⟨Σs ≺ Ls⟩ ϵ−→ ∗ ⟨Σ′s ≺ L′s⟩ Proc. ACM Program. Lang., Vol. 4, No. OOPSLA, Article 148. Publication date: November 2020. 148:22 Fangyi Zhou, Francisco Ferreira, Raymond Hu, Rumyana Neykova, and Nobuko Yoshida For a closed global typeG under context Γ, we show that the global type makes the same trace of reductions as the collection of local types obtained from projection. We prove it in Theorem 4.10. Definition 4.9 (Association of Global Types and Configurations). Let ⟨Γ ≺ G⟩ be a global context. The collection of local contexts associated to ⟨Γ ≺ G⟩, is defined as the configuration {⟨Γ ≺ G⟩ ↾ r}r∈G . We write s ⇔ ⟨Γ ≺ G⟩ if a configuration s is the associated to ⟨Γ ≺ G⟩. Theorem 4.10 (Trace Eqivalence). Let ⟨Γ ≺ G⟩ be a closed global context and s ⇔ ⟨Γ ≺ G⟩ be a configuration associated with the global context. ⟨Γ ≺ G⟩ α−→ ⟨Γ′ ≺ G ′⟩ if and only if s α−→ s ′, where s ′ ⇔ ⟨Γ′ ≺ G ′⟩. The theorem states that semantics are preserved after projection. Practically, we can implement local processes separately, and run them in parallel with preserved semantics. We also show that a well-formed global type G has progress. This means that a well-formed global type does not get stuck, which implies deadlock freedom. Definition 4.11 (Well-formed Global Types). A global type under typing context ⟨Γ ≺ G⟩ is well- formed, if (1) G does not contain free type variables, (2) G is contractive [Pierce 2002, §21], and (3) for all roles in the protocol r ∈ G, the projection ⟨Γ ≺ G⟩ ↾ r is defined. We also say a global type G is well-formed, if ⟨œ ≺ G⟩ is well-formed. Theorem 4.12 (Preservation of Well-formedness). If ⟨Γ ≺ G⟩ is a well-formed global type under typing context, and ⟨Γ ≺ G⟩ α−→ ⟨Γ′ ≺ G ′⟩, then ⟨Γ′ ≺ G ′⟩ is well-formed. Definition 4.13 (Progress). A configuration s satisfies progress, if either (1) For all participants p ∈ s , Lp = end, or (2) there exists an action α and a configuration s ′ such that s α−→ s ′. A global type under typing context ⟨Γ ≺ G⟩ satisfies progress, if its associated configuration s ⇔ ⟨Γ ≺ G⟩, exists and satisfies progress. We also say a global type G satisfies progress, if ⟨œ ≺ G⟩ satisfies progress. Theorem 4.14 (Progress). If ⟨Γ ≺ G⟩ is a well-formed global type under typing context, then ⟨Γ ≺ G⟩ satisfies progress. Theorem 4.15 (Type Safety). If G is a well-formed global type, then for any global type under typing context ⟨Γ′ ≺ G ′⟩ such that ⟨œ ≺ G⟩ −−→ ∗⟨Γ′ ≺ G ′⟩, ⟨Γ′ ≺ G ′⟩ satisfies progress. Proof. Direct consequence of Theorem 4.12 and Theorem 4.14. □ 5 EVALUATION We evaluate the expressiveness and performance of our toolchain Session⋆. We describe the methodology and setup (§ 5.1), and comment on the compilation time (§ 5.2) and the execution time (§ 5.3). We demonstrate the expressiveness of Session⋆ (§ 5.4) by implementing examples from the session type literature and comparing with related work. The source files of the benchmarks used in this section are included in our artifact, along with a script to reproduce the results. 5.1 Methodology and Setup We measure the time to generate the CFSM representation from a Scribble protocol (CFSM), and the time to generate F⋆ code from the CFSM representation (F⋆ APIs). Since the generated APIs in F⋆ need to be type-checked before use, we also measure the type-checking time for the generated code (Gen. Code). Finally, we provide a simple implementation of the callbacks and measure the type-checking time for the callbacks against the generated type (Callbacks). Proc. ACM Program. Lang., Vol. 4, No. OOPSLA, Article 148. Publication date: November 2020. Statically Verified Refinements for Multiparty Protocols 148:23 global protocol PingPongn(role A, role B) { choice at A { Ping(x1:int) from A to B; Pong(y1:int) from B to A; @"y1>x1" Ping(x2:int) from A to B; @"x2>y1" Pong(y2:int) from B to A; @"y2>x2" · · · Ping(xn:int) from A to B; @"xn>yn−1" Pong(yn:int) from B to A; @"yn>xn" do PingPongn(A, B); } or { Bye() from A to B; Bye() from B to A; } } Fig. 15. Ping Pong Protocol (Parameterised by Protocol Length n) To execute the protocols, we need a network transport to connect the participants by providing appropriate sending and receiving primitives. In our experiment setup, we use the standard library module FStar.Tcp to establish TCP connections between participants, and provide a simple serialisation module for base types. Due to the small size of our payloads, we set TCP_NODELAY to avoid the delays introduced by the congestion control algorithms. Since our entry point to execute the protocol is parameterised by the connection/transport type, the implementation may use other connections if developers wish, e.g. an in-memory queue for local setups. We measure the execution time of the protocol (Execution Time). To measure the overhead of our implementation, we compare against an implementation of the protocol without session types or refinement types, which we call bare implementation. In this implementation, we use the same sending and receiving primitives (i.e. connection) as in the toolchain implementation. The bare implementation is in a series of direct calls of sending and receiving primitives, for the same communication pattern, but without the generated APIs. We use a Ping Pong protocol (Fig. 15), parameterised by the protocol length, i.e. the number of Ping Pong messages n in a protocol iteration. When the protocol length n increases, the number of CFSM states increases linearly, which gives rise to longer generated code and larger generated types. In each Ping Pong message, we include payload of increasing numbers, and encode the constraints as protocol refinements. We study its effect on the compilation time (§ 5.2) and the execution time (§ 5.3). We run the experiment on varying sizes of n, up to 25. Larger sizes of n leads to unreasonably large resource usage during type-checking in F⋆. Table 1 reports the results for the Ping Pong protocol in Fig. 15. We run the experiments under a network of latency of 0.340ms (64 bytes ping), and repeat each experiment 30 times. Measurements are taken using a machine with Intel i7-7700K CPU (4.20 GHz, 4 cores, 8 threads), 16 GiB RAM, operating system Ubuntu 18.04, OCaml compiler version 4.08.1, F⋆ compiler commit 8040e34a, Z3 version 4.8.5. 5.2 Compilation Time CFSM and F⋆ Generation Time. Wemeasure the time taken for Scribble to generate the CFSM from the protocol in Fig. 15, and for the code generation tool to convert the CFSM to F⋆ APIs. We observe from Table 1 that the generation time for CFSMs and F⋆ APIs is short. It takes less than 1 second to complete the generation phase for each case. Type-checking Time of Generated Code and Callbacks. We measure the time taken for the generated APIs to type-check in F⋆. We provide a simple F⋆ implementation of the callbacks following the generated APIs, and measure the time taken to type-check the callbacks. The increase of type-checking time is non-linear with regard to the protocol length. We encode CFSM states as records corresponding to local typing contexts. In this case, the size of local typing contexts and the number of type definitions grows linearly, giving rise to a non-linear increase. Moreover, the entry point function is likely to cause non-linear increases in the type-checking time. Proc. ACM Program. Lang., Vol. 4, No. OOPSLA, Article 148. Publication date: November 2020. 148:24 Fangyi Zhou, Francisco Ferreira, Raymond Hu, Rumyana Neykova, and Nobuko Yoshida Protocol Generation Time Type Checking Time Execution Time Length (n) CFSM F⋆ APIs Gen. Code Callbacks (100 000 ping-pongs) bare n/a n/a n/a n/a 28.79s 1 0.38s 0.01s 1.28s 0.34s 28.75s 5 0.48s 0.01s 3.81s 1.12s 28.82s 10 0.55s 0.01s 14.83s 1.34s 28.84s 15 0.61s 0.01s 42.78s 1.78s n/a 20 0.69s 0.02s 98.35s 2.54s 28.81s 25 0.78s 0.02s 206.82s 3.87s 28.76s Table 1. Time Measurements for Ping Pong Protocol The long type-checking time of the generated code could be avoided if the developer chooses to trust our toolchain to always generate well-typed F⋆ code for the entry point. The entry point would be available in an interface file (cf. OCaml .mli files), with the actual implementation in OCaml instead of F⋆5. There would otherwise be no changes in the development workflow. Although neither does type-checking time of the callback implementation fit a linear pattern, it remains within reasonable time frame. 5.3 Runtime Performance (Execution Time) We measure the execution time taken for an exchange of 100,000 ping pongs for the toolchain and bare implementation under the experiment network. The execution time is dominated by network communication, since there is little computation to be performed at each endpoint. We provide a bare implementation using a series of direct invocations of sending and receiving primitives, in a compatible way to communicate with generated APIs. The bare implementation does not involve a record of callbacks, which is anticipated to run faster, since the bare implementation involves fewer function pointers when calling callbacks. Moreover, the bare implementation does not construct state records, which record a backlog of the communication, as the protocol progresses. To measure the performance impact of book-keeping of callback and state records, we run the Ping Pong protocol from Fig. 15 for a protocol of increasing size (number of states and generated types), i.e. for increasing values of n. All implementations, including bare are run until 100,000 ping pong messages in total are exchanged6. We summarise the results in Table 1. Despite the different protocol lengths, there are no significant changes in execution time. Since the execution is dominated by time spent on communication, the measurements are subject to network fluctuations, difficult to avoid during the experiments. We conclude that our implementation does not impose a large overhead on the execution time. 5.4 Expressiveness We implement examples from the session type literature, and add refinements to encode data dependencies in the protocols. We measure the time taken for code generation and type-checking, and present them in Table 2. The time taken in the toolchain for examples in the session type literature is usually short, yet we demonstrate that we are able to implement the examples easily with our callback style API. Moreover, the time taken is incurred at the compilation stage, hence there is no overhead for checking refinements by our runtime. 5Defining a signature in an interface file, and providing an implementation in the target language (OCaml) allows the F⋆ compiler to assume the implementation is correct. This technique is used frequently in the standard library of F⋆. This is not to be confused with implementing the endpoints in OCaml instead of F⋆, as that would bypass the F⋆ type-checking. 6For n = 1, we run 100,000 iterations of recursion; for n = 10, we run 10,000 iterations, etc. Total number of ping pong messages exchanged by two parties remain the same. Proc. ACM Program. Lang., Vol. 4, No. OOPSLA, Article 148. Publication date: November 2020. Statically Verified Refinements for Multiparty Protocols 148:25 Example (Endpoint) Gen. / TC. Time MP RV IV STP Two Buyer a (A) 0.46s / 2.33s ✓ ✗ ✓ ✓† Negotiation b (C) 0.46s / 1.59s ✗ ✓ ✗ ✗ Fibonacci c (A) 0.44s / 1.58s ✗ ✓ ✗ ✗ Travel Agency d (C) 0.62s / 2.36s ✓ ✗ ✗ ✓† Calculator c (C) 0.51s / 2.30s ✗ ✗ ✗ ✓ SH e (P) 1.16s / 4.31s ✓ ✗ ✓ ✓† Online Wallet f (C) 0.62s / 2.67s ✓ ✓ ✗ ✗ Ticket д (C) 0.45s / 1.90s ✗ ✓ ✗ ✗ HTTP h (S) 0.55s / 1.79s ✗ ✗ ✗ ✓† MP Multiparty Protocol RV Uses Recursion Variables IV Irrelevant Variables STP Implementable in STP ✓† STP requires dynamic checks a [Honda et al. 2016] b [Demangeon and Honda 2012] c [Hu and Yoshida 2016] d [Hu et al. 2008] e [Neykova et al. 2018] f [Neykova et al. 2013] д [Bocchi et al. 2013] h [Fielding and Reschke 2014] Table 2. Selected Examples from Literature We also compare the expressiveness of our work with two most closely related works, namely Bocchi et al. [2010] and Neykova et al. [2018], which study refinements in MPST (also see § 6). Neykova et al. [2018] (Session Type Provider, STP) implements limited version of refinements in the Scribble toolchain. Our version is strictly more expressive than STP for two reasons: (1) support for recursive variables to express invariants and (2) support for irrelevant variables. Fig. 16 illustrates those features and Table 2 identifies which of the implemented examples use them. protocol Adder(role S, role C) @"S[acc:=0]" { Num(x:int) from C to S; @"x≥0" Sum(sum:int) from S to C; @"sum=acc+x" do Adder(S, C); @"S[sum]" } (a) Accumulator (using Recursive Invariants) protocol Broadcast(role A, role B, role C) { Broadcast(x:int) from A to B; @"x≥0" // C does not learn y≥0 in STP Broadcast(y:int) from A to C; @"x=y" } (b) Broadcasting (using Irrelevant Variables) Fig. 16. Example Protocols Demonstrating Additional Expressiveness to [Neykova et al. 2018] In STP, when recursion occurs, all information about the variables is lost at the end of an iteration, hence their tool does not support even the simple example in Fig. 16a. In contrast, our work retains the recursion variables, which are available throughout the recursion. Additionally, the endpoint projection in STP is more conservative with regards to refinements. Whilst there must be no variables unknown to a role in the refinements attached to a message for the sending role, there may be unknown variables to the receiving role. The part unknown to the receiving role is discarded (hence weakening the pre-condition). In our work such information can still be retained and used for type checking, thanks to irrelevant variables. In Bocchi et al. [2010], a global protocol with assertions must be well-asserted (§3.1). In particular, the history sensitivity requirement states: "A predicate guaranteed by a participant p can only contain those interaction variables that p knows." Our theory lifts this restriction by allowing variables unknown to a sending role to be used in the global or local type, whereas such variables cannot be used in the implementation. For example, Example 4.1 fails the well-asserted requirement in [Bocchi et al. 2010]. In the refinement x = z for variable z (for message label Trd), the variable x is not known to C, hence the protocol would not be well-asserted. In our setup, such protocol is permitted, the endpoint implementation for C can provide the value y received from B to satisfy the refinement type — The SMT solver can validate the refinement from the transitivity of equality. 6 RELATEDWORK We summarise the most closely related works in the areas of refinement and session types. For a detailed survey on theory and implementations of session types, see Gay and Ravara [2017]. Proc. ACM Program. Lang., Vol. 4, No. OOPSLA, Article 148. Publication date: November 2020. 148:26 Fangyi Zhou, Francisco Ferreira, Raymond Hu, Rumyana Neykova, and Nobuko Yoshida Refinement Types for Verification and Reasoning. Refinement types were introduced to allow recursive data structures to be specified in more details using predicates [Freeman and Pfenning 1991]. Subsequent works on the topic [Bengtson et al. 2011; Schmid and Kuncak 2016; Vazou et al. 2014, 2017] utilise SMT solvers, such as Z3 [De Moura and Bjørner 2008], to aid the type system to decide a semantic subtyping relation [Bierman et al. 2012] using SMT encodings. Refinement types have been applied to numerous domains, such as resource usage analysis [Handley et al. 2019; Knoth et al. 2020], secure implementations [Bengtson et al. 2011; Bhargavan et al. 2010], information control flow enforcements [Polikarpova et al. 2020], and theorem proving [Vazou et al. 2017]. Our aim is to utilise refinement types for the specification and verification of distributed protocols, by combining refinement and session types in a single practical framework. Implementation of Session Types. Neykova et al. [2018] provides an implementation of MPST with assertions using Scribble and F#. Their implementation, the session type provider (STP), relies on code generation of fluent (class-based) APIs, initially described in [Hu and Yoshida 2016]. Each protocol state is implemented as a class, with methods corresponding to the possible transitions from that state. It forces a programming style that not only relies extensively on method chaining, but also requires dynamic checks to ensure the linearity of channel usage. Our work differs from STP in multiple ways. First, we extend the Scribble toolchain to support recursion variables, allowing refinements on recursions, hence improving expressiveness. In this way, developers can specify dependencies across recursive calls, which is not supported in STP. Second, we depart from the class-based API generation, and generate a callback-based API. Our approach has the advantage that the linear usage of channels is ensured by construction, saving dynamic checks for channels. Third, we use refinement types in F⋆ to verify refinements statically, in contrast, STP performs dynamic evaluations to validate assertions in protocols. Finally, the metatheory of session types extended with refinements was not developed in their work. Several other MPST works follow a similar technique of class-based API generation to overcome limitations of the type system in the target language, e.g. Castro et al. [2019] for Go, Ng et al. [2015] for C. All of the above works, suffer from the same limitations – they detect linearity violations at runtime, and offer no static alternative. Indeed, to our knowledge, Imai et al. [2020] provide the only MPST implementation which statically checks linearity violation. It relies on specific type-level OCaml features, and a monadic programming style. Our work proposes generation of a callback-styled API from MPST protocols. Although our target language is F⋆, the callback-styled API code generation technique is applicable to any mainstream programming language. Dependent and Refinement Session Types. Bocchi et al. [2010] propose a multiparty session π -calculus with logical assertions. By contrast, our formulation of RMPST is based on refinement types, projection with silent prefixes and correspondence with CFSMs, to target practical code generation, such as for F⋆. They do not formulate any semantics for global types nor prove an equivalence between refined global types and projections, as in this paper. Toninho and Yoshida [2017] extend MPST with value dependent types. Invariants on values are witnessed by proof objects, which then may be erased at runtime. Our work uses refinement types, which follows the principle naturally, since refinements that appear in types are proof-irrelevant and can be erased safely. These works are limited to theory, whereas we provide an implementation. De Muijnck-Hughes et al. [2019] propose an Embedded Domain Specific Language (EDSL) approach of implementing multiparty sessions (analogous to MPST) in Idris. They use value dependent types in Idris to define combinators, with options to specify data dependencies, contrary to our approach of code generation. However, the combinators only describe the sessions, and how to implement and execute the sessions remains unanswered. Our work provides a complete toolchain from protocol description to implementation and verification. Proc. ACM Program. Lang., Vol. 4, No. OOPSLA, Article 148. Publication date: November 2020. Statically Verified Refinements for Multiparty Protocols 148:27 In the setting of binary session types, Das and Pfenning [2020] extend session types with arithmetic refinements, with application to work analysis for computing upper bounds of work from a given session type. Thiemann and Vasconcelos [2019] extend binary session types with label dependent types. In the setup of their work, specification of arithmetic properties involves complicated definitions of inductive arithmetic relations and functions. In contrast, we use SMT solvers, which have built-in functions and relations for arithmetic. Furthermore, there is no need to construct proofs manually, since SMT solvers find the proof automatically, which enhances usability and ergonomics. Hinrichsen et al. [2019] combine binary session types with concurrent separation logic, allowing reasoning about mixed-paradigm concurrent programs, and planned to extend the framework to MPST. Swamy et al. [2020] provide a framework of concurrent separation logic in F⋆, and demonstrate its expressiveness by showing how (dependent) binary session types can be represented in the logic and used in reasoning. Our work is based on the MPST theory, subsuming binary session types. We also implement a toolchain for developers to use. Bhargavan et al. [2009] use refinement types to implement a limited form of multiparty session types. Session types are encoded in refinement types via code generation. The specification language they use, albeit similar to MPST, has limited expressive power. Only patterns of interactions where participants alternate between sending and receiving are permitted. Moreover, they do not study data dependencies in protocols, hence they can neither specify, nor verify constraints on payloads or recursions. We use refinement types to specify constraints and dependencies in multiparty protocols, and use the F⋆ compiler [Swamy et al. 2016] for verifying the endpoint implementations. The verified endpoint program does not only comply to the multiparty protocol, enjoying the guarantees provided by the original MPST theory (deadlock freedom, session fidelity), but also satisfies additional guarantees provided by refinement types with respect to data constraints. 7 CONCLUSIONS AND FUTUREWORK We present a novel toolchain for implementing refined multiparty session types (RMPST), which enables developers to use Scribble, a protocol description language for multiparty session types, and F⋆, a state-of-the-art verification-oriented programming language, to implement a multiparty protocol and statically verify endpoint implementations. To the best of the authors’ knowledge, this is the first work on statically verified multiparty protocols with refinement types. We extend the theory of multiparty session types with data refinements, and present a toolchain that enables developers to specify multiparty protocols with data dependencies, and implement the endpoints using generated APIs in F⋆. We leverage the advanced typing system in F⋆ to encode local session types for endpoints, and validate the data dependencies in the protocol statically. The verified endpoint program in F⋆ is extracted into OCaml, where the refinements are erased — adding no runtime overhead for refinements. The callback-styled API avoids linearity checks of channel usage by internalising communications in generated code. We evaluate our toolchain and demonstrate that our overhead is small compared to an implementation without session types. Whereas refinement types express the data dependencies of multiparty protocols, the availability of refinement types in general purpose mainstream programming languages is limited. For future work, we wish to study how to mix participants with refined implementation and those without, possibly using a gradual typing system [Igarashi et al. 2019; Lehmann and Tanter 2017]. ACKNOWLEDGMENTS We thank OOPSLA reviewers for their comments and suggestions, David Castro, Julia Gabet and Lorenzo Gheri for proofreading, and Wei Jiang and Qianyi Shu for testing the artifact. The work is supported by EPSRC EP/T006544/1, EP/K011715/1, EP/K034413/1, EP/L00058X/1, EP/N027833/1, EP/N028201/1, EP/T006544/1, EP/T014709/1 and EP/V000462/1, and NCSS/EPSRC VeTSS. Proc. ACM Program. Lang., Vol. 4, No. OOPSLA, Article 148. Publication date: November 2020. 148:28 Fangyi Zhou, Francisco Ferreira, Raymond Hu, Rumyana Neykova, and Nobuko Yoshida REFERENCES Andreas Abel and Gabriel Scherer. 2012. On Irrelevance and Algorithmic Equality in Predicative Type Theory. Logical Methods in Computer Science Volume 8, Issue 1 (March 2012). https://doi.org/10.2168/LMCS-8(1:29)2012 Davide Ancona, Viviana Bono, Mario Bravetti, Joana Campos, Giuseppe Castagna, Pierre-Malo Deniélou, Simon J. Gay, Nils Gesbert, Elena Giachino, Raymond Hu, Einar Broch Johnsen, Francisco Martins, Viviana Mascardi, Fabrizio Montesi, Rumyana Neykova, Nicholas Ng, Luca Padovani, Vasco T. Vasconcelos, and Nobuko Yoshida. 2016. Behavioral Types in Programming Languages. Found. Trends Program. Lang. 3, 2–3 (July 2016), 95–230. https://doi.org/10.1561/2500000031 Jesper Bengtson, Karthikeyan Bhargavan, Cédric Fournet, Andrew D Gordon, and Sergio Maffeis. 2011. Refinement types for secure implementations. ACM Transactions on Programming Languages and Systems (TOPLAS) 33, 2 (2011), 8. Karthikeyan Bhargavan, Ricardo Corin, Pierre-Malo Deniélou, Cédric Fournet, and James J. Leifer. 2009. Cryptographic Protocol Synthesis and Verification for Multiparty Sessions. In 2009 22nd IEEE Computer Security Foundations Symposium. 124–140. https://doi.org/10.1109/CSF.2009.26 Karthikeyan Bhargavan, Cédric Fournet, and Andrew D. Gordon. 2010. Modular Verification of Security Protocol Code by Typing. In Proceedings of the 37th Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (Madrid, Spain) (POPL ’10). ACM, New York, NY, USA, 445–456. https://doi.org/10.1145/1706299.1706350 Gavin M Bierman, Andrew D Gordon, Cătălin Hriţcu, and David Langworthy. 2012. Semantic subtyping with an SMT solver. Journal of Functional Programming 22, 1 (2012), 31–105. https://doi.org/10.1017/S0956796812000032 Laura Bocchi, Romain Demangeon, and Nobuko Yoshida. 2013. A Multiparty Multi-session Logic. In Trustworthy Global Computing, Catuscia Palamidessi and Mark D. Ryan (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 97–111. Laura Bocchi, Kohei Honda, Emilio Tuosto, and Nobuko Yoshida. 2010. A Theory of Design-by-Contract for Distributed Multiparty Interactions. In CONCUR 2010 - Concurrency Theory, Paul Gastin and François Laroussinie (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 162–176. Daniel Brand and Pitro Zafiropulo. 1983. On Communicating Finite-State Machines. J. ACM 30, 2 (April 1983), 323–342. https://doi.org/10.1145/322374.322380 David Castro, Raymond Hu, Sung-Shik Jongmans, Nicholas Ng, and Nobuko Yoshida. 2019. Distributed Programming Using Role-parametric Session Types in Go: Statically-typed Endpoint APIs for Dynamically-instantiated Communication Structures. Proc. ACM Program. Lang. 3, POPL, Article 29 (Jan. 2019), 30 pages. https://doi.org/10.1145/3290342 Ankush Das and Frank Pfenning. 2020. Session Types with Arithmetic Refinements. In 31st International Conference on Concurrency Theory (CONCUR 2020) (Leibniz International Proceedings in Informatics (LIPIcs), Vol. 171), Igor Konnov and Laura Kovács (Eds.). Schloss Dagstuhl–Leibniz-Zentrum für Informatik, Dagstuhl, Germany, 13:1–13:18. https: //doi.org/10.4230/LIPIcs.CONCUR.2020.13 Leonardo De Moura and Nikolaj Bjørner. 2008. Z3: An Efficient SMT Solver. In Proceedings of the Theory and Practice of Software, 14th International Conference on Tools and Algorithms for the Construction and Analysis of Systems (Budapest, Hungary) (TACAS’08/ETAPS’08). Springer-Verlag, Berlin, Heidelberg, 337–340. http://dl.acm.org/citation.cfm?id=1792734. 1792766 Jan de Muijnck-Hughes, Edwin Brady, and Wim Vanderbauwhede. 2019. Value-Dependent Session Design in a Dependently Typed Language. In Proceedings Programming Language Approaches to Concurrency- and Communication-cEntric Software, Prague, Czech Republic, 7th April 2019 (Electronic Proceedings in Theoretical Computer Science, Vol. 291), Francisco Martins and Dominic Orchard (Eds.). Open Publishing Association, 47–59. https://doi.org/10.4204/EPTCS.291.5 Romain Demangeon and Kohei Honda. 2012. Nested Protocols in Session Types. In CONCUR 2012 - Concurrency Theory - 23rd International Conference, CONCUR 2012, Newcastle upon Tyne, UK, September 4-7, 2012. Proceedings (Lecture Notes in Computer Science, Vol. 7454), Maciej Koutny and Irek Ulidowski (Eds.). Springer, 272–286. https://doi.org/10.1007/978-3- 642-32940-1_20 Pierre-Malo Denielou, Nobuko Yoshida, Andi Bejleri, and Raymond Hu. 2012. Parameterised Multiparty Session Types. Logical Methods in Computer Science Volume 8, Issue 4 (Oct. 2012). https://doi.org/10.2168/LMCS-8(4:6)2012 Pierre-Malo Deniélou and Nobuko Yoshida. 2013. Multiparty Compatibility in Communicating Automata: Characterisation and Synthesis of Global Session Types. In 40th International Colloquium on Automata, Languages and Programming (LNCS, Vol. 7966). Springer, Berlin, Heidelberg, 174–186. https://doi.org/10.1007/978-3-642-39212-2_18 Roy T. (Ed.) Fielding and Julian F. (Ed.) Reschke. 2014. Hypertext Transfer Protocol (HTTP/1.1): Message Syntax and Routing. RFC 7230. RFC Editor. 1–89 pages. https://www.rfc-editor.org/info/rfc7230 Tim Freeman and Frank Pfenning. 1991. Refinement Types for ML. In Proceedings of the ACM SIGPLAN 1991 Conference on Programming Language Design and Implementation (Toronto, Ontario, Canada) (PLDI ’91). ACM, New York, NY, USA, 268–277. https://doi.org/10.1145/113445.113468 Simon Gay and António Ravara (Eds.). 2017. Behavioural Types: from Theory to Tools. River Publishers. https://doi.org/10. 13052/rp-9788793519817 Martin A. T. Handley, Niki Vazou, and Graham Hutton. 2019. Liquidate Your Assets: Reasoning about Resource Usage in Liquid Haskell. Proc. ACM Program. Lang. 4, POPL, Article 24 (Dec. 2019), 27 pages. https://doi.org/10.1145/3371092 Proc. ACM Program. Lang., Vol. 4, No. OOPSLA, Article 148. Publication date: November 2020. Statically Verified Refinements for Multiparty Protocols 148:29 Jonas Kastberg Hinrichsen, Jesper Bengtson, and Robbert Krebbers. 2019. Actris: Session-Type Based Reasoning in Separation Logic. Proc. ACM Program. Lang. 4, POPL, Article 6 (Dec. 2019), 30 pages. https://doi.org/10.1145/3371074 Kohei Honda, Nobuko Yoshida, and Marco Carbone. 2008. Multiparty Asynchronous Session Types. In Proceedings of the 35th Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (San Francisco, California, USA) (POPL ’08). ACM, New York, NY, USA, 273–284. https://doi.org/10.1145/1328438.1328472 Kohei Honda, Nobuko Yoshida, and Marco Carbone. 2016. Multiparty Asynchronous Session Types. J. ACM 63 (2016), 1–67. Issue 1-9. https://doi.org/10.1145/2827695 Raymond Hu. 2017. Distributed Programming Using Java APIs Generated from Session Types. Behavioural Types: from Theory to Tools (2017), 287–308. Raymond Hu and Nobuko Yoshida. 2016. Hybrid Session Verification through Endpoint API Generation. In 19th International Conference on Fundamental Approaches to Software Engineering (LNCS, Vol. 9633). Springer, Berlin, Heidelberg, 401–418. https://doi.org/10.1007/978-3-662-49665-7_24 Raymond Hu, Nobuko Yoshida, and Kohei Honda. 2008. Session-Based Distributed Programming in Java. In ECOOP 2008 - Object-Oriented Programming, 22nd European Conference, Paphos, Cyprus, July 7-11, 2008, Proceedings (Lecture Notes in Computer Science, Vol. 5142), Jan Vitek (Ed.). Springer, 516–541. https://doi.org/10.1007/978-3-540-70592-5_22 Atsushi Igarashi, Peter Thiemann, Yuya Tsuda, Vasco T. Vasconcelos, and Philip Wadler. 2019. Gradual session types. Journal of Functional Programming 29 (2019), e17. https://doi.org/10.1017/S0956796819000169 Keigo Imai, Rumyana Neykova, Nobuko Yoshida, and Shoji Yuen. 2020. Multiparty Session Programming with Global Protocol Combinators. https://github.com/keigoi/ocaml-mpst (LIPIcs). Schloss Dagstuhl - Leibniz-Zentrum für Informatik. To appear in ECOOP’20. Keigo Imai, Nobuko Yoshida, and Shoji Yuen. 2019. Session-ocaml: A session-based library with polarities and lenses. Science of Computer Programming 172 (2019), 135 – 159. https://doi.org/10.1016/j.scico.2018.08.005 Tristan Knoth, Di Wang, Adam Reynolds, Jan Hoffmann, and Nadia Polikarpova. 2020. Liquid Resource Types. Proc. ACM Program. Lang. 4, ICFP, Article 106 (Aug. 2020), 29 pages. https://doi.org/10.1145/3408988 Nico Lehmann and Éric Tanter. 2017. Gradual Refinement Types. In Proceedings of the 44th ACM SIGPLAN Symposium on Principles of Programming Languages (Paris, France) (POPL 2017). Association for Computing Machinery, New York, NY, USA, 775–788. https://doi.org/10.1145/3009837.3009856 Rumyana Neykova, Raymond Hu, Nobuko Yoshida, and Fahd Abdeljallal. 2018. A Session Type Provider: Compile-time API Generation of Distributed Protocols with Refinements in F#. In Proceedings of the 27th International Conference on Compiler Construction (Vienna, Austria) (CC 2018). ACM, New York, NY, USA, 128–138. https://doi.org/10.1145/3178372.3179495 Rumyana Neykova and Nobuko Yoshida. 2019. Featherweight Scribble. Springer International Publishing, Cham, 236–259. https://doi.org/10.1007/978-3-030-21485-2_14 Rumyana Neykova, Nobuko Yoshida, and Raymond Hu. 2013. SPY: Local Verification of Global Protocols. In Runtime Verification, Axel Legay and Saddek Bensalem (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 358–363. Nicholas Ng, Jose G.F. Coutinho, and Nobuko Yoshida. 2015. Protocols by Default: Safe MPI Code Generation based on Session Types. In 24th International Conference on Compiler Construction (LNCS, Vol. 9031). Springer, 212–232. https: //doi.org/10.1007/978-3-662-46663-6_11 Dominic Orchard and Nobuko Yoshida. 2017. Session Types with Linearity in Haskell. Behavioural Types: from Theory to Tools (2017), 219–241. Frank Pfenning. 2001. Intensionality, extensionality, and proof irrelevance in modal type theory. In Proceedings 16th Annual IEEE Symposium on Logic in Computer Science. IEEE, 221–230. Benjamin C. Pierce. 2002. Types and Programming Languages (1st ed.). The MIT Press. Nadia Polikarpova, Deian Stefan, Jean Yang, Shachar Itzhaky, Travis Hance, and Armando Solar-Lezama. 2020. Liquid Information Flow Control. Proc. ACM Program. Lang. 4, ICFP, Article 105 (Aug. 2020), 30 pages. https://doi.org/10.1145/ 3408987 Patrick M. Rondon, Ming Kawaguchi, and Ranjit Jhala. 2008. Liquid Types. In Proceedings of the 29th ACM SIGPLAN Conference on Programming Language Design and Implementation (Tucson, AZ, USA) (PLDI ’08). ACM, New York, NY, USA, 159–169. https://doi.org/10.1145/1375581.1375602 Alceste Scalas, Ornela Dardha, Raymond Hu, and Nobuko Yoshida. 2017. A Linear Decomposition of Multiparty Sessions for Safe Distributed Programming. In ECOOP. https://doi.org/10.4230/LIPIcs.ECOOP.2017.24 Georg Stefan Schmid and Viktor Kuncak. 2016. SMT-based Checking of Predicate-qualified Types for Scala. In Proceedings of the 2016 7th ACM SIGPLAN Symposium on Scala (Amsterdam, Netherlands) (SCALA 2016). ACM, New York, NY, USA, 31–40. https://doi.org/10.1145/2998392.2998398 Scribble Authors. 2015. Scribble: Describing Multi Party Protocols. http://www.scribble.org/. Accessed on 20th April 2020. Proc. ACM Program. Lang., Vol. 4, No. OOPSLA, Article 148. Publication date: November 2020. 148:30 Fangyi Zhou, Francisco Ferreira, Raymond Hu, Rumyana Neykova, and Nobuko Yoshida Nikhil Swamy, Cătălin Hriţcu, Chantal Keller, Aseem Rastogi, Antoine Delignat-Lavaud, Simon Forest, Karthikeyan Bhargavan, Cédric Fournet, Pierre-Yves Strub, Markulf Kohlweiss, Jean-Karim Zinzindohoue, and Santiago Zanella- Béguelin. 2016. Dependent Types and Multi-monadic Effects in F*. In Proceedings of the 43rd Annual ACM SIGPLAN- SIGACT Symposium on Principles of Programming Languages (POPL ’16). ACM, St. Petersburg, FL, USA, 256–270. https://doi.org/10.1145/2837614.2837655 Nikhil Swamy, Aseem Rastogi, Aymeric Fromherz, Denis Merigoux, Danel Ahman, and Guido Martínez. 2020. SteelCore: An Extensible Concurrent Separation Logic for Effectful Dependently Typed Programs. Proc. ACM Program. Lang. 4, ICFP, Article 121 (Aug. 2020), 30 pages. https://doi.org/10.1145/3409003 Peter Thiemann and Vasco T. Vasconcelos. 2019. Label-Dependent Session Types. Proc. ACM Program. Lang., Article 67 (Dec. 2019), 29 pages. https://doi.org/10.1145/3371135 Bernardo Toninho and Nobuko Yoshida. 2017. Certifying data in multiparty session types. Journal of Logical and Algebraic Methods in Programming 90 (2017), 61 – 83. https://doi.org/10.1016/j.jlamp.2016.11.005 Niki Vazou, Eric L. Seidel, Ranjit Jhala, Dimitrios Vytiniotis, and Simon Peyton-Jones. 2014. Refinement Types for Haskell. In Proceedings of the 19th ACM SIGPLAN International Conference on Functional Programming (Gothenburg, Sweden) (ICFP ’14). ACM, New York, NY, USA, 269–282. https://doi.org/10.1145/2628136.2628161 Niki Vazou, Anish Tondwalkar, Vikraman Choudhury, Ryan G. Scott, Ryan R. Newton, Philip Wadler, and Ranjit Jhala. 2017. Refinement Reflection: Complete Verification with SMT. Proc. ACM Program. Lang. 2, POPL, Article 53 (Dec. 2017), 31 pages. https://doi.org/10.1145/3158141 Fangyi Zhou, Francisco Ferreira, Raymond Hu, Rumyana Neykova, and Nobuko Yoshida. 2020. Statically Verified Refinements for Multiparty Protocols. https://doi.org/10.5281/zenodo.3970760 Proc. ACM Program. Lang., Vol. 4, No. OOPSLA, Article 148. Publication date: November 2020.