Publications

Up one level

Published papers, Technical Reports and Thesis

The mutable consensus protocol: In this paper we propose the Mutable Consensus protocol, a pragmatic and theoretically appealing approach to enhance the performance of distributed consensus. First, an apparently inefficient protocol is developed using the simple Stubborn channel abstraction for unreliable message passing. Then, performance is improved by introducing judiciously chosen finite delays in the implementation of channels. Although this does not compromise correctness, which rests on an asynchronous system model, it makes it likely that the transmission of some messages is avoided and thus the message exchange pattern at the network level changes noticeably. By choosing different delays in the underlying stubborn channels, the mutable consensus protocol can actually be made to resemble several different protocols. Besides presenting the mutable consensus protocol and four different mutations, we evaluate in detail the particularly interesting permutation gossip mutation, which allows the protocol to scale gracefully to a large number of processes by balancing the number of messages to be handled by each process with the number of communication steps required to decide. The evaluation is performed using a realistic simulation model which accurately reproduces resource consumption in real systems.
Revisiting Epsilon Serializabilty to improve the Database State Machine (Extended Abstract): In this paper, we investigate how to relax the consistency criteria of DBSM in a controlled manner according to the Epsilon Serializability (ESR) concepts and evaluate the direct beneficts in terms of performance.
Distributed Transaction Processing in the Escada Protocol: Database replication is an invaluable technique to implement fault-tolerant databases, being also frequently used to improve database performance. Unfortunately, when strong consistency among the replicas and the ability to update the database at any of the replicas are considered, the replication protocols do not scale up. The problem is related to the number of interactions among the replicas in order to guarantee consistency and to the protocols used to ensure that all the replicas agree on transactions' result. Roughly, the number of aborts, deadlocks and messages exchanged among the replicas grows drastically, when the number of replicas increases. In related works, it has been proved that database replication in such a scenario is impractical. In order to overcome these problems, several studies have been developed. Initially, most of them released the strong consistency and the update-anywhere requirements to achieve feasible solutions. Recently, replication protocols based on group communication were proposed, n which the strong consistency and update-anywhere requirements are preserved and the problems circumvented. This is the context of the Escada project. Briefly, it aims to study, design and implement transaction replication mechanisms suited to large scale distributed systems. In particular, the project exploits partial replication techniques to provide strong consistency criteria without introducing significant synchronization and performance overheads. In this thesis, we augment the Escada with a distributed query processing model and mechanism, which is an inevitable requirement in a partially replicated environment. Moreover, exploiting characteristics of its protocols, we propose a semantic cache to reduce the overhead generated while accessing remote replicas. We also improve the certification process, while attempting to reduce aborts using the semantic information available in the transactions. Finally, to evaluate the Escada protocols, the semantic caching and the certification process, we use a simulation model that combines simulated and real code, which allows to evaluate our proposals under distinct scenarios and configurations. Furthermore, instead of using unrealistic workloads, we test our proposals using workloads based on the TPC-W and TPC-C benchmarks.
An Indulgent Uniform Total Order Algorithm with Optimistic Delivery: A total order algorithm is a fundamental building block in the construction of distributed fault-tolerant applications. Unfortunately, the implementation of such a primitive can be expensive both in terms of communication steps and of number of messages exchanged. This problem is exacerbated in large-scale systems, where the performance of the algorithm may be limited by the presence of high-latency links. Typically, the most efficient total order algorithms do not provide uniform delivery and assume the availability of a perfect failure detector. Such algorithms may provide inconsistent results if the system assumptions do not hold. On the other hand, algorithms that assume an unreliable failure detector always provide consistent results but exhibit higher costs. This paper presents a new algorithm that combines the advantages of both approaches. On good periods, when the system is stable and processes are not suspected, the algorithm operates as if a perfect failure detector is assumed. Yet, the algorithm is indulgent, since it never violates consistency, even in runs where processes are suspected.
Testing the Dependability and Performance of GCS-Based Database Replication Protocols: Database replication based on group communication, or simply GCS-based database replication, has recently been the focus of much attention as a promising technology to achieve strong consistent large-scale data management. GCS-based database replication is expected to provide increased dependability by relying on the properties of atomic multicast protocols and exhibit good performance due to the absence of distributed locking and reduced interaction among concurrent transactions. However, until now the evaluation of such protocols has been conducted on simplistic simulation models, which fail to assess concrete implementations, or on complete system implementations which are costly to test with realistic loads and faults in a large-scale perspective. This paper presents a hybrid model that combines simulated network and database engine components with real implementations of the replication and communication protocols. Such a model allows to precisely evaluate GCSbased database replication's overall performance and resilience when subjected to realistic loads and fault-injection campaigns in several environments. Besides the description of the design and validation of the model, the paper presents results of the simulation of the Database State Machine replication technique using prototype implementations of its protocols.
Avaliação de Plataformas de Suporte à Composição de Protocolos para Sistemas de Replicação de Bases de Dados: Este relatório apresenta uma avaliação do desempenho de plataformas de suporte a composição de protocolos de modo a aferir a sua adequação ao suporte de sistemas de replicação de bases de dados.
Concretização e Avaliação de uma Plataforma de Suporte à Composição e Execução de Protocolos: O desenvolvimento de protocolos de comunicação pode ser simpliﬁcado através do uso de plataformas de suporte a composição e execução adequadas. Esta dissertação descreve a concretização de uma plataforma de suporte a composição e execução de protocolos concreta, o Appia, e faz a avaliação das suas diversas facetas. Nomeadamente, avalia a expressividade e eﬁcácia dos mecanismos de suporte à composição assim como o desempenho do ambiente de execução. Para suportar a avaliação, foi desenvolvida uma concretização de um serviço de comunicação em grupo com requisitos de composição complexos, os Grupos Ligeiros. A concretização deste serviço na plataforma Appia exigiu o desenvolvimento prévio de um sistema completo de comunicação em grupo oferecendo síncronia na vista, o qual é também descrito na dissertação. Para facilitar uma análise comparativa, e de modo a extrair ensinamentos que foram aplicados no desenvolvimento do Appia, o serviço de Grupos Ligeiro foi também concretizado sobre uma outra plataforma de suporte a composição e execução, em particular, sobre a plataforma Ensemble. A dissertação apresenta uma análise dos dois protótipos resultantes, extraindo indicações para o de- senvolvimento de plataformas futuras.
Relatório de Trabalhos: Este relatório descreve as actividades de investigação científica, levadas a cabo pelo bolseiro Alfrânio Correia Jr. e realizadas durante o período de 01 de Dezembro de 2002 a 30 de Novembro de 2003, no âmbito do projecto StrongRep, sob a supervisão do Professor Doutor Rui Carlos Oliveira.

StrongRep

Sections

Personal tools

Publications