A 'Thin Arbiter' for glusterfs replication

FOSDEM VZW

N., Ravishankar

Formal Metadata

Title

Title of Series

FOSDEM 2020

Number of Parts

490

Author

N., Ravishankar

License

CC Attribution 2.0 Belgium:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.

Identifiers

10.5446/46897 (DOI)

Publisher

FOSDEM VZW

Release Date

2020

Language

English

Content Metadata

Subject Area

Computer Science

Genre

Conference/Talk

Abstract

Maintaining consistency in replication is a challenging problem involving locking of nodes, quorum checks and reconciliation of state, all of which impact performance of the I/O path if not done right. In a distributed system, a minimum of 3 nodes storing metadata is an imperative to achieve consensus and prevent the dreaded split-brain state. Gluster has had solutions like the trusted 3-way replication or the ' 2 replica + 1 arbiter' configuration to achieve this. The latest in the series is a 'Thin Arbiter (TA)' which is more minimalist the existing '1 arbiter', targeted at container platforms and stretch cluster deployments. A TA node can be deployed outside a gluster cluster and can be shared with multiple gluster volumes. It requires zilch storage space and does not affect I/O path latencies in the happy case. This talk describes the design, working and deployment of TA and the potential gotchas one needs to be aware of while choosing this solution. The intended audience is sysadmins/dev-ops personnel who might want to try out the thin-arbiter volume and troubleshoot any operational issues that may arise. The Thin Arbiter (TA) is different from normal arbitration logic in the sense that even if only one file is bad in one of the copies of the replica, it marks that entire replica unavailable (despite it having other files in it that are healthy), until it is healed and syncs up to the other good copy. While this might seem like a very bad idea for a highly available system, it works very well to prevent split-brains due to intermittent network disconnects rather than a whole node going off-line indefinitely. In talking about this feature, my talk will cover: Introduction to how synchronous replication in gluster works. The role of quorum in preventing split-brains. Briefly describe the working of replica 3 and arbiter volumes. The basic idea behind thin-arbiter based replication. Explain the state machine behind the thin-arbiter transaction model. Describe how it can be installed and used.