In oVirt datacenter virtualization environments, a manager directs hosts toinitiate operations to shared storage. These operations create or removevolumes, copy data between volumes, create or merge snapshots, and variousother actions related to virtual machine storage. For efficiency and balancethese operations should be distributed across multiple hosts and run inparallel when possible. Maintaining reliability under real world conditionsrequires careful management and resilient algorithms. This talk will introducesome of the problems that can arise including: dropped communications,scheduling conflicts, and host or storage array failure. Next, a solution tothese problems using shared storage locking, atomic operations, volumegenerations, and forensic analysis of the storage will be presented. Throughstep by step examples, the audience will understand how the proposed solutioncan solve all of the outlined problems. |