Ceph USB Gateway
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Subtitle |
| |
Alternative Title |
| |
Title of Series | ||
Number of Parts | 63 | |
Author | ||
Contributors | ||
License | CC Attribution 3.0 Unported: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor. | |
Identifiers | 10.5446/54565 (DOI) | |
Publisher | ||
Release Date | ||
Language |
Content Metadata
Subject Area | ||
Genre | ||
Abstract |
|
openSUSE Conference 201650 / 63
1
5
8
13
14
19
20
24
25
31
32
33
34
35
37
38
40
43
44
45
46
47
49
50
51
52
53
54
58
59
61
63
00:00
Point cloudData storage deviceGateway (telecommunications)Projective planeGateway (telecommunications)Hacker (term)Computer animation
00:24
Data storage deviceDemo (music)WhiteboardArmGateway (telecommunications)Computer-generated imageryBootingConfiguration spaceComputer networkScalabilityBlock (periodic table)Object (grammar)CloningClient (computing)Kernel (computing)Computer hardwareComponent-based software engineeringSCSIRobotInheritance (object-oriented programming)Mobile WebProcess (computing)Configuration spaceQuicksortCartesian coordinate systemSpacetimeMiniDiscData storage deviceProjective planeKernel (computing)Medical imagingScalabilityBitObject (grammar)Computer fileCommunications protocolInformationDatabase normalizationSCSIComputer hardwareParameter (computer programming)Ring (mathematics)LaptopPlastikkarteBlock (periodic table)Connectivity (graph theory)Client (computing)File formatWhiteboardMass storageOpen sourceMulti-core processorFile systemCloningAuthenticationKey (cryptography)Virtual machineLevel (video gaming)Complete metric spaceOrder (biology)Module (mathematics)Gateway (telecommunications)NumberBootingComputer configurationArmHacker (term)Exterior algebraCodeDevice driverWindowStandard deviationLinear programmingRight angleNeuroinformatikEvent horizonPhysical systemInterface (computing)Very-high-bit-rate digital subscriber lineUtility softwareTopological vector spaceBefehlsprozessorSpecial unitary groupRatsche <Physik>Computer animation
10:06
WhiteboardLaptopAsynchronous Transfer ModeConfiguration spaceFile systemSoftware
11:16
Configuration spaceWhiteboardRing (mathematics)MiniDiscLecture/Conference
11:37
Bookmark (World Wide Web)Normal (geometry)Medical imagingConfiguration spaceMultilaterationData storage deviceRing (mathematics)Key (cryptography)Demo (music)Table (information)Whiteboard
12:47
2 (number)Digital photographyDemo (music)
13:12
Power (physics)Cache (computing)EncryptionWhiteboardInheritance (object-oriented programming)EncryptionWhiteboardData storage deviceCryptographyComputer configurationKernel (computing)SCSIBootingProjective planeCache (computing)Mass storageTerm (mathematics)QuicksortComputer hardwareVector potentialExistential quantificationStack (abstract data type)EmulatorOperator (mathematics)Operating systemCASE <Informatik>Module (mathematics)Interior (topology)Data compressionComputer animation
15:16
Kernel (computing)Configuration spaceMedical imagingStandard deviationWhiteboardScripting languageMultiplication signMoment (mathematics)Goodness of fitPhysical systemCuboidSoftware repositoryDemo (music)ForceCubeReal numberGene clusterPhysical law
17:07
Computer animation
Transcript: English(auto-generated)
00:08
Okay. Thanks for coming. So I'm David Disseldorp, and I'm going to talk about a Hack Week project I worked on, which is a Ceph USB gateway.
00:25
So just a quick agenda for starters. So yeah, just start off with a brief introduction to the project, then move on to Ceph, just give a quick overview of Ceph and how it works. Then on to the USB storage stack in kernel.
00:43
So what I've basically made use of there, and we'll finish off with a demonstration. So this project started end of last year during Hack Week 13. So if you don't know, Hack Week 14 starts today.
01:02
So happy hacking next week. But anyway, back to last year, I was considering what to work on for Hack Week. I had an arm board gathering dust in the corner and thought, okay, I'd like to do something with this board. My day job is Ceph, so I thought, okay,
01:23
I'll put these two things together, and I will work on a Ceph USB storage gateway. So the idea with this is we have our storage cluster, potentially at home in our basement, and we'd like to access that storage somehow from mobile phones, from smart TVs,
01:44
from whatever device with a USB port. The idea is that I can then just plug in my USB gateway and access my Ceph storage cluster with that utility.
02:01
So yes, as mentioned, the goals of the project, or the main goal is just to allow this Ceph storage to be accessed by any device with a USB port. A secondary goal was also to be able to boot from a Rados block device image. So yeah, most laptops can boot from USB.
02:22
So this means, yeah, basically it should just be a matter of connecting this device, and I can boot from the Ceph cluster. Yeah, one other goal was just that the configuration would be sort of as straightforward and easy as possible. So, you know, I didn't wanna have to log into the board
02:41
if I change a key ring or a configuration parameter. So now on to Ceph. Hopefully most of you saw Owen's talk yesterday. I don't really wanna go into big detail, but yeah, Ceph is basically an amazing open source project
03:02
which allows you to, yeah, pull storage across a number of nodes, and that storage is then highly available so that if anything dies within those nodes, you know, potentially you have a power failure or a disk dies, you retain access to your storage.
03:23
Yeah, so it's all open source. It's self-managing, self-healing in that, you know, if you do have a failure, it will then, you know, reconstruct your data with the amount of redundancies you require. And it's also incredibly scalable. So, yeah, you know, you can run petabyte
03:42
or you can store petabytes of information on Ceph. So on the user access side, Ceph has generally broken up into sort of three main protocols or components. So we have on the left there, the Rados Gateway,
04:02
and this is basically supports the RESTful protocols. So Amazon S3 or OpenStack Swift. We have the Rados block device interface, which is what I'm using for this project. And that's basically a block device image,
04:21
which is then backed by objects on the Ceph cluster. And finally on the right, there's the Ceph file system, which is, yeah, POSIX file system on top of the Ceph object store. So a bit more on Rados block device or RBD.
04:41
And with this, we have, as mentioned, a block device image, which is then stored across the Ceph cluster. So obviously that inherits the reliability and scalability aspects of Ceph. Has a number of other neat features. So they're thin provisions. You can resize them online, grow and shrink.
05:03
They support, or you can do snapshots and clones of those images. And on the access side, we have the Linux kernel. So from the kernel, we can locally map a Rados block device image that appears as a local device, and then use it like any other disk.
05:22
And we have user space clients as well for other applications. So now onto the hardware I use for this project. Yeah, actually I've got one.
05:45
Yeah, so basically I'm using a QB truck, which is just a low-end ARMv7 CPU, dual core with two gigabytes RAM. Yeah, just sort of pretty slow, cheap board,
06:02
but this is all I had access to, so it's also capable of doing the job. My main requirement for the hardware was that it had mainline kernel support. So obviously I didn't wanna be playing with rewriting new USB driver code.
06:24
So yeah, the good thing about the QB truck is that the Sunsea community have done a lot of work upstream to get basically all the components on the board working. There's also a Tumbleweed port. So thanks to the ARM guys, Andreas, Dirk, Alex,
06:43
they've done great work getting the Tumbleweed port up for that. It is sort of a bit large for something which I'd hoped would be a USB key. There are probably half the components on the board aren't needed for this project,
07:00
but yeah, that's what I had access to. Just a quick look at a couple of alternatives. So there's an open source hardware chip computer, which has just been released by NextThingCo. And there's on the right, MIPS,
07:24
Embedded Board, both of which can run Linux and should be basically potential options for this project. And both are also around the 10 Euro mark, which is obviously a benefit. So now onto USB storage.
07:42
So I've sort of covered the Ceph side and what I was using for hardware. Next was just plumbing in the USB side. So within Linux, there are sort of a couple of options there, so there's the mass storage kernel module and the TCM kernel module.
08:01
Both are basically the storage USB gadget layers. And they then support the two SCSI over USB standards. So we have support for some of the interesting performance-based features, so high-speed support
08:23
means command queuing on the device. And super speed I think is also out of order completion. Yeah, there's impressive support on the kernel side for acting as a USB storage device.
08:42
I should mention at this stage here that I had to, so some of these features weren't enabled on the Tumbleweed kernel. So I did pull down a recent mainline kernel and basically just ran that on the board. So now onto how the board is basically put together
09:02
or the boot sequence of the board. So the idea is you have your board, you plug it into your machine. You then need to point this board at your Ceph cluster. So that involves getting the Ceph configuration and key ring for authentication.
09:21
And also telling it which image should be mapped or exposed by the board as a USB storage device. Once that's done, so this is handled via a, yeah, what I call a configuration file system. So basically I provision a RAM disk, format it with FAT so it can be handled
09:42
on Windows or Linux, and then the user can then copy those configuration files onto the board. Once that's ejected, so we intercept or detect the eject event, and then we can go ahead and connect our, or map our Rados block device image
10:01
and expose it via USB. So with that, I'd like to move on to the demonstration. So with this, I have, or my setup here is I have, on my laptop, a Ceph cluster running with a few OSDs and a monitor node.
10:24
So this is, yeah, then my backing Ceph cluster. At the front here, I have my QB truck board. And this QB truck is then connected by a network to the Ceph cluster, being the laptop. And finally, on the USB side, the board has a USB
10:45
on the go port, which is then acting in device mode or slave mode. So first off, I'll just show how this is actually configured. So I'll plug this into my laptop.
11:08
And I'll bring up the configuration file system.
11:22
So there you can see basically this config drive. So this is backed by a RAM disk on the board. So here we have our Ceph configuration, the key ring to actually access the Ceph cluster, and finally this USB config. And that basically says, okay, the image that I want
11:43
to expose via USB is this USB named image in the RBD pool on the Ceph cluster. So once we're happy with that, which it all looks good,
12:02
I've copied the key ring and config from my cluster, so I can go ahead and eject the device. And what we do then is on the board, intercept the eject, and hopefully, uh-oh.
12:25
If you'd like to see a demo, come by the table later. It looks like I must have done something wrong with the config there. Okay, so normally once the configuration is ejected, we then map the rudderspock device image locally,
12:44
and it comes up as a USB storage device. I was then going to show my Android phone connected and copy a photo from the phone onto this device, onto my Ceph cluster. Yeah, if you'd like to see the demo, just come by later.
13:02
It was working seconds ago, so. Otherwise, yeah, a few other options for the project.
13:20
So on the performance side, as you'd expect with, so this is a USB 2 interface. It's not great. I was sort of seeing around 35 megabytes per second in and out. So the USB seems to be the bottleneck there.
13:40
On the boot side, obviously, you want the board to be booting as quickly as possible so that the storage is exposed as soon as possible. In terms of speeding that up, I looked at, or I did play with running the, or exposing the USB device from Inuit RD,
14:00
so basically not booting into the full operating system, just booting the Inuit RD and doing everything from there. That works well and sort of speeds up the boot time a lot. Yeah, for the TCM module, so currently I'm using the mass storage kernel module, which has its own SCSI emulation.
14:23
I would have liked to have used the TCM module, which then makes use of the LIO SCSI stack, so more mature, thoroughly used SCSI stack in the kernel. I couldn't get that working with this hardware.
14:41
I did get it working in my VM, so something's wrong there. Otherwise, a couple of other potential improvements, so this has four gigabytes of NAND on board, so we could make use of that for caching, so potentially running DM cache or B cache on the board,
15:02
to cache locally. Another option would be DM crypt on the board itself, as well, so transparently handling compression, sorry, encryption of data going in and out. Otherwise, yeah, any questions?
15:21
Sorry about the demo, it was working moments ago, so please just come by and take a look. Yes, Alex. Yep, this time the real Alex.
15:43
So, do you actually have images made of OpenSUSE and your tools and everything just assembled together so I can just take it, download it, put it on my board and have it running? No, so I have, basically the scripts I use to expose the ConfigFS and the image are in this repo,
16:00
but otherwise you do have to build your own kernel with those USB gadget, with USB gadget support, so yes, I would like to get that into the standard QB truck config if it's possible or if you guys would be interested. Totally, yeah, let's just make sure that all this works out of the box on all systems
16:22
and then create images out of it. Cool. I don't see any reason why we shouldn't and then people can just go and take it on their boards that happen to support OTG and just expose them as self-clusters. Sounds good. I should also say I managed to brick one board
16:41
while I was working on this, playing with lithium batteries, so if anyone wants a bricked QB truck board, I think it's just the PMU, so yeah, just come see me if you want it. Otherwise, yeah, thanks for attending.