Video in TIB AV-Portal: Mailpile

Formal Metadata

Title of Series
CC Attribution 2.0 Belgium:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Release Date

Content Metadata

Subject Area
Mailpile is the new kid on the block in the world of F/LOSS e-mail clients. This talk introduces Mailpile from a F/LOSS hacker's perspective, going briefly into the motivation of the project before delving into demos and technical implementation details. Mailpile is a Free and Open Source Software e-mail client which raised over 163,000 USD last September (on IndieGoGo and Bitcoin) to support development of the first version of the software. Mailpile is built around a powerful search engine and uses web technology for the user interface, hoping to rival GMail and other popular web-mail services when it comes to both usability and speed. The project places a strong emphasis on privacy and decentralization and is integrating GPG encryption as a core feature of the user interface. Written in Python and using modern web technology for the user interface, Mailpile aims to very accesible and easy for F/LOSS hackers to tweak and explore. This talk will serve as an introduction both for potential end users and for folks interested in helping us reboot the world of Free Software e-mail.
Computer animation Multiplication sign Projective plane Data storage device Resultant E-learning Asynchronous Transfer Mode Product (business)
User interface Server (computing) Projective plane Code Power (physics) Formal language Web 2.0 Computer animation Software Search engine (computing) Phase transition Duality (mathematics) Quicksort Social class Physical system
Email Computer animation Software State of matter Software developer Encryption Mass Freeware Social class
Filter <Stochastik> NP-hard Server (computing) Email Web service Computer animation Software Virtual machine Self-organization Point cloud Information privacy Form (programming)
Graph (mathematics) Open source Multiplication sign Mass Computer font Information privacy Flow separation Web 2.0 Word Centralizer and normalizer Computer animation Software Vector space Strategy game Natural number Forest Internet service provider Point cloud Right angle Quicksort Species Extension (kinesiology)
Web 2.0 Revision control Message passing Googol Computer animation Uniqueness quantification Software developer Direction (geometry) Website Web browser Quicksort
Standard deviation Process (computing) Computer animation Encryption Video game Point cloud Cloud computing Client (computing) Automatic differentiation Form (programming)
Email Computer animation Software Encryption Point cloud Freeware
User interface Slide rule Email Process (computing) Computer animation Software Search engine (computing) Multiplication sign Projective plane Quicksort Freeware
Computer animation Forcing (mathematics) Planning Bit Disk read-and-write head Mereology Writing
User interface Filter <Stochastik> Divisor Real number Projective plane Bit Mereology Rule of inference Electronic signature Computer animation Software Search engine (computing) Encryption Office suite Quicksort
Computer animation Key (cryptography) Software developer Moment (mathematics) Bound state Electronic mailing list Data storage device Line (geometry) Flow separation
Computer animation Source code Control flow Normal (geometry) Branch (computer science) Online help Quicksort Resultant Shareware Stability theory Alpha (investment)
Implementation Computer animation Multiplication sign Website Physicalism Login Shareware
Web page Default (computer science) Slide rule Email Key (cryptography) State of matter Multiplication sign Software developer Sound effect Client (computing) Image registration Cursor (computers) Message passing Word Digital photography Computer animation Gotcha <Informatik> Encryption User interface Local ring Physical system
Laptop Email Open source Key (cryptography) Multiplication sign 1 (number) Content (media) Web browser Cartesian coordinate system Event horizon Medical imaging Digital rights management Computer animation Bit rate Search engine (computing) Core dump Cuboid Quicksort Data conversion Arithmetic progression
Context awareness Plane (geometry) Computer animation Open source Inheritance (object-oriented programming) Software developer Electronic mailing list Design by contract Plastikkarte Online help Database Plug-in (computing)
Point (geometry) Computer animation Search engine (computing) Multiplication sign Keyboard shortcut User interface Computer-assisted translation
Medical imaging Computer animation Information Electronic visual display Database Quicksort Code Resultant Thumbnail
Message passing Email Computer animation Profil (magazine) Electronic mailing list 1 (number) Address space Attribute grammar
Computer animation Moment (mathematics) Line (geometry) Power (physics)
User interface Expected value Message passing Standard deviation Computer animation Software Internetworking Projective plane 1 (number) Open set Freeware Computer architecture
Goodness of fit Computer animation Software Linear regression Information privacy
Point (geometry) Server (computing) Virtual machine Electronic program guide Web browser Mereology Web 2.0 Core dump Touch typing Plug-in (computing) Physical system Computer architecture Default (computer science) Email Multiplication File format Cryptography Shareware Computer animation Search engine (computing) Query language Finite difference Configuration space User interface Summierbarkeit Reading (process) Writing
Polar coordinate system Computer file Set (mathematics) Code Metadata Number Mathematics Different (Kate Ryan album) Computer hardware Software testing User interface Email Touchscreen Scaling (geometry) Mapping Key (cryptography) Surface Moment (mathematics) Projective plane Electronic mailing list Infinity Total S.A. Bit Line (geometry) Instance (computer science) Computer Subject indexing Word Message passing Process (computing) Computer animation Query language User interface Writing
Filter <Stochastik> Email Electric generator Parsing Information Set (mathematics) Line (geometry) Rule of inference Field (computer science) Message passing Word Computer animation Term (mathematics) Search engine (computing) Query language Personal digital assistant Search algorithm Core dump Ideal (ethics) Formal grammar Cuboid Quicksort Resultant
Email Electric generator Range (statistics) Limit (category theory) Subject indexing Computer animation Search engine (computing) Term (mathematics) Encryption Figurate number Information security Arithmetic progression Plug-in (computing) Mathematical optimization
Ocean current Subject indexing Message passing Email Computer animation Query language Order (biology) Content (media) Set (mathematics) Game theory Metadata
Filter <Stochastik> Default (computer science) Message passing Statistics Computer animation Search engine (computing) Survival analysis Quicksort Resultant Wave packet
Filter <Stochastik> Group action Matching (graph theory) Multiplication sign Sampling (statistics) Mereology Machine vision Wave packet Message passing Computer animation Term (mathematics) Telecommunication Universe (mathematics) Video game Pattern language Metropolitan area network Form (programming)
Area Email Cross-correlation Computer animation Multiplication sign Shared memory Bounded variation
Optical disc drive Message passing Word Process (computing) Computer animation Multiplication sign Video game Generic programming Quicksort Event horizon Descriptive statistics
Cryptography Power (physics) Proof theory Medical imaging Computer animation Software Strategy game Tower Telecommunication Encryption Software testing Reading (process) Writing
Key (cryptography) Computer file Observational study Cellular automaton Binary code Set (mathematics) Open set Cryptography Cartesian coordinate system Subject indexing Computer animation Term (mathematics) Encryption Symmetric-key algorithm Library (computing)
Server (computing) Group action State of matter Multiplication sign Mereology Theory Metadata Power (physics) Web 2.0 String (computer science) Formal verification Error message Information security Computing platform Address space User interface Installation art Default (computer science) World Wide Web Consortium Distribution (mathematics) Email Key (cryptography) Information Feedback Moment (mathematics) Cryptography Electronic signature Connected space Elliptic curve Subject indexing Message passing Computer animation Software Telecommunication Phase transition Chain Right angle
Peer-to-peer Message passing Email Web service Computer animation Velocity Multiplication sign Encryption Communications protocol
Email Computer animation
Facebook Computer animation Personal digital assistant Multiplication sign Internet service provider Computer programming Basis <Mathematik> Extension (kinesiology) Field (computer science) Connected space
Computer animation Multiplication sign Mereology
and what come with me the annual that I have some he would resent made pile and yet stores and then you need a
few will close this is a pretty much to
where my qualifications or just a bachelor's and pure science in online education but I've been doing Linux and stuff since the early days and resulted in my passion for a very
long time on those results they decided to dedicate myself to full time in 2010 and I've managed to pull that off 1 way or another the on well it's projects and products working with prominence mode in nice community of people what
is call they'll class and in brief will pile is no way the system confusion we're using web-based technology for the user interface of some people think maybe it's a server but no it's not real server is an applied so it's built on web-based ecology so we can use modern web assigned to make the price and make accessible has a powerful search engine which is actually held project that started and were hoping to make it
really easy for people to use PGP both assigned be obviously free software were currently in we're dual licensing phase where the code is available 2 licenses with to drop 1 of those when our community has voted on what they prefer and it's written in popular languages by phone and but the seeds are planted to take you back and that's sort of what I will be talking about the 1st half of this talk is the motivation history projects that a long history so be a lot of and then in the 2nd half I'm going go into some of the technical details because I know you guys all techies that's so why while I don't mean to offend anyone but I feel
like a state of free software e-mail has been pretty crap for a lot of there's not a lot of development and things are the way we're doing things kind of secondary and I feel like we need to catch up because if the free software community is going to compete with you know the Microsoft we need innovative provided that experience to our users and we have been doing just that and if we don't then we will never have mass encryption you you will always be written on the back of a postcard
while we're trusting the class more use of proprietary software so this is important I'll e-mail has
been becoming increasingly centralized and like it but people these days when they think of e-mail just sort of assume that it's Gmail or Hotmail they just assume that some were running on someone else's hard work a lot of companies don't bother mail servers and and a lot of schools don't bother anyone else's government organizations to all other animals this anymore and I think this is a pattern of but it's going to continue we don't provide better software now I don't like having you know in the cloud filtering my spam because I consider that to be a form of censorship I would like to have a nice pencil to renew my machine that neither have all the data and I find it if it gets lost in Google filters something away I have no idea why don't you know what happened but they provide good service iterate Ricci all you have to do is pay for it with a privacy
and Edward Snowden has explained 1 facet of why that's a bad thing I ever mumbling his talk about this and richest all this talk of this for a long time but it's not really cool with him words but he's right of stuff in the cloud when other people running things forest that's worse for our freedoms than closed source so I like to piss people off by saying that not Microsoft is the outline of the and because honestly these Microsoft is letting you run the software itself when you put stuff in the cloud and they have a they were very easily lock you and it's very hard to move and you have a sort of natural monopoly
vector species you know the the biggest provider always seems to where everyone gravitates there have these massive centralized silos of data and several for privacy it's bad for a lot of things out there's also risk of something which the some people in the room were old enough to remember you guys remember Microsoft strategy of embrace extend extinguish Johannes the rest of the graph of interest you on this is all Microsoft tried to due to the open web so they said hey let
things kind of neat let's give everyone a free browser and you know when to explorer wasn't bad or that they came with this thing called active
and they encouraged developers and Microsoft ecosystem to use ActiveX to make their websites work dealing with and sort of whole intensity and unique thing so they extended the web using proprietary technology and they were hoping that by doing so enough websites have moved to using their proprietary version of the way that they would kill the overlap because nobody else had ActiveX and it was so it was close to this risk of this happening with things like female if we let cloud run for us and we already from very small hints of things to in this direction and I'm sorry to pick on google because I used to work there and the nice guys but the example I have those come from Gmail on you will start add features where please send messages the
form and in particular way and your who partner then buttons like children you know you might be able to buy something within your e-mail client and to build do stuff that you will be able to do with any other life so they started to
extending and make it into something that it never was before making something that is definitely not an open standard is not interoperable with the rest of the process so this is happening and it's something that we should be aware of and then of course e-mailed the cloud is fundamentally incompatible with encryption and the reason for that is these cloud providers they all base their businesses on advertising and they need to be of target those ads and they can target those ads they can read the content of green the but they always talking about they can show you relevant advertisements the little making money so they're never going to determine besides the fact that you know what he's a bad idea it has nothing to do it so I was worried about this 1 so that
people worried about this and so we came
with his idea was built no pilot that's true it
will that strategically around you want to achieve 5 we want to make you suffer the free software community enjoys hiking on very slight play with something that's accessible to the community reliable software that regular people want you so it has to be attractive and user-friendly and fast all of those things that we suspect which software we want to make e-mail encryption understandable so people will do it without thinking about it and that it can be something that just techies do have reason everyone can do and I want to make it easy decentralized were made the verb you would take the data that committees of people to run their own infrastructure and you know move out of the cloud as much as possible and even if not everyone does this this will have a noticeable benefit to victory over all tech community because it keeps the Congress people can lead then they will they will we have passed finally we need to find better business fault because 1 of the Achilles heel
of the free software community so much work is done by volunteers and we have very limited resources we need to figure out ways to actually do this as a full-time job so this is the sort of stuff that were thinking about in i times in the
project so far 2011 I wrote an experimental search engine I was able to search all my e-mail and in MS it was great and then I went back to it is have to was to nothing happened 2012 slide because there but in 2013 thing started happening on my motivate he'd been harassing me and say you know let's do something with this SMIL things kind of cool using Thunderbird they don't like you that what something better than I can think of a way in which you know that we bumped into Breton who is a user interface design we wanted to him in the heart of what schools in downtown Reykjavik in the more of it book into you attend cafe I think he was following us growing up we had some coffee we had severe and he looked so was like what are you working on those I'm playing with his
e-mail thing and 10 minutes later he may local for the project and we're still using it so we decided to join forces we came up with this plan and we decided to do a fund-raising campaign so we raised begin Indiegogo fund-raising campaign launched
in August was the only possible in those which was fantastic and we succeed we raise enough money that we can work on a full-time for years and actually a little bit extra the
economy going current adjustable gets 16 17 18 months out of out of that but depends a bit how Bitcoin develops the he's going up the the will give humans that way but know since since september we just had a head start reading writing of going to a couple conferences to talk but for the most part we've just work working yeah and 2014 realize
well you know we have to keep those promises we made back in August so we mailed out most of the things we promised to people that factors mailed that shirts and citizens some rocks as well people with pain of money that I think rocks yeah and where rules in the 1st of the year of the witches below the albums and few and where 1 daily the you know it's real sort of project in the so what's the what's the office of the
highlighted the offer is that this is not a where we actually are writing software you can download the file and of the heart of the part of what we're doing is trying to prove that Uq next account for most of like this and it's not a a so keeping your promises really important but we have a nice 5 based user interface we have a fast search engine which is not based on knowledge and will discuss slightly the we have some basic support for PGP encryption signatures so it's a bit clunky it's new but it's it's there we have spam filtering based on the based Bayesian filters and we have over 30
volunteers translating male pile into the local which we think is just 1 of the and
a less it has low light and bounds of
highlights this now so don't expect it to work you know are expected to work well it's tricky to install and configure at the moment it is still very much for developers and we don't have I Latin pop 3 support intermediate and the reason for that is that I haven't we haven't severe Howard store all that data because 1 the thing to do sure things restoring we need to think about how we do that we don't have support for ES line I don't know if anyone cares about it was on the list of things we talked about and we still have kinetic compelling stories about how we're going to human yet so handling keys
expiring and he's been revoked and all that stuff we need to put you on that
doesn't kill the normal user that's going to be tricky there's lots of stuff there's lots of work to do so you get the source code and close the alpha release and from this point on try the that French sort of stable for people to just look at it and will break ship and make things messy on the main branch and we have alive demos because 1 of the 1 of the things that we face here's the by over 3 thousand people most of which are not particularly technical so they're not going to be able to download the source code from and get help so we put some effort into making sure demos were up so the fluency and play with it and they know what we're working and of course the result you check that as well the all they don't yeah I
don't see if this we have a website the the only physical n log interest in the for the lifesaving haven't actually in it seems that works so this is the intuition of my talk of going to do a little demo show you what it looks like and then I'm going to go through talking
about some of the technical implementation and I made use of all my time and then some so I'm really count on them to stop me and I will start with this question there was no 1
else the the was this is
optimal page so we should try it before the system is in the room we are running this of 4 bps is that
all of i tools and and photos that word pair that the I the the president stack it was doesn't work registration of local host so that's how we do most the time anyway this is what the devil is supposed to look like it's is an e-mail client so did you hear a click like that's liable this is my personal mail which is using a different to think about is in doubt there go which makes it a the I think the victims but the effects teasing us to that today the the to for reading e-mail you read about me development which is kind of cool so you can read messages so see my cursor they're here at the top we have policies of going to signify the state encryption so whether the message signed on accepted the cost of the tools to explain what's going on this is the 1st draft going to be improving interface over time but we have some things of I can quickly on a slide here try the trying encrypted but it says the air and the reason being I don't have the key for the recipient the see if details were tested here I can try removing Chris because he doesn't have a key so the similar 5 too slowly the no was still can't encryption at about and this then last owner no look there's this other guy here and now it's balanced and everything so this is what we're trying to we're trying to make it really easy and or the ideas is that if we know that we have at the for the person that were communicating with people suggest that the male being affected by default and we know that because props and because a lot of us so using multiple devices and the only 1 of them can actually could there so we're also looking at ways to communicate preferences so you can if you if we see that we always get
encrypted mail from this person were always assigned then Satan could but sometimes we can assess the slot Major article policy for that this event is to just sign that stuff that we need to work on time this is a search engine I could
search for of Linux of Iceland conference the see if they have the will of the so my mailbox has about 160 thousand as is and this is running on a relatively crappy laptop with this member reasonably fast rate this is what we're the it shows tags here these are these are all tied in box of browser strange show the conversations the the trick that it will filter and only show me the ones I haven't bothered reading it if so that's a of it's making progress we have of of contacts so right here this is holding contacts out of my GPG heating and 1 of the sort of core ideas we have is we want to integrate the management with contact so this shouldn't really you should have total separate application 2 major key we should be able to manage them along with everything else to do with the contents the animator pulling pictures of you don't have a car that's for those images come from I would like to pull pictures from other sources like ground tasteful if we can't but I know
that's the work of art that would be cool thing that we have found we have various plug-ins for importing so repeat that yes how about our and
we have various planes pulling fund that sources well we can pull in context from the morgue database with help from diverse sources contracts and we
have a card does plugin I haven't tried it myself these things are all they're all under development but they're definitely going to happen and you can help see here's should be a list of my parents gives reasons that is where we
make for money so we can keep working on this please please send us all your money of the great so that's that's from the web interface and nutshell but because I know I have a roomful of techies here or he was shortcuts not yet that the hands just a matter of time the
2 of the time so the back into this is this is written as a search engine and every time you hit a cat is written as rest the guy tied in a and point you'll get the details on back so we can actually just request that instead of looking at the but the so
this is the same data so a lot of of it actually is all the information about the search results that are visible so it's gotta be like these big blobs here of that there that's actually what's thumbnail image display that a lot of stuff to interact with
this problematic to an were hoping that will lead to some interesting sort of pointing the question then he say you eat it it limited to the
little database for are small only 2 that I think he pulled code from somewhere in the so called by the of the way the review that lasted the better the privatization of frost the as the title if if uh that we should use a composer full composer Richard the reply was
so this is what the full composer looks like it has the same attributes as the other lectures profiles and said on assigned this message search for people this is searching and searching in my personal you know list of addresses and address that it's easy incoming mail it shows up as a as a suggestion there and the ones that actually have to use as I said they they get a little longer the hopefully that will encourage people use those ones so the others so slightly here demerol if I go from this
here this is a command line interface to how come yes in the bunch of commands and you can search for things the the I don't see here
is the the the the and various things that you can do it so relatively power of some climate of the novel I will not could you repeat the question all the of the yes so 1 of them actually get into that the moment but it is as if all these males for offline if you sort locally and yes we are encouraging people to sort things so you want to know about the center as follicles so so yeah can I tell people what is a single line we just have to go through how we do that in the
same way undergo back to my slides and tell
you all about the technology behind this
some how little work this is a
chance to run away the and and so the overall architecture of the project the grain principles
are free software revenues use open standards and we're opponents of decentralization so we deal the users should hold on data and we shall do so searching is a critical feature but it's not enough that people have to organize all the real and folders because a regular user has thousands of messages and someone like me is that of the internet for you know well over a decade has a couple hundred of all to the of that that the Our main user interface is the well but was going to allow for alternate ones as well what could buy a full whenever possible and we have the expectation that our and users are not the
people in this room the non-technical people from people that don't ask me who writing this suffer for an I would like the software to be the answer to the question When someone comes as I'm worried about my privacy what can I do because they we don't have a good answer someone comes and says I want my e-mail private regress say install Thunderbird learned and that's kind hard to and Thunderbird is in the way so it is the main thing that is not being actively developed getting into the cult and
we have a Python or which is where we can do things like configuration search engine is part of the core reading and writing and sending e-mail as part of the course crypto these the server and we have a plugin architecture it's not stable yet but we do want to be able the things that's what and were already doing so and we support multiple mailbox formats and that's starting to become global supporting contact that was written as a plug-in from day 1 of a set of consistent so when you install they'll pollen machine it goes it finds your mail that have been findings for setting because I do that we plug plug-ins that people write for different at different operating systems and we have plug-ins that allow you to tweak the search engine itself so you can teach search engines to read your mail and creating keywords or to create complex queries based on something that makes sense to you we have a
web based guide touch on that briefly in the demo where you get things that they say so so most of the urals that people see in the web browser they map directly to Python method and that Python method by default if you hit as a 10 point returns changed on if you using regular regular interface will give HTML but we also provide a steel embedded into a sum which is useful for doing agencies such things and plain texts and we might do XML whatever there's a demand for of this you know that we do render is rendered using ginger to have it the interface itself I there were using
is to modify the the is the j query from less all the modern tools and we want to do progressive enhancement we would like to say to work without jobs because it doesn't quite do that yet but you can use in a real world but and we won't have responsive design so it scales different size screens in a while devices and so forth because honestly it's going to be a while before someone develops a male piled that for your phone but we want the web interface at least the mice in a phone if you're exposed to mail out of the and it will be the mobile and scalable for people that want to do that the the Sultan interfaces to the uh I showed you the command line very briefly also Python interface screens saying what will pilot instead you will have instance and attractive programmatically we use that for testing and they could be used for various other weird things we might support excellent PCs because that's really easy to in Python I would love it if someone ordered the user interface of my ladies and of course and the test how does it work so I guess the question why are we not using any suppliers he also apply to the writer uncertainty and beyond instances because that's how I started I was just curious can I write solution I would make and it works in it's fast it's under 600 lines of code at the moment by revealing feeling the change to on the other hand if the benefit ICT is is that it's very very simple I can explain it you now how it works and you'll understand it and I consider that to be a huge benefit and I don't have that visibility into other tools we want to do things like what make sure this gives people could experience customer support you know would be sure that were storing all the data encrypted on this would have a lot of control or how this here so having a simple small code base that we herself is very appealing it also means we don't have another dependency rolling stock packaging and shipping users how does it work they'll call
region male out of the message it will have generated 94 it extracts metadata things like the subject to recipients who sent to the size of the message that kind of thing and it was that a bunch of keywords which are the things you can actually search for let's create posting lists the user basically maps that map a key a key word to a set of message I this is a very simple files it is you have a list of the keywords and there's a map to this and as metadata index which maps the same ideas to the metadata which tells you something about that metadata index is something I store in RAM and that was realizing that I use realizing that e-mail was no small enough and our computers are not powerful enough that I think put all that data and then would notice that started this project because we can do that all of a sudden everything becomes really fast because most search queries can be answered by the small file reading it and then looking up all your data and that means I can answer any search query under 2 in MS on crappy hardware you have good hard work just it's faster this is how you would put this if you're writing code yourself by get the message ideas for a given keyword and generate a file named by hashing key word the you open it up pass it the return set In reality and grouping similar keywords together so they have an infinite number of files and some problems this the adding things is easy to reading them in their total all violence and stuff so that's a bit messy of the metadata and that's it looks like that it's just a dictionary and the and maps to went about it what we put in the metadata indexes all of the data that we need to generate surface so and you ask for search for potato at the back of list of messages and I can find the subject lines centers all the details I need to show you the list of surrealist in and that's that's what that's how we get good experience when you search metadata workers restoring encrypted understood posting lists are not adequate but they will be so
putting those 2 things together that's a 5 line search and start with the set of all of the ideal method ideas that exist for each keyword in the query you narrow down the set just look at the message ideas that match the query into do a set intersection then when you've done that for all of the terms in the people of searching for goes through the results and that's the core of a search engine right there so really on top of this
and we added tags which is similar to the labels that are in and these are basically search terms where you can edit the results so you can add and remove message ideas to this keywords and that allows you to do all sorts of things that means you can create in box can mark certain messages on unregistered or whatever the I don't know we have all the things that do not automatically the of filters and this allows you us to organize your e-mail in a structured way the fungus has an intermediate for this allows you to make new rules for how to generate keywords I'm I'm really hoping someone will write in a plug-in to parse PDF files and extract the text like search those as well that someone could write a plug-in that understand the grammar of the language or understand the particular the fields of specialization of passes wanting to recalls pass messages that come from Ryanair's come from the in extracting the information about the flight to make that into was such a search keyword to love things can be done there and then there's the other side so that generally the words in case magically where there dynamic the way you search for dates is implemented as 1 of the
audience the can search for dates in range 2010 and 2014 you that the work but I can create the words out of that and that's what this what you will actually tell you tell it to search for the 2010 according to order thousand that kind of thing and of course plug-ins can also just use the search engine to generate interest use figure out things about know to give its building a search engine
like this is mostly due with actually reading the mail and generating useful terms so which keywords we put in the index in which he was the go and then of course is the constant optimization native fast enough to be and so forth and so on but works in progress we are not able to delete things from a certain the sustainable of a limitation quixote but the encryption needs work my security people tell me that it's not good enough hazardous and it would be nice to
have a better query language the current query languages persisted so pause actually people have
questions about the searches the White before and 1 the value of the of and commissioned
by really wants to have 0 yeah singing don't yet used those are the protected and so the question is how what we do but in male and the answer is if you tell pilots love and this has to be a set of not everyone is comfortable of male pile just keep iterating know when it sees FIL index just like anything else I that's 1 of the reasons that the search and they need to be kept secure because I I believe that could get e-mail will not be usable in this certain so this needs to work but in order to so we need to building the search index this that the metadata appointed do so in a way so that we're not making this the contents of the bride messages how many of them and I think it's would because I'm about to talk about think but yes half of the bed so actually if it's not also withstood afterward a game of trying the sometimes have always the follow OK well that means that we don't have a rocket that was that was themselves in the school so how does it
work from milk that filters are based on statistical analysis of and so we would have engines that read the mail and give you some sort of result by default we using Spambase of messages that match or all tagged with the spam has and then messages that have that tagged survival hidden from search results so the spammers and in face on we train the filter or a train the whole Bayesian engine of by looking at the user's behave so it's going in a little deeper
statistical analysis is trying to answer the question what is spam to you because not everyone has the same spammer not everyone has the same idea but what is that and we do that by feeding the same keywords as go into the search engine into a Bayesian filter and it will then hopefully we will classify the male into spam or it might be spam or it's definitely not spam and by default we use text the training is the
important part of life we get this wrong then the samples of form very poor and this is where we need to work with more I but what we're trying to do is we're looking at how the user interacts with the man so if you take a message in you drag the spam folder that pretty clear signal that that's that and if you read a message and then you reply to it that's a pretty clear message that it was not that so we're using things like that to assemble a corpus of messages for training and my vision for doing this is I want to train on a relatively frequently because I believe that both your communication patterns and the patterns of spam or going to change over time so I do not expect your spam filter to matches spent 2 years ago but I would like to match the spending the term so we'll pose tracking these actions when you click on a message read that message is tagged as having been right when you reply to a message it is tagged this method was applied to when you manually will organize things move 1 time to another the fact that as well and that allows us to choose dosages for for training and we also have taken filters turns out of nothing that I just said has anything to do with that it's just how you organize a man and so what we're interested in doing is using the exact same plug in the facts and of the new Spambase as a general purpose will by so you can go to sometimes you just created the universe is the father has any just the out and then over time you put some messages in that you haven't with other
then we'll posted should you learned which messages belonging to the post and tag and do that for you all the works it's called the yes I think I
am it's but should goes through 1 more this was kind along the 1 that was
the last year so yeah and did anyone want to ask about stem cells yes yes it's the right thing to do in the Google experience during that of course you're only using
1 what you call this a small and the were you as you don't have a lot you don't want to run the
actually of on the and so the question is whether there is actually good enough on its own and and raise the question of whether we will be able to compete with people like who pulled that have access to lots of other people's mail and can cross-correlate times tau by people and unexploded has been using Thunderbird and its variations that hold for a while and he's we'll be happy however this is an area where I do expect according need inter-rater devolved and if necessary it might be interesting to look into ways that users can collaborate and share details in a decentralized way I mean that's the the handling of some of the problems we you but yeah this is this is a valid question and I don't really have the yeah what if a spammer starts
GPG encrypting the spam e-mails sent to you and then you will be able to the spam filtering and people will use open so I
think we're asking is what happens when spammers to computing the male and the only way that we win because that way who can filter at the weekend because we're doing this event processing after Milton descriptive so that's 1 of the reasons that integrated world you need to move the spherical throughout the edges to where users are yes that under certain assumptions was you made forwarded by users get frightened by spam wonder both harsher directs its so for words that always and occasional atom that some of the other 1 was the yeah that's true of what that actually means that is that that's an interesting message and the odds are the users going to force that you would also put and and that means I have a very strong signal that was interesting that when you know I'm perfectly happy with how it works both ways I am going will also talk about the don't that is 1 of the that life you finally so yes and helping a friend of yours benefit from Europe spent I'm not sure I'm actually can it depends on whether your spam is similar enough to and 1 of the things that we need to look into before we ship this is 1 those values to building a sort of generic pre-trained filter and shipping and then having that learn over time and how we would do that but that's that's something that we don't even start looking at to make some or even later but it's a it's a good question I don't know Sokoto where
should we you could go say we do because it's not all until yes but this is what we're talking about we want to get the data at rest so everything all the data that male pile generates should be stored in the same way I I know some people have full-disk encryption but not everyone does I would like people to to will piled on USB sticks and is carried around there's there's many reasons for doing this reason crypto when we're reading writing and sending them using its to c'est obviously wherever appropriate we would like to analyze communication with still network so when you when male power goes up and down those images gravatar it shouldn't really be Telegraph Tower that I'm talking to these people that is we analyzed in some way and we might want to implement some of the proof of work in the past testing which
relates to 1 or the other crypto ideas that comes I data rest strategy
there is to just shell out to the existing tools from going usage GPG binary audio binary or libraries and use that term crypts basically everything application settings contacts search indexes such at such on using GTG is relatively small so you're probably only going use that convict file that will then contained a key that is used for symmetric encryption the openness itself because of the cell and AES encryption is
quite fast it's the study that I'm
reading e-mail was the basic theory pass the PDP mind you call up to DPT check with the signature checks out the thing that's necessary but but we would like to build like you know things like you plug in other preprocessed because the could the community has is developing all these things and I don't want to they have a platform where the next because of I would like plug in a close friends that wrote something called the B T which is pretty bad practices based on elliptic curves but a similar things you could recall that we just plug that in place time as well as answering the question decrypting can be done during the indexing phase so cryptic messages can be searched the and when we do that that means that revival of information about the state of the messages into the metadata into the search index so you can search for messages that the bad signatures or the you I can give the visible feedback because everything the you I chose is based on what's in the metadata and text so that's that's important coming and that is mostly working as of as of this month I writing e-mail again generate my user group and TPG to actually do the signing in cryptic so we want to do best effort security we would like to encrypt messages whenever we can how we figure out what we can do is an open question like this I signed by default there's really no reason not to and we need to give users the power to change the settings but for the most part we want to do our best we would like to use trust on 1st use of tofu for key verification were so that's like SSH where the 1st time you see the key to say OK this is probably the right key and you only complaint chain which and this is because the Web of trust is very complicated I'm not sure we can put a user interface on that to be and where the 1st is also leaking from private information on the web of trust is revealing to anyone who cares to look who's talking to whom and who knows who how were connected and a lot of people consider that to be such a massive pose a problem that it just should use so this is how we're having right now welcome to talk us out of it this in the back but this is what we're talking about at the moment and you're probably going to distribute keys in ad-hoc way including just at catching up with male send e-mail that's somewhat we should use the less when possible that's been as and the forever and obviously we should use it but we also had this idea as important as antecubital this is when male pulsars because e-mails I build very simple as into the server male we should feel piling toward together so an end user installs milk while they also get toward people notorious right the day if it's an error in the tools it analyzes communications and it's also darkness so you can register what's called a hidden service on for you will connect to that and the anonymity of that connection is strongly against as what this means you get an e-mail address which looks really ugly it's it's your name at some long gibberish string I think a very helpfully made the strings longer recently like it characters something and then down and is but what that means is
that when you mail was connected to pour 1 male power can deliver e-mail directly to another and it doesn't have to go through any outside related it never leaves the encrypted for network and we have suddenly closed the metadata that people are complaining about with SMTP where the NSA is
listening to who is sending messages all is a lot of time that happens in the clear and this
is a different problem from the lack of encryption of a message headers and the GP is any actual as into the dialog where no mail from somebody our CPT to somebody else that's always in the clear unless people using the velocity was not
widely deployed not widely deployed enough this course is that a completely and allows e-mails to suddenly become a peer-to-peer messaging protocol but talent into the cost of that we don't want anything we just a simple things so but by building instead inventing something protocol that means we can use the existing infrastructure use something like post or something that balances service instead
and you really if you're male pilots not online often enough for the 1st were you the question about what I did it well I have no idea the stock will our aggressive whatever will doing is relevant from what I've heard dark male or doing
something clever with X and the P which is Jack so in my mind that's not
really you know I don't know if they tend be backwards compatibility that backwards compatible with the world of e-mail builders have to answer I don't but if they publish something we might want to if it's good political and we just have to wait and see I think this is my last couple this the CPS that's a no-brainer we should be using
all the time and I would like to economize by shipping for we can start downloading program Harvard and download from Facebook here and there without revealing who you talk to and in some cases we will actually also be used an my the other when you connections there are providers like Gmail is 1 of the things they do really well basis we allow incoming connections from 4 and they do this because they want activist field me so there are some providers that are known to behave well and we should be able to connect them over for instead of going to the extent that and really so that's it for me
you know
theft you it takes more questions do we have some more questions it time his hands up so I can hand the microphone leaders like people away with all the other part