Fighting spam for fun and profit
This is a modal window.
Das Video konnte nicht geladen werden, da entweder ein Server- oder Netzwerkfehler auftrat oder das Format nicht unterstützt wird.
Formale Metadaten
Titel |
| |
Untertitel |
| |
Serientitel | ||
Anzahl der Teile | 561 | |
Autor | ||
Lizenz | CC-Namensnennung 2.0 Belgien: Sie dürfen das Werk bzw. den Inhalt zu jedem legalen Zweck nutzen, verändern und in unveränderter oder veränderter Form vervielfältigen, verbreiten und öffentlich zugänglich machen, sofern Sie den Namen des Autors/Rechteinhabers in der von ihm festgelegten Weise nennen. | |
Identifikatoren | 10.5446/44266 (DOI) | |
Herausgeber | ||
Erscheinungsjahr | ||
Sprache |
Inhaltliche Metadaten
Fachgebiet | ||
Genre | ||
Abstract |
|
FOSDEM 201963 / 561
1
9
10
15
18
19
23
24
27
29
31
33
34
35
38
39
40
43
47
49
52
53
54
55
58
59
60
63
65
67
69
70
78
80
82
87
93
95
97
102
103
104
107
110
111
114
116
118
120
122
123
126
127
131
133
136
137
139
141
142
148
153
155
157
159
163
164
168
169
170
171
172
173
174
181
183
185
187
188
193
196
197
198
199
200
201
205
207
208
209
211
213
214
218
221
223
224
226
230
232
234
235
236
244
248
250
251
252
253
255
256
257
262
263
264
268
269
271
274
275
276
278
280
281
283
284
288
289
290
293
294
296
297
300
301
304
309
311
312
313
314
315
317
318
321
322
327
332
333
334
335
336
337
338
339
340
343
345
346
352
353
355
356
357
359
360
362
369
370
373
374
375
376
377
378
383
384
387
388
389
390
391
393
394
395
396
406
408
409
412
413
414
415
419
420
425
426
431
432
433
434
435
436
438
439
440
441
445
446
447
448
453
455
457
459
466
467
471
473
474
475
476
479
480
484
485
486
489
491
492
496
499
500
502
505
507
508
512
515
517
518
529
531
533
534
535
536
539
540
546
550
551
552
553
554
555
557
558
559
560
561
00:00
Framework <Informatik>SoftwareSchlussregelRuhmasseBildschirmmaskeDigitalfilterMessage-PassingZentrische StreckungRoutingDefaultSoftwareGüte der AnpassungDatenflussFramework <Informatik>SchlussregelRuhmasseBildschirmmaskeRepository <Informatik>Wrapper <Programmierung>ServerDemoszene <Programmierung>BenutzerbeteiligungInformationDistributionenraumClientEin-AusgabeBenutzeroberflächeFormale SpracheMechanismus-Design-TheorieSoftwareentwicklerWasserdampftafelCanadian Mathematical SocietyGrenzschichtablösungWurzel <Mathematik>DifferenteVersionsverwaltungMessage-PassingE-MailResultanteProzess <Informatik>Physikalischer EffektContent ManagementDienst <Informatik>BitComputeranimation
05:39
SystemverwaltungComputersicherheitSpeicherabzugModul <Datentyp>CodeE-MailPlug inLeckLineare RegressionSoftwaretestDigitalfilterDomain <Netzwerk>Adressraump-BlockZahlenbereichInterface <Schaltung>E-MailLineare RegressionQuellcodeSoftwaretestRoutingLeckInformationsspeicherungElektronische UnterschriftÄhnlichkeitsgeometrieMultifunktionAdditionFreewareSchlussregelPlug inCodeProgrammfehlerComputersicherheitPhysikalisches SystemPhysikalischer EffektNetzadresseVersionsverwaltungDifferenteGlobale OptimierungMereologieHilfesystemWort <Informatik>SpeicherabzugServerFehlermeldungKonfiguration <Informatik>PhasenumwandlungMultiplikationsoperatorResultanteInformationDichte <Stochastik>ParserRuhmasseSoftwareentwicklerDämpfungVerkehrsinformationHash-AlgorithmusElektronische PublikationAdressraumNormalvektorSynchronisierungMathematikComputeranimation
11:12
Plug inDigitalfilterAdressraumDomain <Netzwerk>p-BlockZahlenbereichE-MailInterface <Schaltung>AbfrageDatenbankKonfiguration <Informatik>Klon <Mathematik>Inverser LimesROM <Informatik>BefehlsprozessorBimodulProzess <Informatik>SchlussregelSoftwareentwicklerRoutingRechter WinkelPlug inSchlussregelPhysikalischer EffektIn-System-ProgrammierungOffene MengeProjektive EbeneFreewareInverser LimesMixed RealityRuhmasseFrequenzSoftwareNetzadresseServerDatenbankDirekte numerische SimulationBitE-MailDifferenzkernPortscannerLokales MinimumSpieltheorieZählenVersionsverwaltungKonfiguration <Informatik>AdressraumCoxeter-GruppeMailing-ListeStellenringMessage-Passingp-BlockHash-AlgorithmusGewicht <Ausgleichsrechnung>UnternehmensarchitekturElektronische PublikationPhishingBimodulProzess <Informatik>Computeranimation
16:21
Plug inOffice-PaketE-MailMakrobefehlBimodulInternationalisierung <Programmierung>SchlussregelDatensatzClientDirekte numerische SimulationSchreiben <Datenverarbeitung>CASE <Informatik>UmwandlungsenthalpieDifferenteFormale SpracheDatenstrukturPhysikalisches SystemBenutzerbeteiligungServerUmsetzung <Informatik>MomentenproblemArithmetische FolgeMailing-ListeAlgorithmische ProgrammierspracheInstantiierungMakrobefehlExogene VariableE-MailParallele SchnittstelleDatenbankFormation <Mathematik>Wort <Informatik>Syntaktische AnalyseNotepad-ComputerMessage-PassingComputeranimation
21:30
Computeranimation
Transkript: Englisch(automatisch erzeugt)
00:06
Hello everyone, let's welcome the next speaker Giovanni Beckis with the talk fighting spam for fun and profit.
00:21
Hi, I'm Giovanni Beckis. I'm Apache and OpenBSD developer. And today I'll talk about what happens in the past four years in the spam processing community. And what will maybe be the future of the software.
00:47
First, just to note that spam as seen is mostly seen as a plug and play software by several users. That just install it, update rules and forgot completely.
01:04
It should be seen as something different. So it should be seen as a framework to develop something on top of it. So to get out the most you have to write your own rules for your own kind of spam.
01:23
You are seeing cause spam is different from everybody. So it's different basically of the language you speak, the interest and a lot of other things.
01:40
And participate to mass check that the clients software mechanism. I will talk more later about it. And it's a general purpose framework, it's not just an anti-spam software. Cause it's used to protect some web forms.
02:03
And I think in Holland it does the not so famous CMS that integrated spam as seen to check web form submission. So first was mass check. Mass check is a client service software that's integrated in spam as seen.
02:26
It's not deployed with general distributions. It sleeps on SVN, you have to check it out extra.
02:42
And it takes an input, a spam folder and a ham folder. It downloads the latest version of spam as seen with the old rules. And it checks all your messages for spam and ham with new rules.
03:03
Rules are committed in SVN repo in each developer sandbox. So there are some rules that are never pushed to the public. Once the software has detected how new rules are performed on your support folder,
03:30
sent the results to an Apache server. And the Apache server grabs all the stuff and decides if spam has changed in some way and decides to push new rules.
03:51
For example, where are some new bit coin obfuscation technique. These new rules get pushed to the public and get new default scores that are up lower or not.
04:10
Depends on how it goes. And it's a good way, you can use it on your own as well. Also it's a good way to know how your rules are performed and how it's your rules or in general the rules,
04:28
the public rules are performing with your mail flow. So you can, for example, there's also a web interface.
04:40
So you can, for every, on your mail flow, you can detect the score assigned to messages and by which rule, if there are some rules that overlap. So for example, if a lot of spam messages are hidden by two or three rules
05:07
and there's one rule that hits all the messages, it could be possible to remove this rule and push up the score to one other rule so you have the same results
05:27
but it would perform better because it has less rules to perform and to check. So in three, the latest release is 3.42 in September and the previous one was three years and a half.
05:49
And this big time was due to some problems we encountered in development. In particular, there were some problems with Apache VM so the C-Submit team had to recreate from scratch
06:09
some infrastructure and the server part of the mass check was not very well documented so it will be recreated from scratch.
06:23
The main problem is that from when the mass check tool sends the reports to the server and the server sends to the public new rules, it passes a couple of days. So if you are in a trial and error phase in development, it's a long time to wait such a couple of days
06:52
to know the results of your code. And there were some security fixes for PDF info plug-in and in the core
07:07
and the general security auditing has been done and it's ongoing as well. And a normal pair bug we found.
07:24
So we, during the development of 3.42, we optimized the startup code and during the regression test, we checked out that the parser skipped some URL in the emails
07:42
only on Red Hat systems. And at the very end, the cause was that Red Hat compiled a pair with the 45-source-3 option by GCC and this maybe does a bug in the pair itself and gets evaluated in a different way,
08:09
an optimization, a part of the optimizations code that in the optimization of the hash it removes some random data. So we changed the code and we're working with Red Hat to find the original reason in the pair code of this bug.
08:35
There were some other improvements. First of all, faster startup code with some optimization.
08:42
Finally, we took a look at the SPAMC and optimized the code. Security, so remove the SSL with v3 and other stuff. There were some free mail anti-forge improvements.
09:02
So there are some code that checks if, for example, you are sending an email as a Gmail user while you're not using Gmail. So it's trying to check this kind of abuse.
09:23
You can check if mail, in previous version, you can check if mail is coming from a particular nation, so from France or whatever, and now we can check with continent as well. So it's easier if you can block or score a particular continent without setting a lot of rules for each nation.
09:49
Some improvements in the URLHABL plugin that detects if an URL in the mail is coming from a particular source and some very bad file descriptor leaks in the TXREPL plugin,
10:07
which is a plugin that's recognized and stores from where an email is coming. So the IP address, the score of the email, the DKIM sync monitor.
10:22
So if this similar email comes from the same API address with the same DKIM sync monitor, it's probably not as paramount because it's a core exponent, so it can detect it and lower the global score.
10:45
And regression tests switch it, change it completely to be more performant and to be able to add better tests. Some additions that has been done in 3.4.2 and that will be even better in PMS and 4.
11:09
We consider that we will release 3.4.3, I think, in the next weeks, and maybe fast sync 4 this year, I hope. The next module is the HBL plugin. It's a plugin which is present in RSPandy as well.
11:28
And it's a particular DNS blacklist because it's not going to be a blacklist for IP addresses. It's a blacklist of hashes. Because, for example, if a message is trying to, a spam message is coming from a Google server or a Microsoft server,
11:53
you cannot block all Google or Microsoft net block. So you can block this hash and this hash is stored as this particular email address.
12:09
With this technique and in PMS 4, we developed a new plugin. Well, the DFS PMS 4 does a modified version of HBL, or if you want to use this feature in PMS 3.4, there's an additional plugin.
12:33
We developed a DNS blacklist for Bitcoin addresses. So you can credit the DNS to the hash of the bitcoinaddress.blbtsb.t at enterprise if it has been used for fraudulent purposes or not.
12:53
So the plugin scans the email for Bitcoin addresses and checks the DNS. And so you can very easily detect a Bitcoin scan emails.
13:11
Then there was another thing that was developed was the IP2 support. Because MaxMind is the bigger player in geolocalization.
13:22
And they decided that from, started from latest April, that we'd not have more support for the legacy job database, but also they push in for the new version. The problem is that the new version with Perl are very slow or are x8664 only.
13:52
So we developed the plugin for MaxMind. And we developed on the simple game, we added support also for epcount to adb file as an additional option.
14:04
It's a different approach. It doesn't use MaxMind databases, but it's a database created downloading the txt files directly from RIPE or AfriNIC or ARIN, etc.
14:21
And it's very fast. It doesn't need all the longer dependencies MaxMind have. It's not complete as the MaxMind one, because MaxMind one has at least the commercial support,
14:41
and it has databases for ISP, for example, or CDNINs or a lot of other interesting things. There's a new anti-phishing plugin based on the phishtank or OpenPhish project. In 3.4.2, it's been developed for 3.4.2.
15:04
And in 3.4.3, it has been changed to be very, very fast. And more databases will be added soon. So it's a way to try to detect more tentative phishing in emails.
15:30
One other issue interesting is resource limits. And it tries to limit the resource consumption of the server.
15:49
This is one interesting thing. The main problem is that the mass check is based on the people that are using this software.
16:01
So public rules are determined on the span of the people using mass check. And the vast majority of people using mass check are from US. And the developers write rules for English spam.
16:20
So there are some additional channels. There are some from Italy, France, Germany, Greece, I think. They're trying to write rules specific for non-English language. So it's very, very efficient to detect spam if your main language is not English.
16:48
This is new plug-in in Spam Assistant 4. It detects if there's an attachment with a macro in Word or Excel.
17:01
And it detects if this macro is trying to do something it's not aimed to do. So Spam Assistant 4 will have full UTF-8 support. So there's no more conversion between the email and the UTF-8.
17:24
But it's full UTF-8. It will have geodb support even better than what we have in 3 at the moment. And bad RTX wrap handling. Some fixes for Postgres has been committed these days and will be available in 343.
17:44
Some more will be available only in 4. If you have any questions.
18:10
So we happen to receive a lot of French spam and also Ukrainian spam for some reason. And actually I've developed like a huge list of custom rules.
18:20
Should I actually submit them somewhere or what should I do to them? Because I use them for myself but I presume other people are also getting this spam. Yeah. I have a similar problem with the Italian spam. I put my rules on a web server as a procedure to...
18:44
At the moment it means you write your rules, you sign with a PGP key, then you put it on a web server and on the DNS you have a particular TXT record.
19:05
So when the client tries to download it just credit the DNS and checks if there's a new rule to download or not. So it will be... I think there is an official French channel so you can interact with them to merge the efforts.
19:29
That could be a... Any other question?
19:46
In a case where there are a lot of different people behind Spanish infrastructure, do you have any advice to make it work? Because the spam is, like you said, different for everybody.
20:01
Should there be different instances of a specific message running in parallel with different kind of rules? I had in the past some of those problems because I had a customer that had a lot of traffic with commercial traffic with China
20:22
and then there was the opposite other people that were getting only spam with China. And the solution was that... One solution would be two instances but the better one would be user preferences dedicated to that,
20:43
different user preferences dedicated to that customer, to this particular. And you can use also the database per user.
21:01
So you can... every user can have is by using the database you can train for every user and so it can detect this difference. Any other question?
21:23
So thank you.