Building a Collaborative News Platform with Plone

Video in TIB AV-Portal: Building a Collaborative News Platform with Plone

Formal Metadata

Building a Collaborative News Platform with Plone
Title of Series
CC Attribution 3.0 Germany:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Release Date
Production Year

Content Metadata

Subject Area
In this talk, Érico will present tools and solutions used to build and maintain Plone is the core of a solution that integrates with Thumbor, DBpedia, ElasticSearch, IFTTT and
Collaborationism Trail Zoom lens Computing platform Videoconferencing Right angle Window
Collaborationism Zoom lens Beta function Open source Building Multiplication sign Moment (mathematics) Open source Product (business) Mathematics Internetworking Software Whiteboard Internetworking Internet service provider Software Computing platform Computing platform Computing platform Whiteboard Information security
Group action Game controller Open source Multiplication sign Source code Plastikkarte Metadata Network topology Computing platform Organic computing Service (economics) Zoom lens Information Building Open source Content (media) Electronic mailing list Metadata Plastikkarte Morley's categoricity theorem Stack (abstract data type) Control flow Subject indexing Data management Uniform resource locator Search engine (computing) Network topology Self-organization Computing platform
Axiom of choice Email Group action Building Code Multiplication sign User-generated content Client (computing) Disk read-and-write head Medical imaging Programmable read-only memory Cuboid Damping Information Web portal Content management system Building Metadata Sound effect Bit Representational state transfer Virtual machine 10 (number) Category of being Data management Googol Website Right angle Quicksort Freeware Web page Point (geometry) Web portal Inheritance (object-oriented programming) Service (economics) Content management system Proxy server Open source Computer-generated imagery Web browser Plastikkarte Content (media) Metadata Wave packet Goodness of fit Computing platform Proxy server Data type Rule of inference Scaling (geometry) Plastikkarte Morley's categoricity theorem Analytic set Avatar (2009 film) Plane (geometry) String (computer science) File archiver Social class
Email Group action Source code Data dictionary Medical imaging Type theory Different (Kate Ryan album) Object (grammar) Cloning Error message Social class Physical system Touchscreen View (database) Block (periodic table) Software developer Web page Electronic mailing list Bit Instance (computer science) Translation (relic) Windows Registry Category of being Message passing Process (computing) Text editor Quicksort Abelian category Web page Windows Registry Inheritance (object-oriented programming) Proxy server GUI widget Computer-generated imagery Maxima and minima Translation (relic) Rule of inference Event horizon Voting Cache (computing) Profil (magazine) Bridging (networking) Computing platform Traffic reporting Form (programming) Data type Scale (map) Mobile app Default (computer science) Scaling (geometry) Information Content (media) Plastikkarte Volume (thermodynamics) Group action Cartesian coordinate system System call Web browser Uniform resource locator String (computer science) Network topology Social class Principal ideal Pulse (signal processing) Code Interior (topology) Multiplication sign View (database) 1 (number) Materialization (paranormal) Bookmark (World Wide Web) Formal language Web 2.0 Mathematics Synchronization Category of being Service (economics) Email File format Moment (mathematics) Representational state transfer User profile Type theory Bridging (networking) Self-organization Configuration space Right angle Functional (mathematics) Interpolation Implementation Service (economics) Virtual machine Web browser Plastikkarte Content (media) Login Field (computer science) 2 (number) Wave packet Power (physics) Local Group Revision control String (computer science) Proxy server Plug-in (computing) Default (computer science) User interface Rule of inference Dialect Inheritance (object-oriented programming) Hazard (2005 film) Principal ideal Mathematical analysis Morley's categoricity theorem Database Library catalog Similarity (geometry) Subject indexing Cache (computing) Logic Computing platform
Game controller Mobile app Thread (computing) Beta function Content management system Multiplication sign Computer-generated imagery Mobile Web Disintegration Data storage device Plastikkarte Content (media) Mereology Twitter Formal language Data management Cache (computing) Term (mathematics) Finitary relation Default (computer science) Beta function Data storage device Plastikkarte Bit Translation (relic) Windows Registry Thread (computing) Twitter Formal language Similarity (geometry) Digital photography Data management Uniform resource locator Musical ensemble Form (programming)
Zoom lens Network topology Plastikkarte
Collaborationism Presentation of a group Information Touch typing Computing platform Computing platform Twitter
Zoom lens
yeah so hello to the um track two um the next talk we'll have here will be about building a collaborative yeah not a native speaker collaborator of news platform with plone and it will be presented by erica andre um please note that uh we will have um questions uh all the questions taken to gtc afterwards um so please just put them into the slider on the right of your window next to the video so then go ahead and have fun first thank you yanina and uh
here we go let's talk about building a collaborative news platform with blonde first of all i'm eric andre most of you
know me also my name in one of the previous talks i'm a brazilian living in berlin i've been working with open source for quite a long time i'm a fellow of the python software foundation and i'm one of the seven members of the foundation board uh in the past i work for microsoft so uh every time timos mentions microsoft it hurts me thank you timo i worked in many other companies including siblings constantia that was the main provider of bloom solutions in brazil and also on rocket internet when where i was cto for two different companies there and right now i work at pentax pendact is a collaborative news platform it's uh the idea of facts are more important than opinion when you're forming your own opinion it was launched in march of 2020 it's still in better we are still evolving the platform and uh we've got some pretty good results but first i would like to talk about my lovely co-founders i have ashley winker it's our design wizard she's amazing she lives in vienna we have christopher young that's the genius behind the idea we worked together in a previous company and uh he basically uh had the idea about uh of pentax he's also based here in berlin and i'm erico cto and i'm based in berlin most of the time uh you can follow me by the footer on the on the pendant usually i change to where i am at the moment in either brasilia sao paulo or locally sorrento next year right so planning
pendant the idea was to build a too long didn't read news platform because we consumed too much news and we want to be informed but not necessarily we want to read every uh op-ed every opinion article in the new york times about something we want to understand first and then okay which sources for this uh information then click one and then go analyze the idea is that short cards are better than articles for you to get this first idea and of course like the wikipedia you go to the to the primary source from there uh the idea is to have the cards submitted by our community members we call them contributors uh the card should include metadata that allows us to kind of cross reference them so i want to know everything about joe biden i click and i see a list of everything that was written about him and uh users can follow tags people organizations locations so want to form their own personal feed and a recent idea but it's important to us for each new card we plant a tree so you submit a card that's published we plant a tree some of the technical requirements uh we needed a collaborative workflow this is something that after working for years with content management you know workflow is important workflow is the king you need to have permissions for people to submit their cards and other people to review the cards and eventually a third group of people to schedule and publish the cards and so on and so forth also uh the the whole permission control to to set who can do what and when and uh we needed a platform that would help us to to leverage metadata and categorization as much as possible of course it needed to be seo friendly because even though we are a startup in berlin we do not have like a rocket size a truck of uh money to invest in search engine marketing so we expect our organic growth for so seo is really important for us and of course it needs to be open source we truly believe that uh being transparent be in the open is the best way to go
so index stacked first thing
content management we decided for prom for many reasons one of them of course i am the cto i'm building um going with something that's familiar but plone was not actually my first choice in the beginning i decided to question my own uh knowledge about cms and say okay how hard would it be if i built something with uh pyramid instead of going with a full blown and i consider a substantially and i consider koti in at some point i was considering developing the the the uh uh simple api on pyramid itself but in the end uh when i start adding some features that would appear it became obvious that uh instead of expanding a lot of time building the solution we could go have something really fast go to market with blonde because plonk brings most of the features we wanted out of the box and what was not there we were able to adapt instead of building from scratch and of course then using my own experience uh back in brazil back with simpliscon storia we had many news portals uh as our customers as our client so it basically means that we were able to build news portals with blonde easily and of course this is a very friendly community it's easy to ask stuff and get answers and we have really smart people in here including matthew wilkes that just published a book the book should be there i completely messed up but it's a really good book about python so you can go there and take a look uh also i decided uh that if we were going to do lots of metadata and lots of categorization we should try to find out some magic way of tagging stuff right not only the the hashtags or the tags in blonde subject but also uh specific categories so we decided to to approach dbpedia they have the solution called spotlight it's a rest api service a rest api that you basically put a text in there and it gives you back with uh some of the the the markings and basically annotating the original content right and this is something i wanted to to play with for a long long time since uh a group of friends did that in 2014 or 13 in brazil for and that was uh an idea that was in the back of my head so we went for it uh we use a spark ql to query uh wikipedia to also get the about and abstract for everything we tag or content with i'm going to talk a little bit more about that in in the future and we have a bit of everything else and in here i start with cloudflare nginx varnish aj proxy and instable and i start with a screenshot of gerry bullsonaro the brazilian president uh not because i'm brazilian but mostly because this card uh brought us the biggest reddit effect so far we had in a few hours after submission between 120 and 200 000 different users so imagine we have our amount of users and then we publish this and it goes to the front page it goes skyrocket and i found out only the next day when i was looking at the google analytics we have all sorts of monitoring but in the end nothing special happened because with uh cloth flyer doing the the the static resources uh caching and we furnished doing the uh caching for everything that's dynamic it was really easy to support the amount of new uh users coming to us we actually survived already 12 different uh reddit effects this was the biggest one but we had one about uh the brussels no the bedroom government offering free train rides for uh their citizens to kind of uh bring back tourism internal tourism to belgium and we got also a few tens of thousands users into the site we use timber to generate all the images and uh uh the scaling of images optimal is a open source platform i was expecting a hard time integrating that with blonde but uh it was actually replacing one of the browser views so it's adding a like a bunch of code solve the problem really easy without having to integrate too much we have century mayo gun uh eft ends up here to when we publish something we push everywhere else and now we have depal scrappinghub and i'm going to talk a bit more about that when i talk about a new service we implemented
so uh building pendant two certain
add-ons uh first things first i have been developing with python 3 since i left symbolis custodia back into the end of 2014 so every single thing i developed after in my my career was based on python 3. i use other languages like php and robbie and a lot of javascript but every time i came back to python was python 3. so pyramid with python 3 and fast api with python 3 and so on so forth even a bit of guillotina with python 3. so i got used to python 3 and i admit that i love f strings they solve a problem for me because every time i need to do a string interpolation f strings it's basically the way to go instead of doing the old dot format and so on and so forth i use a lot of type hints mostly because it's a way for me to understand what i'm doing with my code so every time i'm writing i'm putting type hints in there like from moment zero and uh another thing i start using a lot is their classes right i use data classes as a kind of a contract between uh functions and systems so it's a for me it was a better way than just adding uh the the passing returning dictionaries everywhere it's simple and of course as i use pycharm it uh speed up a lot development and avoid a lot of errors important side hints uh uh are not available for blown right so one of the things i did was to create a facade of a facet of uh the plone api adding some new methods for me for for for to be used by my company and for all the methods i need to change i added type hints so for instance to implement the create user using uh uid as uh the user id uh plone wrestling upload api does not do that by default even though rest api does and uh the login form does so i had to refactor the plone api implemented something there and for that function for that matter i have type hints indicating that okay it's going to return you a member data so it becomes easier for me to to to work and of course black i sort the code analysis uh and so on so forth and many many flake eight plugins thanks to nate to plan and to q for kada for for doing that when it comes to plone add-ons uh i'm going to start with the easy ones collect c3c data grid uh they'll refill to deal with many sources for one content it could be basically a json field and implement a json as saving as json like volto saves the blocks json but i wanted to give people the ability to edit easily and i didn't i did not want to write my own implementation of a widget so data reviewed i use collective sentry and i believe andreas young worked on that thank you i used content rules like that was something i developed a few years back for a previous company i worked that basically every time someone does any action it sends information back to slack this is something that for me is quite important because most of my day i spend on slack so okay we have a new user or we have a new submission it's easier to reach me there and we have a super dot blonde that's like uh one of the hidden gems of the the plum community and the zob community because there's the the zopper and the zopa version of it and uh i thank eric brejo for giving a talk a long long long time ago at in the conference in brazil about that and that's all kind of stayed in the back of my mind and a few months ago it said okay we need to implement something and i was close to implement a api small api with pyramids and then okay maybe it's an overkill i implemented the same thing using a sober inside blog super content times we use the default ones folder document document most for content pages like content guidelines about us and so on so far folder to organize uh some things but uh image and collection collection we use for uh grouping some of the content together but uh we also use something called category that uh it's a folderish with a collection a collection behavior with some same defaults so every time you click for instance uh original news and then you go you'll see a listing of everything that's regional news and then you click on regional news americas and then everything that's america's so we organize the content inside categories in sub categories and we have a collection listing all of them besides that we implement the content type that's the card that's basically the news item with a behavior that implement the other categorization and uh adapting plone we went for okay we know how plume works so how to play with it so first thing we have the concept of my feet when you go and follow some content let me uh yeah you go and okay i want to follow for instance uh uh news from a regional news or specifically from america specific uh from south america i want also american news so you go there and follow we're talking a bit more about that and then we have a list of content so the dashboard was turned into my feed we have the contributor profile basically uh leveraging the logic from outer page and we adapted the image and scale browser views to support uh proxy into timbre content rules we have actions for slack email and web hooks we adapted a bit of those and uh we added one trigger that's when a principal is added or removed from a group because every time i create a new subcategory i want to create the a group of uh contributors that can submit something there and also editors that can review content there besides that news features new features we have aggregator pages as i said we added uh the support for the categorization using dvpedia and with that
we added some additional fields we added a people location and organization so for instance this uh card you see in your uh screen contest lays off 2000 more employees when someone submitted that when uh charlie david submitted that i made a call to dbpedia saying oh this is the the content you need to take and he came back saying oh organization contents location australia people involved if there's someone preeminent in here it would be tagged and grouped in there right and uh we have pages for each one of those categorizations so you seen here this is actually uh cuntus right and we are going to have the same for joby and so on and the name uh that appears there it's uh one to one with the name that appears on the wikipedia page okay and to do that we also implemented the concept that i can follow stuff you see here we have cantas and we have in there follow if you click here everything about this organization is going to appear in your feed right and to implement that we use uh so it's possible to follow categories contributors uh tags people and so on so forth and every time you do this we go and add an entry to the to the super catalog and when you unfollow we remove the entry and so on so we can even do some reports based on how people uh our users love to follow these uh our organizations more than these other ones and so on so forth and uh the first approach i was considering was to implement a simple rest api to to deal with that and save the data on a pulse risk database but uh super was way easier to implement uh especially because it's already included it's i do not need to wait a rest or a database call to get and say okay you are following already this page and so on also implemented but did not release the voting up like saying oh this card is relevant or not and the bookmarking so you can bookmark some cards it's everything there it's not on the user interface yet and a few months ago i stumbled with the problem that was okay every time i wanted to do something i uh like for instance i want to to ping archived out to uh the wayback machine to store this page as soon as uh this card as soon as the card is published right trying to do that with uh plone right now first of all i start doing stuff like that simply adding to listening to some events and doing that synchronously but is the greatest example of why that's not possible so uh when you go and do that in it takes something between 45 seconds to see this 60 seconds to ping you back so it was clear that i would need some kind of a sync solution to do that and uh the approach was to develop a special micro service even though it's not a micro service because it's not so micro but develop something that was going to run on a different process and clone would basically send a message like do something if it's synchronous wait for the answer otherwise do something and i don't care so right now we have everything from translational auto summary archiving and even that bridge to the dbpd spotlight implemented on this microservice the that was developed with uh fast api plus httpx okay of course each endpoint has its own dependencies python dependencies ideally we should have one microservice for each one of them but as i'm using digitalocean's apps platform every new service is going to cost us five bucks a month and we do not have a volume that justifies that but the moment we decide to go for a kubernetes scenario or uh even a bigger deployment it's easy to do with uh this application because it's easy to basically split into smaller pieces some of the lessons learned during this first of all i have upgrade steps every time i need to deploy something if there's a new code that change configuration r needs something there there's going to be an upgrade step and over time i added catalog indexes i added information to the user schema uh profile and so on so forth and of course always be aware of hp configuration stuff that you store in the hazard tree but you have the full value do not forget to to add the purge ball equals false because that already uh hurt me in the past first thing i would like to thank philip power and everyone else involved in the trainings because right now the cloning training materials are the de facto documentation for plum it's the most updated one even though from time to time i need to to go back to the docs but most of the time information that uh it's something i need to learn comes from training materials i am going to to do my one second of renting here i hate uh research registry as it is today and js development and so on i saw andreas young mentioning the past he does everything already with a webpack and outside solution i never got the time to properly understand research registries and how it works it's working but it's not ideal but yeah it's there blood likes now uh uh simple and working async delay solution everything that we had in the past had the async name and it's gone so it's something that if someone is willing to to develop a new add-on i would love to help and uh be careful with webp images because they are not for everyone okay that was a lesson i learned with the tumblr because timber basically reads the the request and say okay you support web p so i send you webp right and i have cloudflare in the front saying every image you basically cache then we had users with iphones so first time webby second time iphone people do not see the image it was a pds and as i preferred the benefit of the caching of cloudflare to the idea of reducing a bit the size of image so iq for us
some future steps first consider contributions they uh we have this feature in beta for some users that you basically put the url we generate auto summary and we if it's in another language we already translate and then generate the auto summary it's going to be available for everyone i'm considering implementing that either with react or smelt it's something i'm going to decide at the end of this conference uh we are planning card threads something like the twitter treads when you basically have a tweet that replies for treat and so on and so forth search improvements we want to move to elasticsearch i've been saying that for a few months never got the time first time last week i was able to play with it and to move to rail storage because right now we have a zeo setup when we go out of bed uh first quarter next year we want to use plone as a headlight cms photo and mobile applications for for for for band acts we want to implement more features in terms of quality control so integrate language too to do the basic of a grammatic and uh spelling check do a bit of auto tagging and something that i wanted to do before launching but i was not able to so and this is the important part that's basically user management right now it's using the user folder and it's already getting slow and slow we need to move away from this and of course
join us help us contribute cards every card we plant trees and uh i would love to see you all there helping us like eric bruno the whole does and thank you eric and
that was it very very glad to be here i'm glad to to to have this bloom conference thank you all
i ask you to follow me on twitter it's aircraft follow pentax it's spandex hq if you want to get in touch this are my contact information pendant information and really important my presentation is already on speaker deck slash aircraft building a collaborative news platform with blonde uh that was it i'm going to to to join you all on the jitsi pretty soon
and uh answer your questions it was a pleasure see you soon
yes thank you erico as he mentioned we will be on the face to face in jitsi so if you have any further questions or discussions just meet us there and yes see you soon