We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Have content quality, will search your Intranet

00:00

Formal Metadata

Title
Have content quality, will search your Intranet
Title of Series
Number of Parts
61
Author
License
CC Attribution 3.0 Germany:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
Abstract
TriMet runs the public transportation system for the city of Portland, Oregon, and the surrounding area. Over several years, TriMet's Plone-based intranet had accumulated lots of content, and the built-in search was not working very well anymore. For this case study, I will show how we solved the problem by focusing on content quality. Faceted search was helpful both in the push for content quality, as well as in the final search functionality. These ideas can help any Plone site, large or small, and should be considered for additional default features of Plone.
Moving averageBit error rateBus (computing)Software developerWeb browserTransportation theory (mathematics)Service (economics)Term (mathematics)Physical systemMereologyPoint (geometry)Group actionWave packetRandomizationCommutatorEntire functionLine (geometry)Limit (category theory)Maxima and minimaBitQuantum stateText editorMusical ensembleNumberSoftware developerState of matterWeb browserWorkstation <Musikinstrument>AreaField (computer science)Computer animationDiagram
Mobile appBlogWeb browserBootingCovering spaceWebsiteMusical ensembleInformationRepository (publishing)Projective planeLink (knot theory)BlogQuicksortPhysical systemInternetworkingProcess (computing)Cartesian coordinate systemDocument management systemConservation lawCloningCovering spaceKnowledge basePersonal digital assistantIntranetProgramming paradigmWeb pageComputer animationDiagram
Covering spaceContent (media)NumberMaxima and minimaWeb pageBlogComputer-generated imageryComputer fileOnline helpComplex (psychology)Operations researchSelf-organizationFunction (mathematics)Standard deviationMultiplication signBootstrap aggregatingWebsiteComputer fileService (economics)Process (computing)Musical ensembleSoftware developerObject (grammar)WordCovering spaceDefault (computer science)Server (computing)Product (business)TesselationLevel (video gaming)Content (media)Landing pageWeb pageResultantLink (knot theory)Table (information)Form (programming)View (database)Computer animationDiagram
Execution unitMaizeProbability density functionData typeContent (media)Interior (topology)Intrusion detection systemFeedbackAlgorithmWide area networkLengthFrequencyInverse elementNumberMUDContext awarenessInclusion mapUsabilityPrice indexQuicksortOrder (biology)Installable File SystemType theoryComputer fileThumbnailResultantWell-formed formulaComputer iconFilter <Stochastik>NumberContext awarenessQuicksortInformationFile viewerElement (mathematics)WordBitTotal S.A.Metropolitan area networkGoogolWeb pageLengthContent (media)WebsiteMusical ensembleMatching (graph theory)Domain nameFrequencyKeyboard shortcutProcess (computing)Set (mathematics)NeuroinformatikDescriptive statisticsTerm (mathematics)Uniform resource locatorComputer animation
PlanningWaveUsabilityRaw image formatContent (media)Division (mathematics)Data typeGame theoryWeb pageComputer fileOrbitPlane (geometry)Software developerSubject indexingGame controllerLattice (order)Type theorySearch engine (computing)Web pageLibrary catalogGUI widgetTraffic reportingText editorProcess (computing)Multiplication signFeedbackData managementDescriptive statisticsMetadataSubject indexingCASE <Informatik>Content (media)Default (computer science)Electronic program guideScheduling (computing)Musical ensemble1 (number)Metropolitan area networkDivision (mathematics)Task (computing)Range (statistics)DigitizingProjective planeArrow of timeCodeComputer animation
Software developerSystem programming
Transcript: English(auto-generated)
[Music] okay so either this morning I'm I want to talk to you about retirement improved
their search challenges with search and with that so who is trying try next try met is basically Tremec runs of the
entire public transportation system for the metropolitan area of Portland Oregon and just a few numbers they have a budget five hundred eighty seven million they cover a pretty large area lots of
people and they are a free-standing government agency that is not part of the city or the state they buses we
the max which is the light ray is sort of like a long-distance streetcar with lots of vehicles like five random lines of miles of service and stations the
West Side Express service is is actually a full-size train commute for commuters which has a 15 mile service line with five stations and six Beautiful's and
then they run paratransit the limit which is transit service and
they run the streetcar which is owned by the city Portland but it's actually maintained and operated by train and some of you probably I know someone who wants to take on in Portland this year
in terms of trips people take a hundred million trips every year in terms of
coverage or science of the transportation system is the ninth per capita in the country even though Portland is a relatively small city it's a twenty fourth largest city in the u.s. in terms of transportation system which
is a very large talking about employees we get to the point where I start thinking about users or our systems so so TriMet has twenty
one twenty nine hundred employees ninety percent of these work in the field so they are supervisors so therefore they
do need access to their so the nine
connect and actually so the majority of the employees the cannery for a little bit loan and it's there
there are just about a dozen or maybe two dozen roughly speaking of content developers which have to editor so that's that gives you an idea of the user base for so if it's not in the
browser we don't really know what to do with it so so let's talk about their their IT infrastructure so the public so
[Music] [Music]
they do they do run clone sites and before this project they had over five clone sites in that state over because some five sites were actually actively being used and there were some of those that were basically committed so of those five three were blogs one
was a knowledge base sort of a document management system and it was the it is still the repository for all of their
technical documentation manuals for all the technical hard so for a trainer with assistant paradigm as you can find information about anything conservative traders buses publication
systems which and then they have the intranet which is what the majority was talk is about after this project so
there were two sites left because people were merged into the intranet ability of the Internet the knowledge base was upgraded from three something the the
reason why would they were stuck on three was because they [Music]
[Music]
[Music] so while this project was in process the four belongs to looking exactly the way they were before all right the project
was basically the main goal was to improve search ability of their intranet and there were two pieces three pieces to this one was obviously if it's not if it's not responsive
[Music] so this was so they have to news
publications magazines and newspapers know that they have developed an art to basically guiding you the user to what they want you to find and and collective
cover is perfect the perfect application focused and so the idea was let's let's see if we can surface the information that we want users to find in a dynamic way you don't have to create a bunch of
links on pages and so I created a few custom tiles at the time nothing to counter Kyle does so I developed a
calendar time and that is an example so you as a Content developer you just created Venice wherever they need to be and services just in one word here about this process of using covers and
bootstrap lineage was great for this because I created some new landing pages using collective cover that more obviously themed with bootstrap and
tested them worked with composer composition process works with so but they need to build links to production
contact so instead of doing this work on staging Dexter this all actually happened on the production server by using sub-sites with an itch there were
so the rest of the site's was completely unchanged still the old feeling nothing but the subside had a new theme bootstrap cover installed running in there and so they could create these covers and then on the product on the
launch day I could just turn on the theme of the rested site copy and paste the pages that the the cover objects over to whether we need to be and switch them to be the default landing pages for those folders so really really that's one reason why
definitely one of my stars in the in the contest so most of Scripture under there does
not tell you anything the where's search found keywords in other words these words anywhere so they
were really trusted by that or they turn up files and evidence and now for the same search we have 28 results and this is what it looks like
now for the same for the same search so this is just a little preview of more to come later another workaround is keyword stuffing so somebody that's not quite figured out what is this keyword thing I
wonder if I can if I can trick the search by putting you know all the keywords that I think people want to find want to use when they search for this particular page I put them all in here results like there's this old table
crate and and so there were about 80 keywords most of which were duplicates whether they were like singular and plural forms of the same word [Music]
so just what they did to replicate this
sound and you know when you are a user
looking for something in particular [Music]
taken and like a bird's-eye view of Google search results [Music]
and it's done I think so okay so
actually it's like it's really really important to know what you're clicking up like just looking at the domain
I get a good idea of what word so that is really important the byline Google doesn't show that by the man in charge I think it's very important snippets aka Scripture make a summary
which we can I think with yeoman we decided that tags are useful to have and then icons or thumbnails could also be
very good hints that I implemented something for the apathetic ended up not being used for now maybe later so [Music]
this is a so just using the description
[Music] [Music]
give me some information about this thing this repeat the type so but to be honest this is not the user it's a little bit so these examples that show
even if they that's where that's all
search results like that we wish
I'll get there okay not like Google he had one job give
me the picture of a guy without a to actually advocate search is about two jobs sorting and filtering because as
soon as that's just over one trillion possible search results okay it doesn't just sort that you know just doesn't get just give you a trillion search results sorted by the thing that you did the keyword that your type no but also
filters but in plumb this morning outcome that I use is something called a copy at the m25 stands for best match and not just not too scary there is
actually even though even if you don't have an idea what this means and what this does there is a lot you can learn just by looking at the formula and when you're looking at what it depends on
so what depends on is the frequency of each keyword in the computing score before so how often does this keyboard appeared the average
length of all documents the ratio of those two and then the total number of documents in the whole set and the number of documents containing this particular that's that's it that's all this depends on okay if you think about
that for a minute you realize that this scoring does not understand anything about the context of a keyword in terms of either the location in the site or the patient on
the keyword inside the document does not contain any other keywords so so so if you're using a synonym for example that's your viewer your fault if you
misspell something that's you can't help it there and obviously doesn't offer any suggestions so so this is this is why loans default search is you know it's fine when you install up alongside and I start
creating content you know ten pages does a few dozen pages a couple hundred pages oh now you get to a thousand ten forget it so thinking about that I thought okay
I'll make my intestine sort or unlike an indexed that depends on these four these four elements in this worker so for content quality to items that have the
same or the quality it will then score higher tags and then for the same spore it will it will rank content that is not
the file higher and then files and then also last modified is seems to be a pretty pretty useful and quality so oh so so that's about the
story and now it's not about filtering because that's the that's the second big job of the search engine performance it's eliminating anything that doesn't have anything to do with it and so what
if users do it themselves it doesn't company and but first we have to decide which facets do we want to use
and that's where that's what our process entail so so in in this project we needed to decide so and so if in case
you haven't seen it it gives you a
little beulah that appears in the byline you see that red little thing in there that's the viewlet collective Jekyll puts there and it has a little drop-down arrow so you can click on it it says warning and when you click on it it drops down this summary of content
quality symptoms and in this case the this this page does not have a summer and that's the one that's everything else so now it's free it says okay so
this is what Jekyll you see and it's it's really nice and I looked at the code it's very much like that it's really well written I think and and it's really extensive so what symptoms do we
want to use and and prioritize which ones are we going to fix first so we want we want we want to have really good
titles so we want them all to be know [Music] for the ad style guide and be really clean about it so that's something that I had a particular type of filter that I
created then the summary aka description so we want every content to have description it has to be you know a complete sentence it doesn't have it has to be not the same as the title or or even
contain the time that's another symptom there is some stuff already collectible but we improved it and then hid you know when you're creating a coffee that
meeting ID always has a copy off in private and silver that is first of all as ugly but moreover it's actually like a symptom of work that was probably left undone unfinished and so it's so the
other thing is it gives us pay your warning it also gives you a collection that you can use but you know it computes all these symptoms on the fly when you actually you know
requesting a page so I create a custom index so that all that those symptoms are persistent in the catalog and we can create reports and so reports we can treat with faster navigation and then you see the widgets in faceted
navigation that has the symptoms and so then the managers could give the task to their editors to go in okay let's start fixing long titles and so people could and there are other digits down there that you don't see so that people could find all the things
that they have to fix so that's how the process worked then tags we decided to do with controlled vocabulary none of us focus on meeting stuff so with home keyword manager had eliminated all the bad keywords look at that so on and with
the Aza we removed the widget that lets editors add new keywords instead they can only pick from the control and that's okay they have to decide what assets do we
want in our search range you know that's and for those we need to create some custom indexes the division is one it's based on the the default index users so bundled into
so this week this year and finally
Amazon does don't see the widgets when
you first [Music] everything just want to give you a few takeaways so for TriMet this man this was a
process and they need to decide which symptoms they [Music]
on a schedule for me I have to be the one to indexes and to create their reports and I have to search page for plumb okay this is for us as a community we need to get better better editor of
feedback on content quality and that's it that's something that castle Dallas then I think we should we should definitely do something like that we need to be able to give users a way to both filter not folder contents view in
top 5 it's great now you can do both updates but you still can't really do a lot with the metadata like content quality of the summary so think about
another quality it always matters [Applause]