Have content quality, will search your Intranet
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Title of Series | ||
Number of Parts | 61 | |
Author | ||
License | CC Attribution 3.0 Germany: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor. | |
Identifiers | 10.5446/54943 (DOI) | |
Publisher | ||
Release Date | ||
Language |
Content Metadata
Subject Area | ||
Genre | ||
Abstract |
|
Plone Conference 201746 / 61
3
11
14
21
22
25
26
27
33
34
37
38
44
45
53
55
57
58
00:00
Moving averageBit error rateBus (computing)Software developerWeb browserTransportation theory (mathematics)Service (economics)Term (mathematics)Physical systemMereologyPoint (geometry)Group actionWave packetRandomizationCommutatorEntire functionLine (geometry)Limit (category theory)Maxima and minimaBitQuantum stateText editorMusical ensembleNumberSoftware developerState of matterWeb browserWorkstation <Musikinstrument>AreaField (computer science)Computer animationDiagram
05:37
Mobile appBlogWeb browserBootingCovering spaceWebsiteMusical ensembleInformationRepository (publishing)Projective planeLink (knot theory)BlogQuicksortPhysical systemInternetworkingProcess (computing)Cartesian coordinate systemDocument management systemConservation lawCloningCovering spaceKnowledge basePersonal digital assistantIntranetProgramming paradigmWeb pageComputer animationDiagram
11:04
Covering spaceContent (media)NumberMaxima and minimaWeb pageBlogComputer-generated imageryComputer fileOnline helpComplex (psychology)Operations researchSelf-organizationFunction (mathematics)Standard deviationMultiplication signBootstrap aggregatingWebsiteComputer fileService (economics)Process (computing)Musical ensembleSoftware developerObject (grammar)WordCovering spaceDefault (computer science)Server (computing)Product (business)TesselationLevel (video gaming)Content (media)Landing pageWeb pageResultantLink (knot theory)Table (information)Form (programming)View (database)Computer animationDiagram
20:35
Execution unitMaizeProbability density functionData typeContent (media)Interior (topology)Intrusion detection systemFeedbackAlgorithmWide area networkLengthFrequencyInverse elementNumberMUDContext awarenessInclusion mapUsabilityPrice indexQuicksortOrder (biology)Installable File SystemType theoryComputer fileThumbnailResultantWell-formed formulaComputer iconFilter <Stochastik>NumberContext awarenessQuicksortInformationFile viewerElement (mathematics)WordBitTotal S.A.Metropolitan area networkGoogolWeb pageLengthContent (media)WebsiteMusical ensembleMatching (graph theory)Domain nameFrequencyKeyboard shortcutProcess (computing)Set (mathematics)NeuroinformatikDescriptive statisticsTerm (mathematics)Uniform resource locatorComputer animation
30:06
PlanningWaveUsabilityRaw image formatContent (media)Division (mathematics)Data typeGame theoryWeb pageComputer fileOrbitPlane (geometry)Software developerSubject indexingGame controllerLattice (order)Type theorySearch engine (computing)Web pageLibrary catalogGUI widgetTraffic reportingText editorProcess (computing)Multiplication signFeedbackData managementDescriptive statisticsMetadataSubject indexingCASE <Informatik>Content (media)Default (computer science)Electronic program guideScheduling (computing)Musical ensemble1 (number)Metropolitan area networkDivision (mathematics)Task (computing)Range (statistics)DigitizingProjective planeArrow of timeCodeComputer animation
39:37
Software developerSystem programming
Transcript: English(auto-generated)
00:25
[Music] okay so either this morning I'm I want to talk to you about retirement improved
00:42
their search challenges with search and with that so who is trying try next try met is basically Tremec runs of the
01:01
entire public transportation system for the metropolitan area of Portland Oregon and just a few numbers they have a budget five hundred eighty seven million they cover a pretty large area lots of
01:22
people and they are a free-standing government agency that is not part of the city or the state they buses we
01:45
the max which is the light ray is sort of like a long-distance streetcar with lots of vehicles like five random lines of miles of service and stations the
02:03
West Side Express service is is actually a full-size train commute for commuters which has a 15 mile service line with five stations and six Beautiful's and
02:21
then they run paratransit the limit which is transit service and
02:42
they run the streetcar which is owned by the city Portland but it's actually maintained and operated by train and some of you probably I know someone who wants to take on in Portland this year
03:12
in terms of trips people take a hundred million trips every year in terms of
03:23
coverage or science of the transportation system is the ninth per capita in the country even though Portland is a relatively small city it's a twenty fourth largest city in the u.s. in terms of transportation system which
03:48
is a very large talking about employees we get to the point where I start thinking about users or our systems so so TriMet has twenty
04:05
one twenty nine hundred employees ninety percent of these work in the field so they are supervisors so therefore they
04:28
do need access to their so the nine
04:51
connect and actually so the majority of the employees the cannery for a little bit loan and it's there
05:03
there are just about a dozen or maybe two dozen roughly speaking of content developers which have to editor so that's that gives you an idea of the user base for so if it's not in the
05:24
browser we don't really know what to do with it so so let's talk about their their IT infrastructure so the public so
05:48
[Music] [Music]
06:02
they do they do run clone sites and before this project they had over five clone sites in that state over because some five sites were actually actively being used and there were some of those that were basically committed so of those five three were blogs one
06:27
was a knowledge base sort of a document management system and it was the it is still the repository for all of their
06:40
technical documentation manuals for all the technical hard so for a trainer with assistant paradigm as you can find information about anything conservative traders buses publication
07:00
systems which and then they have the intranet which is what the majority was talk is about after this project so
07:22
there were two sites left because people were merged into the intranet ability of the Internet the knowledge base was upgraded from three something the the
07:42
reason why would they were stuck on three was because they [Music]
08:10
[Music]
08:22
[Music] so while this project was in process the four belongs to looking exactly the way they were before all right the project
09:02
was basically the main goal was to improve search ability of their intranet and there were two pieces three pieces to this one was obviously if it's not if it's not responsive
09:21
[Music] so this was so they have to news
10:22
publications magazines and newspapers know that they have developed an art to basically guiding you the user to what they want you to find and and collective
10:43
cover is perfect the perfect application focused and so the idea was let's let's see if we can surface the information that we want users to find in a dynamic way you don't have to create a bunch of
11:02
links on pages and so I created a few custom tiles at the time nothing to counter Kyle does so I developed a
11:20
calendar time and that is an example so you as a Content developer you just created Venice wherever they need to be and services just in one word here about this process of using covers and
11:42
bootstrap lineage was great for this because I created some new landing pages using collective cover that more obviously themed with bootstrap and
12:00
tested them worked with composer composition process works with so but they need to build links to production
12:25
contact so instead of doing this work on staging Dexter this all actually happened on the production server by using sub-sites with an itch there were
12:43
so the rest of the site's was completely unchanged still the old feeling nothing but the subside had a new theme bootstrap cover installed running in there and so they could create these covers and then on the product on the
13:01
launch day I could just turn on the theme of the rested site copy and paste the pages that the the cover objects over to whether we need to be and switch them to be the default landing pages for those folders so really really that's one reason why
13:22
definitely one of my stars in the in the contest so most of Scripture under there does
16:24
not tell you anything the where's search found keywords in other words these words anywhere so they
16:45
were really trusted by that or they turn up files and evidence and now for the same search we have 28 results and this is what it looks like
17:04
now for the same for the same search so this is just a little preview of more to come later another workaround is keyword stuffing so somebody that's not quite figured out what is this keyword thing I
17:21
wonder if I can if I can trick the search by putting you know all the keywords that I think people want to find want to use when they search for this particular page I put them all in here results like there's this old table
17:43
crate and and so there were about 80 keywords most of which were duplicates whether they were like singular and plural forms of the same word [Music]
18:09
so just what they did to replicate this
18:56
sound and you know when you are a user
19:04
looking for something in particular [Music]
19:30
taken and like a bird's-eye view of Google search results [Music]
19:42
and it's done I think so okay so
20:54
actually it's like it's really really important to know what you're clicking up like just looking at the domain
21:03
I get a good idea of what word so that is really important the byline Google doesn't show that by the man in charge I think it's very important snippets aka Scripture make a summary
21:22
which we can I think with yeoman we decided that tags are useful to have and then icons or thumbnails could also be
21:44
very good hints that I implemented something for the apathetic ended up not being used for now maybe later so [Music]
22:07
this is a so just using the description
22:46
[Music] [Music]
23:00
give me some information about this thing this repeat the type so but to be honest this is not the user it's a little bit so these examples that show
23:48
even if they that's where that's all
25:03
search results like that we wish
25:29
I'll get there okay not like Google he had one job give
25:47
me the picture of a guy without a to actually advocate search is about two jobs sorting and filtering because as
26:04
soon as that's just over one trillion possible search results okay it doesn't just sort that you know just doesn't get just give you a trillion search results sorted by the thing that you did the keyword that your type no but also
26:21
filters but in plumb this morning outcome that I use is something called a copy at the m25 stands for best match and not just not too scary there is
26:45
actually even though even if you don't have an idea what this means and what this does there is a lot you can learn just by looking at the formula and when you're looking at what it depends on
27:03
so what depends on is the frequency of each keyword in the computing score before so how often does this keyboard appeared the average
27:24
length of all documents the ratio of those two and then the total number of documents in the whole set and the number of documents containing this particular that's that's it that's all this depends on okay if you think about
27:42
that for a minute you realize that this scoring does not understand anything about the context of a keyword in terms of either the location in the site or the patient on
28:01
the keyword inside the document does not contain any other keywords so so so if you're using a synonym for example that's your viewer your fault if you
28:22
misspell something that's you can't help it there and obviously doesn't offer any suggestions so so this is this is why loans default search is you know it's fine when you install up alongside and I start
28:40
creating content you know ten pages does a few dozen pages a couple hundred pages oh now you get to a thousand ten forget it so thinking about that I thought okay
29:02
I'll make my intestine sort or unlike an indexed that depends on these four these four elements in this worker so for content quality to items that have the
29:23
same or the quality it will then score higher tags and then for the same spore it will it will rank content that is not
29:42
the file higher and then files and then also last modified is seems to be a pretty pretty useful and quality so oh so so that's about the
30:08
story and now it's not about filtering because that's the that's the second big job of the search engine performance it's eliminating anything that doesn't have anything to do with it and so what
30:26
if users do it themselves it doesn't company and but first we have to decide which facets do we want to use
30:40
and that's where that's what our process entail so so in in this project we needed to decide so and so if in case
31:18
you haven't seen it it gives you a
31:21
little beulah that appears in the byline you see that red little thing in there that's the viewlet collective Jekyll puts there and it has a little drop-down arrow so you can click on it it says warning and when you click on it it drops down this summary of content
31:42
quality symptoms and in this case the this this page does not have a summer and that's the one that's everything else so now it's free it says okay so
32:07
this is what Jekyll you see and it's it's really nice and I looked at the code it's very much like that it's really well written I think and and it's really extensive so what symptoms do we
32:28
want to use and and prioritize which ones are we going to fix first so we want we want we want to have really good
32:41
titles so we want them all to be know [Music] for the ad style guide and be really clean about it so that's something that I had a particular type of filter that I
33:02
created then the summary aka description so we want every content to have description it has to be you know a complete sentence it doesn't have it has to be not the same as the title or or even
33:20
contain the time that's another symptom there is some stuff already collectible but we improved it and then hid you know when you're creating a coffee that
33:41
meeting ID always has a copy off in private and silver that is first of all as ugly but moreover it's actually like a symptom of work that was probably left undone unfinished and so it's so the
34:03
other thing is it gives us pay your warning it also gives you a collection that you can use but you know it computes all these symptoms on the fly when you actually you know
34:20
requesting a page so I create a custom index so that all that those symptoms are persistent in the catalog and we can create reports and so reports we can treat with faster navigation and then you see the widgets in faceted
34:42
navigation that has the symptoms and so then the managers could give the task to their editors to go in okay let's start fixing long titles and so people could and there are other digits down there that you don't see so that people could find all the things
35:02
that they have to fix so that's how the process worked then tags we decided to do with controlled vocabulary none of us focus on meeting stuff so with home keyword manager had eliminated all the bad keywords look at that so on and with
35:20
the Aza we removed the widget that lets editors add new keywords instead they can only pick from the control and that's okay they have to decide what assets do we
35:41
want in our search range you know that's and for those we need to create some custom indexes the division is one it's based on the the default index users so bundled into
36:37
so this week this year and finally
37:39
Amazon does don't see the widgets when
37:43
you first [Music] everything just want to give you a few takeaways so for TriMet this man this was a
38:09
process and they need to decide which symptoms they [Music]
38:21
on a schedule for me I have to be the one to indexes and to create their reports and I have to search page for plumb okay this is for us as a community we need to get better better editor of
38:42
feedback on content quality and that's it that's something that castle Dallas then I think we should we should definitely do something like that we need to be able to give users a way to both filter not folder contents view in
39:03
top 5 it's great now you can do both updates but you still can't really do a lot with the metadata like content quality of the summary so think about
39:22
another quality it always matters [Applause]