Data mining: Role of TeX files?
Cork, Ireland

Now a days most of the journals accepts TeX files from authors. Journals submits pdf files of the article or abstract of the article on the web site. User wants to search for typical word, concept in the repository. Is it possible to faster the search because of availability of the TeX files? Since TeX files are structured documents. We can extract most of the important words from the document and make it available to user in forms of groups of related words. This will make user to search faster and get the required document in minimum time.
the the and a lot form of relative so obviously we look more in the matter applications and so lending in the last for some reason hello but also simultaneously working on some of the I was it there are some good of values that
is 1 the I that thing is of the results were also the only in the you know some information From Table of table legal along and he died and find out that the means it is the 1 has a has little but especially the young looking for research the the very keen on getting at the at the cost is the so these are not in use several scenario ensemble and the thinking is that contain of the and you know that is very fast it was the 1 has some we want to have some knowledge the existing users on the list can be used kind of if you want to there are something like this to find out of the so these all all questions that in the mind so this is a one tho actually main all so the use of ask class only kind then the user goes to the heart of it and it's all done this side then use only last visited course that is not the only ones looking so can
millions of and the not at the time of the and initially I want all hands that it will be and how would do so this is very so time was some was doing the 2nd alignment is and there's also so I was thinking you made on hollow 1 with a good idea to get some users faster is a final time dependent the that bundles of question in my mind this
so all as you know that the are given access to registered users on the and the new European period file this too but they also give big files and then so that is 1
you of videos and his idea so that kind of not having enough for the use of the but alters this is also something that's fine so that the dialog is don't so can we make use of this table it is the only thing I don't know and and then it is a
general many of them ended up 1 and on on you the of the relation but you also you on the status of the that's not classify you find something like this so from users ideas and wants to have the peak
because what some consensus on that was beyond the find also it is usually to get it done fast know how to do what is the optimal solution on this they
so is it possible to plot the lon of summary summary and there was determined in the unit and meaning 2 of the causes of inventories all depend on just so again we long clusters the the we don't have that in some similar consists for Monday so can you think of as it wasn't that this so there is not known but general so if you have 1 of
consider resource in the late and is more sensitive it's exactly meant to it will take more time so can you find some emitting so that so that don't plaintext so can be I wanted and still want to give some by users so far if you if you go on documents up before we get some clarity and documents of authority and then if you have some will be presentation and the that mean that that they can become bending the rules that is not what I so these clustering method use of anybody many developed and
7 that's missing so that's going to have a little the loop gain on that is 1 the so all of the beta testing the residues and then how do find in so this Bible verses that they're in the mine so group can deform the doubling trick that is 1 challenge so far again in optimal but that it was in the fall began defined and
presented in the company and the the maintenance is what I hope will get and you
present deformable committee that is suppose we have documents of the controls and words you know and and then you don't want a lot weird they're getting the something else we don't want but the knowledge of the document you know in some of the men in an optimum if there is some bonding you would like to highlight the the high information is the model documents that kind of information is that's the so another question is we understand how
information is all what was it began in the central and you want information but now how pattern million you have pre-defined and if the text extraction will the begin but then in is this text format RDF format but I want to get to that Committee so that means that we find some not better FIL is human 5 is
to find format is that enough money the you can also look at the bottom and some users may gas boldface then there was a mechanism users and we can get some tests it's be on in this rules but I mean environment having more lots 1 in the loss of words In this baseline is the if you want the next about you just give someone Indian even more than this 1 so that meant find debt-financed somebody has commands so that is 1 1 that compares with him 5 and all United's about index 5 this
mainly commands that combines the understand but they come pounded you incumbent highly skilled most among the number of the section on the of month in the sense of In this of all the it does me on this the most then has to go in was the next the 1 that of in the command and new rate of all these commands on the station was we describe the document so this 1 unintelligible understand given as the combined by but then why not captured these commands and guides and who is all these commands I put it in some specifying this can on that's the form but she journalists for the document the so the working and you will
like that yeah it's foreign big company what the independent and exciting you can extend we just to do household or community and then that the general standard and some fine benefit others so that when you have something instead of looking at the same time the the removal of final features and the music from Gray not the general file that we can use some of that because the
general have only in modern was later yeah I mean you stop is he's an these last molestation meaning gender so you can have on the set I don't moral and again captain was this has been done that side they come on the side but there must be getting things and comprehensive and and use organisms the uncertainty in the germ line for each
document now we can assume those who understand this wall can find 1 rule so there is no if you have 70 per cent matching features improvise well known together the set of this and all this kind of rules be some experiment the other media model get some groups and those little standard visited and 70 % form when he was In the tubules embodiment and it did so it's only can have some destiny for the native now if the have 10 minutes some pretty strong morning is dopamine in the dead we haven't we can take in the last of the present active and that because of this and and you can see that the process a gave
them the control and assassinating so so that the money that
will have some level so that was taken visibility on this side I'm not going to get making it possible to get problem find the user and you can have some the other illnesses of the things that excite them useful the beginning of a set of like you know this also that is also we can use the standard this ability is not what we have is just an idea about it and I was looking on destinations ended the form of the documents and that arise from that they say can be embedded in it look you said the study so that this idea of in my mind so this this is just a lot of thinking not implementation of the law the only thing is that the began can do that then you solve it just didn't what say because the sign is based so we can make use of that structure of the big 5 and do guys don't you that is the idea that another thing is is that a lot in so I would like to discuss and I like a of small New what and on and on the no that noted this is not possible because of this so that it can relate to an organization and if the portals so and that's it you if a b words is yeah talking about the most prominent onto this users and assume that is really true but it is not all the time and they can go on maternity in the human being is going on of there no question is if those they don't is not sufficient the inspiration this all this is and that many this can work well and in the end users involved in the situation and in general economic group that fossils it is a little yeah I would have to this here the would you come in next week the 1st half of this time of knowing it this means that we mean that I gave more and think this time is productivity and as we were not hold as the this is in the yeah it's a lot work in this and that you so what I'm really talking about adopted in line and then use the software and it was used in Germany there are many the conference the vector and only next week that is what you to the next of work the the the that will search for the Hungarians would think 1 thing that struck me as being potentially serious problem with this is the strength of the green programming and all those making more use of you so you trying to identify some sort of your your key words in the text so you have to actually to some extent on the on the part of the depends on what the authors doing right on the basis of the mean financing next month because in some structure that you usually has built using the I I don't know what it is to work on problems on this so what on the you know the common cause of words you get a serious problem for you dining resources because it will be text you have a more graceful and such year or something which for the reality of what is the source of task which you looking for I don't think you can use the OK so I'm and I'm concerning the subsite signals from that point things happen very and when we're at so what you're still seeing oxygen right and I know I only learning in the basal and nominative many business you know what you want is the sum of the of the of the medical books look at a single people of through indexes blue region will and there is a little all of these things on 1 from the cost we do have something talk because the quality is destructive to to have your own environment not the money that structure rather than system is almost certainly of the structure of the or final we find thing is it was useful in Wall Street doesn't and not just to get meeting on formants for the application of ordinary women were made on the top of the you can have feature also ordered if it if it involves it involves in some sort of scenario and if we take it as well in theory if you take the log of accuracy and you find anything is useful In 1 that and there some tools on that your saying that the commands and as I find formal journals the standpoint is taking a general use commands not only the oldest taking care of me something that we think of 1 of your life understand only some we also itself with preamble his father relation looking you want me to actually cost you know this is the sort of easier for you know the wall of the left in this next phase of the war he was looking at and this was 1 of a series shows that you would need to do a certain level of we want to take all of the energy in time to his from the she was the of the of the of the of the of the of the so called because you want to do and how we might as well as on the