We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Harnessing the Power of AI for Managing Grey Literature

00:00

Formal Metadata

Title
Harnessing the Power of AI for Managing Grey Literature
Title of Series
Number of Parts
14
Author
License
CC Attribution 3.0 Germany:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
Computer animation
Computer animation
Computer animation
Computer animation
Computer animation
Computer animation
Computer animation
Computer animation
Computer animation
Transcript: English(auto-generated)
Harnessing the Power of Artificial Intelligence for Managing Grey Literature. Presentation made by Dobrit Sasavic at the 2015 International Conference on Grey Literature, Confronting Climate Change with Trusted Grey Resources. The conference was held from 13th to 14th November 2023 in Amsterdam, the Netherlands.
When encountering a new topic in an article, book, illustration or presentation, there are typically three questions that come to mind. Firstly, we often wonder, what is this all about?
This presentation focuses on the power of ChatGPT and related AI systems which belong to a group of Large Language Models or LLMs. Immediately following is a somewhat provocative question, so what? Many experts predict that 30% of all jobs will be replaced by artificial intelligence in just a few years,
while many others will undergo substantial transformation. Lastly, we might ask ourselves, what's in it for me? This conference and presentation cater specifically to information and knowledge managers operating in the realm of grey literature management.
Apart from providing a brief overview of ChatGPT, it offers insights into potential ways of managing grey literature using this new technology, transitioning from current information management practices to AI-driven transformations.
Let's take a moment to explore the significance of grey literature. Grey literature refers to any recorded, referable and sustainable data or information resource of current or future value made publicly available without undergoing the traditional peer review process.
According to GreyNet, there exist over 150 types of grey literature. These encompass reports, feasibility studies, dissertations, proceedings, news releases,
newsletters, brochures, notes, posters, blogs, datasets, databases and various others. Grey literature originates from diverse sources, including individuals, businesses, public institutions, research centres and local, national or global organisations.
It can exist in electronic or paper-based formats generated by either machines or individuals. The volume of grey literature is vast and boundless. Millions upon millions of grey literature items are already accessible, with more being generated daily.
However, a significant challenge lies in locating and identifying specific grey literature documents. One of the primary reasons for this challenge is that search engines lack mechanisms for distinguishing grey literature specifically. Nevertheless, grey literature offers several significant advantages.
Number one, it is diverse perspective. Grey literature provides valuable insights from non-traditional sources, such as government reports, conference proceedings and unpublished research, thereby offering a broader perspective on a given topic.
Second, filling information gaps. It often contains specialized and niche knowledge not readily available in mainstream publications, helping to address gaps in existing research. Third, timely and current information.
Grey literature is typically produced more rapidly than formal publications, serving as a valuable resource for staying abreast of the latest developments and trends in a particular field. And finally, it supports evidence-based decision-making. Access to grey literature empowers researchers, policymakers and practitioners to make well-informed
decisions by incorporating a wide range of evidence beyond peer-reviewed journals and books. Let's dive into capabilities of ChatGPT, beginning with a concise overview.
Chat generative pre-trained transformer, or ChatGPT, represents a sophisticated AI machine learning model adept at executing natural language processing or NLP tasks with remarkable precision. It simulates human conversation, exhibiting a level of fluency that enables it to pass the Turing test.
Developed by OpenAI in 2021 and launched in November 22, ChatGPT rapidly garnered an immense user base. Within a week of its release, it occurred 1 million users, reaching a staggering 57 million users in its inaugural month.
By January 23, it surpassed the 100 million milestone. While over 180 million individuals have created ChatGPT accounts to date, approximately 100 million remain active users. Currently, the website experience is staggering, 1.8 billion users per month.
GPT-3 was trained on 175 billion parameters, however, its successor, GPT -4, launched on March 14, 2023, was trained on colossal 170 trillion parameters.
Notably, ChatGPT isn't the solitary large language model in circulation. Numerous major IT corporations have either developed or are in the process of developing their own iterations. For instance, Baidu has the ErnieBot. Google AI is referred to as BARD.
Microsoft Bing incorporates GPT technology. Amazon unveiled Bedrock with Titan Text, similar to ChatGPT. And Elon Musk's AI startup XAI recently introduced its inaugural AI model named Grok.
This prompts the question, what attributes contribute to ChatGPT's widespread popularity and how can it benefit us? The scope of potential applications for ChatGPT is extensive. Integration and utilization of extensive knowledge from diverse sources like books, articles and websites.
Providing comprehensive answers across various topics. Generating coherent and contextually relevant text, drafting emails and crafting creative written content. Assisting encoding by offering snippets, code suggestions, explanations and debugging aid.
Facilitating multilingual text translation. Engaging in simulated conversations with users. Automating customer support, offering round-the-clock assistance without human intervention.
Categorizing, classifying, tagging and auto-generating metadata. Summarizing lengthy text, extracting pivotal information into concise summaries. Expanding users' knowledge base by delivering insight on diverse subjects, historical events, scientific concepts and current affairs.
Facilitating meetings by providing summaries and identifying key decisions and action items. Generating diverse content, including image jokes, stories and poems. Adapting explanations to specific styles and successfully executing a multitude of other tasks.
Here are just some notable instances showcasing the diverse applications of ChatGPT. Microsoft effectively employs ChatGPT to enable users to conduct searches and obtain results using a conversational interface.
Duolingo, recognized as the world's largest platform for learning foreign languages, leverages ChatGPT to offer students comprehensive explanations in natural language akin to guidance from a human tutor.
Slack integrates ChatGPT to streamline workflow management, project administration, enhance productivity and facilitate communication among team members. Octopus Energy, a prominent British renewable energy group specializing in sustainable energy solutions, delegates 44% of its customer inquiries to ChatGPT.
Checkmate harnesses ChatGPT capabilities to assist students with assignments, providing support akin to interacting with human professionals. Freshworks significantly slashes the development time for complex software applications from 10 weeks to less than a week by utilizing ChatGPT.
Udacity employs ChatGPT 4 to craft an intelligent virtual tutor capable of delivering personalized guidance and feedback to students. Air India utilizes ChatGPT to elevate customer-centric offerings to their websites, including FAQs, pilot briefings and other related services.
This example underscores the versatility and effectiveness of ChatGPT across diverse industries and applications, showcasing its ability to enhance processes, improve user experiences and streamline operations.
Upon reviewing the fundamental characteristics of grey literature and the capabilities of ChatGPT, let's explore how ChatGPT can enhance the management of grey literature.
To achieve this, we will first examine the primary workflow phases involved in managing grey literature. Subsequently, we will outline the current key functions performed within each phase. Finally, we will explore the potential role ChatGPT can play in transforming these phases while fulfilling the necessary functions associated with each.
A high-level overview of the workflow for grey literature reveals four primary phases. They are pre-processing, processing, post-processing and the final utilization phase. The pre-processing phase involving the collection of grey literature encompasses key functions such
as identification, selection, acquisition, purchasing or requesting, obtaining items, formatting and scanning if necessary. The processing phase is pivotal for creating metadata achieved primarily through descriptive
cataloguing, preparing bibliographic descriptions, assigning subject categories and keywords and crafting abstracts. Post-processing activities entail managing a repository of grey literature.
This includes sending and receiving documents, repository management and maintaining associated IT systems on both the back and front ends. During the utilization phase, grey literature becomes available for search and retrieval. The search is typically accomplished through keyword-based queries that match search terms with document metadata or its content.
Results are usually sorted by potential, relevance or date. Now, let's explore how ChatGPT can transform each phase of grey literature management and the improvements and benefits it can offer.
The traditional pre-processing phase involving the collection of grey literature can transition towards web scraping. ChatGPT, when appropriately directed, can effectively extract information from various online sources like websites, databases, journals, conference proceedings and preprint servers.
Metadata creation can potentially be replaced by automated tagging and metadata generation, alongside contextual analysis and summarization. This approach is already widely adopted by several information databases and repository providers.
A significant shift in user experience will occur, moving from a database-style boolean and keyword search user interface to conversational interfaces. A ChatGPT-based interface will offer specific replies through dialogue instead of merely providing a list of information sources for further search.
Closely related to user experience and maximizing the use of valuable information resources is ChatGPT's ability in natural language processing. It can comprehend context-based queries, conduct exploratory searches on
related subjects, offer personalized recommendations and expand searches in unforeseen directions, uncovering unexpected facets and possibilities for the user's attention. Let's delve deeper into the techniques of web scraping, particularly exploring its significant
features that render it highly valuable for enhancing the management of grey literature. Web scraping offers direct targeting of grey literature. By fine-tuning the ChatGPT model through pertinent training data
and specific prompts relevant to grey literature, it can effectively aid in retrieving and extracting targeted information from websites. This encompasses a wide range of parameters, including different geographical locations, organizations, specific journals, topics, and other predefined access frequencies.
ChatGPT possesses the ability to cross-reference and verify scrapped information against reputable sources, ensuring the accuracy and reliability of publications. This validation process ensures that gathered information is credible and meets requisite quality standards.
Another notable feature of ChatGPT is its capacity for knowledge integration. It can seamlessly integrate scraped, grey literature with its existing knowledge base, providing supplementary context, related articles, historical data, or scientific background.
This enrichment significantly enhances the overall comprehension of the topic. Furthermore, ChatGPT's multilingual capacities empower it to scrape grey literature from websites in diverse languages. This versatility broadens the scope of data collection and analysis, enabling a more comprehensive understanding across various linguistic domains.
Metadata creation stands as an area where leveraging ChatGPT can yield swift and substantial benefits.
Through an analysis of specific document content and context, ChatGPT can automatically generate pertinent metadata, including author names, publication dates, journal titles, and other essential citation information. This automated process significantly streamlines cataloguing and referencing of grey literature, resulting in considerable time and effort savings.
Capitalizing on its adept contextual comprehension, ChatGPT excels in identifying relationships between concepts, detecting nuances in terminology usage, and establishing connections across various research domains.
Such capabilities empower researchers to attain comprehensive insights into specific topics, pinpoint knowledge gaps, and explore potential research trajectories. The process of automated tagging involves ChatGPT analyzing the content of grey literature
to extract key topics, concepts, and keywords that precisely represent the document subject matter. These tags, when integrated with relevant knowledge organization systems, serve as invaluable metadata, enhancing efficient document organization, searchability, and their retrieval.
ChatGPT's ability to generate concise summaries, encapsulating the essence of lengthy scientific articles, proves immensely beneficial. These summaries furnish researchers with an overview of document content, expeding the review of pertinent literature, and they
aid in filtering relevant resources and significantly contribute to informed decision-making concerning the document's relevance and significance. Here is an illustration of how ChatGPT can effortlessly generate doubling-core metadata from an article.
In this instance, the article under consideration was one of my previous works titled When is Grey, Too Grey? A Case of Grey Data. This article was featured in the conference proceedings of the 20th International Conference on Grey Literature held in New Orleans in December 2018.
As demonstrated, the output generated by ChatGPT captures a comprehensive set of doubling-core metadata. Beyond its proficiency in creating valid doubling-core data records, ChatGPT showcases its
capability to generate the entire metadata in JSON format with a simple command. Here is an example of such a record in JSON format. To illustrate the disparity between future user experiences and current interfaces, let's examine a
sophisticated and widely used search interface within the INIS, International Nuclear Information Systems, repository. Additionally, we'll explore an example of utilizing ChatGPT to inquire about the same subject, specifically nuclear information.
The INIS repository search interface offers diverse search options, allowing users to explore all content, bibliographic records, or exclusively access full-text content. Furthermore, it provides the ability to refine results to those with full-text access.
Boasting nearly 700,000 identified results, users can sort findings by relevance, date, and the number of records displayed per page. Further granularity is available through primary subjects, subject areas, records and literature types, conference
and journal titles, publication years and ranges, country of publication, language, descriptors, and INIS volume. It stands as a robust and all-encompasses search engine offering numerous search facets. However, the user's specific requirement was simple definition and potential clarification of nuclear information.
In contrast, ChatGPT swiftly analyzed the search prompt, providing a succinct definition of nuclear information accompanied by a descriptive overview of its usage, associations, likely related areas, and significance.
This stark contrast in output signifies the potential time saving for searches aiming to grasp the initial definition and core elements of nuclear information. It's important to note that this comparison doesn't establish superiority but highlights a distinct approach.
Ultimately, users can determine which method best aligns with their needs. A blend of both interfaces might present the most advantageous and practical solution. Moving from a DB-type boolean search user interface to a ChatGPT interface offers several advantages.
Here are five major benefits of such system that provides replies instead of simply pointing to external info sources. First, natural language interaction. ChatGPT's conversational interface allows users to interact with system using
natural language queries and receive responses in a conversational manner. This eliminates the need for users to formulate complex boolean queries or understand the underlying database structure. It also makes the search process more intuitive and user-friendly, resembling
a conversation with an expert rather than navigating through a rigid search interface. Second is contextual understanding. ChatGPT's advanced language model enables it to understand the context of user queries and provide relevant and contextualized responses.
Instead of providing a list of potential sources for further search, ChatGPT can directly address user queries, offer explanations or provide specific information within the conversation. This contextual understanding enhances the user experience by reducing the
cognitive load associated with searching for and evaluating multiple sources. Third, personalized recommendations. ChatGPT can leverage its understanding of user preferences and previous search queries to offer personalized recommendations. By analyzing user behavior and feedback, the system can adapt and refine its
responses, ensuring that the information provided aligns with the user needs and interests. This personalized approach enhances the user experience by tailoring the search results to individual requirements and increasing the relevance and usefulness of the information provided.
Fourth, improved efficiency and time saving. With the ChatGPT interface, users can quickly obtain relevant information without the need to browse through multiple sources or sift through lengthy search results.
The conversational interface allows users to directly ask questions and receive concise answers or summaries, saving the time and effort. Additionally, the system can provide additional context, related information or follow-up questions to further refine the search and provide a more comprehensive understanding of the topic.
Collaboration and knowledge sharing. ChatGPT can facilitate collaboration among users by enabling features like document sharing, commenting and annotation. Users can engage in discussions, share insights and collectively contribute to the repository content.
In conclusion, it's evident that grey literature is a valuable information resource. It provides diverse perspectives, bridges existing information gaps, delivers timely and current information and substantiates evidence-based decision making.
ChatGPT presents an exceptional opportunity for leveraging grey literature. It boasts outstanding natural language processing capabilities, contextual understanding, human-like response generation, widespread use across diverse domains and immense potential to revolutionize information systems while enhancing user experiences.
The collaboration between grey literature and ChatGPT can yield excellent synergy. ChatGPT can enhance all phases and functions of information management,
offering web scraping for pertinent publications, automatic tagging, the metadata creation. It augments user experiences with improved interfaces and valuable natural language dialogues. The future of grey literature remains a shared responsibility.
While acknowledging the need for further enhancements and developments in grey literature management, progress must be built upon the success already attained. Advocacy for the significance of grey literature needs amplification alongside the development of standards and guidance materials. The potency of grey literature hinges on collective cooperation, necessitating innovative leadership, and
in summary, the future of grey literature is contingent upon our collective actions. Let us reflect on the words of Mahatma Gandhi who wisely said,
The future depends on what you do today. Thank you.