We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Dissemination of information in event-based surveillance, a case study of Avian Influenza

00:00

Formal Metadata

Title
Dissemination of information in event-based surveillance, a case study of Avian Influenza
Title of Series
Number of Parts
45
Author
License
CC Attribution 3.0 Germany:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Identifiers
Publisher
Release Date
Language
Producer
Production Year2023
Production PlaceWageningen

Content Metadata

Subject Area
Genre
Abstract
Event-Based Surveillance (EBS) tools monitor online news reports and other unofficial sources with the primary aim to provide timely information to users from health agencies on disease outbreaks occurring worldwide. In this presentation, Sarah Valentin presents the results of our recently pusblished paper which focuses on how outbreak-related information disseminates from a primary source to to a definitive aggregator, an EBS tool. She and her team analysed news items reporting avian influenza outbreaks in birds worldwide between July 2018 and June 2019 and detected by PADI-web and HealthMap. They used the sources cited in the news to trace the path of each outbreak. We built a directed network with nodes representing the sources (characterised by type, specialisation, and geographical focus) and edges representing the flow of information. They calculated the degree as a centrality measure to determine the importance of the nodes in information dissemination. They analysed the role of the sources in early detection (detection of an event before its official notification) to the World Organisation for Animal Health (WOAH) and late detection. A total of 23% and 43% of the avian influenza outbreaks detected by the PADI-web and HealthMap, respectively, were shared on time before their notification. For both tools, national and local veterinary authorities were the primary sources of early detection. The early detection component mainly relied on the dissemination of nationally acknowledged events by online news and press agencies, bypassing international reporting to the WAOH. WOAH was the major secondary source for late detection, occupying a central position between national authorities and disseminator sources, such as online news.
Keywords
Computer animation
Computer animation
Computer animationDiagram
Computer animation
Computer animation
Transcript: English(auto-generated)
So good afternoon, everyone. I'm really happy to share with you today the results of a work that we conducted with my colleagues from Tethys and Astra, which is about the dissemination of information
in event-based surveillance, which was recently published in a PLOS One article. I will have a brief word about EBS, although I guess most of you are familiar with this
concept. So the objective is to complement the formalized and traditional notification disease surveillance to official authorities.
And the focus is on what we call informal sources, which include a large range of sources such as social media, press, and so on. Here, we will focus only on online news published on the web.
And one of the big stakes is that we deal with unverified and not validated data. So that's why this surveillance is sometimes called non-official, informal, and so on. So the rationale behind it is that informal sources are able to detect epidemiological
information in a timely manner compared to official notification. A striking example here is the detection of the first introduction of frequent swine fever in Europe, which was first reported in an online news, talking about a suspicion.
This online news was further detected and disseminated by an expert-based surveillance system, which is called PROMED, and the notification was released 11 days after. And there was a lot of studies that focused on EBS and focusing on evaluating their timeliness,
so how are the detection delayed compared to official notification, or evaluating their sensitivity. But there was a few work about where does the information comes from before being detected
by the event-based surveillance system. That is to say, what is the content of the online news that are detected? And that's where our study lies. So we wanted to know, to determine how does the outbreak related information, so information
about outbreaks was disseminating between sources before being detected by an EBS, and what was the role of the sources that was implicated in the detection and the dissemination of the information.
In other words, to have a more graphical view of the problem, we wanted to reconstruct this network. So at the very right, you have what we call the final sources, it's our EBS that finally detect information. At the left, we had the hypothesis that there were some primary sources that first
emitted information about the outbreaks. And between the two, a range of intermediate sources, enabled to detect, disseminate, and finally being collected by the EBS.
So here, the nodes represent the sources, and the link between them represents the propagation of event information. I will use the term event to reflect to a detected outbreak. So to build this network, we decided to select a case study, which was Avian influenza
during a one year period. And we tried the reports collected by both ELSMAP, a semi-automated EBS system, and PADIweb, which is a fully automated one.
We first curated the report to select only the relevant one. And we focused as relevant by the reports containing either a declaration or suspicion of an outbreak, and we only focused on animal cases. So we ignored here the zoonotic events for the study.
Then, what do we do? Started the long step of manual curation, and I will try to express it with an example. So at the top here, you have an example of an online news that is reporting an Avian
true outbreak in Galapagos Island. It's a very recent example. So the outbreak is reported in an online news called VOA news. And in this article, the primary source that is cited is the Galapagos National Park.
Okay, so we used a backward approach to reconstruct the signal. First, the EBS, the online news it has detected, and all the sources cited in the article.
And then we reverse the network because the event has propagated from the source on the right to the left. So we did this for all the reports we had collected. We also determine collected information about the sources that we detected.
I will have a few words about it later. And the third step was to link all the events present in the articles to the OIE, W-O-H-I, here in English, that's easier, database, which was our gold standard.
So there was two cases. Either we could link the event from the article with an official one. So in that case, we call them official events. And in some cases, we couldn't do this much, either because it was a false positive or because it was an outbreak that was never declared.
And that part, we cannot evaluate it without the ground truth. And so here is a list of all the different type of sources that we found in the article
and news contents. So they include official ones like the OIE or National Vet Authorities, Ministry of Health and so on. And more true informal sources such as social media. Sometimes radio was cited and so on.
And of course, online news and press agencies. So regarding as events that the network detected, the high majority of them, nearly
70% were official ones. So we were able to match them with the Imprecy database. The remaining ones were the non-official events. And these amounts were the same for PADIweb and Health Map.
But we identified a big difference between PADIweb and Health Map in terms of the percentage of early detected events. That is to say, the event detected before the OIE that was much higher in Health Map. And we interpret that in regarding the semi-automated process in Health Map, which leads to a more
fine grain filtering and removing a part of the events that are detected lately by the EBS. Regarding the network, we observed that most of the paths followed by the information
were very small, composed of two or three edges that mean one or two intimidated sources between the primary one and the EBS. And this was reflected on the reactivity of the network.
That is to say that in most of the case, the information was emitted and detected by the EBS in one day or less. So this is a view of the network we obtained.
No need to go into detail here. What is interesting to see is that every small figure correspond to a source and the orange one is the EBS. And we can notice that, first of all, the online news, which are represented in green,
are the most numerous sources in our network. And also that we perceive a difference between both EBS. For example, Health Map integrates or includes a social platform in it because it uses also Twitter that Paddy Web doesn't use.
It's visible here in purple. Also what we can see in this, if you look at the length of the segment, so one segment is one source and the length of the signal reflects its capacity to both collect and disseminate.
So the more the length is high, the more it means that it has received and communicates the events. And we can see that it's highly heterogeneous between the sources. Some of them are only received from one source and communicate to one source, while
some others are a truly good disseminator. And to go into detail in that idea, so we looked at the top five sources in terms of collection, diffusion, and both. In network metrics, it corresponds to the in-degree, out-degree, and out-degree.
So the collection, the in-degree is the capacity of a source to retrieve information from many other sources, diffusion to collect too many sources, and at the bottom, it's the ability to do both.
So what we can see first is that the WOHI is the top one sources in our network, which is consistent with the fact that it collects information from all the national VET authorities, and it's also highly cited when online news communicate about official
events. And then we have other type of source, such as press agency, here, CHINUA, raters. We also have specialized websites, poultry sites, national VET authority.
I will come back to it later for the diffusion. And what was really interesting, there are more details in the article, is that this top five, top six were really outlier. There were very few sources that had high in-degree, high out-degree, and high out-degree.
And for the collection and diffusion, we can consider them as hubs, which are essential in the network because they are able, they are at the crossroads between a lot of primary sources and secondary and EBS sources.
Then we looked at the role of the different kind of source, depending on the detection of late event or early events. So this is the detection of late events.
So we call late event when the EBS detected either the day of notification or lately compared to the OIE. So this is for us the information that is not relevant because we already know, we already have the information.
So the primary sources are mostly a national and local VET authority to a lesser extent. And logically, it is diffused by either the OIE and also in online news and press agency sources.
What was more interesting for us is to look at the same typology, but for the early detected events, so before the notification. So here we saw that the local authority were much more implied as primary sources.
This means that even if they were only detected events, they were acknowledged at local level or at national level, which means that they were likely to be true positive events. And in that case, for early detected events, the online news and press agency were the
most important source in terms of diffusion. It's quite the same result in a different view, and here we separated the ELSMAP and
PADIweb network. And what was interesting to see here, it's the difference of type of sources used by both, and more specifically to see that PADIweb implies more local sources than ELSMAP.
So to summarize the major results of our study and implication for further improvement of EBS system. So we saw that even if we use informal sources that mostly disseminate validated
information, so either validated at international level when they communicate on already known events, or at nationally or local or validated at nationally or local scale. So the risk here is the redundancy and the risk of overwhelming of monitoring capacity
of EBS. And so we advise to try to find methods to automatically discriminate when a report talk about an already known outbreak, for example, by detecting the reference to the
cited source. In most of the case, the OIE is cited in the text. So this would allow to filter out this type of reports. And then we showed that the local and national sources were primarily the mostly
implied sources for early detection, which is truly the objective of EBS. So for that, it would increase the coverage if EBS implement as much as language
as possible to target much more, even more local sources. And it can also be by targeting national authority with national VET authority websites. We also saw that there was a really heterogeneous behavior in terms of collection
and dissemination, meaning that it could be a very efficient way not only to monitor keywords, so to search reports based on specific keywords, but also to focus on specific sources that are known to be good collectors.
So we also saw during our data actuation that both system collected some irrelevant reports, posing the risk of false positive alerts. So in this process, human moderation is still required.
And we also observed that both tools were complementary and they didn't use the same range of sources. So yeah, they have to be used in a complementary way. And I will finish with some limitations and biases in this study.
So we were all limited by the manual time of curation that was needed to go through all these reports. That's why we focused on a one-year period that could be extended.
Another bias is that we focused on a very well-known disease, influenza, which is well documented with high economic and public health impacts. So it could be also interesting to study the behavior of the network with less known
diseases. Yes, we focused on English reports here. This didn't keep us from having access to a local source. But of course, we know that we missed all the information that was not translated into
English sources that may not have English websites and so on. So this is a great improvement that we could have. And last thing, important things to have in mind is that the results of individual sources' performance here are not necessarily generic.
They are very specific to the disease studied. We saw that, for example, the Bulgarian vet authority were a highly cited source. This is related to the fact that during the study period, there was the introduction
of a new subtype of highly pathogenic having influenza, leading to high coverage in the But this is very specific to our data. Thank you very much to all the people who helped in this work.
It was conducted in the context also of master internship of Baja, which is not in the picture, and Claire, who helped also in the internship. So yes, thank you.
And I hope we have time for questions.