We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Development of a graphical user interface to support the semi-automatic semantic segmentation of UAS-images

00:00

Formal Metadata

Title
Development of a graphical user interface to support the semi-automatic semantic segmentation of UAS-images
Title of Series
Number of Parts
351
Author
License
CC Attribution 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Identifiers
Publisher
Release Date
Language
Production Year2022

Content Metadata

Subject Area
Genre
Abstract
Development of a graphical user interface to support the semi-automatic semantic segmentation of UAS-images Image semantic segmentation focuses on the problem of properly separating and classifying different regions in an image depending on their specific meaning or use, e.g. belonging to the same object. It is worth to notice that in general segmentation is a ill posed problem: it is not possible to provide a unique solution to such problem, different solutions can typically be acceptable, depending on the segmentation criterion which is applied. Nevertheless, regularization techniques are typically used to reduce the issues related to ill posedness, hence ensuring the computability of a unique solution. In the case of semantic segmentation, ill posedness is also reduced by the specific data and object interpretation that shall be included in the semantic part of the data. It is also worth to notice that image semantic segmentation tools can be useful in many several applications, related both to the interpretation of images themselves, but also of other entities related to such images. The latter is for instance the case of a point cloud, whose objects and areas are also described by some images. In this case, a proper image semantic segmentation could be back projected from the images to the point cloud, in such a way to exploit such process to properly segment the point cloud itself. Automatic image semantic segmentation is a quite challenging problem that nowadays is usually handled by taking advantage of the use of artificial intelligence tools, such as deep learning based neural networks. The availability of reliable image segmentation datasets plays a key role in the training phase of any artificial intelligence and machine learning tool based on the image analysis: indeed, despite artificial intelligence tools can currently be considered as the state of the art method in terms of recognition and segmentation ability, they do require a huge size learning dataset in order to ensure reliable segmentation results. The developed graphical user interface aims at supporting the semi-automatic semantic segmentation of images, hence easing and speeding up the generation of a ground truth segmentation database. Then, such database can be of remarkable importance for properly training any machine or deep learning based classification and segmentation method. Despite the development of the proposed graphical user interface has been originally motivated by the need of easing the process of producing a ground truth segmentation and classification of plastic objects in maritime and fluvial environments, within a project aiming at reducing plastic pollution in rivers, the developed tool can actually be used in contexts that are more general. Indeed, the interface supports in particular two types of quite specific operations: 1) segmenting and identifying objects in a single image, 2) exporting previously obtained results in new images, while also enabling the computation of certain related parameters (e.g. navigation related, such as tracking the same object over different data frames). Different types of images are supported: standard RGB, multispectral images (already available as TIFF (Tagged Image File Format) images) and thermal ones. For what concerns the semantic segmentation of a single image, several alternative segmentation options are supported, starting from manual and going to semi-automatic segmentation methods. First, the manual segmentation of the objects is ensured by means of properly inserted polylines. Then, intensity based and graph based methods are implemented as well. On the semi-automatic side, two tools are provided: a) a machine learning based method, exploiting few click choices by the user (implementing a rationale similar to that in (Majumder et al., “Multi-Stage Fusion for One-Click Segmentation”, 2020), i.e. aiming at minimize the user input), b) when images are periodically acquired by a UAS, at quite high frequencies, two successive frames are expected to be not that different from each other. Consequently, the system aims at determining the camera motion between different frames, and using machine learning tools to properly extend and generalize the results in the previous image to those of the new one. The latter method opens to a wider scenario, where some more information may come by the availability of consecutive frames. In particular, such additional information that could be determined by properly analyzing consecutive frames could be used to: assess and track the UAS movements while acquiring the video frames, increase the automation in the segmentation and classification process of an object. Overall, the developed graphical user interface is expected to be useful to support the semi-automatic identification of objects, and to help determining the UAS and the object movements as well. Despite full autonomous image semantic segmentation would clearly be of interest, its development seems to be quite challenging. Nevertheless, future investigations will be dedicated to these aspects, in order to increase the procedure automation level. The simulator will be freely available for download from the website of the GeCo (Geomatics and Conservation) laboratory of the University of Florence (Italy).
Keywords
202
Thumbnail
1:16:05
226
242
EmailUser interfaceComputer-generated imageryEvent horizonMathematical analysisGraph (mathematics)Military operationOperator (mathematics)InformationArtificial intelligencePattern recognitionInformation extractionGraphical user interfaceObject-oriented programmingSpectrum (functional analysis)UsabilityDatabase normalizationSet (mathematics)User interfaceObject-oriented programmingGraph (mathematics)Database normalizationElectronic signatureSlide ruleArtificial neural networkMathematical analysisSimilarity (geometry)Goodness of fitOperator (mathematics)Interactive televisionMereologyInterpreter (computing)Event horizonPoint (geometry)Machine learningBilderkennungMachine visionGraphical user interfaceVector potentialStandard deviationArithmetic meanRaster graphicsPresentation of a groupUsabilityProof theoryProcedural programmingMultiplication signIdentical particlesComputer animation
InformationMultiplicationSpectrum (functional analysis)Integrated development environmentObject-oriented programmingComputer-generated imageryMusical ensembleGraphical user interfaceGraph (mathematics)Object-oriented programmingIntegrated development environmentObject-oriented programmingMusical ensembleProof theoryUser interfaceSlide ruleConfiguration spaceSpectrum (functional analysis)MultiplicationComputer animation
StatisticsGraphical user interfaceVisualization (computer graphics)BildsegmentierungObject-oriented programmingMusical ensembleComputer-generated imageryoutputOperator (mathematics)Asynchronous Transfer ModeObject-oriented programmingElectric currentAreaImplementationPixelDifferent (Kate Ryan album)Subject indexingMilitary operationGreen's functionPrice indexWorkstation <Musikinstrument>Instance (computer science)Maxima and minimaGraph (mathematics)CybersexHistogramThresholding (image processing)Similarity (geometry)Social classVarianceInformationPartition (number theory)Focus (optics)InformationPixelGraphical user interfaceCASE <Informatik>Graph (mathematics)Object-oriented programming1 (number)BitObject-oriented programmingSelectivity (electronic)Different (Kate Ryan album)Graph (mathematics)Operator (mathematics)Pointer (computer programming)SoftwareAreaSet (mathematics)Multiplication signUniverse (mathematics)Open setSlide ruleProgramming languageArithmetic meanAuditory maskingMaxima and minimaSource codeVisualization (computer graphics)Point (geometry)Presentation of a groupView (database)NumberIdentical particlesLimit (category theory)Insertion lossElectronic mailing listCharacteristic polynomialDifferenz <Mathematik>Standard deviationUser interfaceComputer programmingSubject indexingFreewareComputer animationSource code
Thresholding (image processing)CybersexHistogramMusical ensembleSimilarity (geometry)Social classVarianceInformationPartition (number theory)PixelFocus (optics)Computer-generated imageryGraphical user interfaceAsynchronous Transfer ModeGraph (mathematics)AreaWeightVertex (graph theory)Graph (mathematics)Maxima and minimaMereologyFunction (mathematics)Set (mathematics)Dedekind cutDatabase normalizationBildsegmentierungEigenvalues and eigenvectorsCondition numberMatrix (mathematics)Standard deviationElement (mathematics)DiagonalReduction of orderBerechnungskomplexitätOrientation (vector space)Term (mathematics)Characteristic polynomialGreatest elementSoftwareComputer configurationObject-oriented programmingSystem identificationPrice indexMenu (computing)Object-oriented programmingResultantProcedural programmingNeuroinformatikFilm editingCASE <Informatik>Position operatorSelectivity (electronic)BitPairwise comparisonGraph (mathematics)1 (number)Orientation (vector space)ComputerGraph (mathematics)Object-oriented programmingVector spaceDifferent (Kate Ryan album)Nichtlineares GleichungssystemObject-oriented programmingInformationSummierbarkeitLine (geometry)WeightUser interfaceDialectAuditory maskingAdditionPixelThresholding (image processing)Social classSimilarity (geometry)Arithmetic meanAsynchronous Transfer ModeGraph (mathematics)VarianceComputer-assisted translationDistribution (mathematics)MereologyPoint (geometry)HistogramMusical ensembleBimodal distributionStatistical dispersionConnectivity (graph theory)Normal (geometry)Eigenvalues and eigenvectorsMixed realityAreaDirection (geometry)Computer animation
Operator (mathematics)Graphical user interfaceObject-oriented programmingComputer-generated imagerySpectrum (functional analysis)MultiplicationGraph (mathematics)InformationGraph (mathematics)Computer animation
Transcript: English(auto-generated)
Thank you very much Maria, so During this presentation. We're going to talk about Work that we are the an interface actually a graphical interface that we are developing here in order to properly support the segmentation of Images in particular
This is motivated by As you probably know I mean during the last decades we had A lot of new instruments providing a huge amount of data Which clearly it's a very good thing because you go you can
definitely Obtain a much better knowledge about the reality you can try to Have good interpretation about events and so on however This also Leads you to some to some issues related to the analysis of this data, so
we do need the possibly some kind of automatic or semi-automatic tools that allow us to ease in some way our Understanding of these images our analysis of these images I guess that you may probably be already quite familiar with the idea of big data
And so this actually what we are talking about so we would like to Get some ways in order to ease our analysis of this big data So here we also have a question so too big or not too big that is the question
Okay, so likely speaking a lot of our Data are actually can actually be photos images, or if you prefer even a raster or whatever, but the point is that We do need
Since typically we do use artificial intelligence tools machine learning and deep learning tools in order to Analyze these images we do typically need quite large data sets in order to properly train these tools Unfortunately generating the data sets that we're going to use in order to
Train these tools. It's quite time-consuming this typically requires the presence of a lot of work by means of human operator and Despite vision some sense. It's quite a standard way to do it
It's still something that it's not really so so nice to do some typically quite annoying and so the idea here is that of supporting this human operator in such a way that The These operations versus these labeling is still going to be probably quite an annoying procedure, but
faster is possible and possibly and also more accurate if something Can also Enable some kind of Same automatic analysis of you now of the image itself
So We are quite lucky because during the last decades where we have a lot of Tools that have been developed in order to make image analysis and actually We are going to exploit
in general we would like to exploit the potential of also the new methods such as What we are going to talk about is trying to ease the generation of a ground truth data set in order to properly support
The learning part of these two obvious artificial intelligence based tools Okay, so Just to summarize what we are going to talk is trying to generate the ground truth data that we're trying to ease the generation of Ground truth data. So all of the labeling part and all of the segmentation labeling part part of the work
and since usually a lot of human interaction is required what we would like to have is something like Making it faster. And so the faster the better We are going to make some assumptions
So first we are going to assume that we are going to work with UAS images So this actually means that we're going to have a lot we typically are going to have a lot of redundancy between Our data because usually we acquire For example an image every second or something similar
So typically you have a lot of overlapping between these data And so also a lot of a lot of redundancy that we are going to exploit in In whatever we are going to show in the next slides Furthermore we are also assuming that we do have some multispectral information So we're going to deal with multispectral images and in particular
We're going to make to make an implicit assumption That is that the objects of our interest can be in some way Distinguished from from each other by means and in particular from whatever is not of our interest by means of their Spectral signature is not so surprising is something that is done quite in a standard way, but it's
It's quite important with respect to what we are going to see in the next slides This in some way the last assumption can also be summarized if you like as Everything you do is a signature of yourself or if you prefer everything I do
Especially during this presentation going to do it for you Okay So just to make the things clear how we are going to consider with the motivation the real motivation of this work Is that we collected a lot of images?
in order to make plastic object detection Flable environment so our original idea was that of properly dealing with this kind of images so We started developing this kind of interface in order to help ourselves To produce of the ground truth data for
properly training a machine learning tool for the classification of for the Plastic detection of object for the detection of plastic objects in this kind of problem So you as you can see in this slide actually we what we use is a multi spectral camera of the Maya camera
in the Sentinel to configuration So we use actually nine bands and I'm multispectral channels and we mounted this camera on the mattress of 300 and we collected something like some thousands of images of plastic objects in
In on the Arno river in this in the example that we are looking at in in this slide So the real motivation is first try to help ourselves in in making this thing So what we've what we have started to do is developing
This graphical user interface the graphical user interface that we're developing is in MATLAB Well, the first question in your mind is probably right now. The MATLAB is definitely not a free tool however most of well most of the of universities and in the academic institutions
We typically do have a MATLAB license. So the idea here is that of providing a Freely downloadable tool with whatever with all of the source code and whatever for a
Software which actually for a programming language, which actually is not a cannot be open in any way right now however, you you can still it's still free in some sense for academic use so That is the reason why well, obviously we also do it in this way because we are quite familiar with this kind of programming language
So that is the reason why we started doing it in this way So in the next slides we are going to focus on the free kind of operations that are typically that can typically be done by the by the guy so The first operation is just related to the visualization of the data
The second one is related to the segmentation and in the third case We are just going to talk about the object classification and in particular We're mostly going to focus on the second step. So on the segmentation step Well for what concerns
viewing the visualization of the imagery while this is quite standard so We can typically we provided the possibility of making quite standard operations such as Well, obviously selecting the data set navigating the data
Zooming shifting whatever out changing the visible channels and so on. So all of these are quite standard operations So I'm not going to spend too much time on this So instead for what concerns of the segmentation so we distinguish it between editing an already existing object and
Instead making a new in certain information about new object so okay, so in order to Consider this thing first. We should make an assumption. So here we assume that the objects can be distinguished by
by properly defining a mask on each of you each of our images and in particular It's quite important it's worth to notice that currently According to our assumptions if we have two overlapped objects, even if they are different from each other
We actually do not distinguish between the two of them This means that we do only have one mask for each images. And if we have two overlapping objects We are just going to consider them as Just one object. This is clearly a limitation We are also thinking about extending the finger extended this kind. So generalizing a little more
This kind of finger actually Generalizing a little bit more means in some sense Changing from a binary mask to an index of mask with different numbers and so on Depending on the object that we are considering but by the way for now, we are just making this kind of assumption
Okay, so we've also developed quite a lot of tools in order to support the Segmentation of the manual segmentation of the objects And also some semi-automatic tools that we're going to talk in a minute As you can see the list of the tools is
Relatively long. I mean not really incredibly long, but it's relatively long. We are just going to focus on The free for tools which are in some way More remarkable. So let's first start with the more obvious obvious ones
so in particular the free and selection here, there's nothing really so surprising so you can Just select with a standard the free and to the object each of the objects of your interest So I'm not going to provide too many details about it because it's pretty obvious and also
We supported the pixel that Each pixel selection. So in this case just by mouse clicking you you can Directly select or unselect the pixels of your interest This typically can be fought as an operation just to slightly modify an object that you have already defined
Okay, so it's pretty clear that the two tools that we have just talked about are quite obvious So let's move on something a bit more reasonable at least a bit more interesting So in particular we developed a couple a couple of tools for same automatic
Area selection of each of the objects. So in this case what we would like to obtain Something like a tool where the operator just select with We were left with a left click of the mouse any of the pixel of the object and then the tool should Automatically find whatever are the borders of the object. Okay, so this is the idea
So here you can see an example on how it works You should see that the in this case the user just selected. Do I have a pointer here? yeah, so you see that here the user just have just selected a point inside of the object and
The tool automatically selected the borders of the object. So in such a way to ease The labeling the segmentation labeling of the object. Okay, fine. So obvious thing can work. Well, we have Developed a couple of different ways in order to do it
but before describing the ways Let's just focus on a couple of assumptions. So we we assume that The user as we have already said just select a point inside of the object and the other assumption that we're making is the view of user also
Properly set as already probably set in some way The size of a searching window. Okay, so we do know that The objects should have a maximum size in a certain sense. Okay around the point that we have selected And also obviously for what concerns what we have already said
We also assume that in some way we can distinguish the object by means of its multispectral Characteristics from whatever is around it Okay, fine so We've implemented as I told you a couple of ways to do it the first it's just based on the also some also method for
segmenting actually in this case This pretty simple, I guess that most of you have already heard about it So let's try to describe it in a minute. So what we are going to do, let's focus just on using one channel You may imagine that there are many ways in order to combine the information from different channels
But for simplicity for the sake of simplicity in this particular in this presentation Let's just focus on the case in which we are just considering one channel So in this case, we are just going to Compute an histogram of Intensities inside of this image. Okay on the channel that we are considering in particular inside of a searching window and
What we are what we would like to find is in some way an optimal threshold for these intensities in such a way to distinguish between The Two kind of components that we are expecting so we are expecting to see two components one related to the objects
so you see here we have a multimodal distribution obviously Where one of the mode is related to the object and the other one is supposed to be related to the background in this Case we would like to distinguish between the two of them in order to define in some way an optimal
threshold to distinguish with two of them We are going to search for it in such a way to minimize the Intervariance between each of the two classes. Okay. So inside of each of the two classes We would like to minimize this intravariance Since we are considering just two classes visit equivalent in some way also to maximize the inter class
Variance, so that is probably not really so so important But the point is that in some way we would like to minimize if you prefer the dispersion around the mean of each of the two Of each of the two distribution that you are going to consider in this case
So this is pretty simple as you might have imagined. So This is just the first way that we have used. The other one is instead based Graph is based on graph based segmentation in this case We have just implemented the normalized cats the normalized cats. Sorry
Up so in this case We are just going to consider a graph where each of the pixel inside of a searching window is going to be a node of the graph, so let's try to See the graph so this clearly Just a simple graph in order to have an idea about what's happening and in particular
For each of the edges between the nodes we are going to define a weight This weight is related to the similarity of the values of a node. Okay So in particular what we can what is possible to show is that in order to make
This kind of computation work. This kind of weight is going to be Small the smaller is this way The more different are the two pixels that we are considering the two nodes that we are considering, okay Okay, fine, so what we would like to find is
Something like a cut. Okay, so you see in this red line is actually cutting The graph in two parts where we are distinguishing, you know, I mean the kind of problem that we are considering We are distinguishing between the object of our interest and the rest of the other searching window
So in some sense, let's talk about the rest as something like a background So a Way in order to define this candle to compute this kind of cut maybe for example that of Minimizing the ways that were the ways of the edges that we are cutting
Okay, so let's some for example all of you have the weights of all of the edges that we are cutting with this kind of Line and a first way in order to do it Maybe that let's say also quite a naive way in order to do it Is that of trying to minimize the sum of these weights? Okay This is quite reasonable. But unfortunately this trial this tends to in some way
favors The generation the generation of quite small regions, okay, which is not really our case So we would like to make something which is Which is not? Affected by this kind of issue So in order to make it and to make it in some sense to make it in a more fair way
So not depending on the area of the region that we are segmenting We're going to consider the normalized cut in the case of a normalized cut We are just considering this kind of functional to be minimized Okay, fine
The good thing in this kind of procedure is that actually can be shown you can find the details in this paper this kind of problem can be can be actually Reposed as the computation of eigenvalues of eigenvalues and they get vector of this kind
On this kind of a question Sorry in this kind of equation And so the only thing that we really do need of the importance the most important thing that we really do need to do Is computing the eigenvector obvious kind of equation in particular what we are really interested to it's just the eigenvector
Associated to the second smallest eigenvalue Okay, fine In addition to What I told you we have also developed another tool which actually tries to exploit the results from previous images because as I told you since we are acquiring images with
with a UAV or UAS we have that Successive images are quite similar to each other. Okay, and we would like in some way to exploit this kind of information so what we're going to do in this case is that of Assuming that the images are acquired
That the camera is actually oriented in an ideal orientation And we also assume that some navigation data are available from this kind of from these two kind of Information we can exploit this information in order to define a proper Searching window to find a previously find object in the next images
Okay, so it's not that the such a difficult thing you can see here that the objects are moving here You're directing looking at the masks Okay of the binary masks and what we are looking at is just you see that we have defined
searching windows and so on and in particular you can also imagine that we have You may also try to develop different tools in order to distinguish which ones Of the areas inside of the searching window are those Corresponding to the real object in our case. We developed we implemented
Some kind of comparison based actually on descriptors on feature descriptors and Here you can see what we found as the best Let's say Medium point the centroid actually of a new
Plastic positions in the new image. You see that now you are directly looking at Some some of the bands of a real image acquired by our drone Okay for both concerns with classification instead is something that we are still working about it. So classification for now
It's mostly yeah, I've finished it's just based on the manual selection object and Just the manual selection can be propagated also in the way that you have just seen by means of this kind of of two But for now, it's mostly manual. We're going to work a little bit more
in our future development on this step So this is just to conclude You have just seen the kind of tool that we have developed in order to support the segmentation of images And Clearly it's something that we are still developing So on the foreseen future work you see that there are still a couple of steps that we are planning to do
Hopefully quite soon. This is all Thank you