We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

The Complexity of Dynamic Data Race Prediction

00:00

Formale Metadaten

Titel
The Complexity of Dynamic Data Race Prediction
Untertitel
Q/A Session E - Paper E6.E
Serientitel
Anzahl der Teile
56
Autor
Mitwirkende
Lizenz
CC-Namensnennung 3.0 Deutschland:
Sie dürfen das Werk bzw. den Inhalt zu jedem legalen Zweck nutzen, verändern und in unveränderter oder veränderter Form vervielfältigen, verbreiten und öffentlich zugänglich machen, sofern Sie den Namen des Autors/Rechteinhabers in der von ihm festgelegten Weise nennen.
Identifikatoren
Herausgeber
Erscheinungsjahr
Sprache

Inhaltliche Metadaten

Fachgebiet
Genre
Dynamisches SystemKomplex <Algebra>VorhersagbarkeitDynamisches SystemKomplex <Algebra>VorhersagbarkeitUMLXMLComputeranimation
SoftwareProgrammierungProgrammierparadigmaKartesische KoordinatenDatenparallelitätSoftwareentwicklerSoftwareProgrammierparadigmaDatenparallelitätMinkowski-MetrikComputeranimation
SoftwareProdukt <Mathematik>MAPSoftwaretestProgrammfehlerDatenparallelitätSoftwareProdukt <Mathematik>SoftwaretestProgrammfehlerProgrammierparadigmaQuellcodeDatenparallelitätMinkowski-MetrikComputeranimation
AnalysisDynamisches SystemProgrammierungHydrostatikGanze FunktionQuellcodeAnalysisDynamisches SystemHydrostatikQuellcodeLaufzeitfehlerComputeranimation
AnalysisFolge <Mathematik>ProgrammierungNichtlinearer OperatorPunktLesen <Datenverarbeitung>ThreadSpeicheradresseEreignishorizontDatenparallelitätFolge <Mathematik>ProgrammierungHalbleiterspeicherNichtlinearer OperatorThreadEreignishorizontDatenparallelitätURLComputeranimation
AlgorithmusDatensatzDynamisches SystemProgrammierungSchedulingSigma-AlgebraÄhnlichkeitsgeometrieVorhersagbarkeitThreadSpeicheradresseEreignishorizontDatenparallelitätDynamisches SystemSpieltheorieHalbleiterspeicherSchedulingVorhersagbarkeitThreadEreignishorizontComputeranimation
ProgrammierungKategorie <Mathematik>AggregatzustandGarbentheorieKontrollstrukturSchnittmengeLesen <Datenverarbeitung>QuellcodeEreignishorizontOrdnung <Mathematik>Kategorie <Mathematik>GarbentheorieThreadGamecontrollerComputeranimation
AlgorithmusOrdnung <Mathematik>Leistung <Physik>Sigma-AlgebraFlächeninhaltSystemaufrufGüte der AnpassungVorhersagbarkeitErfüllbarkeitsproblemFokalpunktSummengleichungEreignishorizontSymboltabelleKontextbezogenes SystemMinkowski-MetrikSoftwareentwicklerOrdnung <Mathematik>PolynomKategorie <Mathematik>VektorraumGarbentheorieSatellitensystemStützpunkt <Mathematik>VorhersagbarkeitThreadAbzählenKonditionszahlObjekt <Kategorie>DatenparallelitätGamecontrollerSoftwareentwicklerComputeranimation
AlgorithmusMultiplikationsoperatorAlgorithmusPolynomVektorraumSatellitensystemStützpunkt <Mathematik>VorhersagbarkeitPartielle DifferentiationAbzählenBitrateSoftwareentwicklerComputeranimation
AlgorithmusLaufzeitfehlerMultiplikationsoperatorAlgorithmusPolynomVektorraumKomplex <Algebra>Potenz <Mathematik>SatellitensystemStützpunkt <Mathematik>VorhersagbarkeitVollständigkeitAbzählenMultiplikationsoperatorSoftwareentwicklerComputeranimation
AlgorithmusFolge <Mathematik>Funktion <Mathematik>Komplex <Algebra>Sigma-AlgebraVorhersagbarkeitEin-AusgabeSpeicheradresseMultiplikationsoperatorAlgorithmusPolynomFunktion <Mathematik>HalbleiterspeicherVorhersagbarkeitVollständigkeitThreadPartielle DifferentiationEreignishorizontURLAblaufverfolgungComputeranimation
Folge <Mathematik>PolynomKardinalzahlSatellitensystemGebundener ZustandMultiplikationsoperatorPolynomFunktion <Mathematik>HalbleiterspeicherSatellitensystemUntere SchrankeVorhersagbarkeitGebundener ZustandÄußere Algebra eines ModulsThreadAbzählenEreignishorizontURLAblaufverfolgungComputeranimation
SynchronisierungLeistung <Physik>Primitive <Informatik>NP-hartes ProblemFunktion <Mathematik>HalbleiterspeicherSynchronisierungLeistung <Physik>VorhersagbarkeitThreadPrimitive <Informatik>EreignishorizontBitrateNP-hartes ProblemURLAblaufverfolgungComputeranimation
Folge <Mathematik>ResultanteWiderspruchsfreiheitCASE <Informatik>Mapping <Computergraphik>Folge <Mathematik>Funktion <Mathematik>HalbleiterspeicherSynchronisierungProgrammverifikationLeistung <Physik>WiderspruchsfreiheitVorhersagbarkeitThreadPrimitive <Informatik>EreignishorizontBitrateNP-hartes ProblemDifferenteURLAblaufverfolgungComputeranimation
AlgorithmusZahlensystemKomplex <Algebra>MaßerweiterungPhysikalische TheorieResultanteZahlenbereichFontParametersystemCASE <Informatik>VorhersagbarkeitInstantiierungGebundener ZustandEin-AusgabeThreadBeobachtungsstudieMultiplikationsoperatorAblaufverfolgungAlgorithmusFunktion <Mathematik>HalbleiterspeicherKomplex <Algebra>MaßerweiterungPhysikalische TheorieLinearisierungKonstanteParametersystemCASE <Informatik>VorhersagbarkeitEin-AusgabeThreadBeobachtungsstudieWiederkehrender ZustandEreignishorizontURLAblaufverfolgungMinkowski-MetrikComputeranimation
Ordnung <Mathematik>Luenberger-BeobachterLesen <Datenverarbeitung>SpeicheradresseEreignishorizontMapping <Computergraphik>AlgorithmusTelekommunikationTopologieZahlensystemHalbleiterspeicherIdeal <Mathematik>FontAbstandCASE <Informatik>VorhersagbarkeitGebundener ZustandThreadEreignishorizontURLAblaufverfolgungMapping <Computergraphik>Computeranimation
Folge <Mathematik>Kategorie <Mathematik>Sigma-AlgebraTeilmengeZahlenbereichLesen <Datenverarbeitung>EreignishorizontMapping <Computergraphik>Folge <Mathematik>ZahlensystemThreadEreignishorizontMapping <Computergraphik>Computeranimation
Ordnung <Mathematik>Sigma-AlgebraSchnittmengeEreignishorizontGibbs-VerteilungIdeal <Mathematik>EreignishorizontAblaufverfolgungComputeranimation
Ideal <Mathematik>Sigma-AlgebraTermLinearisierungVorhersagbarkeitGibbs-VerteilungLoginAblaufverfolgungIdeal <Mathematik>EreignishorizontAblaufverfolgungComputeranimation
Ideal <Mathematik>Sigma-AlgebraVorhersagbarkeitSchnittmengeEreignishorizontAblaufverfolgungIdeal <Mathematik>VorhersagbarkeitEreignishorizontComputeranimation
AlgorithmusFeasibility-StudieIdeal <Mathematik>Sigma-AlgebraZahlenbereichVorhersagbarkeitEin-AusgabeJensen-MaßMultiplikationsoperatorAlgorithmusFeasibility-StudieIdeal <Mathematik>VorhersagbarkeitAbzählenEreignishorizontComputeranimationDiagrammFlussdiagramm
ExponentIdeal <Mathematik>ResultanteZahlenbereichKonstanteVorhersagbarkeitThreadMultiplikationsoperatorAlgorithmusDynamisches SystemFeasibility-StudieGarbentheorieIdeal <Mathematik>Lemma <Logik>TeilbarkeitEinflussgrößeVorhersagbarkeitThreadEreignishorizontComputeranimation
TopologieCASE <Informatik>VorhersagbarkeitThreadMultigraphGraphTelekommunikationTopologieThreadComputeranimation
GraphTelekommunikationTopologieThreadComputeranimation
TopologieIdeal <Mathematik>ResultanteCASE <Informatik>Parallele SchnittstelleThreadTelekommunikationTopologieTeilbarkeitThreadComputeranimation
TopologieIdeal <Mathematik>Komplex <Algebra>Quadratische GleichungCASE <Informatik>VorhersagbarkeitDickeTelekommunikationTopologieVariableFeasibility-StudieIdeal <Mathematik>Lemma <Logik>TeilbarkeitServerThreadClientEreignishorizontComputeranimation
Ordnung <Mathematik>Sigma-AlgebraAbstandVorhersagbarkeitSchnittmengeEin-AusgabeReverse EngineeringTelekommunikationTopologieIdeal <Mathematik>Komplex <Algebra>TeilbarkeitServerAbstandGebundener ZustandThreadReverse EngineeringClientMultiplikationsoperatorComputeranimation
Sigma-AlgebraAbstandVorhersagbarkeitEreignishorizontAblaufverfolgungAbstandVorhersagbarkeitGebundener ZustandReverse EngineeringEreignishorizontComputeranimation
AlgorithmusIdeal <Mathematik>SchnittmengeReverse EngineeringMultiplikationsoperatorIterationFeasibility-StudieIdeal <Mathematik>TeilmengeAbstandGebundener ZustandEreignishorizontInverter <Schaltung>Computeranimation
ParametersystemCASE <Informatik>VorhersagbarkeitMultiplikationsoperatorFeasibility-StudieIdeal <Mathematik>KonstanteAbstandGebundener ZustandThreadEreignishorizontComputeranimation
AlgorithmusResultanteZahlenbereichCASE <Informatik>VorhersagbarkeitThreadMultiplikationsoperatorFeasibility-StudieIdeal <Mathematik>KardinalzahlKomplex <Algebra>KonstanteFontAbstandParametersystemCASE <Informatik>VorhersagbarkeitGebundener ZustandThreadEreignishorizontMultiplikationsoperatorXMLComputeranimation
Folge <Mathematik>GraphSoftwaretestOrdnungsreduktionProjektive EbeneResultanteWiderspruchsfreiheitKonstanteStochastische AbhängigkeitParametersystemVorhersagbarkeitGemeinsamer SpeicherEin-AusgabeInterleavingNP-hartes ProblemDifferenteMultiplikationsoperatorMapping <Computergraphik>Unabhängige MengeOrdnungsreduktionIdeal <Mathematik>ResultanteKonstanteStochastische AbhängigkeitParametersystemVorhersagbarkeitGebundener ZustandThreadNP-hartes ProblemUnabhängige MengeComputeranimation
AlgorithmusPolynomPotenz <Mathematik>MultiplikationsoperatorAblaufverfolgungOrdnungsreduktionIdeal <Mathematik>ResultanteKonstanteParametersystemVorhersagbarkeitGebundener ZustandThreadNP-hartes ProblemUnabhängige MengeComputeranimation
AlgorithmusTopologieOrdnungsreduktionOrthogonalitätPotenz <Mathematik>Statistische HypotheseVersionsverwaltungTeilbarkeitMultiplikationsoperatorAlgorithmusVariableVektorraumOrthogonalitätIdeal <Mathematik>TheoremVorhersagbarkeitGebundener ZustandThreadEreignishorizontComputeranimation
AlgorithmusDynamisches SystemNebenbedingungZahlenbereichParametersystemVorhersagbarkeitGebundener ZustandRandomisierungProgrammierparadigmaKlasse <Mathematik>ThreadMinkowski-MetrikAlgorithmusDynamisches SystemTopologieIdeal <Mathematik>Komplex <Algebra>Physikalische TheorieAbstandVorhersagbarkeitGebundener ZustandProgrammierparadigmaThreadNP-hartes ProblemAblaufverfolgungMinkowski-MetrikXMLComputeranimation
JSONXML
Transkript: Englisch(automatisch erzeugt)
Hi, I am Umang Mathur and I am going to be talking about the complexity of dynamic data race prediction and this is joint work with Andreas Pavloujiannis and Mahesh Rifanathan.
Concurrency is a ubiquitous programming paradigm and most software applications are concurrent by design. But at the same time, concurrent programs are very challenging to develop and that is because when developers have to reason about the correctness of their programs, they have to look at exponentially many interleavings. And as a result, concurrency
bugs manifest themselves in production level software despite rigorous in-house testing. Data races happen to be one of the most common concurrency issues and I am going to be talking about data races in the rest of the day. Broadly, race detection techniques
are either static analysis techniques or dynamic analysis techniques. Static analyzers look at the entire source code and try to figure out the presence of data races. This is an undecidable problem and most static analyzers report a lot of false alarms. Dynamic analysis techniques on the other hand, look at executions of programs and try
to infer the presence of data races based on these executions. And typically, dynamic analyzers are sound, they do not report false alarms and that is one of the reasons why they are more widely adopted than static analyzers. So, when one talks about dynamic
analysis techniques, one looks at executions. And executions of concurrent programs are basically sequences of events where every event tells you what is the thread that performed that event and what is the operation that was performed in that event. So, an operation is either a read or a write from a memory location or it is an acquire or release
of a lock. So, if you look at the execution on the right, it has six events performed by either thread T1 or thread T2. The first event is an acquire of lock L performed by T1 and the fifth event is the matching release of the same lock again performed by T1. Similarly, the third event is the write to memory location Y and the fourth
event is a read from a memory location X. Executions are assumed to be well formed, that is, at any point in the execution, if you look at a given lock L, it is held by at most one thread. Let's now define data races. An execution has a data race if there is a pair of conflicting events in the execution that are next to each other. By
conflicting, I mean two events that access the same memory location are performed by different threads and at least one of them is a write. So, if you look at the execution on the right, it has a data race because it has two conflicting events that are next to each other. The problem of dynamic data race detection asks if an execution
has a data race and as you can imagine, it has a very trivial algorithm that linearly scans a given execution. But this way of detecting data races is not very meaningful because executions of concurrent programs are very sensitive to thread scheduling and this way of detecting data races will miss a lot of races. So, let's look at this example. This is a very similar trace, execution
trace as the previous one, but this does not have a data race. But nevertheless, this can be reordered to expose the data race. So, if you reorder the redex above the release lnt1, you will get the previous execution which has a data race. So,
reorderings can expose data races and this is the question that data race prediction asks. It asks if given an execution sigma, is there a reordering row of sigma such that row has a data race? Before I proceed, I must warn you that not all reorderings make sense and only some reorderings are allowed. And the fundamental principle
behind defining the set of allowable reorderings is the following. Any program that generates the observed execution must also be capable of generating the reordering and this principle ensures that a reasoning behind reorderings can be made agnostic of the source code. So, with this principle, let's state some of the properties that we want our reorderings to satisfy. First of all,
we want them to be well formed. That is, we don't want critical sections on the same log to overlap. So, if you look at this execution, I cannot reorder a requirel between writex and release lnt1. That's because that would violate the semantics of logs, but I can reorder the entire critical section
above the critical section on t1. Similarly, executions are assumed to preserve intra-theta reordering and therefore this kind of reordering is not allowed. Lastly, we want that these reorderings preserve the control flow of the original execution. And we ensure this by saying that every read event sees the same
last write as the original execution. So here, reordering readx above the writex is not allowed because the readx will not see the same write. So, this notion is called correct reordering and this, in fact, dates back to the days
of Hurlihi and Wing in the linearizability paper of 1990. So, just to recap, the problem of data race prediction is to check, given an execution sigma, is there a correct reordering of sigma in which two conflicting events are consecutive? Now, before I move to the contributions of this paper, I would like to mention the historical
context of our work. Data race prediction has been an actively studied area for over five decades now. Much of the developments in the initial three decades were focused on simple techniques like the use of happens before partial order and much of the focus was in developing scalable techniques that work well in practice. These techniques were not all
powerful and would often miss a lot of data races. Now, in the next decade or so, people started looking at ways to improve data race prediction by making sure that their techniques report more races than before and most of these were based on either exhaustively
enumerating the space of reorderings or symbolic ways that made calls to SAT solvers. Now, one can imagine these techniques were slow and, when implemented in practice, would perform even worse than happens before race prediction. In 2012, Smarakdakis et al. came up with a new partial order that detects more races than
happens before and provided a good balance of predictive power and performance. And then, from 2017 onwards, this area started getting more attention and many algorithms for predicting data races have been proposed since then. Each of these algorithms run in polynomial time of a linear time and detects more races than the previous paper. And the last three years or so,
there have been more than 30 papers on this topic. So, if you look at this initial phase, the algorithms developed would run in polynomial time but would be almost always incomplete. They would miss a lot of races. And then, in the middle era, the algorithms were very powerful
but their runtime was very poor. And in the last few years, the techniques proposed almost always run in polynomial time. And they also, in practice, do not miss any data race. So, looking at this recent trend, one interesting question we wanted to answer is,
is there an algorithm that runs in polynomial time and still never misses any data race theoretically? In other words, we wanted to understand what is the complexity of data race prediction problem. Let us first discuss some obvious complexity for this problem. So, the specific problem we will look at is the following.
You are given an input and execution trace sigma and two events e1 and e2. The execution has n events, k threads, and d memory locations and locks. And the output is yes, if and only if there is a correct reordering of sigma in which e1 and e2 are consecutive.
So, easily, this problem is in NP because one can guess a new sequence and check in polynomial time if this new sequence is a correct reordering in which e1 and e2 are consecutive. And numeration based techniques typically solve this problem in k to the n time and SAT solving based techniques solve it in 2 to the polynomial of n time.
Let us look at some lower bounds for this problem. Well, to the best of our knowledge, there were no lower bounds for this specific problem. So, we would like to talk about close related work before we proceed to contributions.
Often, this problem is said to be NP-hard and this is attributed to Netzer and Miller. But I'd like to point out that the specific problem that Netzer and Miller consider is slightly different. The notion of correct reordering is, in fact, very restrictive. And the hardness comes from very powerful synchronization primitives that are used in
executions. Gibbons and Korach in 1997 also looked at a slightly different problem called Verifying Sequential Consistency and the specific problem is called VSC-RM or VSC-Read Mapping. But then again, the results do not apply to our case because that is a different problem and
I'm going to talk about that problem later in this talk. So, let's look at the contributions of this paper. In this paper, we perform an extensive study of complexity theoretic questions in data race prediction. So, in the general case, we show that there is an algorithm that
runs in time k into n to the 2k and when the number of threads k is constant, then this problem is polynomial time. And we also further show that, in fact, this problem is hard in the and that also means that this problem is NP-hard when the number of threads is not constant
and, in fact, there's no FPT algorithm in the parameter k. And then we consider special cases based on practical instances of this problem and we show that when we restrict the space of input traces in some fashion, there is a quadratic algorithm for this problem and there is also a
lower bound for the same problem. We then look at the problem when we restrict the space of data races that we want to report and we show that in that case, there is a linear time algorithm. So, the rough outline of the rest of the talk is as follows. I'm going to first
talk about some notations that will be useful for the results and then I'm going to give the algorithm for data race prediction in the general case as well as in two specific cases and then I'm going to move to lower bounds and then I'm going to conclude with summary and some future work. So, if you have an execution sigma, the intra-thread ordering of the execution is
denoted by TO. The RF mapping maps every read event to its last write event on the same memory location and every release to its matching acquire. And TRF is a partial order that combines
TO and also respect RF. Now, one can make a very simple observation about these orders. So, if you have a sequence rho on a subset of events of sigma, then rho is a correct reordering of sigma if it satisfies three properties. Number one, for every log L, there is at most one
unmatched acquire of L in rho. Number two, rho respects the TRF ordering of sigma and number three, rho also respects the read from mapping of sigma. Let's look at what are feasible trace ideals. A feasible trace ideal I of an execution sigma is a set of events such that it is downward
closed with respect to the TRF order and for every log L, there is at most one unmatched acquire of L in I. So, if you look at this execution, the TRF ordering is shown in red and the blue set of events is downward closed with respect to TRF and there is at most one
unmatched acquire on every log. Now, the canonical RF post-set of ideal I is the smallest partial order that respects TRF and also orders every release before every unmatched acquire on the same log. So, the realizability problem for trace ideals is the following. So, if you are given a
trace ideal I of an execution sigma, you have to check if there is a linearization sigma star of this RF post-set, canonical RF post-set P I such that this linearization respects the
reads from ordering mapping. Now, data race prediction can be posed in terms of the realizability problem of trace ideals as follows. A pair of conflicting events E1, E2 is a predictable data race of an execution sigma if and only if there exists a feasible trace ideal I of sigma such that I is realizable and both E1 and E2 are enabled in I.
Let's look at the same execution traces before. Here, E1 and E2 is a predictable race because there is an ideal namely the set of events in blue which is realizable.
This in fact gives us a simple algorithm for data race prediction. We simply enumerate the feasible ideals of the input execution sigma in which both E1 and E2 are enabled and we check if any of these ideals is realizable. If so, then E1 and E2 is a predictable data race otherwise not.
So, if the number of ideals feasible ideas is bounded by alpha and the time taken to check realizability of each such ideal is bounded by beta, then we have an algorithm for this prediction that runs in time alpha times beta. In our paper, we showed that the number
of ideals is at most n to the k-2 and the time taken to check realizability can be bounded by k times n to the k and this gives us our first result. Data race prediction can be solved in time n to the 2k. In particular, the exponent depends only on k and the problem
can be solved in polynomial time when the number of threads is constant. So, next we look at special case for data race prediction when executions have a special topology. So, the topology of an execution is an undirected graph whose nodes are the threads in the execution and two threads are connected with an edge if they perform
conflicting accesses or acquire a common lock. So, for the example on the right, the topology has an edge between t1 and t2 because of the common lock L. Similarly, there is an edge between t2 and t4 because of conflicting accesses but there is no edge between t4 and t3 because
they don't perform conflicting accesses. Recall that read x and read x are not conflicting. Our next result is about executions whose topology is acyclic. So, this kind of topology is commonly found in many scenarios including server-client architectures, divide-and-conquer parallelism or even the special case of two threads.
For this case, we show that the ideal realizability problem can in fact be solved in time quadratic in the length of the execution. Further, we show that in this case, it also suffices to check the realizability of a single well-defined ideal.
And thus, the overall complexity of data race prediction for the case of acyclic topology is n2 log n. We next consider the problem of distance-bounded data race prediction.
Given an execution sigma and its correct reordering rho, the set of reversals between rho and sigma is the pair of conflicting writes or acquires in the same lock, whose order is different in sigma and rho. And the distance between sigma and rho is the
number of such reversals. So, as an example, the two executions on the left are correct reorgings of each other and there are two reversals between them and thus their distance is two. Now, we can define the distance-bounded race prediction problem as follows. So, it is a promise problem. We first fix a constant L and the input to this problem
is as before in execution sigma and two conflicting events e1 and e2. Now, if e1 and e2 do not constitute a predictable data race, then we are required to say no. If e1 and e2 are indeed a predictable data race and if the correct reordering that
witnesses this data race is at most L distance apart from the original execution sigma, then we are required to say yes. And finally, if the witness traces farther away from sigma, then we are allowed to say either yes or no. So, we can define the distance-bounded
realizability problem for ideals analogously as a promise problem. Now, a nice solution to the distance-bounded realizability problem would be to enumerate all possible sets of reversals of size at most L and this would
give an algorithm that runs in time n to the 2L. But in our paper, we present a better algorithm than this. We show that the realizability problem can be solved in time k to the L times n, which happens to be linear when k is a constant. And additionally, if some other parameters are also constant, then the data race prediction problem can be solved in linear time in this case.
We now turn to lower bound results of our paper. So, we first establish a lower bound for the general case. Recall that when the number of threads k is constant, the data race prediction
problem can be solved in polynomial time. So, is this problem also in P even when k is not constant? Well, if not, is the problem at least FPT in k? That is, is there an algorithm whose running time looks like F of k times n to the c, where c is a constant? To this end, we show
that the problem is in fact W1 hard in the parameter k. We show that both data race prediction and ideal realizability are W1 hard. For this, we show a reduction from the problem independent set c, which asks if there is an independent set of size at least c in a given graph.
This reduction runs in time that is polynomial in both n and k. And as a result, the problem is also NP hard when k is not a constant. So, I would like to remark that this result in fact improves a previous result due to Gibbons and Korach. In the 1997 paper Testing Shared Memories, Gibbons and Korach considered the problem
of verifying sequential consistency given a read-from mapping, that is VSCRM. Here, one is given thread-wise sequences of reads and writes, and also reads from mapping. And the problem asks if there is an interleaving of these sequences that respects the read-from
mapping. They showed that this problem is NP hard in general. This problem is similar to ideal realizability with the crucial difference that there are no logs, and also that the input to this problem is not a complete execution, but only thread-wise projections which may or may not belong to a complete execution. Based on our result, it is fair to say that
it is unlikely that there is a polynomial time algorithm or even an FPT algorithm for this problem, unless the exponential time conjecture fails. So, the exponential time conjecture intuitively states that there is no sub-exponential time algorithm for 3SAT, and it is widely believed to be true. We next show our upper bound is tight even when the traces just have
two threads. In particular, we show a fine-grained reduction from the orthogonal vectors problem to race detection with two threads. It is widely believed that there is no sub-quadratic time algorithm for the orthogonal vectors problem, also known as the orthogonal vector
conjectures. I'd like to point out that a stronger version of the exponential time hypothesis called Strong Exponential Time Hypothesis or SETH implies the orthogonal vector conjectures.
Finally, this lower bound also means that our algorithm for acyclic topologies is also optimal up to logarithmic factors. So, let me summarize our contributions. We considered the problems of dynamic data race prediction and investigate complexity-theoretic questions in this space. We show that in general,
when the number of threads is constant, we have a polynomial time algorithm, and otherwise the problem is hard for the parametric class W1 when the number of threads is a parameter. We then considered specific sub-problems motivated by practical instances, and we show that the problem is often strongly polynomial.
There are lots of avenues for future work, including tighter bounds for some of the questions and complexity-theoretic investigations for other algorithmic paradigms like randomization or parallelization. Finally, one very interesting avenue is to come up with algorithms that meet
practical constraints and are deployable in practice. With this, I'd like to conclude.