René van Bevern:

The Computational Hardness and Tractability of Restricted Seriation Problems on Inaccurate Data.


Diplomarbeit, Institut für Informatik, Friedrich-Schiller-Universität Jena, Germany, April 2010.

Seriation is the task of finding a linear order of objects, whose pairwise similarities are given, such that similar objects are placed more closely in the order than dissimilar ones. The seriation problem originates from the task of relatively dating archaeological artifacts. Moreover, it is applied in the social sciences, in psychology, and technical diagnosis. Seriation is a special case of ordination as applied in the ecological sciences.

If there is an order of the objects such that two objects are not more similar to each other than any object pair ordered in between them, then the similarity associated with the objects is called Robinsonian—named after W. S. Robinson, who applied seriation to date archaeological artifacts. The corresponding order is called compatible with the given similarity. If a similarity is Robinsonian, that is, if it allows for a compatible order, then the order is computable in polynomial time. However, not all similarities allow for a compatible order. This, for example, can be due to inaccuracies in the input data. This work focuses on similarities that are not Robinsonian. We study two approaches to solve the seriation task in this case.

The first approach tries to identify a small subset of outliers of the given object set, so that the remaining objects allow for a compatible order. We call the problem Partial Robinsonian Similarity and show that it is NP-complete. By developing a fixed-parameter algorithm for the closely related Unit Interval Vertex Deletion problem, it is shown that Partial Robinsonian Similarity is fixed-parameter tractable (parameterized by the number of outliers) in case that the similarities between objects obtain only binary values. This case has applications in measuring indifference in the social sciences.

The second approach is searching for a Robinsonian similarity (obtaining values from a set M) that comes closest to a given similarity. For the resulting similarity, a compatible order can be constructed to solve the seriation problem for the original input data. Here, we measure closeness using Lp-norms and call the problem Robinson (Lp, M)-Fitting. We show that the problem is NP-complete for M=ℕ and M={0,1}. We present a mixed integer program formulation of the problem.

As a variant of the second approach, we study the Restricted Robinson (Lp, M)-Fitting problem, which searches for a similarity (obtaining values from M) that comes closest to the given input similarity and that is compatible with a given order. Algorithms solving this problem are applicable to seriation problems where parts of the sought order are already known. Exploiting convex programming, we obtain that Restricted Robinson (Lp, ℝ)-Fitting is polynomial-time solvable within arbitrary precision. Moreover, we present linear-time algorithms for Restricted Robinson (L, M)-Fitting when M is one of or , and for Restricted Robinson (L1, {0,1})-Fitting.

Download

The Diploma thesis is available in PDF format. A talk about the Unit Interval Vertex Deletion chapter is also available.

BibTex Entry

For information on using BibTex in your LaTeX documents, refer to the chapter "Bibliography management" in the wikibook for LaTeX.

@Misc{Bev10,
  author = {Ren\'{e} van Bevern},
  title = {The Computational Hardness and Tractability of
    Restricted Seriation Problems on Inaccurate Data},
  year = {2010},
  howpublished = {Diplomarbeit, Institut f\"{u}r Informatik, 
    Friedrich-Schiller-Universit\"{a}t
    Jena, Germany},
  note = {Available electronically.}
}

© 2010 René van Bevern. Last modified 2010-05-01.