Geometric probing considers problems of determining a geometric structure or some aspect of that structure from the results of a mathematical or physical measuring device, a probe. A variety of problems from robotics, medical instrumentation, mathematical optimization, integral and computational geometry, graph theory and other areas fit into this paradigm. Beginning with my dissertation, I have worked to develop a field of geometric probing. The emphasis is on interactive reconstruction, where the results of all previous measurements are used to determine the orientation of the next probe so it provides the maximum amount of information about the structure. Through interactive reconstruction, we have developed finite determination strategies for such diverse models as finger, x-ray, and half-plane probes.
Efficiently maintaining the partition induced by a set of features is an important problem in building decision-tree classifiers. In order to identify a small set of discriminating features, we need the capability of efficiently adding and removing specific features and determining the effect of these changes on the induced classification, or partition.
We introduce a variety of efficient (randomized and deterministic) data structures to support these operations on both general and geometrically-induced set partitions. We give both Monte Carlo and Las Vegas algorithms which realize near-optimal time bounds and are practical to implement. We provide an efficient suffix-tree based algorithm for a more general problem on maintaining the sorted order of strings under character insertion/deletion and introduce an interesting related problem on simultaneously sorting a binary matrix by both rows and columns.
A well-known dynamic programming algorithm computes the longest
common subsequence of strings X and Y in
time.
In this paper, we develop significantly faster algorithms for a special
class of strings which emerge frequently in pattern matching problems.
In particular, we present the first algorithm which finds the
longest common
subsequence of strings X and Y in time polynomial
in the size of the compressed strings.
Our final algorithm runs in
time,
where k and l are the compressed lengths of strings X and Y, and is
a substantial improvement on the previously best algorithm of
Bunke and Csirik,
which runs in
O(l|Y|+k|X|) time.
The need to approximately match run-length encoded strings emerged during development of an optical character recognition (OCR) system. This system, built in association with Data Capture Systems Inc. has been designed to achieve a low substitution error-rate via fixed-font character recognition.