Complexity of computing distances between geometric trees

Aasa Feragen

doi:10.1007/978-3-642-34166-3_10

Complexity of computing distances between geometric trees

Aasa Feragen

Department of Computer Science

6 Citations (Scopus)

Abstract

Geometric trees can be formalized as unordered combinatorial trees whose edges are endowed with geometric information. Examples are skeleta of shapes from images; anatomical tree-structures such as blood vessels; or phylogenetic trees. An inter-tree distance measure is a basic prerequisite for many pattern recognition and machine learning methods to work on anatomical, phylogenetic or skeletal trees. Standard distance measures between trees, such as tree edit distance, can be readily translated to the geometric tree setting. It is well-known that the tree edit distance for unordered trees is generally NP complete to compute. However, the classical proof of NP completeness depends on a particular case of edit distance with integer edit costs for trees with discrete labels, and does not obviously carry over to the class of geometric trees. The reason is that edge geometry is encoded in continuous scalar or vector attributes, allowing for continuous edit paths from one tree to another, rather than finite, discrete edit sequences with discrete costs for discrete label sets. In this paper, we explain why the proof does not carry over directly to the continuous setting, and why it does not work for the important class of trees with scalar-valued edge attributes, such as edge length. We prove the NP completeness of tree edit distance and another natural distance measure, QED, for geometric trees with vector valued edge attributes.

Original language	English
Title of host publication	Structural, Syntactic, and Statistical Pattern Recognition : Joint IAPR International Workshop, SSPR&SPR 2012, Hiroshima, Japan, November 7-9, 2012. Proceedings
Editors	Georgy Gimel'farb, Edwin Hancock, Atsushi Imiya, Arjan Kuijper, Mineichi Kudo, Shinichiro Omachi, Terry Windeatt, Keiji Yamada
Number of pages	9
Publisher	Springer
Publication date	2012
Pages	89-97
ISBN (Print)	978-3-642-34165-6
ISBN (Electronic)	978-3-642-34166-3
DOIs	https://doi.org/10.1007/978-3-642-34166-3_10
Publication status	Published - 2012
Event	Joint IAPR International Workshops on Structural and Syntactic Pattern Recognition (SSPR 2012) and Statistical Techniques in Pattern Recognition (SPR 2012) - Miyajima-Itsukushima, Hiroshima, Japan Duration: 7 Nov 2012 → 9 Nov 2012

Conference

Conference	Joint IAPR International Workshops on Structural and Syntactic Pattern Recognition (SSPR 2012) and Statistical Techniques in Pattern Recognition (SPR 2012)
Country/Territory	Japan
City	Miyajima-Itsukushima, Hiroshima
Period	07/11/2012 → 09/11/2012

Series	Lecture notes in computer science
Volume	7626
ISSN	0302-9743

Access to Document

10.1007/978-3-642-34166-3_10

10.1007%2F978-3-642-34166-3_10.pdfFinal published version, 384 KB

Cite this

Feragen, A. (2012). Complexity of computing distances between geometric trees. In G. Gimel'farb, E. Hancock, A. Imiya, A. Kuijper, M. Kudo, S. Omachi, T. Windeatt, & K. Yamada (Eds.), Structural, Syntactic, and Statistical Pattern Recognition : Joint IAPR International Workshop, SSPR&SPR 2012, Hiroshima, Japan, November 7-9, 2012. Proceedings (pp. 89-97). Springer. Lecture notes in computer science Vol. 7626 https://doi.org/10.1007/978-3-642-34166-3_10

Complexity of computing distances between geometric trees. / Feragen, Aasa.

Structural, Syntactic, and Statistical Pattern Recognition : Joint IAPR International Workshop, SSPR&SPR 2012, Hiroshima, Japan, November 7-9, 2012. Proceedings. ed. / Georgy Gimel'farb; Edwin Hancock; Atsushi Imiya; Arjan Kuijper; Mineichi Kudo; Shinichiro Omachi; Terry Windeatt; Keiji Yamada. Springer, 2012. p. 89-97 (Lecture notes in computer science, Vol. 7626).

Research output: Chapter in Book/Report/Conference proceeding › Article in proceedings › Research › peer-review

Feragen, A 2012, Complexity of computing distances between geometric trees. in G Gimel'farb, E Hancock, A Imiya, A Kuijper, M Kudo, S Omachi, T Windeatt & K Yamada (eds), Structural, Syntactic, and Statistical Pattern Recognition : Joint IAPR International Workshop, SSPR&SPR 2012, Hiroshima, Japan, November 7-9, 2012. Proceedings. Springer, Lecture notes in computer science, vol. 7626, pp. 89-97, Joint IAPR International Workshops on Structural and Syntactic Pattern Recognition (SSPR 2012) and Statistical Techniques in Pattern Recognition (SPR 2012), Miyajima-Itsukushima, Hiroshima, Japan, 07/11/2012. https://doi.org/10.1007/978-3-642-34166-3_10

Feragen A. Complexity of computing distances between geometric trees. In Gimel'farb G, Hancock E, Imiya A, Kuijper A, Kudo M, Omachi S, Windeatt T, Yamada K, editors, Structural, Syntactic, and Statistical Pattern Recognition : Joint IAPR International Workshop, SSPR&SPR 2012, Hiroshima, Japan, November 7-9, 2012. Proceedings. Springer. 2012. p. 89-97. (Lecture notes in computer science, Vol. 7626). doi: 10.1007/978-3-642-34166-3_10

Feragen, Aasa. / Complexity of computing distances between geometric trees. Structural, Syntactic, and Statistical Pattern Recognition : Joint IAPR International Workshop, SSPR&SPR 2012, Hiroshima, Japan, November 7-9, 2012. Proceedings. editor / Georgy Gimel'farb ; Edwin Hancock ; Atsushi Imiya ; Arjan Kuijper ; Mineichi Kudo ; Shinichiro Omachi ; Terry Windeatt ; Keiji Yamada. Springer, 2012. pp. 89-97 (Lecture notes in computer science, Vol. 7626).

@inproceedings{c32e96e17a364861b7f36d162e2ba325,

title = "Complexity of computing distances between geometric trees",

abstract = "Geometric trees can be formalized as unordered combinatorial trees whose edges are endowed with geometric information. Examples are skeleta of shapes from images; anatomical tree-structures such as blood vessels; or phylogenetic trees. An inter-tree distance measure is a basic prerequisite for many pattern recognition and machine learning methods to work on anatomical, phylogenetic or skeletal trees. Standard distance measures between trees, such as tree edit distance, can be readily translated to the geometric tree setting. It is well-known that the tree edit distance for unordered trees is generally NP complete to compute. However, the classical proof of NP completeness depends on a particular case of edit distance with integer edit costs for trees with discrete labels, and does not obviously carry over to the class of geometric trees. The reason is that edge geometry is encoded in continuous scalar or vector attributes, allowing for continuous edit paths from one tree to another, rather than finite, discrete edit sequences with discrete costs for discrete label sets. In this paper, we explain why the proof does not carry over directly to the continuous setting, and why it does not work for the important class of trees with scalar-valued edge attributes, such as edge length. We prove the NP completeness of tree edit distance and another natural distance measure, QED, for geometric trees with vector valued edge attributes.",

author = "Aasa Feragen",

year = "2012",

doi = "10.1007/978-3-642-34166-3_10",

language = "English",

isbn = "978-3-642-34165-6",

series = "Lecture notes in computer science",

publisher = "Springer",

pages = "89--97",

editor = "Georgy Gimel'farb and Edwin Hancock and Atsushi Imiya and Arjan Kuijper and Mineichi Kudo and Shinichiro Omachi and Terry Windeatt and Keiji Yamada",

booktitle = "Structural, Syntactic, and Statistical Pattern Recognition",

note = "Joint IAPR International Workshops on Structural and Syntactic Pattern Recognition (SSPR 2012) and Statistical Techniques in Pattern Recognition (SPR 2012) ; Conference date: 07-11-2012 Through 09-11-2012",

}

TY - GEN

T1 - Complexity of computing distances between geometric trees

AU - Feragen, Aasa

PY - 2012

Y1 - 2012

N2 - Geometric trees can be formalized as unordered combinatorial trees whose edges are endowed with geometric information. Examples are skeleta of shapes from images; anatomical tree-structures such as blood vessels; or phylogenetic trees. An inter-tree distance measure is a basic prerequisite for many pattern recognition and machine learning methods to work on anatomical, phylogenetic or skeletal trees. Standard distance measures between trees, such as tree edit distance, can be readily translated to the geometric tree setting. It is well-known that the tree edit distance for unordered trees is generally NP complete to compute. However, the classical proof of NP completeness depends on a particular case of edit distance with integer edit costs for trees with discrete labels, and does not obviously carry over to the class of geometric trees. The reason is that edge geometry is encoded in continuous scalar or vector attributes, allowing for continuous edit paths from one tree to another, rather than finite, discrete edit sequences with discrete costs for discrete label sets. In this paper, we explain why the proof does not carry over directly to the continuous setting, and why it does not work for the important class of trees with scalar-valued edge attributes, such as edge length. We prove the NP completeness of tree edit distance and another natural distance measure, QED, for geometric trees with vector valued edge attributes.

AB - Geometric trees can be formalized as unordered combinatorial trees whose edges are endowed with geometric information. Examples are skeleta of shapes from images; anatomical tree-structures such as blood vessels; or phylogenetic trees. An inter-tree distance measure is a basic prerequisite for many pattern recognition and machine learning methods to work on anatomical, phylogenetic or skeletal trees. Standard distance measures between trees, such as tree edit distance, can be readily translated to the geometric tree setting. It is well-known that the tree edit distance for unordered trees is generally NP complete to compute. However, the classical proof of NP completeness depends on a particular case of edit distance with integer edit costs for trees with discrete labels, and does not obviously carry over to the class of geometric trees. The reason is that edge geometry is encoded in continuous scalar or vector attributes, allowing for continuous edit paths from one tree to another, rather than finite, discrete edit sequences with discrete costs for discrete label sets. In this paper, we explain why the proof does not carry over directly to the continuous setting, and why it does not work for the important class of trees with scalar-valued edge attributes, such as edge length. We prove the NP completeness of tree edit distance and another natural distance measure, QED, for geometric trees with vector valued edge attributes.

U2 - 10.1007/978-3-642-34166-3_10

DO - 10.1007/978-3-642-34166-3_10

M3 - Article in proceedings

SN - 978-3-642-34165-6

T3 - Lecture notes in computer science

SP - 89

EP - 97

BT - Structural, Syntactic, and Statistical Pattern Recognition

A2 - Gimel'farb, Georgy

A2 - Hancock, Edwin

A2 - Imiya, Atsushi

A2 - Kuijper, Arjan

A2 - Kudo, Mineichi

A2 - Omachi, Shinichiro

A2 - Windeatt, Terry

A2 - Yamada, Keiji

PB - Springer

T2 - Joint IAPR International Workshops on Structural and Syntactic Pattern Recognition (SSPR 2012) and Statistical Techniques in Pattern Recognition (SPR 2012)

Y2 - 7 November 2012 through 9 November 2012

ER -

Complexity of computing distances between geometric trees

Abstract

Conference

Access to Document

Fingerprint

Cite this