In topological data analysis, a simplex tree is a type of trie used to represent efficiently any general simplicial complex. Through its nodes, this data structure notably explicitly represents all the simplices. Its flexible structure allows the implementation of many basic operations useful to computing persistent homology. This data structure was invented by Jean-Daniel Boissonnat and Clément Maria in 2014, in the article The Simplex Tree: An Efficient Data Structure for General Simplicial Complexes.[1] This data structure offers efficient operations on sparse simplicial complexes. For dense or maximal simplices, Skeleton-Blocker[2] representations or Toplex Map[3] representations are used.
Definitions
editMany researchers in topological data analysis consider the simplex tree to be the most compact simplex-based data structure for simplicial complexes, and a data structure allowing an intuitive understanding of simplicial complexes due to integrated usage of their mathematical properties.[1][3][4]
Heuristic definition
editConsider any simplicial complex is a set composed of points (0 dimensions), line segments (1 dimension), triangles (2 dimensions), and their n-dimensional counterparts, called n-simplexes within a topological space. By the mathematical properties of simplexes, any n-simplex is composed of multiple -simplexes. Thus, lines are composed of points, triangles of lines, and tetrahedrons of triangles. Notice each higher level adds 1 vertex to the vertices of the n-simplex. The data structure is simplex-based, therefore, it should represent all simplexes uniquely by the points defining the simplex. A simple way to achieve this is to define each simplex by its points in sorted order.
Let be a simplicial complex of dimension k, its vertex set, where vertices are labeled from 1 to and ordered accordingly. Now, construct a dictionary size containing all vertex labels in order. This represents the 0-dimensional simplexes. Then, for the path to the initial dictionary of each entry in the initial dictionary, add as a child dictionary all vertices fully-connected to the current set of vertices, all of which have a label greater than . Represent this step on k levels. Clearly, considering the first dictionary as depth 0, any entry at depth of any dictionary in this data structure uniquely represents a -simplex within . For completeness, the point to the initial dictionary is considered the representation of the empty simplex. For the practicality of the operations, labels that are repeated on the same level are linked together, forming a looped linked list. Finally, child dictionaries also have pointers to their parent dictionary, for fast ancestor access.[1]
Constructive definition
editLet be a simplicial complex of dimension k. We begin by decomposing the simplicial complex into mutually exclusive simplexes. This can be achieved in a greedy way by iteratively removing from the simplicial complex the highest order simplexes until the simplicial complex is empty. We then need to label each vertex from 1 to and associate each simplex with its corresponding "word", that is the ordered list of its vertices by label. Ordering the labels ensures no repetition in the simplex tree, as there is only one way to describe a simplex. We start with a null root, representing the null simplex. Then, we iterate through all simplexes, and through each label of each simplex word. If the label is available as a child to the current root, make that child the temporary root of the insertion process, otherwise, create a new node for the child, make it the new temporary root, and continue with the rest of the word. During this process, k dictionaries are maintained with all the labels and insert the address of the node for the corresponding label. If an address is already at that space in the dictionary, a pointer is created from the old node to the new node. Once the process is finished, all children of each node are entered into a dictionary, and all pointers are looped to make looped linked lists. A wide range of dictionaries could be applied here, like hash tables, but some operations assume the possibility of an ordered traversal of the entries, leading most of the implementations to use red-black trees are dictionaries.[1]
Operations
editWhile simplex trees are not the most space efficient data structures for simplicial complex representation, their operations on sparse data are considered state-of-art. Here, we give the bounds of different useful operations possible through this representation. Many implementations of these operations are available.[1][4][5][6][7]
We first introduce the notation. Consider is a given simplex, is a given node corresponding to the last vertex of , is the label associate to that node, is the depth of that node, is the dimension of the simplicial complex, is the maximal number of operations to access in a dictionary (if the dictionary is a red-black tree, is the complexity) . Consider is the number of cofaces of , and is the number of nodes of the simplex tree ending with the label at depth greater than . Notice .
- Search, insert and remove words are done in .[1]
- Insert and remove an entire simplex is done in .[1]
- Computing persistent homology, or in a more involved way, computing Betti numbers, using a simplex tree most efficiently remains an open problem, however, current algorithms for this task on sparse simplicial complexes achieve state-of-art performance.[4]
- The structure of simplex trees allows for elementary collapse of collapsible simplexes, however the bounds of this operation in the general case are unknown.[1][5][7]
- A subcase of elementary collapse is edge-contraction. Edge contraction can be
achieved in .[1]
- Locating cofaces of given simplex can be achieved in .[1]
- Locating cofacets of given simplex can be achieved in .[1]
As for construction, as seen in the constructive definition, construction is proportional to the number and complexity of simplexes in the simplicial complex. This can be especially expensive if the simplicial complex is dense. However, some optimizations for particular simplicial complexes, including for Flag complexes, Rips complexes and Witness complexes.[1][8]
Applications
editSimplex trees are efficient in sparse simplicial complexes. For this purpose, many persistent homology algorithms focusing on high-dimensional real data (often sparse) use simplex trees within these algorithms. While simplex trees are not as efficient as incidence matrices, their simplex-based structure allows them to be useful and efficient for simplicial complex storage within persistent homology algorithms.[9]
References
edit- ^ a b c d e f g h i j k l Boissonnat, Jean-Daniel; Maria, Clément (November 2014). "The Simplex Tree: an Efficient Data Structure for General Simplicial Complexes". Algorithmica. 70 (3): 406–427. arXiv:2001.02581. doi:10.1007/s00453-014-9887-3. ISSN 0178-4617. S2CID 15335393.
- ^ Salinas, David (7 February 2020). "Skeleton-Blocker". Geometry Understanding in Higher Dimensions. Retrieved 9 December 2021.
- ^ a b Godi, Francois (7 February 2020). "Toplex Map". Geometry Understanding in Higher Dimensions. Retrieved 9 December 2021.
- ^ a b c Boissonnat, Jean-Daniel. "Simplex tree reference manual". Geometry Understanding in Higher Dimensions. Retrieved 9 December 2021.
- ^ a b Piekenbrock, Matt (13 September 2020). "simplex_tree: Simplex Tree". rdrr.io. Retrieved 9 December 2021.
- ^ Nanda, Vidit. "Perseus, the Persistent Homology Software". The Perseus Software Project for Rapid Computation of Persistent Homology. Retrieved 9 December 2021.
- ^ a b Morozov, Dmitriy (2019). "Basics". Dionysus 2. Retrieved 9 December 2021.
- ^ Morozov, Dmitriy (2019). "Vietoris–Rips Complexes". Dionysus 2. Retrieved 9 December 2021.
- ^ Mandal, Sayan (2020). "Applications of Persistent Homology and Cycles". The Ohio State University. PHD Thesis.