Skip to content

quotient_graph

QuotientGraph Link

QuotientGraph()

Bases: Graph

Represents a quotient graph derived from a point cloud graph.

The QuotientGraph class is designed to create a quotient graph based on the provided point cloud graph structure. The class allows for computation of graph nodes, which are regions derived from clustering in the point cloud graph, and establishes intra-class and inter-class connections with computed weights and labels. Additionally, coordinates of nodes can be derived, representing the geometric center of their respective regions.

Attributes:

  • point_cloud_graph (PointCloudGraph or None) –

    The underlying point cloud graph associated to this quotient graph.

  • nodes_coordinates (ndarray or None) –

    Computed coordinates of the nodes within the quotient graph.

Source code in spectral_clustering/quotient_graph.py
28
29
30
31
32
33
34
def __init__(self):
    super().__init__()
    # self.seed_colors = None
    # self.graph_labels_dict = None
    # self.label_count = None
    self.point_cloud_graph = None
    self.nodes_coordinates = None

build_from_pointcloudgraph Link

build_from_pointcloudgraph(G, labels_from_cluster, region_growing=True)

Build a quotient graph from a given point cloud graph.

Builds a quotient graph from a given point cloud graph based on clustering results and optionally applies a region-growing algorithm to extract connected components made of the same k-means cluster label. The resulting quotient graph nodes represent connected components, while edges represent interactions between these components.

Parameters:

  • G (PointCloudGraph) –

    The input point cloud graph where nodes represent points and edges may have associated weights describing relationships between points.

  • labels_from_cluster (ndarray) –

    An array of labels from the initial clustering (e.g., k-means clustering) where each element corresponds to the cluster label of a graph node.

  • region_growing (bool, default: True ) –

    A flag indicating whether to use a region-growing algorithm to group nodes into connected components based on their k-means cluster labels. If False, connected components are not grown, and nodes are assigned to clusters based on initial cluster labels directly. Default is True.

Source code in spectral_clustering/quotient_graph.py
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
def build_from_pointcloudgraph(self, G, labels_from_cluster, region_growing=True):
    """Build a quotient graph from a given point cloud graph.

    Builds a quotient graph from a given point cloud graph based on clustering results
    and optionally applies a region-growing algorithm to extract connected components
    made of the same k-means cluster label. The resulting quotient graph nodes represent
    connected components, while edges represent interactions between these components.

    Parameters
    ----------
    G : spectral_clustering.point_cloud_graph.PointCloudGraph
        The input point cloud graph where nodes represent points and edges may have
        associated weights describing relationships between points.
    labels_from_cluster : numpy.ndarray
        An array of labels from the initial clustering (e.g., k-means clustering) where
        each element corresponds to the cluster label of a graph node.
    region_growing : bool, optional
        A flag indicating whether to use a region-growing algorithm to group nodes into
        connected components based on their k-means cluster labels. If False, connected
        components are not grown, and nodes are assigned to clusters based on initial
        cluster labels directly.
        Default is ``True``.
    """

    kmeans_labels = labels_from_cluster
    connected_component_labels = np.zeros(kmeans_labels.shape, dtype=int)
    current_cc_size = 0
    connected_component_size = []
    seed_kmeans_labels = []
    # This line has been added to obtain particular colors when the initial clustering is done with four clusters
    # especially kmeans.
    lablist = list(kmeans_labels[:])
    my_count = pd.Series(lablist).value_counts()
    # if len(my_count) == 4:
    #     #cluster_colors = ['#0000FF', '#00FF00', '#FFFF00', '#FF0000']
    #     cluster_colors = ['glasbey_'+str(i) for i in range(len(my_count))]
    # else:
    #     jet = cm.get_cmap('jet', len(my_count))
    #     cluster_colors = jet(range(len(my_count)))
    #     list_labels = np.unique(lablist).tolist()
    seed_colors = []
    label_count = 0
    visited = np.zeros(kmeans_labels.shape, dtype=int)
    queue = []

    # Region growing algorithm to extract connected components made of the same k-means class.
    if region_growing:
        for i in range(kmeans_labels.shape[0]):

            # Find a seed to initialize the seed fill process.

            if visited[i] == 0:
                seed = i
                visited[i] = 1
                queue.append(i)
                seed_kmeans_labels.append(kmeans_labels[seed])
                seed_colors.append('glasbey_' + str(kmeans_labels[seed][0] % 256))
                # if len(my_count) == 4:
                #     seed_colors.append(cluster_colors[kmeans_labels[seed][0]])
                # else:
                #     seed_colors.append(cluster_colors[list_labels.index(kmeans_labels[seed][0])][:])
                current_cc_size = 0

                # Region growing from the specified seed.

                while len(queue) != 0:

                    current = queue.pop(0)
                    connected_component_labels[current] = label_count
                    current_cc_size += 1

                    for n in G[current]:
                        if kmeans_labels[n] == kmeans_labels[seed] and visited[n] == 0:
                            queue.append(n)
                            visited[n] = 1

                connected_component_size.append(current_cc_size)
                label_count += 1
    else:
        # To correct/finish, it doesn't work for now
        list_kmeans_labels = [kmeans_labels[i][0] for i in range(len(kmeans_labels))]
        label_count = len(list(set(list_kmeans_labels)))
        for i in range(kmeans_labels.shape[0]):
            seed_colors.append('glasbey_' + str(kmeans_labels[i][0] % 256))
            seed_kmeans_labels.append(kmeans_labels[i])

    # Create quotient graph nodes.

    self.add_nodes_from(range(label_count))

    node_kmeans_labels_values = dict(
        zip(np.asarray(self.nodes()), np.transpose(np.asarray(seed_kmeans_labels))[0]))
    nx.set_node_attributes(self, node_kmeans_labels_values, 'kmeans_labels')

    # Create quotient graph edges.

    intra_edge_weight_sum = np.zeros(label_count, dtype=np.float64)
    intra_edge_count = np.zeros(label_count, dtype=int)

    for (u, v) in G.edges:
        a = connected_component_labels[u, 0]
        b = connected_component_labels[v, 0]

        # Inter-class edge case:

        if a != b:
            if not self.has_edge(a, b):
                w = G.edges[u, v]['weight']
                self.add_edge(a, b, inter_class_edge_weight=w, inter_class_edge_number=1)
                # nx.set_edge_attributes(self, G.edges[u, v]['weight'], 'inter_class_edge_weight')
                # self.edges[a, b]['inter_class_edge_weight'] = G.edges[u, v]['weight']
                # nx.set_edge_attributes(self, 1, 'inter_class_edge_number')
                # self.edges[a, b]['inter_class_edge_number'] = 1
            else:
                self.edges[a, b]['inter_class_edge_weight'] += G.edges[u, v]['weight']
                self.edges[a, b]['inter_class_edge_number'] += 1

        # Intra-class edge case:

        else:
            intra_edge_weight_sum[a] += G.edges[u, v]['weight']
            intra_edge_count[a] += 1

    # Assign to each point cloud graph node the corresponding quotient graph node.

    node_cc = dict(
        zip(np.asarray(G.nodes()), np.transpose(np.asarray(connected_component_labels))[0]))
    nx.set_node_attributes(G, node_cc, 'quotient_graph_node')

    nx.set_node_attributes(self, dict(
        zip(np.asarray(self.nodes()), np.transpose(np.asarray(connected_component_size)))),
                           'intra_class_node_number')

    nx.set_node_attributes(self, dict(
        zip(np.asarray(self.nodes()), np.transpose(np.asarray(intra_edge_weight_sum)))), 'intra_class_edge_weight')

    nx.set_node_attributes(self, dict(
        zip(np.asarray(self.nodes()), np.transpose(np.asarray(intra_edge_count)))), 'intra_class_edge_number')

    # if len(my_count) == 4:
    #     nx.set_node_attributes(self, dict(
    #         zip(np.asarray(self.nodes()), np.transpose(np.asarray(seed_colors)))), 'seed_colors')
    # else:
    nx.set_node_attributes(self, dict(
        zip(np.asarray(self.nodes()), seed_colors)), 'seed_colors')

    # self.seed_colors = seed_colors
    # self.label_count = label_count
    self.point_cloud_graph = G

compute_direction_info Link

compute_direction_info(list_leaves)

Computes directional gradient information for specific nodes and edges in the graph.

This function computes a mean direction gradient for each node in the graph, based on the sum of the direction gradients of the individual nodes that belong to its quotient graph node. Additionally, it calculates the energy dot product for each edge based on the dot product of the mean direction gradients of the connected nodes. Leaves from the specified list are assigned a predefined energy dot product value.

Parameters:

  • list_leaves (list) –

    A list of nodes that are considered as leaves in the graph. These nodes will have their associated edges assigned a constant energy dot product value.

Source code in spectral_clustering/quotient_graph.py
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
def compute_direction_info(self, list_leaves):
    """Computes directional gradient information for specific nodes and edges in the graph.

    This function computes a mean direction gradient for each node in the graph, based on
    the sum of the direction gradients of the individual nodes that belong to its quotient
    graph node. Additionally, it calculates the energy dot product for each edge based
    on the dot product of the mean direction gradients of the connected nodes. Leaves
    from the specified list are assigned a predefined energy dot product value.

    Parameters
    ----------
    list_leaves : list
        A list of nodes that are considered as leaves in the graph. These nodes will
        have their associated edges assigned a constant energy dot product value.
    """
    G = self.point_cloud_graph
    for l in self:
        self.nodes[l]['dir_gradient_mean'] = 0
        list_of_nodes_in_qnode = [x for x, y in G.nodes(data=True) if y['quotient_graph_node'] == l]
        for n in list_of_nodes_in_qnode:
            self.nodes[l]['dir_gradient_mean'] += G.nodes[n]['direction_gradient']
        self.nodes[l]['dir_gradient_mean'] /= len(list_of_nodes_in_qnode)

    # take all the edges and compute the dot product between the two cluster
    for e in self.edges():
        v1 = self.nodes[e[0]]['dir_gradient_mean']
        v2 = self.nodes[e[1]]['dir_gradient_mean']
        if e[0] in list_leaves or e[1] in list_leaves:
            self.edges[e]['energy_dot_product'] = 4
        else:
            dot = v1[0] * v2[0] + v1[1] * v2[1] + v1[2] * v2[2]
            energy_dot_product = 1 - dot
            self.edges[e]['energy_dot_product'] = energy_dot_product

compute_local_descriptors Link

compute_local_descriptors(method='all_qg_cluster', data='coords', neighborhood='radius', scale=10)

Compute local descriptors for the points in a graph or its quotient graph nodes based on various methods, data types, and neighborhood definitions.

This method calculates geometric and structural descriptors such as planarity, linearity, scattering, curvature, and eigenvalues using covariance matrices derived from either the point coordinates or gradient vectors. Depending on the method and data type, the computation is either performed for individual points or for clusters of points belonging to quotient graph nodes.

Parameters:

  • method (str, default: 'all_qg_cluster' ) –

    Specifies the type of computation to perform. Default is 'all_qg_cluster'. Options include:

    • 'all_qg_cluster': Compute descriptors for quotient graph nodes by aggregating information from all points in a node.
    • 'each_point': Compute descriptors for individual points based on their direct neighborhoods.
  • data (str, default: 'coords' ) –

    Specifies the type of data on which descriptors are computed. Default is 'coords'. Options include:

    • 'coords': Use the coordinates of points.
    • 'gradient_vector_fiedler': Use the gradient vector of the Fiedler eigenvector.
  • neighborhood (str, default: 'radius' ) –

    Specifies the neighborhood definition when computing descriptors for individual points. Default is 'radius'. Options include:

    • 'radius': Neighborhood defined by points within a given radius.
    • 'pointcloudgraph': Predefined graph-based neighborhood.
  • scale (int, default: 10 ) –

    When 'neighborhood' is set to 'radius', this parameter determines the radius to define the neighborhood. Default is 10.

Raises:

  • ValueError

    If an invalid combination of method, data, or neighborhood is provided.

  • KeyError

    If required attributes are missing in the point cloud graph.

Returns:

  • None

    This method updates nodes in the graph with their computed descriptors as attributes.

Source code in spectral_clustering/quotient_graph.py
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
def compute_local_descriptors(self, method='all_qg_cluster', data='coords', neighborhood='radius', scale=10):
    """
    Compute local descriptors for the points in a graph or its quotient graph nodes based on various methods, data types, and neighborhood definitions.

    This method calculates geometric and structural descriptors such as planarity, linearity, scattering, curvature, and eigenvalues
    using covariance matrices derived from either the point coordinates or gradient vectors. Depending on the method and data type, the
    computation is either performed for individual points or for clusters of points belonging to quotient graph nodes.

    Parameters
    ----------
    method : str, optional
        Specifies the type of computation to perform.
        Default is ``'all_qg_cluster'``.
        Options include:

        - ``'all_qg_cluster'``: Compute descriptors for quotient graph nodes by aggregating information from all points
          in a node.
        - ``'each_point'``: Compute descriptors for individual points based on their direct neighborhoods.
    data : str, optional
        Specifies the type of data on which descriptors are computed.
        Default is 'coords'.
        Options include:

        - ``'coords'``: Use the coordinates of points.
        - ``'gradient_vector_fiedler'``: Use the gradient vector of the Fiedler eigenvector.
    neighborhood : str, optional
        Specifies the neighborhood definition when computing descriptors for individual points.
        Default is 'radius'.
        Options include:

          - ``'radius'``: Neighborhood defined by points within a given radius.
          - ``'pointcloudgraph'``: Predefined graph-based neighborhood.
    scale : int, optional
        When 'neighborhood' is set to 'radius', this parameter determines the radius to define the neighborhood.
        Default is ``10``.

    Raises
    ------
    ValueError
        If an invalid combination of `method`, `data`, or `neighborhood` is provided.
    KeyError
        If required attributes are missing in the point cloud graph.

    Returns
    -------
    None
        This method updates nodes in the graph with their computed descriptors as attributes.
    """
    G = self.point_cloud_graph
    # compute the descriptor for all the point in a quotient graph node
    if method == 'all_qg_cluster' and data == 'coords':
        for qnode in self:
            list_of_nodes_in_qnode = [x for x, y in G.nodes(data=True) if y['quotient_graph_node'] == qnode]
            mat = np.zeros((len(list_of_nodes_in_qnode), 3))
            i = 0
            for n in list_of_nodes_in_qnode:
                mat[i] = G.nodes_coords[n]
                i += 1
            covmat = np.cov(np.transpose(mat))

            eigenval, eigenvec = np.linalg.eigh(covmat)

            self.nodes[qnode]['planarity'] = (eigenval[1] - eigenval[0]) / eigenval[2]
            self.nodes[qnode]['planarity2'] = (eigenval[1] - eigenval[0]) / eigenval[1]
            self.nodes[qnode]['linearity'] = (eigenval[2] - eigenval[1]) / eigenval[2]
            self.nodes[qnode]['scattering'] = eigenval[0] / eigenval[2]
            self.nodes[qnode]['curvature_eig'] = eigenval[0] / (eigenval[0] + eigenval[2] + eigenval[1])
            self.nodes[qnode]['lambda1'] = eigenval[2]
            self.nodes[qnode]['lambda2'] = eigenval[1]
            self.nodes[qnode]['lambda3'] = eigenval[0]

    if method == 'each_point' and data == 'coords' and neighborhood == 'radius':
        # compute a new pointcloudgraph with the neighborhood wanted
        NewG = sgk.create_riemannian_graph(G.pcd, method=neighborhood, nearest_neighbors=neighborhood, radius=scale)

    if method == 'each_point' and data == 'coords' and neighborhood == 'pointcloudgraph':
        # compute each descriptor for each point of the point cloud and its neighborhood
        for p in G:
            mat = np.zeros((len(G[p]), 3))
            i = 0
            for n in G[p]:
                mat[i] = G.nodes_coords[n]
                i += 1
            covmat = np.cov(np.transpose(mat))

            eigenval, eigenvec = np.linalg.eigh(covmat)
            G.nodes[p]['planarity'] = (eigenval[1] - eigenval[0]) / eigenval[2]
            G.nodes[p]['linearity'] = (eigenval[2] - eigenval[1]) / eigenval[2]
            G.nodes[p]['scattering'] = eigenval[0] / eigenval[2]
            G.nodes[p]['curvature_eig'] = eigenval[0] / (eigenval[0] + eigenval[2] + eigenval[1])
        # compute mean value for the qnode
        for qnode in self:
            list_of_nodes_in_qnode = [x for x, y in G.nodes(data=True) if y['quotient_graph_node'] == qnode]
            self.nodes[qnode]['planarity'] = 0
            self.nodes[qnode]['linearity'] = 0
            self.nodes[qnode]['scattering'] = 0
            self.nodes[qnode]['curvature_eig'] = 0

            for n in list_of_nodes_in_qnode:
                self.nodes[qnode]['planarity'] += G.nodes[n]['planarity']
                self.nodes[qnode]['linearity'] += G.nodes[n]['linearity']
                self.nodes[qnode]['scattering'] += G.nodes[n]['scattering']
                self.nodes[qnode]['curvature_eig'] += G.nodes[n]['curvature_eig']
            self.nodes[qnode]['planarity'] /= len(list_of_nodes_in_qnode)
            self.nodes[qnode]['linearity'] /= len(list_of_nodes_in_qnode)
            self.nodes[qnode]['scattering'] /= len(list_of_nodes_in_qnode)
            self.nodes[qnode]['curvature_eig'] /= len(list_of_nodes_in_qnode)

    if method == 'all_qg_cluster' and data == 'gradient_vector_fiedler':
        for qnode in self:
            list_of_nodes_in_qnode = [x for x, y in G.nodes(data=True) if y['quotient_graph_node'] == qnode]
            mat = np.zeros((len(list_of_nodes_in_qnode), 3))
            i = 0
            for n in list_of_nodes_in_qnode:
                mat[i] = G.nodes[n]['direction_gradient']
                i += 1
            covmat = np.cov(np.transpose(mat))

            eigenval, eigenvec = np.linalg.eigh(covmat)

            self.nodes[qnode]['max_eigenval'] = eigenval[2]
            self.nodes[qnode]['planarity'] = (eigenval[1] - eigenval[0]) / eigenval[2]
            self.nodes[qnode]['linearity'] = (eigenval[2] - eigenval[1]) / eigenval[2]
            self.nodes[qnode]['scattering'] = eigenval[0] / eigenval[2]
            self.nodes[qnode]['curvature_eig'] = eigenval[0] / (eigenval[0] + eigenval[2] + eigenval[1])

compute_nodes_coordinates Link

compute_nodes_coordinates()

Compute and assign average 3D coordinates to nodes in the quotient graph.

This function calculates the centroid coordinates for each node in the quotient graph by computing the mean of the coordinates of the corresponding point cloud graph nodes. These coordinates are subsequently stored as an attribute for each node in the quotient graph. It provides a mechanism for visualizing the quotient graph in 3D space.

Source code in spectral_clustering/quotient_graph.py
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
def compute_nodes_coordinates(self):
    """Compute and assign average 3D coordinates to nodes in the quotient graph.

    This function calculates the centroid coordinates for each node in the quotient
    graph by computing the mean of the coordinates of the corresponding point cloud
    graph nodes. These coordinates are subsequently stored as an attribute for each
    node in the quotient graph. It provides a mechanism for visualizing the quotient
    graph in 3D space.
    """

    G = self.point_cloud_graph

    # Calcul de coordonnées moyennes pour chaque noeud du graphe quotient, dans le but d'afficher en 3D le graphe.
    new_classif = np.asarray(list((dict(G.nodes(data='quotient_graph_node')).values())))
    nodes_coords_moy = np.array([np.mean(G.nodes_coords[new_classif == n], axis=0) for n in self.nodes])

    # import scipy.ndimage as nd
    # nodes_coords_moy = np.transpose([nd.mean(G.nodes_coords[:,k],
    #                                          new_classif,
    #                                          index=list(self.nodes)) for k in range(3)])
    #
    # nodes_coords_moy = np.zeros((len(self), 3))
    # for j, n in enumerate(self.nodes):
    #     nodes_coords_moy[j] = np.mean(G.nodes_coords[new_classif == n], axis=0)
    #
    #     X = []
    #     Y = []
    #     Z = []
    #     for i in range(pcd_attribute.shape[0]):
    #         if sorted_pcd_attribute_by_quotient_graph_attribute[i, 3] == n:
    #             X.append(sorted_pcd_attribute_by_quotient_graph_attribute[i, 0])
    #             Y.append(sorted_pcd_attribute_by_quotient_graph_attribute[i, 1])
    #             Z.append(sorted_pcd_attribute_by_quotient_graph_attribute[i, 2])
    #     nodes_coords_moy[j, 0] = np.mean(X)
    #     nodes_coords_moy[j, 1] = np.mean(Y)
    #     nodes_coords_moy[j, 2] = np.mean(Z)

    # new_classif = new_classif[:, np.newaxis]
    # pcd_attribute = np.concatenate([G.nodes_coords, new_classif], axis=1)
    # sorted_pcd_attribute_by_quotient_graph_attribute = pcd_attribute[np.argsort(pcd_attribute[:, 3])]
    # nodes_coords_moy = np.zeros((len(self), 3))
    # j = 0
    # for n in self.nodes:
    #     X = []
    #     Y = []
    #     Z = []
    #     for i in range(pcd_attribute.shape[0]):
    #         if sorted_pcd_attribute_by_quotient_graph_attribute[i, 3] == n:
    #             X.append(sorted_pcd_attribute_by_quotient_graph_attribute[i, 0])
    #             Y.append(sorted_pcd_attribute_by_quotient_graph_attribute[i, 1])
    #             Z.append(sorted_pcd_attribute_by_quotient_graph_attribute[i, 2])
    #     nodes_coords_moy[j, 0] = np.mean(X)
    #     nodes_coords_moy[j, 1] = np.mean(Y)
    #     nodes_coords_moy[j, 2] = np.mean(Z)
    #     j += 1
    self.nodes_coordinates = nodes_coords_moy

    for nodes in self.nodes:
        self.nodes[nodes]['centroide_coordinates'] = np.nan
    i = 0
    for nodes in self.nodes:
        self.nodes[nodes]['centroide_coordinates'] = nodes_coords_moy[i, :]
        i += 1

compute_quotientgraph_metadata_on_a_node_interclass Link

compute_quotientgraph_metadata_on_a_node_interclass(node)

Computes metadata for a specific node in the quotient graph while focusing on interclass relationships.

This function analyzes the relationships of the node's neighbors that belong to different classes within the quotient graph. It returns a dictionary that counts the occurrences of neighboring nodes belonging to other classes relative to the specified node.

Parameters:

  • node (any) –

    The node in the quotient graph for which the interclass metadata should be computed.

Returns:

  • dict

    A dictionary where the keys are the different classes (quotient graph nodes) and the values are the count of neighboring nodes belonging to those classes.

Notes

The function uses the 'quotient_graph_node' attribute of the nodes in the graph to determine the class of each node. Nodes with the same 'quotient_graph_node' value are considered to belong to the same class.

Source code in spectral_clustering/quotient_graph.py
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
def compute_quotientgraph_metadata_on_a_node_interclass(self, node):
    """Computes metadata for a specific node in the quotient graph while focusing on interclass relationships.

    This function analyzes the relationships of the node's neighbors that belong to different classes within
    the quotient graph. It returns a dictionary that counts the occurrences of neighboring nodes belonging
    to other classes relative to the specified node.

    Parameters
    ----------
    node : any
        The node in the quotient graph for which the interclass metadata should be computed.

    Returns
    -------
    dict
        A dictionary where the keys are the different classes (quotient graph nodes)
        and the values are the count of neighboring nodes belonging to those classes.

    Notes
    -----
    The function uses the 'quotient_graph_node' attribute of the nodes in the graph to determine
    the class of each node. Nodes with the same 'quotient_graph_node' value are considered to
    belong to the same class.
    """
    G = self.point_cloud_graph
    count = {}
    for c in self[node]:
        count[c] = 0

    list_of_nodes_G = [x for x, y in G.nodes(data=True) if y['quotient_graph_node'] == node]
    for ng in list_of_nodes_G:
        for neing in G[ng]:
            if G.nodes[ng]['quotient_graph_node'] != G.nodes[neing]['quotient_graph_node']:
                count[G.nodes[neing]['quotient_graph_node']] += 1

    return count

compute_silhouette Link

compute_silhouette(method='topological', data='direction_gradient_vector_fiedler')

Compute the silhouette score for nodes in a point cloud graph based on clustering.

This function calculates silhouette scores for each node in a point cloud graph based either on the "all_qg_cluster" or "topological" approach. The silhouette score is a measure of how similar a node's association to its own cluster is versus its association to neighboring, adjacent clusters. The results are stored in the graph for individual nodes and quotient graph nodes.

Parameters:

  • method (str, default: 'topological' ) –

    The method used to compute the silhouette score. Acceptable values are 'all_qg_cluster' for traditional silhouette computation or 'topological' for a more nuanced topology-based computation. Default is 'topological'.

  • data (str, default: 'direction_gradient_vector_fiedler' ) –

    Indicates the data used for silhouette computation. Options are 'direction_gradient_vector_fiedler' or 'norm_gradient_vector_fiedler', determining the specific gradient vector field used in the calculation. Default is 'direction_gradient_vector_fiedler'.

Notes
  • For the 'all_qg_cluster' method, silhouette scores are based on the classical approach available in scikit-learn.
  • For the 'topological' method, silhouette calculation considers the dissimilarities with adjacent clusters only, leveraging adjacency matrices for the computations.
  • The computed silhouette scores are stored directly as node attributes in the underlying graph.
  • Quotient graph nodes' silhouette scores are computed as the average of individual nodes within the respective cluster.
Source code in spectral_clustering/quotient_graph.py
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
def compute_silhouette(self, method='topological', data='direction_gradient_vector_fiedler'):
    """Compute the silhouette score for nodes in a point cloud graph based on clustering.

    This function calculates silhouette scores for each node in a point cloud graph
    based either on the "all_qg_cluster" or "topological" approach. The silhouette
    score is a measure of how similar a node's association to its own cluster is
    versus its association to neighboring, adjacent clusters. The results are stored
    in the graph for individual nodes and quotient graph nodes.

    Parameters
    ----------
    method : str, optional
        The method used to compute the silhouette score. Acceptable values are
        'all_qg_cluster' for traditional silhouette computation or 'topological'
        for a more nuanced topology-based computation. Default is 'topological'.
    data : str, optional
        Indicates the data used for silhouette computation. Options are
        'direction_gradient_vector_fiedler' or 'norm_gradient_vector_fiedler',
        determining the specific gradient vector field used in the calculation.
        Default is 'direction_gradient_vector_fiedler'.

    Notes
    -----
    - For the 'all_qg_cluster' method, silhouette scores are based on the classical
      approach available in scikit-learn.
    - For the 'topological' method, silhouette calculation considers the dissimilarities
      with adjacent clusters only, leveraging adjacency matrices for the computations.
    - The computed silhouette scores are stored directly as node attributes in the
      underlying graph.
    - Quotient graph nodes' silhouette scores are computed as the average of individual
      nodes within the respective cluster.
    """
    G = self.point_cloud_graph
    label = []
    for node in G:
        label.append(G.nodes[node]['quotient_graph_node'])

    if method == 'all_qg_cluster':
        # this method is based on the classical way to compute silhouette coefficients in the sci-kit learn module

        if data == 'direction_gradient_vector_fiedler':
            sil = sk.metrics.silhouette_samples(G.direction_gradient_on_fiedler_scaled, label, metric='euclidean')
        elif data == 'norm_gradient_vector_fiedler':
            sil = sk.metrics.silhouette_samples(G.gradient_on_fiedler, label, metric='euclidean')

    if method == 'topological':
        import scipy.ndimage as nd
        from sklearn.metrics import pairwise_distances_chunked
        # this method aims at comparing the cluster A of the point considered with the adjacent clusters of A.
        X = []

        if data == 'direction_gradient_vector_fiedler':
            X = G.direction_gradient_on_fiedler_scaled
        elif data == 'norm_gradient_vector_fiedler':
            X = G.gradient_on_fiedler

        labels = np.array(label)

        label_list = np.array(self.nodes)

        label_adjacency = nx.adjacency_matrix(self).todense().astype('float64')
        label_adjacency -= 1 * np.eye(len(label_adjacency)).astype('float64')

        label_index = np.array([np.where(label_list == l)[0][0] for l in labels])

        def mean_by_label(D_chunk, start):
            """Computes the mean of data points by label, excluding contributions from the current label.

            This function calculates the per-label mean of a given chunk of data points, while
            excluding contributions from the label currently being processed. It operates on an
            input chunk of data and associated labels, aggregating information based on the specified
            index labels.

            Parameters
            ----------
            D_chunk : np.ndarray
                A chunk of multi-dimensional data points split from the full dataset, where each row
                represents a data point.
            start : int
                The start index for the current chunk of data within the full dataset. Used to
                correctly align the chunk with the corresponding labels.

            Returns
            -------
            np.ndarray
                An array representing the mean value of data points grouped by label, excluding
                contributions from the current label. The size of the array matches the provided
                `D_chunk`.
            """
            label_mean = []
            chunk_labels = labels[start:start + len(D_chunk)]
            for line, label in zip(D_chunk, chunk_labels):
                label_sum = nd.sum(line, labels, index=label_list)
                label_count = nd.sum(np.ones_like(labels), labels, index=label_list)
                label_count -= label_list == label
                label_mean += [label_sum / label_count]
            return np.array(label_mean)

        gen = pairwise_distances_chunked(X, reduce_func=mean_by_label)

        start = 0
        silhouette = []
        for label_mean_dissimilarity in gen:
            chunk_label_index = label_index[start:start + len(label_mean_dissimilarity)]

            adjacent_label_mean_dissimilarity = np.copy(label_mean_dissimilarity)
            adjacent_label_mean_dissimilarity[label_adjacency[chunk_label_index] != 1] = np.nan

            self_label_mean_dissimilarity = np.copy(label_mean_dissimilarity)
            self_label_mean_dissimilarity[label_adjacency[chunk_label_index] != -1] = np.nan

            intra = np.nanmin(self_label_mean_dissimilarity, axis=1)
            inter = np.nanmin(adjacent_label_mean_dissimilarity, axis=1)

            silhouette += list((inter - intra) / np.maximum(inter, intra))
            start += len(label_mean_dissimilarity)

        sil = np.array(silhouette)

    i = 0
    for node in G:
        G.nodes[node]['silhouette'] = sil[i]
        i += 1
    for qnode in self:
        list_of_nodes_in_qnode = [x for x, y in G.nodes(data=True) if y['quotient_graph_node'] == qnode]
        self.nodes[qnode]['silhouette'] = 0
        for n in list_of_nodes_in_qnode:
            self.nodes[qnode]['silhouette'] += G.nodes[n]['silhouette']
        self.nodes[qnode]['silhouette'] /= len(list_of_nodes_in_qnode)

count_local_extremum_of_Fiedler Link

count_local_extremum_of_Fiedler()

Counts the local extrema of the Fiedler vector in the graph associated with each node.

The method utilizes a point cloud graph and updates the associated nodes in the graph with the total number of local extrema of the Fiedler vector belonging to their corresponding quotient graph nodes.

Source code in spectral_clustering/quotient_graph.py
659
660
661
662
663
664
665
666
667
668
669
670
671
672
def count_local_extremum_of_Fiedler(self):
    """Counts the local extrema of the Fiedler vector in the graph associated with each node.

    The method utilizes a point cloud graph and updates the associated nodes
    in the graph with the total number of local extrema of the Fiedler vector
    belonging to their corresponding quotient graph nodes.
    """
    G = self.point_cloud_graph
    G.find_local_extremum_of_Fiedler()
    for c in self.nodes():
        self.nodes[c]['number_of_local_Fiedler_extremum'] = 0
        list_of_nodes_r = [x for x, y in G.nodes(data=True) if y['quotient_graph_node'] == c]
        for pt in list_of_nodes_r:
            self.nodes[c]['number_of_local_Fiedler_extremum'] += G.nodes[pt]['extrem_local_Fiedler']

delete_empty_edges_and_nodes Link

delete_empty_edges_and_nodes()

Delete edges from the quotient graph that do not represent any edges in the distance-based graph anymore.

Delete nodes from the quotient graph that do not represent any nodes of the distance-based graph. Update of the topological structure by removal only.

Source code in spectral_clustering/quotient_graph.py
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
def delete_empty_edges_and_nodes(self):
    """Delete edges from the quotient graph that do not represent any edges in the distance-based graph anymore.

    Delete nodes from the quotient graph that do not represent any nodes of the distance-based graph.
    Update of the topological structure by removal only.
    """
    to_remove = []
    list = [e for e in self.edges]
    for e in list:
        if self.edges[e[0], e[1]]['inter_class_edge_number'] == 0:
            to_remove.append(e)
    print(to_remove)
    self.remove_edges_from(to_remove)
    to_remove = []
    for n in self.nodes:
        if self.nodes[n]['intra_class_node_number'] == 0:
            to_remove.append(n)
    self.remove_nodes_from(to_remove)
    print(to_remove)

ponderate_with_coordinates_of_points Link

ponderate_with_coordinates_of_points()

Computes the weights of edges based on the centroid coordinates of their connected nodes.

This method first calculates the coordinates of the centroids for all nodes and then uses these coordinates to calculate the distance between connected nodes in the graph. The calculated distance is stored as an additional weight attribute for each edge.

Attributes:

  • self.nodes (dict) –

    A dictionary representing graph nodes where keys are node identifiers and values are associated attributes, including 'centroide_coordinates'.

  • self.edges (dict) –

    A dictionary representing graph edges where keys are tuples of connected node identifiers and values are associated attributes, including the new 'distance_centroides'.

Notes

The edge weight 'distance_centroides' is calculated as the Euclidean distance between the centroid coordinates of two nodes connected by the edge.

Source code in spectral_clustering/quotient_graph.py
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
def ponderate_with_coordinates_of_points(self):
    """Computes the weights of edges based on the centroid coordinates of their connected nodes.

    This method first calculates the coordinates of the centroids for all nodes and then uses these
    coordinates to calculate the distance between connected nodes in the graph. The calculated
    distance is stored as an additional weight attribute for each edge.

    Attributes
    ----------
    self.nodes : dict
        A dictionary representing graph nodes where keys are node identifiers and values are
        associated attributes, including 'centroide_coordinates'.
    self.edges : dict
        A dictionary representing graph edges where keys are tuples of connected node identifiers
        and values are associated attributes, including the new 'distance_centroides'.

    Notes
    -----
    The edge weight 'distance_centroides' is calculated as the Euclidean distance between the
    centroid coordinates of two nodes connected by the edge.
    """
    self.compute_nodes_coordinates()
    # Calcul des poids
    for e in self.edges:
        n1 = e[0]
        n2 = e[1]
        c1 = self.nodes[n1]['centroide_coordinates']
        c2 = self.nodes[n2]['centroide_coordinates']
        dist = np.linalg.norm(c1 - c2)
        self.edges[e]['distance_centroides'] = dist

rebuild Link

rebuild(G, clear=True)

Rebuilds the internal structure based on the provided graph and its associated data.

This method processes the input graph and reinitializes the internal structure. If the clear flag is set to True, any existing data in the internal structure is cleared before rebuilding. The method extracts node attributes from the graph, transforms them into a numpy array, and utilizes this array to rebuild the internal representation from the given graph.

Parameters:

  • G (PointCloudGraph) –

    A graph structure that contains nodes with the attribute 'quotient_graph_node', which is used to rebuild the internal data structure.

  • clear (bool, default: True ) –

    If True, clears the current internal structure before rebuilding. Default is True.

Source code in spectral_clustering/quotient_graph.py
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
def rebuild(self, G, clear=True):
    """Rebuilds the internal structure based on the provided graph and its associated data.

    This method processes the input graph and reinitializes the internal structure. If the
    `clear` flag is set to True, any existing data in the internal structure is cleared
    before rebuilding. The method extracts node attributes from the graph, transforms them
    into a numpy array, and utilizes this array to rebuild the internal representation
    from the given graph.

    Parameters
    ----------
    G : spectral_clustering.point_cloud_graph.PointCloudGraph
        A graph structure that contains nodes with the attribute 'quotient_graph_node', which is
        used to rebuild the internal data structure.
    clear : bool, optional
        If ``True``, clears the current internal structure before rebuilding.
        Default is ``True``.
    """
    if clear:
        self.clear()
    labels_qg = [k for k in dict(G.nodes(data='quotient_graph_node')).values()]
    labels_qg_re = np.asarray(labels_qg)[:, np.newaxis]
    self.build_from_pointcloudgraph(G, labels_qg_re)