quotient_graph

QuotientGraph Link

QuotientGraph()

Bases: Graph

Represents a quotient graph derived from a point cloud graph.

The QuotientGraph class is designed to create a quotient graph based on the provided point cloud graph structure. The class allows for computation of graph nodes, which are regions derived from clustering in the point cloud graph, and establishes intra-class and inter-class connections with computed weights and labels. Additionally, coordinates of nodes can be derived, representing the geometric center of their respective regions.

Attributes:

point_cloud_graph (PointCloudGraph or None) –

The underlying point cloud graph associated to this quotient graph.
nodes_coordinates (ndarray or None) –

Computed coordinates of the nodes within the quotient graph.

Source code in spectral_clustering/quotient_graph.py

def __init__(self):
    super().__init__()
    # self.seed_colors = None
    # self.graph_labels_dict = None
    # self.label_count = None
    self.point_cloud_graph = None
    self.nodes_coordinates = None

build_from_pointcloudgraph Link

build_from_pointcloudgraph(G, labels_from_cluster, region_growing=True)

Build a quotient graph from a given point cloud graph.

Builds a quotient graph from a given point cloud graph based on clustering results and optionally applies a region-growing algorithm to extract connected components made of the same k-means cluster label. The resulting quotient graph nodes represent connected components, while edges represent interactions between these components.

Parameters:

G (PointCloudGraph) –

The input point cloud graph where nodes represent points and edges may have associated weights describing relationships between points.
labels_from_cluster (ndarray) –

An array of labels from the initial clustering (e.g., k-means clustering) where each element corresponds to the cluster label of a graph node.
region_growing (bool, default: True ) –

A flag indicating whether to use a region-growing algorithm to group nodes into connected components based on their k-means cluster labels. If False, connected components are not grown, and nodes are assigned to clusters based on initial cluster labels directly. Default is True.

Source code in spectral_clustering/quotient_graph.py

def build_from_pointcloudgraph(self, G, labels_from_cluster, region_growing=True):
    """Build a quotient graph from a given point cloud graph.

    Builds a quotient graph from a given point cloud graph based on clustering results
    and optionally applies a region-growing algorithm to extract connected components
    made of the same k-means cluster label. The resulting quotient graph nodes represent
    connected components, while edges represent interactions between these components.

    Parameters
    ----------
    G : spectral_clustering.point_cloud_graph.PointCloudGraph
        The input point cloud graph where nodes represent points and edges may have
        associated weights describing relationships between points.
    labels_from_cluster : numpy.ndarray
        An array of labels from the initial clustering (e.g., k-means clustering) where
        each element corresponds to the cluster label of a graph node.
    region_growing : bool, optional
        A flag indicating whether to use a region-growing algorithm to group nodes into
        connected components based on their k-means cluster labels. If False, connected
        components are not grown, and nodes are assigned to clusters based on initial
        cluster labels directly.
        Default is ``True``.
    """

    kmeans_labels = labels_from_cluster
    connected_component_labels = np.zeros(kmeans_labels.shape, dtype=int)
    current_cc_size = 0
    connected_component_size = []
    seed_kmeans_labels = []
    # This line has been added to obtain particular colors when the initial clustering is done with four clusters
    # especially kmeans.
    lablist = list(kmeans_labels[:])
    my_count = pd.Series(lablist).value_counts()
    # if len(my_count) == 4:
    #     #cluster_colors = ['#0000FF', '#00FF00', '#FFFF00', '#FF0000']
    #     cluster_colors = ['glasbey_'+str(i) for i in range(len(my_count))]
    # else:
    #     jet = cm.get_cmap('jet', len(my_count))
    #     cluster_colors = jet(range(len(my_count)))
    #     list_labels = np.unique(lablist).tolist()
    seed_colors = []
    label_count = 0
    visited = np.zeros(kmeans_labels.shape, dtype=int)
    queue = []

    # Region growing algorithm to extract connected components made of the same k-means class.
    if region_growing:
        for i in range(kmeans_labels.shape[0]):

            # Find a seed to initialize the seed fill process.

            if visited[i] == 0:
                seed = i
                visited[i] = 1
                queue.append(i)
                seed_kmeans_labels.append(kmeans_labels[seed])
                seed_colors.append('glasbey_' + str(kmeans_labels[seed][0] % 256))
                # if len(my_count) == 4:
                #     seed_colors.append(cluster_colors[kmeans_labels[seed][0]])
                # else:
                #     seed_colors.append(cluster_colors[list_labels.index(kmeans_labels[seed][0])][:])
                current_cc_size = 0

                # Region growing from the specified seed.

                while len(queue) != 0:

                    current = queue.pop(0)
                    connected_component_labels[current] = label_count
                    current_cc_size += 1

                    for n in G[current]:
                        if kmeans_labels[n] == kmeans_labels[seed] and visited[n] == 0:
                            queue.append(n)
                            visited[n] = 1

                connected_component_size.append(current_cc_size)
                label_count += 1
    else:
        # To correct/finish, it doesn't work for now
        list_kmeans_labels = [kmeans_labels[i][0] for i in range(len(kmeans_labels))]
        label_count = len(list(set(list_kmeans_labels)))
        for i in range(kmeans_labels.shape[0]):
            seed_colors.append('glasbey_' + str(kmeans_labels[i][0] % 256))
            seed_kmeans_labels.append(kmeans_labels[i])

    # Create quotient graph nodes.

    self.add_nodes_from(range(label_count))

    node_kmeans_labels_values = dict(
        zip(np.asarray(self.nodes()), np.transpose(np.asarray(seed_kmeans_labels))[0]))
    nx.set_node_attributes(self, node_kmeans_labels_values, 'kmeans_labels')

    # Create quotient graph edges.

    intra_edge_weight_sum = np.zeros(label_count, dtype=np.float64)
    intra_edge_count = np.zeros(label_count, dtype=int)

    for (u, v) in G.edges:
        a = connected_component_labels[u, 0]
        b = connected_component_labels[v, 0]

        # Inter-class edge case:

        if a != b:
            if not self.has_edge(a, b):
                w = G.edges[u, v]['weight']
                self.add_edge(a, b, inter_class_edge_weight=w, inter_class_edge_number=1)
                # nx.set_edge_attributes(self, G.edges[u, v]['weight'], 'inter_class_edge_weight')
                # self.edges[a, b]['inter_class_edge_weight'] = G.edges[u, v]['weight']
                # nx.set_edge_attributes(self, 1, 'inter_class_edge_number')
                # self.edges[a, b]['inter_class_edge_number'] = 1
            else:
                self.edges[a, b]['inter_class_edge_weight'] += G.edges[u, v]['weight']
                self.edges[a, b]['inter_class_edge_number'] += 1

        # Intra-class edge case:

        else:
            intra_edge_weight_sum[a] += G.edges[u, v]['weight']
            intra_edge_count[a] += 1

    # Assign to each point cloud graph node the corresponding quotient graph node.

    node_cc = dict(
        zip(np.asarray(G.nodes()), np.transpose(np.asarray(connected_component_labels))[0]))
    nx.set_node_attributes(G, node_cc, 'quotient_graph_node')

    nx.set_node_attributes(self, dict(
        zip(np.asarray(self.nodes()), np.transpose(np.asarray(connected_component_size)))),
                           'intra_class_node_number')

    nx.set_node_attributes(self, dict(
        zip(np.asarray(self.nodes()), np.transpose(np.asarray(intra_edge_weight_sum)))), 'intra_class_edge_weight')

    nx.set_node_attributes(self, dict(
        zip(np.asarray(self.nodes()), np.transpose(np.asarray(intra_edge_count)))), 'intra_class_edge_number')

    # if len(my_count) == 4:
    #     nx.set_node_attributes(self, dict(
    #         zip(np.asarray(self.nodes()), np.transpose(np.asarray(seed_colors)))), 'seed_colors')
    # else:
    nx.set_node_attributes(self, dict(
        zip(np.asarray(self.nodes()), seed_colors)), 'seed_colors')

    # self.seed_colors = seed_colors
    # self.label_count = label_count
    self.point_cloud_graph = G

compute_direction_info Link

compute_direction_info(list_leaves)

Computes directional gradient information for specific nodes and edges in the graph.

This function computes a mean direction gradient for each node in the graph, based on the sum of the direction gradients of the individual nodes that belong to its quotient graph node. Additionally, it calculates the energy dot product for each edge based on the dot product of the mean direction gradients of the connected nodes. Leaves from the specified list are assigned a predefined energy dot product value.

Parameters:

list_leaves (list) –

A list of nodes that are considered as leaves in the graph. These nodes will have their associated edges assigned a constant energy dot product value.

Source code in spectral_clustering/quotient_graph.py

def compute_direction_info(self, list_leaves):
    """Computes directional gradient information for specific nodes and edges in the graph.

    This function computes a mean direction gradient for each node in the graph, based on
    the sum of the direction gradients of the individual nodes that belong to its quotient
    graph node. Additionally, it calculates the energy dot product for each edge based
    on the dot product of the mean direction gradients of the connected nodes. Leaves
    from the specified list are assigned a predefined energy dot product value.

    Parameters
    ----------
    list_leaves : list
        A list of nodes that are considered as leaves in the graph. These nodes will
        have their associated edges assigned a constant energy dot product value.
    """
    G = self.point_cloud_graph
    for l in self:
        self.nodes[l]['dir_gradient_mean'] = 0
        list_of_nodes_in_qnode = [x for x, y in G.nodes(data=True) if y['quotient_graph_node'] == l]
        for n in list_of_nodes_in_qnode:
            self.nodes[l]['dir_gradient_mean'] += G.nodes[n]['direction_gradient']
        self.nodes[l]['dir_gradient_mean'] /= len(list_of_nodes_in_qnode)

    # take all the edges and compute the dot product between the two cluster
    for e in self.edges():
        v1 = self.nodes[e[0]]['dir_gradient_mean']
        v2 = self.nodes[e[1]]['dir_gradient_mean']
        if e[0] in list_leaves or e[1] in list_leaves:
            self.edges[e]['energy_dot_product'] = 4
        else:
            dot = v1[0] * v2[0] + v1[1] * v2[1] + v1[2] * v2[2]
            energy_dot_product = 1 - dot
            self.edges[e]['energy_dot_product'] = energy_dot_product

compute_local_descriptors Link

compute_local_descriptors(method='all_qg_cluster', data='coords', neighborhood='radius', scale=10)

Compute local descriptors for the points in a graph or its quotient graph nodes based on various methods, data types, and neighborhood definitions.

This method calculates geometric and structural descriptors such as planarity, linearity, scattering, curvature, and eigenvalues using covariance matrices derived from either the point coordinates or gradient vectors. Depending on the method and data type, the computation is either performed for individual points or for clusters of points belonging to quotient graph nodes.

Parameters:

method (str, default: 'all_qg_cluster' ) –
Specifies the type of computation to perform. Default is 'all_qg_cluster'. Options include:
- 'all_qg_cluster': Compute descriptors for quotient graph nodes by aggregating information from all points in a node.
- 'each_point': Compute descriptors for individual points based on their direct neighborhoods.
data (str, default: 'coords' ) –
Specifies the type of data on which descriptors are computed. Default is 'coords'. Options include:
- 'coords': Use the coordinates of points.
- 'gradient_vector_fiedler': Use the gradient vector of the Fiedler eigenvector.
neighborhood (str, default: 'radius' ) –
Specifies the neighborhood definition when computing descriptors for individual points. Default is 'radius'. Options include:
- 'radius': Neighborhood defined by points within a given radius.
- 'pointcloudgraph': Predefined graph-based neighborhood.
scale (int, default: 10 ) –

When 'neighborhood' is set to 'radius', this parameter determines the radius to define the neighborhood. Default is 10.

Raises:

ValueError –

If an invalid combination of method, data, or neighborhood is provided.
KeyError –

If required attributes are missing in the point cloud graph.

Returns:

None –

This method updates nodes in the graph with their computed descriptors as attributes.

Source code in spectral_clustering/quotient_graph.py

def compute_local_descriptors(self, method='all_qg_cluster', data='coords', neighborhood='radius', scale=10):
    """
    Compute local descriptors for the points in a graph or its quotient graph nodes based on various methods, data types, and neighborhood definitions.

    This method calculates geometric and structural descriptors such as planarity, linearity, scattering, curvature, and eigenvalues
    using covariance matrices derived from either the point coordinates or gradient vectors. Depending on the method and data type, the
    computation is either performed for individual points or for clusters of points belonging to quotient graph nodes.

    Parameters
    ----------
    method : str, optional
        Specifies the type of computation to perform.
        Default is ``'all_qg_cluster'``.
        Options include:

        - ``'all_qg_cluster'``: Compute descriptors for quotient graph nodes by aggregating information from all points
          in a node.
        - ``'each_point'``: Compute descriptors for individual points based on their direct neighborhoods.
    data : str, optional
        Specifies the type of data on which descriptors are computed.
        Default is 'coords'.
        Options include:

        - ``'coords'``: Use the coordinates of points.
        - ``'gradient_vector_fiedler'``: Use the gradient vector of the Fiedler eigenvector.
    neighborhood : str, optional
        Specifies the neighborhood definition when computing descriptors for individual points.
        Default is 'radius'.
        Options include:

          - ``'radius'``: Neighborhood defined by points within a given radius.
          - ``'pointcloudgraph'``: Predefined graph-based neighborhood.
    scale : int, optional
        When 'neighborhood' is set to 'radius', this parameter determines the radius to define the neighborhood.
        Default is ``10``.

    Raises
    ------
    ValueError
        If an invalid combination of `method`, `data`, or `neighborhood` is provided.
    KeyError
        If required attributes are missing in the point cloud graph.

    Returns
    -------
    None
        This method updates nodes in the graph with their computed descriptors as attributes.
    """
    G = self.point_cloud_graph
    # compute the descriptor for all the point in a quotient graph node
    if method == 'all_qg_cluster' and data == 'coords':
        for qnode in self:
            list_of_nodes_in_qnode = [x for x, y in G.nodes(data=True) if y['quotient_graph_node'] == qnode]
            mat = np.zeros((len(list_of_nodes_in_qnode), 3))
            i = 0
            for n in list_of_nodes_in_qnode:
                mat[i] = G.nodes_coords[n]
                i += 1
            covmat = np.cov(np.transpose(mat))

            eigenval, eigenvec = np.linalg.eigh(covmat)

            self.nodes[qnode]['planarity'] = (eigenval[1] - eigenval[0]) / eigenval[2]
            self.nodes[qnode]['planarity2'] = (eigenval[1] - eigenval[0]) / eigenval[1]
            self.nodes[qnode]['linearity'] = (eigenval[2] - eigenval[1]) / eigenval[2]
            self.nodes[qnode]['scattering'] = eigenval[0] / eigenval[2]
            self.nodes[qnode]['curvature_eig'] = eigenval[0] / (eigenval[0] + eigenval[2] + eigenval[1])
            self.nodes[qnode]['lambda1'] = eigenval[2]
            self.nodes[qnode]['lambda2'] = eigenval[1]
            self.nodes[qnode]['lambda3'] = eigenval[0]

    if method == 'each_point' and data == 'coords' and neighborhood == 'radius':
        # compute a new pointcloudgraph with the neighborhood wanted
        NewG = sgk.create_riemannian_graph(G.pcd, method=neighborhood, nearest_neighbors=neighborhood, radius=scale)

    if method == 'each_point' and data == 'coords' and neighborhood == 'pointcloudgraph':
        # compute each descriptor for each point of the point cloud and its neighborhood
        for p in G:
            mat = np.zeros((len(G[p]), 3))
            i = 0
            for n in G[p]:
                mat[i] = G.nodes_coords[n]
                i += 1
            covmat = np.cov(np.transpose(mat))

            eigenval, eigenvec = np.linalg.eigh(covmat)
            G.nodes[p]['planarity'] = (eigenval[1] - eigenval[0]) / eigenval[2]
            G.nodes[p]['linearity'] = (eigenval[2] - eigenval[1]) / eigenval[2]
            G.nodes[p]['scattering'] = eigenval[0] / eigenval[2]
            G.nodes[p]['curvature_eig'] = eigenval[0] / (eigenval[0] + eigenval[2] + eigenval[1])
        # compute mean value for the qnode
        for qnode in self:
            list_of_nodes_in_qnode = [x for x, y in G.nodes(data=True) if y['quotient_graph_node'] == qnode]
            self.nodes[qnode]['planarity'] = 0
            self.nodes[qnode]['linearity'] = 0
            self.nodes[qnode]['scattering'] = 0
            self.nodes[qnode]['curvature_eig'] = 0

            for n in list_of_nodes_in_qnode:
                self.nodes[qnode]['planarity'] += G.nodes[n]['planarity']
                self.nodes[qnode]['linearity'] += G.nodes[n]['linearity']
                self.nodes[qnode]['scattering'] += G.nodes[n]['scattering']
                self.nodes[qnode]['curvature_eig'] += G.nodes[n]['curvature_eig']
            self.nodes[qnode]['planarity'] /= len(list_of_nodes_in_qnode)
            self.nodes[qnode]['linearity'] /= len(list_of_nodes_in_qnode)
            self.nodes[qnode]['scattering'] /= len(list_of_nodes_in_qnode)
            self.nodes[qnode]['curvature_eig'] /= len(list_of_nodes_in_qnode)

    if method == 'all_qg_cluster' and data == 'gradient_vector_fiedler':
        for qnode in self:
            list_of_nodes_in_qnode = [x for x, y in G.nodes(data=True) if y['quotient_graph_node'] == qnode]
            mat = np.zeros((len(list_of_nodes_in_qnode), 3))
            i = 0
            for n in list_of_nodes_in_qnode:
                mat[i] = G.nodes[n]['direction_gradient']
                i += 1
            covmat = np.cov(np.transpose(mat))

            eigenval, eigenvec = np.linalg.eigh(covmat)

            self.nodes[qnode]['max_eigenval'] = eigenval[2]
            self.nodes[qnode]['planarity'] = (eigenval[1] - eigenval[0]) / eigenval[2]
            self.nodes[qnode]['linearity'] = (eigenval[2] - eigenval[1]) / eigenval[2]
            self.nodes[qnode]['scattering'] = eigenval[0] / eigenval[2]
            self.nodes[qnode]['curvature_eig'] = eigenval[0] / (eigenval[0] + eigenval[2] + eigenval[1])

compute_nodes_coordinates Link

compute_nodes_coordinates()

Compute and assign average 3D coordinates to nodes in the quotient graph.

This function calculates the centroid coordinates for each node in the quotient graph by computing the mean of the coordinates of the corresponding point cloud graph nodes. These coordinates are subsequently stored as an attribute for each node in the quotient graph. It provides a mechanism for visualizing the quotient graph in 3D space.

Source code in spectral_clustering/quotient_graph.py

def compute_nodes_coordinates(self):
    """Compute and assign average 3D coordinates to nodes in the quotient graph.

    This function calculates the centroid coordinates for each node in the quotient
    graph by computing the mean of the coordinates of the corresponding point cloud
    graph nodes. These coordinates are subsequently stored as an attribute for each
    node in the quotient graph. It provides a mechanism for visualizing the quotient
    graph in 3D space.
    """

    G = self.point_cloud_graph

    # Calcul de coordonnées moyennes pour chaque noeud du graphe quotient, dans le but d'afficher en 3D le graphe.
    new_classif = np.asarray(list((dict(G.nodes(data='quotient_graph_node')).values())))
    nodes_coords_moy = np.array([np.mean(G.nodes_coords[new_classif == n], axis=0) for n in self.nodes])

    # import scipy.ndimage as nd
    # nodes_coords_moy = np.transpose([nd.mean(G.nodes_coords[:,k],
    #                                          new_classif,
    #                                          index=list(self.nodes)) for k in range(3)])
    #
    # nodes_coords_moy = np.zeros((len(self), 3))
    # for j, n in enumerate(self.nodes):
    #     nodes_coords_moy[j] = np.mean(G.nodes_coords[new_classif == n], axis=0)
    #
    #     X = []
    #     Y = []
    #     Z = []
    #     for i in range(pcd_attribute.shape[0]):
    #         if sorted_pcd_attribute_by_quotient_graph_attribute[i, 3] == n:
    #             X.append(sorted_pcd_attribute_by_quotient_graph_attribute[i, 0])
    #             Y.append(sorted_pcd_attribute_by_quotient_graph_attribute[i, 1])
    #             Z.append(sorted_pcd_attribute_by_quotient_graph_attribute[i, 2])
    #     nodes_coords_moy[j, 0] = np.mean(X)
    #     nodes_coords_moy[j, 1] = np.mean(Y)
    #     nodes_coords_moy[j, 2] = np.mean(Z)

    # new_classif = new_classif[:, np.newaxis]
    # pcd_attribute = np.concatenate([G.nodes_coords, new_classif], axis=1)
    # sorted_pcd_attribute_by_quotient_graph_attribute = pcd_attribute[np.argsort(pcd_attribute[:, 3])]
    # nodes_coords_moy = np.zeros((len(self), 3))
    # j = 0
    # for n in self.nodes:
    #     X = []
    #     Y = []
    #     Z = []
    #     for i in range(pcd_attribute.shape[0]):
    #         if sorted_pcd_attribute_by_quotient_graph_attribute[i, 3] == n:
    #             X.append(sorted_pcd_attribute_by_quotient_graph_attribute[i, 0])
    #             Y.append(sorted_pcd_attribute_by_quotient_graph_attribute[i, 1])
    #             Z.append(sorted_pcd_attribute_by_quotient_graph_attribute[i, 2])
    #     nodes_coords_moy[j, 0] = np.mean(X)
    #     nodes_coords_moy[j, 1] = np.mean(Y)
    #     nodes_coords_moy[j, 2] = np.mean(Z)
    #     j += 1
    self.nodes_coordinates = nodes_coords_moy

    for nodes in self.nodes:
        self.nodes[nodes]['centroide_coordinates'] = np.nan
    i = 0
    for nodes in self.nodes:
        self.nodes[nodes]['centroide_coordinates'] = nodes_coords_moy[i, :]
        i += 1

compute_quotientgraph_metadata_on_a_node_interclass Link

compute_quotientgraph_metadata_on_a_node_interclass(node)

Computes metadata for a specific node in the quotient graph while focusing on interclass relationships.

This function analyzes the relationships of the node's neighbors that belong to different classes within the quotient graph. It returns a dictionary that counts the occurrences of neighboring nodes belonging to other classes relative to the specified node.

Parameters:

node (any) –

The node in the quotient graph for which the interclass metadata should be computed.

Returns:

dict –

A dictionary where the keys are the different classes (quotient graph nodes) and the values are the count of neighboring nodes belonging to those classes.

Notes

The function uses the 'quotient_graph_node' attribute of the nodes in the graph to determine the class of each node. Nodes with the same 'quotient_graph_node' value are considered to belong to the same class.

Source code in spectral_clustering/quotient_graph.py

def compute_quotientgraph_metadata_on_a_node_interclass(self, node):
    """Computes metadata for a specific node in the quotient graph while focusing on interclass relationships.

    This function analyzes the relationships of the node's neighbors that belong to different classes within
    the quotient graph. It returns a dictionary that counts the occurrences of neighboring nodes belonging
    to other classes relative to the specified node.

    Parameters
    ----------
    node : any
        The node in the quotient graph for which the interclass metadata should be computed.

    Returns
    -------
    dict
        A dictionary where the keys are the different classes (quotient graph nodes)
        and the values are the count of neighboring nodes belonging to those classes.

    Notes
    -----
    The function uses the 'quotient_graph_node' attribute of the nodes in the graph to determine
    the class of each node. Nodes with the same 'quotient_graph_node' value are considered to
    belong to the same class.
    """
    G = self.point_cloud_graph
    count = {}
    for c in self[node]:
        count[c] = 0

    list_of_nodes_G = [x for x, y in G.nodes(data=True) if y['quotient_graph_node'] == node]
    for ng in list_of_nodes_G:
        for neing in G[ng]:
            if G.nodes[ng]['quotient_graph_node'] != G.nodes[neing]['quotient_graph_node']:
                count[G.nodes[neing]['quotient_graph_node']] += 1

    return count

compute_silhouette Link

compute_silhouette(method='topological', data='direction_gradient_vector_fiedler')

Compute the silhouette score for nodes in a point cloud graph based on clustering.

This function calculates silhouette scores for each node in a point cloud graph based either on the "all_qg_cluster" or "topological" approach. The silhouette score is a measure of how similar a node's association to its own cluster is versus its association to neighboring, adjacent clusters. The results are stored in the graph for individual nodes and quotient graph nodes.

Parameters:

method (str, default: 'topological' ) –

The method used to compute the silhouette score. Acceptable values are 'all_qg_cluster' for traditional silhouette computation or 'topological' for a more nuanced topology-based computation. Default is 'topological'.
data (str, default: 'direction_gradient_vector_fiedler' ) –

Indicates the data used for silhouette computation. Options are 'direction_gradient_vector_fiedler' or 'norm_gradient_vector_fiedler', determining the specific gradient vector field used in the calculation. Default is 'direction_gradient_vector_fiedler'.

Notes

For the 'all_qg_cluster' method, silhouette scores are based on the classical approach available in scikit-learn.
For the 'topological' method, silhouette calculation considers the dissimilarities with adjacent clusters only, leveraging adjacency matrices for the computations.
The computed silhouette scores are stored directly as node attributes in the underlying graph.
Quotient graph nodes' silhouette scores are computed as the average of individual nodes within the respective cluster.

Source code in spectral_clustering/quotient_graph.py

def compute_silhouette(self, method='topological', data='direction_gradient_vector_fiedler'):
    """Compute the silhouette score for nodes in a point cloud graph based on clustering.

    This function calculates silhouette scores for each node in a point cloud graph
    based either on the "all_qg_cluster" or "topological" approach. The silhouette
    score is a measure of how similar a node's association to its own cluster is
    versus its association to neighboring, adjacent clusters. The results are stored
    in the graph for individual nodes and quotient graph nodes.

    Parameters
    ----------
    method : str, optional
        The method used to compute the silhouette score. Acceptable values are
        'all_qg_cluster' for traditional silhouette computation or 'topological'
        for a more nuanced topology-based computation. Default is 'topological'.
    data : str, optional
        Indicates the data used for silhouette computation. Options are
        'direction_gradient_vector_fiedler' or 'norm_gradient_vector_fiedler',
        determining the specific gradient vector field used in the calculation.
        Default is 'direction_gradient_vector_fiedler'.

    Notes
    -----
    - For the 'all_qg_cluster' method, silhouette scores are based on the classical
      approach available in scikit-learn.
    - For the 'topological' method, silhouette calculation considers the dissimilarities
      with adjacent clusters only, leveraging adjacency matrices for the computations.
    - The computed silhouette scores are stored directly as node attributes in the
      underlying graph.
    - Quotient graph nodes' silhouette scores are computed as the average of individual
      nodes within the respective cluster.
    """
    G = self.point_cloud_graph
    label = []
    for node in G:
        label.append(G.nodes[node]['quotient_graph_node'])

    if method == 'all_qg_cluster':
        # this method is based on the classical way to compute silhouette coefficients in the sci-kit learn module

        if data == 'direction_gradient_vector_fiedler':
            sil = sk.metrics.silhouette_samples(G.direction_gradient_on_fiedler_scaled, label, metric='euclidean')
        elif data == 'norm_gradient_vector_fiedler':
            sil = sk.metrics.silhouette_samples(G.gradient_on_fiedler, label, metric='euclidean')

    if method == 'topological':
        import scipy.ndimage as nd
        from sklearn.metrics import pairwise_distances_chunked
        # this method aims at comparing the cluster A of the point considered with the adjacent clusters of A.
        X = []

        if data == 'direction_gradient_vector_fiedler':
            X = G.direction_gradient_on_fiedler_scaled
        elif data == 'norm_gradient_vector_fiedler':
            X = G.gradient_on_fiedler

        labels = np.array(label)

        label_list = np.array(self.nodes)

        label_adjacency = nx.adjacency_matrix(self).todense().astype('float64')
        label_adjacency -= 1 * np.eye(len(label_adjacency)).astype('float64')

        label_index = np.array([np.where(label_list == l)[0][0] for l in labels])

        def mean_by_label(D_chunk, start):
            """Computes the mean of data points by label, excluding contributions from the current label.

            This function calculates the per-label mean of a given chunk of data points, while
            excluding contributions from the label currently being processed. It operates on an
            input chunk of data and associated labels, aggregating information based on the specified
            index labels.

            Parameters
            ----------
            D_chunk : np.ndarray
                A chunk of multi-dimensional data points split from the full dataset, where each row
                represents a data point.
            start : int
                The start index for the current chunk of data within the full dataset. Used to
                correctly align the chunk with the corresponding labels.

            Returns
            -------
            np.ndarray
                An array representing the mean value of data points grouped by label, excluding
                contributions from the current label. The size of the array matches the provided
                `D_chunk`.
            """
            label_mean = []
            chunk_labels = labels[start:start + len(D_chunk)]
            for line, label in zip(D_chunk, chunk_labels):
                label_sum = nd.sum(line, labels, index=label_list)
                label_count = nd.sum(np.ones_like(labels), labels, index=label_list)
                label_count -= label_list == label
                label_mean += [label_sum / label_count]
            return np.array(label_mean)

        gen = pairwise_distances_chunked(X, reduce_func=mean_by_label)

        start = 0
        silhouette = []
        for label_mean_dissimilarity in gen:
            chunk_label_index = label_index[start:start + len(label_mean_dissimilarity)]

            adjacent_label_mean_dissimilarity = np.copy(label_mean_dissimilarity)
            adjacent_label_mean_dissimilarity[label_adjacency[chunk_label_index] != 1] = np.nan

            self_label_mean_dissimilarity = np.copy(label_mean_dissimilarity)
            self_label_mean_dissimilarity[label_adjacency[chunk_label_index] != -1] = np.nan

            intra = np.nanmin(self_label_mean_dissimilarity, axis=1)
            inter = np.nanmin(adjacent_label_mean_dissimilarity, axis=1)

            silhouette += list((inter - intra) / np.maximum(inter, intra))
            start += len(label_mean_dissimilarity)

        sil = np.array(silhouette)

    i = 0
    for node in G:
        G.nodes[node]['silhouette'] = sil[i]
        i += 1
    for qnode in self:
        list_of_nodes_in_qnode = [x for x, y in G.nodes(data=True) if y['quotient_graph_node'] == qnode]
        self.nodes[qnode]['silhouette'] = 0
        for n in list_of_nodes_in_qnode:
            self.nodes[qnode]['silhouette'] += G.nodes[n]['silhouette']
        self.nodes[qnode]['silhouette'] /= len(list_of_nodes_in_qnode)

count_local_extremum_of_Fiedler Link

count_local_extremum_of_Fiedler()

Counts the local extrema of the Fiedler vector in the graph associated with each node.

The method utilizes a point cloud graph and updates the associated nodes in the graph with the total number of local extrema of the Fiedler vector belonging to their corresponding quotient graph nodes.

Source code in spectral_clustering/quotient_graph.py

def count_local_extremum_of_Fiedler(self):
    """Counts the local extrema of the Fiedler vector in the graph associated with each node.

    The method utilizes a point cloud graph and updates the associated nodes
    in the graph with the total number of local extrema of the Fiedler vector
    belonging to their corresponding quotient graph nodes.
    """
    G = self.point_cloud_graph
    G.find_local_extremum_of_Fiedler()
    for c in self.nodes():
        self.nodes[c]['number_of_local_Fiedler_extremum'] = 0
        list_of_nodes_r = [x for x, y in G.nodes(data=True) if y['quotient_graph_node'] == c]
        for pt in list_of_nodes_r:
            self.nodes[c]['number_of_local_Fiedler_extremum'] += G.nodes[pt]['extrem_local_Fiedler']

delete_empty_edges_and_nodes Link

delete_empty_edges_and_nodes()

Delete edges from the quotient graph that do not represent any edges in the distance-based graph anymore.

Delete nodes from the quotient graph that do not represent any nodes of the distance-based graph. Update of the topological structure by removal only.

Source code in spectral_clustering/quotient_graph.py

def delete_empty_edges_and_nodes(self):
    """Delete edges from the quotient graph that do not represent any edges in the distance-based graph anymore.

    Delete nodes from the quotient graph that do not represent any nodes of the distance-based graph.
    Update of the topological structure by removal only.
    """
    to_remove = []
    list = [e for e in self.edges]
    for e in list:
        if self.edges[e[0], e[1]]['inter_class_edge_number'] == 0:
            to_remove.append(e)
    print(to_remove)
    self.remove_edges_from(to_remove)
    to_remove = []
    for n in self.nodes:
        if self.nodes[n]['intra_class_node_number'] == 0:
            to_remove.append(n)
    self.remove_nodes_from(to_remove)
    print(to_remove)

ponderate_with_coordinates_of_points Link

ponderate_with_coordinates_of_points()

Computes the weights of edges based on the centroid coordinates of their connected nodes.

This method first calculates the coordinates of the centroids for all nodes and then uses these coordinates to calculate the distance between connected nodes in the graph. The calculated distance is stored as an additional weight attribute for each edge.

Attributes:

self.nodes (dict) –

A dictionary representing graph nodes where keys are node identifiers and values are associated attributes, including 'centroide_coordinates'.
self.edges (dict) –

A dictionary representing graph edges where keys are tuples of connected node identifiers and values are associated attributes, including the new 'distance_centroides'.

Notes

The edge weight 'distance_centroides' is calculated as the Euclidean distance between the centroid coordinates of two nodes connected by the edge.

Source code in spectral_clustering/quotient_graph.py

def ponderate_with_coordinates_of_points(self):
    """Computes the weights of edges based on the centroid coordinates of their connected nodes.

    This method first calculates the coordinates of the centroids for all nodes and then uses these
    coordinates to calculate the distance between connected nodes in the graph. The calculated
    distance is stored as an additional weight attribute for each edge.

    Attributes
    ----------
    self.nodes : dict
        A dictionary representing graph nodes where keys are node identifiers and values are
        associated attributes, including 'centroide_coordinates'.
    self.edges : dict
        A dictionary representing graph edges where keys are tuples of connected node identifiers
        and values are associated attributes, including the new 'distance_centroides'.

    Notes
    -----
    The edge weight 'distance_centroides' is calculated as the Euclidean distance between the
    centroid coordinates of two nodes connected by the edge.
    """
    self.compute_nodes_coordinates()
    # Calcul des poids
    for e in self.edges:
        n1 = e[0]
        n2 = e[1]
        c1 = self.nodes[n1]['centroide_coordinates']
        c2 = self.nodes[n2]['centroide_coordinates']
        dist = np.linalg.norm(c1 - c2)
        self.edges[e]['distance_centroides'] = dist

rebuild Link

rebuild(G, clear=True)

Rebuilds the internal structure based on the provided graph and its associated data.

This method processes the input graph and reinitializes the internal structure. If the clear flag is set to True, any existing data in the internal structure is cleared before rebuilding. The method extracts node attributes from the graph, transforms them into a numpy array, and utilizes this array to rebuild the internal representation from the given graph.

Parameters:

G (PointCloudGraph) –

A graph structure that contains nodes with the attribute 'quotient_graph_node', which is used to rebuild the internal data structure.
clear (bool, default: True ) –

If True, clears the current internal structure before rebuilding. Default is True.

Source code in spectral_clustering/quotient_graph.py

def rebuild(self, G, clear=True):
    """Rebuilds the internal structure based on the provided graph and its associated data.

    This method processes the input graph and reinitializes the internal structure. If the
    `clear` flag is set to True, any existing data in the internal structure is cleared
    before rebuilding. The method extracts node attributes from the graph, transforms them
    into a numpy array, and utilizes this array to rebuild the internal representation
    from the given graph.

    Parameters
    ----------
    G : spectral_clustering.point_cloud_graph.PointCloudGraph
        A graph structure that contains nodes with the attribute 'quotient_graph_node', which is
        used to rebuild the internal data structure.
    clear : bool, optional
        If ``True``, clears the current internal structure before rebuilding.
        Default is ``True``.
    """
    if clear:
        self.clear()
    labels_qg = [k for k in dict(G.nodes(data='quotient_graph_node')).values()]
    labels_qg_re = np.asarray(labels_qg)[:, np.newaxis]
    self.build_from_pointcloudgraph(G, labels_qg_re)