cflg code documentation
cflg
- class Edge(*, start_node: Node, end_node: Node, timestamp: int)
Bases:
BaseModel
Edge class representing a connection between two nodes with a timestamp.
- get_max_node()
- model_config: ClassVar[ConfigDict] = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- model_fields: ClassVar[dict[str, FieldInfo]] = {'end_node': FieldInfo(annotation=Node, required=True), 'start_node': FieldInfo(annotation=Node, required=True), 'timestamp': FieldInfo(annotation=int, required=True)}
Metadata about the fields defined on the model, mapping of field names to [FieldInfo][pydantic.fields.FieldInfo].
This replaces Model.__fields__ from Pydantic V1.
- class Node(*, number: int)
Bases:
BaseModel
Node class with a numerical value for comparison operations.
- Parameters:
number (int) –
- model_config: ClassVar[ConfigDict] = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class SelectApproach(s_node1_number: int = None, s_node2_number: int = None)
Bases:
object
Class to select a subgraph from a given graph using different sampling approaches.
- start_node2_number
An additional starting node number for the snowball sampling method.
- Type:
Optional[int]
- __call__(graph: StaticGraph)
Execute the selected sampling method on the graph.
Chooses between snowball sampling and random vertex sampling based on the provided starting nodes.
- Parameters:
graph (StaticGraph) – The original graph from which a subgraph is to be sampled.
- Returns:
The sampled subgraph.
- Return type:
- static random_selected_vertices(graph: StaticGraph) StaticGraph
Perform random vertex sampling on the graph.
Randomly selects a specified number of vertices and their associated edges to create a subgraph.
- Parameters:
graph (StaticGraph) – The original graph from which a subgraph is to be sampled.
- Returns:
The sampled subgraph.
- Return type:
- snowball_sample(graph: StaticGraph) StaticGraph
Perform snowball sampling on the graph.
Starting from one or two nodes, it expands to include neighbors of these nodes, up to a specified limit.
- Parameters:
graph (StaticGraph) – The original graph from which a subgraph is to be sampled.
- Returns:
The sampled subgraph.
- Return type:
- class StaticGraph
Bases:
object
Class representing a static graph with nodes and edges.
The graph is represented as an adjacency dictionary of dictionaries. Each node is a key in the outer dictionary, and its value is another dictionary containing adjacent nodes as keys and a list of timestamps as values.
- largest_connected_component
Largest connected component of the graph.
- Type:
Optional[StaticGraph]
- add_edge(edge: Edge) int
Add a new edge to the graph. If the nodes of the edge do not exist, they are added to the graph.
- adjacency_dict_of_dicts: dict[int, dict[int, [<class 'int'>]]] = None
- assortative_factor() float
Calculate the assortativity coefficient of the graph.
Assortativity measures the similarity of connections in the graph with respect to the node degree. It indicates whether high-degree nodes tend to connect with other high-degree nodes (assortative mixing) or low-degree nodes (disassortative mixing). A positive assortativity coefficient indicates a preference for high-degree nodes to attach to other high-degree nodes, while a negative coefficient indicates the opposite.
- Returns:
The assortativity coefficient of the graph.
- Return type:
- average_cluster_factor() float
Calculate the average clustering coefficient for the largest connected component in the graph.
The clustering coefficient for a vertex quantifies how close its neighbors are to being a complete graph (clique).
- Returns:
The average clustering coefficient for the largest connected component.
- Return type:
- density() float
Calculate and return the density of the graph.
Density is defined as the ratio of the number of edges to the maximum possible number of edges in a graph with the same number of vertices.
- Return type:
- get_adjacency_dict_of_dicts() dict
Return the adjacency dictionary of dictionaries representing the graph.
- Return type:
- get_diameter(graph: StaticGraph) int
Calculate the diameter of the graph.
The diameter is the greatest distance between any pair of vertices in the graph.
- Parameters:
graph (StaticGraph) – The graph for which the diameter is to be calculated.
- Returns:
The diameter of the graph.
- Return type:
- get_largest_connected_component() StaticGraph
If not already found, find the largest weakly connected component.
- Returns:
The largest weakly connected component.
- Return type:
- get_number_of_connected_components() int
If not already found, find the number of weakly connected components.
- Returns:
Number of weakly connected components.
- Return type:
- get_radius(graph: StaticGraph) int
Calculate the radius of the graph.
The radius is the minimum eccentricity of any vertex in the graph. Eccentricity of a vertex is the greatest distance between that vertex and any other vertex in the graph.
- Parameters:
graph (StaticGraph) – The graph for which the radius is to be calculated.
- Returns:
The radius of the graph.
- Return type:
- largest_connected_component: Optional[StaticGraph] = None
- percentile_distance(graph: StaticGraph, percentile: int = 90) int
Calculate a specific percentile of the distance distribution in the graph.
- Parameters:
graph (StaticGraph) – The graph for which the distances are calculated.
percentile (int) – The percentile to calculate (between 0 and 100).
- Returns:
The calculated percentile distance.
- Return type:
Calculate the proportion of vertices in the largest connected component.
- Returns:
Proportion of vertices in the largest connected component.
- Return type:
- features_for_edges_of_static_graph(path_to_data, verbose=False)
Generate features for edges of the static graph from data file
- Parameters:
path_to_data – the path to the data file in the format: string - “num_node_1 num_node_2 timestamp”. The data starts with the 3rd line. (the first two lines of the file are skipped)
- Returns:
features for edges of the static graph
- Return type:
pandas.DataFrame
- graph_features_auc_score_tables(datasets_info: DataFrame, cls_model=None, verbose=False)
Generate LaTeX tables of network features from a DataFrame of datasets information.
- Parameters:
datasets_info (pd.DataFrame) – DataFrame with columns: ‘Network’, ‘Label’, ‘Category’, ‘Edge type’, ‘Path’. Path - the path to the data file in the format: string - “num_node_1 num_node_2 timestamp”. The data starts with the 3rd line. (the first two lines of the file are skipped)
cls_model – classification model for predicting the appearance of an edge.
- Returns:
A tuple of LaTeX strings for different feature tables of the networks.
- Return type: