# Counting Connected Components

## Number of Connected Components in an Undirected Graph

Given `n`

nodes labeled from `0`

to `n - 1`

and a list of undirected edges (each edge is a pair of nodes),
find the number of connected components in the undirected graph.

Example 1:

0 – 1 – 2

3 – 4

Given

`n = 5`

and`edges = [[0, 1], [1, 2], [3, 4]]`

, return`2`

.

Example 2:

0 – 1 – 2 – 3 – 4

Given

`n = 5`

and`edges = [[0, 1], [1, 2], [2, 3], [3, 4]]`

, return`1`

.

Note:
You can assume that no duplicate edges will appear in edges. Since all edges are undirected, `[0, 1]`

is the same as `[1, 0]`

and thus will not appear together in edges.

## Idea

- Construct the graph from
`edges`

- Initialize
`components = 0`

- Start at an
*unvisited*node, traverse the connected component that contains this node and mark every node in this component as*visited* - Increment
`components`

by 1 - Repeat 3 and 4 until every node has been
*visited*. - Return
`components`

## Before Implementing …

### How do we represent a graph?

We want to represent a graph in such a way that supports fast neighbor lookup (like a `getNeighbors()`

method). So we can either use a HashMap to map a node to its neighbors or write a `GraphNode`

class:

```
class GraphNode:
def __init__(self, id):
self._id = id
self._neighbors = []
def add_neighbor(self, neighbor):
self._neighbors.append(neighbor)
@property
def neighbors(self):
return self._neighbors
```

This is a good opportunity to discuss with your interviewer about the trade-offs between the two implementations.

HashTable:

- Pro: Easy to implement
- Con: Code is not self-explanatory

GraphNode[^1]:

- Pro: Good design, self-explanatory code
- Con: Extra overhead

[^1]: An important note about using a `GraphNode`

class: after initializing `GraphNode(0), GraphNode(1), ..., GraphNode(n-1)`

, when we see `[2, 3]`

, we want to add `GraphNode(3)`

to `GraphNode(2).neighbors`

and vice versa. `GraphNode(2)`

and `GraphNode(3)`

have been created – how do we find them? I.e, given some `i`

, how do we store the `GraphNode`

's so we can quickly look up the `GraphNode(i)`

that has been created?

Solution 1: Store `GraphNode(0)`

,`GraphNode(1)`

,…, `GraphNode(n-1)`

in an array :
`python nodes = [GraphNode(i) for i in range(n)]`

Solution 2: Use a dictionary to map `i`

to `GraphNode(i)`

:
`python nodes = {i: GraphNode(i) for i in range(n)}`

In this case when the node ids range from 0 to `n-1`

, solution 1 is slightly better since the dictionary solution uses more space. In other cases, however, when the node ids are strings or integers but all are sparsely distributed within some range, using an array to store all the nodes will be either impossible or a huge waste of space.

Seeing that you understand how to use a `GraphNode`

class to construct the graph, the interviewer will likely suggest you use a HashMap to represent the graph to save some time.

### How do we traverse a connected component?

Both BFS and DFS will work.

### How do we store the visited nodes?

In our algorithm, we need a way to retrieve an *unvisited* node. So instead of keeping track the *visited* nodes, we use a set to store the *unvisited* nodes and we can get retrive an *unvisited* node using `unvisited.pop()`

.

## Implementation

#### Main code

```
def countComponents(n, edges):
children = {i: [] for i in range(n)}
for a, b in edges:
children[a].append(b)
children[b].append(a)
components = 0
unvisited = set(range(n))
while unvisited:
somenode = unvisited.pop()
dfs(somenode, children, unvisited) # bfs also works here
components += 1
return components
```

#### DFS

```
def dfs(root, children, unvisited):
stack = [root]
while stack:
current = stack.pop()
for x in children[current]:
if x in unvisited:
unvisited.discard(x)
stack.append(x)
```

Note: `set.discard(x)`

can be replaced by `set.remove(x)`

. The difference between `discard(x)`

and `remove(x)`

is that `set.remove(x)`

will raise `KeyError`

if `x`

is not present.

#### BFS

```
from collections import deque
def bfs(root, children, unvisited):
q = deque([root])
while q:
current = q.popleft()
for x in children[current]:
if x in unvisited:
unvisited.discard(x)
q.append(x)
```

## A Union Find Solution

A Disjoint Set Union (DSU) data structure[^2] can also be used to find the number of connected components in a graph. We’ll cover DSU in detail in a future post.

[^2]: Also called Union Find because of its two main operations.