# Hierarchical clustering in Python with SciPy

## 2

Using SciPy, we can perform hierarchical clustering on our dataset, and efficiently traverse the resulting dendrogram to generate clusters at different levels

SciPy provides a hierarchical clustering implementation that makes clustering data relatively straightforward.

`linkage`

`linkage`

produces a linkage matrix that defines the dendrogram. The `method`

parameter determines how clusters at each level of the hierarchy are linked, and the `metric`

parameter determines which distance measure is used. These parameters should be tuned to your problem. More information on the `linkage`

function and valid parameters can be found here.

`fcluster`

Once we have our linkage matrix, we can extract clusters from it using `fcluster`

as in line #22. Here we must specify a threshold `t`

, at which to cut the dendrogram. `fcluster`

returns an array of cluster assignments where each index corresponds to the row index of the `data`

array, and the value indicates the cluster assignment. More information on `fcluster`

can be found here.

`fclusterdata`

If we already know the threshold value `t`

ahead of time and don't require a dendrogram plot, we can use the `fclusterdata`

function to create the linkage matrix and extract the clusters in a single function call, as in line #34. More information on `fclusterdata`

can be found here.

duncster

130| edited