## Issue

I have two arrays of *x*–*y* coordinates, and I would like to find the minimum Euclidean distance between *each* point in one array with *all* the points in the other array. The arrays are not necessarily the same size. For example:

```
xy1=numpy.array(
[[ 243, 3173],
[ 525, 2997]])
xy2=numpy.array(
[[ 682, 2644],
[ 277, 2651],
[ 396, 2640]])
```

My current method loops through each coordinate `xy`

in `xy1`

and calculates the distances between that coordinate and the other coordinates.

```
mindist=numpy.zeros(len(xy1))
minid=numpy.zeros(len(xy1))
for i,xy in enumerate(xy1):
dists=numpy.sqrt(numpy.sum((xy-xy2)**2,axis=1))
mindist[i],minid[i]=dists.min(),dists.argmin()
```

Is there a way to eliminate the for loop and somehow do element-by-element calculations between the two arrays? I envision generating a distance matrix for which I could find the minimum element in each row or column.

Another way to look at the problem. Say I concatenate `xy1`

(length *m*) and `xy2`

(length *p*) into `xy`

(length *n*), and I store the lengths of the original arrays. Theoretically, I should then be able to generate a *n x n* distance matrix from those coordinates from which I can grab an *m x p* submatrix. Is there a way to efficiently generate this submatrix?

## Solution

(Months later)

`scipy.spatial.distance.cdist( X, Y )`

gives all pairs of distances,

for X and Y 2 dim, 3 dim …

It also does 22 different norms, detailed

here .

```
# cdist example: (nx,dim) (ny,dim) -> (nx,ny)
from __future__ import division
import sys
import numpy as np
from scipy.spatial.distance import cdist
#...............................................................................
dim = 10
nx = 1000
ny = 100
metric = "euclidean"
seed = 1
# change these params in sh or ipython: run this.py dim=3 ...
for arg in sys.argv[1:]:
exec( arg )
np.random.seed(seed)
np.set_printoptions( 2, threshold=100, edgeitems=10, suppress=True )
title = "%s dim %d nx %d ny %d metric %s" % (
__file__, dim, nx, ny, metric )
print "\n", title
#...............................................................................
X = np.random.uniform( 0, 1, size=(nx,dim) )
Y = np.random.uniform( 0, 1, size=(ny,dim) )
dist = cdist( X, Y, metric=metric ) # -> (nx, ny) distances
#...............................................................................
print "scipy.spatial.distance.cdist: X %s Y %s -> %s" % (
X.shape, Y.shape, dist.shape )
print "dist average %.3g +- %.2g" % (dist.mean(), dist.std())
print "check: dist[0,3] %.3g == cdist( [X[0]], [Y[3]] ) %.3g" % (
dist[0,3], cdist( [X[0]], [Y[3]] ))
# (trivia: how do pairwise distances between uniform-random points in the unit cube
# depend on the metric ? With the right scaling, not much at all:
# L1 / dim ~ .33 +- .2/sqrt dim
# L2 / sqrt dim ~ .4 +- .2/sqrt dim
# Lmax / 2 ~ .4 +- .2/sqrt dim
```

Answered By – denis

Answer Checked By – Candace Johnson (BugsFixing Volunteer)