Issue
I am taking data with latitude, longitude, and a z value and interpolating it using a cubic method. The values seem to be saved in a numpy
array. Is there a way to save the interpolated results to a pandas
dataframe? I’m trying to save it with a longitude, latitude, and z value column for the interpolated result.
The input file can be found here and is called nutrition.csv. Here is what I have done so far:
#Import modules
import pandas as pd
import numpy as np
import os
import shapely
import geopandas as geo
import glob
import holoviews as hv
hv.extension('bokeh')
from scipy.interpolate import griddata, interp2d
import fiona
import gdal
import ogr
#Read file
nut = pd.read_csv('nutrition.csv') #Data to be interpolated
#Minimum and maximum longtude values
lon_min = nut['longitude'].min()
lon_max = nut['longitude'].max()
#Create range of nitrogen values
lon_vec = np.linspace(lon_min, lon_max, 100) #Set grid size
#Find min and max latitude values
lat_min = nut['latitude'].min()
lat_max = nut['latitude'].max()
# Create a range of N values spanning the range from min to max latitude
# Inverse the order of lat_min and lat_max to get the grid correctly
lat_vec = np.linspace(lat_max,lat_min,100,)
# Generate the grid
lon_grid, lat_grid = np.meshgrid(lon_vec,lat_vec)
#Cubic interpolation
points = (nut['longitude'], nut['latitude'])
targets = (lon_grid, lat_grid)
grid_cubic = griddata(points, nut['n_ppm'], targets, method='cubic', fill_value=np.nan)
#Generate the graph
map_bounds=(lon_min,lat_min,lon_max,lat_max)
map_cubic = hv.Image(grid_cubic, bounds=map_bounds).opts(aspect='equal',
colorbar=True,
title='Cubic',
xlabel='Longitude',
ylabel='Latitude',
cmap='Reds')
map_cubic
I think that I really just need to find a way to combine the targets
geo-referenced grid (which has the longitude and latitude of the interpolated points) and the grid_cubic
interpolated z-field data into a pandas dataframe.
Solution
You can construct a dataframe yourself from the data you obtained
df = pd.DataFrame({
'latitude': lat_grid.reshape(-1),
'longitude': lon_grid.reshape(-1),
'value': grid_cubic.reshape(-1)
});
Let’s check the data there converting back to numpy and plotting
import matplotlib.pyplot as plt
plt.pcolor(np.array(df['longitude']).reshape(100, 100),
np.array(df['latitude']).reshape(100, 100),
np.array(df['value']).reshape(100, 100))
Notice that matplotlib is presenting the axes labels differently but are the same coordinates as for your original data.
If you want to drop the rows with undefined value from the dataframe, you can use something like this
df[df['value'].notna()]
Answered By – Bob
Answer Checked By – Clifford M. (BugsFixing Volunteer)