Mapping with mapbox compared to folium in python

by Dane Miller

A quick comparison of mapping looking at mapbox and folium in python. Mapbox is a mapping program based in GeoJSON, very easy to use and produces maps very quickly. Folium is a python based mapping program that requires several dependencies in order to produce a map.

If you need to produce a map quickly that has high resolution go with mapbox using GeoJSON. It will save you hours worth of work. If your end goal is to create statistical modeled map you will need to use python or R to likely create such a map.

Click on the links below to see the difference.

GeoJSON map
This map shows conifer cones species collected across the Western United States. The data was saved in a CSV file of latitude/longitude data.
Cone Map (fixing the link)

folium map (python)
Conifer_Map

Gas production

by Dane Miller

In this post I will be comparing the rate of two chemical reactions.

1) Hydrochloric acid and seashell
CaCO3 (s) + 2HCl (aq) –> CaCl2 (aq) + H2O (l) + CO2 (g)

2) Hydrogen peroxide and Yeast
2H2O2 (aq) — (catalyst Yeast) —> H2O (aq) + O2 (g) + heat

Both of these chemical reactions were measured with Vernier labquest O2/CO2 probe. Then the data was converted to CSV file and ran some descriptive statistics in jupyter notebook (Python 3.6).

1) Hydrochloric acid and seashell

2) Hydrogen peroxide and Yeast

First import the necessary modules you will be using for the analysis.

# %load ../standard_import.txt
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import axes3d
import seaborn as sns

from sklearn.preprocessing import scale
import sklearn.linear_model as skl_lm
from sklearn.metrics import mean_squared_error, r2_score
import statsmodels.api as sm
import statsmodels.formula.api as smf

%matplotlib inline
plt.style.use('seaborn-white')
df = pd.read_csv('/.../YeastO2.csv') # load csv file into python
df.info()
df.head()
est = smf.ols('Yeast_O2_ppm ~ Time', df).fit()
est.summary().tables[1]

Yeast and H2O2

coef std err t P>|t| [0.025 0.975]
Intercept -467.1246 122.092 -3.826 0.000 -709.382 -224.867
Time 17.2861 0.527 32.779 0.000 16.240 18.333

HCl and Seashell

coef std err t P>|t| [0.025 0.975]
Intercept -1575.5714 527.309 -2.988 0.006 -2659.471 -491.672
Time 87.8234 8.379 10.481 0.000 70.599 105.048

We can compare correlations.
Time Yeast_O2_ppm
Time 1.000000 0.956887
Yeast_O2_ppm 0.956887 1.000000

Time HCl_CO2_ppm
Time 1.000000 0.899226
HCl_CO2_ppm 0.899226 1.000000

We can compare the linear regression models in the first two figures above.

regr = skl_lm.LinearRegression()

X = df[['Time']].as_matrix()
y = df.Yeast_O2_ppm # ran this same code for the HCl dataset

regr.fit(X,y)
print(regr.coef_)
print(regr.intercept_)

Yeast and H2O2
[17.28611823]
-467.1246359930119

HCl and Seashell
[87.8234127]
-1575.571428571428

Folium mapping from CSV file

by Dane Miller

This is some graduate school field data collecting different species conifer cones. Here is a link to that publication:

https://www.sciencedirect.com/science/article/pii/S0033589414000738

Creating interactive maps with multiple latitude and longitude coordinates. The folium mapping module is very powerful and interactive.

Here is a link to the interactive map. The map allows you to zoom in and scroll over the cloud icons for additional information.
Conifer_Map

import folium
from folium import plugins
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

%matplotlib inline
df = pd.read_csv('/.../gradcone.csv')
df.head()

I set the start point of this map at Kenosha Pass, Colorado. I could have easily put in a different location.

m = folium.Map([39.4133, -105.7567], zoom_start=5)
m

Make sure you specify row (lat and long) and in popup additional information you want to include.

for index, row in df.iterrows():
    folium.Marker([row['Latitude'], row['Longitude']], 
                  popup=row['Location'],
                  icon=folium.Icon(icon='cloud')
                 ).add_to(m)
m
m.save('/.../map4.html')
# in order for the map to popup on in jupyter notebook comment out the m.save. 

If you are interested in digging into folium mapping with python take a look at the links below.
http://folium.readthedocs.io/en/latest/index.html
https://alysivji.github.io/getting-started-with-folium.html

Mapping with folium

by Dane Miller

Here is a very easy to use interactive mapping module in python called folium. It is a fast way to make maps while the map can be interactive.

Click on the link to open the map.
Santa fe map

Here is some documentation how to work through and create your own map.
quickstart[1]

In order to run to create a map you will need to install folium.

https://anaconda.org/conda-forge/folium

I would also suggest installing Ipyleaflet which contains lots of mapping features.

https://anaconda.org/conda-forge/ipyleaflet
folium.Map(location=[35.6870, -105.9378],
          tiles='Stamen Toner',
          zoom_start=14)
map_osm.save('/.../map3.html')
map_1 = folium.Map(location=[35.6870, -105.9378],
                   zoom_start=12,
                   tiles='Stamen Terrain')
folium.Marker([35.6892, -105.9413], popup='Georgia O Keeffe Museum').add_to(map_1)
folium.Marker([35.6865, -105.9359], popup='Cathedral Basilica of St. Francis of Assisi').add_to(map_1)
folium.Marker([35.6641, -105.9266], popup='Museum of International Folk Art').add_to(map_1)
folium.Marker([35.5889, -106.0775], popup='arroyo de los chamisos trail').add_to(map_1)
folium.Marker([35.6661433, -105.8308525], popup='Thompson Peak Trail, NM').add_to(map_1)

map_1
map_1.save('/.../map3.html')

Basemap

by Dane Miller

Here is a step by step method of using matplotlib basemap in python. Before you get started make sure you have installed matplotlib and basemap.

Links:
Matplotlib
https://matplotlib.org/

The process is a lot easier if you are using Anaconda with Jupyter notebook.
https://anaconda.org/conda-forge/matplotlib
https://anaconda.org/anaconda/basemap

Also make sure your jupyter notebook latest version has been updated.
http://jupyter.readthedocs.io/en/latest/projects/upgrade-notebook.html

Once you have successfully installed and updated all modules then we can get to the fun stuff! Mapping!!!

Basemap has a lot of features, in this post I am focusing on a couple simple features.
https://matplotlib.org/basemap/api/basemap_api.html#module-mpl_toolkits.basemap

from mpl_toolkits.basemap import Basemap
import matplotlib.pyplot as plt
m = Basemap(projection='mill',
           llcrnrlat = -90,
           llcrnrlon = -180,
           urcrnrlat = 90,
           urcrnrlon = 180, 
           resolution = 'l')
m.drawcoastlines()
m.drawcountries(linewidth=2)
m.drawrivers(color='blue')
m.fillcontinents(color='g', lake_color='blue', alpha=0.5)

plt.title('Basemap of the globe')
plt.show()

Here is our first map of the globe.

# Lambert Conformal map of lower 48 states.
m = Basemap(llcrnrlon=-119,llcrnrlat=20,urcrnrlon=-64,urcrnrlat=49,
            projection='lcc',lat_1=33,lat_2=45,lon_0=-95)

m.drawcoastlines()
m.drawcountries(linewidth=2)
m.drawrivers(color='blue')
m.fillcontinents(color='g', lake_color='blue', alpha=0.5)
m.drawstates()
# m.bluemarble()

plt.title('Basemap of the United States of America')
plt.show()

Map of the lower 48 states in the US. Note you can see all the states and rivers drawn out to the map. This is easily done by using m.drawrivers(color=select a color) and m.drawstates().

# Lambert Conformal map of California
m = Basemap(width=1284000,height=1164000,projection='lcc',lat_1=30.,lat_2=60,\
             lat_0=37,lon_0=-120.5,resolution='h',rsphere=6370000.00)

m.drawcoastlines()
m.drawcountries(linewidth=2)
m.drawrivers(color='blue')
m.fillcontinents(color='g', lake_color='blue', alpha=0.5)
m.drawstates()
m.drawcounties()
# m.bluemarble()

plt.title('Basemap of California')
plt.show()

Zooming in even closer to a single state, California.

# Lambert Conformal map of California
m = Basemap(width=1284000,height=1164000,projection='lcc',lat_1=36,lat_2=38,\
             lat_0=37.7749,lon_0=-122.4194,resolution='h',rsphere=63700000.00)

m.drawcoastlines()
m.drawcountries(linewidth=2)
m.drawrivers(color='blue')
m.fillcontinents(color='tan', lake_color='blue', alpha=0.5)
m.drawcounties(linewidth=1, color='black')
# m.bluemarble()

plt.title('Basemap San Francisco Bay')
plt.show()

And finally, zoomed into the San Francisco Bay.

Bottled Water pH

by Dane Miller

Here is a quick analysis of bottle drinking pH plotted with seaborn. I started by looking up cited material on documenting bottle water pH analysis (see chart below). With my analysis converted the pH for each brand to H+ and OH-.

 

Brands pH [H+]aq mol-1 [OH-]aq mol-1
Coca-Cola 2.24 0.0224 2.24E-13
VitaminWater 2.49 0.0249 2.49E-13
Gatorade 2.92 0.0292 2.92E-13
Ozarka water 5.16 0.0000516 5.16E-09
Aquafina 5.63 0.0000563 5.63E-09
Dasnia 5.72 0.0000572 5.72E-09
Nestle Pure Life 6.24 0.00000624 6.24E-08
Evian 6.89 0.00000689 6.89E-08
Fiji 6.9 0.0000069 0.000000069
Smart Water 6.91 0.00000691 6.91E-08
Houston Tap Water 7.29 0.000000729 0.000000729
Pasadena Tap Water 7.58 0.000000758 0.000000758
Evamor 8.78 8.78E-08 0.00000878
Essentia 10.38 1.038E-09 0.001038

phw.jpg

Here is the article if you would like more information:  http://jdh.adha.org/content/89/suppl_2/6.full.pdf

 

Plotting the [H+]aq mol-1 and [OH-]aq mol-1 with pH show us a clearer picture of the relationship between pH and H/OH. When the pH values are high are associated with OH values and low pH values are associated with H values.

H: Hydrogen ion concentration

OH: Hydroxide ion concentration

 

pH and [H+] – Hydrogen

download (1).png

pH and [OH-] Hydroxide

download (3).png

 

 

Python code – bottled water pH

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import axes3d
import seaborn as sns

from sklearn.preprocessing import scale
import sklearn.linear_model as skl_lm
from sklearn.metrics import mean_squared_error, r2_score
import statsmodels.api as sm
import statsmodels.formula.api as smf

%matplotlib inline
plt.style.use('seaborn-white')
df = pd.read_csv('/.../phwater.csv')

df.head()
g = sns.lmplot(x="OH", y="pH", hue="Brands", data=df)
g.set(ylim=(0,14))

g = sns.lmplot(x="H", y="pH", hue="Brands", data=df)
g.set(ylim=(0,14))

Python code “Old faithful geyser dataset rebooted with Python”

# %load ../standard_import.txt
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import axes3d
import seaborn as sns

from sklearn.preprocessing import scale
import sklearn.linear_model as skl_lm
from sklearn.metrics import mean_squared_error, r2_score
import statsmodels.api as sm
import statsmodels.formula.api as smf

%matplotlib inline
plt.style.use('seaborn-white')
of = pd.read_csv('/.../oldfaith.csv')
of.info()
of.head()
regr = skl_lm.LinearRegression()

# Linear fit
X = of.wait_time_min.values.reshape(-1,1)
y = of.duration_sec
regr.fit(X, y)

of['pred1'] = regr.predict(X)
of['resid1'] = of.duration_sec - of.pred1

# Quadratic fit
X2 = of[['wait_time_min', 'wait_time_min']].as_matrix()
regr.fit(X2, y)

of['pred2'] = regr.predict(X2)
of['resid2'] = of.duration_sec - of.pred2
fig, (ax1,ax2) = plt.subplots(1,2, figsize=(12,5))

# Left plot
sns.regplot(of.pred1, of.resid1, lowess=True, 
            ax=ax1, line_kws={'color':'r', 'lw':1},
            scatter_kws={'facecolors':'None', 'edgecolors':'k', 'alpha':0.5})
ax1.hlines(0,xmin=ax1.xaxis.get_data_interval()[0],
           xmax=ax1.xaxis.get_data_interval()[1], linestyles='dotted')
ax1.set_title('Residual Plot for Linear Fit')

# Right plot
sns.regplot(of.pred2, of.resid2, lowess=True,
            line_kws={'color':'r', 'lw':1}, ax=ax2,
            scatter_kws={'facecolors':'None', 'edgecolors':'k', 'alpha':0.5})
ax2.hlines(0,xmin=ax2.xaxis.get_data_interval()[0],
           xmax=ax2.xaxis.get_data_interval()[1], linestyles='dotted')
ax2.set_title('Residual Plot for Quadratic Fit')

for ax in fig.axes:
    ax.set_xlabel('Fitted values')
    ax.set_ylabel('Residuals')
est = smf.ols('wait_time_min ~ duration_sec', of).fit()
est.summary().tables[1]
sns.jointplot(x='wait_time_min',y='duration_sec',data=of,kind='reg')
g = sns.jointplot("wait_time_min", "duration_sec", data=of,
...                   kind="kde", space=0, color="g")
 g = (sns.jointplot("wait_time_min", "duration_sec",
...                    data=of, color="k")
...         .plot_joint(sns.kdeplot, zorder=0, n_levels=6))
g = sns.jointplot("wait_time_min", "duration_sec", data=of,
...                   marginal_kws=dict(bins=15, rug=True),
...                   annot_kws=dict(stat="r"),
...                   s=40, edgecolor="w", linewidth=1)