Calculating the total subsurface storage anomalies using geopandas
In this example spatial mean monthly total subsurface storage for each federal state of Germany from 2011 until 2021 will be calculated taking into account how much (ratio) of the gridpoints is actually being intersected inside each federal state.
Before starting it is important to import the needed libraries.
Make sure to make geopandas and SLOTH available to import in python. If you already have added SLOTH to your PYHTONPATH, below step is not needed.
import os
import sys
cwd = os.getcwd()
sloth_path = f"{cwd}/../"
sys.path.append(sloth_path)
import geopandas as gpd
import xarray as xr
import pandas as pd
from shapely.geometry import Polygon
import sloth.toolBox
The paths needed for the anomalies calculations are prepared next.
path = '//p/scratch/pfgpude05/belleflamme1/sim/ADAPTER_DE05_ECMWF-HRES_detforecast__FZJ-IBG3-ParFlowCLM380D_v01bJuwelsGpuProdClimatologyTl_PRhourly/postpro/data/'
fname= 'tss_DE05_ECMWF-HRES_hindcast_r1i1p2_FZJ-IBG3-ParFlowCLM380_hgfadapter-h00-v3_1hr_*-*.nc'
fname_tssAVE = 'tssAve_DE05_ECMWF-HRES_hindcast_r1i1p2_FZJ-IBG3-ParFlowCLM380_hgfadapter-h00-v3_1hr_reference-year-2011-2021.nc'
In the following step all files will be compiled together and the monthly total subsurface
storage is calculated.
ds_tss = xr.open_mfdataset(path+fname, decode_times=True)
ds_tss_mon = ds_tss.resample({'time': '1MS'}).mean() # calculate the monthly mean
df_tss_mon = ds_tss_mon.to_dataframe()
df_tss_mon = df_tss_mon.reset_index() # the values here represent the monthly tts for the whole 2000*2000 grid
df_tss_mon_sorted=df_tss_mon.sort_values(by=['time'])
For calculating the anomalies, the long term total subsurface storage is needed and calculated next.
ds_tssAVE = xr.open_mfdataset(path + fname_tssAVE, engine="netcdf4", decode_times=True)
ds_tssAVE = ds_tssAVE.drop('rotated_pole')
ds_tssAVE_mon = ds_tssAVE.resample({'time': '1MS'}).mean()
df_tssAVE_mon = ds_tssAVE_mon.to_dataframe()
df_tssAVE_mon = df_tssAVE_mon.reset_index()
A dataframe is created with the lons and lats of the 2000*2000 grid (the domain that contains the total subsurface storage) in order to add the calculated anomalies later to this dataframe.
df_anomaly = pd.DataFrame(columns=['lon',
'lat',])
df_anomaly['lon'] = df_tss_mon_sorted['lon'].iloc[0:4000000]
df_anomaly['lat'] = df_tss_mon_sorted['lat'].iloc[0:4000000]
The final step is to calculate the anomalies, in this loop the anomalies for each month from 2011-2022 for each grid is calculated, it loops through the years and months and add the values in a column, which means for each month a column with its anomalies values will be added. t
As shown below, the dataframe is prepared as follows:
each row represents a grid point in the domain
the columns contains information on lon and lat as well as anomalies for each month from 2011 until 2021
df_anomaly
lon | lat | 2011_1 | 2011_2 | 2011_3 | 2011_4 | 2011_5 | 2011_6 | 2011_7 | 2011_8 | ... | 2022_3 | 2022_4 | 2022_5 | 2022_6 | 2022_7 | 2022_8 | 2022_9 | 2022_10 | 2022_11 | 2022_12 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 2.864156 | 44.254330 | -0.455078 | -0.175781 | 0.312500 | -11.029297 | -37.771484 | -26.111328 | 16.777344 | 65.214844 | ... | -0.265625 | 4.953125 | -23.765625 | -55.781250 | -37.515625 | -91.335938 | -103.468750 | -117.330078 | -125.214844 | -57.529297 |
127800604 | 4.254391 | 48.737358 | -0.806641 | -0.367188 | 0.195312 | -9.496094 | -33.369141 | -19.093750 | 20.230469 | 51.841797 | ... | -0.441406 | 4.152344 | -19.548828 | -45.017578 | -23.996094 | -75.730469 | -86.578125 | -95.554688 | -96.341797 | -27.708984 |
170586722 | 4.320573 | 51.805782 | -0.154297 | -0.035156 | 0.207031 | -4.126953 | -6.591797 | 3.384766 | 16.972656 | 15.625000 | ... | -0.224609 | 1.564453 | -10.769531 | -14.166016 | -4.728516 | -36.582031 | -32.123047 | -2.962891 | -0.312500 | -0.189453 |
646492778 | 15.528274 | 47.738449 | -0.394531 | -0.123047 | 0.328125 | -10.277344 | -37.744141 | -24.435547 | 20.156250 | 64.380859 | ... | -0.312500 | 4.826172 | -24.535156 | -53.386719 | -30.810547 | -84.294922 | -97.890625 | -110.275391 | -112.976562 | -38.738281 |
22256262 | 2.113726 | 48.209255 | -0.347656 | -0.041016 | 0.251953 | -10.488281 | -37.468750 | -23.828125 | 21.552734 | 63.205078 | ... | -0.408203 | 4.753906 | -23.535156 | -51.101562 | -29.115234 | -81.833984 | -94.062500 | -104.388672 | -103.269531 | -26.167969 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
9990224 | 0.359184 | 52.314259 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
566925424 | 13.920066 | 46.913773 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
47866008 | 2.631417 | 48.333389 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
567278116 | 14.007371 | 45.914566 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
567809288 | 13.674760 | 49.967464 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
4000000 rows × 146 columns
anomaly_list = []
list_year = [2011, 2012, 2013, 2014,
2015, 2016, 2017, 2018,
2019, 2020, 2021, 2022
]
list_month = [1, 2, 3,
4, 5, 6,
7, 8, 9,
10, 11, 12,
]
print('calculating the anomalies')
for i in range(len(list_year)):
for n in range(len(list_month)):
print(list_month[n])
start_mon = f'{list_year[i]}-{list_month[n]}-01'
end_mon = f'{list_year[i]}-{list_month[n]}-01'
start_mon = pd.to_datetime(start_mon)
end_mon = pd.to_datetime(end_mon)
start_mon_tssAVE = f'2017-{list_month[n]}-01'
end_mon_tssAVE = f'2017-{list_month[n]}-01'
start_mon_tssAVE = pd.to_datetime(start_mon_tssAVE)
end_mon_tssAVE = pd.to_datetime(end_mon_tssAVE)
df_tss_yr = df_tss_mon[df_tss_mon['time'].between(start_mon, end_mon)]
df_tss_yr = df_tss_yr.reset_index()
df_tssAVE_mon_yr = df_tssAVE_mon[df_tssAVE_mon['time'].between(start_mon_tssAVE, end_mon_tssAVE)]
df_tssAVE_mon_yr = df_tssAVE_mon_yr.reset_index()
df_anom = df_tss_yr['tss'] - df_tssAVE_mon_yr['tssAve']
df_anomaly[f'{list_year[i]}_{list_month[n]}'] = df_anom.values
df_anomaly.to_csv('/p/scratch/pfgpude05/hammoudeh1/sim/ADAPTER_DE05_ECMWF-HRES_detforecast__FZJ-IBG3-ParFlowCLM380D_v01bJuwelsGpuProdClimatologyTl_PRhourly/postpro/data/time_series/tss_anomalies_DE05_2021-20230823.csv')
After the anomalies are calculated, the paths for the data are prepared.
Data needed:
Data containing information on the total subsurafce storage to calculate the monthly anomaly.
Dataframe (can also be a nectdf or any other data type) containing infromation on the four longitudes and latitudes that surrounds each gridpoint in the source dataframe. this information is needed to be able to create polygons for each gridpoint.
Shapefile fo Germany containing all the federal stations.
Another information is needed such as:
Projected coordinate reference system (utm).
Number of years.
Number of days/months etc.
A path to save the output.
df_data= df_anomaly
#df_data = '/p/scratch/pfgpude05/hammoudeh1/sim/ADAPTER_DE05_ECMWF-HRES_detforecast__FZJ-IBG3-ParFlowCLM380D_v01bJuwelsGpuProdClimatologyTl_PRhourly/postpro/data/time_series/tss_anomalies_DE05_2021-20230823.csv'
corners = '/p/project/pfgpude05/hammoudeh1/geo_data/corner_grids_DE06/corner_grids_DE06.nc'
area_of_interest ='/p/project/pfgpude05/hammoudeh1/geo_data/nuts2500/2500_NUTS1.shp'
crs_utm = 'EPSG:25832'
nr_yr =12
nr_entries = 12
save_dir = '/p/scratch/pfgpude05/hammoudeh1/sim/ADAPTER_DE05_ECMWF-HRES_detforecast__FZJ-IBG3-ParFlowCLM380D_v01bJuwelsGpuProdClimatologyTl_PRhourly/geo_data/anomalies_geopandas/NUTS3_tss_monthly_anomalies_DE06_ECMWF-HRES_hindcast_r1i1p2_FZJ-IBG3-ParFlowCLM380_hgfadapter-h00-v3_1hr/NUTS1_tss_monthly_anomalies_DE06_Parflow_SH_v1.shp'
shapefile
NUTS_LEVEL | NUTS_CODE | NUTS_NAME | geometry | |
---|---|---|---|---|
0 | 1 | DE1 | Baden-Württemberg | MULTIPOLYGON (((591826.754 5434806.578, 591502... |
1 | 1 | DE2 | Bayern | POLYGON ((759919.057 5493994.847, 759656.392 5... |
2 | 1 | DE3 | Berlin | POLYGON ((812675.236 5831469.065, 813573.397 5... |
3 | 1 | DE4 | Brandenburg | POLYGON ((785946.492 5753692.786, 785689.525 5... |
4 | 1 | DE5 | Bremen | MULTIPOLYGON (((471292.489 5932922.759, 470685... |
5 | 1 | DE6 | Hamburg | MULTIPOLYGON (((463107.577 5977587.614, 462620... |
6 | 1 | DE7 | Hessen | POLYGON ((492636.671 5483361.184, 491734.166 5... |
7 | 1 | DE8 | Mecklenburg-Vorpommern | MULTIPOLYGON (((713722.584 6019028.989, 714041... |
8 | 1 | DE9 | Niedersachsen | MULTIPOLYGON (((508518.071 5806884.353, 508505... |
9 | 1 | DEA | Nordrhein-Westfalen | POLYGON ((373588.503 5612322.662, 372409.785 5... |
10 | 1 | DEB | Rheinland-Pfalz | POLYGON ((420154.696 5433007.302, 418891.303 5... |
11 | 1 | DEC | Saarland | POLYGON ((361553.412 5446433.660, 360354.267 5... |
12 | 1 | DED | Sachsen | POLYGON ((753976.948 5653939.390, 753924.450 5... |
13 | 1 | DEE | Sachsen-Anhalt | POLYGON ((669178.456 5690646.781, 668139.009 5... |
14 | 1 | DEF | Schleswig-Holstein | MULTIPOLYGON (((427783.160 6003522.451, 427486... |
15 | 1 | DEG | Thüringen | POLYGON ((726352.625 5648413.044, 726600.494 5... |
The ‘NUTS_NAME’ from the table below, will be used for dissovling the field in the end.
That’s why this information is needed as an input in the tool
Name_area = "NUTS_NAME"
The intersection_calculations tool is imported from sloth.toolbox
sloth.toolbox.intersection_calculations(df_data, corners, area_of_interest, Name_area, crs_utm, nr_yr, nr_entries, save_dir)
The result is shown in the table below:
dissolve_gdf.drop('index_left',axis='columns')
geometry | lon | lat | 2011_1 | 2011_2 | 2011_3 | 2011_4 | 2011_5 | 2011_6 | 2011_7 | ... | 2022_7 | 2022_8 | 2022_9 | 2022_10 | 2022_11 | 2022_12 | area | NUTS_LEVEL | area_intersected | weight | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
NUTS_NAME | |||||||||||||||||||||
Baden-Württemberg | MULTIPOLYGON (((400493.888 5268816.619, 401103... | 9.045352 | 48.537956 | 0.999507 | -2.988537 | -13.584477 | -21.015722 | -57.277115 | -63.289494 | -43.851151 | ... | -64.984001 | -93.216202 | -89.057289 | -77.876862 | -81.391617 | -76.319214 | 374678.085372 | 1.0 | 369487.918176 | 0.986148 |
Bayern | POLYGON ((507442.465 5491666.064, 507374.615 5... | 11.420963 | 48.933758 | 1.731518 | -3.034703 | -13.670825 | -21.404711 | -57.849911 | -64.082687 | -44.784664 | ... | -63.859665 | -91.726974 | -87.553711 | -76.597458 | -80.274864 | -75.176254 | 375050.561093 | 1.0 | 370964.112033 | 0.989103 |
Berlin | POLYGON ((780348.536 5813302.828, 779911.820 5... | 13.403963 | 52.500927 | 25.705936 | 25.638256 | 14.032354 | 9.348076 | -15.815358 | -17.693853 | 1.582355 | ... | -85.105247 | -108.459457 | -101.509254 | -95.753525 | -108.704613 | -102.388306 | 375838.643631 | 1.0 | 347783.234208 | 0.925356 |
Brandenburg | POLYGON ((676049.159 5877864.309, 676109.609 5... | 13.397862 | 52.468426 | 5.543557 | 0.937342 | -10.016368 | -17.526293 | -52.535980 | -59.144440 | -40.211243 | ... | -63.198277 | -90.098953 | -85.329941 | -75.727402 | -80.210075 | -75.104263 | 375848.489952 | 1.0 | 369595.519417 | 0.983361 |
Bremen | MULTIPOLYGON (((468226.157 5893810.567, 467760... | 8.736111 | 53.206532 | -9.506310 | -12.064992 | -21.316515 | -31.167387 | -75.887642 | -81.646461 | -60.356339 | ... | -78.357590 | -107.278130 | -101.842323 | -84.059784 | -87.900513 | -85.182060 | 374818.598036 | 1.0 | 326689.773857 | 0.871589 |
Hamburg | MULTIPOLYGON (((551170.414 5928592.519, 550963... | 10.002545 | 53.550182 | 25.570534 | 26.046181 | 13.929835 | 10.464141 | -12.814467 | -16.788189 | 1.120710 | ... | -84.439415 | -104.581825 | -96.698769 | -92.634155 | -105.355164 | -98.748741 | 374803.502594 | 1.0 | 343649.951681 | 0.916876 |
Hessen | POLYGON ((420136.833 5536098.325, 419637.142 5... | 9.029756 | 50.597702 | -1.622735 | -6.414443 | -17.024820 | -24.803610 | -62.581284 | -69.298172 | -49.687885 | ... | -65.227036 | -94.198318 | -89.954620 | -78.154289 | -81.152153 | -76.200714 | 375041.337313 | 1.0 | 367454.244201 | 0.979770 |
Mecklenburg-Vorpommern | MULTIPOLYGON (((617186.776 5910703.204, 616915... | 12.542992 | 53.756886 | 3.585589 | -0.735577 | -11.536910 | -19.055853 | -54.574936 | -60.570900 | -41.148571 | ... | -66.532906 | -94.190651 | -89.620071 | -79.108841 | -83.113586 | -77.816010 | 375282.656001 | 1.0 | 364617.468794 | 0.971585 |
Niedersachsen | MULTIPOLYGON (((354207.146 5814329.405, 354033... | 9.159881 | 52.761726 | 7.291996 | 3.948156 | -6.924499 | -13.796718 | -47.395435 | -53.088966 | -33.938244 | ... | -66.691063 | -93.021286 | -88.040375 | -78.674881 | -84.412430 | -79.208153 | 374952.406115 | 1.0 | 368999.595577 | 0.984124 |
Nordrhein-Westfalen | POLYGON ((290655.096 5623782.625, 290656.794 5... | 7.558647 | 51.475185 | 4.601259 | 0.266528 | -10.279454 | -17.371027 | -52.570518 | -58.947182 | -39.851475 | ... | -64.477089 | -91.286812 | -86.429611 | -76.440918 | -80.905113 | -75.741623 | 375154.023219 | 1.0 | 369483.374360 | 0.984885 |
Rheinland-Pfalz | POLYGON ((301290.602 5531392.741, 301084.142 5... | 7.450834 | 49.910480 | 6.048045 | 2.004749 | -9.018032 | -16.458361 | -50.644306 | -56.789558 | -37.998516 | ... | -64.440720 | -91.378166 | -86.539139 | -76.646225 | -81.347626 | -76.340233 | 375094.153727 | 1.0 | 368624.320980 | 0.982752 |
Saarland | POLYGON ((315507.595 5481970.296, 314910.921 5... | 6.951203 | 49.383770 | 2.946488 | -1.885279 | -12.328921 | -20.652977 | -58.255016 | -63.524227 | -44.533035 | ... | -61.877522 | -90.718102 | -88.003914 | -76.803322 | -79.527748 | -73.550659 | 375104.627509 | 1.0 | 359668.655686 | 0.958849 |
Sachsen | POLYGON ((712923.245 5582183.306, 712733.682 5... | 13.345225 | 51.048813 | 4.797780 | -0.519058 | -11.254671 | -18.747925 | -54.531155 | -61.287735 | -42.006641 | ... | -61.797779 | -88.396599 | -83.679886 | -74.015678 | -78.097069 | -72.758545 | 375951.445976 | 1.0 | 368478.593865 | 0.980122 |
Sachsen-Anhalt | POLYGON ((615254.052 5726743.256, 615250.857 5... | 11.702046 | 52.007923 | -10.956268 | -16.022284 | -26.324970 | -35.175495 | -76.511673 | -83.026962 | -63.564831 | ... | -60.799084 | -92.132446 | -89.499664 | -74.617950 | -75.833855 | -71.662788 | 375364.559161 | 1.0 | 369012.082595 | 0.983077 |
Schleswig-Holstein | MULTIPOLYGON (((426389.821 6005095.954, 426461... | 9.806491 | 54.186234 | 8.032061 | 4.532070 | -6.483582 | -13.468135 | -46.871441 | -52.392162 | -33.390022 | ... | -66.364571 | -92.719681 | -87.642197 | -78.375977 | -83.884872 | -78.847504 | 374599.191754 | 1.0 | 363606.684402 | 0.970651 |
Thüringen | POLYGON ((562083.355 5611197.224, 562099.302 5... | 11.030182 | 50.901054 | 11.136419 | 8.009848 | -2.874439 | -9.569095 | -42.398903 | -48.243431 | -29.178293 | ... | -69.037308 | -94.825653 | -89.288139 | -80.275635 | -86.308487 | -80.896896 | 375262.305978 | 1.0 | 366870.117407 | 0.977638 |
16 rows × 151 columns