{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Generating a Custom ATL06 over Grand Mesa, CO\n",
    "\n",
    "Process ATL03 data from the Grand Mesa, CO region and produce a customized ATL06 dataset.\n",
    "\n",
    "### What is demonstrated\n",
    "\n",
    "* The `icesat2.atl06p` API is used to perform a SlideRule parallel processing request of the Grand Mesa region\n",
    "* The `icesat2.cmr` and `icesat2.h5p` API's are used to manually retrieve specific ATL06 datasets corresponding to the Grand Mesa region\n",
    "* The `pyproj` and `shapely` packages are used to subset ATL06 data that was manually retrieved\n",
    "* The `matplotlib` package is used to plot the data processed by SlideRule alongside the manually retrieved and subsetted data\n",
    "\n",
    "### Points of interest\n",
    "\n",
    "The resulting datasets plotted at the bottom of the notebook show that existing ATL06 data is not available for the entire Grand Mesa region.  By using the SlideRule API to process ATL03 data and produce a customized ATL06 dataset, elevation data can be returned for the entire region of interest."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "import concurrent.futures\n",
    "import time\n",
    "from datetime import datetime\n",
    "import geopandas as gpd\n",
    "import matplotlib.pyplot as plt\n",
    "from pyproj import Transformer\n",
    "from shapely.geometry import Polygon, Point\n",
    "from sliderule import sliderule, icesat2, earthdata, h5"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## SlideRule Configuration"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "# Configure ICESat-2 API\n",
    "icesat2.init(\"slideruleearth.io\", verbose=True)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Specify Region of Interest"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "# Specify region of interest from geojson\n",
    "poly_fn = 'grandmesa.geojson'\n",
    "region = sliderule.toregion(poly_fn)[\"poly\"]\n",
    "region"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "# Read geojson with geopandas\n",
    "pregion = gpd.read_file(poly_fn)\n",
    "pregion.plot()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "# Prepare coordinate lists for plotting the region of interest polygon\n",
    "region_lon = [e[\"lon\"] for e in region]\n",
    "region_lat = [e[\"lat\"] for e in region]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Specify parameters for ATL06-SR processing"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "# Build ATL06 Request\n",
    "parms = {\n",
    "    \"poly\": region,\n",
    "    \"srt\": icesat2.SRT_LAND,\n",
    "    \"cnf\": icesat2.CNF_SURFACE_HIGH,\n",
    "    \"ats\": 10.0,\n",
    "    \"cnt\": 10,\n",
    "    \"len\": 40.0,\n",
    "    \"res\": 20.0\n",
    "}"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Calculate ATL06-SR Elevations from ATL03 Photons using SlideRule"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "# Latch Start Time\n",
    "perf_start = time.perf_counter()\n",
    "\n",
    "# Request ATL06 Data\n",
    "atl06_sr = icesat2.atl06p(parms)\n",
    "\n",
    "# Latch Stop Time\n",
    "perf_stop = time.perf_counter()\n",
    "\n",
    "# Display Statistics\n",
    "perf_duration = perf_stop - perf_start\n",
    "print(\"Completed in {:.3f} seconds of wall-clock time\".format(perf_duration))\n",
    "print(\"Reference Ground Tracks: {}\".format(atl06_sr[\"rgt\"].unique()))\n",
    "print(\"Cycles: {}\".format(atl06_sr[\"cycle\"].unique()))\n",
    "print(\"Received {} elevations\".format(atl06_sr.shape[0]))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Plot ATL06-SR Results"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "f, ax = plt.subplots()\n",
    "ax.set_title(\"ATL06-SR Points\")\n",
    "ax.set_aspect('equal')\n",
    "atl06_sr.plot(ax=ax, column='h_mean', cmap='inferno', s=0.1)\n",
    "ax.plot(region_lon, region_lat, linewidth=1, color='g')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "atl06_sr"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Retrieve ATL06 Elevations Directly using `icesat2.h5p` API\n",
    "\n",
    "This method of reading H5 data directly is the recommended method and runs faster than `icesat2.h5` as each dataset is read in parallel on the server and shares a common cache. The code below has a couple other optimizations including only sampling every 10th coordinate for point inclusion, and reading the lat,lon information first and then reading only the necessary heights.  \n",
    "\n",
    "See https://nsidc.org/data/atl06 for the source dataset."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "# read ATL06 resource and return heights within polygon\n",
    "def subsetted_read(resource, polygon, transformer):\n",
    "\n",
    "    heights = []\n",
    "    latitudes = []\n",
    "    longitudes = []\n",
    "    api_time = 0\n",
    "\n",
    "    try:\n",
    "\n",
    "        # List of tracks to read\n",
    "        tracks = [\"1l\", \"1r\", \"2l\", \"2r\", \"3l\", \"3r\"]\n",
    "\n",
    "        # Build list of each lat,lon dataset to read\n",
    "        geodatasets = []\n",
    "        for track in tracks:\n",
    "            prefix = \"/gt\"+track+\"/land_ice_segments/\"\n",
    "            geodatasets.append({\"dataset\": prefix+\"latitude\", \"startrow\": 0, \"numrows\": -1})\n",
    "            geodatasets.append({\"dataset\": prefix+\"longitude\", \"startrow\": 0, \"numrows\": -1})\n",
    "\n",
    "        # Read lat,lon from resource\n",
    "        api_start = time.perf_counter()\n",
    "        geocoords = h5.h5p(geodatasets, resource, \"icesat2\")\n",
    "        api_stop = time.perf_counter()\n",
    "        api_time += (api_stop - api_start)\n",
    "\n",
    "        # Build list of the subsetted h_li datasets to read\n",
    "        hidatasets = []\n",
    "        for track in tracks:\n",
    "            prefix = \"/gt\"+track+\"/land_ice_segments/\"\n",
    "            lat_dataset = geocoords[prefix+\"latitude\"]\n",
    "            lon_dataset = geocoords[prefix+\"longitude\"]\n",
    "            startrow = -1\n",
    "            numrows = -1\n",
    "            index = 0\n",
    "            while index < len(lat_dataset):\n",
    "                lat = lat_dataset[index]\n",
    "                lon = lon_dataset[index]\n",
    "                c = transformer.transform(lat, lon)\n",
    "                point = Point(c[0], c[1])\n",
    "                intersect = point.within(polygon)\n",
    "                if startrow == -1 and intersect:\n",
    "                    startrow = index\n",
    "                elif startrow != -1 and not intersect:\n",
    "                    break\n",
    "                index += 10 # only sample values for speed increase\n",
    "            if startrow >= 0:\n",
    "                numrows = index - startrow\n",
    "            if numrows > 0:\n",
    "                hidatasets.append({\"dataset\": prefix+\"h_li\", \"startrow\": startrow, \"numrows\": numrows, \"prefix\": prefix})\n",
    "\n",
    "        # Read h_li from resource\n",
    "        if len(hidatasets) > 0:\n",
    "            api_start = time.perf_counter()\n",
    "            hivalues = h5.h5p(hidatasets, resource, \"icesat2\")\n",
    "            api_stop = time.perf_counter()\n",
    "            api_time += (api_stop - api_start)\n",
    "\n",
    "        # Append results\n",
    "        for entry in hidatasets:\n",
    "            heights += hivalues[entry[\"prefix\"]+\"h_li\"].tolist()\n",
    "            latitudes += geocoords[entry[\"prefix\"]+\"latitude\"][entry[\"startrow\"]:entry[\"startrow\"]+entry[\"numrows\"]].tolist()\n",
    "            longitudes += geocoords[entry[\"prefix\"]+\"longitude\"][entry[\"startrow\"]:entry[\"startrow\"]+entry[\"numrows\"]].tolist()\n",
    "\n",
    "    except Exception as e:\n",
    "        pass\n",
    "\n",
    "    # Return results\n",
    "    return {\"resource\":  resource,\n",
    "            \"h_li\":      heights,\n",
    "            \"latitude\":  latitudes,\n",
    "            \"longitude\": longitudes,\n",
    "            \"time\":      api_time}"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "# Initialize Total Time Spent Inside API\n",
    "api_total_time = 0\n",
    "\n",
    "# Latch Start Time\n",
    "perf_start = time.perf_counter()\n",
    "\n",
    "# Query ATL06 Files from NASA CMR System\n",
    "resources = earthdata.cmr(polygon=region, short_name='ATL06')\n",
    "print('Retrieved %s resources that intersect region' % (len(resources)))\n",
    "\n",
    "# Create Projection Transformer\n",
    "transformer = Transformer.from_crs(4326, 3857) # GPS to Web Mercator\n",
    "\n",
    "# Project Polygon\n",
    "pregion = []\n",
    "for point in region:\n",
    "    ppoint = transformer.transform(point[\"lat\"], point[\"lon\"])\n",
    "    pregion.append(ppoint)\n",
    "polygon = Polygon(pregion)\n",
    "\n",
    "# Initialize Result Dataset\n",
    "results = {\"latitude\": [], \"longitude\": [], \"h_li\":[]}\n",
    "\n",
    "# Update Available Servers #\n",
    "num_servers, _ = sliderule.update_available_servers()\n",
    "print('Allocating %d workers across %d processing nodes' % (num_servers, num_servers))\n",
    "\n",
    "# Make Parallel Processing Requests\n",
    "with concurrent.futures.ThreadPoolExecutor(max_workers=num_servers) as executor:\n",
    "    futures = [executor.submit(subsetted_read, resource, polygon, transformer) for resource in resources]\n",
    "    # Wait for Results\n",
    "    result_cnt = 0\n",
    "    for future in concurrent.futures.as_completed(futures):\n",
    "        result_cnt += 1\n",
    "        result = future.result()\n",
    "        print('%d results returned for %s (%d out of %d)' % (len(result[\"h_li\"]), result[\"resource\"], result_cnt, len(resources)))\n",
    "        results[\"h_li\"] += result[\"h_li\"]\n",
    "        results[\"latitude\"] += result[\"latitude\"]\n",
    "        results[\"longitude\"] += result[\"longitude\"]\n",
    "        api_total_time += result[\"time\"]\n",
    "\n",
    "# Latch Stop Time\n",
    "perf_stop = time.perf_counter()\n",
    "perf_duration = perf_stop - perf_start\n",
    "\n",
    "# Build GeoDataframe of ATL06 Standard Data Product\n",
    "geometry = gpd.points_from_xy(results[\"longitude\"], results[\"latitude\"])\n",
    "df = gpd.pd.DataFrame(results)\n",
    "atl06_sdp = gpd.GeoDataFrame(df, geometry=geometry)\n",
    "\n",
    "# Filter Height Values\n",
    "atl06_sdp = atl06_sdp[atl06_sdp[\"h_li\"] < 10000]\n",
    "\n",
    "# Print Statistics\n",
    "print(\"Completed in {:.3f} seconds of wall-clock time\".format(perf_duration))\n",
    "print(\"Spent {:.3f} concurrent seconds waiting for api\".format(api_total_time))\n",
    "print(\"Retrieved {} valid elevations out of {} total elevations\".format(len(atl06_sdp), len(results[\"h_li\"])))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Trim ATL06 points to region polygon"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "# Create shapely polygon\n",
    "pregion = Polygon(zip(region_lon, region_lat))\n",
    "\n",
    "# Using geopandas\n",
    "idx = atl06_sdp.within(pregion)\n",
    "atl06_sdp = atl06_sdp[idx]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Plot ATL06-SR vs. ATL06"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "# Set color ramp limits\n",
    "vmin, vmax = atl06_sdp['h_li'].quantile((0.02, 0.98))\n",
    "\n",
    "# Create dictionary of common keyword arguments\n",
    "plot_kw = {'cmap':'inferno', 's':0.1, 'vmin':vmin, 'vmax':vmax}\n",
    "\n",
    "f, axa = plt.subplots(2,1, figsize=(10,10)) # sharex=True, sharey=True\n",
    "axa[0].set_title(\"ATL06-SR Points\")\n",
    "atl06_sr.plot(ax=axa[0], column='h_mean', **plot_kw)\n",
    "axa[1].set_title(\"ATL06 Points\")\n",
    "atl06_sdp.plot(ax=axa[1], column='h_li', **plot_kw)\n",
    "\n",
    "for ax in axa:\n",
    "    # Plot the region polygon\n",
    "    ax.plot(region_lon, region_lat, linewidth=1, color='g')\n",
    "    ax.set_aspect('equal');\n",
    "    ax.set_facecolor('lightgray')"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "sliderule_env",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.12.0"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
