GeoEPIC Regional Simulation Tutorial
This tutorial provides step-by-step guidance on setting up and running simulations for multiple sites across a region, often as part of a calibration process, using GeoEPIC.
-
Set Up Environment: As with single-site simulations, the first step is to configure your local machine and install the necessary
geoEpic
components. This ensures your system is ready for both single-site and regional-scale tasks. For detailed instructions, please refer to the Installation Page. Completing this step is a prerequisite for all subsequentgeoEpic
operations. -
Prepare Input Files for Multiple Sites: For regional simulations, you need the standard input files (Site - .SIT, Soil - .SOL, Weather - .DLY, Operation Schedule - .OPC) for each site within your study region. This typically involves batch processing or scripting to generate files for numerous locations. Consistent naming conventions (e.g., using site IDs) are crucial for managing these files.
- Refer to the Site File Section, Soil File Section, Weather File Section, and Operation Schedule File Section in the getting started menu for details on the format and generation of each file type.
- You might use scripts to loop through a list of site coordinates or identifiers to automate the generation and saving of these files. Example concept for downloading weather for multiple sites:
Successfully preparing these files systematically for all sites is essential for regional analysis.
from geoEpic.weather import get_daymet_data from geoEpic.io import DLY import os sites = [ {'id': 'siteA', 'lat': 35.1, 'lon': -78.5}, {'id': 'siteB', 'lat': 35.2, 'lon': -78.6}, # ... more sites ] start_date, end_date = '2015-01-01', '2019-12-31' output_weather_dir = './workspace/weather/' os.makedirs(output_weather_dir, exist_ok=True) for site_info in sites: site_id = site_info['id'] lat, lon = site_info['lat'], site_info['lon'] print(f"Processing weather for {site_id}...") try: df = get_daymet_data(lat, lon, start_date, end_date) dly_path = os.path.join(output_weather_dir, f"{site_id}.DLY") DLY(df).save(dly_path) except Exception as e: print(f"Error processing {site_id}: {e}")
-
Set Up Workspace and Folder Structure: Managing inputs and outputs for numerous sites requires an organized workspace. A well-defined folder structure is critical for regional runs and calibration workflows.
- It's recommended to create a main
workspace
directory. Inside, organize subdirectories for each input type and for outputs, often with further subdirectories per site or run configuration. - Example Structure:
workspace/ ├── sites/ # .SIT files (e.g., siteA.SIT, siteB.SIT) ├── soil/ # .SOL files (e.g., siteA.SOL, siteB.SOL) ├── weather/ # .DLY files (e.g., siteA.DLY, siteB.DLY) ├── opc/ # .OPC files or templates │ ├── files/ # Generated .OPC files (e.g., siteA.OPC) │ └── templates/ # OPC templates if using generate_opc ├── model/ # EPIC executable (e.g., EPIC1102.exe) └── outputs/ # Simulation outputs, potentially nested ├── run_config_1/ │ ├── siteA/ # Outputs for siteA (siteA.ACY, siteA.DGN) │ └── siteB/ # Outputs for siteB └── run_config_2/ └── ...
- Refer to the Workspace Setup Guide for best practices and potential templates provided by
geoEpic
. This organization facilitates batch processing and managing results across the region.
- It's recommended to create a main
-
Run Regional Simulation / Calibration: Executing simulations for multiple sites often involves looping through your site list, configuring the model for each, and running it. Calibration adds complexity by adjusting model parameters based on observed data, typically requiring iterative runs.
- This process usually involves scripting to automate the setup and execution for each site within your organized workspace. You might use Python loops, parallel processing libraries (like
multiprocessing
ordask
), or specialized functions withingeoEpic
if available for batch runs. - For detailed strategies on handling multiple sites, managing parameters during calibration, potentially using parallel execution, and interpreting calibration results, please consult the Regional Simulations and Calibration Guide. This guide covers the advanced techniques needed for efficient and effective regional modeling.
- This process usually involves scripting to automate the setup and execution for each site within your organized workspace. You might use Python loops, parallel processing libraries (like
-
Process Outputs: After running simulations for all sites, outputs (e.g., .ACY, .DGN files) will be available, likely organized within your workspace structure (e.g., in
workspace/outputs/run_config_X/siteY/
). Processing involves reading these numerous files, aggregating data, and performing regional analysis.-
Reading Multiple Output Files: Use scripts to loop through your output directories and load data for each site.
import os import pandas as pd from geoEpic.io import ACY # Or DGN, etc. output_base_dir = './workspace/outputs/run_config_1/' sites_to_process = [d for d in os.listdir(output_base_dir) if os.path.isdir(os.path.join(output_base_dir, d))] # Get site folders all_yields = [] for site_id in sites_to_process: acy_path = os.path.join(output_base_dir, site_id, f"{site_id}.ACY") # Assuming output ACY is named after site_id if os.path.exists(acy_path): try: yield_df = ACY(acy_path).get_var('YLDG') yield_df['site_id'] = site_id # Add site identifier all_yields.append(yield_df) except Exception as e: print(f"Error reading {acy_path}: {e}") # Combine data from all sites into a single DataFrame if all_yields: regional_yields = pd.concat(all_yields, ignore_index=True) print(regional_yields.head()) else: print("No yield data processed.")
-
Regional Analysis and Visualization: Analyze aggregated data to understand regional patterns, average performance, or variability.
Visualizations might include maps of yields, comparative charts, or plots showing the range of outcomes across the region.import matplotlib.pyplot as plt import seaborn as sns # For potentially more complex plots if 'regional_yields' in locals() and not regional_yields.empty: # Example: Boxplot of yields across sites for a specific year year_to_plot = 2017 yields_year = regional_yields[regional_yields['YR'] == year_to_plot] if not yields_year.empty: plt.figure(figsize=(12, 7)) sns.boxplot(x='site_id', y='YLDG', data=yields_year) plt.title(f'Regional Corn Yield Distribution ({year_to_plot})') plt.xlabel('Site ID') plt.ylabel('Yield (t/ha)') plt.xticks(rotation=45, ha='right') plt.tight_layout() plt.show() else: print(f"No data found for year {year_to_plot}") # Example: Calculate average yield per year across the region avg_regional_yield = regional_yields.groupby('YR')['YLDG'].mean().reset_index() print("\nAverage Regional Yield per Year:") print(avg_regional_yield)
-
Exporting Aggregated Results: Save the combined regional data for reporting or further analysis.
This final step transforms individual site outputs into meaningful regional insights, often crucial for calibration validation and understanding spatial patterns.
-