About and Reproducibility
Project
This website presents my JSC370 final project on daily PM2.5, weather, and high-pollution prediction across major U.S. metropolitan areas in 2024.
The project builds on my midterm research question:
How are daily PM2.5 levels associated with temperature, precipitation, wind, barometric pressure, and humidity-related conditions across major U.S. metropolitan areas in 2024?
For the final project, I extend that descriptive analysis into predictive modeling. I use Random Forest and XGBoost models to predict daily PM2.5 concentrations and classify whether a monitor-day exceeds 35 ug/m3.
Data Sources
The analysis uses a saved merged file, pm25_weather_local_2024_2.0.csv, created from:
- EPA AQS daily PM2.5 monitor data.
- EPA AQS daily wind, pressure, relative humidity, and dew point summaries.
- NOAA Climate Data Online API variables for maximum temperature, minimum temperature, and precipitation.
The modeling unit is a monitor-day observation. Geographic identifiers, weather measurements, and temporal variables are used as predictors.
Reproducibility Notes
All source files are in the final_project directory:
index.qmdbuilds the project homepage.visualizations.qmdbuilds the interactive Plotly figures.report.qmdbuilds both the HTML report and the downloadable PDF report._quarto.ymldefines the Quarto website structure.pm25_weather_local_2024_2.0.csvis the analysis dataset.
The project repository is available here: https://github.com/NKwyk/JSC370-project.
The final project repository only is available here: https://github.com/NKwyk/JSC370-project/tree/main/final_project.
Rendering
To reproduce the site locally, run this command from the final_project directory:
quarto renderThe rendered website includes index.html, visualizations.html, report.html, about.html, and report.pdf.