Metadata-Version: 2.4
Name: hrrr-data
Version: 2.3.8
Summary: A toolkit for accessing, downloading, and processing HRRR Lite data
Author: Jan Kazil
License-Expression: BSD-3-Clause
Requires-Python: >=3.11
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: cartopy
Requires-Dist: matplotlib
Requires-Dist: numpy
Requires-Dist: netcdf4
Requires-Dist: pygrib
Requires-Dist: requests
Requires-Dist: s3fs
Requires-Dist: xarray
Provides-Extra: dev
Requires-Dist: pytest>=8; extra == "dev"
Requires-Dist: pytest-cov>=5; extra == "dev"
Requires-Dist: mypy>=1.11; extra == "dev"
Requires-Dist: ruff>=0.5; extra == "dev"
Requires-Dist: pre-commit>=3.7; extra == "dev"
Dynamic: license-file

# hrrr-data

**hrrr-data** is a Python toolkit for accessing, downloading, and processing High-Resolution Rapid Refresh (HRRR) forecast data from NOAA’s public S3 bucket.

It provides:

- Top-level command-line tools that
  - Download HRRR surface forecast GRIB2 files from NOAA’s public S3 bucket for a specified date range, forecast initialization time, forecast lead time, and region.
  - Extract a subset of commonly used variables from the GRIB2 files into netCDF files.
  - Plot single-level (2-D) HRRR variables over CONUS (the contiguous U.S.)

- Modules for
  - Interacting with the NOAA HRRR S3 bucket and downloading HRRR forecast data
  - Working with HRRR data in GRIB2 and netCDF formats
  - Plotting HRRR data

## Installation

### pip

```bash
pip install hrrr-data
```

### conda / mamba
```bash
mamba install -c jan.kazil -c conda-forge hrrr-data
```

## Overview

This repository provides the following top-level command-line interface (CLI) scripts for working with HRRR surface forecasts:

- **`hrrr-fetch-sfc-forecast`**: Download HRRR surface forecast GRIB2 files for a given date range, initialization hour, forecast lead time, and region from the NOAA S3 bucket. If requested, a subset of pre-defined variables (temperature, humidity, wind speed, precipitation) is extracted into a netCDF file. Both the GRIB2 and the processed netCDF files are stored locally.

- **`hrrr-extract-sfc-vars`**: Process a single local HRRR GRIB2 file (previously downloaded) by converting it to netCDF and writing a new netCDF file that contains a pre-defined set of variables, with added long names and metadata attributes.
  
- **`hrrr-plot-singlelevel-conus`**: Create a plot of every HRRR variable that has only the horizontal grid dimensions latitude and longitude in a given HRRR netCDF file, one PNG file per variable. Assumes that the netCDF file contains variables over CONUS.

The pre-defined variables extracted and saved in netCDF files are:

  - Air temperature at 2 m above ground
  - Dew point temperature at 2 m above ground
  - Relative humidity at 2 m above ground
  - Wind speed at 10 m and 80 m above ground
  - 1 h accumulated precipitation

## Workflow

The typical workflow is:

1. **Download forecast data** using `hrrr-fetch-sfc-forecast`, specifying the date range, initialization hour, forecast lead time, and region of interest. This fetches the GRIB2 files from the NOAA HRRR S3 bucket, stores them locally, and, if requested, extracts pre-defined variables into netCDF files.

2. **Extract from a single GRIB2 file** using `hrrr-extract-sfc-vars` when you already have a GRIB2 file available locally and only want to extract a set of pre-defined variables for analysis.

3. **Work with the outputs** in standard netCDF format using your preferred scientific Python libraries (`xarray`, `netCDF4`, etc.), integrate them into downstream machine learning and analytics workflows, or plot the data using `hrrr-plot-singlelevel-conus`.

## Command-line interface (CLI)

### `hrrr-fetch-sfc-forecast`

Download HRRR surface forecast GRIB2 files from the NOAA S3 archive and optionally extract selected surface variables into netCDF files.

**Usage:**

```bash
hrrr-fetch-sfc-forecast START_YEAR START_MONTH START_DAY END_YEAR END_MONTH END_DAY FORECAST_INIT_HOUR FORECAST_LEAD_HOUR REGION DATA_DIR [-n N_JOBS] [-e] [-r] [-v]
```

**Arguments:**

- `START_YEAR START_MONTH START_DAY`: beginning of date range  
- `END_YEAR END_MONTH END_DAY`: end of date range  
- `FORECAST_INIT_HOUR`: forecast initialization hour (UTC)  
- `FORECAST_LEAD_HOUR`: forecast lead time in hours  
- `REGION`: HRRR region (e.g. `conus`)  
- `DATA_DIR`: local directory into which HRRR forecast files will be downloaded  

**Options:**

- `-n N_JOBS, --n N_JOBS`: number of parallel download processes  
- `-e, --extract`: extract selected surface variables (temperature, humidity, wind, precipitation) into a netCDF file  
- `-r, --refresh`: download and process files even if they already exist in the data directory  
- `-v, --verbose`: print detailed progress information  

Downloaded and processed files follow the naming convention:

```
hrrr.<YYYYMMDD>/<HRRR region tag>/hrrr.t<II>z.wrfsfcf<FF>.grib2  
hrrr.<YYYYMMDD>/<HRRR region tag>/hrrr.t<II>z.wrfsfcf<FF>.nc
```

where:

- `YYYYMMDD` is the year, month, and day  
- `II` is the initialization hour  
- `FF` is the forecast lead time in hours


### `hrrr-extract-sfc-vars`

Extract from an HRRR surface forecast GRIB2 file the variables listed in Section "Overview" into a netCDF file.

**Usage:**

```bash
hrrr-extract-sfc-vars /path/to/file.grib2
```

**Arguments:**
- `file.grib2`: path to a local HRRR GRIB2 file.

The tool produces a new netCDF file named `file.nc` with variables such as 2-m air temperature, 2-m dew point, relative humidity, wind components, and 1-hour accumulated precipitation, and adds global metadata identifying the processing.

### `hrrr-plot-singlelevel-conus`

Create a plot of each single-level (2-D) HRRR variable in a netCDF file for CONUS and save one PNG per variable.

**Usage:**
```bash
hrrr-plot-singlelevel-conus /path/to/file.nc
```

**Arguments:**
- `file.nc`: Path to a local HRRR netCDF file containing single-level variables on the HRRR grid (expects coordinates `gridlat_0` and `gridlon_0`, and variables with dimensions exactly `('ygrid_0', 'xgrid_0')`).

**Output:**
- One PNG per qualifying variable, saved alongside the input file and named:
  ```
  file.<variable_name>.png
  ```
  Each figure is a Lambert Conformal CONUS map with a colorbar labeled from the variable’s `long_name` and `units` attributes. Variables that do not match the required 2-D grid shape are skipped.
  
**Example:**

  Relative humidity at 2 m forecast for 2025-07-01 20:00:00 UTC, 32 h forecast lead time:

![Relative humidity at 2 m, 1 July 2025](plots/hrrr.t12z.wrfsfcf32.RH_P0_L103_GLC0.png)  


## Modules

- **`hrrr_fetch_surface_forecasts.py`**: Provides the run_fetch() function for programmatic use, allowing Python scripts to download HRRR surface forecast GRIB2 files from NOAA’s S3 archive over a specified date range and configuration, and optionally extract selected surface variables into netCDF files for further analysis.

- **`s3.py`**: Functions for interacting with the NOAA HRRR S3 bucket, including:
  - Listing available files via direct path matching or wildcard-style expressions
  - Downloading HRRR data files from S3 to a local directory
  - Retrieving S3 object metadata (size, last modified, ETag, user-defined metadata, etc.) for a file within the NOAA HRRR S3 bucket

- **`tools.py`**: Utilities for working with HRRR data in GRIB2 and netCDF formats, including:
  - Listing variables in GRIB2 files (`pygrib`)
  - Converting GRIB2 files to netCDF using `ncl_convert2nc`
  - Extracting pre-defined variables from netCDF files using `xarray`
  - Retrieving the metadata for a HRRR file in S3

- **`plotting`**: Utilities for plotting HRRR data.

## Demo Scripts

The `demos` directory provides example scripts demonstrating individual operations:

- `demo_s3_ls.py` and `demo_s3_ls_re.py`: List available HRRR files in the NOAA S3 bucket.
- `demo_s3_download.py`: Download a single GRIB2 file.
- `demo_s3_download_date_range.py`: Download GRIB2 files for a given date range.
- `demo_tools_grib_list_vars.py`: List variables in a GRIB2 file.
- `demo_tools_grib2nc.py`: Convert a GRIB2 file to netCDF.
- `demo_s3_info.py`: Retrieve and display S3 object metadata.

## Development

### Code Quality and Testing Commands

- `make fmt` - Runs ruff format, which automatically reformats Python files according to the style rules in `pyproject.toml`
- `make lint` - Runs ruff check --fix, which checks for style errors, bugs, outdated patterns, etc., and auto-fixes what it can.
- `make check` - Runs fmt and lint.
- `make type` - Currently disabled. Runs mypy, the static type checker, using the strictness settings from `pyproject.toml`.
- `make test` - Runs pytest with reporting (configured in `pyproject.toml`).

## Disclaimer

The HRRR data accessed by this software are publicly available from NOAA and are subject to their terms of use. This project is not affiliated with or endorsed by NOAA.

## Author
Jan Kazil - jan.kazil.dev@gmail.com - [jankazil.com](https://jankazil.com)  

## License

BSD 3-clause
