Research
Current Research
AGNBoost: Expanding Color Selection with Machine Learning to identify IR AGN
Identifying active galactic nuclei (AGN) from photometry alone is notoriously challenging. UV-optical photometry is highly susceptible to dust obscuration, and even mid-IR color selections—while more robust—often sacrifice either reliability or completeness (e.g., Kirkpatrick+2023). The problem becomes particularly acute when working with sparse photometric coverage: star-forming galaxies (SFGs) can easily mimic the rising mid-IR power-law characteristic of AGN emission, as illustrated in the animated color-color plot below.
Traditional color selection is essentially a rudimentary classification algorithm that draws decision boundaries in 2-D color space. AGNBoost extends this concept to higher dimensions, leveraging all available spectral information through machine learning. The key innovation is using distributional regression via XGBoostLSS (März 2019) to simultaneously predict two quantities: fracAGN (the fraction of 3–30 μm mid-IR emission attributable to AGN) and photometric redshift.
Unlike standard regression methods that only predict a single value (the conditional mean), XGBoostLSS predicts the entire conditional distribution for each target variable. This means AGNBoost naturally quantifies both the uncertainty in each prediction and the full range of plausible values—critical for AGN identification where degeneracies between SFGs and AGN are common.
Key Features:
- Trained on 106 mock galaxies from CIGALE spanning z = 0.01–8.0
- Uses 11 JWST bands (7 NIRCam + 4 MIRI) plus 55 derived colors as inputs
- Achieves sub-1% outlier fractions on test data (0.19% for fracAGN, 0.63% for redshift)
- Processes catalogs of ~1000 galaxies in minutes (vs. hours-to-days for traditional SED fitting)
- Provides robust uncertainty estimates combining aleatoric, epistemic, and photometric uncertainties
- Handles missing photometric bands through integrated imputation methods
AGNBoost is publicly available on GitHub and enables rapid AGN candidate identification in large JWST surveys—essential for efficient follow-up observations in the era of wide-field infrared astronomy. For more detail, please view the paper prepreint [arXiv:2506.03130](https://arxiv.org/abs/2506.03130)