Drone-based Unmanned Aerial Systems (UAS) provide an efficient means for early detection and monitoring of remote wildland fires due to their rapid deployment, low flight altitudes, high 3D maneuverability, and ever-expanding sensor capabilities. Recent sensor advancements have made side-by-side RGB/IR sensing feasible for UASs. The aggregation of optical and thermal images enables robust environmental observation, as the thermal feed provides information that would otherwise be obscured in a purely RGB setup, effectively "seeing through"thick smoke and tree occlusion. In this work, we present Fire detection and modeling: Aerial Multi-spectral image dataset (FLAME 2) , the first ever labeled collection of UAS-collected side-by-side RGB/IR aerial imagery of prescribed burns. Using FLAME 2, we then present two image-processing methodologies with Multi-modal Learning on our new dataset: (1) Deep Learning (DL)-based benchmarks for detecting fire and smoke frames with Transfer Learning and Feature Fusion. (2) an exemplary image-processing system cascaded in the DL-based classifier to perform fire localization. We show these two techniques achieve reasonable gains than either single-domain video inputs or training models from scratch in the fire detection task.