arXiv 2509.17247

DeepASA: An Object-Oriented One-for-All Network for Auditory Scene Analysis

By Dongheon Lee, Younghoo Kwon, et al.

Published 2025-09-21

Citation lineage

Review the prior work and downstream research connected to this paper.

We propose DeepASA, a multi-purpose model for auditory scene analysis that performs multi-input multi-output (MIMO) source separation, dereverberation, sound event detection (SED), audio classification, and direction-of-arrival estimation (DoAE) within a unified framework. DeepASA is designed for complex auditory scenes where multiple, often similar, sound sources overlap in time and move dynamically in space. To ac…

View the original paper on arXiv