arXiv 2509.17247

DeepASA: An Object-Oriented One-for-All Network for Auditory Scene Analysis

By Dongheon Lee, Younghoo Kwon, et al.

Published 2025-09-21

Mindmap

Browse the paper's core ideas, clusters, and relationships in a structured outline.

We propose DeepASA, a multi-purpose model for auditory scene analysis that performs multi-input multi-output (MIMO) source separation, dereverberation, sound event detection (SED), audio classification, and direction-of-arrival estimation (DoAE) within a unified framework. DeepASA is designed for complex auditory scenes where multiple, often similar, sound sources overlap in time and move dynamically in space. To ac…

View the original paper on arXiv