Skip to main content

Mathematics and Statistics Seminar

When:
Venue: Birkbeck Main Building, Malet Street

No booking required

Sound event detection (SED) is a problem to detect the onset and offset times of sound events in an audio recording. SED has many applications in both academia and industry such as multimedia information retrieval and monitoring the domestic and public security. However, compared to speech signal processing that has been researched for many years, the classification and detection of general sounds of the world has not been researched much until recent years. One limitation of the study on audio classification and sound event detection is that there have been limited data sets publicly available until the appearance of the release of the detection and classification of acoustic scenes and events (DCASE) data set. The DCASE data set consists of data for acoustic scene classification (ASC), audio tagging (AT) and sound event detection. ASC and AT are tasks to design systems to predict pre-defined labels on an audio clip. SED is a task to design systems to predict both the presence or absence of sound events in a audio clip as well as the onset and offset time of the sound events. My presentation will focus on some of the outcomes of Making Sense of Sound project. The recently released databases (specifically for speaker verification task) will be explained. At the end, I will show two of our recently developed software demos for SED task.

Contact name: