Fully Convolutional Event-camera Voice Activity Detection Based on Event Intensity
| dc.contributor.author | Arman Savran | |
| dc.contributor.author | Savran, Arman | |
| dc.date.accessioned | 2025-10-06T17:49:35Z | |
| dc.date.issued | 2023 | |
| dc.description.abstract | The use of visual signals to detect vocally active duration is quite helpful when there is severe acoustic noise or even can be the only option if the audio channel is missing. There has been significant progress in video-based voice activity detection (VAD). On the other hand while recently emerging event camera (EC) technology has demonstrated great benefits for applications in robotics drones autonomous vehicles and mobile devices including visual speech recognition topics it has not been explored to be used as a vision-only VAD front-end. In this work we propose an event intensity-based method by designing a fully convolutional network to efficiently realize an EC-VAD that segments vocally active duration. Efficiency is due to pooling the data over the mouth area reducing the dimensions by totally collapsing local spatial information as well as due to one-stage detection by a fully temporal convolutional network. Experimental evaluations show successful detection of voice activity with about 0.91 area under the receiver operating curve over a dataset including high speech content variability and different types of facial actions. © 2023 Elsevier B.V. All rights reserved. | |
| dc.description.sponsorship | Yas¸ar University Project Evaluation Commission, (BAP112) | |
| dc.description.sponsorship | Supported by the Yas¸ar University Project Evaluation Commission for the project “Dynamic Facial Analysis with Neuromorphic Camera” [grant number: BAP112]. | |
| dc.identifier.doi | 10.1109/ASYU58738.2023.10296754 | |
| dc.identifier.isbn | 9798350306590 | |
| dc.identifier.scopus | 2-s2.0-85178262391 | |
| dc.identifier.uri | https://www.scopus.com/inward/record.uri?eid=2-s2.0-85178262391&doi=10.1109%2FASYU58738.2023.10296754&partnerID=40&md5=947d732b1a1eb6c99546515ddb3a5ff0 | |
| dc.identifier.uri | https://gcris.yasar.edu.tr/handle/123456789/8519 | |
| dc.identifier.uri | https://doi.org/10.1109/ASYU58738.2023.10296754 | |
| dc.language.iso | English | |
| dc.publisher | Institute of Electrical and Electronics Engineers Inc. | |
| dc.relation.ispartof | 2023 Innovations in Intelligent Systems and Applications Conference ASYU 2023 | |
| dc.rights | info:eu-repo/semantics/closedAccess | |
| dc.subject | Event Camera, Fully Convolutional Network, Lip Activity, Visual Speech, Voice Activity Detection, Acoustic Noise, Audio Acoustics, Convolution, Speech Recognition, Audio Channels, Autonomous Vehicles, Camera Technology, Convolutional Networks, Event Camera, Fully Convolutional Network, Lip Activity, Visual Signals, Visual Speech, Voice-activity Detections, Cameras | |
| dc.subject | Acoustic noise, Audio acoustics, Convolution, Speech recognition, Audio channels, Autonomous Vehicles, Camera technology, Convolutional networks, Event camera, Fully convolutional network, Lip activity, Visual signals, Visual speech, Voice-activity detections, Cameras | |
| dc.subject | Event Camera | |
| dc.subject | Visual Speech | |
| dc.subject | Fully Convolutional Network | |
| dc.subject | Lip Activity | |
| dc.subject | Voice Activity Detection | |
| dc.title | Fully Convolutional Event-camera Voice Activity Detection Based on Event Intensity | |
| dc.type | Conference Object | |
| dspace.entity.type | Publication | |
| gdc.author.institutional | Savran, Arman (14032056900) | |
| gdc.author.scopusid | 14032056900 | |
| gdc.bip.impulseclass | C5 | |
| gdc.bip.influenceclass | C5 | |
| gdc.bip.popularityclass | C4 | |
| gdc.coar.type | text::conference output | |
| gdc.collaboration.industrial | false | |
| gdc.description.department | ||
| gdc.description.departmenttemp | [Savran A.] Yaşar University, Department of Computer Engineering, İzmir, Turkey | |
| gdc.description.endpage | 6 | |
| gdc.description.publicationcategory | Konferans Öğesi - Uluslararası - Kurum Öğretim Elemanı | |
| gdc.description.startpage | 1 | |
| gdc.identifier.openalex | W4388038797 | |
| gdc.index.type | Scopus | |
| gdc.oaire.diamondjournal | false | |
| gdc.oaire.impulse | 3.0 | |
| gdc.oaire.influence | 2.5462394E-9 | |
| gdc.oaire.isgreen | false | |
| gdc.oaire.popularity | 4.118417E-9 | |
| gdc.oaire.publicfunded | false | |
| gdc.openalex.fwci | 1.2226 | |
| gdc.openalex.normalizedpercentile | 0.8 | |
| gdc.opencitations.count | 5 | |
| gdc.plumx.mendeley | 11 | |
| gdc.plumx.scopuscites | 6 | |
| gdc.scopus.citedcount | 6 | |
| gdc.virtual.author | Savran, Arman | |
| person.identifier.scopus-author-id | Savran- Arman (14032056900) | |
| project.funder.name | Supported by the Yas¸ar University Project Evaluation Commission for the project “Dynamic Facial Analysis with Neuromorphic Camera” [grant number: BAP112]. | |
| relation.isAuthorOfPublication | ec3245ee-803e-4537-8ade-40b369fad1c3 | |
| relation.isAuthorOfPublication.latestForDiscovery | ec3245ee-803e-4537-8ade-40b369fad1c3 | |
| relation.isOrgUnitOfPublication | ac5ddece-c76d-476d-ab30-e4d3029dee37 | |
| relation.isOrgUnitOfPublication.latestForDiscovery | ac5ddece-c76d-476d-ab30-e4d3029dee37 |
