Programme SpeD 2025
The 13th Conference on Speech Technology and Human-Computer Dialogue
October 19-22 – Cluj-Napoca, Romania
Aula HUB UTC-N, 3rd floor, George Barițiu 4, Cluj-Napoca
Monday, October 20th, 2025
8:30-9:30 |
Registration |
9:00-9:30 |
Opening notes |
9:30-10:30 |
Keynote “Building Trust in the Age of Speech Deepfakes: From Detection to Source Tracing and Robust Biometrics”
Tomi Kinnunen, University of Eastern Finland |
10:30-11:00 |
Coffee break |
11:00-12:20 |
Oral Session: Audio deepfakes and forensics
Chair: Tomi Kinnunen
11:00
11:20 |
Audio Splicing Detection Using Self-Supervised Representations
Octavian Pascu, Horia Cucu |
11:20
11:40 |
Augmented Transfer Learning for Synthetic Speech Detection
Irina Mutica, Serban Mihalache, Gheorghe Pop, Dragoș Burileanu |
11:40
12:00 |
Spoofed Speech Detection for Physical and Logical Access Applications using Deep Neural Networks
Cristian-Teodor Neamţu, Șerban Mihalache, Dragoș Burileanu |
12:00
12:20 |
Detecting Audio Deepfakes on the Edge: Lightweight SSL-Based Detection in a Browser Plugin
Octavian Pascu, Dan Oneata, Horia Cucu, Nicolas Müller |
|
12:20-14:00 |
Lunch – onsite |
14:00-14:40 |
Keynote “New research directions of Romanian Academy Research Institute for Artificial Intelligence “Mihai Drăgănescu” (ICIA)”
Paul-Andrei Păun, Romanian Academy |
14:40-16:00 |
Oral Session: Natural Language Processing
Chair: Camelia Lemnaru
14:40
15:00 |
A Study on Metric-Human Alignment in Code Summarization by LLMs through Kolmogorov-Arnold Networks
Georgian Nicolae, Corneliu Burileanu |
15:00
15:20 |
Effects of Attention Head Pruning on Encoder-only Language Models for Multilingual Recipe Classification
Angheluș Alin-Gabriel, Vlad Andrei Negru, Camelia Lemnaru, Rodica Potolea |
15:20
15:40 |
Safeguarding Online Trust: NLP-ML Detection of Computer-Generated Reviews
Cornelia Ionela Bădoi |
15:40
16:00 |
Transcription and Classification of Compound and Special Numeral Entities Using Artificial Intelligence and Rule-Based Methods
Vasile Mitruț, Ștefania Ștefănescu, Ștefan-Adrian Toma |
|
16:00-16:30 |
Coffee break |
16:30-17:50 |
Oral Session: Emotion, speaker recognition, diarization, multimodal
Chair: Peter Mihajlik
16:30
16:50 |
Improved visually prompted keyword localisation in real low-resource settings
Leanne Nortje, Dan Oneata, Gabriel Pirlogeanu, Herman Kamper |
16:50
17:10 |
On the Contribution of Lexical Features to Speech Emotion Recognition
David Combei |
17:10
17:30 |
From Lab to Field: Practical Deployment and Evaluation of a Speaker Recognition System
Raluca-Ionela Costan, Mihai Coca, Ștefan-Adrian Toma |
17:30
17:50 |
Expanding and Refining RoMEMEs: A Multimodal Corpus of Romanian Memes for Advanced AI Analysis
Vasile Păiș, Daniela Gîfu |
|
19:00-21:00 |
Conference Banquet |
Tuesday, October 21st, 2025
9:00-9:30 |
Registration |
9:30-10:30 |
Keynote: “Evaluating speech technology – challenges and opportunities”
Cassia Valentini-Botinhao, University of Edinburgh |
10:30-11:00 |
Coffee break |
11:00-12:40 |
Oral Session: Synthetic speech
Chair: Cassia Valentini-Botinhao
11:00
11:20 |
Latent Insights: Exploring Phoneme Diversity in Natural and Synthetic Speech through Latent Representations
Diptasree Debnath, Helard Becerra, Andrew Hines |
11:20
11:40 |
Style-Controlled VALL-E for Few-Shot Emotional German TTS
Rami Kammoun, Mohammed Salah Al-Radhi |
11:40
12:00 |
Towards Hungarian-English Code-Switching Speech Dataset Construction by using Multi Speaker-adaptive Text-to-Speech Synthesis
Piroska Zsófia Barta, Peter Mihajlik |
12:00
12:20 |
Adding Emotion Conditioning in Speech Synthesis via Multi-Term Classifier-Free Guidance
Radu-George Bolborici, Ana Antonia Neacșu |
12:20
12:40 |
Improving Speech Synthesis by Using a Cognitive View on F0 Contours
Doina Jitcă |
|
12:40-14:00 |
Lunch – onsite |
14:00-14:20 |
Sponsor: “Orange Romania Research and Development Ecosystem”
Răzvan Mihai, Orange Romania
|
14:20-14:40 |
Sponsor: “Infineon Romania – Semiconductors R&D Competence Center”
Valentina Davidoiu, Infineon Romania
|
14:40-15:40 |
Oral Session: Automatic Speech Recognition
Chair: Ștefan-Adrian Toma
14:40
15:00 |
Impact of Text Origin and Real-Synthetic Data Ratio in TTS-Augmented Low-Resource ASR
Mengke Dalai, Peter Mihajlik |
15:00
15:20 |
Open Source State-Of-the-Art Solution for Romanian Speech Recognition
Gabriel Pîrlogeanu, Alexandru-Lucian Georgescu, Horia Cucu |
15:20
15:40 |
Cross-lingual Transfer Learning Experiments for Arabic ASR
Amin Hassairi, Peter Mihajlik |
|
15:40-16:10 |
Coffee break |
16:10-17:50 |
Oral Session: Signal processing
Chair: Constantin Paleologu
16:10
16:30 |
Pretrained Speech Models Learn Boundaries, Not Patterns: An Analysis of Supervised vs. Unsupervised Capabilities
Abdul Rehman, Kavisha Jayathunge, Jian-Jun Zhang, Xiaosong Yang |
16:30
16:50 |
ADNAC: Audio Denoiser using Neural Audio Codec
Lucian Daniel Jimon, Mircea Florin Vaida, Adriana Stan |
16:50
17:10 |
A Robust Decomposition-Based RLS Algorithm for Echo Cancellation Applications
Radu-Andrei Otopeleanu, Camelia Elisei-Iliescu, Constantin Paleologu, Jacob Benesty, Cristian-Lucian Stanciu, Cristian Anghel |
17:10
17:30 |
Range of lip movement during the production of selected consonants in Kannada speakers
Ajish Abraham, V. Sivaramakrishnan, N. Swapna, N. Manohar |
17:30
17:50 |
About the spatial arrangement of acoustic sensors for monitoring protected wildlife area
Corneliu Rusu |
|
17:50 |
Closing notes |
Page last updated: 02.10.2025
