Programme SpeD 2025

The 13th Conference on Speech Technology and Human-Computer Dialogue

October 19-22 – Cluj-Napoca, Romania

Aula HUB UTC-N, 3rd floor, George Barițiu 4, Cluj-Napoca

Monday, October 20th, 2025

8:30-9:30

Registration

9:00-9:30

Opening notes

9:30-10:30

Keynote “Building Trust in the Age of Speech Deepfakes: From Detection to Source Tracing and Robust Biometrics”
Tomi Kinnunen, University of Eastern Finland

10:30-11:00

Coffee break

11:00-12:20

Oral Session: Audio deepfakes and forensics
Chair: Tomi Kinnunen

11:00 11:20	Audio Splicing Detection Using Self-Supervised Representations Octavian Pascu, Horia Cucu
11:20 11:40	Augmented Transfer Learning for Synthetic Speech Detection Irina Mutica, Serban Mihalache, Gheorghe Pop, Dragoș Burileanu
11:40 12:00	Spoofed Speech Detection for Physical and Logical Access Applications using Deep Neural Networks Cristian-Teodor Neamţu, Șerban Mihalache, Dragoș Burileanu
12:00 12:20	Detecting Audio Deepfakes on the Edge: Lightweight SSL-Based Detection in a Browser Plugin Octavian Pascu, Dan Oneata, Horia Cucu, Nicolas Müller

12:20-14:00

Lunch – onsite

14:00-14:40

Keynote “New research directions of Romanian Academy Research Institute for Artificial Intelligence “Mihai Drăgănescu” (ICIA)”
Paul-Andrei Păun, Romanian Academy

14:40-16:00

Oral Session: Natural Language Processing
Chair: Camelia Lemnaru

14:40 15:00	A Study on Metric-Human Alignment in Code Summarization by LLMs through Kolmogorov-Arnold Networks Georgian Nicolae, Corneliu Burileanu
15:00 15:20	Effects of Attention Head Pruning on Encoder-only Language Models for Multilingual Recipe Classification Angheluș Alin-Gabriel, Vlad Andrei Negru, Camelia Lemnaru, Rodica Potolea
15:20 15:40	Safeguarding Online Trust: NLP-ML Detection of Computer-Generated Reviews Cornelia Ionela Bădoi
15:40 16:00	Transcription and Classification of Compound and Special Numeral Entities Using Artificial Intelligence and Rule-Based Methods Vasile Mitruț, Ștefania Ștefănescu, Ștefan-Adrian Toma

16:00-16:30

Coffee break

16:30-17:50

Oral Session: Emotion, speaker recognition, diarization, multimodal
Chair: Peter Mihajlik

16:30 16:50	Improved visually prompted keyword localisation in real low-resource settings Leanne Nortje, Dan Oneata, Gabriel Pirlogeanu, Herman Kamper
16:50 17:10	On the Contribution of Lexical Features to Speech Emotion Recognition David Combei
17:10 17:30	From Lab to Field: Practical Deployment and Evaluation of a Speaker Recognition System Raluca-Ionela Costan, Mihai Coca, Ștefan-Adrian Toma
17:30 17:50	Expanding and Refining RoMEMEs: A Multimodal Corpus of Romanian Memes for Advanced AI Analysis Vasile Păiș, Daniela Gîfu

19:00-21:00

Conference Banquet at Maimuța Plângătoare [map]

Tuesday, October 21st, 2025

9:00-9:30

Registration

9:30-10:30

Keynote: “Evaluating speech technology – challenges and opportunities”
Cassia Valentini-Botinhao, University of Edinburgh

10:30-11:00

Coffee break

11:00-12:40

Oral Session: Synthetic speech
Chair: Cassia Valentini-Botinhao

11:00 11:20	Latent Insights: Exploring Phoneme Diversity in Natural and Synthetic Speech through Latent Representations Diptasree Debnath, Helard Becerra, Andrew Hines
11:20 11:40	Style-Controlled VALL-E for Few-Shot Emotional German TTS Rami Kammoun, Mohammed Salah Al-Radhi
11:40 12:00	Towards Hungarian-English Code-Switching Speech Dataset Construction by using Multi Speaker-adaptive Text-to-Speech Synthesis Piroska Zsófia Barta, Peter Mihajlik
12:00 12:20	Adding Emotion Conditioning in Speech Synthesis via Multi-Term Classifier-Free Guidance Radu-George Bolborici, Ana Antonia Neacșu
12:20 12:40	Improving Speech Synthesis by Using a Cognitive View on F0 Contours Doina Jitcă

12:40-14:00

Lunch – onsite

14:00-14:20

Sponsor: “Orange Romania Research and Development Ecosystem”
Răzvan Mihai, Orange Romania

14:20-14:40

Sponsor: “Infineon Romania – Semiconductors R&D Competence Center”
Valentina Davidoiu, Infineon Romania

14:40-15:40

Oral Session: Automatic Speech Recognition
Chair: Ștefan-Adrian Toma

14:40 15:00	Impact of Text Origin and Real-Synthetic Data Ratio in TTS-Augmented Low-Resource ASR Mengke Dalai, Peter Mihajlik
15:00 15:20	Open Source State-Of-the-Art Solution for Romanian Speech Recognition Gabriel Pîrlogeanu, Alexandru-Lucian Georgescu, Horia Cucu
15:20 15:40	Cross-lingual Transfer Learning Experiments for Arabic ASR Amin Hassairi, Peter Mihajlik

15:40-16:10

Coffee break

16:10-17:50

Oral Session: Signal processing
Chair: Constantin Paleologu

16:10 16:30	Pretrained Speech Models Learn Boundaries, Not Patterns: An Analysis of Supervised vs. Unsupervised Capabilities Abdul Rehman, Kavisha Jayathunge, Jian-Jun Zhang, Xiaosong Yang
16:30 16:50	ADNAC: Audio Denoiser using Neural Audio Codec Lucian Daniel Jimon, Mircea Florin Vaida, Adriana Stan
16:50 17:10	A Robust Decomposition-Based RLS Algorithm for Echo Cancellation Applications Radu-Andrei Otopeleanu, Camelia Elisei-Iliescu, Constantin Paleologu, Jacob Benesty, Cristian-Lucian Stanciu, Cristian Anghel
17:10 17:30	Range of lip movement during the production of selected consonants in Kannada speakers Ajish Abraham, V. Sivaramakrishnan, N. Swapna, N. Manohar
17:30 17:50	About the spatial arrangement of acoustic sensors for monitoring protected wildlife area Corneliu Rusu

17:50

Closing notes

Wednesday, October 22nd, 2025

9:00-15:00

LDS Country Workshop in Romania

Page last updated: 02.10.2025