SpeD 2025 – Welcome message

The “SpeD 2025” Organizing Committee warmly invites you to attend the 13th Conference on Speech Technology and Human-Computer Dialogue, in Cluj-Napoca, Romania. The conference will be held in-person at the Technical University of Cluj-Napoca.

The conference will bring together scientists, developers, and professionals to present their work, meet colleagues, discuss new ideas, and build collaboration between university, research center, and commercial sector research groups. The technical program will include oral sessions, keynotes by renowned speakers, and demonstrations of latest research on a wide range of topics positioned at the forefront of science and engineering in speech technology and human-computer dialogue.

The past editions of the “SpeD” conference series were sponsored by IEEE and EURASIP (technical sponsors), the proceedings being indexed by the IEEE Xplore® Digital Library, Scopus, and the Web of Science Conference Proceedings Citation Index (the WoS indexing process has not been finalized for the previous 2023 edition). This year, papers accepted and presented during the conference will also be submitted for inclusion into IEEE Xplore, subject to meeting IEEE Xplore’s scope and quality requirements, and for indexing in Web of Science.

Joint event

This year, the Language Data Space (LDS)  Workshop organised by the Research Institute for Artificial Intelligence “Mihai Drăgănescu”, Romanian Academy, will be co-located with “SpeD”.

Main Topics

  • Automatic Speech Recognition (ASR): algorithms, models, and systems for accurate and robust transcription of spoken language in diverse acoustic and linguistic conditions.
  • Audio Deepfakes and Forensics: detection, analysis, and prevention of manipulated or synthetic speech; forensic applications for speaker verification and authenticity assessment.
  • Text-to-Speech (TTS) Synthesis: neural and statistical approaches to generating natural, expressive, and intelligible synthetic speech.
  • Speech Emotion Recognition (SER): computational methods for analyzing affective and paralinguistic cues in speech.
  • Automatic Speaker Recognition and Diarization: techniques for speaker identification, verification, and segmentation in multi-speaker environments.
  • Audio and Speech Signal Processing: enhancement, separation, coding, and transformation of speech and audio signals.
  • Multimodal and Audio-Visual Speech Processing: integration of visual, linguistic, and contextual cues for improved understanding and synthesis of human communication.
  • Natural Language Processing (NLP): models and tools for understanding, generating, and interacting with human language, including dialogue systems and large language models.

Gold Partners

               

 

Schedule

  • Paper submission (5 – 6 pages, IEEE format): June 2, 2025 July 7, 2025.
  • Notification of acceptance and reviewers’ comments: August 15, 2025.
  • Submission of final papers: September 5, 2025.
  • Conference: October 19-22, 2025.