Abdou Aziz DIOP

PhD Researcher — Speech-to-Speech Translation     NLP · ASR · African Languages

I am a PhD student at Université Cheikh Anta Diop, Dakar, Senegal, working on Speech-to-Speech Translation (S2ST) for low-resource and African languages. My research spans automatic speech recognition, neural machine translation, and speech synthesis, with a focus on building spoken-language technology that serves the hundreds of millions of speakers of African languages who are currently underserved by existing NLP systems.

Before my PhD I earned a degree in Mathematics & Computer Science and a Master’s in Mathematics (thesis: Rank-Based Cryptography — Encryption & Signature), then specialised in Data Transmission and Information Systems. I have since worked extensively on multilingual NLP, conversational AI, and data science.

Email: abdouaziz@gmail.com  ·  GitHub: abdouaziz  ·  LinkedIn: abdouaziiz  ·  Twitter/X: @abdouaziiz


News

  • Mar 2026 Started working on a unified end-to-end S2ST architecture for Wolof–French and Pulaar–French language pairs.
  • Jan 2026 Released a new ASR dataset for Wolof containing 120h of transcribed speech. Available on HuggingFace.
  • Sep 2025 Presented work on low-resource ASR at an African NLP workshop. Slides and recording available on GitHub.
  • Jun 2025 Open-sourced a fine-tuned Whisper model for Wolof — first publicly available Wolof ASR checkpoint.
  • 2023 Started PhD in Speech-to-Speech Translation.

Research Interests

  • Speech-to-Speech Translation
  • Automatic Speech Recognition
  • Neural Machine Translation
  • Text-to-Speech Synthesis
  • Low-Resource NLP
  • African Languages
  • Multilingual Models
  • Self-Supervised Learning for Speech
  • Conversational AI

Publications & Preprints

S2ST
African Lang.
Towards End-to-End Speech-to-Speech Translation for Low-Resource African Languages
Abdou Aziz DIOP
Work in progress — 2026
ASR
Wolof
Fine-Tuning Whisper for Wolof: A Low-Resource ASR Study
Abdou Aziz DIOP
Preprint — 2025
NLP
African
Building NLP Resources for Wolof: Tokenizers, Language Models & Benchmarks
Abdou Aziz DIOP
Preprint — 2024

Update paper links and co-authors as submissions are finalised.


Education

PhD — Speech-to-Speech Translation (in progress) Université Cheikh Anta Diop (UCAD), Dakar · 2023 – present

Postgraduate — Data Transmission & Information Systems Université Cheikh Anta Diop (UCAD), Dakar

Master’s — Mathematics (Rank-Based Cryptography — Encryption & Signature)

Bachelor’s — Mathematics & Computer Science


Selected Projects

  • African ASR Toolkit — Whisper fine-tuning pipelines and pre-trained checkpoints for Wolof, Pulaar, and Mandinka.
  • Low-Resource S2ST — Cascaded and direct S2ST systems benchmarked on African language pairs.
  • NLP for Wolof — Tokenizers, language models, and evaluation datasets for Wolof.