|
Syed Mostofa Monsur
I am a PhD Student in the Computer Science Department at the State University of New York at Stony Brook.
Previously, I led the AI/ML Team at Celloscope. I received my Bachelor of Science in Computer Science and Engineering from CSE BUET.
Email /
CV /
gScholar /
LinkedIn
|
|
Research Interests
I'm interested in NLP, LLM Reasoning, AI for Science, etc.
|
[5] Scaling down, Powering up: A Survey on the Advancements of Small Vision-Language Models
Sheikh Iftekhar Ahmed,
Muhammad Zubair Hasan,
Abrar Jahin Niloy,
Syed Mostofa Monsur,
Mark V. Albert
Information Fusion, 2025
[paper]
|
[4] SynthNID: Synthetic Data to Improve End-to-end Bangla Document Key Information Extraction
Syed Mostofa Monsur,
Shariar Kabir,
Sakib Chowdhury
BLP Workshop at EMNLP, 2023
[paper]
|
[3] Grid-Coding: An Accessible, Efficient, and Structured Coding Paradigm for Blind and Low-Vision Programmers
Md Ehtesham-Ul-Haque,
Syed Mostofa Monsur,
Syed Masum Billah
UIST, 2022 (Best Paper Award)
[paper]
/
[video]
/
[featured]
|
[2] SHONGLAP: A Large Bengali Open-Domain Dialogue Corpus
Syed Mostofa Monsur,
Sakib Chowdhury,
Md Shahrar Fatemi,
Shafayat Ahmed
LREC, 2022
[poster]
/
[paper]
|
[1] Distributing Active Learning Algorithms
Syed Mostofa Monsur, Muhammad Abdullah Adnan
NSysS, 2020
[video]
/
[slides]
/
[paper]
|
Agrani Voice Banking
Led AI/ML Team at Celloscope
Agrani Bank is
Bangladesh's one of the largest state-owned banks with a huge number of customers
who have very little access to information. Agrani Voice Banking makes
banking services accessible to everyone. It is powered by Bengali ASR
and a finetuned NLU engine for natural language-driven
fund transfers and inquiries.
|
Industry-Grade ASR, TTS and Speaker Verification for Bengali Speech-Driven Systems
Led AI/ML Team at Celloscope
Collected and pre-processed 400+ hrs of Bengali audio and
transcription. Trained end-to-end high-quality ASR models.
Trained industry-grade TTS for Bengali language with 40+ hours of
curated data and improved generated audio quality with Vocoders (naturalizing audio)
Integrated with Natural Language driven User Interfaces
including speech-driven chatbots. Developed industry-grade speaker
verification system using ensemble of pre-trained
unispeech-sat, wavlm and ecapa-tdnn.
|
|