Syed Mostofa Monsur

I am a 1st Year PhD Student in the Computer Science Department at Stony Brook University.

Previously, I led the AI/ML Team at Celloscope, a Bangladeshi Fintech. I have 5+ years of experience working in the industry where I developed multiple user-facing systems powered by NLP and Speech-based UX. I developed a number of Real-World Applications with Natural Language Interfaces thus making services accessible to hundreds of thousands of users who have limited access to information. I received my Bachelor of Science in Computer Science and Engineering from CSE BUET. There I worked with Professor Muhammad Abdullah Adnan on Active Learning for Distributed Systems.

Email  /  CV  /  gScholar  /  LinkedIn   

profile photo

Research Interests

I'm interested in NLP, Conversational AI, Human-Centered Design etc.

Publications

[4] SynthNID: Synthetic Data to Improve End-to-end Bangla Document Key Information Extraction
Syed Mostofa Monsur, Shariar Kabir, Sakib Chowdhury
BLP Workshop at EMNLP, 2023
[paper]
[3] Grid-Coding: An Accessible, Efficient, and Structured Coding Paradigm for Blind and Low-Vision Programmers
Md Ehtesham-Ul-Haque, Syed Mostofa Monsur, Syed Masum Billah
UIST, 2022 (Best Paper Award)
[project page] / [video] / [paper]
[2] SHONGLAP: A Large Bengali Open-Domain Dialogue Corpus
Syed Mostofa Monsur, Sakib Chowdhury, Md Shahrar Fatemi, Shafayat Ahmed
LREC, 2022
[poster] / [paper]
[1] Distributing Active Learning Algorithms
Syed Mostofa Monsur, Muhammad Abdullah Adnan
NSysS, 2020
[video] / [slides] / [paper]

Industry Projects

Agrani Voice Banking
Leaded Speech and NLU Team at Celloscope

Agrani Bank is Bangladesh's one of the largest state-owned banks with a huge number of customers who have very little access to information. Agrani Voice Banking makes banking services accessible to everyone. It is powered by Bengali ASR and a finetuned NLU engine for natural language-driven fund transfers and inquiries.

National ID Information Extraction using Document Transformers
Leaded NLP Team at Celloscope
slides

After fine-tuning pretrained document transformers, it achieves significantly good performance on extracting NID information. We treated the NID extraction problem as a document question-answering problem – querying on key fields of the NID image document. The model is fine-tuned with real user data and synthetic data as well.

License-Plate Extraction from Very Noisy Real-World Deployment
Leaded NLP Team at Celloscope and ML Team at Spectrum
slides

License-Plate extraction task in very noisy real-world setting. Fine-tuning end-to-end sequence extraction models on real and synthetic data for better performance. System deployed in several toll booths in Bangladesh for reporting analytics.

Industry-Grade ASR, TTS and Speaker Verification for Bengali Speech-Driven Systems
Leaded NLP Team at Celloscope

Collected and pre-processed 400+ hrs of Bengali audio and transcription. Trained end-to-end high-quality ASR models. Trained industry-grade TTS for Bengali language with 40+ hours of curated data and improved generated audio quality with Vocoders (naturalizing audio) Integrated with Natural Language driven User Interfaces including speech-driven chatbots. Developed industry-grade speaker verification system using ensemble of pre-trained unispeech-sat, wavlm and ecapa-tdnn.


Template stolen from Jon Barron's Site.