CV
Technical Skills
Speech & Audio
ML & LLM
Engineering
Languages
Education
VNUHCM – University of Science Ho Chi Minh City, Vietnam
Bachelor of Science, Advanced Program in Computer Science (APCS K17) Sep 2017 – Jun 2021
- GPA: 3.83 / 4.0 · TOEFL ITP: 550 / 677
Experience
Vinfast JSC Ho Chi Minh City, Vietnam
AI Team Lead – Speech Generation Jun 2026 – Present
- Unified TTS Serving Gateway: Architected a production FastAPI gateway consolidating multiple TTS model tiers and normalization backends — supporting streaming, async job queue, API key auth with per-key usage tracking, and horizontal shard deployment for multilingual production serving.
- TTS Normalization Automation Harness: Designed an end-to-end QC automation framework with LLM-based error classification, BA-configurable per-project switch-code vocabulary management, automated Java SDK and prompt rebuild, and full regression evaluation — reducing engineering intervention for vocabulary updates to zero.
VinSmart Future – Vingroup JSC Ho Chi Minh City, Vietnam
AI Team Lead – Speech Generation Feb 2026 – Jun 2026
- Research Team Leadership: Led a team of 6 on speech generation: voice design, non-verbal audio, and reinforcement learning for speech synthesis.
- Voice Model Product Management: Coordinated between research and product; standardized QA and release processes across robot assistants, in-vehicle AI, and super-app voice mode.
Senior AI Engineer – Core AI Research Jul 2025 – Feb 2026
- Vietnamese Phoneme-Based Flow Matching TTS: Developed a phoneme-adaptive fine-tuning TTS model achieving stable quality with low WER in production.
- TensorRT & Triton Serving Optimization: Achieved 2× throughput improvement and increased concurrent user capacity.
- Hybrid Text Normalization: SLM + rule-based hybrid pipeline improving accuracy 13% over the rule-based baseline.
Vingroup Big Data Institute – Vingroup JSC Hanoi, Vietnam
Middle AI Engineer – AI Agent & Voice Biometric Jan 2024 – Jul 2025
- AI Agent with Knowledge Base: Built a RAG-based agent with query processing, embedding, function calling, and automatic double-bot QC evaluation (2025).
- Speech Data Warehouse: Managed a team building storage, querying, processing, release management, and NL querying integration (2025).
- Streaming Diarization Optimization: Triton EEND-VC — 2× speed, superior quality over third-party solutions (2025).
- Anti-spoofing Module: EER of 3.75% combining RawBoost augmentation, speech SSL, and Graph Neural Networks (2024).
AI Engineer – Speech Synthesis & Voice Biometric Oct 2021 – Dec 2023
- Smart AI Voice Recording Service: Speaker Diarization + ASR + Voice Biometrics for insurance fraud detection (2023).
- Government Voice Biometric Project: Microservice gateway with SNR Estimator, VAD, ASR, Speaker Counter via TorchServe and Milvus (2023).
- Multispeaker Acoustic Model: 4× GPU reduction, 30-min fine-tuning, zero-shot cloning with 30s audio (2023).
- Universal Multistream Vocoder: HiFi-GAN optimization — 1.5× faster inference without quality loss (2022).
- AI Service User Interfaces: Demo and labeling tools using Streamlit, Material UI React, Wavesurfer.js (2022).
- Tacotron2 Enhancement: Monotonic Alignment Attention integration to eliminate noise artifacts (2021).
- Massive Speech Data Crawler: Large-scale automated pipeline covering selection, denoising, transcription, and quality ranking (2021).
VinAI Research Institute – Vingroup JSC Hanoi, Vietnam
Engineering Resident – ERP Batch 1 Dec 2020 – Sep 2021
- Vietnamese TTS for Electric Vehicles: Integrated Tacotron2, FastSpeech2, GlowTTS, HiFi-GAN; optimized with ONNX Runtime for edge inference.
- SpeechMT: Real-time voice-based machine translation service (Docker + FastAPI).
- Speech-Based QA for Car Manuals: Lightweight end-to-end voice-driven QA system.
KMS Technology Ho Chi Minh City, Vietnam
Software Engineer – Center of Excellence Jun 2020 – Sep 2020
- RASA Chatbot Core Research: Intent classification combining Dense (ConveRT, BERT) and Sparse (TF-IDF, Count Vectors) features.
- Reading Comprehension Backend: BIDAF-based services with AllenNLP, Flask, Swagger, Docker Compose, RabbitMQ.
Scooter Saigon Tour – Tourism Startup Ho Chi Minh City, Vietnam
SEO Developer – Founder Jan 2014 – Aug 2017
- Built WordPress site with PHP backend and VTCpay integration; managed SEO and social channels.
Freelance AI Projects: Voice-Preserving Speech MT · Singing Voice Conversion · Pronunciation Assessment · Speech Enhancement · Sleep Stage Classification · Emotional Dubbing
Research Affiliations
Speech and Language Lab, Nanyang Technological University Singapore
Research Intern (Remote) Sep 2022 – Mar 2023
- Government project with ST Engineering — classifying emergency sound events in urban areas.
Artificial Intelligence Lab, University of Science Ho Chi Minh City
Research Assistant Feb 2019 – Sep 2021
- Built a Vietnamese speech corpus (19 GB); researched text normalization and phonology.
- Organized ERC2019 – Emotion Recognition Challenge.
- Teaching assistant at VNsigma Python beginner class (American Consulate).
Robotics & IoT Lab, University of Science Ho Chi Minh City
Research Assistant May 2019 – Sep 2021
- Teacher for EV3 Mindstorm; coach for FLL, EVJ Makethon, and WRO competitions.
Computational Linguistics Center, University of Science Ho Chi Minh City
Research Assistant Apr 2019 – Sep 2019
- Teaching assistant for English Linguistics faculty (USSH), guiding SDL Trados Studio.
Publications
- MAPR 2025 Unified Acoustic Representation Learning for Vietnamese Speech Classification Tasks
- MAPR 2025 Analyzing the Correlation and Impact of Speech Evaluation Metrics on Speaker Verification and ASR
- MAPR 2025 Adapting WavLM for Vietnamese Speaker Diarization in Real-world Conversations
- RIVF 2024 Enhancing Deepfake Detection: WavLM and Advanced RawBoost Augmentation
- Voice Privacy 2024 Voice Attacker: Multi-Head Factorized Attentive Reconstructor and Gradient Reversal for Random Prosody Anonymization
- DCASE 2023 Sound Event Detection with Soft Labels Using Self-Attention for Global Scene Feature Extraction
- IJAL 2022 FastSpeechStyle: Fast, Emotion-Controllable, High-Quality Speech Synthesis
- NeurIPS 2021 Vietnamese Speech-based Question Answering over Car Manuals
- NAFOSTED 2020 Vietnamese Speech Synthesis with End-to-End Model and Text Normalization
- MediaEval 2020 Emotion Classification Using WaveNet Features with SpecAugment and EfficientNet
PyPI
vinorm — Vietnamese Text Normalization · 34,933+ downloads / 6 months PyPI
viphoneme — Vietnamese Grapheme-to-IPA Phonetization · 4,195+ downloads / 6 months Thesis: DeepSpeechVC – Voice Cloning Framework with Speech Synthesis and Voice Conversion
Awards & Recognition
- Outstanding Employee of the Year – VinBigdata2023
- First Place, VLSP TTS Emotional Speech Synthesis2022
- First Prize, Science-A-Thon, 9th Vietnamese Summer School of Science2022
- Outstanding Student Award in Artificial Intelligence, Ho Chi Minh City2021
- Excellent Thesis – Science and Technology Student Award2021
- Runner-Up, KMS Hackathon2020
Certifications & Competitions
- ISO/IEC 19795-1:2021 (NIST/NVLAP) for Voice Biometric Testing2024
- IEEE Spoken Language Technology Workshop Hackathon2022
- 7th/23 teams, IEEE Signal Processing Cup2022
- FPT Scholarship, Final Round, Code War Competition2019
- Final Round, Samsung Collegiate Programming Cup2018
- ACM-ICPC Vietnam National Programming Contest – Rank 332018
- Self-Driving Cars Specialisation – University of Toronto (Coursera)2018
Volunteering
- Volunteer for medical research on dental surgery at HCMC Oromaxillofacial Hospital2021
- F1 contact tracing support during the Covid-19 pandemic2022
- Organized ERC2019 – Emotion Recognition Challenge2019
- Teaching assistant at VNsigma Python beginner class (American Consulate)2019
