Portfolio

Voice Assessment

Published: January 01, 2025

This project is customized to use the latest ASR model Whisper-large, with additional implementation of a React UI and several other signal processing tweaks.. .

The 9th International Workshop on Vietnamese Language and Speech Processing (VLSP Hanoi 2022)

Published: January 01, 2023

FastSpeechStyle : Vietnamese Emotional Speech Synthesis for VLSP 2022 Shared Task. VLSP is the most prestigious and quality contest in Vietnam for speech and natural language processing, I was fortunate to join talented colleagues at Vinbigdata and won two first prizes.

Synthetic Speech Attribution - 2022 IEEE Signal Processing Cup

Published: January 01, 2022

Fake synthetic speech audio tracks can be generated through a wide variety of available methods. Given an audio recording representing a synthetically generated speech track, to detect which method among a list of candidate ones has been used to synthesize the speech.

Music Voice Conversion - 1st The Sound of AI Hackathon 2022

Published: January 01, 2022

Sing any song without speaking the language

Voice-Preserving Speech Machine Translation - STL Hackathon (Quatar 2022)

Published: January 01, 2021

Speech Translation for Low Resource Language with Voice I/O and Preserve the Characteristics of the Voice Input

Sleep Stage Classification with Multi-Scale Multi-Period CNN

Published: January 01, 2021

Sleep stage classification refers to the process of categorizing different stages of sleep based on the patterns and characteristics of brain activity.

Viphoneme

Published: January 01, 2020

Pypi Package Viphoneme: Phonetization, Convert Vietnamese Grapheme to IPA. I did this project in my 3rd year of college, which converts raw text to phonemes of sound so that speech AI models can learn, this package is accessible and used by most projects Voice research project in Vietnam (not production)

Vinorm

Published: January 01, 2020

Python - NSW package for Vietnamese: Normalization system to convert numbers, abbreviations, and words that cannot be pronounced into syllables. I worked on this project at AILAB with an outstanding friend in APCS Program. Originally written in C++ then packaged into python for ease of use in research projects

Reading Comprehension Backend - AI Core

Published: January 01, 2020

This is the AI Service Core system design for virtual assistants I made during my internship in my 3rd year of university.

Core AI Research and Statistic Report for RASA

Published: January 01, 2020

This is the test statistics of the Rasa chatbot system I made during my 3rd year internship

Statistics and Frontend Improvements for Voice of Southern Speech Synthesis (AILAB)

Published: January 01, 2019

This report aim to analyze the performance of Voice of Southern TTS system and take an overview about Vietnamese TTS. Some statistics and improvements for Frontend of VOS are also given based on the popular syllables nowadays.

Lyric based approach for music emotion recognition using hierarchical attention networks (NLP)

Published: January 01, 2019

We utilize the natural structure of a song which is words combine to lines, lines combine to segments, and segments combine to a complete song by adapting a hierarchical attention networks (HAN). .

Ethics in the voice recreating technology or Voice Cloning (AI Ethic)

Published: January 01, 2019

Technology is growing rapidly, especially the explosion of artificial intelligence in recent years has raised many concerns about the danger of the development itself

CUDA programing language

Published: January 01, 2018

CUDA is architecture and programming model developed by NVIDIA to run parallel computing on graphics processing units (GPUs) CUDA is the acronym for Compute Unified Device Architecture

Do Tri Nhan

Portfolio