Biography

I graduated from Yali Middle School in 2014.

I received my B.Sc. degree from the Department of Computer Science and Technology, Nanjing University in 2018.

I received my M.Sc. degree from the LAMDA group, Department of Computer Science and Technology, Nanjing University in 2021, under the supervision of Professor Wu-Jun Li.

Currently, I am a senior algorithm engineer at HUAWEI.

I am interested in machine learning and data mining. Currently, I am mainly focusing on:

  • Multimodal Large Language Models

I am also interested in:

  • Text Recognition
  • Speaker Recognition
  • Speech Recognition
  • Text-to-Speech
  • Face Recognition

Publications

  • TextHawk: Exploring Efficient Fine-Grained Perception of Multimodal Large Language Models
  • Ya-Qi Yu, Minghui Liao, Jihao Wu, Yongxin Liao, Xiaoyu Zheng and Wei Zeng
    Preprint, 2024
  • CAM: Context-Aware Masking for Robust Speaker Verification
  • Ya-Qi Yu, Siqi Zheng, Hongbin Suo, Yun Lei and Wu-Jun Li
    Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Toronto, Canada, 2021
  • Densely Connected Time Delay Neural Network for Speaker Verification
  • Ya-Qi Yu and Wu-Jun Li
    Proceedings of the Annual Conference of the International Speech Communication Association (INTERSPEECH), Shanghai, China, 2020
  • Deep Hashing for Speaker Identification and Retrieval
  • Lei Fan, Qing-Yuan Jiang, Ya-Qi Yu and Wu-Jun Li
    Proceedings of the Annual Conference of the International Speech Communication Association (INTERSPEECH), Graz, Austria, 2019
  • Ensemble Additive Margin Softmax for Speaker Verification
  • Ya-Qi Yu, Lei Fan and Wu-Jun Li
    Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Brighton, UK, 2019

    Experiences

    Senior Algorithm Engineer

    July 2021 - Now
    HUAWEI, Shanghai, China

    Algorithm Engineer Intern

    June 2020 - August 2020
    Speech Lab, Alibaba DAMO Academy, Alibaba Group, Hangzhou, China

    Work at speaker recognition group:

    • Deep hashing-based large-scale speaker retrieval
    • Noise robust speaker verification (Paper accpeted by ICASSP 2021)

    Awards & Honors

    Graduation with Distinction
    April 2021
    Nanjing University, Nanjing, China
    Excellent Graduate Student
    December 2020
    Nanjing University, Nanjing, China
    Excellence Scholarship
    November 2020
    Nanjing University, Nanjing, China
    HUAWEI Scholarship
    December 2019
    Nanjing University, Nanjing, China
    Speaker Verification Competition (2/345, ¥50,000)
    October 2018
    Tongdun, Hangzhou, China

    Projects

    D-TDNN - PyTorch implementation of densely connected time delay neural networks
    KaldiFeat - A light-weight Python library for computing Kaldi-style acoustic features based on NumPy
    PyShengyun - A Python converter for Chinese Pinyin and Shengyun (initials and finals)

    Skills

    PyTorch

    Kaldi

    Python

    C & C++