Yichao Cao

I am a Ph.D. candidate at Southeast University, under the supervision of Professor Xiaobo Lu. My research interests lie in the fields of computer vision and artificial intelligence.

In recent years, I have also served as the Manager of the Algorithm Department at Enbo Technology Co., Ltd. in Nanjing, where I am responsible for the research and development of algorithmic products for forest fire video monitoring systems. Recently, we developed the first multimodal large model in the forestry domain, named "ForestMind". Through this single model, we can accomplish a diverse range of tasks in smart forestry.

Email  /  CV  /  Scholar  /  Github

profile photo

Research

My research interests include, but are not limited to: (1) Human-Object Interaction Detection; (2) Fine-tuning and applications of multimodal large models; (3) Embodied intelligence. Most of my research is about these fields. Additionally, I am passionate about and eager to explore other emerging fields. Some papers are highlighted.

Detecting any human-object interaction relationship: Universal HOI detector with spatial prompt learning on foundation models
Yichao Cao, Qingfei Tang, Xiu Su, Song Chen, Shan You, Xiaobo Lu, Chang Xu
NeurIPS, 2023

Re-mine, learn and reason: Exploring the cross-modal semantic correlations for language-guided hoi detection
Yichao Cao, Qingfei Tang, Feng Yang, Xiu Su, Shan You, Xiaobo Lu, Chang Xu
ICCV, 2023

Universal Frequency Domain Perturbation for Single-Source Domain Generalization
Chuang Liu*, Yichao Cao*, Haogang Zhu, Xiu Su (Equal contribution)
ACM MM, 2024

Searching for better spatio-temporal alignment in few-shot action recognition
Yichao Cao, Xiu Su, Qingfei Tang, Shan You, Xiaobo Lu, Chang Xu
NeurIPS, 2022

Coarse2fine: local consistency aware re-prediction for weakly supervised object localization
Yixuan Pan, Yao Yao, Yichao Cao, Chongjin Chen, Xiaobo Lu (Oral)
AAAI, 2023

Attributes grouping and mining hashing for fine-grained image retrieval
Xin Lu, Shikun Chen, Yichao Cao, Xin Zhou, Xiaobo Lu
ACM MM, 2023

EFFNet: Enhanced feature foreground network for video smoke source prediction and detection
Yichao Cao, Qingfei Tang, Xuehui Wu, Xiaobo Lu
IEEE TCSVT, 2021

Global2Salient: Self-adaptive feature aggregation for remote sensing smoke detection
Shikun Chen, Yichao Cao, Xiaoqiang Feng, Xiaobo Lu
Neurocomputing, 2021


Stolen from Jon Barron. Big thanks!