📄 HyperLLaVA

The official repository of the paper HyperLLaVA: Dynamic Visual and Language Expert Tuning for Multimodal Large Language Models.

🎓 HyperLLaVA Overview

HyperLLaVA is a Multimodal Large Language Model (MLLM) designed for effectively enhancing performance on downstream multimodal tasks. It is composed of a Visual Expert-Assisted Projector and a Language Expert-integrated Tuning module. The architecture of the proposed HyperLLaVA is shown in the following figure.

Code will be available soon.

🤝 Referencing and Citing

If you find our work useful in your research and would like to cite our project, please use the following citation: found this work useful, please consider giving this repository a star and citing our paper as follows:

@misc{zhang2024hyperllava,
      title={HyperLLaVA: Dynamic Visual and Language Expert Tuning for Multimodal Large Language Models}, 
      author={Wenqiao Zhang and Tianwei Lin and Jiang Liu and Fangxun Shu and Haoyuan Li and Lei Zhang and He Wanggui and Hao Zhou and Zheqi Lv and Hao Jiang and Juncheng Li and Siliang Tang and Yueting Zhuang},
      year={2024},
      eprint={2403.13447},
      archivePrefix={arXiv},
      primaryClass={cs.AI}
}

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
README.md		README.md
framework.png		framework.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

framework.png

framework.png

Repository files navigation

📄 HyperLLaVA

🎓 HyperLLaVA Overview

🤝 Referencing and Citing

About

Releases

Packages

DCDmllm/HyperLLaVA

Folders and files

Latest commit

History

README.md

README.md

framework.png

framework.png

Repository files navigation

📄 HyperLLaVA

🎓 HyperLLaVA Overview

🤝 Referencing and Citing

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages