Vid-LLM: A Compact Video-based 3D Multimodal LLM
with Reconstruction–Reasoning Synergy

Project page under construction — content will be updated as the paper progresses.

Abstract (placeholder)

This page is a placeholder for the project Vid-LLM. Detailed abstract, figures, demos, and benchmark results will be released here.

The code will be released upon the acceptance of the paper.

Citation (placeholder)

Once the paper is on arXiv, a BibTeX entry will appear here.

@article{vidllm2025,
  title   = {Vid-LLM: A Compact Video-based 3D Multimodal LLM with Reconstruction–Reasoning Synergy},
  author  = {Your Name et al.},
  journal = {arXiv preprint arXiv:XXXXX},
  year    = {2025}
}

Contact

For questions, please contact: 2024282140075111@whu.edu.cn