The Yi-VL version can understand and discuss images at 448x448 resolution. ⚖️ The Verdict
It is highly optimized for both English and Chinese instructions. The Yi-VL version can understand and discuss images
High-end versions (34B) require significant VRAM—up to 80GB+ per GPU for full fine-tuning. The Yi-VL version can understand and discuss images