MIA-Bench: In the direction of Higher Instruction Following Analysis of Multimodal LLMs

MIA-Bench: In the direction of Higher Instruction Following Analysis of Multimodal LLMs
MIA-Bench: In the direction of Higher Instruction Following Analysis of Multimodal LLMs


We introduce MIA-Bench, a brand new benchmark designed to guage multimodal massive language fashions (MLLMs) on their potential to strictly adhere to complicated directions. Our benchmark includes a various set of 400 image-prompt pairs, every crafted to problem the fashions’ compliance with layered directions in producing correct responses that fulfill particular requested patterns. Analysis outcomes from a wide selection of state-of-the-art MLLMs reveal important variations in efficiency, highlighting areas for enchancment in instruction constancy. Moreover, we create further coaching knowledge and discover supervised fine-tuning to boost the fashions’ potential to strictly observe directions with out compromising efficiency on different duties. We hope this benchmark not solely serves as a software for measuring MLLM adherence to directions, but additionally guides future developments in MLLM coaching strategies.

Leave a Reply

Your email address will not be published. Required fields are marked *