Seeing past words: Testing the cross-modal capabilities of pretrained V&L models on counting tasks

DSpace/Manakin Repository

 
 
See more statistics about this item