Testing Encoders

I wanted to get a feel for myself of how well Masked Language Models perform at uncovering masks. This is a fairly trivial exercise, but I wanted to do it myself to get a better intuition of relative performance. Maybe it’ll be helpful to you as well, code on the bottom.

The Tasks: Answer these questions:

Question	Expected Answer
The cat chased the [MASK].	mouse
The capital of [MASK] is Paris.	France
Bread is typically baked in an [MASK].	oven
When it’s raining, you use an [MASK] to stay dry.	umbrella
You brush your [MASK] to keep them clean.	teeth
Money doesn’t grow on [MASK].	trees
A doctor often works in a [MASK].	hospital
You don’t put metal in a [MASK].	microwave
When you’re sleepy, you go to [MASK].	bed
The [MASK] is the center of our solar system.	sun
People breathe [MASK] to live.	oxygen
Plants need [MASK] to grow.	water
The [MASK] is a natural satellite of the Earth.	moon
Cars usually run on [MASK].	gasoline
A fridge is used to keep food [MASK].	cold
To see stars, you look at the [MASK].	sky
Humans have [MASK] fingers on each hand.	five
Fish live in [MASK].	water
Fire needs [MASK] to burn.	oxygen
Birds use [MASK] to fly.	wings
The [MASK] rises in the East.	sun

Models tested include BERT, Roberta, Electra, DeBERTa, XLM_Roberta_Base, BERT_Large_Cased, Legal_BERT, InfoXLM_Large, and Albert_Base_V1. The results are displayed in tables for each question, comparing the top 5 predicted tokens and their scores from each model.

CODE:

https://colab.research.google.com/drive/1FRyFGc9R70xd7nLeu9pmNBrWc2G5wcuh?usp=sharing

Evaluation:

I’m underwhelmed by the models’ performance. I’m sure I could’ve tweaked things further to get better outcomes, but I still expected more. Roberta did reasonably well, but still not too exciting.

Written September 30, 2023 by Yonbel

Battle of the Forms

The Personal Website of Yonathan Arbel

Testing Encoders

Leave a Reply Cancel reply