Edit model card

MCQBert Model Card

MCQBert is a robust and versatile BERT-based model fine-tuned to predict the correct answers for multiple-choice questions (MCQs) within Intelligent Tutoring Systems (ITS). Using LernnaviBERT as a base model, MCQBert is able to understand and process educational language in German, especially in grammar teaching, where sentences contain mistakes. The model processes both the text of the questions and the answer to predict the correct response to a question and is designed to be fine-tuned on student's interactions for Student Answer Forecasting. It is trained on one objective: given a question and answer pair, classificate whether the answer is correct or not.

Model Sources

Direct Use

MCQBert is primarily intended to predict correct answers to MCQs in Intelligent Tutoring Systems (ITS). Given a question and answer pair, it performs a binary classification to decide whether the answer is correct or not.

Downstream Use

It's intended downstream use is to be finetuned on user interaction (like MCQStudentBertCat and MCQStudentBertSum) for Student Answer Forecasting as described in https://arxiv.org/abs/2405.20079

Bias, Risks, and Limitations

While MCQBert is effective, it has some limitations:

It is primarily trained on German language MCQs and may not generalize well to other languages or subjects without further fine-tuning. The model may not capture all nuances of student learning behavior, particularly in diverse educational contexts.

Privacy: No personally identifiable information has been used in any training phase.

How to Use MCQBert

import torch
from transformers import AutoModel, AutoTokenizer

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

# load MCQStudentBert
model_bert = AutoModel.from_pretrained("epfl-ml4ed/MCQBert", trust_remote_code=True, token=token).to(device)
tokenizer_bert = AutoTokenizer.from_pretrained("dbmdz/bert-base-german-uncased")

qna = f"Q: my_question {tokenizer.sep_token}A: candidate_answer"
output = torch.nn.functional.sigmoid(
    model_bert(
        tokenizer_bert(qna, return_tensors="pt").input_ids.to(device),
    ).cpu()
).item() > 0.5

print(output)

Training Details

The model was trained on questions from a real-world ITS, Lernnavi, for 20k steps with a batch size of 16. The optimizer used is AdamW with learning rate = 1.75e-5, β1=0.9\beta_{1} = 0.9 and β2=0.999\beta_{2} = 0.999, and a weight decay of 0.01

Citation

If you find this useful in your work, please cite our paper

@misc{gado2024student,
      title={Student Answer Forecasting: Transformer-Driven Answer Choice Prediction for Language Learning}, 
      author={Elena Grazia Gado and Tommaso Martorella and Luca Zunino and Paola Mejia-Domenzain and Vinitra Swamy and Jibril Frej and Tanja Käser},
      year={2024},
      eprint={2405.20079},
      archivePrefix={arXiv},
}
Gado, E., Martorella, T., Zunino, L., Mejia-Domenzain, P., Swamy, V., Frej, J., Käser, T. (2024). 
Student Answer Forecasting: Transformer-Driven Answer Choice Prediction for Language Learning. 
In: Proceedings of the Conference on Educational Data Mining (EDM 2024). 
Downloads last month
4
Safetensors
Model size
110M params
Tensor type
F32
·
Inference API
Unable to determine this model’s pipeline type. Check the docs .

Collection including epfl-ml4ed/MCQBert