This is an automated archive made by the Lemmit Bot.
The original was posted on /r/machinelearning by /u/Total-Opposite-8396 on 2024-04-06 09:33:16.
Hi everyone, this is the first time I’m fine tuning an LLM and I just can’t get over 40% accuracy for the text-classification task.
I’m using BERT from transformers library to load and train the model and peft for LoRA implementation. My data set contains English written summaries of news articles and with each article there is a label such as Economics, Politics, Science, Entertainment, etc… (14 unique labels). The maximum length of summaries can extend up to 250-300 tokens. My training set has 800 examples and validation set has 200 examples.
At first the training loss was reaching very low but the validation loss was not going too low with validation accuracy going maximum up to 45%. Since it was overfitting, I changed dropout rate form 0.1 to 0.5. After that the model is not overfitting now, but it is underfitting, with validation and training loss being almost the same and validation accuracy still reaching 45% max.
I tried removing LoRA implementation but nothing changed, except for the training time. At this point I’m confused as to what should I do. I’ve tried tuning hyperparameters but nothing changes.
Can anyone help me out in understanding what possibly could I be missing here. I can share stats and code implementation or I can even get on call if that’s possible. Any help will be very much appreciated.