Enchancing Semi-Supervised Learning for Extractive Summarization with an LLM-based pseudolabeler
Gaurav SahuOlga VechtomovaIssam H. Laradji
Gaurav SahuOlga VechtomovaIssam H. Laradji
Nov 2023
0被引用
0笔记
摘要原文
This work tackles the task of extractive text summarization in a limited labeled data scenario using a semi-supervised approach. Specifically, we propose a prompt-based pseudolabel selection strategy using GPT-4. We evaluate our method on three text summarization datasets: TweetSumm, WikiHow, and ArXiv/PubMed. Our experiments show that by using an LLM to evaluate and generate pseudolabels, we can improve the ROUGE-1 by 10-20\% on the different datasets, which is akin to enhancing pretrained models. We also show that such a method needs a smaller pool of unlabeled examples to perform better.