World Scientific
Skip main navigation

Cookies Notification

We use cookies on this site to enhance your user experience. By continuing to browse the site, you consent to the use of our cookies. Learn More
×

System Upgrade on Tue, May 28th, 2024 at 2am (EDT)

Existing users will be able to log into the site and access content. However, E-commerce and registration of new users may not be available for up to 12 hours.
For online purchase, please visit us again. Contact us at customercare@wspc.com for any enquiries.

Summary Augmenter: A Text Augmentation Framework to Improve Summarization Quality

    https://doi.org/10.1142/S0218213024500052Cited by:0 (Source: Crossref)

    Data augmentation in Natural Language Processing (NLP) faces various challenges that hinder its widespread adoption, unlike its ever-present usage in the field of vision. It is even more the case for the text summarization task where one should focus on both article and summary. In this paper, we review the effect of back translation augmentation, present the diverse beam search decoding strategy, and masking as a method to generate synthetic data for text summarization. The approaches will be evaluated by ROUGE score, novelty, summary length, and GPT-4 to analyze their effectiveness. Our proposed framework presents multiple combinations of back translation and masking for articles, along with diverse augmentation for summaries. Although applicable to networks of any size, we decided to use BART-large, a relatively smaller model, in order to conduct a larger number of experiments. The experiments demonstrated superior performance across all specified metrics when compared to fine-tuning BART-large on the CNN/Dailymail dataset. Specifically, we showed a significant improvement in novelty; 158% and 56% increase rate for bigrams and unigrams, respectively. It could eliminate some copyright concerns around generating content similar to human writing. Additionally, the GPT-4 assessment indicates that models trained using the augmentation technique tend to capture important information more effectively than the baseline model.