• Register
  • Login
  • العربیة

Kut University College Journal for Humanitarian Science

  1. Home
  2. Continuous Translation Pretraining: Self-Supervised Methods for Emerging Language Variations

Current Issue

By Issue

By Author

By Subject

Author Index

Keyword Index

About Journal

News

Editorial Board

Continuous Translation Pretraining: Self-Supervised Methods for Emerging Language Variations

    Author

    • Assoc. Prof. Dr. MUTHANA HAMEED KHALAF
,

Document Type : Reasearch paper

  • Article Information
  • Download
  • Export Citation
  • Statistics
  • Share

Abstract

Ever since, we have continued to deploy smarter LLMs through various training regimes, which often take advantage of self-supervised language modelling objectives such as next token prediction or span corruption. In parallel, MT (Machine Translation) systems rely on cross-lingual supervision, necessitating aligned data between a source and target language pair. To address these challenges associated with ELVs, we develop Continuous Translation Pretraining (CTP), a novel framework that maps continuous language space with reliable, constrained language mapping. We show that models pretrained in self-supervised language modelling and supervised machine translation objectives tend to perform significantly better on translation tasks across the board, particularly well on low-resource language pairs. Extensive experiments on several language pairs demonstrate substantial gains on zero-shot and fine-tuned settings, attaining up to 4.5 points of BLEU score improvement over traditional methods. This proposed framework facilitates improvement for novel lingual forms without vast parallel corpora, which is advantageous in less-proficient lingual venues and dialects. Our contributions include an in-depth look at the architecture, this model's training process and applications, and a novel evaluation framework tailored to low-resource language situations.

Keywords

  • continuous translation pretraining
  • self-supervised learning
  • emerging language variations
  • cross-lingual transfer
  • low-resource languages
  • neural machine translation
  • XML
  • PDF 1020.9 K
  • RIS
  • EndNote
  • Mendeley
  • BibTeX
  • APA
  • MLA
  • HARVARD
  • CHICAGO
  • VANCOUVER
    • Article View: 4
    • PDF Download: 5
Kut University College Journal for Humanitarian Science
Volume 6, Issue 1
June 2025
Page 168-191
Files
  • XML
  • PDF 1020.9 K
Share
Export Citation
  • RIS
  • EndNote
  • Mendeley
  • BibTeX
  • APA
  • MLA
  • HARVARD
  • CHICAGO
  • VANCOUVER
Statistics
  • Article View: 4
  • PDF Download: 5

APA

KHALAF, A. P. D. M. H. (2025). Continuous Translation Pretraining: Self-Supervised Methods for Emerging Language Variations. Kut University College Journal for Humanitarian Science, 6(1), 168-191.

MLA

KHALAF, A. P. D. M. H. . "Continuous Translation Pretraining: Self-Supervised Methods for Emerging Language Variations", Kut University College Journal for Humanitarian Science, 6, 1, 2025, 168-191.

HARVARD

KHALAF, A. P. D. M. H. (2025). 'Continuous Translation Pretraining: Self-Supervised Methods for Emerging Language Variations', Kut University College Journal for Humanitarian Science, 6(1), pp. 168-191.

CHICAGO

A. P. D. M. H. KHALAF, "Continuous Translation Pretraining: Self-Supervised Methods for Emerging Language Variations," Kut University College Journal for Humanitarian Science, 6 1 (2025): 168-191,

VANCOUVER

KHALAF, A. P. D. M. H. Continuous Translation Pretraining: Self-Supervised Methods for Emerging Language Variations. Kut University College Journal for Humanitarian Science, 2025; 6(1): 168-191.

  • Home
  • About Journal
  • Editorial Board
  • Submit Manuscript
  • Contact Us
  • Sitemap

News

Newsletter Subscription

Subscribe to the journal newsletter and receive the latest news and updates

© Journal Management System. Powered by iJournalPro.com