My work at the intersection of NLP and low-resource Indian languages.
Machine translation systems often fail when sentences contain negation — turning “I am happy” into “I am not happy” should flip the meaning entirely, but most models struggle with this, especially for low-resource language pairs.
We investigated how negation impacts English–Assamese machine translation using Transformer-based models. Assamese is a low-resource Indo-Aryan language spoken by ~15 million people in Northeast India, with limited parallel corpora available for training.
Our work systematically analyzed negation handling across multiple MT architectures and proposed techniques to improve translation accuracy for negated sentences, contributing to the broader goal of making NLP work for underrepresented languages.
This work was part of my NLP internship at NIT Silchar (Dec 2021 – May 2022), where I worked on Transformer-based models for machine translation under the guidance of Prof. Partha Pakray. The research focused on improving MT quality for Northeast Indian languages, which are severely underrepresented in the NLP community.
The paper was published in Sadhana, a peer-reviewed journal by the Indian Academy of Sciences, published by Springer.