Hi, I'm Yash

My research focuses on designing trustworthy language agents that combine NeuroSymbolic reasoning, information retrieval, and source attribution.

Profile photo

About Me

I'm Yash Saxena, a Ph.D. student in Computer Science at the University of Maryland, Baltimore County. My research sits at the intersection of retrieval augmented generation, interpretability, and source attribution. I focus on making each stage of a language agent (retrieval, evidence selection, and generation) transparent and verifiable, through interpretable retrievers, rationale driven evidence selection, and citation schemes that balance coverage with correctness. I care about language models that can show their work and earn trust in domains where mistakes actually matter.

Before my Ph.D., I completed a B.Tech. in Computer Science and Engineering (AI and ML) at Galgotias University. I have worked on projects in mental health classification, game content generation, and legal tech, in roles that combine research and engineering. I enjoy mentoring undergraduate researchers, collaborating with interdisciplinary teams, and turning ideas into practical tools. Outside of research, you will usually find me exploring music, reading about neuroscience, or at a hackathon.

0

Publications

0

Presentations

0

Awards

Research Focus

NeuroSymbolic AI

Bridging neural networks and symbolic reasoning to build hybrid, interpretable systems that combine structured knowledge with deep learning.

Source Attribution

Developing reliable citation techniques and evidence selection methods that ensure every statement in generated text is traceable to its origin.

Trustworthy AI

Building AI systems that are transparent, robust, and fair, with a focus on interpretability, safety, and user confidence in automated decisions.

Information Retrieval

Designing efficient retrieval and ranking algorithms to fetch relevant documents and passages that power retrieval augmented generation.

NLP

Exploring natural language processing from sentiment analysis to language generation, with a focus on large language models and reasoning.

Skills

Retrieval and RAG
  • Chunking and query reformulation
  • Dense and sparse retrieval
  • FAISS and HNSW indexing
LLM Tuning
  • Instruction tuning (SFT)
  • Preference tuning (PPO, DPO, GRPO)
  • PEFT (LoRA and QLoRA)
Tools and Libraries
  • PyTorch, Hugging Face (Transformers, PEFT)
  • LlamaIndex, LangChain
  • FAISS, SQL, Git

Education

University of Maryland, Baltimore County

Ph.D. in Computer Science (Jan 2025 - Present)

Research Assistant working on interpretable retrieval for trustworthy LLMs; GPA 4.0/4.0.

Galgotias University, India

B.Tech. in CSE (AI and ML) (Nov 2020 - May 2024)

Completed coursework with a CGPA of 8.75/10, with a focus on artificial intelligence and machine learning.

Bishop Conrad Senior Secondary School

Senior Secondary (XII) (May 2018 - May 2019)

Graduated with 83.20 percent in Physics, Chemistry, and Mathematics.

Professional Experience

Knowledge Infused AI and Inference Lab, UMBC

Remote Research Intern (Aug 2024 - Jan 2025)

  • Fine tuned Llama 3 70B using the DPO algorithm and developed a RAG pipeline.
  • Designed and ran experiments on large scale GPU clusters via Slurm.
  • Collaborated with faculty to advance research on interpretable retrieval.
Stareout Games

AI Engineer Intern (Jan 2024 - Mar 2024)

  • Developed pipelines that combined LLMs and generative models to bring game ideas to life.
  • Fine tuned language and image generation models for interactive experiences.
  • Used LangChain, Ntscraper, Streamlit, and other Python libraries.
Artificial Intelligence Institute, University of South Carolina

Remote Research Intern (Aug 2023 - Jul 2024)

  • Explored LLMs using frameworks such as LangChain and LlamaIndex.
  • Built web scrapers to construct datasets for ongoing research.
Celebal Technologies

Data Science Intern (May 2023 - Jul 2023)

  • Processed large datasets and engineered features for specific use cases.
  • Developed NLP applications using spaCy and related libraries.
HCLTech

Intern (Nov 2022 - Jan 2023)

  • Developed a Python based web app using spaCy and Streamlit to extract information from résumés.
  • Built an application that enabled HR teams to filter résumés based on custom criteria.

Presentations and Talks

Building Trustworthy LLM Agents for Academia

Presented at the UMBC AI Symposium, this talk outlines three pillars of trust: interpretability, robustness, and credibility. It covers interpretable retrieval (IMRNNs), rationale driven reranking (METEORA), and citation paradigms such as G Cite and P Cite. The talk demonstrates improved recall and precision, discusses tradeoffs between coverage and correctness, and highlights a transparent pipeline for academic research.

Watch video 
RASOR: Contextual Legal Intelligence

At Bloomberg Law's AI Symposium on the future of legal technology, I presented RASOR: Contextual Legal Intelligence via Rationalized Selection and Refinement in RAG with collaborators. The symposium brought together legal and AI experts to explore how retrieval augmented systems can provide contextual legal intelligence using rationalized selection and refinement to improve citation quality in legal tasks.

Event article 

Publications and Preprints

A selection of my research papers with summaries and keywords. Click the titles to read more.

Generation Time vs Post hoc Citation

NeurIPS LLM Evaluation Workshop 2025. This paper compares two citation paradigms: Generation Time Citation (G Cite) and Post hoc Citation (P Cite) across multiple attribution datasets. It shows that retrieval quality drives attribution quality, that P Cite offers higher coverage with competitive correctness, and recommends a retrieval centric, P Cite first approach for high stakes domains.

LLM attribution citations evaluation

Evaluating Consistency and Reasoning of LLMs

Second International Conference on Data Science and Information Systems 2024. This study evaluates the consistency and reasoning abilities of public and proprietary LLMs using the Boolq dataset. Models are assessed with metrics such as BERT score, BLEU, and F1 on generated explanations and answers, revealing that proprietary models outperform public ones yet none achieve high scores for both consistency and reasoning.

LLM consistency reasoning evaluation

Can LLMs Obfuscate Code?

AAAI 2025. This study asks whether LLMs can generate obfuscated assembly code and presents the MetamorphASM benchmark with a dataset of 328,200 obfuscated samples. By evaluating multiple LLMs across obfuscation techniques such as dead code, register substitution, and control flow change, the authors show that LLMs can produce obfuscated code, which poses risks for static analysis tools.

code obfuscation LLM security malware

Emotion Based Mental Health Classifier

IEEE IC3I 2023. This work presents a machine learning pipeline to detect and classify the mental state of engineering students using social media text. It combines sentiment analysis with models such as RNN, GRU, and SVM to identify emotions and support early detection of mental health issues.

mental health sentiment analysis emotion classification

Works Under Review and In Preparation

Ranking Free RAG

Under review (ICLR 2026). Proposes replacing re ranking with a selection mechanism in retrieval augmented generation, with the goal of improving fairness and transparency in sensitive domains by selecting evidence based on rationales instead of a fixed top k ranking.

RAG fairness sensitive domains

IMRNNs: Efficient Embedding Modulation

Under review (EACL 2026). Extends interpretable retrieval by introducing efficient embedding modulation techniques that produce token level explanations while reducing computational overhead in dense retrieval.

interpretable retrieval efficient embedding dense retrieval

Attribution in Scientific Literature

In preparation. Introduces a benchmark and methods for evaluating source attribution in scientific literature, with the goal of improving citation coverage and correctness in generative models.

source attribution benchmarking citations

Awards and Recognitions

  • 2023 Led a team in the grand finale of the Smart India Hackathon (Ministry of Education, India).
  • 2022 Winner of the UNESCO India Africa Hackathon (Ministry of Education, India).
  • 2022 Led a team of five in the grand finale of the Smart India Hackathon and won the event.

Contact

Have a project in mind or want to collaborate? Feel free to reach out.

ysaxena1@umbc.edu