Philipp Borchert

Logo

Philipp Borchert | NLP Researcher

Let's connect and collaborate! You can find me here:


πŸ’» GitHub
πŸ€— HuggingFace
πŸ’Ό LinkedIn
πŸ“š Google Scholar
πŸ‘‹ Hi, I'm Philipp!
I'm a NLP researcher fascinated by how we can teach LLMs to reason and tackle complex, structured problems. My research spans from multilingual NLP over information extraction to my current work on reasoning in AI for Math at Huawei in London. I completed my PhD at KU Leuven, where my research focused on multilinguality and NLP for Business applications.
What excites me |

πŸ“„ Selected Publications & Projects

Paper overview
Philipp Borchert, Ivan Vulić, Marie-Francine Moens, Jochen De Weerdt
ACL 2025 | Vienna, Austria πŸ“
This study introduces Fusion for Language Representations (FLARE), a novel method that merges source and target language representations within low-rank adapters, enhancing cross-lingual transfer performance while maintaining parameter efficiency. Keywords: cross-lingual transfer, peft
Talk overview
Self-Distillation for Model Stacking Unlocks Cross-Lingual NLU in 200+ Languages πŸ’» GitHub πŸ“ ArXiv
Fabian David Schmidt, Philipp Borchert, Ivan Vulić, Goran Glavaő
EMNLP (Findings) 2024 | Miami, USA πŸ“
This paper introduces MT-LLM, a method integrating machine translation (MT) encoders into LLM backbones using self-distillation. This approach unlocks natural language understanding capabilities for over 127 languages by allowing them to tap into the rich knowledge of English-centric LLMs. Keywords: cross-lingual transfer, model merging, mt-llm
Talk overview
Efficient Information Extraction in Few-Shot Relation Classification through Contrastive Representation Learning πŸ’» GitHub πŸ“ ArXiv
Philipp Borchert, Jochen De Weerdt, Marie-Francine Moens
NAACL 2024 | Mexico City, Mexico πŸ“
This paper presents MultiRep, a novel approach to improve few-shot relation classification by combining multiple sentence representations using contrastive learning. This method effectively extracts complementary, discriminative information, proving especially beneficial in low-resource scenarios. Keywords: low-resource nlp, information extraction
Talk overview
CORE: A Few-Shot Company Relation Classification Dataset for Robust Domain Adaptation πŸ’» GitHub πŸ“ ArXiv
Philipp Borchert, Jochen De Weerdt, Kristof Coussement, Arno De Caigny, Marie-Francine Moens
EMNLP 2023 | Singapore πŸ“
CORE is a dataset for few-shot relation classification focused on company relations, designed to challenge models with the contextual complexity of business entities. The study demonstrates that while models struggle to adapt to CORE, training on this high-quality, information-rich dataset improves out-of-domain performance. Keywords: relation extraction, domain adaptation, dataset
Talk overview
Investigating Bias in Multilingual Language Models: Cross-Lingual Transfer of Debiasing Techniques πŸ’» GitHub πŸ“ ArXiv
Manon, Reusens, Philipp Borchert, Margot Mieskes, Jochen De Weerdt, Bart Baesens
EMNLP 2023 | Singapore πŸ“
This study investigates whether debiasing techniques can be effectively transferred across different languages within multilingual LLMs. The findings confirm that cross-lingual transfer is not only feasible but also beneficial, with the SentenceDebias technique proving most effective by reducing bias by an average of 13% across the tested languages. Keywords: fairness, llm bias, cross-lingual transfer
Talk overview
SEER: A Knapsack approach to Exemplar Selection for In-Context HybridQA πŸ’» GitHub πŸ“ ArXiv
Jonathan Tonglet, Manon Reusens, Philipp Borchert, Bart Baesens
EMNLP 2023 | Singapore πŸ“
This paper introduces SEER, a novel method for selecting diverse and representative examples for in-context learning in complex question-answering tasks. SEER formulates exemplar selection as a Knapsack problem, which allows it to optimize for desirable attributes under size constraints and outperform previous methods on the FinQA and TAT-QA benchmarks. Keywords: in-context learning, integer linear programming