• Jobs
  • >
  • Senior NLP / ML Researcher (LLM Evaluation & Agentic Systems)

Senior NLP / ML Researcher (LLM Evaluation & Agentic Systems)

  • Indefinite
  • Full time
  • Remote
  • Research & Development

Why Iris.ai

At Iris.ai, we’re building an agentic AI platform that scales expert-level domain knowledge across entire organizations.

For more than a decade, we’ve worked at the intersection of scientific research, industrial data, and applied AI, helping researchers, engineers, and business teams reason over complex technical knowledge.

Our products - Neuralith, Axion, and RSpace - span the full GenAI lifecycle:

  • Data ingestion across text, tables, figures, and technical formats

  • Advanced RAG and indexing pipelines

  • Agentic orchestration and reasoning

  • Rigorous LLM evaluation and governance

What makes us different: we care deeply about accuracy, evaluation, and responsibility. We don’t optimize for demos and proof-of-concepts we optimize for systems that experts trust and use.

The Role

We’re looking for a Senior NLP / ML Researcher who wants to work on hard, unsolved problems in modern language models — and see their ideas land in real products used by enterprises and researchers.

This role combines research and applied engineering. You’ll drive novel research directions, build prototypes, conduct experimentation, and help turn them into production capabilities inside our platform.

You’ll also play a key role in securing research funding by contributing to high-quality grant proposals both EU and national grants (EIC, Horizon, etc.).

If you enjoy thinking deeply about why models fail, how to measure intelligence and uncertainty, and when agents should reason vs. act — you’ll feel at home here.

What You’ll Research:

You’ll work on a focused set of high‑impact research directions that sit at the core of modern applied NLP and agentic AI. The exact mix will evolve based on your strengths and interests, but broadly includes:

  • LLM evaluation & uncertainty — confidence estimation, answer relevance, and robustness in open‑book QA and RAG systems

  • Agentic reasoning & control — understanding when models should reason, stop reasoning, or act, including inference‑time steering

  • Translation & multilingual NLP — evaluation and system design for modern LLM‑based translation, including low‑resource languages

Your goal will be turning rigorous research into capabilities that real users can trust and use.

What You’ll Do

  • Design and implement novel NLP & ML methods (from theory to code)

  • Run end‑to‑end experiments: data, training, evaluation, ablations

  • Translate research insights into prototypes and production features

  • Collaborate closely with engineers and product teams

  • Publish, present, and engage with the AI research community

  • Lead and co‑author EU and national research grant proposals

  • Write and publish research articles

Our Tech Stack

  • Languages: Python (strong OOP practices)

  • ML: PyTorch, Transformers, TensorFlow

  • LLMs: Hugging Face, OpenAI, custom and fine‑tuned models

  • Systems: RAG pipelines, Multi-agent frameworks, Evaluation tools

  • Infra: AWS, Docker, Distributed computing

  • Practices: Git, CI/CD, reproducible research workflows

What We’re Looking For

  • PhD in ML, NLP, Computer Science, or a related field

  • Strong, hands‑on experience with R&D grants and proposal writing (e.g. Horizon Europe, EIC, national or international research funding)

  • 5+ years of industry or applied research experience

  • Strong background in NLP (transformers, semantic search, RAG)

  • Hands‑on experience with LLMs and their evaluation

  • Solid software engineering skills and experience with Python

  • Publications in ML/NLP conferences or journals

  • Able to work within European time zones

🌱Why Join Iris.ai?

If you want to do meaningful NLP work, help secure funding for frontier AI research, and grow in a culture built on trust, rigor, and fairness — let’s talk.

We’re not your typical tech company. We believe in:

  • Real transparency — information is shared, context is open, and questions are welcome.

  • Fairness, designed in policies, opportunities, and growth are aligned across countries and teams.

  • Ownership and empoweredness to make decisions without micromanagement.

  • Metrics that guide us — but they never replace human thinking or responsibility

Compensation & Ownership

Pay

  • Compensation that reflects your value. Our salaries are typically 25% above local market averages, ensuring competitive, fair pay across regions and roles. And we review it annually.

Equity

  • We believe salary helps you get by. Stock options build wealth. At Iris.ai all colleagues receive ownership in the company, part of our ESOP pool (3%). Because when we grow, you grow — that's what shared success really means.

(Just imagine: Someone once bought a Tesla option for $1 — it's worth $400 today.)

Benefits

We’ve built our benefits to reflect how we work: with trust, fairness, and room to grow.

  • 30 days paid vacation

  • 5 additional days paid vacation for Learning and Development

  • Private health insurance (premium coverage) and bi-annual health checks

  • Free MultiSport card for your physical well-being

  • Remote-first & flexible hours — work where you're at your best

  • Personal annual learning budget for conferences, courses, or certifications

  • Personal equipment budget to choose the gear that suits your style

  • Charity and volunteer activities

  • Seasonal working camps (summer & winter) and team retreats

  • Ongoing growth through weekly tech deep dives, mentorship, pair coding, and knowledge-sharing

🚀 Let’s Build the Future of Responsible AI

If you care about building high-quality, ethical AI — guided by data and human judgment — you’ll feel at home at Iris.ai.

👉 Apply now or reach out with questions. We’re transparent by default.