Jobs
>
Senior NLP / ML Researcher (LLM Evaluation & Agentic Systems)

Senior NLP / ML Researcher (LLM Evaluation & Agentic Systems)

Indefinite
Full time
Remote
Research & Development

Why Iris.ai

At Iris.ai, we’re building an agentic AI platform that scales expert-level domain knowledge across entire organizations.

For more than a decade, we’ve worked at the intersection of scientific research, industrial data, and applied AI, helping researchers, engineers, and business teams reason over complex technical knowledge.

Our products - Neuralith, Axion, and RSpace - span the full GenAI lifecycle:

Data ingestion across text, tables, figures, and technical formats
Advanced RAG and indexing pipelines
Agentic orchestration and reasoning
Rigorous LLM evaluation and governance

What makes us different: we care deeply about accuracy, evaluation, and responsibility. We don’t optimize for demos and proof-of-concepts we optimize for systems that experts trust and use.

The Role

We’re looking for a Senior NLP / ML Researcher who wants to work on hard, unsolved problems in modern language models — and see their ideas land in real products used by enterprises and researchers.

This role combines research and applied engineering. You’ll drive novel research directions, build prototypes, conduct experimentation, and help turn them into production capabilities inside our platform.

You’ll also play a key role in securing research funding by contributing to high-quality grant proposals both EU and national grants (EIC, Horizon, etc.).

If you enjoy thinking deeply about why models fail, how to measure intelligence and uncertainty, and when agents should reason vs. act — you’ll feel at home here.

What You’ll Research:

You’ll work on a focused set of high‑impact research directions that sit at the core of modern applied NLP and agentic AI. The exact mix will evolve based on your strengths and interests, but broadly includes:

LLM evaluation & uncertainty — confidence estimation, answer relevance, and robustness in open‑book QA and RAG systems
Agentic reasoning & control — understanding when models should reason, stop reasoning, or act, including inference‑time steering
Translation & multilingual NLP — evaluation and system design for modern LLM‑based translation, including low‑resource languages

Your goal will be turning rigorous research into capabilities that real users can trust and use.

What You’ll Do

Design and implement novel NLP & ML methods (from theory to code)
Run end‑to‑end experiments: data, training, evaluation, ablations
Translate research insights into prototypes and production features
Collaborate closely with engineers and product teams
Publish, present, and engage with the AI research community
Lead and co‑author EU and national research grant proposals
Write and publish research articles

Our Tech Stack

Languages: Python (strong OOP practices)
ML: PyTorch, Transformers, TensorFlow
LLMs: Hugging Face, OpenAI, custom and fine‑tuned models
Systems: RAG pipelines, Multi-agent frameworks, Evaluation tools
Infra: AWS, Docker, Distributed computing
Practices: Git, CI/CD, reproducible research workflows

What We’re Looking For

PhD in ML, NLP, Computer Science, or a related field
Strong, hands‑on experience with R&D grants and proposal writing (e.g. Horizon Europe, EIC, national or international research funding)
5+ years of industry or applied research experience
Strong background in NLP (transformers, semantic search, RAG)
Hands‑on experience with LLMs and their evaluation
Solid software engineering skills and experience with Python
Publications in ML/NLP conferences or journals
Able to work within European time zones

🌱Why Join Iris.ai?

If you want to do meaningful NLP work, help secure funding for frontier AI research, and grow in a culture built on trust, rigor, and fairness — let’s talk.

We’re not your typical tech company. We believe in:

Real transparency — information is shared, context is open, and questions are welcome.
Fairness, designed in policies, opportunities, and growth are aligned across countries and teams.
Ownership and empoweredness to make decisions without micromanagement.
Metrics that guide us — but they never replace human thinking or responsibility

Compensation & Ownership

Pay

Compensation that reflects your value. Our salaries are typically 25% above local market averages, ensuring competitive, fair pay across regions and roles. And we review it annually.

Equity

We believe salary helps you get by. Stock options build wealth. At Iris.ai all colleagues receive ownership in the company, part of our ESOP pool (3%). Because when we grow, you grow — that's what shared success really means.

(Just imagine: Someone once bought a Tesla option for $1 — it's worth $400 today.)

Benefits

We’ve built our benefits to reflect how we work: with trust, fairness, and room to grow.

30 days paid vacation
5 additional days paid vacation for Learning and Development
Private health insurance (premium coverage) and bi-annual health checks
Free MultiSport card for your physical well-being
Remote-first & flexible hours — work where you're at your best
Personal annual learning budget for conferences, courses, or certifications
Personal equipment budget to choose the gear that suits your style
Charity and volunteer activities
Seasonal working camps (summer & winter) and team retreats
Ongoing growth through weekly tech deep dives, mentorship, pair coding, and knowledge-sharing

🚀 Let’s Build the Future of Responsible AI

If you care about building high-quality, ethical AI — guided by data and human judgment — you’ll feel at home at Iris.ai.

👉 Apply now or reach out with questions. We’re transparent by default.

Indefinite
Full time
Remote
Research & Development

Apply now