Deep Thinking: Reinforcement Learning and Benchmarking for Better LLMs

Wednesday 24 June • 5 PM - 7 PM CEST

Overview

The 20th edition of Better AI Meetup will be all about fine-tuning and benchmarking LLMs to help people get the most out of them.

LIVESTREAM LINK: coming up

The 20th volume of Better AI Meetup will somewhat return to the topic of the very first meetup in 2021, where the first ever Slovak language neural model was presented. This meetup will be all about fine-tuning and benchmarking LLMs to help people get the most out of them.

First, Jakub Mačina, an LLM researcher from ETH Zurich will talk about a post-training alignment method for collaborative agents to help LLMs co-reason and co-act with humans. He will show how an 7B open model can rival larger proprietary LLMs by thoughtful use of reinforcement learning.

Marek Šuppa, a Principal Machine Learning and AI Engineer at Slido and Andrej Ridzik, a Senior Research Engineer at KInIT will show us how language models are evaluated to separate real capability from marketing fluff. They will cover the ongoing effort to create a benchmark model for Slovak language, why different models require different benchmarks and what are their challenges.

SPEAKERS

Jakub Mačina

Researcher at the ETH Zurich

Jakub Mačina is a researcher at the ETH Zurich working on large language models (LLMs) for reasoning and multi-turn capabilities with applications to education. He earned his PhD in Machine Learning at ETH Zurich as a Fellow of the ETH AI Center and is a Forbes 30U30 in the category of Science and Education. Previously, he led a machine learning team at Slovak startup Exponea (acquired by Bloomreach).

Marek Šuppa

Principal Machine Learning / AI Engineer at Slido

Marek Šuppa is a Principal Machine Learning / AI Engineer at Slido, acquired by Cisco. He leads Slido’s Data team, and before that was one of the early employees of DuckDuckGo, the search engine that does not track you. He also teaches at Comenius University in Bratislava. His work spans natural language processing, machine learning, robotics, and applied AI. He helps organize various events, such as RoboCup, Slovakia’s Int. Olympiad in AI, AI Build Day or Bratislava Slush’D.

Andrej Ridzik

Senior Research Engineer at KInIT

Andrej Ridzik is a Senior Research Engineer at the Kempelen Institute of Intelligent Technologies (KInIT), where he works on a portfolio of NLP activities spanning research, commercial projects, and LLM-based solutions for Slovak. He represents KInIT as the lead for benchmarking activities within the Slovak NLP Community, where he co-leads the development of benchmarks for the Slovak language. He has worked in AI and NLP for over a decade, with prior experience as an engineer across industry and research projects.

Join us online for the 20th Better AI Meet Up on June 24th!

Good to know

Highlights

2 hours
Online

Location

Online event

Agenda

05:00 PM

_ Welcome

05:05 PM - 05:30 PM

Post-training LLMs for Collaboration using Reinforcement Learning

Jakub Mačina

Reinforcement learning (RL) has made LLMs stronger reasoners by scaling test-time compute, but most optimize for single-turn problem solving. This talk presents a post-training alignment method for collaborative agents: models that co-reason and act with humans across multi-turn tutoring and planning settings, showing how a 7B open model can rival larger proprietary LLMs. It highlights practical lessons for applying RL across verifiable and less verifiable domains while preserving accuracy and avoiding SFT-style overspecialization.

05:30 PM - 05:55 PM

Benchmarking Language Models for Slovak: What It Takes and Why It Matters.

Marek Šuppa, Andrej Ridzik

As language models become central to NLP applications, reliable evaluation is what separates real capability from marketing claims – and building good benchmarks is far from trivial, especially for languages with limited resources like Slovak. This talk presents the ongoing effort to build reliable evaluation benchmarks for Slovak language models. It covers why different model types require different benchmarks, how raw datasets are turned into well-defined evaluation tasks, and the specific challenges of evaluating LLM outputs. It also shares lessons learned from delivering these benchmarks and reflects on the role of cross-team collaboration in scaling such efforts.

Organised by

BETTER_AI Meetup

Report this event

Deep Thinking: Reinforcement Learning and Benchmarking for Better LLMs

SPEAKERS

Good to know

Location

Online event

Agenda

_ Welcome

Post-training LLMs for Collaboration using Reinforcement Learning

Benchmarking Language Models for Slovak: What It Takes and Why It Matters.

More events from BETTER_AI Meetup

Discover more events from BETTER_AI Meetup, from Science & Tech to other experiences you might love.

Still looking for the right event?

Explore all online events to browse and filter by date, category, and more.

Deep Thinking: Reinforcement Learning and Benchmarking for Better LLMs

SPEAKERS

Good to know

Location

Online event

Agenda

_ Welcome

Post-training LLMs for Collaboration using Reinforcement Learning

Benchmarking Language Models for Slovak: What It Takes and Why It Matters.

More events from BETTER_AI Meetup

Discover more events from BETTER_AI Meetup, from Science & Tech to other experiences you might love.

You might also like...

Browse more events with different dates, prices, and formats to find your next great experience.

Still looking for the right event?

Explore all online events to browse and filter by date, category, and more.