OFF CLOUD by Sovereign AI Lab
A hands-on migration and tuning workshop for DACH engineers moving from cloud APIs to local LLMs. Date: 28.04.2026, 17:30 – 22:00 PM CET
ABOUT THIS SESSION
Your cloud API invoice is climbing. Your compliance team is asking harder questions. You've heard "just run it locally," but nobody has shown you how. And many engineers who already run locally are leaving meaningful headroom on their hardware.
This session fixes both problems in the same room.
Two tracks, same pizza:
Track A, Cloud to Local. For teams currently using hosted LLM APIs who have never run local inference. You walk in with an API key. You walk out with a working local model, a drop-in API swap, and a cost comparison template you can fill with your own workload numbers.
Track B, GPU Whispering. For engineers already running local inference engines who want to go deeper than default settings. Quantization, VRAM layout, CPU memory bus saturation, real benchmarks on real hardware.
Cross over during open lab. Ask the awkward questions.
WHO THIS IS FOR
- Technical founders, CTOs, and engineers whose pay-per-token invoices have become a fast-growing expense line.
- ML and backend engineers who want to get off the API treadmill.
- DevOps leads and infrastructure architects already running local models who want to go deeper than default settings.
- Sectors: healthcare, legal, manufacturing, financial services, media, startups.
GROUND RULES
- No sales pitches. Ever.
- Bring a laptop. If you're already running local models, bring that setup. If not, we'll get you started during doors (17:30–18:00), and a pre-event setup gist goes out the day before.
- All skill levels welcome, provided you write code.
- English is the working language.
A hands-on migration and tuning workshop for DACH engineers moving from cloud APIs to local LLMs. Date: 28.04.2026, 17:30 – 22:00 PM CET
ABOUT THIS SESSION
Your cloud API invoice is climbing. Your compliance team is asking harder questions. You've heard "just run it locally," but nobody has shown you how. And many engineers who already run locally are leaving meaningful headroom on their hardware.
This session fixes both problems in the same room.
Two tracks, same pizza:
Track A, Cloud to Local. For teams currently using hosted LLM APIs who have never run local inference. You walk in with an API key. You walk out with a working local model, a drop-in API swap, and a cost comparison template you can fill with your own workload numbers.
Track B, GPU Whispering. For engineers already running local inference engines who want to go deeper than default settings. Quantization, VRAM layout, CPU memory bus saturation, real benchmarks on real hardware.
Cross over during open lab. Ask the awkward questions.
WHO THIS IS FOR
- Technical founders, CTOs, and engineers whose pay-per-token invoices have become a fast-growing expense line.
- ML and backend engineers who want to get off the API treadmill.
- DevOps leads and infrastructure architects already running local models who want to go deeper than default settings.
- Sectors: healthcare, legal, manufacturing, financial services, media, startups.
GROUND RULES
- No sales pitches. Ever.
- Bring a laptop. If you're already running local models, bring that setup. If not, we'll get you started during doors (17:30–18:00), and a pre-event setup gist goes out the day before.
- All skill levels welcome, provided you write code.
- English is the working language.
Good to know
Highlights
- 4 hours 30 minutes
- In person
Location
Mariahilfer Str. 36
36 Mariahilfer Straße
1070 Wien
How do you want to get there?
