I’m trying to set up a Kubernetes cluster that allows me to submit SQL queries directly as Flink jobs. My goal is to use the Flink Kubernetes operator to manage jobs and leverage new features like forst backend and materialized tables from Flink 2.0.
The documentation I’ve found is pretty scattered and incomplete. I’m not sure if this approach is officially supported or recommended. From what I can tell, the Flink K8s operator requires me to create some kind of wrapper around my SQL code, which feels clunky and overcomplicated.
Has anyone successfully implemented this kind of setup? Are there better approaches or alternative solutions I should consider instead?
I ran into the same issues setting this up for production. The Flink Kubernetes operator requires packaging SQL jobs as application deployments, which complicates direct SQL submissions. After testing various setups, I found that running the SQL Gateway as a standalone service alongside the operator is the most effective solution. This allows you to deploy SQL Gateway with a standard Kubernetes deployment and configure it to submit jobs via the operator’s REST API. This method retains the job management benefits of the operator while enabling direct SQL query execution. Additionally, using templates for common SQL patterns and storing queries in ConfigMaps simplifies the process.
Cool setup! Have you looked into Ververica Platform or Kafka Connect with Flink? What made you go with the K8s operator approach - trying to skip session clusters completely? And have you tried any custom resource definitions for this?
ya, I hear ya! I went down that path too, it’s kinda messy. The SQL gateway works but can be a pain with operators. I ended up using Flink SQL client with a session cluster. Much smoother experience and no wrapper hassles!