タラ・マイヤー
2月 17, 2026
ComfyUIワークフロー、オープンソースAIツールガイド、およびクリエイティブパイプラインへの恩恵
モバイルゲームスタジオで静かな革命が起きており、その波は中国から始まりました。現地のチームは オープンソースAIツールを活用し、人員を増やすことなくユーザー獲得(UA)の規模を10倍に拡大しています。急成長するこれらのチームは、従来の「月単位」ではなく「週単位」で何百もの広告クリエイティブをテストし、これらの無料ツールを使って成長を実現しています。
しかし、世界の市場には大きな格差があります。中国の開発者がオープンソースAIワークフローを磨き続けている一方で、欧米のスタジオはどのサブスクリプションサービスを購入するかについて議論を続けています。
Two & a Half GamersのJakubは次のように予測しています。 「2026年末までに、全UAクリエイティブの約50%がAI要素を含むか、完全にAIによって作成されるでしょう。」 彼はモバイルゲーム業界で10年以上の経験を持ち、システム設計、マネタイズ、世界中のスタジオの ユーザー獲得拡大 を専門としています。ここ3年間は独立コンサルタントとして、インディースタジオから大手パブリッシャーまで、様々なクライアントにクリエイティブワークフローの最適化アドバイスを提供しています。
現在、私は世界中の多くのゲームスタジオと仕事をしており、ゲーム以外のクライアントにもサービスを提供しています。Duolingoを見ればわかるように、アプリがApp Storeをほぼ支配しており、彼らは私たちの専門知識を求めているのです。
Jakubと彼のチームのユニークな点は、実際の予算を持つ実際のクライアントのために、毎日 ComfyUIワークフロー を実装していることです。ComfyUIワークフローや同様のクリエイティブ自動化ツールに投資しているスタジオは、サブスクリプションベースのツールでは複製できない競争上の堀を構築しています。
JakubはTenjinのマーケティングディレクターであるRomanと対談し、オープンソースAIツールを使用してUAとクリエイティブ産出を成長させるための実践的なガイドを共有しました。このガイドは、リクエストやテストのバックログに溺れているUAマネージャー、線形的なコスト増なしにスケールしたいスタジオ創設者、反復作業や燃え尽き症候群に疲れたクリエイティブディレクター、そして予算は限られているがプロフェッショナルなクリエイティブを求めるインディー開発者のために作成されました。
今回のTenjin ROI 101は、実用的なツールを使ってモバイルアプリを成長させたいと願うすべての人のためのエピソードです。
学習内容:
- なぜオープンソースツールが「ブラックボックス」ツールに勝るのか
- 始めるために必要な準備
- クリエイティブ自動化ツール比較:ComfyUI vs その他
- プロはどのようにComfyUIの「画像→動画」パイプラインを使用しているか
- クリエイティブ自動化ツールがチームにもたらす利益
- 速度と量がモバイルゲームの成功を決定する
- クリエイティブ生成からパフォーマンス測定までの全プロセス
なぜオープンソースAIワークフローが「ブラックボックス」ツールに勝るのか
ComfyUIワークフローのチュートリアルと技術的な詳細に入る前に、AIツールに対する西洋と中国のアプローチの根本的な違いを理解する必要があります。西洋のAIツールはサブスクリプションサービスのように月額料金がかかりますが、多くの中国製オープンソースAIツールは初期設定後、実質無料です。
西洋の「ブラックボックス」アプローチ
例: OpenAI, Anthropic、 Midjourney.
- 学習曲線が最小限で開始が簡単
- クローズドソース、サブスクリプション依存
- 「プロンプトを入力して結果を出力」するだけで、制御性は極めて低い
西洋の『ブラックボックス』AIツールのアプローチは、繰り返しますが完全に閉じています。ポジティブプロンプト、ネガティブプロンプト、そして非常に限られたカスタマイズしかできないのです。
Jakubによると、これらのトップクラスのUGC動画生成ツールは、多くの場合素晴らしいパフォーマンスを発揮しますが、以下の要件が必要になると手詰まりになります。
- 何百もの広告バリエーションにわたる一貫したキャラクターデザイン
- 特定の「フック」のための構図の精密な制御
- 既存のクリエイティブ制作パイプラインとの統合
- 予算の予測可能性(生成ごとのコストなし)
グローバル規模で広告クリエイティブを大量制作する場合、これらの「ブラックボックス」ツールの多くは最終的にボトルネックになるとJakubは主張します。これが、特にクリエイティブの反復作業において、彼がオープンソースAIソリューションを推奨する理由です。
中国のオープンソースAIエコシステム
中国のAI戦略は、成功したゲームのModコミュニティを意図的に模倣しています。
中国は現在、これらすべてのオープンソースモデルで市場に溢れさせています。これは『モデルを人々の手に渡すことで、エコシステムを支配する』という彼らの政治政策の一環です。
この戦略は、繁栄する文化とエコシステムを生み出しました:
- 様々なAIモデルがコミュニティの貢献によって絶えず改善されている
- (努力を惜しまなければ)無制限のカスタマイズが可能
- サブスクリプション費用なし、必要なのはハードウェア代のみ
- ワークフローが独自の優位性となる
Jakubの Skyrim に例える比喩は非常に的確です。
基本的に『Skyrim』のようなものだと想像してください。このゲームは今日でもプレイされており、世界最高のRPGの一つとされています。なぜでしょうか?それには巨大なModコミュニティがあり、ゲームを復活させ、パッチを当て、改善し続けているからです。基本的にこれが彼らのアプローチなのです。
これがモバイルゲームのUAにとって重要な理由
ComfyUIワークフローは、 クリエイティブ制作に「Modのマインドセット」をもたらします。チームはコミュニティ内でリミックスを行い、オープンソースAIモデルを使用して、複数のフォーマットにわたって必要なアセットを迅速に生成できます。
オープンソースAI生成は、画像や動画だけに限定されません。基本的に、対応するオープンソースモデルがあれば、あらゆる形態のコンテンツを生成できます……オーディオ、3Dアセット、2Dアセット、2Dスプライト——欲しいものは何でも生成できます。
最終的に、あなたのクリエイティブワークフローは次のように進化します:時間とともに能力が高まる複合成長エンジン、競合他社が容易に複製できない独自のIP(知的財産)の壁、そして経常費用ではなく、価値が増大していく資産。これらは、先見の明があるモバイルゲームスタジオが今投資している主な理由です。
成長のためのツール:ComfyUIのハードウェアとソフトウェア要件
Jakubは、ComfyUIワークフロー(およびCivitAIなどのプラットフォームからのモデル)を使用して自動化されたクリエイティブ制作システムを構築するための実用的なハードウェアと買い物リストを挙げています。
性能の良いコンピュータが必要です。具体的には、8GBから10GBのVRAMを搭載したNVIDIA GPUが最低限必要です。AMDでは動作しません。実験的な形式で動作するかもしれませんが、まずCUDAコア対応のNVIDIA GPUが必要です。それが第一歩です。それがあれば、次にComfyUIが必要です。これもインターネットで簡単に入手できます。
ハードウェアへの投資
クラウドサービスとは異なり、オープンソースAIワークフローツールはローカルで実行されます。これには初期投資が必要ですが、継続的なコストを排除できます。
最低スペック:
- GPU: NVIDIA RTX 3060 (12GB VRAM)
- メモリ: 16GB
- ストレージ: 512GB SSD(モデルとワークフローファイル用)
推奨スペック:
- GPU: NVIDIA RTX 4070 または 4080 (16GB+ VRAM)
- メモリ: 32GB
- ストレージ: 1TB NVMe SSD
ROI計算:
- MidJourney サブスクリプション:$60/月 ≒ $720/年
- Runway 動画生成:$95/月 ≒ $1,140/年
- 年間節約総コスト:$1,860
- ハードウェア投資回収期間: 6〜16ヶ月
選択する構成にもよりますが、ハードウェアの投資回収期間はおよそ6〜16ヶ月です。1年目が過ぎれば、新たな生成はすべて実質「無料」になります。月額料金、シート料金、生成ごとの料金を支払う必要がなくなるからです。
ソフトウェアスタック(すべて無料)
- ComfyUI - コアとなるクリエイティブワークフローソフトウェアフレームワーク
- Stable Diffusionモデル - SDXL、SD 1.5、および各種専門モデル
- LoRAモデル - キャラクターの一貫性維持、スタイル制御用
- ControlNet - 構図の精密な制御を実現
- AnimateDiff/動画拡張 - ComfyUIに「画像→動画」機能を提供
- 顔修復モデル - プロフェッショナルな品質の仕上げを実現
ダウンロードソース
- CivitAI - モデルとプリセットワークフローの宝庫
- Hugging Face - ベースモデルの入手
- GitHub - ComfyUI本体とその各種拡張プラグイン
学習への時間投資
努力ベースなんです。少し努力をすれば、手に入ります。私にだってできます。私はプログラマーではありません。ゲームデザイナーです。エクセルで数学や経済計算はできますが、コーディングはできません。それでも、これら全てを行うことができました。だから、そんなに難しいことではないんです。
Jakubの核心的な洞察は、技術的なスキルそのものよりも、プロセスへの献身と、独自のアセットや広告クリエイティブを作成する意欲にあるということです。
Creative Automation Tools Comparison: ComfyUI vs Alternatives
| 機能 | ComfyUI | Midjourney | Runway | Traditional |
| 月額コスト | $0 | $60-$120 | $95-$600 | $5,000-$15,000 |
| セットアップ時間 | 2-4 hours | 5 minutes | 5 minutes | Weeks |
| 制御レベル | Complete | Limited | Medium | Complete |
| キャラクターの一貫性 | Excellent | Poor | Medium | Excellent |
| 動画生成 | はい | いいえ | はい | はい |
| 反復速度 | Very fast | Fast | Medium | Slow |
| 学習曲線 | Steep | Easy | Easy | Steep |
| 最適な用途 | High volume UA teams | Quick concepts | Video polish | Hero assets |
モバイルゲームUAに関する結論
週に50以上のクリエイティブバリエーションを制作するチームにとって、ComfyUIワークフローは間違いなく最良の選択肢です。初期のセットアップへの投資は、長期的には無限の生成能力ときめ細かな制御を維持することで回収され、これはブランディングにとって不可欠です。
私が言いたいのは、将来のチームは独自のツール、独自のデータモデル、データセットを構築し、それらをこれらのオープンソースAIモデルを通じて活用するようになるということです。
ComfyUIチュートリアル:画像から動画へのワークフロー
ここからが、ComfyUIワークフローがUAクリエイティブのスケーリングにおいて真価を発揮する場所です。さらに、プロがすぐに気づく核心的な洞察があります:
動画生成、そしてあらゆる生成の鍵は、画像生成にあります。これらのツールを通じて学ぶ最初のルールです。
なぜ「テキスト→動画」はスケールしないのか
ワークフローは直感的に見えます。プロンプトを入力してすぐに結果を得る……一度きりの作成ならこれで機能するかもしれません。しかし、スケールさせたり、クライアントに類似したオプションを提示しようとすると、大きな問題が発生します。
「多くの場合、人々は単に『テキスト→動画』を行います。画像生成ツールに行き、何かテキストを入力すると、何かが生成されます。それは素晴らしいですが、制御権がありません。それが大きな問題です。見た目、キャラクター、環境、あらゆるものの見た目を制御できないのです。」
月に数十(あるいは数百、数千)のUAクリエイティブをテストする場合、この制御不足は致命的です。A/Bテストで何が機能しているのかを分離できませんし、競争力を維持するために十分な速さで反復作業を行うこともできません。
「画像ファースト」のプロフェッショナルパイプライン
フェーズ1:ベース画像生成
- 精密なプロンプトエンジニアリング
- 構図制御のためのControlNet
- 初期生成バッチ(20-50バリエーション)
フェーズ2:修正・調整
- 顔修復
- 手の修正(UGCのリアリズムに不可欠)
- 背景の強化
- 品質アップスケーリング
フェーズ3:アニメーション化
- ComfyUIによる画像→動画変換
- キャラクターの一貫性維持
- モーションパラメータの微調整
- 長さとペースの制御
フェーズ4:ポストプロセス
- 最終的なカラーグレーディング
- テキスト/UIのオーバーレイ
- 書き出しの最適化
ここが、ComfyUIワークフローがクリエイティブチームとUAチームに真のレバレッジをもたらす場所です。プロはすぐにある原則を内面化します: 動画生成の鍵は画像生成にある、という原則です。
基本的に、対応するオープンソースモデルがあれば、あらゆるモダリティで欲しいものを生成できます。私が見せるComfyUIのプロセスは、言わばその枠組みのようなものです。しかし、オーディオ、3Dアセット、2Dアセットから生成することもできます——とにかく、欲しいものは何でも生成できるのです。
これはComfyUIを単なるクリエイティブツールとしてだけでなく、クリエイティブワークフローのインフラとして位置づけるものです。
キャラクターの見た目、環境のレンダリング、ブランド要素がフレームごとにどう表示されるかを制御できます。月に数百のアセットをA/Bテストし、反復しているチームにとって、これは必須です。
一貫性はブランド構築の鍵です。一貫性を維持できなければ、特定の変数を分離できず、競争力を維持するために十分な速さで動くこともできません。
クリエイティブ自動化ツールがチームにもたらす利益
アウトプットの利点は明らかですが、クリエイティブチームへの影響はさらに重要かもしれません。
クリエイティブ疲労と燃え尽き症候群の回避
従来の方法で大量のクリエイティブを制作することは、チームに影響を与えます。細かい変更や繰り返しの修正はチームの士気とモチベーションを低下させ、クリエイティブ疲労や燃え尽き症候群を引き起こします。
多くのバリエーションをテストする場合、通常、分析により多くの時間と労力が必要となり、残業の原因となります。これらはクリエイティブなアウトプットの質に影響を与え、チームに不健康なプレッシャーを与えます。現在のツールと適切なパイプラインがあれば、これらの影響は回避可能です。
クリエイティブ自動化は、反復的な作業サイクルを排除することでこれらの問題を緩和し、クリエイターが戦略と実行により多くの時間を割けるようにします。また、大量生産とテストを、人的リソースではなく技術レベルに委ねることができます。
将来のチームは、独自のツール、独自のデータモデル、データセットを構築し、それらをこれらのオープンソースAIモデルを通じて活用するようになるでしょう。
Jakubによると、UAチームはピクセルに集中する職人ではなく、ツールビルダーへと進化していくと予想されます。チームはより魅力的で、価値があり、持続可能なコンテンツを作成するように進化します。
新しいパイプラインが競争上の堀を構築する
真の競争優位性は、競合他社が購入できないカスタムクリエイティブパイプラインを構築することから生まれます。スタジオが特定のキャラクターデザインに基づいてLoRAをトレーニングし、ブランドに合ったスタイルモデルを開発し、パフォーマンスの高いクリエイティブ要素のライブラリを整備するために時間を投資するとき、根本的なシフトが起こります。
オープンソースAIワークフローは、単なるツールスタックの中のツールではなく、実際の知的財産となります。
私たちが話しているのは、汎用ツールでは複製できないブランド固有の品質レベルを達成する独自のワークフロー、クリエイティブインフラに直接組み込まれた組織の知識、そして生成するごとに良くなっていく評価される資産についてです。
支払いを止めた瞬間に消えてしまうサブスクリプションサービスとは異なり、これらのカスタムパイプラインは時間とともに価値を複合的に高めます。スタジオの美学的好みを学習し、特定のUA指標に合わせて最適化し、競合他社によるリバースエンジニアリングがますます困難になります。これが、最もスマートなモバイルゲームチームが長期的なツール整備を行っている理由です。
速度と量がモバイルゲームの成功を決定する
この変化を最も明確に示しているのが『King Shot』の事例です。2025年2月にリリースされたこのゲームは、急速にスケールし、1日あたり約150万〜200万ドルの売上を生成しました。このような軌道は、わずか2年前にはほぼ不可能でした。Jakubは次のように説明しています:
『King Shot』は今年最大のゲームです。2月頃にリリースされ、現在は1日あたり約150万、場合によっては200万ドル近くの売上を出しています。
『King Shot』の成功が特啓発的なのは、収益だけではありません。このゲームのUA戦略は、広告では親しみやすいパズルスタイルのゲームプレイ(Steamゲーム『Thronefall』に触発されたもの)を提示し、インストール後、プレイヤーをより深い4Xストラテジー体験へとシームレスに移行させる、洗練された「誘い込み」アプローチに依存しています。
これは従来の意味での欺瞞広告ではなく、強力なリテンション指標を維持しながら、ファネル上部の獲得範囲を劇的に広げる綿密に設計されたファネルです。
これはすべて、この種の誘い込み、偽の広告、偽のオンボーディング、リアルなゲームプレイ、4Xスタイルの手法に基づいています……非常にアプローチしやすいため、ファネルを非常に広げることができます。
その巧みさは実行にあります。ユーザーは広告で魅力的なパズルメカニクスを見、最初のオンボーディングで同じメカニクスを体験し、進行するにつれて徐々にゲームのより複雑な4Xシステムを発見します。「偽の広告」と「リアルなゲームプレイ」は十分に整合性が取れているため、ユーザーの信頼は維持され、アクセスしやすいエントリーポイントは、従来の4Xストラテジーゲームを検討しなかったであろう層を獲得します。
しかし、ComfyUIワークフローやクリエイティブ自動化ツールが不可欠である理由を説明する重要な洞察があります: この戦略は、膨大なクリエイティブ量があって初めて機能するのです。
『King Shot』は5つや10個の広告クリエイティブを実行しているわけではありません。わずかに異なるオーディエンスセグメント、クリエイティブフック、メッセージングの角度をターゲットにした数百のバリエーションを同時にテストしています。週単位や月単位ではなく、毎日勝利したコンセプトに対して反復作業を行っています。
この量依存のアプローチは、現在複数のモバイルゲームジャンルに広がっています。ソーシャルカジノゲームも同様の戦略を採用しており、パズルゲームでも使用されています。従来のRPGやストラテジータイトルも、コアなゲームプレイのアイデンティティを損なうことなく、クリエイティブファーストのUAが獲得ファネルをどのように広げられるかを模索しています。
これは、自動化されたクリエイティブ制作がもはや「あれば嬉しい」最適化ではなく、2026年の競争力のあるUAにおける「必須条件(テーブルステークス)」になったことを意味します。週に数百のクリエイティブバリエーションを生成、テスト、反復できるスタジオは、従来の制作スケジュールに依存しているスタジオに対して乗り越えられない優位性を築いています。競合他社があなたが5つのクリエイティブコンセプトを制作する時間に50の新しいコンセプトをテストできる場合、彼らは単に速く動いているだけでなく、オーディエンスの共感、パフォーマンスを駆動するフック、クリエイティブファネルの各段階を最適化する方法について指数関数的により多くのことを学んでいるのです。
クリエイティブ生成からパフォーマンス測定までの全プロセス
ComfyUIのようなオープンソースAIツールを用いたJakubの取り組みは、モバイルゲームのクリエイティブチームの構造的変革のための単なる技術ロードマップ以上のものを表しています。数百のクリエイティブバリエーションを生成しても、パフォーマンスを測定する正確なアトリビューションがなければ意味がありません。
先行するスタジオは、AIパイプラインをTenjinのような モバイル測定プラットフォーム と直接統合し、以下を測定しています:
- クリエイティブレベルのROAS: ファイル命名規則によるクリエイティブIDタグ付けと詳細なアトリビューションを使用
- コンバージョン率: クリエイティブレベルのデータでセグメント化されたインストールからクリックへのコンバージョン率
- コホート分析: Metaなどの大規模プラットフォームでのクリエイティブパフォーマンスを洗練させるために使用
- LTV軌道: AI生成クリエイティブと従来のクリエイティブのライフタイムバリュー軌道
これらの測定は、特定のクリエイティブモデルの組み合わせと戦略がリターンをもたらすかを示します。
これらのツールを使って成長するには、正確なアトリビューションも必要です。オープンソースAIワークフローインフラへの投資は、クリエイティブ制作とパフォーマンス結果の間のループを提供できるモバイル測定パートナーとペアになったときにのみ、価値をもたらします。
全文をお読みください。
In this video, we cover:
• 🇨🇳 The difference between Western and Chinese AI adoption and open-source models.
• 🖥️ The hardware and software you need (GPU requirements & ComfyUI).
• 🎨 A live breakdown of image generation workflows, including “Detailers” and specific rendering techniques.
Leveraging Open-Source AI for Mobile Game User Acquisition
Roman: Hi everyone, welcome to another episode of ROI 101. I’m Roman from Tenjin, and today I’m joined by Jakub from Two and a Half Gamers. Hi, Jakub!
Jakub: Hi, hello there. Nice, thanks for having me. I’m Jakub from Two and a Half Gamers for those who don’t know.
Roman: What do you do there Jakob? A super quick intro for people who might not know who you are.
Jakub: So currently I’m like 10 years plus in the game industry, mainly mobile game industry. Lately, I’ve been working for the last three years, I guess, as an independent consultant, pretty much. But of course, yeah, we run the Two and a Half Gamers podcast with Felix and Matteo, which will be four years next month, so quite some time, I guess.
Roman: Feels a lot longer, dude. It feels a lot longer. I’m not sure how you feel.
Jakub: Yeah, yeah, yeah. That’s the grind there. But yeah, I work with multiple gaming studios around the world, or even non-gaming people these days, because based on the whole Duolingo—you know, apps pretty much taking over the App Store—they’re looking for our know-how. And it’s a perfect match a lot of times where, you know, they need progressions, monetization, and all these other things like system design, basically.
Roman: Yeah, yeah. We’ve seen the same with apps—like a huge amount. But anyway, we met at Modictum with Jakub, and we decided that we want to talk about AI. Of course, it’s still 2025, so we have to talk about AI.
Let’s just jump in, Jakub. It’s going to be like free flow. We don’t have an agenda. We’ll just see what Jakub has to show, and I’ll ask plenty of questions.
Jakub: Yeah, yeah. There’s lots of stuff, and yeah, I guess this will hopefully be as practical as possible because this won’t be one of those discussions that like, “AI will replace your job, AI will be this, AI will be that,” and so on and so forth. This will be like, what can you do now, completely free, and it’s extremely impactful. So let’s start there.
Jakub: Okay, so I guess, yeah, for those listening, best case scenario, you can probably watch this on YouTube or somewhere there, because we’ll be sharing the screen, and I guess it’ll be from now on some kind of a workflow.
So yeah, before we get to this nice image, which we’ll get to in a second, let’s first look at some of the actual stuff that’s currently completely taking over the market, which is basically AI creatives.
AI creatives are actually the most impactful, let’s say, surface-level view of AI that we see in the market. It’s one of the most important things in the current environment because UA is more important than product this year and next year even more, and so on and so forth.
It was not like this a few years before, but now it is. And if you want to give the best example, just look at King Shot. King Shot is the biggest game of this year. It was launched somewhere like February, and currently it’s doing something like one and a half, nearly two million a day.
And it’s all based on this kind of bait-and-switch fake ads, fake onboarding, real gameplay, 4X-style thing, where it was actually taken from Thronefall, which was the game on Steam.
(There we go.) That was pretty much very good but, again, very approachable.
But what happens is basically they widen the funnel so much because it’s so approachable. Users get to see these fake ads. Then when they go into the game, they see the gameplay which is the same as the one in the ads, which means like the fake ads, fake onboarding kinda equalizes itself. Therefore, nothing’s fake anymore, and it’s exactly the thing that you’ve seen in the ads. But slowly, the game unfolds you into 4X or some other high-LTV engine that we see.
It’s proliferating also to other genres, like Social Casino. Like, just wait when we release the next episode on the channel. You’ll see how this bait-and-switch also works there.
And all of this is, again, possible because creatives and marketing is the key in this whole setup. And AI creatives—I’m not saying you can’t do this without the AI creatives—but it’s enabling it in a very, very big way that, again, it gives you volume because you need volume for this.
And AI creatives these days are extremely prevalent. And we think that our prediction is basically that by the end of 2026, there will be around 50% of all UA creatives either having AI hooks or completely done by AI. Like, here you have an example. The one that I showed before, it was actually like a hook, and there was the creative real gameplay and so on. This is the fully generated one where you would have stuff like—you see here, completely generated in an image and video editor, and you just run it as your creative, and that’s it, basically.
So, again, we won’t talk about “AI takes your job, AI does this.” We’re literally talking about what’s currently trending in the market now and how to get this. So if your creative team is not using AI, you’re already behind. That’s basically the state of it.
So how do we actually get to this? And how are these things done? And like, a little bit more nitty-gritty stuff of generation?
Because, as I said, I won’t talk about any other use cases about AI these days, because in my opinion, mastering the UA pipeline and mastering this and addition to boost your volume is the key.
Of course, there are stuff like—let’s say, you know, it’s just an example here. Here’s an example from YouTube that I found where, again, you can use the ComfyUI thing, which I’m using today, and generate 3D assets through it. Again, open-source AI generation is not confined only to images and video. You can generate whatever you want, basically, in any modality, as long as you have the open-source model for it. The ComfyUI thing that I will be showing is just like the, let’s say, the frame for it. But you can do from audio, 3D assets, 2D assets, 2D sprites—like, you can generate whatever you want, basically, and completely for free, as I said, as long as your graphics card is able to handle it.
So that’s there. So don’t just think, “Oh yeah, this is just images and videos, and it won’t help us through.” We can do pretty much everything, because how I think the teams of the future will be going is that they will all be making this custom. Because that’s the biggest difference between the Western approach of like blackbox AI tools, which are, again, completely closed—as for you can only do, I don’t know, positive prompt, negative prompt, then like some very small customization to it—whereas if we go actually to what we can do today…Yeah, it’s kind of very heavy what you can do and what you can actually create and check and stuff like that.
It gives you completely free hands, uncomparable. And as I said, what I’m saying is that the teams of the future will be building their own tools and own data models and old datasets that they will be then pretty much using through these open-source AI models. Because that’s the attitude, or let’s say that’s the way that China handles it.
Like, China is currently flooding the market with all these open-source models because it’s their kind of political policy of, “We’ll get these models in the people’s hands. Therefore, we control the ecosystem.” Instead of the Western approach, which is like, “We have these giant OpenAI companies that are doing like the best of everything,” but again, it’s not that supportive as in China.
In China, the community is also driving these models because they’re adding all of these additions and stuff. Imagine it basically like Skyrim. Skyrim is played to this day and is one of the best RPGs in the world. Why? Because it has a giant modding community that revives it, patches it, improves it, so on and so forth. So that’s their approach, basically.
Roman: …Your first creative when we started. It had the Chinese characters, and I already—because I also follow the channel—I know that you have some folks from China, and they’re like sharing some crazy stuff.
And leads me to my first question: Do you feel like they’re upfront than like everyone else with this AI adoption? And like, clearly you’re saying yes, right?
Jakub: I would say so. Not only are their models—again, they’re open source, so you can go customize and use them for yourself—but the approach and pipeline is, again, different in China.
Because, again, this is the big difference between the West and the East: user acquisition is the most important job in the mobile game industry in China. In the West, it’s not.
In the West, it’s a product, usually. Product—as for either design or, you know, live ops, PM, monetization stuff like that. That’s the most important part, the core of it. User acquisition for them [China] is, again, as I said, the most important part, because also the product is so up to par across the whole industry there. So their product is great to begin with. But yeah, that’s another discussion for some different time.
Roman: But can the folks from the West adopt this kind of—like, the models are open source, as you said?
Jakub: Yeah. Again, they can. Like, you know, we have AIs all over the place, so there’s basically no language barrier if you know how to use them. It’s just artificial. It’s like, you know, effort-based. Like, you need to put in some effort, and then you have it.
But other than that, like, yeah, it’s quite easy. Like, I can do it. I’m not a programmer. Like, I’m a game designer. I can do Excel sheets like maths and economy, but I can’t code, and I was able to do all these things. So it’s not that hard. Yeah, everybody can do that.
And it’s, again, just people in the West kind of sleeping on themselves, whereas they should be doing these things all over the place. But yeah, we’ll get to it.
So, as I said, how to do these creatives and how to pretty much even get to some of these things. Because, again, you can do and do this still pretty easily, through like Nano Banana or Chat GPT, or any other image generator in the West. You can still do great. Like, don’t get me wrong.
This is more hardcore and, let’s say, more customizable stuff because of what you can do and what you can create. You can, for instance, create your own LoRA. We’ll get to it—what that means. But basically, what it means is that you create your own dataset from your art, your custom art, your whatever you want to do, and you add it onto a model. Therefore, the model suddenly spits out like an art that would be coming from your artist, which isn’t really the thing that you can do with GPT or these other tools.
Because currently, as I’m seeing it, for instance, every big company—and I mean like companies like, I don’t know, Blizzard, CD Project Red, and all these other guys—they’re probably already creating their own models, which are completely fed only on their own data, meaning that they’re, again, creating the armies of these artists that they’ll suddenly be able to do and use, which is completely legally okay. That’s because there’s no copyright so far, and they’re just using the model, not the training material. But yeah, that’s again one of these things.
So how does it look, and what’s there? So this is ComfyUI. Let’s start maybe from a little bit easier workflow until we get to the hard stuff. Again, it’s quite easy. It’s visual prompting once you get into it. So you just download the thing from Hugging Face. Hugging Face is the big programmer repo with all the databases and models and everything. It’s all open source on the internet.
And the important part—like, you’re looking at this like, “Oh, this is so—like, how did you create?” No, you don’t. You don’t need to. It’s very easy because all of these things that you see here, for instance, these workflows that I have here, you just take from someone else.
Like, if you’re hardcore, you can literally go and like, “Okay, add a node and like edge spaghetti here and do this visual coding thing,” that, you know, goes from here, from here, from here. You can do it yourself, but I don’t. Because, for instance, this one that I have here—the big one—yeah, no chance for me.
But again, what you do: You go on the internet, you read the guide, and on the guide you have like this whole thing, pretty much. And again, somebody did it for you. So don’t get—maybe let’s get rid of this so it’s a little bit more easy on the eyes. Don’t get scared and don’t think, “Oh, this is just horrible.” As I said, I went through these. I didn’t know shit about all of this, and pretty much by trial and error, you can figure it out quite fast. It’s not that hard.
And my number one advice when working with these tools: Whatever errors or stuff that you have there, just throw it into ChatGPT, and it will just tell you in layman’s terms like, “You need to do this, you need to do that, you need to do this.” And it’s great because, again, we need to realize that suddenly we have this AI that’s literally right there sitting in the corner for us, which we can ask anything, and it will do anything for us.
So all of these things—like, “I don’t understand this, I don’t understand that”—doesn’t matter, because again, you slap it into AI, it will tell you. And especially programming code. Immediately, it’ll fix errors and do stuff for you. So it’s, again, an effort-based barrier, no other barrier.
So if we go into the basics…
Roman: So maybe we can clarify, maybe for the small one. This is what was used to generate one of those creatives that you’re showing at the bottom [of the screen]?
Jakub: Yeah, yeah. So let’s say this one. So how do you use this? How do you generate those?
So, for instance, this one—this was an image, and you run the image through a video generator which then animates it, and then you stitch it into a movie, or like a creative, basically. Because all of these kinds of cuts, that means that it’s another image and another generation, usually. So in order to do these—for instance, this one already requires a little bit more advanced workflow because one thing that we have here is a consistent character, which is like, yeah, it’s not something that you see every day.
So, again, for this you use ComfyUI, where you have workflows for consistent character. Literally, create a character, and from that point on, you kind of save it like, “This is my character.” And then all the generations can go through that character. Therefore, you end up with something like this, where I said like, “Okay, let my character sit in the evening in the office,” and there it goes.
And the video generator is just kind of a cherry on top. It’s not that hard. The important part of let’s say creative video generation, is actually the image itself. That’s because the workflow that you always go to is image-to-video, not text-to-video.
Lots of times, people just go text-to-video. Like, you go to an image generator, and you do something and just input some text, and it just generates something, which is great, but you don’t have control. That’s the big problem. You don’t have control of how it looks, how the characters look, how the environment—how anything looks.
So again, the key to video generation, anything, is image generation. That’s the number one rule that you learn with these things.
Therefore, if you want to have great creatives, you first need to master the image generation. Once you master the image generation, then always the first frame starts with your image, and from that image you go and create the creative, and you can do pretty much whatever you want.
So how do we get to image generation? So, as I said, you install stuff like ComfyUI. You can do Nano Banana or whatever—anything is good. But this is just a much better way of having controllability. So let’s just go over this very simple workflow and how it works and what we have here.
So this is the Z Image Turbo, which is the latest model from Alibaba that is literally taking over the internet in the last month. For those who don’t know, it’s unheard of because this is a very small model—literally like 6.1 billion parameters—and it’s outstandingly good. But yeah, I’ll just go very fast through it.
So here, for instance, we have the base model which is quantized. Quantized means that in order for these—some of these models—we don’t really have the top-of-the-line graphics cards, so the community, again, creates lower versions of these models to cut down on the VRAM requirement but also a little bit on the quality. So that means that I can run this on my 3080 Ti GPU graphics card, which has 12GB of VRAM, even though the base version of this model requires 16.
So you literally go on the internet, and again, in the guide itself—I have here, for instance—you can get and find. So you have these repositories. For instance, the quantized version of the model—you go all the way into small ones, which is like 2 gigs or whatever, and you can run this even on 6 gigs VRAM card.
Roman: So the first step is actually to buy a good computer. Is that what it is? Haha.
Jakub: Haha, yes, you need a good computer. So we need at least something like, I would say, 8 to 10GB of VRAM, NVIDIA GPU. This stuff won’t work on AMD. Maybe in some experimental form it will, but you need a CUDA core GPU. That’s the first step. Once you have this, you need ComfyUI. Again, you can get it on the internet, very easy. It’s just one repository from Hugging Face.
Also, I recommend installing ComfyUI Manager, which is just the UI add-ons stuff, pretty much a utility that, again, you don’t need to go on the website, download manually. You can just click on it, and it will download it from GitHub immediately.
And once you have this, again, you just drag stuff. You can literally go here and drag an image here, and the image and its metadata will then create the workflow if it’s embedded in it. So that’s the beauty of it. Like, you don’t really need to create all this spaghetti visual coding stuff. It will just have the—for instance, this one is an example workflow on the site. It was just like, throw in an image, here we go.
So again, what we have here and what are some of the things that you can control here and what gives you the things. So here we have the base model, as I said—the text encoder and the model itself. It’s quantized, so it’s lower quality, lower VRAM, so we can actually run it. Then we have stuff like “shift.” This is specific for the model. It’s more of like a contrast slider. So less shift means more contrast. More shifts means less contrast. That’s there.
Then we have the positive prompt. Yeah, I’ll get to it—how I got it. And the negative prompt. If I understand correctly, this one doesn’t really work with negative prompts that much. It’s, again, some image generators don’t even have that. Like Flux, for instance—they don’t have a negative prompt. Then we have the image size, which is just like a square of 1024 bits times the same. We could pump it up to 2K, even higher. The problem is that it will just load longer, and we don’t need it for the sake of this video. So that’s there.
Roman: Jakub, quick question. Is it also effort-based, as you said at the beginning, in order to understand everything you actually—
Jakub: Yeah. As I said, no programming skills on my side, no computer science, no nothing. My background is psychology. Like, you don’t need anything. You can get these things still. As I said, for instance, we can link the literally the how-to guide tutorial into the video. There’s like a 40-minute tutorial, but most of the stuff—it’s not even a tutorial. It’s just the guy goes over what’s the comparison between these models—Z Image, Flux, and Qwen—is more of a comparison.
Jakub: So really, where he goes through file manager and just tells you how to install it—this takes like 10 minutes, honestly. It’s not like it will do this and that and it will be super hard. No, it won’t. It will be just like four or five clicks. Again, you have ChatGPT sitting right next to you that if you don’t understand, you just tell it, “I don’t understand this. What should I do?” It will tell you. It’s that easy.
Like, for instance, I didn’t understand which quantized model I should pick for my graphics card. And yeah, so this is what it told me. So I just literally pasted the repository from the thing, and it told me like, “Okay, so you go here, and these are the models. So if you have 10 to 12GB VRAM, pick this one because this one will probably be enough for your memory.” That’s it. And you do all these steps like this. It’s super easy. So nothing really to it.
So once we have all these fixed, let’s just finish the last step. So steps are very important. This is the setup that tells you how many actual parts of generations it goes through, because all these images—usually the diffusion models—it starts from noise. So imagine just a black-and-white grainy picture that all these pictures start like that. And this will be like how many steps—the noise will be run through this.
Then we have CFG value, which is how much prompt adherence compared to creativity we let the model do. Meaning, how much more creative we let it be compared to how it must be exactly as we prompted. Again, a value that you can play with. And then some base stuff that you don’t really need there.
So if we go here and run it, we have this kind of a demon guy, which is like hyper-realistic—a line drawing of a furious forest spirit. Da-da-da-da-da. Let’s run it.
Jakub: We have the same seed. Yeah, we need to change the seed to random because we don’t want to have a different seed each time.
Roman: Prompt. Mhm, I see a lot of text in there. Yeah. How do we get this?
Jakub: Yeah, exactly. Yeah, let me just generate the thing so you see it. Went through the prompt, now it’s in the K sampler, and then from K sampler, it goes to decode, and then there we have our image. So we have, instead of this guy, we have this guy. That’s quite easy.
How do we get this giant prompt? So prompting is kind of another way of learning these things. So, for instance, this prompt I got from CivitAI. CivitAI, again, is one of those things that I would recommend you go check it out. It’s pretty much the biggest open-source community website on the internet. Think of it literally as an Instagram. So it’s just basically images and videos of other creators that people vote on and then can check and do stuff.
The very important part about this site is that you can go there and learn and get stuff for yourself. So, for instance, our forest spirit is just—I was just browsing here. For instance, everyone, today’s images—what’s that? Generated. You have some very interesting stuff that you can get here.
By the way, spoiler alert: I’m using the Civit AI Green site because there’s also the Civitai.com site, which is like 90% porn, because that’s what people generate with user-generated context. So just saying, if you want the one without it, it’s the Green one. If you want the one with it, it’s the base one.
Roman: Thanks for picking the right one for this recording. I appreciate it.
Jakub: No worries. So again, I just found the image from a creator, and the key part here is not the image itself, but again, this thing on the right, which we can zoom on a little bit.
Jakub: So what we have here is that it tells us actually how this was created. And we can even run it on the site itself and generate it there if you really want. The site allows it if you buy literally through credits. But again, why should we do it if we have it open source?
So what this tells us: It’s using the Z Image Turbo generator. So I can literally just go here, click here, and then I have the model. It was released November 26th, and I can download it or create with it or basically get stuff from it. You also have some kind of current generations and what people are doing there and stuff like that. But again, we already know the model.
Then we have the prompt. So we have the prompt. We can take the prompt, and you can play with it and use it. Prompts have very specific setups. Again, we would probably need a different podcast for it. But again, you don’t need to create this stuff yourself from scratch. You can learn from other people. This is why this site is so important.
It comes into the formula because you can create amazing stuff just by copying other people’s work and reverse-engineering it and seeing how it works. And therefore, you learn, and you learn very, very fast.
Then we also have some other important things, which is the metadata—basically how the guy specified his sliders in ComfyUI. So we see, as we talked, CFG scales a little bit more to the adherence, so it’s 1.1 only, eight steps. The sampler—we can even take the same seed and generate the same exact image if we want. That’s also possible because he left the seed here.
Some people don’t share their generation metadata because they’re very—you know, want to stay confidential and stuff like that because some people work very hard on their workflow. But most of the stuff that you see here, you can do, and you can just take and learn from it. This is the beauty of the site—that you learn so much.
Jakub: So again, this was pretty easy to do, and we can do whatever we want, actually. Just for the sake of it—so if we go here, we can leave our fire guy and—
Roman: Roman, tell me what do you want to generate?
Jakub: Let’s do something Christmas-related.
Roman: Christmas. Zombie.
Jakub: Zombie.
Roman: Like, do you remember Plants vs. Zombies?
Jakub: Okay, this is what immediately sparked for me. By the way, good that you’re saying it. The beauty of these models—uhhh Christmas postcard…
Yeah, let’s try this one. The beauty of these models—good that you mentioned—is that they’re completely uncensored, which is, again, the big advantage of it. Because if you go into ChatGPT or, again, one of these kinds of main models, you can’t generate IP-based stuff. For instance, my son asked me, like, “Oh, can you—can I have, like, Olaf or whoever from Frozen?” Or like, no, you can’t, because these models have other AIs that are censoring the output of them so that you can’t do it. It’s impossible here.
Roman: Quick!
Jakub: Yeah, it’s very quick. Again, as I said, I’m using a downward-quality one, so this would be a little bit different than the usual quality that you can pump it up an d there are still better models. This is the Turbo one, so speed is more important than quality itself. But again, whatever you do here, you see, you still can get amazing quality.
But again, if I would go and, as I said, Elsa and Anna from Frozen standing in front of a giant frozen castle, cinematic, high quality, realistic—let’s try. Yeah, the more these tags and words you add to it, the better the image will be, of course. That’s like without saying. As I said, I would recommend for anyone to learn just the process.
Oh, there we go! See?
Roman: Oh, that’s literally—well, yeah, like 95%.
Jakub: It’s like if we would fine-tune it a little bit more with details and—you see, the ice maybe needs a little bit of stuff like that here and there. And yeah, we can get to it very easily.
Roman: Your legal department is not going to be happy about—
Jakub: Yeah, yeah, yeah. But again, you can do whatever you want. That’s the beauty of it. So it gives you—and it’s completely free. You know, just take electricity and your GPU, nothing really to it.
But again, I would recommend for anyone just to kind of touch this, run through it, and just learn it. Because, again, you can apply this same process—how this works—to any modality, to like, as I said, text-to-video, image-to-video, 2D art, 3D art, voice, you know, whatever. It works the same. And I think it’s important for people to understand what’s under the hood and how much control they can actually have. Because it’s amazing.
And we’ll probably end up with this last thing, which is my signature stuff that I was working on. And yeah, this gives you much, much, much more control. This is a very advanced workflow that—not this one, sorry, this one. There we go. Let it run because this one is actually 240 steps.
Roman: What does it do? I didn’t understand. What does it do?
Jakub: Yeah, yeah. So what we have here is that we are actually using an Ion on Justice anime model, and we are using the model only for 140 steps. And what we’re trying to achieve—we’re trying to generate a snow leopard anthropomorphic warrior in a realistic style for our game. There’s a pretty big prompt here, pretty big negative prompt also.
It took some time for me to do this. But we want this to be realistic, and the anime model that I’m using here is not able to do realistic stuff. So what’s happening here?
So what you do: You use a refiner. So what it does—after 140 steps, this model stops, and I actually plug in a different model.
So now we’re doing a two-model generation now through Fennekin, which is a realistic model, which finishes the generation, the denoising of the noise from the image for another 100 steps. So it goes all the way to 240. That’s why it’s taking like 3 minutes. And then it basically creates something that each of these models couldn’t create on their own. Because we want, again, a fantasy-style snow leopard warrior guy that—again, I was not satisfied with anything I found on the internet, so I just dug deeper and deeper and deeper and deeper and got into it.
The very important thing is that this model and the workflow that we have here—and by the way, everything that you see here, we’re not using even half of it. Here, we have basically the possibilities of this workflow, and you can just plug them out like functions. You know, just click here and enable it or not. All the violet stuff that you see means that that’s inactive. We’re not using an OpenPose, IP adapter, or ControlNet upscaler, all these other things. It can do so many things that, again, would take a different podcast to do.
But what it can do is still—we’re using the after-generation corrections, like Detail. This is the really important part. Because in the image that we generated here, for instance, you see that, yeah, they’re great, but there’s something strange about these two. It’s not that, you know, the position of their eyes and everything—it’s like they look from Wish.
So what happens here is we can look at it in real time, actually, as the workflow is continuing. And okay, it’s already on ADetailer. So we have the base image here, and you see, it’s not perfect. It’s like the face is kind of distorted. Yeah, we don’t really want this. So what’s happening? We have a face detailer, and the face detailer actually fixes only the face. So we are putting another generation on the image that we already have here. And not only that, we’re also fixing the eyes to make them a little bit better.
Roman: Oh, yeah, yeah, yeah. I see, I see.
Jakub: Basically. And you can do—again, there are like four passes we can do, both hand and body kind of setup. You see how the body is kind of, again, fixed a little bit.
So last time I was checking some stuff on the internet, a professional from an AI agency that was sharing his workflow said that it takes him something like 20 hours on an image and 500 generations to kind of get it where it wants to be—like top quality. So just to give you an example, from the really, really basic stuff, like “Let me generate Olaf from my son,” to very, very advanced stuff like this is how it works. Because, again, this is something that needs to be kind of perfect, because it, again, defines what you want to do.
Jakub: And if we go, again, somewhere here—not this one, but the one creative that I got really, really—not this one. Yeah, there we go. So you see how beautiful these creatives are? Literally like a Pixar movie. And again, you get to this quality by being able to use advanced workflow. And what you end up with are these perfect creatives afterwards.
So, again, that’s the beauty of it. Because this looks literally like a high-level cinematic. It’s like something that somebody would take, again, lots and lots and lots of work and time to kind of get and generate—I mean, draw. But then, again, you can just generate it through pretty much an advanced workflow timeline. And yeah, it would go, and you need consistent characters and all these other things.
But as I said, it’s like step one to getting all these things. So for any creative team that is making creatives, yeah, I think AI—the image generation and video generation mastery—is like an existential problem next year, basically. Because you just won’t be able to keep up with the volume. It’s just very, very hard. And having enough creative volume means that your CPI is getting low as it should be, and it’s getting higher. So eventually, all of these things translate basically into that.
And as I said, even though this looks super complex and stuff, it’s not. Like, I’m not even scratching the surface of it—how complex it can get. It’s just some basic stuff that I’m showing, pretty much, not really to it.
Roman: Yeah, I really like how we look at both of the schemas. At the end of the day, we generate an image.
Jakub: Yeah, yeah, yeah, yeah, right. Yeah, we have a nice warrior here. But again, you can do whatever you want in the end. And that’s, again, as I said, that’s the beauty. You can play with it and try it for yourself. There’s not some money-hungry website consuming credits or whatever. As I said, the only thing that you need is your GPU and electricity. So you can do whatever. I’m currently past something like 12K, 13K generations, probably.
So yeah, sometimes, even here, for instance, there’s a setting that—run it in batches of eight or whatever. You can just go and sleep and let it run and then pick the best one. I always do that sometimes. Yeah, it’s like an idle game, so, you know, you come back and you collect your rewards basically after it.
But I think it’s very fulfilling in order to be able to know how this stuff works. Because then also what it gives you is, once you go and once you see these creatives and all these other things that are currently trending on the—most of the times, I can even spot the generation model just by looking at it. Because some models are very specific—you know, you can see the giveaway.
Like, for instance, all the ChatGPT images—they have this orange tint behind them. So if we go, I think, one of the last ones here—yeah, see, this one. So this is a creative that Candy Crush is running, and you can see in the end that there’s this kind of orange aroma around the image, which means that it was based on ChatGPT as the base image. So here, it’s kind of very, very, very visible. Yeah. But again, it’s very subtle. But once you see 10,000 of these images, it’s very obvious afterwards.
So therefore, you can immediately see how these teams are working, how they are generating these things, and you can do it yourself. Again, it gives you a lot of edge over the process.
Roman: And it’s really a skill, right, for probably 2026—if you’re doing creatives, whether you’re like a UA manager and that’s part of your responsibility, or you’re part of the creative team.
And just trying to summarize at the end: The cloud-based services like ChatGPT or Claude will give you less flexibility with what you can do. Therefore, we would like to use open-source models.
Jakub: Like, honestly, you can even combine those in a way that, for instance, you can do the image in your open-source model to kind of refine it better, and then use a video generator. Video generator is quite important, but it’s like the final part of the thing where, you know, what you want to generate, which starts from the base image, which can be a combination of like, “I don’t know, use the ComfyUI—I want to generate the image,” and then like Veo 3 for finishing, you know, the video and stuff like that. Again, you can do whatever you want.
Best-case scenario, you test all of it, and you figure out which one works best for you. That’s the beauty of it. But again, by knowing these things, we even have the option to test it yourself. Otherwise, you just like, “Oh, we can just use Veo 3. That’s the only thing we know.” And that’s it.
Roman: I see.
Jakub: And the other big problem with these—these things degrade very fast. And not really degrade, but pretty much new stuff gets released all the time, and you need to keep up. I’ve seen some of the creatives here, for instance—I think, yeah, these ones, or maybe a little bit older ones—that you can see some of those are just running on old models. People just haven’t updated yet, which is, again, normal, because this was—the update cycle here is like 3 months or something. But you want to be using cutting-edge stuff because, again, it gives you an edge on quality, stuff like that, and all these other things. So yeah, just kind of moves very, very fast.
Roman: How do you keep up, Jakub? You personally? Do you have time?
Jakub: I don’t know. Okay, yeah, understandable. So I listen to a few podcasts, of course, based on AI. Literally, we can link—this one is, in my opinion, the—this kind of AI Search YouTube channel is literally a guy that just does the news. Every week, he goes, “What was released this week?” and just goes through all the models. And they have specific videos on specific stuff, like comparisons, stuff like that. So this one kind of debrief I watch very regularly.
Jakub: Then I have a few podcasts that are industry-based, like what’s the latest of ChatGPT versus Microsoft and all these other things. And then, of course, you go on the Civitai site, and you just go and see what people generate.
So, for instance, here you could clearly see that lots of these things like, okay, there’s a lot of ChatGPT actually in it. There’s a lot of—what’s this? Yeah, Google Nano Banana, pretty much is trending really high these days. And you just see what people generate and what’s pretty much there in the market. And this just tells you, “Okay, yeah, that’s basically it.” So, you know who generates what, right?
Roman: And this is so interesting that you’re actually a game design expert, and you are now fully into this creative part of the game. Full in. How does it feel, Jakub?
Jakub: Oh, it’s great, you know. I’m always kind of obsessed in—but again, the biggest problem was I didn’t understand how this thing works, which for me is the biggest itch of like, “I need to do something about it,” because I don’t feel safe, or how do we say it? I don’t feel on top of it if I don’t understand how it works.
So it kind of drives me to kind of go into this rabbit hole and learn about this. Because, again, if you don’t know something, at least know how it works. You don’t need to specifically do it, but at least know that there are these options. Because that way, you won’t get sidelined. You won’t get in a situation where somebody tells you something, you can’t call their bullshit, and you don’t know if this is the best, not the best, or are they even telling you the truth, stuff like that.
So, again, AI is just moving so fast these days in a way, and it’s one of the most important technologies of our lives currently. So why not, you know, slap two birds with one stone? Where, again, we need this because of our game industry professional expertise. And on the other hand, of course, AI will be there, and it will change stuff. That’s for sure.
But again, the edge that it gives me now is I know, for instance, gaming-wise—no, gameplay-wise, it won’t change stuff that much. It will maybe help with some optimizations, like matchmaking or whatever bolts, I don’t know what. But we still haven’t reached the point that AI is doing the, you know, the AI-enabled games—games that won’t be working without AI capabilities. We still didn’t hit that inflection point.
It’s not something like it was, for instance, in, I don’t know, 1999 or something, where Doom 3D was released. Because the 3D-ness of it enabled you to do stuff that you couldn’t do in 2D. We haven’t reached this point yet. Again, why? Because you learn about this, and you know that, like, “Oh, this doesn’t make sense. It cannot even code properly yet, so much hallucinations.”
Roman: Everything is connected. All right, Jakub, this was super insightful. I’m sure the guys will have a lot of questions. I’ll ask everyone to leave their questions in the comments. We’ll ask Jakub to answer them once he has time.
Any parting thoughts? Or we will, of course, leave all the links in the descriptions to Two and a Half Gamers and the stuff that we mentioned during the videos. But I also want Jakub to say something, especially at the end of the year. Last parting thoughts from you, Jakub?
Jakub: Yeah, yeah. As I said, if you have any questions or any thoughts, comments, feel free to leave them under the video, or you can join the Two and a Half Gamers Slack. That’s also open for all the people to kind of share their knowledge and talk with others.
Jakub: Yeah, I would—as I said, parting line is: Go and learn it. Don’t wait for it until it will kind of, you know, it’s too late to kind of catch up.
Roman: Well said, well said. We’ll end on this point. Like and subscribe. I’m sure you liked this episode. And thanks a lot, Jakub.
Jakub: Yeah, no worries. See you there. Cheers.
Roman: Bye-bye.
マーケティング・コンテンツ・マネージャー
タラ・マイヤー