<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" ><generator uri="https://jekyllrb.com/" version="4.4.1">Jekyll</generator><link href="https://paolino.me/feed.xml" rel="self" type="application/atom+xml" /><link href="https://paolino.me/" rel="alternate" type="text/html" /><updated>2026-04-19T07:27:41+00:00</updated><id>https://paolino.me/feed.xml</id><title type="html">Carmine Paolino</title><subtitle>I build AI tools at &lt;a href=&quot;https://chatwithwork.com&quot;&gt;Chat with Work&lt;/a&gt; and &lt;a href=&quot;https://rubyllm.com&quot;&gt;RubyLLM&lt;/a&gt;. Co-founded &lt;a href=&quot;https://freshflow.ai&quot;&gt;Freshflow&lt;/a&gt;. Outside tech, I &lt;a href=&quot;https://crimsonlake.live&quot;&gt;make music&lt;/a&gt;, run &lt;a href=&quot;https://floppydisco.live&quot;&gt;Floppy Disco&lt;/a&gt;, and &lt;a href=&quot;https://paolino.photography&quot;&gt;take photos&lt;/a&gt;.</subtitle><author><name>Carmine Paolino</name></author><entry><title type="html">Your Agent’s Context Window Is Not a Junk Drawer</title><link href="https://paolino.me/your-agents-context-window-is-not-a-junk-drawer/" rel="alternate" type="text/html" title="Your Agent’s Context Window Is Not a Junk Drawer" /><published>2026-04-07T00:00:00+00:00</published><updated>2026-04-07T00:00:00+00:00</updated><id>https://paolino.me/your-agents-context-window-is-not-a-junk-drawer</id><content type="html" xml:base="https://paolino.me/your-agents-context-window-is-not-a-junk-drawer/"><![CDATA[<p>Your agent’s context window is the most precious resource it has. The more you stuff into it, the worse your agent performs.</p>

<p>Researchers call it <a href="https://research.trychroma.com/context-rot">context rot</a>: the more tokens in the window, the harder it becomes for the model to follow instructions, retrieve information, and stay on task. Chroma tested 18 frontier models and found that accuracy drops up to 30% when you go from a focused 300-token input to 113k tokens of conversation history, with the task held constant. The model essentially became <em>dumber</em>.</p>

<p>This holds true regardless of how big the window is, yet most agent setups treat the context window like a junk drawer.</p>

<p>“Just toss it in there, the LLM will figure it out!”</p>

<h2 id="mcp-the-biggest-offender">MCP: the biggest offender</h2>

<p>Don’t get me wrong. MCP is a fine idea. You need to talk to a service? Grab an MCP server, plug it in, and you’re running in ten minutes. For prototyping, for exploration, for answering “is this even worth building?”, it’s great.</p>

<p>The problem is what happens next. Which is: nothing.</p>

<p>People leave the MCP servers plugged in. They add more. Every MCP server you connect dumps tool descriptions, schemas, and instructions into your context. You didn’t write those. You didn’t optimize them. You probably haven’t even read them. You’re handing over a chunk of your context window to whatever some third party decided to shove in there.</p>

<p>Say you need a tool that checks the weather. You could plug in an MCP server and get dozens of tool descriptions, parameter schemas, and whatever instructions its author decided to write. Or you could write this:</p>

<pre><code class="language-ruby">class Weather &lt; RubyLLM::Tool
  description "Gets current weather for a location"

  param :latitude, desc: "Latitude (e.g., 52.5200)"
  param :longitude, desc: "Longitude (e.g., 13.4050)"

  def execute(latitude:, longitude:)
    url = "https://api.open-meteo.com/v1/forecast?latitude=#{latitude}&amp;longitude=#{longitude}&amp;current=temperature_2m,wind_speed_10m"
    Faraday.get(url).body
  rescue =&gt; e
    { error: e.message }
  end
end
</code></pre>

<p>Twelve lines of <a href="https://rubyllm.com">RubyLLM</a>. You wrote the description, so you know exactly what tokens are going into your context. You wrote the parameters, so the model gets precisely the interface it needs, no more. You own it, you can tune it, and nobody can inject anything into your agent’s brain through it.</p>

<p>Use MCP to prototype. Then replace it with crafted tools you actually control.</p>

<h2 id="tool-responses-are-context-too">Tool responses are context too</h2>

<p>Your RAG retrieves ten full documents when the model needs a paragraph. Your API call returns a massive JSON blob when the model needs two fields. You’re paying for every one of those tokens with your agent’s IQ.</p>

<p>The fix is progressive disclosure. At <a href="https://chatwithwork.com">Chat with Work</a>, when the agent searches your Google Drive, we don’t dump entire files into context. The search tool returns only some metadata and a single line from the file, the line that matched the search keywords. Fifty results, fifty lines. The AI reads those, decides which files actually matter, and only then reads them. If a file is too large, it reads it in chunks. At every step, the model is only looking at what it needs.</p>

<p>The same principle applies to any tool. Don’t return everything. Return enough for the model to decide what to look at next.</p>

<h2 id="your-instructions-are-context-too">Your instructions are context too</h2>

<p>Then there’s the stuff you wrote yourself. Your system prompt is context. Your tool descriptions are context. Your parameter schemas are context. Every edge case, every guardrail, every overly detailed description competes for attention. You think you’re being thorough. You’re actually drowning the instructions that matter in a sea of instructions that don’t. A focused system prompt will outperform an exhaustive one every time.</p>

<h2 id="tool-count-is-context-too">Tool count is context too</h2>

<p>You hand-crafted 40 beautiful tools. Your agent needs 5 for this task. The other 35 sit in context doing nothing except making the model slower at picking the right one.</p>

<p>Don’t register every tool your agent might ever need. Load the tools the current task actually requires. If you’re building a support agent that handles billing and technical issues, don’t give it all of both. Route billing questions to a billing agent and technical questions to a technical agent. Two focused agents will outperform one bloated one.</p>

<h2 id="every-token-should-earn-its-place">Every token should earn its place</h2>

<p>The context window is not a junk drawer. It’s a workbench. Everything on it should be there for a reason, and you should be able to say what that reason is.</p>

<p>So before you plug in another MCP server, add another RAG source, or write another paragraph in your system prompt, ask yourself one question: is this worth making my agent dumber?</p>]]></content><author><name>Carmine Paolino</name></author><category term="AI" /><category term="LLM" /><category term="MCP" /><category term="Agents" /><category term="Developer Experience" /><summary type="html"><![CDATA[Strategies to combat context rot.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://paolino.me/images/context-rot.png" /><media:content medium="image" url="https://paolino.me/images/context-rot.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">I Built a Monitor Configuration Tool for Hyprland</title><link href="https://paolino.me/hyprmoncfg-monitor-configuration-for-hyprland/" rel="alternate" type="text/html" title="I Built a Monitor Configuration Tool for Hyprland" /><published>2026-03-31T00:00:00+00:00</published><updated>2026-03-31T00:00:00+00:00</updated><id>https://paolino.me/hyprmoncfg-monitor-configuration-for-hyprland</id><content type="html" xml:base="https://paolino.me/hyprmoncfg-monitor-configuration-for-hyprland/"><![CDATA[<p>Configuring monitors in Hyprland means writing <code>monitor=</code> lines by hand. A 4K display at 1.33x scale is effectively 2880x1620 pixels, so the monitor next to it needs to start at x=2880. Vertically centering a 1080p panel against it means doing division in your head to get the y-offset right. You reload, you’re off by 40 pixels, you edit, you reload again. There’s no visual feedback until after you’ve committed to a config.</p>

<p>Then it gets worse. You unplug your laptop, go to a conference, plug into a projector, and you’re back to editing config files backstage before your talk. You come home, dock the laptop, and the layout is wrong again.</p>

<p>I looked at what was available. The closest to what I wanted was <a href="https://github.com/ToRvaLDz/monique">Monique</a>: spatial editor, profiles, workspace management, a hotplug daemon. It does exactly what I need. But it’s a GTK4 GUI that pulls in Python and a stack of dependencies, and the daemon was broken when I tried it. The other tools each cover parts of this: <a href="https://sr.ht/~emersion/kanshi/">kanshi</a> does profiles and auto-switching but has no editor, you write config files; <a href="https://github.com/nwg-piotr/nwg-displays">nwg-displays</a> and <a href="https://github.com/erans/hyprmon">HyprMon</a> have spatial editors but no daemon; <a href="https://github.com/fiffeek/hyprdynamicmonitors">HyprDynamicMonitors</a> has a daemon but no real layout tool, and it pulls in UPower and D-Bus.</p>

<p>I wanted Monique’s feature set without the dependency baggage, in something that works over SSH when your monitors are broken. So I built <a href="https://hyprmoncfg.dev">hyprmoncfg</a>.</p>

<h2 id="a-real-spatial-editor-in-your-terminal">A real spatial editor, in your terminal</h2>

<p>The TUI is the thing I’m most proud of. It’s not a config editor with a preview pane. It’s a full spatial layout tool.</p>

<p>The left side is a canvas where your monitors are drawn as rectangles, proportional to their resolution. You click one to select it, drag it to move it. Monitors snap to each other’s edges as you position them, just like arranging windows in a GUI display manager. Arrow keys give you fine control: 100px per step, Shift for 10px, Ctrl for 1px.</p>

<p>The right side is a per-monitor inspector. Pick a resolution and refresh rate from a scrollable list. Set scale, position, transform, VRR, mirroring. All inline, no dialogs within dialogs. A third tab handles workspace planning.</p>

<p>And because it’s a TUI: it works over SSH. When your monitor configuration is broken and you can’t see anything, you can SSH into the machine and fix it. Try that with a GTK app.</p>

<h2 id="safe-apply-with-automatic-revert">Safe apply with automatic revert</h2>

<p>Every apply, whether from the TUI or the daemon, follows the same path: write <code>monitors.conf</code> atomically (temp file + rename, no corruption), reload Hyprland, re-read the actual monitor state, and verify the result matches what was requested.</p>

<p>Then it gives you 10 seconds to confirm. If you don’t, maybe because the layout left you staring at a black screen, it reverts automatically. No stuck monitors. No reaching for a second machine to undo the damage.</p>

<p>This is the same apply engine everywhere. The TUI and the daemon share identical code. If it works when you test it interactively, it works when the daemon fires at 2am because you bumped your dock cable.</p>

<h2 id="workspace-planning">Workspace planning</h2>

<p>Monitor configuration and workspace assignment are the same problem. If you’re rearranging monitors, you probably want workspaces to follow. hyprmoncfg has a workspace planner built into its third tab, with three strategies:</p>

<ul>
  <li><strong>Sequential</strong>: Groups in chunks. Workspaces 1-3 on monitor A, 4-6 on monitor B.</li>
  <li><strong>Interleave</strong>: Round-robins. 1→A, 2→B, 3→A, 4→B.</li>
  <li><strong>Manual</strong>: Explicit per-workspace rules when you want full control.</li>
</ul>

<p>Workspace assignments are stored inside each profile and applied together with the layout. Switch profiles, switch workspace distribution. One operation.</p>

<h2 id="source-chain-verification">Source-chain verification</h2>

<p>Here’s something no other tool does. Before writing anything, hyprmoncfg parses your <code>hyprland.conf</code> and verifies it actually sources the target <code>monitors.conf</code>. If it doesn’t, it refuses to write.</p>

<p>Other tools skip this check. They silently update a file that Hyprland never reads. You spend twenty minutes debugging why nothing changed, only to realize the file was never sourced. I lost an evening to this once. Never again.</p>

<h2 id="dotfiles-integration">Dotfiles integration</h2>

<p>Profiles are stored as JSON files in <code>~/.config/hyprmoncfg/profiles/</code>, one per profile. The generated <code>monitors.conf</code> is a build artifact, you don’t commit it. You commit the profiles.</p>

<pre><code class="language-sh">chezmoi add ~/.config/hyprmoncfg
</code></pre>

<p>Save a “desk” profile at home with your ultrawide. Save “conference-1080p” at one venue. Save “conference-4k” at another. Sync them across machines via your <a href="https://github.com/crmne/dotfiles">dotfiles</a>. The daemon matches profiles to connected hardware automatically. Arrive somewhere, plug in, and the right layout applies.</p>

<p>This is portable. The same profile library works across machines because matching is based on the monitors you have, not on the machine you’re at.</p>

<h2 id="one-runtime-dependency-hyprland">One runtime dependency: Hyprland</h2>

<p>Two compiled Go binaries. No Python, no GTK, no GObject introspection, no D-Bus, no UPower. Install them and you’re done. The only runtime requirement is Hyprland itself.</p>

<h2 id="how-it-compares">How it compares</h2>

<table>
  <thead>
    <tr>
      <th> </th>
      <th>hyprmoncfg</th>
      <th>Monique</th>
      <th>HyprDynamicMonitors</th>
      <th>HyprMon</th>
      <th>nwg-displays</th>
      <th>kanshi</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>GUI or TUI</td>
      <td>TUI</td>
      <td>GUI</td>
      <td>TUI</td>
      <td>TUI</td>
      <td>GUI</td>
      <td>CLI</td>
    </tr>
    <tr>
      <td>Spatial layout editor</td>
      <td>Yes</td>
      <td>Yes</td>
      <td>Partial</td>
      <td>Yes</td>
      <td>Yes</td>
      <td>No</td>
    </tr>
    <tr>
      <td>Drag-and-drop</td>
      <td>Yes</td>
      <td>Yes</td>
      <td>No</td>
      <td>Yes</td>
      <td>Yes</td>
      <td>No</td>
    </tr>
    <tr>
      <td>Snapping</td>
      <td>Yes</td>
      <td>Not documented</td>
      <td>No</td>
      <td>Yes</td>
      <td>Yes</td>
      <td>No</td>
    </tr>
    <tr>
      <td>Profiles</td>
      <td>Yes</td>
      <td>Yes</td>
      <td>Yes</td>
      <td>Yes</td>
      <td>No</td>
      <td>Yes</td>
    </tr>
    <tr>
      <td>Auto-switching daemon</td>
      <td>Yes</td>
      <td>Yes</td>
      <td>Yes</td>
      <td>No (roadmap)</td>
      <td>No</td>
      <td>Yes</td>
    </tr>
    <tr>
      <td>Workspace planning</td>
      <td>Yes</td>
      <td>Yes</td>
      <td>No</td>
      <td>No</td>
      <td>Basic</td>
      <td>No</td>
    </tr>
    <tr>
      <td>Mirror support</td>
      <td>Yes</td>
      <td>Yes</td>
      <td>Yes</td>
      <td>Yes</td>
      <td>Yes</td>
      <td>No</td>
    </tr>
    <tr>
      <td>Safe apply with revert</td>
      <td>Yes</td>
      <td>Yes</td>
      <td>No</td>
      <td>Partial (manual rollback)</td>
      <td>No</td>
      <td>No</td>
    </tr>
    <tr>
      <td>Source-chain verification</td>
      <td>Yes</td>
      <td>No</td>
      <td>No</td>
      <td>No</td>
      <td>No</td>
      <td>No</td>
    </tr>
    <tr>
      <td>Additional runtime dependencies</td>
      <td>None</td>
      <td>Python + GTK4 + libadwaita</td>
      <td>UPower, D-Bus</td>
      <td>None</td>
      <td>Python + GTK3</td>
      <td>None</td>
    </tr>
  </tbody>
</table>

<h2 id="try-it">Try it</h2>

<p>On Arch:</p>

<pre><code class="language-sh">yay -S hyprmoncfg
</code></pre>

<p>Or build from source:</p>

<pre><code class="language-sh">go install github.com/crmne/hyprmoncfg/cmd/hyprmoncfg@latest
go install github.com/crmne/hyprmoncfg/cmd/hyprmoncfgd@latest
</code></pre>

<p>Check out the <a href="https://hyprmoncfg.dev/">documentation</a> for the full guide, or browse the <a href="https://github.com/crmne/hyprmoncfg">source on GitHub</a>.</p>]]></content><author><name>Carmine Paolino</name></author><category term="Hyprland" /><category term="Open Source" /><category term="Go" /><category term="Linux" /><category term="TUI" /><summary type="html"><![CDATA[A spatial TUI editor with drag-and-drop, safe apply with revert, workspace planning, and a hotplug daemon. All in two zero-dependency Go binaries.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://paolino.me/images/hyprmoncfg-demo.gif" /><media:content medium="image" url="https://paolino.me/images/hyprmoncfg-demo.gif" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Comb Shaped Slices</title><link href="https://paolino.me/comb-shaped-slices/" rel="alternate" type="text/html" title="Comb Shaped Slices" /><published>2026-03-24T00:00:00+00:00</published><updated>2026-03-24T00:00:00+00:00</updated><id>https://paolino.me/comb-shaped-slices</id><content type="html" xml:base="https://paolino.me/comb-shaped-slices/"><![CDATA[<p>A friend who’s built and shut down companies in this space sat across from me at breakfast during a conference recently. He knows what I’m building: <a href="https://chatwithwork.com">Chat with Work</a>, an AI tool that lets you talk to your actual work data. He wanted to know what my plan was. I think he was a bit concerned.</p>

<p>“Add more integrations, finish the security assessment, market it well.” I said.</p>

<p>That didn’t help. “All those LLM providers are going to eat the whole market. They’ll ship every integration you can think of. If you want a slice of the pie, you need to pick a vertical and own it.”</p>

<p>I told him I was going to grab a T shaped slice of the pie instead.</p>

<p>He looked at me like I’d lost it.</p>

<hr />

<p>Here’s the thing about the “pick a vertical” advice: it’s not wrong. It’s just not the only way. And for a lot of small software companies, it’s a trap dressed up as strategy.</p>

<p>The conventional wisdom goes like this: the market is huge, the big players are coming, so you’d better find your little corner and defend it. Specialize. Go deep. Become the AI assistant for dentists in Luxembourg or the knowledge tool for corporate lawyers in Berlin-Brandenburg. Calculate your total addressable market. Build a defensible moat. Make investors happy.</p>

<p>But what if you don’t care about making investors happy? Most companies don’t need investors. What if you just want to build something good?</p>

<h2 id="the-comb">The comb</h2>

<p>I said T shaped in the moment. One horizontal, one vertical. But the more I thought about it, the more teeth it grew. Less like a T, more like a comb.</p>

<p>Here’s why. When you’re OpenAI or Google, you sample from the top of the distribution. You build what most people use first, then work your way down. The result is always the same: a broad horizontal platform that serves everyone and surprises no one.</p>

<p>When you’re small, you sample from what’s right in front of you. You build for yourself because no amount of user research, design thinking, or theory of mind will ever match the depth of actually needing the thing you’re making. You understand your own problems in a way that connects to your emotions, your workflow, your instincts. You can’t fake that. You can’t interview your way to it. I chose fast onboarding over full sync, because I don’t want to wait to start working. Nextcloud, Todoist, IMAP, and CalDAV: that’s my stack, so that’s where I’ll go deep next.</p>

<p>Then you listen to your customers. “This is cool, but I use Slack.” So you build that too. A team needs to own their data, so you add on-premises installation. Someone uses Basecamp, and you build that integration because the people behind it think like you. One tooth at a time.</p>

<p>The shape that emerges is yours. Not because you planned it on a whiteboard, but because you started from yourself and grew outward. It works for the small teams, the freelancers, the music collectives, the people who don’t have an IT department and don’t want one. That’s the comb: not a strategy you choose, but what naturally happens when you’re small and you give a damn.</p>

<p>There’s a reason people still choose Linear over Jira, or Proton over Gmail, or Plausible over Google Analytics. It’s not because the small player has more features. It’s because someone built it for themselves first, and that resonated. The entire market doesn’t need to resonate with you. Just enough of it.</p>

<p>So yes, the big players are coming. They’re going to ship a lot of integrations. They’re going to spend a lot of money. And they’re going to build software that feels like it was built by a company that spends a lot of money.</p>

<p>I’ll be over here, grabbing my comb shaped slice of pie. It’s <a href="https://plenty.is">Plenty</a>.</p>

<p><em>Today also happens to be the day I officially founded <a href="https://plenty.is">Plenty</a>. The papers are signed. The comb is real!</em></p>]]></content><author><name>Carmine Paolino</name></author><category term="Startups" /><category term="Product Strategy" /><category term="SaaS" /><category term="Bootstrapping" /><category term="Indie" /><summary type="html"><![CDATA[You don't need to pick a vertical. Build for yourself, listen to your customers, and the shape that emerges is yours.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://paolino.me/assets/images/og/posts/comb-shaped-slices.png" /><media:content medium="image" url="https://paolino.me/assets/images/og/posts/comb-shaped-slices.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Ruby Deserves Beautiful Documentation</title><link href="https://paolino.me/ruby-deserves-beautiful-documentation/" rel="alternate" type="text/html" title="Ruby Deserves Beautiful Documentation" /><published>2026-03-19T00:00:00+00:00</published><updated>2026-03-19T00:00:00+00:00</updated><id>https://paolino.me/ruby-deserves-beautiful-documentation</id><content type="html" xml:base="https://paolino.me/ruby-deserves-beautiful-documentation/"><![CDATA[<p>Have you ever looked at a VitePress documentation site and felt a little jealous?</p>

<p>The sidebar navigation. The “On this page” outline on the right. The search that pops up with <code>/</code>. The homepage that actually looks like a product page, not a README with a nav bar. Dark mode that just works. Code blocks with copy buttons and language labels. It all looks like someone sat down and designed the whole experience.</p>

<p>Because someone did. VitePress is genuinely great. And Ruby developers know it, because some of the most visible projects in our community are shipping their docs on VitePress. Not on a Jekyll theme, not on a Ruby tool. On a JavaScript static site generator built for Vue.</p>

<p>I don’t blame them. I looked at what we had in the Jekyll ecosystem and understood immediately. The best option is Just the Docs, and I’ve been using it for <a href="https://rubyllm.com">RubyLLM</a>. It’s solid. But I had to patch in proper dark mode support that follows the browser setting. I had to add a copy-page button. The homepage layout is narrow and document-y. It works. It doesn’t wow.</p>

<p>So I built <a href="https://jekyll-vitepress.dev">Jekyll VitePress Theme</a>.</p>

<h2 id="what-it-is">What It Is</h2>

<p>A Jekyll theme gem that recreates the VitePress documentation experience. Everything you’d expect:</p>

<ul>
  <li>Top nav with mobile menu</li>
  <li>Left sidebar, right “On this page” outline</li>
  <li>Homepage layout with hero section and feature cards</li>
  <li>Built-in local search (press <code>/</code> or <code>Cmd+K</code>)</li>
  <li>Dark/light/auto appearance toggle</li>
  <li>Code blocks with copy buttons, language labels, and file title bars</li>
  <li>Doc footer with edit link, previous/next pager, and “last updated”</li>
  <li>GitHub star widget</li>
  <li>Rouge syntax highlighting with separate light and dark themes</li>
</ul>

<p>All configured through <code>_config.yml</code> and <code>_data/*.yml</code> files. No JavaScript toolchain. No Node.js. Just Jekyll.</p>

<h2 id="getting-started">Getting Started</h2>

<div data-title="Gemfile" class="language-ruby highlighter-rouge"><div class="highlight"><pre><code>gem "jekyll-vitepress-theme"
</code></pre></div></div>

<div data-title="_config.yml" class="language-yaml highlighter-rouge"><div class="highlight"><pre><code>theme: jekyll-vitepress-theme
plugins:
  - jekyll-vitepress-theme

jekyll_vitepress:
  branding:
    site_title: My Project
</code></pre></div></div>

<pre><code class="language-sh">bundle install
bundle exec jekyll serve --livereload
</code></pre>

<p>That’s it. Your docs site now looks like VitePress. Customize the nav, sidebar, colors, fonts, and everything else from the <a href="https://jekyll-vitepress.dev/configuration-reference/">configuration reference</a>.</p>

<h2 id="why-this-matters">Why This Matters</h2>

<p>When I came back to Ruby in 2024, I kept finding things that could be better. There wasn’t a great LLM library, so I built <a href="https://rubyllm.com">RubyLLM</a>. Async deserved more attention, so I <a href="/async-ruby-is-the-future">blogged about it</a>. And our documentation sites? They didn’t look the part.</p>

<p>In open source, looks matter. A beautiful docs site tells potential users: this project is serious, maintained, and worth your time. It lowers the barrier to adoption. It makes people want to try your library.</p>

<p>VitePress understood this. Now Jekyll has it too.</p>

<pre><code class="language-ruby">gem "jekyll-vitepress-theme", "~&gt; 1.0"
</code></pre>]]></content><author><name>Carmine Paolino</name></author><category term="Ruby" /><category term="Jekyll" /><category term="Documentation" /><category term="Open Source" /><category term="VitePress" /><summary type="html"><![CDATA[The Ruby community doesn't have a great documentation theme. So I made one. Jekyll VitePress Theme brings VitePress's docs UX to Jekyll.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://paolino.me/images/jekyll-vitepress.png" /><media:content medium="image" url="https://paolino.me/images/jekyll-vitepress.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">RubyLLM 1.14: From Zero to AI Chat App in Under Two Minutes</title><link href="https://paolino.me/rubyllm-1-14-chat-ui/" rel="alternate" type="text/html" title="RubyLLM 1.14: From Zero to AI Chat App in Under Two Minutes" /><published>2026-03-18T00:00:00+00:00</published><updated>2026-03-18T00:00:00+00:00</updated><id>https://paolino.me/rubyllm-1-14-chat-ui</id><content type="html" xml:base="https://paolino.me/rubyllm-1-14-chat-ui/"><![CDATA[<p>RubyLLM 1.14 ships a full chat UI generator. Two commands and you have a working AI chat app with Turbo streaming, model selection, and tool call display, in under two minutes. The demo above shows the whole thing: new Rails app to working chat in 1:46, including trying it out.</p>

<h2 id="why-this-matters">Why This Matters</h2>

<p>RubyLLM turned one last week. <a href="/rubyllm-1-0/">1.0 shipped on March 11, 2025</a> with Rails integration from day one: ActiveRecord models, <code>acts_as_chat</code>, Turbo streaming, persistence out of the box. <a href="/rubyllm-1.4-1.5.1/">1.4</a> added the install generator. <a href="https://github.com/crmne/ruby_llm/releases/tag/1.7.0">1.7</a> brought the first scaffold chat UI with Turbo Streams. <a href="/rubyllm-1-12-agents/">1.12</a> introduced agents with prompt conventions. Each release got closer to the same thing: AI that works the way Rails works.</p>

<p>1.14 fully realizes that goal. A beautiful Tailwind chat UI (with automatic fallback to scaffold if you’re not using Tailwind). Generators for agents and tools. Conventional directories for everything. All of it extracted from <a href="https://chatwithwork.com">Chat with Work</a>, where it’s been running in production for months.</p>

<h2 id="what-you-get">What You Get</h2>

<p>Two generators. That’s it.</p>

<pre><code class="language-sh">bin/rails generate ruby_llm:install
bin/rails generate ruby_llm:chat_ui
</code></pre>

<p>Your app now has this structure:</p>

<pre><code class="language-plaintext">app/
├── agents/
├── controllers/
│   ├── chats_controller.rb
│   └── messages_controller.rb
├── helpers/
│   └── messages_helper.rb
├── jobs/
│   └── chat_response_job.rb
├── models/
│   ├── chat.rb
│   ├── message.rb
│   ├── model.rb
│   └── tool_call.rb
├── prompts/
├── schemas/
├── tools/
└── views/
    ├── chats/
    │   ├── index.html.erb
    │   ├── show.html.erb
    │   └── _chat.html.erb
    └── messages/
        ├── _assistant.html.erb
        ├── _user.html.erb
        ├── _tool.html.erb
        ├── _error.html.erb
        ├── create.turbo_stream.erb
        ├── tool_calls/
        │   └── _default.html.erb
        └── tool_results/
            └── _default.html.erb
</code></pre>

<p>Separate partials for each message role. Turbo Stream templates for real-time updates via <code>broadcasts_to</code>. A background job that handles the AI response. Tool calls and tool results each get their own rendering pipeline. A complete Tailwind chat interface, not a scaffold you need to fight with.</p>

<h2 id="full-tutorial-new-app-from-scratch">Full Tutorial: New App from Scratch</h2>

<p>If you want to start from zero, this is what the demo shows. The whole thing takes just a minute.</p>

<pre><code class="language-sh">rails new chat_app --css tailwind
cd chat_app
bundle add ruby_llm
bin/rails generate ruby_llm:install
bin/rails generate ruby_llm:chat_ui
bin/rails db:migrate
bin/rails ruby_llm:load_models
bin/dev
</code></pre>

<p>That’s a new Rails app with Tailwind, RubyLLM installed, the chat UI generated, the database set up, models loaded, and the server running. Open <code>localhost:3000/chats</code> and start talking to an AI.</p>

<h2 id="generators-for-agents-tools-and-schemas">Generators for Agents, Tools, and Schemas</h2>

<p>Now the fun part. You scaffold agents, tools, and schemas the same way you’d scaffold anything else in Rails:</p>

<pre><code class="language-bash">bin/rails generate ruby_llm:agent SupportAgent
</code></pre>

<pre><code class="language-plaintext">app/
├── agents/
│   └── support_agent.rb
└── prompts/
    └── support_agent/
        └── instructions.txt.erb
</code></pre>

<p>The agent class comes with the <a href="/rubyllm-1-12-agents/">1.12 DSL</a> ready to go. The instructions file is an ERB template for your system prompt, so you can version it, review it in PRs, and template it with runtime context.</p>

<pre><code class="language-bash">bin/rails generate ruby_llm:tool WeatherTool
</code></pre>

<pre><code class="language-plaintext">app/
├── tools/
│   └── weather_tool.rb
└── views/
    └── messages/
        ├── tool_calls/
        │   └── _weather.html.erb
        └── tool_results/
            └── _weather.html.erb
</code></pre>

<p>Each tool gets its own partials for rendering calls and results. Show a weather widget for the weather tool, a search results list for a search tool, all through Rails partials.</p>

<pre><code class="language-bash">bin/rails generate ruby_llm:schema Product
</code></pre>

<pre><code class="language-plaintext">app/
└── schemas/
    └── product_schema.rb
</code></pre>

<p>This creates a schema for structured output validation.</p>

<p>More on all of this in the <a href="https://rubyllm.com/rails/">Rails integration docs</a>, and the dedicated guides for <a href="https://rubyllm.com/agents/">agents</a> and <a href="https://rubyllm.com/tools/">tools</a>.</p>

<h2 id="self-registering-provider-config">Self-Registering Provider Config</h2>

<p>For people building provider gems: providers now register their own configuration options instead of patching a monolithic <code>Configuration</code> class.</p>

<pre><code class="language-ruby">class DeepSeek &lt; RubyLLM::Provider
  class &lt;&lt; self
    def configuration_options
      %i[deepseek_api_key deepseek_api_base]
    end
  end
end
</code></pre>

<p>When the provider is registered, its options become <code>attr_accessor</code>s on <code>RubyLLM::Configuration</code> automatically. Third-party gems can add their config keys without touching the core.</p>

<h2 id="bug-fixes">Bug Fixes</h2>

<ul>
  <li><strong>Faraday logging memory bloat</strong>: logging no longer serializes large payloads (like base64-encoded PDFs) when the log level is above DEBUG.</li>
  <li><strong>Agent <code>assume_model_exists</code> propagation</strong>: setting this on the agent class now actually works.</li>
  <li><strong>Renamed model associations</strong>: foreign key references with <code>acts_as</code> helpers are fixed.</li>
  <li><strong>MySQL/MariaDB compatibility</strong>: JSON column defaults work correctly now.</li>
  <li><strong>Error.new with string argument</strong>: no longer raises a <code>NoMethodError</code>.</li>
</ul>

<p>Full list in the <a href="https://github.com/crmne/ruby_llm/releases/tag/1.14.0">release notes</a>.</p>

<pre><code class="language-ruby">gem 'ruby_llm', '~&gt; 1.14'
</code></pre>]]></content><author><name>Carmine Paolino</name></author><category term="Ruby" /><category term="AI" /><category term="Rails" /><category term="LLM" /><category term="Open Source" /><category term="RubyLLM" /><category term="Chat UI" /><summary type="html"><![CDATA[RubyLLM 1.14 ships a Tailwind chat UI, Rails generators for agents and tools, and a simplified config DSL. Watch the full setup in 1:46.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://paolino.me/images/rubyllm-1.14.png" /><media:content medium="image" url="https://paolino.me/images/rubyllm-1.14.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Ruby Is the Best Language for Building AI Apps</title><link href="https://paolino.me/ruby-is-the-best-language-for-ai-apps/" rel="alternate" type="text/html" title="Ruby Is the Best Language for Building AI Apps" /><published>2026-02-20T00:00:00+00:00</published><updated>2026-02-20T00:00:00+00:00</updated><id>https://paolino.me/ruby-is-the-best-language-for-ai-apps</id><content type="html" xml:base="https://paolino.me/ruby-is-the-best-language-for-ai-apps/"><![CDATA[<blockquote>
  <p>If your goal is to ship AI applications in 2026, Ruby is the best language to do it.</p>
</blockquote>

<h2 id="the-ai-training-ecosystem-is-irrelevant">The AI Training Ecosystem Is Irrelevant</h2>

<p>Python owns model training. PyTorch, TensorFlow, the entire notebooks-and-papers gravity well. Nobody disputes that.</p>

<p>But you’re not training LLMs. Almost nobody is. Each training run costs millions of dollars. The dataset is the internet!</p>

<p>This is what AI development today looks like:</p>

<pre><code class="language-bash">curl https://api.openai.com/v1/chat/completions \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -d '{"model": "gpt-5.2", "messages": [{"role": "user", "content": "Hello"}]}'
</code></pre>

<p>That’s it. An HTTP call.</p>

<p>The entire Python ML stack is <em>irrelevant</em> to achieve this. What matters is everything around it: streaming responses to users, persisting conversations, tracking costs, switching providers when pricing changes.</p>

<p>That’s web application engineering. That’s where Ruby and Rails shine like no other.</p>

<h2 id="you-need-a-complex-agent-framework-or-youre-not-doing-real-ai">“You Need a Complex Agent Framework or You’re Not Doing Real AI”</h2>

<p>Bullshit.</p>

<p>You need a beautiful, truly provider-independent API. Let me show you.</p>

<h2 id="python-vs-javascript-vs-ruby-llm-libraries">Python vs JavaScript vs Ruby LLM Libraries</h2>

<h3 id="simple-chat">Simple chat</h3>

<p><strong>Python (LangChain):</strong></p>

<pre><code class="language-python">from langchain.chat_models import init_chat_model
from langchain.messages import HumanMessage

model = init_chat_model("gpt-5.2", model_provider="openai")
response = model.invoke([HumanMessage("Hello!")])
</code></pre>

<p>You need to specify the provider, create an array of messages that need to be instantiated, etc.</p>

<p>That’s ceremony.</p>

<p><strong>JavaScript (AI SDK):</strong></p>

<pre><code class="language-javascript">import { generateText } from 'ai';
import { openai } from '@ai-sdk/openai';

const { text } = await generateText({
  model: openai('gpt-5.2'),
  prompt: 'Hello!',
});
</code></pre>

<p>What if you want to use a model from another provider?</p>

<p><strong>Ruby (<a href="https://rubyllm.com">RubyLLM</a>):</strong></p>

<pre><code class="language-ruby">require 'ruby_llm'

RubyLLM.chat.ask "Hello!"
</code></pre>

<p>Reads like it should.</p>

<h3 id="token-usage-tracking">Token usage tracking</h3>

<p>If you’re running AI in production, you need to track token usage. This is how you price your app.</p>

<p><strong>LangChain (GPT):</strong></p>

<pre><code class="language-python">response = model.invoke([HumanMessage("Hello!")])
response.response_metadata['token_usage']
# {'completion_tokens': 12, 'prompt_tokens': 8, 'total_tokens': 20}
</code></pre>

<p><strong>LangChain (Claude):</strong></p>

<pre><code class="language-python">response.response_metadata['usage']
# {'input_tokens': 8, 'output_tokens': 12}
</code></pre>

<p>Different key and different structure!</p>

<p><strong>LangChain (Gemini):</strong></p>

<pre><code class="language-python">response.response_metadata
# ...nothing...
</code></pre>

<p>It’s not even there!</p>

<p><a href="https://rubyllm.com">RubyLLM</a>:</p>

<pre><code class="language-ruby">response.tokens.input   # =&gt; 8
response.tokens.output  # =&gt; 12
</code></pre>

<p>Same interface. Every provider. Every model.</p>

<h3 id="agents">Agents</h3>

<p>This is where it gets fun.</p>

<p><strong>Python (LangChain):</strong></p>

<pre><code class="language-python">from langchain_openai import ChatOpenAI
from langchain.agents import create_agent

model = ChatOpenAI(model="gpt-5-nano")

graph = create_agent(
    model=model,
    tools=[search_docs, lookup_account],
    system_prompt="You are a concise support assistant",
)

inputs = {"messages": [{"role": "user", "content": "How do I reset my API key?"}]}

for chunk in graph.stream(inputs, stream_mode="updates"):
    print(chunk)
</code></pre>

<p><strong>JavaScript (AI SDK 6):</strong></p>

<pre><code class="language-javascript">import { ToolLoopAgent } from 'ai';
import { openai } from '@ai-sdk/openai';

const supportAgent = new ToolLoopAgent({
  model: openai('gpt-5-nano'),
  system: 'You are a concise support assistant.',
  tools: { searchDocs, lookupAccount },
});

const { text } = await supportAgent.generateText({
  messages: [{ role: 'user', content: 'How do I reset my API key?' }],
});
</code></pre>

<p><strong>Ruby (<a href="https://rubyllm.com">RubyLLM</a>):</strong></p>

<pre><code class="language-ruby">require 'ruby_llm'

class SupportAgent &lt; RubyLLM::Agent
  model "gpt-5-nano"
  instructions "You are a concise support assistant."
  tools SearchDocs, LookupAccount
end

SupportAgent.new.ask "How do I reset my API key?"
</code></pre>

<p>Pure joy.</p>

<h2 id="its-about-cognitive-overhead">It’s About Cognitive Overhead</h2>

<p>This isn’t just about aesthetics.</p>

<p>It’s about <em>cognitive overhead</em>: how many abstractions, how many provider-specific details, how many different data structures you need to hold in your head instead of focusing on what really matters: prompts and tool design.</p>

<p>Low cognitive overhead compounds: faster onboarding, fewer accidental bugs, easier refactors, and cleaner debugging when production explodes at 2AM.</p>

<p>Ruby’s advantage here is cultural: elegant APIs are treated as first-class engineering work, not icing on the cake.</p>

<h2 id="rails-gives-you-the-rest-of-the-product-for-free">Rails Gives You the Rest of the Product for Free</h2>

<p>Model calls are only a small chunk of your code. The rest makes up the bulk of it: auth, billing, background jobs, streaming UI, persistence, admin screens, observability, even <a href="https://native.hotwired.dev/">native apps</a>.</p>

<p>Rails gives you a beautiful, coherent answer for all of it.</p>

<p>With <a href="https://rubyllm.com">RubyLLM</a> + Rails, the core streaming loop is tiny:</p>

<pre><code class="language-ruby">class ChatResponseJob &lt; ApplicationJob
  def perform(chat_id, content)
    chat = Chat.find(chat_id)

    chat.ask(content) do |chunk|
      message = chat.messages.last
      message.broadcast_append_chunk(chunk.content) if chunk.content.present?
    end
  end
end
</code></pre>

<p>And on the model side:</p>

<pre><code class="language-ruby">class Chat &lt; ApplicationRecord
  acts_as_chat
end

class Message &lt; ApplicationRecord
  acts_as_message
  has_many_attached :attachments
end
</code></pre>

<p>This gives you streaming chunks to your web app and persistence in your DB in absurdly few lines of code.</p>

<h2 id="it-scales">It Scales</h2>

<p>“Ruby can’t handle AI scale.”</p>

<p>Wrong.</p>

<p>LLM workloads are mostly network-bound and streaming-bound. That’s exactly where Ruby’s <a href="https://socketry.github.io/async/">Async</a> ecosystem shines. Fibers let you handle high concurrency without thread explosion and resource waste. No need to plaster the code with <code>async</code>/<code>await</code> keywords. <a href="https://rubyllm.com">RubyLLM</a> became concurrent with 0 code changes.</p>

<p>I wrote a deep dive here: <a href="/async-ruby-is-the-future">Async Ruby is the Future of AI Apps (And It’s Already Here)</a></p>

<h2 id="dont-take-my-word-for-it">Don’t Take My Word for It</h2>

<p>Someone ported <a href="https://rubyllm.com">RubyLLM</a>’s API design to JavaScript as <a href="https://github.com/nicholasgriffintn/node-llm">NodeLLM</a>. Same design. Clean code, good docs.</p>

<p>The JavaScript community’s response: zero upvotes on Reddit. 14 GitHub stars. Top comments: “How’s this different from AI SDK?” and “It’s always fun when you AI bros post stuff. They all look and sound the same. Also, totally unnecessary.”</p>

<p><a href="https://rubyllm.com">RubyLLM</a>: #1 on Hacker News. ~3,600 stars. 5 million downloads. Millions of people using RubyLLM-powered apps today.</p>

<p>Same design. Wildly different reception. That tells you everything about which community is ready for this moment.</p>

<p>And teams that switched from Python are not going back:</p>

<blockquote>
  <p>We had a customer deployment coming up and our Langgraph agent was failing. I rebuilt it using <a href="https://rubyllm.com">RubyLLM</a>. Not only was it far simpler, it performed better than the Langgraph agent.</p>
</blockquote>

<blockquote>
  <p>Our first pass at the AI Agent used langchain… it was so painful that we built it from scratch in Ruby. Like a cloud had lifted. Langchain was that bad.</p>
</blockquote>

<blockquote>
  <p>At Yuma, serving over 100,000 end users, our unified AI interface was awful. <a href="https://rubyllm.com">RubyLLM</a> is so much nicer than all of that.</p>
</blockquote>

<p>These aren’t people who haven’t tried Python. They tried it, shipped it, and replaced it.</p>

<h2 id="go-ship-ai-apps-with-ruby-rails-and-rubyllm">Go Ship AI Apps with Ruby, Rails, and <a href="https://rubyllm.com">RubyLLM</a></h2>

<p>When we freed ourselves from complexity, this community built Twitter, GitHub, Shopify, Basecamp, Airbnb. Rails changed web development forever.</p>

<p>Now we have the chance to change AI app development. Because AI apps are all about the product. And nobody builds products better than Ruby developers.</p>]]></content><author><name>Carmine Paolino</name></author><category term="Ruby" /><category term="Rails" /><category term="AI" /><category term="LLM" /><category term="RubyLLM" /><category term="Async" /><category term="Developer Experience" /><summary type="html"><![CDATA[A pragmatic, code-first argument for Ruby as the best language to ship AI products in 2026.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://paolino.me/images/rubyconfth-2026-keynote.jpg" /><media:content medium="image" url="https://paolino.me/images/rubyconfth-2026-keynote.jpg" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">RubyLLM 1.12: Agents Are Just LLMs with Tools</title><link href="https://paolino.me/rubyllm-1-12-agents/" rel="alternate" type="text/html" title="RubyLLM 1.12: Agents Are Just LLMs with Tools" /><published>2026-02-17T00:00:00+00:00</published><updated>2026-02-17T00:00:00+00:00</updated><id>https://paolino.me/rubyllm-1-12-agents</id><content type="html" xml:base="https://paolino.me/rubyllm-1-12-agents/"><![CDATA[<p>“Agent” might be the most overloaded word in tech right now. Every startup claims to have one. Every framework promises to help you build them. The discourse has gotten so thick that the actual concept is buried under layers of marketing.</p>

<p>So let’s start from first principles.</p>

<h2 id="whats-an-agent">What’s an Agent?</h2>

<p>An agent is an LLM that can call functions.</p>

<p>That’s it. When you give a language model a set of tools it can invoke – a database lookup, an API call, a file operation – and the model decides when and how to use them, you have an agent. The model reasons about the problem, picks the right tool, looks at the result, and continues reasoning. Sometimes it calls several tools in sequence. Sometimes none.</p>

<p>There’s no special “agent mode.” No orchestration engine. No graph of nodes. It’s just a conversation where the model can do things besides talk.</p>

<h2 id="rubyllm-always-had-this">RubyLLM Always Had This</h2>

<p>Tool calling has been a core feature of <a href="https://rubyllm.com">RubyLLM</a> since 1.0:</p>

<pre><code class="language-ruby">class SearchDocs &lt; RubyLLM::Tool
  description "Searches our documentation"
  param :query, desc: "Search query"

  def execute(query:)
    Document.search(query).map(&amp;:title)
  end
end

chat = RubyLLM.chat
chat.with_tool(SearchDocs)
chat.ask "How do I configure webhooks?"
# Model searches docs, reads results, answers the question
</code></pre>

<p>That’s an agent. The model decides to search, interprets the results, and responds. You didn’t need a special class or framework to make this happen.</p>

<p>But there was a problem.</p>

<h2 id="the-reuse-problem">The Reuse Problem</h2>

<p>In a real application, you don’t configure a chat once. You configure it in controllers, background jobs, service objects, API endpoints. The same instructions, the same tools, the same temperature – scattered across your codebase:</p>

<pre><code class="language-ruby"># In the controller
chat = RubyLLM.chat(model: 'gpt-4.1')
chat.with_instructions("You are a support assistant for #{workspace.name}...")
chat.with_tools(SearchDocs, LookupAccount, CreateTicket)
chat.with_temperature(0.2)

# In the background job
chat = RubyLLM.chat(model: 'gpt-4.1')
chat.with_instructions("You are a support assistant for #{workspace.name}...")
chat.with_tools(SearchDocs, LookupAccount, CreateTicket)
chat.with_temperature(0.2)

# In the service object...
# You get the idea
</code></pre>

<p>Every Rubyist’s instinct kicks in: this should be a class.</p>

<h2 id="rubyllm-112-a-dsl-for-agents">RubyLLM 1.12: A DSL for Agents</h2>

<p>That’s exactly what 1.12 adds. Define your agent once, use it everywhere:</p>

<pre><code class="language-ruby">class SupportAgent &lt; RubyLLM::Agent
  model 'gpt-4.1'
  instructions "You are a concise support assistant."
  tools SearchDocs, LookupAccount, CreateTicket
  temperature 0.2
end

# Anywhere in your app
response = SupportAgent.new.ask "How do I reset my API key?"
</code></pre>

<p>Every macro maps to a <code>with_*</code> call you already know. <code>model</code> maps to <code>RubyLLM.chat(model:)</code>. <code>tools</code> maps to <code>with_tools</code>. <code>instructions</code> maps to <code>with_instructions</code>. No new concepts. Just a cleaner way to package what you were already doing.</p>

<h2 id="runtime-context">Runtime Context</h2>

<p>Static configuration is only half the story. Real agents need runtime data – the current user, the workspace, the time of day. Agents support lazy evaluation for this:</p>

<pre><code class="language-ruby">class WorkAssistant &lt; RubyLLM::Agent
  chat_model Chat
  inputs :workspace

  instructions { "You are helping #{workspace.name}" }

  tools do
    [
      TodoTool.new(chat: chat),
      GoogleDriveTool.new(user: chat.user)
    ]
  end
end

chat = WorkAssistant.create!(user: current_user, workspace: @workspace)
chat.ask "What's on my todo list?"
</code></pre>

<p>Blocks and lambdas are evaluated at runtime, with access to the chat object and any declared inputs. Values that depend on runtime context must be lazy – a constraint that Ruby makes trivially natural.</p>

<h2 id="prompt-conventions">Prompt Conventions</h2>

<p>If you’re using Rails, agents follow a convention for prompt management:</p>

<pre><code class="language-ruby">class WorkAssistant &lt; RubyLLM::Agent
  chat_model Chat
  instructions display_name: -&gt; { chat.user.display_name_or_email }
end
</code></pre>

<p>This renders <code>app/prompts/work_assistant/instructions.txt.erb</code> with <code>display_name</code> available as a local. Namespaced agents map naturally: <code>Admin::SupportAgent</code> looks in <code>app/prompts/admin/support_agent/</code>.</p>

<p>Your prompts are ERB templates. Version them in git. Review them in PRs. Treat them like the application code they are.</p>

<h2 id="rails-integration">Rails Integration</h2>

<p>The <code>chat_model</code> macro activates Rails-backed persistence:</p>

<pre><code class="language-ruby">class WorkAssistant &lt; RubyLLM::Agent
  chat_model Chat
  model 'gpt-4.1'
  instructions "You are a helpful assistant."
  tools SearchDocs, LookupAccount
end

# Create a persisted chat with agent config applied
chat = WorkAssistant.create!(user: current_user)

# Load an existing chat, apply runtime config
chat = WorkAssistant.find(params[:id])

# User sends a message, everything persisted automatically
chat.ask(params[:message])
</code></pre>

<p><code>create!</code> persists both the chat and its instructions. <code>find</code> applies configuration at runtime without touching the database. This distinction matters when your prompts evolve faster than your data.</p>

<h2 id="also-in-112">Also in 1.12</h2>

<p>Agents are the headline, but this release also adds:</p>

<ul>
  <li><strong>AWS Bedrock full coverage</strong> via the Converse API – every Bedrock chat model through one interface</li>
  <li><strong>Azure Foundry API</strong> – broad model access across Azure’s ecosystem</li>
  <li><strong>Clearer <code>with_instructions</code> semantics</strong> – explicit append options, guaranteed message ordering</li>
</ul>

<h2 id="already-in-production">Already in Production</h2>

<p>This isn’t a spec or a proposal. The agent DSL powers <a href="https://chatwithwork.com">Chat with Work</a> in production right now. The <code>WorkAssistant</code> examples above aren’t hypothetical – they’re simplified versions of real code handling real conversations.</p>

<p>If you want to see what it feels like, <a href="https://chatwithwork.com">try it out</a>.</p>

<h2 id="the-point">The Point</h2>

<p>The industry is making agents complicated. They’re not. An agent is an LLM with tools. You define the tools in Ruby. You package them in a class. You use the class in your app.</p>

<p>No graphs. No chains. No orchestration frameworks. Just Ruby.</p>

<pre><code class="language-ruby">gem 'ruby_llm', '~&gt; 1.12'
</code></pre>]]></content><author><name>Carmine Paolino</name></author><category term="Ruby" /><category term="AI" /><category term="Agents" /><category term="LLM" /><category term="Rails" /><category term="Open Source" /><category term="Tool Calling" /><category term="RubyLLM" /><summary type="html"><![CDATA[Agents aren't magic. They're LLMs that can call your code. RubyLLM 1.12 adds a clean DSL to define and reuse them.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://paolino.me/images/rubyllm-1.12.png" /><media:content medium="image" url="https://paolino.me/images/rubyllm-1.12.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Dictation Is the New Prompt (Voxtype on Omarchy)</title><link href="https://paolino.me/dictation-is-the-new-prompt/" rel="alternate" type="text/html" title="Dictation Is the New Prompt (Voxtype on Omarchy)" /><published>2026-01-07T00:00:00+00:00</published><updated>2026-01-07T00:00:00+00:00</updated><id>https://paolino.me/dictation-is-the-new-prompt</id><content type="html" xml:base="https://paolino.me/dictation-is-the-new-prompt/"><![CDATA[<p>Typing every prompt feels backwards in 2026. You can speak faster than you can type. Hold a hotkey, speak, your OS types it for you. If you care about flow, dictation is the most underrated upgrade you can make.</p>

<p>In the <a href="https://omarchy.org/">Omarchy</a> world, <a href="https://github.com/goodroot/hyprwhspr">Hyprwhspr</a> is getting a lot of attention after a recent DHH tweet:</p>

<div class="jekyll-twitter-plugin"><blockquote class="twitter-tweet"><p lang="en" dir="ltr">I had no idea that local model dictation had gotten this good and this fast! I&#39;m blown away by how good hyprwhspr with Omarchy is just using a base model backed by the CPU. Unbelievably accurate. <a href="https://t.co/Jtz3eN84Jf">https://t.co/Jtz3eN84Jf</a></p>&mdash; DHH (@dhh) <a href="https://twitter.com/dhh/status/2007498242561593535?ref_src=twsrc%5Etfw">January 3, 2026</a></blockquote>
<script async="" src="https://platform.twitter.com/widgets.js" charset="utf-8"></script>

</div>

<p>He’s right: local dictation is <em>shockingly</em> good now. The catch is Hyprwhspr uses Python virtual environments, which don’t mix well with <a href="http://mise.jdx.dev/">mise</a>. Fortunately <a href="https://github.com/peteonrails">Pete Jackson</a> <a href="https://github.com/basecamp/omarchy/discussions/3872">saw that and created</a> <a href="https://github.com/peteonrails/voxtype/">Voxtype</a> to solve exactly this issue!</p>

<p>EDIT: five minutes after I posted this, DHH confirmed that Voxtype ships will ship with Omarchy 3.3! 🎉</p>

<div class="jekyll-twitter-plugin"><blockquote class="twitter-tweet"><p lang="en" dir="ltr">Voxtype is shipping with Omarchy 3.3 👍 <a href="https://t.co/Pt1EkgNLoi">https://t.co/Pt1EkgNLoi</a></p>&mdash; DHH (@dhh) <a href="https://twitter.com/dhh/status/2008856834258645389?ref_src=twsrc%5Etfw">January 7, 2026</a></blockquote>
<script async="" src="https://platform.twitter.com/widgets.js" charset="utf-8"></script>

</div>

<h2 id="why-voxtype">Why Voxtype</h2>

<p>Voxtype is built in Rust, so you don’t need Python virtual environments which means it works well with mise. It’s fast, it just works, and when <a href="https://github.com/peteonrails/voxtype/issues/26">I opened an issue asking for an Omarchy theme</a>, <a href="https://github.com/peteonrails/voxtype/releases/tag/v0.4.4">the author shipped it immediately</a>. Now it looks <em>stunning</em> in my setup.</p>

<p>With Vulkan enabled, transcription is almost instant on my Ryzen AI 9 HX370. The video at the top is not sped up. Longer text also transcribes instantly.</p>

<p>If you want to copy my exact configuration, here it is.</p>

<h2 id="install">Install</h2>

<pre><code class="language-bash">sudo pacman -S wtype ydotool wl-clipboard vulkan-icd-loader # last only if you want to use your GPU
sudo yay -S voxtype

voxtype setup --download
voxtype setup gpu # if you want to use your GPU
voxtype setup systemd
</code></pre>

<p>Restart Waybar after the changes:</p>

<pre><code class="language-bash">pkill -SIGUSR2 waybar
</code></pre>

<h2 id="voxtype-config">Voxtype config</h2>

<p><code>~/.config/voxtype/config.toml</code></p>

<pre><code class="language-toml">state_file = "auto"

[hotkey]
enabled = false

[audio]
device = "default"
sample_rate = 16000
max_duration_secs = 600

[audio.feedback]
enabled = true
# Sound theme: "default", "subtle", "mechanical", or path to custom theme directory
theme = "default"
volume = 0.7

[whisper]
model = "base.en"
language = "en"
translate = false
on_demand_loading = true # saves your GPU until it's needed

[output]
mode = "type"
fallback_to_clipboard = true

# Delay between typed characters in milliseconds
# 0 = fastest possible, increase if characters are dropped
type_delay_ms = 1

[output.notification]
on_recording_start = false
on_recording_stop = false
on_transcription = true

[text]
replacements = { "hyperwhisper" = "hyprwhspr" }

[status]
icon_theme = "omarchy"
</code></pre>

<h2 id="waybar-integration">Waybar integration</h2>

<p><code>~/.config/waybar/config.jsonc</code></p>

<pre><code class="language-jsonc">"custom/voxtype": {
  "exec": "voxtype status --follow --format json",
  "return-type": "json",
  "format": "{}",
  "tooltip": true
},
</code></pre>

<p>And add it to <code>modules-right</code>:</p>

<pre><code class="language-jsonc">"modules-right": [
  "group/tray-expander",
  "custom/voxtype",
  "bluetooth",
  "network",
  "pulseaudio",
  "cpu",
  "battery"
]
</code></pre>

<p><code>~/.config/waybar/style.css</code></p>

<pre><code class="language-css">@import "voxtype.css";
@import "../omarchy/current/theme/waybar.css";
</code></pre>

<p><code>~/.config/waybar/voxtype.css</code></p>

<pre><code class="language-css">#custom-voxtype {
  margin: 0 16px 0 0;
  font-size: 12px;
  font-weight: bold;
  border-top: 2px solid transparent;
  border-bottom: 2px solid transparent;
  transition: color 150ms ease-in-out, border-color 150ms ease-in-out;
}

#custom-voxtype.recording {
  color: #ff5555;
  animation: pulse 1s ease-in-out infinite;
}

#custom-voxtype.transcribing {
  color: #ff5555;
}

#custom-voxtype.stopped {
  color: #6272a4;
}

@keyframes pulse {
  0% { opacity: 1; }
  50% { opacity: 0.5; }
  100% { opacity: 1; }
}
</code></pre>

<h2 id="keybinding">Keybinding</h2>

<p>In your Hyprland config:</p>

<pre><code class="language-ini"># Voxtype
bindd = SHIFT, XF86AudioMicMute, Transcribe, exec, voxtype record toggle
</code></pre>

<p>That’s it. Use your voice whenever possible. It’s faster, more natural, and keeps you in flow.</p>]]></content><author><name>Carmine Paolino</name></author><category term="AI" /><category term="Voice" /><category term="Linux" /><category term="Omarchy" /><category term="Rust" /><category term="Productivity" /><summary type="html"><![CDATA[Stop typing every prompt. Speak it instead, with a fast Rust stack and a clean Omarchy setup.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://paolino.me/images/omarchy-voxtype-demo.png" /><media:content medium="image" url="https://paolino.me/images/omarchy-voxtype-demo.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Nano Banana with RubyLLM</title><link href="https://paolino.me/nano-banana-with-rubyllm/" rel="alternate" type="text/html" title="Nano Banana with RubyLLM" /><published>2025-10-23T00:00:00+00:00</published><updated>2025-10-23T00:00:00+00:00</updated><id>https://paolino.me/nano-banana-with-rubyllm</id><content type="html" xml:base="https://paolino.me/nano-banana-with-rubyllm/"><![CDATA[<p>Google wired Nano Banana into the chat interface <code>generateContent</code>, not the image API’s <code>predict</code>. Counterintuitive if you’re using RubyLLM, which makes you think in terms of <em>actions</em> like <a href="https://rubyllm.com/image-generation/"><code>paint</code></a> instead of <a href="https://rubyllm.com/chat/"><code>chat</code></a>.</p>

<p>Once you know that quirk, it’s straightforward. Only caveat: you need the latest trunk or v1.9+, because that’s where we taught RubyLLM to unpack inline file data from chat responses.</p>

<h2 id="wire-it-up">Wire It Up</h2>

<pre><code class="language-ruby">chat = RubyLLM
         .chat(model: "gemini-2.5-flash-image")
         .with_temperature(1.0) # optional, but you like creativity, right?
         .with_params(generationConfig: { responseModalities: ["image"] }) # also optional, if you prefer the model to return only images

response = chat.ask "your prompt", with: ["all.png", "the.jpg", "attachments.png", "you.png", "want.jpg"]

image_io = response.content[:attachments].first.source
</code></pre>

<p>That <code>StringIO</code> holds the generated image. Stream it to S3, attach it to Active Storage, or keep it in memory for a downstream processor.</p>

<p>Want a file?</p>

<pre><code class="language-ruby">response.content[:attachments].first.save "nano-banana.png"
</code></pre>

<p>That’s it. Chat endpoint, one call. Ship the image feature and go enjoy the rest of your day.</p>]]></content><author><name>Carmine Paolino</name></author><category term="Ruby" /><category term="AI" /><category term="RubyLLM" /><category term="Google" /><category term="Gemini" /><summary type="html"><![CDATA[Nano Banana hides behind Google's chat endpoint. Here's the straight line to ship it with RubyLLM.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://paolino.me/images/nano-banana.png" /><media:content medium="image" url="https://paolino.me/images/nano-banana.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">RubyLLM 1.4-1.5.1: Three Releases in Three Days</title><link href="https://paolino.me/rubyllm-1.4-1.5.1/" rel="alternate" type="text/html" title="RubyLLM 1.4-1.5.1: Three Releases in Three Days" /><published>2025-08-01T00:00:00+00:00</published><updated>2025-08-01T00:00:00+00:00</updated><id>https://paolino.me/rubyllm-1.4-1.5.1</id><content type="html" xml:base="https://paolino.me/rubyllm-1.4-1.5.1/"><![CDATA[<p>Three releases in three days. Wednesday, Friday, and Friday again. Each one shipped as soon as it was ready.</p>

<h2 id="140-the-structured-output-release-wednesday">1.4.0: The Structured Output Release (Wednesday)</h2>

<p>Getting LLMs to return data in the format you need has always been painful.</p>

<p>We all had code like this:</p>

<pre><code class="language-ruby"># The old struggle
response = chat.ask("Return user data as JSON. ONLY JSON. NO MARKDOWN.")
begin
  data = JSON.parse(response.content.gsub(/```json\n?/, '').gsub(/```\n?/, ''))
rescue JSON::ParserError
  # Hope and pray
end
</code></pre>

<p>Now with structured output:</p>

<pre><code class="language-ruby"># Define your schema with the RubyLLM::Schema DSL
class PersonSchema &lt; RubyLLM::Schema
  string :name
  integer :age
  array :skills, of: :string
end

# Get perfectly structured JSON every time
chat = RubyLLM.chat.with_schema(PersonSchema)
response = chat.ask("Generate a Ruby developer profile")

# =&gt; {"name" =&gt; "Yukihiro", "age" =&gt; 59, "skills" =&gt; ["Ruby", "C", "Language Design"]}
</code></pre>

<p>No more regex. No more parsing. Just data structures that work.</p>

<p>Oh, and Daniel Friis released <a href="https://github.com/danielfriis/ruby_llm-schema">RubyLLM::Schema</a> just for the occasion, but you can use any gem you want with RubyLLM, or even write your own JSON schema from scratch.</p>

<h2 id="rails-generators-from-zero-to-chat">Rails Generators: From Zero to Chat</h2>

<p>We didn’t have Rails generators before. Now we do:</p>

<pre><code class="language-bash">rails generate ruby_llm:install
</code></pre>

<p>This creates everything you need:</p>
<ul>
  <li>Migrations</li>
  <li>Models with <code>acts_as_chat</code>, <code>acts_as_message</code>, and <code>acts_as_tool_call</code></li>
  <li>A clean initializer</li>
</ul>

<p>Your Chat model works like any Rails model:</p>

<pre><code class="language-ruby">chat = Chat.create!(model: "gpt-4.1-nano")
response = chat.ask("Explain Ruby blocks")
# Messages are automatically persisted with proper associations
</code></pre>

<p>From <code>rails new</code> to working chat in under 5 minutes.</p>

<h2 id="tool-call-transparency">Tool Call Transparency</h2>

<p>New callback to see what your AI is doing:</p>

<pre><code class="language-ruby">chat.on_tool_call do |tool_call|
  puts "🔧 AI is calling: #{tool_call.name}"
  puts "   Arguments: #{tool_call.arguments}"

  Rails.logger.info "[AI Tool] #{tool_call.name}: #{tool_call.arguments}"
end

chat.ask("What's the weather in Tokyo?").with_tools([weather_tool])
# =&gt; 🔧 AI is calling: get_weather
#    Arguments: {"location": "Tokyo"}
</code></pre>

<p>Essential for debugging and auditing AI behavior.</p>

<h2 id="direct-parameter-provider-access">Direct Parameter Provider Access</h2>

<p>Need that one weird parameter? Use <code>with_params</code>:</p>

<pre><code class="language-ruby"># OpenAI's JSON mode
chat.with_params(response_format: { type: "json_object" })
     .ask("List Ruby features as JSON")
</code></pre>

<p>No waiting for us to wrap every provider option.</p>

<h2 id="critical-bug-fixes-and-other-improvements-in-140">Critical Bug Fixes and Other Improvements in 1.4.0</h2>

<ul>
  <li><strong>Anthropic multiple tool calls</strong>: Was only processing the first tool call, silently ignoring the rest</li>
  <li><strong>Streaming errors</strong>: Now handled properly in both Faraday V1 and V2</li>
  <li><strong>Test fixtures</strong>: Removed 60MB of unnecessary test data</li>
  <li><strong>Message ordering</strong>: Fixed race conditions in streaming responses</li>
  <li><strong>JRuby support</strong>: Now officially tested and supported</li>
  <li><strong>Direct access to raw responses</strong>: Get the raw responses from Faraday for debugging</li>
  <li><strong>GPUStack support</strong>: A production-ready alternative to Ollama</li>
</ul>

<p><a href="https://github.com/crmne/ruby_llm/releases/tag/1.4.0">Full release notes for 1.4.0 available on GitHub.</a></p>

<h2 id="150-two-new-providers-friday">1.5.0: Two New Providers (Friday)</h2>

<h3 id="mistral-ai">Mistral AI</h3>

<p>63 models from France, from tiny to massive:</p>

<pre><code class="language-ruby">RubyLLM.configure do |config|
  config.mistral_api_key = ENV['MISTRAL_API_KEY']
end

# Efficient small model
chat = RubyLLM.chat(model: 'ministral-3b-latest')

# Their flagship model
chat = RubyLLM.chat(model: 'mistral-large-latest')

# Vision with Pixtral
vision = RubyLLM.chat(model: 'pixtral-12b-latest')
vision.ask("What's in this image?", with: "path/to/image.jpg")
</code></pre>

<h3 id="perplexity">Perplexity</h3>

<p>Real-time web search meets LLMs:</p>

<pre><code class="language-ruby">RubyLLM.configure do |config|
  config.perplexity_api_key = ENV['PERPLEXITY_API_KEY']
end

# Get current information with web search
chat = RubyLLM.chat(model: 'sonar-pro')
response = chat.ask("What are the latest Ruby 3.4 features?")
# Searches the web and returns current information
</code></pre>

<p><a href="https://github.com/crmne/ruby_llm/releases/tag/1.5.0">Full release notes for 1.5.0 available on GitHub.</a></p>

<h3 id="rails-generator-fixes">Rails Generator Fixes</h3>

<ul>
  <li>Fixed migration order (Chats → Messages → Tool Calls)</li>
  <li>Fixed PostgreSQL detection that was broken by namespace collision</li>
  <li>PostgreSQL users now get <code>jsonb</code> columns instead of <code>json</code></li>
</ul>

<h2 id="151-quick-fixes-also-friday">1.5.1: Quick Fixes (Also Friday)</h2>

<p>Found issues Friday afternoon. Fixed them. Shipped them. That’s it.</p>

<p>Why make users wait through the weekend with broken code?</p>

<ul>
  <li>Fixed Mistral model capabilities (was a Hash, should be Array)</li>
  <li>Fixed Google Imagen output modality</li>
  <li>Updated to JRuby 10.0.1.0</li>
  <li>Added JSON schema validation for model registry</li>
</ul>

<p><a href="https://github.com/crmne/ruby_llm/releases/tag/1.5.1">Full release notes for 1.5.1 available on GitHub.</a></p>

<h2 id="the-philosophy-ship-when-ready">The Philosophy: Ship When Ready</h2>

<p>Three days. Three releases. Each one made someone’s code work better.</p>

<p>We could have bundled everything into one release next week. But every moment we wait is a moment someone’s dealing with a bug we already fixed.</p>

<p>The structured output in 1.4.0? People needed that since before RubyLLM existed. The PostgreSQL fix in 1.5.0? Someone’s migrations were failing Thursday. The Mistral fix? Breaking someone’s code Friday morning.</p>

<p>When code is ready, you ship.</p>

<h2 id="what-you-can-build-now">What You Can Build Now</h2>

<p>With structured output and multiple providers, you can build real features:</p>

<pre><code class="language-ruby"># Extract structured data from any text
class InvoiceSchema &lt; RubyLLM::Schema
  string :invoice_number
  date :date
  float :total
  array :line_items do
    object do
      string :description
      float :amount
    end
  end
end

# Use Mistral for cost-effective extraction
extractor = RubyLLM.chat(model: 'ministral-8b-latest')
                    .with_schema(InvoiceSchema)

invoice_data = extractor.ask("Extract invoice details from: #{pdf_text}")
# Reliable data extraction at a fraction of GPT-4's cost

# Use Perplexity for current information
researcher = RubyLLM.chat(model: 'sonar-deep-research')
market_data = researcher.ask("Current Ruby job market trends in 2025")
# Real-time data, not training cutoff guesses
</code></pre>

<h2 id="use-it">Use It</h2>

<pre><code class="language-ruby">gem 'ruby_llm', '~&gt; 1.5'
</code></pre>

<p>Full backward compatibility. Your 1.0 code still runs. These releases just made everything better.</p>]]></content><author><name>Carmine Paolino</name></author><category term="Ruby" /><category term="AI" /><category term="LLM" /><category term="Rails" /><category term="Structured Output" /><category term="Mistral" /><category term="Perplexity" /><summary type="html"><![CDATA[Structured output that works, Rails generators that didn't exist, and why we shipped Wednesday, Friday, and Friday again.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://paolino.me/assets/images/og/posts/rubyllm-1.4-1.5.1.png" /><media:content medium="image" url="https://paolino.me/assets/images/og/posts/rubyllm-1.4-1.5.1.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry></feed>