GitHub Hot — 30 May 2026

Top 10 repos trending on GitHub this week — what they do, why they matter, and how to use them in your projects.

1. op7418/guizang-social-card-skill

1,551 stars this week · HTML · agent-skill ai-agent anthropic claude-code

A Claude Code agent skill that turns any text or article into publication-quality social media image sets (Xiaohongshu carousels + WeChat cover pairs) using single-file HTML → PNG via Playwright — no design tools needed.

Use case

Developers and content creators who write long-form content need to repurpose it into platform-native image formats (Xiaohongshu's 3:4 carousel, WeChat's 21:9 header + 1:1 share card) without opening Figma or Canva. Concrete example: you finish a 2,000-word teardown of a visa policy change — you run the skill against the article, specify 'Swiss style, 6 cards, jade green', and get a pixel-perfect PNG set ready to post, with consistent grid, typography hierarchy, and colour anchor across all cards.

Why it's trending

It's gaining traction because it directly plugs into the Claude Code agent skill ecosystem (installable via npx skills add) at exactly the moment that agentic coding workflows are mainstream — content production becomes a one-prompt CLI operation rather than a multi-tool design sprint. The Chinese social media angle (Rednote/Xiaohongshu) also taps a fast-growing Western-creator audience who is migrating to the platform.

How to use it

Install the skill: npx skills add https://github.com/op7418/guizang-social-card-skill --skill guizang-social-card-skill (requires Playwright and a Claude Code agent with shell access).
Verify installation — the agent should confirm SKILL.md, assets/, and references/ directories exist in ~/.claude/skills/guizang-social-card-skill.
Invoke it in plain language: 帮我基于这篇文章做一套瑞士风小红书图文,5 张,IKB 蓝 — or in English: 'Generate a 5-card Swiss-style Xiaohongshu set from this article, IKB blue accent'.
The skill renders a self-contained HTML file using one of 28 layouts, then Playwright headlessly screenshots each card at the correct canvas size (1080×1440 for 3:4, 2100×900 for WeChat header).
Output PNGs land in your working directory — ready to upload directly with zero post-processing.

How I could use this

Auto-generate a branded 'Career Edge' digest card set every time Henry's weekly AI-jobs roundup posts — wire the skill into the existing fetch-ai-news GitHub Actions cron so each digest article gets a 3-card Swiss-style Xiaohongshu set committed alongside the markdown, giving the blog a social-ready asset with zero extra effort.
Build a 'Share My Gap Analysis' feature in the resume analyser: after Claude returns a skill gap report, offer a one-click 'Export as card' button that calls a /api/share-card route, passes the top 5 gaps + the user's target role to the skill, and returns a 1:1 WeChat-style summary card the user can post to LinkedIn or Rednote — turning a private tool output into a viral shareable.
Create a visa milestone card generator: when a user logs a status update in the visa tracker (e.g. '485 granted'), trigger the skill to produce an Editorial-style commemorative card with the visa subclass, grant date, and a short personalised message — delivered as a downloadable PNG from the dashboard, making a mundane database write emotionally rewarding and naturally shareable.

2. helloianneo/ian-xiaohei-illustrations

1,095 stars this week · various · ai-agent chinese codex-skill handdrawn

A Codex Skill that instructs AI agents to generate hand-drawn 16:9 article illustrations for Chinese content using a consistent 'Xiaohei' character IP — bridging the gap between abstract concepts and visual metaphors.

Use case

Chinese content creators writing methodology articles, AI workflow explainers, or knowledge posts need inline diagrams that aren't generic stock art or PowerPoint infographics. This skill gives an AI agent a repeatable visual grammar: extract the cognitive anchor from a paragraph, assign it a Xiaohei action (sorting, pressing, bridging), and generate a white-background, hand-drawn PNG. Example: you write a post about context window limits, the skill generates a 'Xiaohei drowning in a pile of tokens' illustration rather than a bar chart.

Why it's trending

OpenAI Codex's recent resurgence as an agentic coding platform has sparked a wave of shareable 'skills' — reusable instruction sets that give AI agents a persistent visual or behavioral identity. This repo is one of the first to nail a coherent illustration style-guide-as-skill, making it a reference point for anyone building content-production agents.

How to use it

Clone the repo and copy it into your Codex skills directory: cp -R ./ian-xiaohei-illustrations ~/.codex/skills/,2. In your Codex agent session, load the skill: @ian-xiaohei-illustrations — the agent now has the full visual grammar and shot-list logic in context.,3. Paste your Chinese article text and prompt: '为这篇文章生成 4-6 张正文配图的 shot list，每张图聚焦一个认知动作' — the agent returns a structured shot list with Xiaohei actions and annotation suggestions.,4. For each shot, run the image generation step: the skill instructs the agent to use DALL-E or a compatible model with the exact style constraints (white BG, hand-drawn line weight, 40-60% subject fill, sparse red/orange/blue Chinese labels).,5. Output PNGs land in assets/<article-slug>-illustrations/ — drop them directly into your CMS or Notion doc.

How I could use this

Build a 'Gradland post illustrator' pipeline: when a new blog post lands in content/posts/, a GitHub Action triggers a Codex agent with this skill to auto-generate 2-3 Xiaohei illustrations and commit them alongside the markdown — giving your Chinese-language visa/career posts a distinctive, recognizable visual identity that separates Gradland from generic IT career blogs.
Adapt the shot-list pattern for your resume analyser feature: instead of returning only text gaps, generate a Xiaohei 'gap map' diagram showing the candidate's current skills (small Xiaohei) vs the job requirements (large structure looming over them) — a shareable visual that international students can post on LinkedIn, driving referral traffic back to Gradland.
Fork the skill's cognitive-anchor extraction logic (the step where it identifies 'judgment, process, state, or metaphor' in a passage) and wire it into your AI news digest pipeline — before fetching the weekly AI news summary, run this extraction pass to auto-tag each article with its core concept type, then use that tag to select a matching Xiaohei illustration template, giving content/digest/ posts a consistent illustrated header without manual art direction.

3. UditAkhourii/adhd

572 stars this week · TypeScript · adhd agents ai ai-agents

A drop-in Claude Code skill that replaces linear chain-of-thought with isolated parallel reasoning processes across divergent cognitive frames — then scores, prunes, and deepens the survivors before returning an answer.

Use case

LLMs anchor on their first token — ask Claude 'how should I structure this API?' and every branch of its thinking is contaminated by whatever it said in line 1. ADHD fixes this architecturally: it spawns N completely isolated sub-agents under deliberately different cognitive frames (e.g. 'systems thinker', 'sceptic', 'first-principles engineer'), lets them reason in parallel with zero shared context, then runs a critic pass that clusters overlapping ideas, kills traps, and sends only the surviving divergent threads into a deeper pass. Concrete scenario: you're deciding between three resume-scoring architectures and every LLM you ask converges on the same RAG pattern — ADHD would surface the outlier approach (graph-based skill matching, say) that a single CoT pass would anchor past.

Why it's trending

572 stars in a week off the back of a New Stack feature and being adopted by repowire — it's the first repo to frame LLM premature convergence as an architectural bug rather than a prompting problem, which resonates strongly right now as teams hit the ceiling of single-agent CoT on complex design decisions. The Claude Agent SDK launch also brought a new audience looking for ready-made skills.

How to use it

Install: npm install -g adhd-agent (requires Node ≥18 and a Claude Code Pro session active).
Drop the skill into your Claude Code project: copy adhd.md from the repo into your .claude/skills/ directory, or register it via the SDK if you're building an agent.
Invoke it from Claude Code: /adhd How should I design the rate-limiting layer for my resume-analysis API? — it will fan out N isolated reasoning agents under different frames and return a scored, pruned synthesis.
Tune divergence width and pruning threshold in adhd.config.ts — start with frames: 5, pruneThreshold: 0.4 for most design questions.
Consume the structured output (survivors[], traps[], synthesis) — wire synthesis into your next prompt or display survivors as ranked alternatives in your UI.

How I could use this

Wire ADHD into a 'blog post ideation' endpoint: when Henry drafts a title, hit /adhd 'What are 5 non-obvious angles on: {title}?' and surface the pruned survivors as a sidebar of alternative framings he can pivot to — solves the blank-page problem for the content moat without him switching tools.
Use ADHD's critic+prune output as the backbone of a 'resume strategy' feature: instead of one Claude pass scoring a resume against a JD, spawn frames like 'ATS filter', 'hiring manager gut check', 'visa-sponsor lens', and 'skills-gap analyst' in parallel — the pruned survivors become the ranked, non-redundant feedback bullets shown to the user, which is meaningfully better than a single-pass gap analysis.
Build a 'career path divergence' tool for the 485 visa audience: given a user's current role and target PR pathway, run ADHD with frames like 'fastest skills ROI', 'least visa risk', 'highest salary ceiling Australia', and 'most transferable to NZ/UK fallback' — the synthesis surfaces the non-obvious path that a linear 'here are your options' prompt would collapse into generic advice.

4. withkynam/vibecode-pro-max-kit

569 stars this week · JavaScript · agentic ai-agents ai-coding-assistant ai-development

A spec-driven harness of 12 agents and 32 skills that wraps Claude Code / Codex to enforce PRD-first development and persist project context across sessions — so your AI doesn't amnesia-code its way into spaghetti.

Use case

When you're building a multi-week feature with an AI coding agent, every new session starts cold — the AI re-invents decisions, ignores past constraints, and ships code that contradicts what it wrote yesterday. vibecode-pro-max-kit solves this by maintaining a living spec (PRD + backlog + ADRs) in-repo and routing every agent call through that context first. Concrete example: you're adding Stripe webhooks over three sessions; instead of re-explaining your auth pattern each time, the harness loads the relevant spec slices automatically and the agent picks up exactly where it left off.

Why it's trending

Context rot is the #1 complaint from developers running Claude Code or Codex on anything beyond a single-session feature — the timing overlaps with Claude Code going GA and a wave of 'vibe-coded' apps hitting production and breaking. The repo is essentially a community-standardised answer to Anthropic's own CLAUDE.md pattern, but with 12 pre-wired agent roles and persistent memory out of the box.

How to use it

Install into any existing project: npx vibecode-pro-max-kit init — this scaffolds a .vibecode/ directory with spec templates, agent configs, and a MEMORY.md index.,2. Run the Spec agent to generate a PRD from a plain-English feature description: claude /spec 'Add resume PDF export with ATS scoring overlay' — it produces a structured PRD in .vibecode/specs/ with acceptance criteria and file change list.,3. Hand off to the Builder agent: claude /build specs/resume-pdf-export.md — the agent reads the spec, checks MEMORY.md for existing patterns (your auth setup, rate-limit conventions, etc.), then implements without re-asking.,4. After shipping, run /remember — the harness extracts non-obvious decisions from the session and writes them to .vibecode/memory/ so the next agent inherits them.,5. For long autonomous runs (CI, overnight tasks), use /loop with a spec file as the target — the harness self-checkpoints state every N agent calls so a quota limit or timeout doesn't lose progress.

How I could use this

Wire the Spec agent into your TODO.md workflow from AGENTS.md §16 — instead of manually writing feature entries, run /spec 'feature description' and let it generate the TODO entry + PRD stub, then review before implementation starts. This enforces your own spec-before-code rule without the friction.
Use the memory self-improvement loop to build a Gradland-specific agent knowledge base: after each PR merged to main, run /remember against the diff — over time the harness accumulates your exact auth patterns, RLS conventions, and rate-limit rules, so any new career-tools feature (gap analysis, interview routes) gets implemented consistently without you re-citing AGENTS.md every session.
Create a /career-tools-scaffold custom skill that uses the harness's agent routing to spin up a new career tool (route handler + client component + Supabase migration + rate limit registration) from a single spec file — the 12-agent architecture means you can parallelise the API route agent, the UI agent, and the migration agent, cutting feature scaffolding from 30 minutes of prompting to one command.

5. Michaelliv/pi-dynamic-workflows

557 stars this week · TypeScript

A Pi extension that ports Claude Code's dynamic multi-agent workflow system to the open-source Pi AI assistant, letting you fan out work across isolated parallel subagents from a plain JavaScript script.

Use case

When a task is too large or multidimensional for one sequential AI pass — a codebase security audit, a multi-angle PR review, a fan-out research job — you write a small JS script that spawns isolated subagents in parallel and synthesizes their outputs. Concrete example: auditing a Next.js app for performance, security, and accessibility simultaneously across three subagents, rather than waiting for each dimension to finish before starting the next.

Why it's trending

Anthropic shipped dynamic workflows in Claude Code this week and published a blog post about it — this repo is the immediate community response, porting the same pattern to Pi (an open-source Claude Code alternative). Developers who can't or won't pay for Claude Code Pro are using this to get the same fan-out capability in a self-hosted setup.

How to use it

Install: pi install npm:pi-dynamic-workflows, then /reload in Pi.
Ask Pi in plain English: 'Run a workflow to audit this repo for security issues across auth, API routes, and SQL queries.'
Pi auto-generates a workflow script with phase(), agent(), and parallel() calls.
Watch inline progress — each subagent shows ✓/✗ as it completes; press Esc to abort.
The synthesized result is returned to the main conversation as structured output.

phase('Review')
const [auth, api, sql] = await parallel([
  () => agent('Audit auth middleware for session leaks'),
  () => agent('Check all /api routes for missing rate limits'),
  () => agent('Scan Supabase queries for RLS bypass risks'),
])
return { auth, api, sql }

How I could use this

Automate the githot digest: write a Pi workflow that fans out across the week's top 10 repos in parallel — one subagent per repo doing the use-case/trending/brainstorm analysis you're doing manually right now — then synthesizes into a single content/githot/YYYY-MM-DD.md file ready to commit. Cuts multi-minute sequential Claude calls down to the latency of the slowest single repo.
Parallel gap analysis for career tools: instead of sending a job description through one sequential Claude call that checks skills, visa fit, salary band, and ACS classification one after another, fan out each dimension to its own subagent. The four analyses run concurrently; the synthesizer merges them into the structured gap report. On a 485-visa candidate with a complex JD, this could halve response latency.
Multi-lens cover letter generator: spawn three subagents per application — one researches the company's recent GitHub activity and tech stack, one reads the JD for implicit culture signals, one checks ACS ANZSCO code alignment for visa sponsorship likelihood — then a synthesis agent writes the cover letter using all three inputs. Replaces the current single-pass approach with grounded, parallel research.

6. baoweise-bot/aimili-vpngate

509 stars this week · Python

AimiliVPN automates cycling through free VPNGate public nodes on a Linux VPS to give you a clean, rotating outbound IP with HTTP/SOCKS5 proxy endpoints — no paid VPN subscription required.

Use case

If you self-host scrapers, API polling bots, or outbound webhooks on a VPS, your server's static IP eventually gets rate-limited or blocked. AimiliVPN solves this by automatically benchmarking hundreds of free VPNGate nodes, picking the lowest-latency one, routing your outbound traffic through it via OpenVPN + policy routing, and exposing a local SOCKS5/HTTP proxy your app can use — so each session looks like it originates from a different country's clean IP.

Why it's trending

Free 'clean IP' egress has become a hot topic as AI scraping bots saturate shared VPS subnets, getting them flagged by Cloudflare and API providers almost instantly. VPNGate's open academic relay network (run by University of Tsukuba) is one of the few high-quality free sources left, and automating it is newly attractive.

How to use it

Spin up a fresh Ubuntu 22.04 VPS (DigitalOcean, Hetzner, etc.) — the tool installs system-level OpenVPN and iptables rules, so use a dedicated machine.,2. Run the one-liner: bash <(curl -Ls https://raw.githubusercontent.com/baoweise-bot/aimili-vpngate/main/install.sh) — it installs dependencies, registers the ml CLI, and starts the service.,3. Check status and grab your local proxy address: ml status — you'll see the active VPN node, latency, and the SOCKS5/HTTP proxy port (typically 127.0.0.1:1080).,4. Point your app at the proxy: curl --socks5 127.0.0.1:1080 https://api.example.com or set HTTPS_PROXY=socks5://127.0.0.1:1080 in your env.,5. Use ml restart to rotate to a fresh node when the current one degrades — or automate rotation with a cron job calling the CLI.

How I could use this

Route your scripts/fetch-ai-news.ts and scripts/fetch-visa-news.ts scrapers through a rotating SOCKS5 proxy so your Vercel-deployed cron jobs don't share the same Vercel egress IP pool that news sites increasingly block — run the scraper on a sidecar VPS with AimiliVPN and POST results to a Supabase edge function.
For the job scraper backing Gradland's job search feature, proxy outbound Jora/ACS requests through AimiliVPN to bypass per-IP rate limits on Australian job boards — you'd get significantly higher successful scrape rates without paying for residential proxies.
Use the HTTP proxy endpoint as a passthrough in a Next.js route handler that geo-tests your own content: hit /api/geo-preview?country=JP which proxies a self-request through a Japanese VPNGate node and returns what an overseas user actually sees — useful for validating visa-news and salary content accuracy by region.

7. MatinSenPai/SenPaiScanner

477 stars this week · Go

A Go terminal UI tool that scans Cloudflare IP ranges to find low-latency endpoints that survive DPI/censorship filtering, validated end-to-end through your actual VLESS/Trojan proxy config.

Use case

Developers behind restrictive networks (Iran, China, corporate firewalls) where Cloudflare-fronted proxies randomly drop due to DPI need to find CF IPs that actually pass traffic. SenPaiScanner automates the two-phase hunt: first a lightweight connectivity probe across the CF IP space, then a live xray validation that downloads real bytes through your config — so you get IPs ranked by actual throughput, not just ping.

Why it's trending

Network restriction enforcement has intensified in several regions in 2026, driving demand for practical circumvention tooling. The repo stands out because it embeds xray natively and adds a TUI with zero CLI flags — making a technically complex workflow accessible enough to go viral in affected communities.

How to use it

Download the latest release binary for your platform from GitHub Releases (or go install github.com/matinsenpai/senpaiscanner@latest).
Run senpaiscanner — no flags needed; a terminal menu appears immediately.
Select 'Find Working IPs', paste your VLESS or Trojan config URL (e.g. vless://uuid@domain:443?...).
Phase 1 probes CF IP ranges using SNI/host/path extracted from your config; Phase 2 launches an embedded xray instance and benchmarks the best hits for real download speed and TTFB.
Press c when Phase 2 finishes to copy a working config with the fastest discovered IP substituted in — drop it straight into your proxy client.

How I could use this

Write a 'How Cloudflare IP scanning works' deep-dive post for the blog — explaining BGP anycast, DPI evasion via TLS SNI fronting, and why latency varies so wildly across the CF range. This is high-value SEO content for developers in restricted regions, a real audience gap, and directly relevant to the Go/networking audience already reading githot digests.
Build a 'Network-aware job search' feature for the career tools: detect whether a visitor is behind a high-latency or restricted connection (via a lightweight CF worker probe from the browser) and surface Australian remote-first job listings first, since on-site roles are irrelevant if the candidate is still overseas with spotty connectivity.
Use SenPaiScanner's two-phase scan architecture as a design pattern reference when building your own AI feature health-check endpoint — Phase 1 checks if Anthropic's API edge is reachable at all (DNS + TLS), Phase 2 runs a cheap claude-haiku probe request to confirm the session is truly functional before an expensive Sonnet call. Expose this as a /api/health/ai route that your frontend can poll to show a degraded-mode banner instead of a silent spinner.

8. 2aronS/Duel-Agents

450 stars this week · TypeScript · ai-agents anthropic claude-code cli

A routing layer that fans your prompt out to multiple LLMs simultaneously and returns the cheapest model response that passes a quality threshold — wired directly into Claude Code, Cursor, and Codex CLI.

Use case

The real problem: you're paying Claude Opus rates for tasks that GPT-4o-mini or Haiku could handle fine, but you don't want to manually route per-task. Duel Agents intercepts every prompt, runs it against a fleet of models in parallel, scores the outputs, and returns the winner by cost-quality ratio. Concrete example: Henry's cover letter endpoint currently hardcodes claude-sonnet-4-6 — Duel would automatically downgrade to Haiku on simple rewrites and only escalate to Sonnet when the scoring threshold demands it.

Why it's trending

Model routing is the hot infra problem of mid-2026 — everyone has multi-model access but no clean way to arbitrage cost vs quality per-request without writing bespoke logic. This repo landed 450 stars this week because it drops routing into existing IDE toolchains (Claude Code plugin, Cursor skill) with a single npx command, requiring zero app-level code changes.

How to use it

Get a Duel API key from duelagents.com/dashboard/settings and export DUEL_API_KEY=duel_yourprefix_yoursecret. 2. Run npx @duel-agents/install claude-code to patch Claude Code's model config to point at https://duelagents.com/v1. 3. For app-level integration, swap your Anthropic base URL: const client = new Anthropic({ baseURL: 'https://duelagents.com/v1', apiKey: process.env.DUEL_API_KEY }) — the SDK is OpenAI-compatible so the same pattern works with the OpenAI SDK. 4. Use model: 'duel-auto' in API calls and Duel picks the winner; or use 'duel-haiku-preferred' / 'duel-sonnet-floor' aliases to set a quality floor. 5. Run npx @duel-agents/install doctor to verify routing is live and inspect per-model latency/cost breakdown.

How I could use this

Wire Duel into the githot digest pipeline (scripts/fetch-ai-news.ts) — the summarisation pass is a perfect duel-auto candidate since quality floor is low and it runs daily, potentially cutting the Claude Haiku bill by routing some calls to cheaper models on off-peak nights.
Build a transparent cost-comparison widget for the blog: log which model Duel actually selected for each cover-letter or gap-analysis call, then render a 'this analysis cost $0.003 vs $0.021 if Sonnet had run' stat on the results page — concrete social proof that smart routing is production-viable.
Use Duel's OpenAI-compatible endpoint to A/B test model quality on the interview prep feature without code changes — flip model: 'duel-auto' for 10% of users, capture their session ratings, and compare against the hardcoded Sonnet baseline to quantify whether routing degrades perceived answer quality for that audience.

9. nv-tlabs/Gamma-World

429 stars this week · various · aigc multi-agent robotics video-game

NVIDIA's Gamma-World generates consistent, multi-agent video game worlds with more than two simultaneous AI-controlled players — replacing scripted NPCs with emergent world-model agents.

Use case

Game engines and simulation platforms have long been bottlenecked by hand-scripted NPC behaviour that breaks the moment players deviate from expected paths. Gamma-World trains a single diffusion-based world model that jointly predicts the next game state for N agents at once, keeping all characters physically coherent with each other. Concretely: imagine a six-player battle arena where every character's next frame is generated by the model, not a rule tree — no scripted edge cases, no desync between agents.

Why it's trending

It dropped on arXiv (2605.28816) in the last two weeks from NVIDIA's SIL lab and immediately hit the intersection of two hot topics: video-native world models (à la Genie 2, GameGen) and multi-agent coordination in generative AI. 429 stars in the first week signals strong interest from both the game-dev and robotics communities who are hungry for scalable world-model baselines.

How to use it

Clone and install dependencies: git clone https://github.com/nv-tlabs/Gamma-World && cd Gamma-World && pip install -r requirements.txt
Download the pretrained checkpoint from the project page (NVIDIA Hugging Face hub) and place it under checkpoints/.
Run inference on the provided demo scenario: python generate.py --config configs/demo_6player.yaml --ckpt checkpoints/gamma_world.pt --out_dir outputs/
Inspect outputs/ — you get a rendered video of N agents interacting over a horizon of ~64 frames.
To condition on your own action sequences, edit the agent_actions field in the YAML config (each agent gets an independent action token stream) and re-run — no retraining needed for action-conditional rollouts.

How I could use this

Write a deep-dive blog post titled 'Why World Models Are the New Game Engine' that uses Gamma-World's 6-player demo as the centrepiece — embed the NVIDIA demo video, walk through the architecture (joint diffusion over N agent tokens), and contrast it with how Unreal scripted NPCs work today. This targets the intersection of your AI-news readers and any game-dev audience you can pull from Hacker News.
Build a 'Career Simulation' concept page for Gradland: pitch the idea of using a world-model-style approach to simulate interview panels (multiple interviewers as agents, each with independent reasoning tokens). You don't need to implement the full model — write a speculative product spec as a blog post, mock up the UI with your existing design system, and use it as a thought-leadership piece that signals Gradland is tracking frontier AI research relevant to career tools.
Use Gamma-World as a case study in your AI digest to explain the core technical idea of 'action-conditional video generation' to a non-ML audience — strip the paper down to one concrete analogy (e.g. 'it's like a chess engine that thinks in pixels, not move trees'), generate a Mermaid diagram of the inference loop (action tokens → world model → next frame → repeat), and publish it as a short 'paper explained in 5 minutes' post using your existing fetch-ai-news pipeline to keep it topical.

10. Sophomoresty/gemini-web2api

420 stars this week · Python

A single-file Python proxy that wraps Google Gemini's free web interface as an OpenAI-compatible API — drop it behind any OpenAI SDK call and pay nothing.

Use case

If you're building AI features but hitting Anthropic/OpenAI rate limits or cost ceilings, this lets you route overflow requests through Gemini's free web tier using the exact same OpenAI SDK calls you already have. Concrete scenario: your cover-letter endpoint hits its daily Claude quota at 2am — instead of returning a 429, you fall back to this local proxy and the request succeeds at zero marginal cost.

Why it's trending

Gemini 2.5 Flash's 1M-token context and 20k+ character output via Flash Thinking just became genuinely competitive, and developers are realising the free web tier can absorb serious workloads. The zero-dependency single-file design also means it deploys as a sidecar on any machine without touching your package graph.

How to use it

Clone and run: git clone https://github.com/Sophomoresty/gemini-web2api && python gemini_web2api.py — server starts at http://localhost:8081/v1,2. Point your existing OpenAI client at the proxy: client = OpenAI(base_url='http://localhost:8081/v1', api_key='any-string'),3. Swap model names to Gemini variants: model='gemini-3.5-flash-thinking' for deep reasoning, model='gemini-3.5-flash' for fast responses,4. For streaming in Next.js route handlers, call the proxy exactly as you would Claude — the SSE format is OpenAI-compatible, so your existing stream parsing code works unchanged,5. Optional: set api_keys in config.json to add Bearer auth if you're exposing the proxy beyond localhost

How I could use this

Use Gemini Flash Thinking's 20k-char output limit as a free fallback for your gap-analysis endpoint — when a user's resume is long and Claude Haiku hits context limits, proxy the request through this and get a full structured analysis back without the truncation problem you'd hit at 4k tokens.
Wire this as a secondary provider in your cover-letter and interview-prep routes: track daily Claude usage in Supabase, and once a user crosses your free-tier threshold, transparently route their next request through the Gemini proxy — same API contract, no code change on the frontend, and you avoid hard-blocking free users before they convert.
Gemini has native web search built in (@search suffix) — use this proxy to build a live visa-news enrichment step: when your fetch-visa-news.ts script runs, send the article title through Gemini with web search enabled to pull in same-day updates from DIBP and tribunal decisions that weren't in the original RSS feed, then merge that into your existing markdown pipeline.