Every workflow automation project starts with the same question: do we build it in a visual tool like n8n or Make, or do we write custom code? We have shipped all three across client projects, and the honest answer is that the right choice depends on volume, how often the logic changes, and who maintains it after we hand it off. The wrong choice usually shows up six months later as either a brittle node graph nobody can debug or a custom service that costs more to maintain than the problem it solved. Here is how we actually decide, with the numbers we use to back the call.
What n8n, Make, and custom code each do well
Make (formerly Integromat) is the fastest way to connect SaaS apps with a clean visual canvas. It ships with roughly 3,000 prebuilt connectors and, since August 2025, prices on credits rather than the older “operations” model. In practice one credit still maps to one module run for standard steps, while AI modules and steps that fan out across multiple data bundles can burn more than one credit each. For a marketing team automating lead routing across HubSpot, Slack, and Google Sheets, a non-technical operator can build and own the scenario without a developer. The ceiling is real logic: branching, loops, and data reshaping get awkward fast, and complex scenarios become expensive because every module call counts against your credit balance.
n8n sits in the middle. It is fair-code licensed, self-hostable, and lets us drop into JavaScript or Python inside any node. It offers 500-plus official integrations plus unlimited custom HTTP calls, so we get the visual graph for the boring 80 percent (auth, retries, pagination) and real code for the 20 percent that actually matters. The pricing model is the key difference: n8n bills per execution, where one workflow run is one execution regardless of how many nodes it touches, and self-hosted Community Edition has no execution cap at all. That changes the math entirely at volume. The tradeoff is that you now own a running service: upgrades, the Postgres database behind it, and queue mode if you need concurrency.
Custom code (a TypeScript or Python service, often with a job queue like BullMQ or a durable execution engine like Temporal) wins when the workflow is core to the product, runs at high volume, or needs guarantees a visual tool cannot express cleanly: durable multi-step state, fine-grained idempotency, complex retry policies, or sub-100ms latency. The cost is engineering time up front and ongoing ownership.
The cost math that usually decides it
The number that flips decisions is volume, and the two platforms bill on opposite axes. Make charges per module run, so an 8-module scenario running 50,000 times a month is roughly 400,000 credits. The Core plan ($9/month) includes about 10,000 credits, so that workload pushes you into add-on packs (priced around 25 percent above your plan’s included rate) or a much higher tier. n8n charges per execution, so those same 50,000 runs are 50,000 executions no matter how many nodes each contains, and on self-hosted Community Edition the marginal cost is just your server.
A worked example makes the gap concrete. Take a daily sync that runs 30 modules across 20,000 records a month. On Make that is roughly 600,000 credits, comfortably into the higher paid tiers. The same job on a self-hosted n8n box (a $20 to $50/month VPS for light loads, $50 to $150/month for production) is flat regardless of how many records pass through, plus a few hours of DevOps a month. At the volumes we tend to run, tens of thousands of executions a day, a per-module model is simply the wrong tool: the bill grows linearly with success while a self-hosted n8n instance or a custom worker runs on a fixed server cost.
A rough heuristic we use:
- Under ~10,000 module runs/month with simple logic and a non-technical owner: Make. Speed to ship and zero infrastructure usually beat everything else.
- 10,000 runs/month up to a few hundred thousand, or logic that needs real code: self-hosted n8n. You escape per-module billing and keep the visual layer for maintainability.
- High volume, product-critical paths, or strict reliability and latency needs: custom code. The maintenance burden is justified because the workflow is the product.
A concrete cost example with an LLM step
The math gets sharper the moment an AI call enters the workflow, because now you pay twice: once to the automation platform and once to the model. Say a workflow classifies 100,000 support tickets a month, each prompt about 1,000 input tokens and 200 output tokens. On a small model like GPT-4o mini ($0.15 per million input, $0.60 per million output), that is roughly $15 in input plus $12 in output, about $27/month in model cost. On Claude Haiku 4.5 ($1.00 input, $5.00 output) the same volume runs closer to $200/month. The model choice swings the bill by an order of magnitude, and a visual tool gives you almost no leverage over it.
This is exactly why we pull AI steps into custom code. In a service we control prompt length, cache repeated context, batch requests, set hard token budgets, and fall back to a cheaper model when confidence is high. Those levers routinely cut model spend by half or more, and none of them are easy to express inside a node on a canvas where each AI module is also burning platform credits on top of the API cost.
Maintenance and debugging are the hidden cost
Visual tools look cheaper because the build is fast, but maintenance is where the bill actually lands. A 40-node Make scenario or a sprawling n8n graph is hard to diff, hard to code review, and hard to reason about when it breaks at 2am. Make has no real version control on the visual layer, though n8n’s JSON export and Git-backed source control narrow the gap. Custom code gets you proper testing, CI, and a stack trace, but only if the team maintaining it can read it. We have inherited n8n graphs that were genuinely faster to rewrite than to debug, and we have also seen custom services abandoned because the one engineer who understood them left.
The real question is not which tool is most powerful. It is which tool the team that owns this after us can keep running. We weight that heavily, and it is the single factor that most often overrides a pure cost calculation.
Scaling and reliability: where the guarantees live
At low volume every tool is reliable enough. The differences appear under load. Self-hosted n8n scales horizontally through queue mode, which moves executions onto a Redis queue picked up by separate worker processes; each worker defaults to a concurrency of 10, and n8n recommends 5 or higher per worker while watching your database connection pool. That gets you a long way, and it is genuinely production-grade when configured well.
Custom code earns its keep when you need guarantees beyond throughput. A durable execution engine like Temporal gives exactly-once semantics for workflow logic, with at-least-once for the side-effecting activities it calls, which still must be idempotent. A Redis-backed queue like BullMQ gives at-least-once delivery and is the right fit for atomic, independently retryable jobs: send an email, process an upload, sync a record. Reach for durable execution when the work is process-shaped, multi-step, long-running, and must complete no matter what fails in between. No visual tool expresses that cleanly, which is the honest reason we drop to code for the load-bearing paths.
How we mix all three in practice
Most production systems we build are not purist. A common pattern: n8n orchestrates the high-level flow and handles the dozen API integrations, then calls out to a small custom service for the one piece that needs real logic, heavy data processing, or an LLM call with strict prompt and cost control. This keeps the integration glue in a maintainable visual layer while the load-bearing logic lives in tested code. For RAG chatbots and AI agents specifically, we almost always put retrieval and model orchestration in custom code (where caching, token budgets, and evaluation matter) and use n8n only for the surrounding triggers and notifications.
We also separate prototype from production. Make or n8n is excellent for proving a workflow is worth building at all. Once it earns its keep and volume climbs, we know which parts to harden into code and which to leave visual.
A decision checklist we run before building
Before committing to a tool we answer five questions. Who owns this after launch, an operator or an engineer? How many runs a month, today and projected in a year? Does the logic fit clean branching, or does it need loops, state, and data reshaping? Are there reliability or latency guarantees the business actually depends on? And is an LLM or other metered API in the path, where model cost will dominate? The answers usually point at one tool without much debate. When they conflict, ownership wins, because a powerful system nobody can maintain is worth less than a modest one a team can keep alive.
Our recommendation
Start with the cheapest tool that can express the logic and is owned by the right person. Choose Make for fast, low-volume, SaaS-to-SaaS automation owned by an operator. Choose self-hosted n8n when you need code inside your workflows or you are scaling past Make’s per-module pricing. Choose custom code when the workflow is core, high-volume, or has reliability needs a visual tool cannot guarantee. Then revisit the decision at six months: volume and logic both grow, and the tool that was right at launch is often wrong at scale.
Send us your monthly run count, your trickiest step, and who maintains it after launch, and we will tell you which of the three fits, or build you the hybrid that splits the visual glue from the load-bearing code.