What does the system actually look like under the hood, and how does a non-developer go from 13 services to 7?
Part of the ForkIt! Case Study. Back to the case study →
Three versions. Three layers of complexity. Each one added because the previous version couldn't do something users needed.
Three pieces. The app talks to a Vercel backend. The backend talks to Google Places API. One button, one API call, one random restaurant.
Total services: 3. Monthly cost: Google API only.
Fork Around sessions require real-time coordination: who joined, what filters they set, whether the host picked. Redis (Vercel KV) handles ephemeral session state with a 1-hour TTL.
The web joiner lets people without the app join from a browser. A static HTML page, no framework, no build step.
Total services: 5. Added: Redis, Web.
User accounts need auth (Clerk). History and favorites need a database (Neon PostgreSQL). Subscriptions need payment infrastructure (RevenueCat). Crashes need reporting (Sentry). Code needs automated gates (GitHub Actions).
Total services: 13. Everything else was free tier.
With fewer than 100 users, 13 services was overhead, not infrastructure. Each dependency cost money and maintenance whether there were 10 users or 10,000.
Clerk removed — no accounts needed. Subscriptions verified locally via the store. Neon removed — no database needed. History and favorites stored on-device with export/import. RevenueCat removed — react-native-iap talks to Apple and Google directly.
Tiered API field masks dropped free users from Enterprise billing ($35/1K calls, 1K free cap) to Pro billing ($0 within 5K free cap). Monthly API cost: $0.
Total services: 7. Monthly cost: $0. Same features, fewer dependencies, no ongoing bills.
Vercel Hobby plan caps serverless functions at 12. The backend hit that limit. When Claude refactored endpoints and renamed them, backward-compatible rewrites were needed so deployed apps wouldn't break.
Those rewrites still count toward the function limit. Every new endpoint means consolidating or removing an old one. V4's removal of auth, database, and quota endpoints freed up half the function slots.
Sentry was added after users reported crashes. Not before. CI was added after a bad deploy broke production. Not before.
Lesson: set up observability on day 1, not after the problems are already live.
An LLM built the code. An LLM also builds bugs, billing exposures, dead code, and accessibility failures. The review suite exists because trust, but verify applies double when neither you nor your tool has built software before.
Run with one command: npm run review
Read every source file. Cite lines. Rate by severity.
Findings table by severity (CRITICAL / HIGH / MEDIUM / LOW). Top 5 issues to fix first. Areas that passed clean. Recommended fix order grouped by related changes.
The last full run found 12 HIGH findings across sync logic, account deletion, test coverage, and cost caps. Every one became a GitHub Issue, triaged, and fixed in order.
The review suite is not a checkbox. It is the reason production hasn't broken since it was adopted. The code is LLM-generated. The quality gates are human-designed.
Four phases. Every feature, every bug fix, every session follows the same loop. The order matters: stability first, then polish, then new features.
Crashes, data loss, security exploits, broken deploys. Nothing else moves until every Tier 1 issue is verified closed on a physical device via USB debugging.
Theme inconsistencies, font scaling, layout breaks, misaligned elements. These affect trust. A polished app signals care.
New functionality only after stability and polish are clean. Features are scoped to a single session when possible. If a bug surfaces mid-feature, it gets filed as an issue, not fixed inline (prevents rabbit trails).
Run the full 31-review suite. Automated checks first (npm run review), then manual deep-dives. New findings become GitHub Issues. Triaged by severity.
Reviews always surface new issues. Those go back into the build queue at the appropriate tier. The cycle repeats until the review comes back clean enough to ship.
Never submit store builds without testing locally first. Commit, build local, test on a physical device, THEN submit. This rule exists because of the Annapolis demo: a backend update deployed without the app being tested against it. The app broke at a restaurant. Three users lost.
The guiding principle: "as free as possible." If it weren't for Google API costs, this app might not charge at all. The Pro tier exists to offset infrastructure, not as a revenue model.
Monthly operating cost
V3 (13 services)
| Google Places API | ~$49/mo (Enterprise tier) |
| Vercel (backend + web) | Free tier |
| Neon (PostgreSQL) | Free tier |
| Vercel KV (Redis) | Free tier |
| Clerk (auth) | Free tier |
| RevenueCat (IAP) | Free tier |
| Sentry (crash reporting) | Free tier |
| GitHub Actions (CI) | Free tier |
| Total | ~$49/mo |
V4 (7 services)
| Google Places API | $0 (tiered field masks, within Pro free cap) |
| Vercel (backend + web) | Free tier |
| Vercel KV (Redis) | Free tier |
| Sentry (crash reporting) | Free tier |
| GitHub Actions (CI) | Free tier |
| Neon, Clerk, RevenueCat | Removed in V4 |
| Total | $0/mo |
The single most impactful optimization. Instead of calling Google Places on every tap, the first "Fork It" tap fetches a full pool of results. Every subsequent tap picks randomly from the cached pool locally, with zero API calls.
User taps 5 times = 5 API calls
Every re-roll hits Google Places. Costs scale linearly with user engagement.
User taps 5 times = 1 API call
First tap fetches pool. Re-rolls pick locally. Cache invalidates on filter change or after 8 hours.
Each one reduced per-search cost or eliminated unnecessary calls entirely.
The client-side pool cache and autocomplete cache are readable in forkit-open (utils/api.js, App.js). The server-side Redis layer stays closed.
Caching strategy isn't set-and-forget. After researching alternative API providers (Foursquare, OSM/Overpass, HERE, Apple MapKit), the conclusion was clear: no alternative provides ratings, price level, and real-time "open now" data. Google is the only viable source for the filters users depend on. That meant cost reduction had to come from smarter caching, not a provider switch.
The first version used a flat 1-hour server cache and coarse 11km location buckets for everything. That worked for drive mode but was problematic for walk mode: a user searching within 0.25 miles could get results cached from a point 7 miles away. The fix: radius-aware precision. Searches under 5km (all of walk mode plus the shortest drive radius) use tight 1.1km location buckets. Longer drives keep the coarse grid for better cache sharing.
Server cache TTL was also extended from 1 hour → 3 hours → 8 hours as usage patterns showed restaurant data doesn't change mid-meal-decision. The details call for the picked restaurant always hits Google live, so stale pool data only affects which restaurants appear, not the info shown for the one that gets selected.
Google killed the $200/month free credit in March 2025. The app hit Enterprise-tier pricing without warning. V4's response: stop requesting Enterprise fields for free users. Tiered field masks dropped costs from ~$49/month to $0. The lesson: cost architecture should be designed up front, not bolted on after the bill arrives.
The V4 Phase 1 prediction was specific: route free users to Pro field-mask billing, gate Enterprise-field-requiring filters behind paid tiers. April was the first full month of real billing data against that prediction.
Q1 logged 7,374 Enterprise-tier calls. April logged 132. Pro-tier calls stepped in to absorb free traffic at 623 calls. Net cost: $0.00, within the Pro free cap.
The interesting bit isn't the cost number; it's where the split lives. Pro vs Enterprise isn't a different endpoint, a different SDK, or a different Google API. It's the per-request field mask. One backend, one Google API, two billing tiers, decided per-call by which fields the request asks for. Filters that need Enterprise fields (like real-time "open now") get gated to paid tiers; everything else stays in the Pro mask. The router is the request body, not the URL.
The backend API had no rate limiting or origin checking for weeks. No user data was at risk (the search endpoint doesn't store or transmit personal data), but anyone who found the URL could have run up the Google Places API bill on my account. An LLM built it. An LLM didn't flag the gap. I didn't know to check.
No rate limiting. No CORS restrictions. No origin checking. The Google API key was server-side (good), but the endpoints that used it were open to the internet (bad). The risk was financial, not personal: someone could have run up the API bill, not accessed user data.
The security hardening took multiple sessions spread across weeks. None of it was in the original design. All of it should have been. The review suite (specifically Reviews 7, 19, and 21) now catches these patterns before they ship.
Fork Around lets a group of friends collectively pick a restaurant. One host, up to 8 participants, 4-letter session code, browser or app. Here is how it works at the system level.
/api/group/create. Backend generates a 4-letter code, stores session in Redis with 1-hour TTL. Session saved to AsyncStorage for host reconnection.
/api/group/join with the code and a display name. App users join in-app; browser users join via the web joiner at forkaround.io/group/.
/api/group/pick. Backend merges all filters (most restrictive wins), searches Google Places with the merged criteria, picks one randomly.
/api/group/leave.
If Alice wants a 5-mile radius and Bob wants 2 miles, the search uses 2 miles. If one person sets a $$ max price, no $$$ results appear.
This is deliberate. The alternative (union/broadest) returns results that someone actively excluded. The most restrictive merge respects everyone's constraints, even if it narrows the pool.
A standalone HTML page. No framework, no build step, no app install required. The host shares a link; the friend opens it in any browser.
It calls the same backend endpoints as the app. No duplicate logic. The web joiner is a thin UI layer over the same API.
Vercel serverless functions are stateless. They spin up, execute, and die. WebSockets require persistent connections, which means a dedicated server (added cost, added complexity).
Instead: polling. The app and web joiner poll /api/group/status every few seconds. It is not elegant. It works. It costs nothing. At 8 participants max and 1-hour sessions, the polling load is negligible.
Three things that bit (or almost bit) the system in v4.2 and v4.3, and the analytics tool that surfaces what real users are doing. Each one is a specific incident, not an abstraction.
Two version values exist. They bump on different cadences. Confusing them silently breaks OTA delivery.
app.json version is the binary version (CFBundleShortVersionString on iOS, versionName on Android). Because runtimeVersion.policy is "appVersion", this string IS the OTA runtime version. Bumping it orphans every existing OTA bundle: the new value won't match any deployed binary's runtime, so the OTA never reaches users. It bumps only immediately before cutting a new EAS build for the stores. Three-segment MAJOR.MINOR.PATCH.
constants/config.js APP_VERSION is the value sent in the X-App-Version request header and shown on the in-app version line. Free to update on every OTA. Two-segment MAJOR.MINOR.
Discovered the hard way: an OTA published successfully (the EAS dashboard confirmed it) and never reached users. The runtime mismatch was silent on both sides.
The platform-side framing of the OTA contract (channel, runtime, store-review interaction) lives on the platform deep-dive.
The "Skip the Chains" filter originally trusted Google Places review counts as a chain proxy: 100+ reviews implied chain. False negatives proliferated. Small chains (regional fast-casual, coffee mini-chains) slipped through with low review counts and showed up in Hidden Gems results. The user-facing impact was bad enough to land a 3-star review.
v4.3 dropped the heuristic entirely. The replacement is a curated 768-entry keyword list, expanded from 159 via Wikidata seeding (chain-restaurant entities + their alternate labels). Pure name-match against the list. No review-count gate.
The maintenance loop is the part that matters. A report-as-chain flag icon sits on result cards; reports flow to a backend endpoint and a weekly cron pipeline proposes new candidates as a GitHub Issue. The list grows from real misses, not from guessing. Closes the loop from "user sees a chain in Hidden Gems" to "filter learns about it" without a rebuild or store submission.
scripts/analytics is a local CLI that pulls Apple sales reports plus subscription events plus Google Play stats from gs://pubsite_prod_<DEVELOPER_ID>/. Run with ./monthly-report.sh [current | YYYY-MM]. Output is a single monthly summary that ties units, revenue, renewals, refunds, and Play installs together.
Apple side: the monthly sales report finalizes about 5 days after month end; subscription events are daily and include renewals, refunds, cancels, and upgrades that the sales report doesn't surface. Both pulled via the App Store Connect API.
Google side: bucket access requires Application Default Credentials (gcloud auth application-default login), not the Play Console service account. Service account permissions don't immediately propagate to the bucket (24+ hours in practice); ADC works immediately and uses the developer's own identity. This was the single largest setup-time cost.
The contract-level framing of Apple vs Google sales reporting (what each store guarantees, what each report contains, finalization windows) lives on the platform deep-dive.
The architecture is not clever. It peaked at 13 services, added optimistically as each version needed something new. V4 stripped it back to 7: the services that actually earn their place. The 31-point review suite and a dev cycle that prioritizes stability over speed hold it together.
Every service was added because the previous version couldn't do something users needed. Three were removed when it became clear that fewer than 100 users didn't need accounts, a database, or a third-party payment layer. The constraint ("as free as possible") drove the design better than any architecture diagram could have.
The architecture isn't clever. It's cheap, observable, and it ships.