Configuration
Environment Variables
Section titled “Environment Variables”All configuration is done through environment variables in the .env file. At least one provider must be configured (Gemini, Groq, or Ollama); the route returns 503 otherwise.
| Variable | Required | Description |
|---|---|---|
GEMINI_API_KEY | One of these | Google AI API key (Gemma 3 27B) |
GROQ_API_KEY | One of these | Groq API key (Llama 3.3 70B) |
OLLAMA_BASE_URL | One of these | Base URL of a local Ollama daemon (e.g. http://127.0.0.1:11434) |
OLLAMA_MODEL | Optional | Ollama model tag, defaults to llama3.2. Use any tag from ollama list. |
OLLAMA_API_KEY | Optional | Bearer token sent as Authorization: Bearer {key} on every Ollama request. Only needed if your Ollama is behind auth. |
Provider Priority
Section titled “Provider Priority”The LLM chain composes from whatever’s configured in env. Ordering is fixed:
- Ollama (
OLLAMA_BASE_URL), local first when configured - Gemma 3 27B via Google (
GEMINI_API_KEY) - Llama 3.3 70B via Groq (
GROQ_API_KEY)
If a provider fails (timeout, rate limit, malformed response), the system automatically tries the next one. Because each provider uses a separate credential, their quotas are completely independent. Self-hosters who want a fully offline scanner should set only OLLAMA_BASE_URL and leave the cloud keys unset.
Running Locally with Ollama
Section titled “Running Locally with Ollama”For privacy-first deployments where every byte of the resume stays on your machine:
# install ollama from https://ollama.com and pull a modelollama pull llama3.2
# in your .env (or as shell vars before pnpm dev):OLLAMA_BASE_URL=http://127.0.0.1:11434OLLAMA_MODEL=llama3.2
# leave GEMINI_API_KEY / GROQ_API_KEY unset for offline-only modeThe Ollama path uses Ollama’s format: 'json' so the model returns strict JSON without prompt-engineering tricks. First scan is slow on commodity hardware (60-120s for llama3.2:3b on a typical laptop); subsequent scans of the same resume hit the in-memory result cache and return in <100ms. Bigger models produce noticeably better suggestions but take longer.
The /api/analyze response includes _provider: "ollama-{model}" so you can confirm requests are landing locally and not falling back to a cloud key you forgot to remove.
Behind a reverse proxy or auth gate
Section titled “Behind a reverse proxy or auth gate”Vanilla ollama serve on 127.0.0.1 has no authentication, which is fine for a local-only setup. If your Ollama lives behind a reverse proxy that requires a bearer token, or you’re pointing at a hosted Ollama-compatible endpoint (OpenWebUI, LiteLLM, OpenRouter’s Ollama-compatible routes, a Cloudflare-tunneled daemon with a service token, etc.), set OLLAMA_API_KEY and the request will include Authorization: Bearer {key} on every call:
# in your .envOLLAMA_BASE_URL=https://ollama.your-domain.tldOLLAMA_MODEL=llama3.2OLLAMA_API_KEY=sk-your-proxy-tokenThe header is only attached when the env var is non-empty, so leaving it unset keeps the request shape identical to the local-only setup. Empty or whitespace-only values are treated as not set so a stray OLLAMA_API_KEY= line in .env does not produce a malformed Authorization: Bearer header that the proxy would reject.
Authentication
Section titled “Authentication”How users sign in (or whether they sign in at all) is a separate choice from the LLM provider, and it’s also driven by environment variables. ATS Screener supports three modes, picked automatically:
- Anonymous: leave Firebase and LDAP unset. The scanner is open and history is local. This is the default.
- Firebase: set the
PUBLIC_FIREBASE_*variables for Google / email sign-in and synced history. - Active Directory: set
LDAP_URLfor on-premise AD sign-in.
See Authentication for the full comparison and the Active Directory guide for AD setup. The Firebase variables are listed below.
# self-host without firebase: leave every PUBLIC_FIREBASE_* var unset (the default).# self-host with firebase: set all six.PUBLIC_FIREBASE_API_KEY=...PUBLIC_FIREBASE_AUTH_DOMAIN=your-project.firebaseapp.comPUBLIC_FIREBASE_PROJECT_ID=your-projectPUBLIC_FIREBASE_STORAGE_BUCKET=your-project.appspot.comPUBLIC_FIREBASE_MESSAGING_SENDER_ID=1234567890PUBLIC_FIREBASE_APP_ID=1:1234567890:web:abcFree Tier Limits
Section titled “Free Tier Limits”| Provider | Model | RPM | RPD | TPM | Cost |
|---|---|---|---|---|---|
| Gemma 3 27B | 30 | 14,400 | 15K | Free | |
| Groq | Llama 3.3 70B | 1000 | 14,400 | 12K | Free |
Both providers block at their limits and never auto-charge. You cannot accidentally incur costs.
For the latest limits, see the official documentation:
Rate Limiting
Section titled “Rate Limiting”Rate limiting is configured in src/routes/api/analyze/+server.ts:
const RATE_LIMIT = { maxPerMinute: 10, maxPerDay: 200};Adjust these values based on your expected traffic and API key limits.
Timeouts
Section titled “Timeouts”Each provider has its own timeout. Vercel Fluid Compute is enabled by default and allows up to 300 seconds on the Hobby plan:
// Gemma: 90s, Groq: 30s → worst case total: 120sconst PROVIDER_TIMEOUTS_MS = [90_000, 30_000];Gemma 3 27B typically takes 30-45 seconds for the full scoring prompt but can spike under load. The 90s timeout gives generous headroom. Groq responds in under 1 second but gets 30s for safety. If both providers fail, the system falls back to rule-based scoring on the client side.