Crucible

Documentation

Learn Crucible

Everything you need to set up and run your autonomous research engine.

Quick Start

Quick Start Guide

Zero to first research cycle in under 10 minutes.

Crucible needs a language model. The easiest free option is Ollama -- download it from ollama.ai, then run `ollama pull gemma3:4b` in a terminal. Already have an OpenAI, Anthropic, Google, or OpenRouter API key? Skip to Step 3.

Double-click the Crucible app (or run `py datagrabber.py serve` from the command line). Your browser opens to http://localhost:8000.

Click "Continue without account" to start immediately on the Free plan. Or sign up with email/password if you want to upgrade later.

Go to Settings. For Ollama: set provider to ollama, model to gemma3:4b, base URL to http://localhost:11434. For cloud APIs: pick your provider, paste your key. Click Test Connection to verify.

Go to the Interests page. Type what you want to research in plain English (e.g. "Global energy markets, AI regulation, semiconductor supply chains"). Click Generate -- the LLM converts it into structured research topics.

Go to the Dashboard and click Run Now. In 5–15 minutes you’ll have your first batch of articles -- graded, summarized, and filed by topic.

User Guide

User Guide

Every feature in Crucible, explained.

Your home base. Shows total documents, topic breakdown with counts, average quality score, source count, token usage per cycle, recent documents with clickable titles, and a live activity feed.

Browse collected articles by topic. Each article has a title, source URL, quality score (1–10), topic, LLM-written summary, tags, and full cleaned text. Delete documents that don’t belong.

Find any article by keyword. Full-text search across titles, summaries, and content. Results are instant thanks to the SQLite FTS5 index.

Every website Crucible has scraped, with trust tiers (high/medium/low), doc counts, quality averages, and success rates. Promote good domains, block bad ones, unblock when needed.

See how your vault is organized. Each topic shows its document count. Edit the taxonomy directly to add, remove, or restructure categories.

Your structured research goals. Add, remove, or regenerate topics any time. Changes take effect on the next cycle.

Visual map of how documents connect: causal links, contradictions, thematic ties, temporal chains. Explore clusters, browse narrative chains, and view edge statistics. Use Reconnect after cycles to discover new edges.

Log of every collection cycle: documents fetched/kept/rejected, token usage, estimated cost, and errors. Click a run for step-by-step details.

Configure LLM provider and model, temperature, context window, per-step model overrides, scrape delays, blocked domains, and research templates. Test Connection verifies your LLM before you run.

Pre-built profiles for Geopolitical Intelligence, Business Intelligence, Academic Research, and Sports Analytics. Load one to populate your goals, topics, and feeds instantly.

Start fresh: deletes all documents, index, graph, and activity -- but keeps your configuration, interests, and templates.

Your vault is plain markdown files. Open the folder in Obsidian for advanced note-taking alongside collected research. Wiki-links between related articles work natively.

AI Providers

AI Providers

Configure local or cloud language models.

For fully private inference, install Ollama and pull a model (e.g. `ollama pull llama3`). Configure the Ollama endpoint in Crucible settings. Requires an NVIDIA GPU with 4+ GB VRAM.

Add your OpenAI API key in settings. Crucible uses GPT models for document evaluation, summarization, and knowledge graph construction. Per-step model overrides let you use cheaper models for bulk tasks.

Add your Anthropic API key in settings. Claude models are used for document analysis and relationship extraction. Recommended for high-quality knowledge graph edges.

Search

Search Providers

Where Crucible finds your data.

Available on all plans. Privacy-respecting web search with no API key required. Good for general research topics.

Available on Starter and above. Requires a Brave Search API key. Provides independent web index results that complement DuckDuckGo.

Available on Pro and above. Meta-search engine that queries dozens of search engines simultaneously through a privacy-respecting proxy. Self-hosted or use a public instance.

Available on Pro and above. Direct integration with the arXiv API for discovering academic papers. Supports citation extraction and structured metadata.

Vault

Vault Structure

How your research is stored and connected.

Every document is stored as a markdown file with YAML frontmatter containing metadata: source URL, quality score, categories, timestamps, and relationship data. Fully compatible with Obsidian.

Documents are connected via typed edges: causal, temporal, thematic, and contradictory. Multi-hop traversal lets you discover chains of reasoning across your entire vault.

Starter plans and above can export the entire vault as a ZIP archive for backup, migration, or offline access.

Collection

Collection Modes

Manual, scheduled, or continuous research.

Available on all plans. Trigger a single research cycle on demand. The AI director plans queries, executes searches, scrapes pages, evaluates relevance, and stores results.

Available on Starter and above. Set daily, weekly, or custom cron intervals for automatic collection cycles. Crucible runs in the background on your schedule.

Available on Pro and above. Always-on collection mode that continuously monitors your topics of interest, discovering and ingesting new sources as they appear.

Use Cases

Use Cases

How teams and individuals use Crucible.

Monitor competitors, track industry trends, and aggregate market data automatically. Crucible continuously scrapes press releases, filings, news, and social signals into a structured vault you can query and analyze.

Pull papers from arXiv, government databases (FRED, EIA), and institutional repositories. The knowledge graph surfaces citation chains and thematic connections across hundreds of papers without manual curation.

Build information edges by aggregating diverse signals from news, data feeds, and expert analysis. Crucible's structured vault makes it easy to track evolving narratives and export curated datasets for modeling.

Use Crucible as the collection layer upstream of your analysis stack. Export structured data to Jupyter, Tableau, or custom pipelines. The YAML frontmatter and knowledge graph metadata make datasets immediately queryable.

Hobbyists, writers, and lifelong learners use Crucible to build deep knowledge bases on any topic. The Obsidian-compatible vault integrates with your existing notes, and scheduled collection keeps your vault growing while you sleep.

Troubleshooting

Troubleshooting

Common issues and fixes.

Make sure your LLM is running. For Ollama: run `ollama serve` in a terminal first. For cloud APIs: verify your API key is correct and has credits.

Make your interests more specific. Vague topics produce poor search queries. Try narrowing from 'technology' to 'semiconductor supply chain disruptions in 2025'.

Try a smaller model (gemma3:4b instead of llama3:70b), increase timeout in Settings, or reduce documents per cycle.

Block bad source domains, lower temperature to 0.2 for more focused evaluation, and promote domains you trust.

You need at least 5 documents before the graph can find connections. Run Reconnect after a few cycles to discover new edges.

Reference

Free Plan Limits

What you get on the free tier.

Documents per cycle50
Research priorities3
Queries per priority2
Search engineDuckDuckGo
PDFs per cycle3
SchedulingManual (Run Now)
ExportStarter plan
CuratorPro plan