Research

Inside the intelligence engine.

Capital for Cures runs a purpose-built AI research platform for pharmaceutical royalties. Owned hardware, no third-party cloud APIs, full data sovereignty. Three layers of compute and data designed for one job: surface every royalty-relevant signal in pharmaceutical filings, benchmark every disclosure against four decades of comparables, and ship the result as research the next morning. Daily ingestion. Every verified outcome feeds back into training.

50K+Indexed data points
6,000+Reverse-engineered transactions
5.5MPricing data points
40+Years of deal history
9Primary jurisdictions, daily
5+Languages parsed natively

Architecture

Three layers, end to end.

Ingestion brings primary documents in. Analysis turns documents into a structured knowledge graph and runs the valuation work on top of it. Reporting surfaces the outputs as scored briefs, hidden-connection alerts, new-stream detection, and benchmark queries.

Under the hood

Four compute roles, one purpose.

The system is partitioned by workload. Ingestion runs in parallel across primary sources. Enrichment runs domain-trained models against a forty-year deal corpus. A dedicated role handles non-English markets continuously. A central data layer serves the rest of the system and powers the scoring engine. We do not publish hardware specifics.

Node 1: Ingest

Parallel extraction · multi-source

Parallel extraction from SEC EDGAR, clinical trial registries (ClinicalTrials.gov, EU CTR), patent databases (USPTO, EPO, JPO), press releases, and annual reports. Every new filing is automatically parsed, classified, and queued for enrichment. Coverage runs daily across the source set.

Node 2: Enrich & Value

Domain-trained models · multi-pass enrichment

Runs large parameter models purpose-trained on our deal corpus. Multi-pass enrichment: scientific context, competitive landscape, patent analysis, deal-precedent matching. Proprietary valuation engine runs DCF, Monte Carlo, and comparables against the full forty-year database for every asset.

Node 3: Global Markets

Non-English filings · five-plus languages

Dedicated to non-English markets. EDINET filings (Japan), DART system (Korea), EMA regulatory data (EU). Custom Japanese NER and OCR pipeline for pharmaceutical deal extraction. Always-on coverage scanning filings in five-plus languages. This is how we find deals no one else sees.

Node 4: Score & Surface

Central data layer · scoring & screening

Central data layer serves all nodes. Deal scoring engine rates every asset on royalty and licensing attractiveness. Distressed-biotech detection (cash runway, management changes, declining market cap). Origination screening: which holders are most likely to monetise next. Surfaces results to the research team and the publication pipeline.

Why owned infrastructure

Three reasons we do not run this on a cloud API.

Data sovereignty

Royalty research touches private deal documents, draft term sheets, confidential disclosures from holders. None of that leaves our infrastructure. No third-party model provider trains on our corpus, and no inference call routes through anyone else's logging.

Domain training, not generic

Our models are continuously fine-tuned on our own 40-year deal corpus. A general-purpose frontier model is excellent at language but does not know what a step-down clause looks like in a 1998 Merck-Schering term sheet. The domain training is the moat.

Cost discipline at the long tail

Daily ingestion across nine jurisdictions and five-plus languages would be uneconomic on per-token pricing. Owned hardware turns marginal cost to near-zero, which is what makes systematic coverage of sub-$100M positions tractable in the first place.

The output

What the engine surfaces, every day.

Surface	What it is
Daily intelligence briefs	Scored and ranked against your investment thesis. Asset, action, why now.
Hidden connections	Entity relationships invisible in any single filing. Same molecule, multiple counterparties; same counterparty, multiple molecules.
New royalty streams	Buried in footnotes, litigation, amendments. Detected the day they appear in a filing.
Competitive landscape shifts	Pipeline, formulary, and pricing changes that move existing royalty economics.
Deal structure patterns	Rate benchmarks by therapy area, stage, structure. Where the cohort is tight; where it is wide.
Royalty rate calculator	Query fair rates by stage, indication, deal structure against the corpus.

Subscribe to the engine → Asset owners →

Numbers and architecture descriptions reflect the state of the engine as of the date of publication. The engine is a research tool. Outputs are research and analysis, not investment advice. Capital for Cures does not manage capital, hold positions, or place securities.