Self-hostable · AI-driven · Open source

Own your sources.
Own your pipeline.
Own your output.

Turn a firehose of newsletters and news sources into bite-sized, source-tracked units you actually read — then publish your own digest with citations attached automatically. No copy-pasting. No lock-in. No cloud middleman.

View on GitHub See how it works

Ingest → Scrape & dedupe → AI breakdown → Read & curate → Cite & publish

The problem

One simply does not copy paste.
Fix your publishing pipeline.

Fragmented

Newsletters arrive in your inbox. Articles live in browser tabs. You lose your place every time you switch context.

Ephemeral

You have no canonical copy. Services shut down. Paywalls move. Your reading history belongs to someone else's database.

Citation hell

Turning what you read into writing means hunting down the original URL, pulling the right quote, formatting a reference. Every. Single. Time.

Locked in

Every reader app, every newsletter tool has a conflict of interest with you. Their incentive is your continued subscription, not your autonomy.

Samizdat collapses the read → curate → publish loop into a single pipeline that runs on hardware you control, stores everything as Markdown you own, and hands you a citable, publish-ready digest with one command.

How it works

Five steps.
One pipeline.

Ingest newsletters and feeds

Point an email alias at your server for newsletters. Add RSS feeds for news portals. Everything lands in one place.

Scrape once, deduplicate forever

Each canonical URL is fetched exactly once — never twice, never for each subscriber. The result is stored as Markdown in your vault. Scraping is expensive and ban-prone; Samizdat respects that.

Break down with your AI rules

Your editable Pipeline runs over each Document and emits bite-sized Highlights — the key facts, quotes, and ideas, tagged and sorted. The breakdown is personal; the document is shared.

Read and curate on your phone

The mobile app syncs highlights offline-first. Swipe to curate. Every linked URL is tracked and openable. One tap to mark, annotate, or skip.

Publish your digest with citations attached

Run sam digest. Every item you curated becomes a bullet with the source URL, quote, and original context already wired. No copy-paste. Publish to email, the web, or anywhere.

Components

Four pieces.
One system.

server/

The hub. REST API, cron worker, scraper engine, dedup, pipeline runner, TLS — everything in a single static binary you can scp anywhere.

Go · SQLite (pure-Go) · CertMagic

app/

The reader. Offline-first, swipe-to-curate, syncs with your server. Runs on iOS, Android, and the web from one codebase.

Expo · React Native · RN Web

cli/

Meet sam the power user's and the robot's interface. Init, configure providers, manage jobs, run pipelines, generate digests — all headless, all scriptable.

Go · shared engine

clipper/

[IN PROGRESS] The browser extension. Clip any page, highlight text, add it to your pipeline on the fly — posts to your server's same API.

Chrome MV3 · Defuddle · Turndown

Design rules

Non-negotiables.

📄

Markdown vault is the source of truth. SQLite is a rebuildable index. sam reindex reconstructs the DB from your files at any time.

🔒

Nothing lives only in a hosted DB. Everything has a Markdown path. You own and take care of backup. Single command archive. sam archive current

⚡

Scrape one URL once. Deduplication by canonical URL happens before any network request. Scraping is expensive; we respect that.

🧱

Phase split is sacred. Scraper → Document is opinion-free and community-maintainable. Pipeline → Highlight is personal. Never mix them.

🏠

Paywalled content stays local. Credentialed content never goes to a cloud LLM by default — it's routed to your local AI provider.

📦

One binary, no Docker, no nginx. scp it, run it. TLS is in-binary via CertMagic. The happy path has no moving parts.

Philosophy

Anti-slop.
Anti-lock-in.
No-copy-paste.

Samizdat — self-published, hand-passed, traceable writing. Underground, anti-corporate, self rolled.

"There is an inherent conflict of interest between you as the publisher and the for-profit platforms that feed you info. Capitalism's incentive is to lock you in, manipulate your consumption. Make your own feed, that serves you."

Focus on what YOU care about, what you consume. The algorithm keeps you scrolling; gain control of it.

Samizdat has no engagement metric. No recommendation engine. No "you might also like." It does exactly what you configure it to do — and the configuration is a text file you own.

OSS-first. Self hosted, self rolled.

Own your sources.Own your pipeline.Own your output.