Building an Engineering & Security News Aggregator (10 Sources, No APIs)

  • Автор темы Автор темы Sascha
  • Дата начала Дата начала

Sascha

Команда форума
Администратор
Ofline
https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0qk5nvb8oe36qcqcyxiy.png


We built a curated engineering and security news aggregator that pulls from 10 high-signal sources, deduplicates content, and updates every 6 hours.

No paid APIs. No scraping. No login. Just clean, structured news for developers.

This post breaks down exactly how it works.


What This Is​


A lightweight news wire combining:

  • Hacker News
  • Lobsters
  • InfoQ
  • Cloudflare Blog
  • Krebs on Security
  • The Hacker News (Security)
  • NIST NVD (vulnerabilities)
  • GitHub Blog
  • OpenAI Blog
  • Anthropic Research

The goal: high-quality signal, zero noise, zero cost.


Why Build This?​


Most engineering/news aggregators fail in one of these ways:

  • Too noisy (no curation)
  • Too expensive (paid APIs)
  • Too slow (manual updates)
  • Too fragmented (you check 10 sites anyway)

We wanted:

  • A single feed
  • Fresh updates (but not real-time obsession)
  • No operational cost
  • No lock-in (no accounts, no tracking)

Stack​

  • Hono (API layer)
  • Drizzle ORM
  • Postgres
  • Next.js (frontend)
  • RSS feeds + Hacker News Firebase API

High-Level Architecture​


Код:
           ┌───────────────┐
           │   RSS Feeds   │
           │ (9 sources)   │
           └──────┬────────┘
                  │
                  ▼
           ┌───────────────┐
           │ Fetch Workers │
           │ (every 6 hrs) │
           └──────┬────────┘
                  │
                  ▼
        ┌──────────────────────┐
        │ Normalize Articles   │
        │ title, url, date     │
        └─────────┬────────────┘
                  │
                  ▼
        ┌──────────────────────┐
        │ SHA-256 Deduplication│
        │ (based on URL)       │
        └─────────┬────────────┘
                  │
                  ▼
           ┌───────────────┐
           │   Postgres    │
           └──────┬────────┘
                  │
                  ▼
           ┌───────────────┐
           │   Hono API    │
           └──────┬────────┘
                  │
                  ▼
           ┌───────────────┐
           │   Next.js UI  │
           └───────────────┘



Data Sources​


We deliberately chose sources with:

  • High editorial quality
  • Low duplication between each other
  • Stable RSS feeds or APIs

Breakdown​

SourceTypeWhy It Matters
Hacker NewsAPIReal-time dev signal
LobstersRSSMore technical discussions
InfoQRSSDeep engineering content
Cloudflare BlogRSSInfra + performance insights
Krebs on SecurityRSSTrusted security reporting
The Hacker NewsRSSSecurity news (broader)
NIST NVDRSS/APIVerified vulnerabilities
GitHub BlogRSSPlatform + ecosystem updates
OpenAI BlogRSSAI developments
Anthropic ResearchRSSAI + safety research

Fetching Strategy​


We run a simple scheduled job:


Код:
// every 6 hours
cron.schedule("0 */6 * * *", async () => {
  await fetchAllSources();
});



Why every 6 hours?

  • Keeps content fresh
  • Avoids unnecessary load
  • Works well with RSS update frequencies

Deduplication (Key Part)​


Different sources often post the same story.

We solve this using SHA-256 hashing of URLs.


Код:
import { createHash } from "crypto";

function hashUrl(url: string) {
  return createHash("sha256").update(url).digest("hex");
}


Why URL hashing?​

  • Fast
  • Deterministic
  • No fuzzy matching complexity
  • Works across sources

Tradeoff​

  • Won’t catch rewritten articles with different URLs
  • But avoids false positives (important for trust)

Normalization​


Each source has its own format. We normalize into a single shape:


Код:
type Article = {
  title: string;
  url: string;
  source: string;
  publishedAt: Date;
};



This keeps the frontend simple and predictable.


API Layer (Hono)​


Example endpoint:


Код:
app.get("/articles", async (c) => {
  const articles = await db.query.articles.findMany({
    orderBy: (a, { desc }) => [desc(a.publishedAt)],
    limit: 100,
  });

  return c.json(articles);
});



Minimal, fast, no overengineering.


Frontend (Next.js)​

  • Server-rendered list
  • No login required
  • No personalization
  • Just chronological, deduplicated news

Limitations​

  • Not real-time (by design)
  • No personalization
  • Deduplication is URL-based only
  • Dependent on RSS availability

What We’d Improve​

  • Smarter clustering (same story, different URLs)
  • Tagging (infra, AI, security, etc.)
  • Optional filters (without accounts)

Try It​


The news wire is open to everyone:

👉 Engineering & Security News Wire — How We Built a $0/Month News Aggregator https://clawship.app/blog/engineering-security-news-wire


Connect with Us​


 
Назад
Сверху Снизу