ETL to QE, Update 70, Embarrassing Over Engineering

Date: 2025-04-20

Let's review where we are and where we wanted to be,

By now I am supposed to be half way through the the RBAC LDAP Like Content Addressable Storage System within the DDaemon 2025 Roadmap Rev. 0.0.3 but in reality I am still stuck on the Nostr Scraping Project where I continue to over engineer it over and over again.

Workflow Engines are Not Easy

I started with the idea of developing my own workflow engine within Postgres, this would involve loading up an task within postgres, this task would then be processed by a worker that updated it's progress and results in postgres, then when the job was complete other jobs would start or get triggered.

Turns out this was all a cope to avoid Doing The Thing. All I really wanted was 1,000,000 nostr events in a database so I could start making sense of things, I didn't need an entire workflow engine with signed hash chains for the IO of every function fun on the system.

When it came to developing a dependency job management queue for the workflow engine, which would mean actually representing the DAG in Postgres. I realized I had a harder project on my hands than just scraping a tone of Nostr Events.

You Don't Need Separate Logs

Earlier today I tried to create a separate SQL table for the raw Nostr Filter that was to be scraped via paginated timestamps rather than just loging everything to the logging table.

I realized that just adding the correct label to the log accomplished the same thing as having a separate table for the raw nostr filters themselves.

Turns out there is better Tooling

After realizing how slow paginating 100 event within postgres is I realized I could have been using nosdump the entire time. Every filter from every relay could just be a good old NDJSON file, that could all just be stored in a file and uploaded to S3 or Google Drive, then we could have a nice simple ingestion script which could do some epic SQL Transactions to ingest the data quickly and easily.

Backlinks

Project Update Posts