An entire Social Network in 1.6GB using Roaring Bitmaps (GraphD Part 2)

In Part 1 of this series, we tried to answer the question “who do you follow who also follows user B” in Bluesky, a social network with millions of users and hundreds of millions of follow relationships. At the conclusion of the post, we’d developed an in-memory graph store for t … | Continue reading


@jazco.dev | 13 days ago

Your (Graph) Data Fits in Memory

I recently shipped a new revision of Bluesky’s global AppView at the start of February and things have been going very well. The system scales and handles millions of users without breaking a sweat, the ScyllaDB-backed Data Plane service sits at under 5% DB load in the most inten … | Continue reading


@jazco.dev | 18 days ago

Scaling Golang to 192 Cores with Heavy I/O

When running on baremetal, however, we found two key limitations of the Go runtime so far: Systems with a lot of RAM can have a lot of allocations, prompting the Go Garbage Collector to aggressively steal CPU. 2. Applications performing hundreds of thousands of requests per secon … | Continue reading


@jazco.dev | 3 months ago

Solving Thundering Herds with Request Coalescing in Go

Caches are a wonderful way to make your most frequent operations cheaper. If you’ve got a resource somewhere on disk (or a network hop away) that is accessed often, changes infrequently, and fits in memory, you’ve got an excellent candidate for a cache! Caching Celebrity Posts Fo … | Continue reading


@jazco.dev | 7 months ago

Speeding Up Massive PostgreSQL Joins with Common Table Expressions

I’ve been continuing to work on a growing series of services that archive, analyze, and represent data from a social network. This network creates text-based posts at a rate of around 400,000 posts per day, and I’ve been feeding the posts through different ML models to try and ga … | Continue reading


@jazco.dev | 8 months ago

Speeding up Postgres Queries by 200x with Analyze

Postgres uses an internal table called ‘pg_statistic’ to keep track of some metadata on all tables in the DB. Postgres’s Planner uses these statistics when estimating the cost of operations, which, if out of date, can cause the Planner to pick a suboptimal plan for our query. To … | Continue reading


@jazco.dev | 11 months ago

How to use ChatGPT to Write Good Code Faster

ChatGPT has incredible potential for accelerating your development flow. When working on new projects and starting things from scratch, it allows you to rapidly iterate, make decisions that would usually mean a painful refactor, or make use of libraries and/or APIs you’re unfamil … | Continue reading


@jazco.dev | 1 year ago

Workload Agnosticism in Large Language Models: The Foundation for the Next Generation of Computing

As discussed in my previous post, LLMs such as OpenAI’s ChatGPT and GPT-4, Google’s Bard, and Meta’s LLaMA have risen seemingly out of nowhere, poised to disrupt the future of computing. Cloud Compute changed the landscape drastically when it was introduced by Amazon in the mid 2 … | Continue reading


@jazco.dev | 1 year ago

A Tale of Two Technologies: Why Large Language Models are the Future and the Metaverse Isn't

In the digital landscape of recent years, two major technologies have vied for the spotlight: the Metaverse and Large Language Models (LLMs). Though the Metaverse, a virtual reality-based universe, initially garnered significant attention and expectations, it ultimately failed to … | Continue reading


@jazco.dev | 1 year ago