This is what not optimizing anything looks like: lots and lots and lots of things to manage, and a burning stream of money. https://twitter.com/th3j35t3r/status/1350612426115452935">https://twitter.com/th3j35t3r...
I& #39;m seeing a lot of discussion here on resources to run a major website and that& #39;s awesome - it& #39;s a fun topic I think. I want to add some nuance to the thinking I& #39;m seeing in various twitter threads just to get you considering what matters here with scale...
First: every site is different. All of them. They& #39;re doing things and nothing is apples to apples, but that& #39;s okay just keep it in mind.
Second: you generally have 2 buckets of users: those that haven& #39;t visited in a long time (orphaned accounts, basically), and "active" ones.
Second: you generally have 2 buckets of users: those that haven& #39;t visited in a long time (orphaned accounts, basically), and "active" ones.
The thing is, people just say "active" users, and I& #39;ve yet to find any 2 major websites that define that the same way.
What does "active" mean? It& #39;s whatever some PM or management defined it as, often to look better to people asking the question. How it& #39;s defined matters.
What does "active" mean? It& #39;s whatever some PM or management defined it as, often to look better to people asking the question. How it& #39;s defined matters.
When you& #39;re rendering webpages based on data, your ratios of efficiency are generally how much you& #39;re rendering (e.g. page views) and what it takes to make that happen. That one& #39;s pretty straightforward to think about.
Then, there& #39;s "feeds". Those are a bit of a quagmire...
Then, there& #39;s "feeds". Those are a bit of a quagmire...
Again, what is a "feed"? An RSS style feed is fairly simple, something along the lines of query the latest N and cache for some duration if you need. This could be global, by category, by user, etc. But: you& #39;d almost always *query* it, and not *pre-generate* it. Huge difference.
An "interesting" feed is where you can quickly get into real performance issues. You can do these in soooo many ways. If I had to guess, that& #39;s what& #39;s happening for Parler unless so many things are just very badly implemented across the board (and that& #39;s entirely possible).
Let& #39;s take a few examples:
Do we query what& #39;s interesting? If so, when?
Do we cache it?
Do we do it ahead of time? ...or when the user asks? (I& #39;d *guess* the latter is what Parler is doing, with that database resource list)
...or, do we use some event-based system?
Do we query what& #39;s interesting? If so, when?
Do we cache it?
Do we do it ahead of time? ...or when the user asks? (I& #39;d *guess* the latter is what Parler is doing, with that database resource list)
...or, do we use some event-based system?
There are a litany of trade-offs to consider with all of these questions.
If you query when the user asks, you& #39;re down to the minimum possible waste: only the users asking require work. But: that doesn& #39;t scale and gets slower and slower and more and more expensive. Or...
If you query when the user asks, you& #39;re down to the minimum possible waste: only the users asking require work. But: that doesn& #39;t scale and gets slower and slower and more and more expensive. Or...
...you could pre-generate the feed. Again: options!
The main 2 buckets are "do it every so often" (with the caveat of "when the users pull to refresh" being a generation point there too), or "do it as you go" (e.g. with events).
Why does this matter? Now we care about "active".
The main 2 buckets are "do it every so often" (with the caveat of "when the users pull to refresh" being a generation point there too), or "do it as you go" (e.g. with events).
Why does this matter? Now we care about "active".