people are building tiktok clones on atproto now. instagram alternatives. livestreaming platforms. photo-sharing apps. the atmosphere is getting media heavy.
here's the thing: the whole premise of atproto is that your data lives on your PDS, not the app's servers. your posts, your follows, your images, your videos ā all stored in your personal data repository. the app is just an interface.
this is beautiful in theory. in practice? media is expensive. storage is cheap, bandwidth is not. and "unlimited storage" is a subsidy that can't last forever.
i've been building pollen, a tumblr-style app on atproto. last night i added video uploads. let me tell you about the constraints i hit. or should have hit!!!
the 10mb wall
bluesky's hosted PDS has a ~10MB per blob upload. that's the individual file size cap. sounds reasonable until you realize:
a 30-second 1080p video is easily 50-100MB
a high-res photo from a modern phone is 5-15MB
a short screen recording can hit 20MB instantly
so when someone uploads a video to pollen, i can't just pass it through to their PDS. i have to process it first.
ffmpeg to the rescue (sort of)
here's what pollen does now:
export async function processVideo(
inputPath: string,
maxSizeMB = 10
): Promise<ProcessVideoResult> {
const info = await getVideoInfo(inputPath);
// already small enough?
const currentSizeMB = info.size / (1024 * 1024);
if (currentSizeMB <= maxSizeMB && info.width <= 720) {
return { outputPath: inputPath, wasProcessed: false, ... };
}
// calculate target bitrate for desired file size
const targetBitrate = calculateTargetBitrate(info.duration, maxSizeMB * 0.9);
// two-pass x264 encoding, max 720p
await runFfmpeg(pass1Args);
await runFfmpeg(pass2Args);
return { outputPath, wasProcessed: true, ... };
}every video gets:
downscaled to 720p max
re-encoded with x264
bitrate calculated to fit under 10MB based on duration
two-pass encoding for optimal quality at that bitrate
it works! but a 60-second video at 720p with a 10MB cap means ~1.3 Mbps bitrate. that's... fine. not great. definitely not "i'm building the next tiktok" quality.
the per-blob limit is annoying but workable. the bigger question is total storage per user.
right now, bluesky's hosted PDS offers effectively unlimited storage. your repo can grow and grow. this is a subsidy, bluesky is eating the storage costs to bootstrap the network.
but think about what happens when media-heavy apps take off:
a flashes creator (tiktok-style) posting short videos: 10GB/month easy
a grain photographer sharing galleries: 2-5GB/month
a pinksea (classic oekaki / art platform) artist creating daily high-res pieces: 1-2GB/month
multiply by millions of users
someone's going to pay for that storage. the question is who and how.
the bandwidth problem
and really, storage is actually the easy part. hook it up to s3 or something compatible and you're fine for a long time. bandwidth is where it gets harder.
when you view a pollen post with an image, here's what happens:
your browser requests the image
pollen proxies the request to the author's PDS
the PDS serves the blob
pollen caches the image aggressively and then passes it through to you
now imagine a post goes viral. 100,000 people want to see that image. that's 100,000 requests hitting the author's PDS.
if you're on bluesky's hosted PDS, bluesky absorbs that. they have CDN infrastructure, caching, the whole deal.
but the dream of atproto is self-hosted PDSes. what happens when your self-hosted PDS on a $5/month VPS suddenly needs to serve 100,000 image requests?
the missing layer: media CDN infrastructure
microcosm has built amazing community infrastructure for atproto:
slingshot for identity caching
constellation for backlink tracking
spacedust for real-time interactions
what we don't have yet: a community CDN layer for media.
imagine something like:
blob caching at the edge
automatic cache population from the firehose
bandwidth pooling across indie PDS operators
"media relay" infrastructure parallel to the existing firehose relays
this feels like it needs to exist for the self-hosted PDS dream to work at scale (someone else build it and i'll pay for it pls)
futures š®
i see a few possible directions:
1. storage tiers on hosted PDSes
bluesky (or other PDS hosts) introduce paid tiers:
free: 5GB total storage
$5/month: 50GB
$20/month: 500GB
this is the boring but realistic answer. it's how email works (gmail gives you 15GB free, then you pay). it's how most cloud storage works.
the nice thing about atproto: you can migrate your PDS. don't like bluesky's pricing? move to a different host. your data comes with you.
2. self-hosted PDS becomes normal
like running your own mastodon instance, but for data instead of a full social app.
the official PDS repo already has docker images and setup guides. it's not trivial, but it's doable for technical users.
the challenge: you need to solve the bandwidth problem yourself. cloudflare in front of your PDS? bunny CDN? this is where most people will give up.
3. app-specific media hosting
apps like pollen could offer to host media on behalf of users, while still maintaining the atproto identity model.
your posts would still be in your PDS, but the blob refs would point to pollen's CDN instead of your PDS directly. you'd trade some "true ownership" for practical scalability.
this feels like a compromise, but maybe a reasonable one for media-heavy apps and it makes migration much more difficult.
4. reference counting and garbage collection
delete a post? the blob might still be in your repo! orphaned forever. taking up space.
at some point, PDSes will need tooling to:
identify orphaned blobs
let users reclaim space
maybe auto-clean after some period
this is especially important if storage tiers become a thing.
what i'm doing in pollen
for now, pollen works within the constraints:
aggressive video compression (720p max, calculated bitrate)
server-side ffmpeg processing before PDS upload
rejecting uploads over 100MB even before processing
aggressive cache headers on an image proxy serving media from pollen
it's not perfect. video quality is acceptable, not great. but it works, and users actually own their content.
the architecture means i don't need massive storage infrastructure. each user's media lives on their PDS. pollen is just the interface layer plus a read cache.
if pollen somehow goes viral, i don't need 10x the storage, the users' PDSes absorb it. that's the magic of the model! and maybe the downfall idk.
the optimistic take
i think this is solvable. the architecture allows for solutions:
CDN layers can be added without changing the protocol (and i bet some PDS hosts are already doing this)
storage tiers can be introduced by PDS hosts
community infrastructure can emerge (it already is with microcosm)
apps can get creative with compression and caching
the question is whether we'll build this infrastructure before the "unlimited storage" subsidy runs out.
or maybe the answer is simpler: cheap storage + smart caching + users pay for premium. that's worked for basically every other cloud service.
either way, i'm going to keep building pollen and hitting these constraints. every workaround teaches me something about what the protocol actually needs.
if you're building media-heavy apps on atproto, i'd love to hear what constraints you're hitting. find me on bluesky or check out pollen.