this post was submitted on 28 Apr 2025
73 points (97.4% liked)

Fediverse

33101 readers
584 users here now

A community to talk about the Fediverse and all it's related services using ActivityPub (Mastodon, Lemmy, KBin, etc).

If you wanted to get help with moderating your own community then head over to !moderators@lemmy.world!

Rules

Learn more at these websites: Join The Fediverse Wiki, Fediverse.info, Wikipedia Page, The Federation Info (Stats), FediDB (Stats), Sub Rehab (Reddit Migration)

founded 2 years ago
MODERATORS
 

cross-posted from: https://lemmy.world/post/28808772

Finally released an alpha build for the PeerTube recommendation algorithm!
Basic UI is complete. If you want to try it out, the link is here:
👉 https://github.com/solidheron/peertube_recomendation_algorythm

New features since the last build:

  • Sort by videos that share your time engagement similarity.
  • Sort by videos that share your like similarity.
  • Display of like similarity cosine values.
  • Basic information shown for recommended videos (title, account, and channel names).
  • 404 check for generated instance links (so you don’t get stuck clicking into dead videos—you’ll know which instance hosts the video).
  • De-ranking for previously seen videos (simply a 0.5x multiplier on time and like similarity).

Features from previous builds:

  • Ability to input multiple instance domain names (DNs) and generate playable video links.
  • Limit of 5 recommendations per channel to avoid floods (e.g., during testing, The Linux Experiment would dominate otherwise—this limit is more of a failsafe than a feature).

Personal thoughts:
I still think cosine similarity beats chronological algorithms.
This algorithm also synergizes with other algorithms—it's great for finding videos that appear next to or below what you're currently watching.

You can also revisit videos you previously liked to help strengthen your like similarity vectors.


Moving forward: basic design philosophies and current issues

There’s an issue I’m calling the “Linux pipeline.”
Basically, Linux-related videos tend to dominate PeerTube’s well-produced content.
Since the algorithm relies on English words in descriptions, titles, and tags, Linux videos—which sometimes have fewer general keywords—end up being more "orthogonal" to typical user vectors, causing lower ranking.

Another challenge:
It’s really hard to properly combine like cosine similarity and time engagement cosine similarity.
You could add them, but it doesn’t fully make sense:

  • High like similarity + high time engagement similarity = you probably like and will watch the video longer.
  • But short videos can be liked even if they contribute almost nothing to time engagement (because time engagement is based on percentage watched × video length).

If I combined them, it would basically enter machine learning territory:
You'd have to adjust proportions dynamically based on user behavior.
Since I want this algorithm scoped to one person only (no data sharing yet), that level of ML is out of scope for now.

(Sharing data across devices could come later—Brave browser has sync features, and PeerTube watch history syncing could be possible.)


Summary:
Most of the data structure is settling into place.
Future updates will probably focus on expanding the data structure and making small improvements.

top 8 comments
sorted by: hot top controversial new old
[–] onlinepersona@programming.dev 8 points 1 day ago (1 children)

What is this? It jumps in explaining features and details about other stuff, but doesn't explain the basic goal. There are also no screenshots except of some table. It's not clear how to use this thing.

Anti Commercial-AI license

[–] Cattail@lemmy.world 4 points 1 day ago

its a browser extension for brave (or any chrome based browser) it's in the github readme. recomendation algo was self explanatory. it's meant to recommend you videos on peertube. i only screen shot the only ui that exists, the only things I can screenshot is variables stored in indexdedDb and local extension.

also the installations instructions are in the github readme

[–] iso@lemy.lol 7 points 1 day ago* (last edited 1 day ago)

I think open discovery algorithms are the way. We are against algos but sorting by like similarity would be beneficial.

What are you guys thinking? @dessalines@lemmy.ml @nutomic@lemmy.ml Are you optimistic about this or fuck any algorithms?

[–] cyrano@lemmy.dbzer0.com 4 points 1 day ago (1 children)
[–] atro_city@fedia.io 8 points 1 day ago (1 children)

I don't even know what "it" is. A recommendation algorithm? But peertube already has a "similar" video section to the right of all videos. Does this replace that? Techies really have a problem with presenting stuff to laymen.

[–] cyrano@lemmy.dbzer0.com 0 points 1 day ago

Yeah the algo recommendation

[–] ArtificialHoldings@lemmy.world 3 points 1 day ago (1 children)

Here’s a cleaned-up version of your Lemmy post that keeps your tone but improves clarity, flow, and grammar:

Did they forget to delete ChatGPT's bit or did they intentionally copy the whole thing lol

[–] Cattail@lemmy.world 2 points 1 day ago

Lol I didn't copy the hole prompt I deleted the bit at the end, but it was late and I was tired so I used an AI just to fix my original text