this post was submitted on 07 Sep 2025
162 points (87.2% liked)
Bluesky
1631 readers
18 users here now
People skeeting stuff.
Bluesky Social is a microblogging social platform being developed in conjunction with the decentralized AT Protocol. Previously invite-only, the flagship Beta app went public in February 2024. All are welcome!
founded 9 months ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
I agree, and anyone who is interested may enjoy reading this awesome blog post from Mohammed AlQuraishi. It gets pretty deep into the architecture of how alphafold works, but is pretty good at giving a sense of scope of how impressive alphafold is, even for curious laypeople (if those laypeople are willing to skim over parts they don't understand)
Alphafold is super cool, but I see the same problem with the rhetoric around AI as I do with a lot of AI: an overconfidence around the capabilities of the model and a lack of understanding about what problems it's useful for. For example, Google published Alphafold's predicted structures for every protein in the human proteome, including proteins that we have no experimental structural data.
That's super cool and useful, but it wouldn't be accurate to say that Alphafold has solved the entire human proteome, as I often saw reported back when this was released. There's a huge amount of the human proteome that is referred to as "the dark proteome" because of how little we understand about those proteins, which means that Alphafold's predictions are way less reliable due to there not being much training data.
To give a general example, most of our high resolution, experimentally determined structures in the protein database (which alphafold was trained on) are structures solved using X-ray crystallography, which isn't great at resolving long, unstructured loop regions, or other highly flexible areas. That doesn't mean Alphafold or other computational tools (like RosettaFold) aren't useful at all in these areas, but it's necessary to critically consider the tools one uses in the context of what problem you're actually trying to solve.
One might think "okay, but the scientists who are working in this space surely know this stuff", but I have anecdotally seen a split in the community that feels analogous to what we see in AI discourse more generally: there seems to be a sharp divide between the pro- and anti- camps, where people who are in the "anti-" camp may not be necessarily opposed to alphafold and other machine learning tools, but are often completely oblivious to how they work and how they, as experimental scientists, can interact with them. This is exacerbated by the pro- side being pretty hard to parse as an outsider. I guess I'd describe myself as being the pro- camp, because I've tinkered with this stuff and understand enough about how it works under the hood that I am blown away by how cool it is. However, I often feel like I need to take the position of "machine learning buzzkill" to temper the excessive hype that I see coming from others within this pro- side of the debate. There's a nebulous sense of distrust towards these tools, and I understand why (especially when non-ML researchers are also increasingly sick of seeing AI shoved everywhere in their regular lives)
The root problem here isn't necessarily the machine learning though. Personally, I see this is a sort of separate subfield, and the communication difficulties that arise (such as overhype making it hard for outsiders to parse what's worth caring about) can be attributed to the "publish or perish" pressure that's prevalent across basically all fields of research.