Use resources available through your library's website
Ask Lemmy
A Fediverse community for open-ended, thought provoking questions
Rules: (interactive)
1) Be nice and; have fun
Doxxing, trolling, sealioning, racism, and toxicity are not welcomed in AskLemmy. Remember what your mother said: if you can't say something nice, don't say anything at all. In addition, the site-wide Lemmy.world terms of service also apply here. Please familiarize yourself with them
2) All posts must end with a '?'
This is sort of like Jeopardy. Please phrase all post titles in the form of a proper question ending with ?
3) No spam
Please do not flood the community with nonsense. Actual suspected spammers will be banned on site. No astroturfing.
4) NSFW is okay, within reason
Just remember to tag posts with either a content warning or a [NSFW] tag. Overtly sexual posts are not allowed, please direct them to either !asklemmyafterdark@lemmy.world or !asklemmynsfw@lemmynsfw.com.
NSFW comments should be restricted to posts tagged [NSFW].
5) This is not a support community.
It is not a place for 'how do I?', type questions.
If you have any questions regarding the site itself or would like to report a community, please direct them to Lemmy.world Support or email info@lemmy.world. For other questions check our partnered communities list, or use the search function.
6) No US Politics.
Please don't post about current US Politics. If you need to do this, try !politicaldiscussion@lemmy.world or !askusa@discuss.online
Reminder: The terms of service apply here too.
Partnered Communities:
Logo design credit goes to: tubbadu
Ask Lemmy.
Help, I'm in a loop - between this ask Lemmy and this comment
You know how unhinged our grasp on reality is, right?
Well, OP just said "efficiently" ... nothing about the quality. So you are technically correct.
Searxng with brave, duckduckgo, google, mullvadleta, mullvadleta brave and qwant as the search engines. Law of big numbers makes it quite useful.
There are some paid options that are pretty good (I’m thinking Kagi).
Easy, but one obvious downside.
Does Kagi let you add a domain to a denylist (like a new well SEOed site thats genAI with inaccuracies you've noticed), or positively bias search results (like saying you know you want Wikipedia entries high in the list)?
It does. You can outright block domains, rank them higher or lower and I think even pin them to the top.
It’s one of their best features. No ads being the best, since that also means you get real results and no “sponsored” bullshit. They also have ai slop filters.
For whatever reason, wikipedia seems to have been really pushed down the page on search engines specifically for medical information. It's a shame because I can acquire the surface level of information (which is all i really ever need) way faster from wikipedia than the other sites that come to the top of the list (mayo clinic, John's Hopkins, Cleveland clinic, govt sites).
I really shouldn't complain about it too much, cause they could be pushing pseudoscience blogs.
If you're interested in building a new general purpose search engine, it probably makes the most sense to start with Common Crawl's data set and augment it rather than starting from scratch.
for research, stem, look for sites like researchgate, and others for peer reviewed papers. articles, magazines, blogs are not good sources unless they are citing said research paper that links you to the proper site, and important to not put it out of context which might lull people into pseudoscience beliefs. some people jump the gun on these sites which are basically articles, often using dumbed down wording. universities/colleges often have access to most if not the full library of papers, that usually are behind paywalls of publishers, if yuo somehow can get acces to those go for it.
Utilizing books from a shadow library like Anna's archive (you can use Wikipedia to find the right domains), you can read prior written material for academic subjects, relevant books on various subjects from the pre-internet area, and so forth. Some users from newer fields (such as 3d printing/CAD) are going as far as to upload their PDF works onto Anna's for distributed access.
Go back to 2022 and run your search then
Deny list plugins!?? I'd been looking for a search engine with that built it. It seems so obvious. I didn't even think to look up a plugin. I had been writing keyword searches for browsers that manually added the query params for particularly frustrating results.
Kagi has that feature built in though it is a paid search engine.
Just found uBlacklist.
Now to find something for whitelist searches (basically I only ever want recipes or medical information from a small list of sites).
Edit: duckduckgo has the capability built in, too
I know you came here for answers but how would one start making their own metadata search engine you got any guides to point me towards? I hate google so much I’m willing to learn to make my own search engine
- StartPage, Mojeek, SearXNG, YaCy
- hyperlink surfing "extranets", as you would WikiMedia WikiPedia InternetArchive FediVerse posts etc.
- webscrapers like Monolith etc. for offline PIR and just as you say convenience of having it all there
i look forward to reading what you come up with, because i am still kinda at the theoretical stage with keeping such a knowledgebase.
edit: i keep thinking a plaintext document of information is way simpler to deal with than webpages. at what point is information posted online preserved in it's "original" form? just dumping this FediThread into a plaintext file or a folder of plaintext files with names being 'hierarchy•postID•username' or something so it is presented self-organized.
OP is ¤, 1st rank comments are ¤a ¤b ¤c and 2nd rank comments attached to comment ¤a are ¤a-a ¤a-b ¤a-c and 3rd rank comments attached to ¤a-c are ¤a-c-a ¤a-c-b ¤a-c-c so on. this then lists itself in a self-organized way, given all ASCII & unicode characters are provided in order. not just a-Z... because that would limit size of posts to take on.
ofcourse more difficult and complicated solutions like selfhosting webservers and managing ports and databases exist... not that i grasp the necessity for so many services.