Author |
Topic |
Italian Archmage Karsus
Learned Scribe
126 Posts |
Posted - 12 Nov 2022 : 00:02:53
|
Hey, guys, I've been working on and off for a year now on a tool to find Realmslore quickly. We've been using a prototype at the wiki, it's been invaluable. If anyone wants to take a look, it's here at
https://askvalhaeria.com
Basically, you input a subject and it tries to find matches in existing sources. The database on Ask Valhaeria so far only includes web sources that were put out for free by Wizards or TSR, so it shouldn't step on any legal toes. I might add search data for other books to that database later; the database doesn't actually store the books, it only stores searches it has already made inside the books, so it shouldn't be anywhere near infringement.
Right now it's ugly as sin, I've been putting off a migration and redesign. But I could really use some feedback, if anyone has two cents to share about it. Next I gotta change the way it shows links so it's a little more user friendly, and then make all of the webpage way, way prettier.
It has other functions, too: if you click "ZOOM" on a source, it allows you to restrict your searches to that source alone. That allows us at the wiki to find where in a book a specific word or sequence shows up without having to sift through the whole book by hand. It allows for multiple queries separated by bullet points, so that we can copypaste straight off the wiki. The last function it has is a plagiarism search. It finds matches for longer sequences (minimum 5 or so words), allowing us to catch plagiarism in the wiki in a blink (as it exposes us to legal action, it's central we don't let it exist for long!).
I've been wondering what else I could add to Valhaeria to make it more useful, but in the meantime, I got told you peeps would enjoy it. So I figured I'd show ya.
|
|
Karthak
Seeker
63 Posts |
Posted - 12 Nov 2022 : 12:42:52
|
Seems very useful, only thing I could think of adding is an index to show users what terms are currently searchable. |
|
|
Athreeren
Learned Scribe
144 Posts |
Posted - 12 Nov 2022 : 15:33:51
|
Thanks, this seems very useful! However, I have tried lots of terms that give me results on the wiki but not on this platform. How does the platform search for terms? Do you have a database that was filled manually, or did you scrape the wiki? |
|
|
Italian Archmage Karsus
Learned Scribe
126 Posts |
Posted - 12 Nov 2022 : 22:26:44
|
Thanks for your feedback!
Karthak, there's 11 million terms in the database that it has indexes for. It can't be displayed- as it stores not just individual words, but 2-word and even 3-word sequences. The list would be endless and not easy to find; it's easiest to just ask for a term yourself and see if it's there.
Athreeren, it's not wiki scraped, it's only indexing a specific subset of the sources. Later versions may include more sources; the database scrapes sources, not wiki pages. We use Valhaeria to help us research subjects and write wiki pages, not the other way around. |
|
|
Karthak
Seeker
63 Posts |
Posted - 13 Nov 2022 : 02:09:00
|
Ah right, I was figuring it was a short list of maybe 10,000 terms, no point in having an index even if it was possible when most search terms will return some sources.
Is there a difference between Database A and B?
|
|
|
Italian Archmage Karsus
Learned Scribe
126 Posts |
Posted - 13 Nov 2022 : 12:31:47
|
Karthak, there is no difference between the two databases. They are a holdover from the prototype, I should later remove them. |
Edited by - Italian Archmage Karsus on 13 Nov 2022 12:35:40 |
|
|
sleyvas
Skilled Spell Strategist
USA
11829 Posts |
Posted - 14 Nov 2022 : 19:32:34
|
searched mingari - no results searched savyels (for the entry of a bounty hunter in powers and pantheons) - no results searched Lauzoril - lot of results, but not to anything I would have expected (i.e. nothing to Dreams of the Red Wizards, spellbound, etc...)... but don't see how you can read the results, it just tells you where to look. |
Alavairthae, may your skill prevail
Phillip aka Sleyvas |
|
|
Italian Archmage Karsus
Learned Scribe
126 Posts |
Posted - 14 Nov 2022 : 22:12:39
|
That's right, sleyvas. It just tells you where to look, but the idea is that it never misses anything- so it helps you sift that much faster.
Valhaeria right now runs on web-only stuff- and I'm working on adding links to every web-available source. I'm working on a prototype with books indexed in it, but it's not quite ready for primetime- we just use it internally in the wiki. Even then, it still only tells us where to look, it just happens to do so in a comfortable blink. |
Edited by - Italian Archmage Karsus on 14 Nov 2022 22:14:56 |
|
|
BadCatMan
Senior Scribe
Australia
401 Posts |
Posted - 15 Nov 2022 : 02:35:07
|
Consider it the shareware demo version.
But I've been using the full prototype version, with sourcebooks, novels, and magazines, as well as the free articles you get in Ask Valhaeria, and it has been an absolute godsend to Realmslore research for the Forgotten Realms Wiki. Rather than spending ages searching books manually, even searching PDFs en masse, we can have a list of every source in under a second. That can be way too much for a common topic or a regular word, of course, and long phrases run the risk of false positives. But for less common stuff, it's a very quick way to zero in on the relevant sources, down to within a page or so. All we need to do is find the book and open to the page.
With it, we've been able to write articles that cover every source to mention a topic, and can include some obscure lore and forgotten sources, without having to come back later when we discover or obtain something new. We can also more easily find references for unsourced pages, so we're steadily cleaning them up on the wiki.
It's a D&D-dedicated search engine and the quickest and most comprehensive resource for lore. If you're looking for some Realmslore, do consider trying out Ask Valhaeria. |
BadCatMan, B.Sc. (Hons), M.Sc. Scientific technical editor Head DM of the Realms of Adventure play-by-post community Administrator of the Forgotten Realms Wiki |
|
|
sleyvas
Skilled Spell Strategist
USA
11829 Posts |
Posted - 15 Nov 2022 : 15:19:28
|
I must say, I like the concept. Lord knows I'm constantly using the DOS findstr and powershell's select-string to find things for work. Never tried searching PDF's though without opening them and doing a search within the book because like many things you don't get useful outputs. This will definitely help me in my realms research, as I often end up at the wiki to look up things. |
Alavairthae, may your skill prevail
Phillip aka Sleyvas |
|
|
Italian Archmage Karsus
Learned Scribe
126 Posts |
Posted - 25 Nov 2022 : 22:37:53
|
Thanks for the vote of confidence, BCM! I've been banging my head against the code for a bit, looking to upgrade the database. In my next delivery I'll see about throwing in archive links for every web resource, the novels with Google Books, Google Reader, Internet Archive and Amazon links (all of those let you have a taste for free, I figure they'd be useful for a thrifty lore scholar), and look at getting a presentable version of the magazines somehow. |
|
|
martynq
Seeker
United Kingdom
90 Posts |
Posted - 28 Nov 2022 : 16:18:57
|
This looks like a very worthwhile endeavour ... and far more sensible than the vast text file that I once embarked upon with every reference that I could find in every FR book. (My project was long since abandoned since it was clearly not feasible to complete and it wasn't using technology in a sensible way -- though I did start it back in the late 1990s when such technology was harder to come by!!) |
|
|
Italian Archmage Karsus
Learned Scribe
126 Posts |
Posted - 30 Nov 2022 : 23:22:22
|
Thanks, martynq.
I've managed to finagle the SQL into adding links and topics; the next version of Valhaeria ought have links to every column it has cribbed in Internet Archive, with print friendly and one-file download options wherever available. Crawling through the IA's glacial interface is something I wouldn't wish my worst enemy... I've considered local downloads in case the IA goes down, but that'll hafta wait for next version: I don't want anyone to take my site's word for anything as it is intended mostly as a wiki writing accellerator.
I also think I've refined the crawler module to get all of the IA's pages as much as I'll be able to refine those. The rest, I figure I'm doing by hand the hard way. |
|
|
Italian Archmage Karsus
Learned Scribe
126 Posts |
Posted - 06 Dec 2022 : 00:09:48
|
So, uh, I improved on Valhaeria some, at last. Now we have links, on almost every web source, and a few novels. I'll make it a point of adding more sources, and providing links to every one of them whenever available. This should improve its useability, even if many of the links are unfortunately broken.
I've lost some references while rebuilding- I'll get them back, I promise. |
Edited by - Italian Archmage Karsus on 06 Dec 2022 00:58:47 |
|
|
Italian Archmage Karsus
Learned Scribe
126 Posts |
Posted - 18 Jan 2023 : 22:22:55
|
Hey! I fixed up the database. For whatever reason it has rejected a handful of sources, which I'll get to in due time.
Try it on now for size. It's got a single database, no more database swapping button, and it's got links to its every source. Most of them work, too. |
|
|
Italian Archmage Karsus
Learned Scribe
126 Posts |
Posted - 08 Feb 2023 : 22:31:20
|
Hey! Twitter's about to sunset its API access, so this might be the last time I get to grab tweets en masse. Did I mention Valhaeria also links to tweets? I did my best to skip the tweets that related to his personal life or to other projects, so I might have caught some Realmslore in that blacklist... but it's more comfortable this way.
Also, recently added a new capability! By prefacing your query with ">>", you get a snippet from every page showing you exactly what text it thinks you are matching. This system uses some maths to guess what the text must've been, as Valhaeria most emphatically does not, can not, and will not ever store the full text, but it might be good for quickly ruling out turns of phrase when you're looking for something in particular.
Be warned, the system is heavily taxed by this function, I'd prefer it if you didn't use it for queries that have too many results. I'm not running this one on a super robust VPS.
Val's almost feature complete now. Just needs one more feature, and I'll call it a day. It's going to be mostly for us wiki editors, tho, so I guess as it is, Valhaeria's feature complete for general use. The next feature is going to be pointless for anyone who isn't a wiki editor: I am going to create a checker that looks for matches *outside* a given book, so that when we search for, say, "Valraxaxath" in his own article, we only get hits for Valraxaxath OUTSIDE that article. A thingy for knowing if something exists OUTSIDE a particular book, that is. :) |
|
|
Italian Archmage Karsus
Learned Scribe
126 Posts |
Posted - 16 Feb 2023 : 00:41:19
|
Hey! Added the final feature. It's mostly for wiki editing, but by adding space-bullet point-space between search queries, you get results separated by query, instead of by year. This is useful for telling whether any of N elements after indexing requires us looking other sources before starting a fresh article.
From here on, I just gotta fix some stuff with the text digester, and add a bunch if other things from the archived versions of TSRINC.com and TSR.com. Meager, but best have them all, I figure! That's how I learned that the Realms and Greyhawk are on the same planet, for instance. We have to put this in the wiki! Everyone's gonna hate it!
https://web.archive.org/web/19970424044506/http://tsrinc.com/info/tabloid.html |
|
|
Athreeren
Learned Scribe
144 Posts |
Posted - 16 Feb 2023 : 08:08:18
|
quote: Originally posted by Italian Archmage Karsus
Hey! Added the final feature. It's mostly for wiki editing, but by adding space-bullet point-space between search queries, you get results separated by query, instead of by year. This is useful for telling whether any of N elements after indexing requires us looking other sources before starting a fresh article.
From here on, I just gotta fix some stuff with the text digester, and add a bunch if other things from the archived versions of TSRINC.com and TSR.com. Meager, but best have them all, I figure! That's how I learned that the Realms and Greyhawk are on the same planet, for instance. We have to put this in the wiki! Everyone's gonna hate it!
https://web.archive.org/web/19970424044506/http://tsrinc.com/info/tabloid.html
Any idea what this is about? "The above story is a work of fiction. Any resemblance to any being living, dead, or employee of TSR, is purely concidental." makes me thinks it's more satire than April fool, but in 1997 or earlier, I'm not sure what this is in reference to. Have you been able to find more about the context of that joke? (although I'm definitely keeping that bit about Vecna being a bartender in Waterdeep) |
Edited by - Athreeren on 16 Feb 2023 08:09:30 |
|
|
Karthak
Seeker
63 Posts |
Posted - 16 Feb 2023 : 09:44:01
|
The context is in the name, it's a parody of a trashy tabloid article. |
|
|
Italian Archmage Karsus
Learned Scribe
126 Posts |
Posted - 05 Mar 2023 : 17:45:23
|
I'm about to take another swing at making the database a bit more complete.
For my next act, I'm going to cache the page preview (still working with free mats, shouldn't be a problem) in every page. I've rebuilt the bulk DL'er to cast a bit wider net. I'm also going to include more DOC files, and the piece de resistance, I'm going to grind up and snort the old TSR and RPGA pages. This may add some stuff that is a wee sketchy and of dubious finality, but better sketchy than gone for good. I also gotta rework the tweet-sucking module a li'l dri'l bit. The current version's OK-ish, but sometimes it fails to hoover up the context where the search keywords are supposed to be. That's not great. Finally, there's PDF versions of game manuals that shouldn't be a challenge to hoover up.
Was also considering tossin' up a PayPal link. I'm not doing this for a living, but hey, why not? |
|
|
Italian Archmage Karsus
Learned Scribe
126 Posts |
Posted - 15 Mar 2023 : 14:32:16
|
It took me way longer than I thought, but the new site is up!
Forgot a few sources... Will throw them in later today, my bad. |
|
|
Ayrik
Great Reader
Canada
7989 Posts |
Posted - 16 Mar 2023 : 00:10:53
|
I tried using the site.
But I couldn't follow any of the links. Because the font colours are too painful. The only way I could really read them is with highlighting.
Sorry. |
[/Ayrik] |
|
|
Italian Archmage Karsus
Learned Scribe
126 Posts |
Posted - 17 Mar 2023 : 02:04:55
|
I'm open to feedback, of course. I can change the colors, shouldn't take more than a moment. ;) If you don't have any suggestions I'll just check out style guides online, how hard could it be? |
|
|
Karthak
Seeker
63 Posts |
Posted - 17 Mar 2023 : 02:17:49
|
The purple used for showing clicked links looks fine, for some reason the blue looks blurry except if zoomed in, maybe changing it to a bright orange or red will make it stand out more. |
|
|
Ayrik
Great Reader
Canada
7989 Posts |
Posted - 17 Mar 2023 : 08:08:25
|
I'm using dark mode, basically a black or dark grey background.
That particular blue on black is somehow just bright enough to be obnoxious and just dark enough to be illegible. On my small laptop screen and on my large desktop monitor. I want to get closer to make out what it says but I can't look at it long because of the awful brightness.
Not trying to be negative. But my eyes aren't as young as they used to be, lol. |
[/Ayrik] |
|
|
Italian Archmage Karsus
Learned Scribe
126 Posts |
Posted - 17 Mar 2023 : 22:31:04
|
Alright! Lightened up lonk colors. I hope this will be more forgiving on your eyes, going forward.
Thanks for the feedback! Please let me know how it is right now! |
|
|
Ayrik
Great Reader
Canada
7989 Posts |
Posted - 17 Mar 2023 : 23:08:00
|
That is much better, I can read it in the dark, I can read it in sunlight. Thank you. |
[/Ayrik] |
|
|
Italian Archmage Karsus
Learned Scribe
126 Posts |
Posted - 19 Mar 2023 : 02:54:40
|
Alright. Added a thousand and change sources to the database. It's getting pretty complete... If anything it is somewhat overfilled, some of them turned out to have failed to display links correctly and some others just don't belong.
It will take some time to go the rest of the way. There are about 70 columns missing about 310 articles and i didn't want tbem in the DB yet. I will try to stick those in next update. See you 'till then! |
|
|
Italian Archmage Karsus
Learned Scribe
126 Posts |
Posted - 25 Mar 2023 : 23:04:01
|
Stuck in another database, hopefully with less Star Wars and Neopets and Harry Potter and other bycatch. Just enough to let us check for crossovers and inspirations, I hope!
Also added a LIBRARY view. It will let you know exactly what we've got in the base. Please don't abuse it! It's still a lot of results involved in it, and may be a wee slow for your connection. |
Edited by - Italian Archmage Karsus on 25 Mar 2023 23:04:56 |
|
|
Athreeren
Learned Scribe
144 Posts |
Posted - 12 Apr 2023 : 12:14:50
|
The Mages and Sages podcast can dump lore at high speed, and there is of course no way Youtube's automated subtitles can properly spell the names of the uncommon locations mentioned by the sages. Ask Valhaeria has been very useful for this, and I'm wondering how I can improve my search process: can I include joker characters, or characters that could be one of a few possibilities? |
|
|
sleyvas
Skilled Spell Strategist
USA
11829 Posts |
Posted - 13 Apr 2023 : 14:15:08
|
just wondering... has anyone asked something like ChatGPT yet to "write a report about Khelben Arunsun" or "make a list of known harpers from the forgotten realms". I've seen the news going on and on about this AI, but I've yet to setup an account (there is a portion of me that loathes the idea of making an account with them). |
Alavairthae, may your skill prevail
Phillip aka Sleyvas |
|
|
Topic |
|