November 2025: Knowledge Scraping

“Web scraping” is pulling information from webpages.
“Knowledge scraping” is a term I just made up that starts with scraping information, but then uses machine learning to build a knowledge graph with the text. We’ll use Python’s Scrapy and pretrained models from NLTK, but no LLMs - we don’t need them!

This will be in preparation for an upcoming Dec. 6 talk at OLF. Yes, it’s the long-awaited return of Captain Columbus, using automated data processing to continue her battle with the Innovators of Iniquity. (However, this will be well short of a full dress rehearsal - much preparation still neds to be done!)

Location: Innovation Hub

We meet in the Innovation Hub, a gorgeous new facility that’s part of the renovated Dayton Arcade complex.
Enter through the doors that face the Wright Stop Plaza bus hub.

Street parking is free in the evening. I usually park on Ludlow Street.

if for any reason coming downtown doesn’t work for you (for instance, you’ve been exposed to COVID, or you’ve converted yourself to purely digital format and now exist as a set of cloud-hosted algorithms), we’ll be online as well!

Join us at 7 PM EDT on the PyFri Discord channel, discord.gg/9SgTh3T, and click on the General voice chat link. You may need to install the Discord desktop app rather than just using the web interface.