AthTech meeting - "Historical Results" presentation

Thank you to Andy R. @andyrobinson for the great presentation, it was fascinating to see how old results could be parsed and digitized and would love to start a dialogue about it here.

I have a newspapers.com subscription and used it this year to try to harvest historical results e.g. for old editions of the USA Indoor Track & Field Championships, to preserve a list of medallists on Wikipedia I created here: https://en.wikipedia.org/wiki/Category:USA_Indoor_Track_and_Field_Championships

Is there any proof of concept to use AI to comb these historical newspaper sites for results? The hardest part for me was finding the right search queries to use to locate results scans.

Thank you and welcome!

I’m not aware of any specific AI for this. As an old-time Python prgrammer, if I had an evening (warning: I rarely do!) I would be inclined to just write some web scraping code, since they have a very samey structure and it’s a lot of data. I guess that’s similar to what you did?

However, this raises another question I have been meaning to investigate, and haven’t: how does all this athletics data get onto Wikipedia in the first place. Is all the other international stuff on Wikipedia done by you?

We have a site and wiki and can share edit rights now, so perhaps should at least start a page of links to promising sources, wiki-style, and include this. Also, in the design I am thinking of, you have prompted me that some sources might start with a URL to something rather than a scan or old picture!

  • Andy