about this map
this map was developed by nick deMarchis over the course of a few weeks as a proof-of-concept to map places mentioned in local news articles in realtime in New York City.
it seems like it's worked, at least to some degree. there's some dust and clear areas for improvement that i have documented, obviously. check the issue tracker to see what's up. or, you can contact me with the information mentioned at the bottom of the about page.
process
there are three main parts of this project: a recurring Python script, a database, and the frontend.
the Python script is supposed to run on a schedule. it checks against a predetermined set of RSS feeds, and determines whether any new articles have been added since it last checked. if so, it will pull relevant information from those articles, and temporarily store them.
it then uses OpenAI's gpt-4o-mini-2024-07-18 model to extract physical locations that are mentioned in the text of each article.
after we have relevant locations extracted, we use the Google Maps Geocoding API to determine their location, as well as to provide metadata on whether the place is too generic to be mapped at this proof-of-concept stage (for instance, counties, states). we can then use the place_id that they provide to associate articles with each other.
we then send information on our locations, our articles and each relation between an article and location to our database.
on the frontend, whenever someone loads the page, we use a Next.js Route Handler to reformat our database information as GeoJSON as requested. we then serve that GeoJSON to an OpenLayers map.
when a user clicks a location, we then use another route to handle database requests, and pull the information of the articles that match the selected location.
blind spots
there are quite a few blind spots and limitations associated here. to name a few:
- the OpenAI model isn't amazing at pulling out non-obvious locations, and sometimes is either too verbose or too general
- at this moment, i can only really use publications that have working RSS feeds. this is mostly fine, as most still do, but not all.
- i did my best to filter feeds by their metro/local reporting, if relevant. that obviously could leave a lot of state- or federal-level news with local impact off the map.
next steps
like i said, issue tracker. there's a lot of work to do — especially to expand this work to more communities and understand better how neighborhoods receive news coverage.