Last week I attended a web scraping training session run by the OpenAustralia Foundation. It was day rammed full of hands-on work, and run with lots of positive encouragement and enthusiasm by Luke and Henare.
Using the morph.io platform (seriously - what an amazing resource is this?!), Luke and Henare walked us through how to scrape a “random” page of the Australian Parliament. Interestingly, while doing this we learnt that Cory Bernadi’s parliamentary ID is G0D (and I am serious …*G0D*…).
We were then set two tasks that we paired up for. The first was to scrape data from the NSW Regional Fire Service Daily Fire Warnings which you can check out here , and if you ever feel like it, you can download the csv, or even access an API! It’s updated daily (this part is magical). I will be looking at this over the summer and see what changes by council region. We also started off one that scraped a link and description of every bill passed in the NSW Parliament since 1997. This isn't working yet :-( so I won’t link it, but it’s on my list of things to fix this week.
We created the code in Ruby, so we had a little quick crash course on how it works, but Luke and Hanare kept it very beginner-friendly, and their encouragement and positivity made us all leave feeling like we could do ANYTHING.
One of the great things about this course is that the course fee went straight back into the OpenAustralia Foundation, which is all about gearing Australians (and beyond) up with the tools and information to help delve into and understand further our democratic process. They created and maintain morph.io which is such an amazing resource used by people all around the world (go and have a search of the available data sets on there), plus Open Australia, and Planning Alerts (https://www.openaustraliafoundation.org.au/projects/). Great stuff!
Having skills like this means getting access to data that may be "hidden" is easier. As we also discovered, it helps you understand further about data structures, and how data is classified by organisations. Peeping under the hood is fun!
Keep an eye out on the @OpenAustralia twitter feed for upcoming courses, as it’s definitely worth attending (I’d recommend it to all my data-loving MDSI comrades!). Oh, and they had catering from Fleetwood Machiato . Seriously very good!