On the subtleties of URL parsing

A recent side project of me has been to write a scalable crawler which looks for broken resources (links, stylesheets, …) on a website. This project is meant to replace an existing crawler written in PHP with a more efficient implementation in golang. Part of writing a crawler includes parsing URLs on pages. Thankfully golang has the url.Parse method which makes this job easy, though there are a couple of caveats to look out....

January 9, 2021 · 2 min · Me