Another option is to not scrape at all, and use an existing data set. Common crawl is one good example, and http archive is another.
If you just want meta data from the homepage of all domains we scrape that every month at https://host.io and make the data available over our API: https://host.io/docs
If you just want meta data from the homepage of all domains we scrape that every month at https://host.io and make the data available over our API: https://host.io/docs