· @postmodern
541 followers · 354 posts · Server infosec.exchange

Coming up with the options for a web spider command and which options are mutually exclusive, is really difficult. Like obviously such a common should print the URLs by default. However, what if the user also wants to scrape HTML nodes out of each webpage using an XPath? Should you print the URLs and the matched content, or disable printing of URLs if --xpath is specified, or have a separate option called like --no-print-urls to explicitly disable printing the URLs if you only want to pipe the matched HTML into some other util.

#webspidering #webspider #spidering #cli #recon

Last updated 3 years ago