To run this:
pip3 install -r requirements.txt-
Output:
python link_extractor.py --helpusage: link_extractor.py [-h] [-m MAX_URLS] url Link Extractor Tool with Python positional arguments: url The URL to extract links from. optional arguments: -h, --help show this help message and exit -m MAX_URLS, --max-urls MAX_URLS Number of max URLs to crawl, default is 30. - For instance, to extract all links from 2 first URLs appeared in github.com:
This will result in a large list, here is the last 5 links:
python link_extractor.py https://github.com -m 2This will also save these URLs in[!] External link: https://developer.github.com/ [*] Internal link: https://help.github.com/ [!] External link: https://github.blog/ [*] Internal link: https://help.github.com/articles/github-terms-of-service/ [*] Internal link: https://help.github.com/articles/github-privacy-statement/ [+] Total Internal links: 85 [+] Total External links: 21 [+] Total URLs: 106github.com_external_links.txtfor external links andgithub.com_internal_links.txtfor internal links.