This script retrieves all links from a given Webpage and saves them as a tct file
Prerequisites
Required Modules
- BeautifulSoup4
- requests
To Install:
$ pip install -r requirements.txt
How to Run the Script:
$ python get_links.py
You will then be asked which webpage you would like to analyze. After that, the extracted links will be saved like an array inĀ myLinks.txt
.
Requirements:
Source Code:
import requests as rq from bs4 import BeautifulSoup url = input("Enter Link: ") if ("https" or "http") in url: data = rq.get(url) else: data = rq.get("https://" + url) soup = BeautifulSoup(data.text, "html.parser") links = [] for link in soup.find_all("a"): links.append(link.get("href")) # Writing the output to a file (myLinks.txt) instead of to stdout # You can change 'a' to 'w' to overwrite the file each time with open("myLinks.txt", 'a') as saved: print(links[:10], file=saved)