Skip to content

Latest commit

 

History

History
25 lines (13 loc) · 867 Bytes

README.md

File metadata and controls

25 lines (13 loc) · 867 Bytes

WebScrapper "eltoque.com"

This is a Python script that uses the BeautifulSoup module to parse an HTML file and extract information from it. Specifically, it extracts the title, date, author name(s), and emphasized text from an article.

Requirements

Python 3.8

BeautifulSoup4

html5lib

Installation

Install Python 3.x on your system if it is not already installed.

Install BeautifulSoup4 and html5lib using pip: "pip install beautifulsoup4 html5lib"

Notes

The script assumes that the HTML file has a specific structure and classes. If the structure or classes change, the script may not work as expected.

The script is not optimized for performance and may take longer to run on large HTML files.

The script includes commented-out code for printing the first try text. Uncomment this code to print the first try text instead of the emphasized text.