You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently, the scraper works as intended for 13 out of the 30 team pages. 8 of the remaining 17 team pages pull data successfully, but switch the employee and title because of switched HTML tags. 2 pages have incomplete outputs and missing employees, 1 page returns an IndexError, 1 page returns a UnicodeEncodeError, and the remaining 5 team pages have significantly different HTML structures that will require separate scraping code.
Handle all of these exceptions such that all 30 team pages can be scraped into a single csv file with the same dimensions of information (team, department, subdepartment, employee, title).
Currently, the scraper works as intended for 13 out of the 30 team pages. 8 of the remaining 17 team pages pull data successfully, but switch the employee and title because of switched HTML tags. 2 pages have incomplete outputs and missing employees, 1 page returns an IndexError, 1 page returns a UnicodeEncodeError, and the remaining 5 team pages have significantly different HTML structures that will require separate scraping code.
Handle all of these exceptions such that all 30 team pages can be scraped into a single csv file with the same dimensions of information (team, department, subdepartment, employee, title).
See https://github.com/albertlyu/mlb-front-offices/blob/master/mlbfrontoffice_scraper.py#L23-L48.
The text was updated successfully, but these errors were encountered: