A body of code for useful snippets and functions of python code blocks from scrapers to ssl, database from past working examples
Top level Python code organized and ranging from scraping, database, API connection files for reference purposes. The old school way of storing our code blocks. Some items are general scripts to apply to various urls and parse using Beautiful soup.
instagram scraper - need to find the api may be a meta api now.
This package was achieved from code 5 years ago. Its intent was to capture information for curiosity sake and to dabble with python.
It is a generalized bucket of code and output.txt
Its initial intent in a github repo is to access code bases for a variety of needs but mostly exploring a crawler, its techniques and migrations into a database. See the #Secondary-folder
section below for further explanations of the code in subdirectories.
research.py
A compilation of python connectors to read data for database consumption (license_spring folder) This body of code reflects the process of bring an external crm data and a licensing data together to merge the information. It was utilized in an ETL into Hubspot CRM.
Database level code for injesting data from csvs and api json data. Used as a basic block of work for ETL purposes to ingest data.
Screen Scraper for various industry, amazon best sellers, beauty products recipe scrapper and stock scraper
Reading folder files to injest in database Outputing excel from database
--
bestPerformers.py biotechstocks.py stockScrape.py, stockBestPerformers.py, htmlReaderBloomberg.py file reads stocks from a dividend stock website, another for bloomberg and writes to files. Parsing of the web component xpath to strip the components interested in.
--
crawlProducts.py Lush -- dailyBestSellersLush.py, lushBestSellersAll.py, LushFace.py,LushFaceTest.py, LushHairShampoo.py, lushIngredients.py,lushLeavingSoon.py Humblebee.py JustNaturals.py JustNaturalsURLs.py AnniesRemedies.py etsyBath.py Amazon- beauty best sellers, htmlReaderAmazon.py, htmlAmazonSkinCareReader.py detox_market.py
psychArticles.py happinessLinks.py
databaseConnectionFromStackO.py requestSample.py
OutboundAPICAL.py convertPyFiletoApp.py -- to make into an app to run on scheduler to scrape Amazon Best Sellers for Beauty (MACOS) scheduledRun.py htmlCrawler.py htmlCrawlerTest.py randomExamples.py randomNamesGenerator.py readFolder.py stringRelace.py UserAgentList.py used to obfuscate the browser user agent as a friendly hacking experiment against Amazon web scraping
sheduledRun.py
create-contact.py scrapeModule.py dealsSample.py - API connection to Hubspot using python