Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sitemap index / robots.txt parsers #10

Open
ghost opened this issue Dec 25, 2020 · 1 comment
Open

sitemap index / robots.txt parsers #10

ghost opened this issue Dec 25, 2020 · 1 comment

Comments

@ghost
Copy link

ghost commented Dec 25, 2020

Hi,

Hope you are all well ! And merry Christmas first of all !

I was playing to today with site-audit-seo and I was missing some features like a robots.txt parser to find available sitemaps for a website and the related sitemap extractor (handling sitemap indexes also).

Do you think it is possible to add these 2 components easily ?

Please fin below some references that I found:

Thanks for any insights or inputs on that.

Ps. Do you have a telegram account as I have some questions for you and do not want to pollute this thread ?
My handle is "deepocrates"

Cheers,
Luc Michalski

@popstas
Copy link
Contributor

popstas commented Dec 25, 2020

site-audit-seo based on https://github.com/yujiosaka/headless-chrome-crawler that using https://www.npmjs.com/package/robots-parser

In our cases this parser working correct for robots.txt.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant