Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Urls not listed in filter still pass through? #2

Open
coodoo opened this issue May 29, 2016 · 6 comments
Open

Urls not listed in filter still pass through? #2

coodoo opened this issue May 29, 2016 · 6 comments

Comments

@coodoo
Copy link

coodoo commented May 29, 2016

First of all thanks for making all these great things happen, big kudos!

Just did a quick run and seemed urls not listed in the filter still got passed to the fn.

See below, edgesuite.net is not listed in the filter I would assume it shouldn't got passed into fn at all, am I doing something wrong here?

this.browser
.filter({
    urls: ['https://*.github.com/*', '*://electron.github.io']
  }, function(details, cb){
    // a request to http://img.edgesuite.net/foo.png got passed in and blocked, which shouldn't
    return cb({cancel: (details.url.indexOf('edgesuite.net') !== -1 )});
  })
.goto( url )
@rosshinkley
Copy link
Owner

@coodoo What version of Nightmare and nightmare-load-filter are you using, out of curiosity?

I tried your example, and if you add logging in the filter callback, it doesn't look like it gets called. How are you asserting that the image is getting blocked? (The URL provided returns a 502.)

@coodoo
Copy link
Author

coodoo commented May 30, 2016

@rosshinkley I'll provide detailed report soon, quick question: how do I log in the filter callback? I tried the standard console.log('foo') to no avail.

@coodoo
Copy link
Author

coodoo commented May 30, 2016

Here's a short code sample to reproduce the issue, edgesuite.net is not listed in the rules, yet all images from that website were blocked, I would expect that should not happen?

const rules = [
    'google.com',
    // 'edgesuite.net'
]

this.browser
    .filter( { urls: rules }, ( details, cb ) => cb({ cancel: details.url.indexOf('edgesuite.net') != -1 }) )
    .goto( 'http://www.appledailytw.com/realtimenews/article/nextmag/20160531/874328/' )

Using:

"nightmare": "^2.5.0",
"nightmare-load-filter": "0.2.0",

@rosshinkley
Copy link
Owner

I tried the standard console.log('foo') to no avail.

Output will be a part of the Electron stdout. Run your script with DEBUG and you'll have better luck.

...yet all images from that website were blocked, I would expect that should not happen?

That's odd. Maybe this is a quirk of later versions of Electron or Chromium - I would expect whole matches (eg, http://www.google.com) to match only that address, but it looks like that filter is completely ignored. In fact, I'd expect it to behave how WebRequest match patterns work. I'll dig into this as time permits.

It looks like it works as expected if you are willing to use wildcards. Your example, slightly modified:

const rules = [
    'http://google.com/*',
    // 'edgesuite.net'
]

this.browser
    .filter( { urls: rules }, ( details, cb ) => cb({ cancel: details.url.indexOf('edgesuite.net') != -1 }) )
    .goto( 'http://www.appledailytw.com/realtimenews/article/nextmag/20160531/874328/' )

@coodoo
Copy link
Author

coodoo commented May 31, 2016

Very interesting findings! After playing with it a bit more I found the url must contain :// and / after the domain, so something like *://google.com/* works, any other form won't.

@rosshinkley
Copy link
Owner

That doesn't surprise me as much: google.com is ambiguous and should be "fully qualified" (even if the full qualification is with wildcards, it's required to be explicit about what you expect). I can kind of understand why that wouldn't work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants