Urls not listed in filter still pass through? #2

coodoo · 2016-05-29T03:06:11Z

First of all thanks for making all these great things happen, big kudos!

Just did a quick run and seemed urls not listed in the filter still got passed to the fn.

See below, edgesuite.net is not listed in the filter I would assume it shouldn't got passed into fn at all, am I doing something wrong here?

this.browser
.filter({
    urls: ['https://*.github.com/*', '*://electron.github.io']
  }, function(details, cb){
    // a request to http://img.edgesuite.net/foo.png got passed in and blocked, which shouldn't
    return cb({cancel: (details.url.indexOf('edgesuite.net') !== -1 )});
  })
.goto( url )

The text was updated successfully, but these errors were encountered:

rosshinkley · 2016-05-30T15:05:41Z

@coodoo What version of Nightmare and nightmare-load-filter are you using, out of curiosity?

I tried your example, and if you add logging in the filter callback, it doesn't look like it gets called. How are you asserting that the image is getting blocked? (The URL provided returns a 502.)

coodoo · 2016-05-30T23:30:07Z

@rosshinkley I'll provide detailed report soon, quick question: how do I log in the filter callback? I tried the standard console.log('foo') to no avail.

coodoo · 2016-05-30T23:44:46Z

Here's a short code sample to reproduce the issue, edgesuite.net is not listed in the rules, yet all images from that website were blocked, I would expect that should not happen?

const rules = [
    'google.com',
    // 'edgesuite.net'
]

this.browser
    .filter( { urls: rules }, ( details, cb ) => cb({ cancel: details.url.indexOf('edgesuite.net') != -1 }) )
    .goto( 'http://www.appledailytw.com/realtimenews/article/nextmag/20160531/874328/' )

Using:

"nightmare": "^2.5.0",
"nightmare-load-filter": "0.2.0",

rosshinkley · 2016-05-31T02:19:37Z

I tried the standard console.log('foo') to no avail.

Output will be a part of the Electron stdout. Run your script with DEBUG and you'll have better luck.

...yet all images from that website were blocked, I would expect that should not happen?

That's odd. Maybe this is a quirk of later versions of Electron or Chromium - I would expect whole matches (eg, http://www.google.com) to match only that address, but it looks like that filter is completely ignored. In fact, I'd expect it to behave how WebRequest match patterns work. I'll dig into this as time permits.

It looks like it works as expected if you are willing to use wildcards. Your example, slightly modified:

const rules = [
    'http://google.com/*',
    // 'edgesuite.net'
]

this.browser
    .filter( { urls: rules }, ( details, cb ) => cb({ cancel: details.url.indexOf('edgesuite.net') != -1 }) )
    .goto( 'http://www.appledailytw.com/realtimenews/article/nextmag/20160531/874328/' )

coodoo · 2016-05-31T02:50:18Z

Very interesting findings! After playing with it a bit more I found the url must contain :// and / after the domain, so something like *://google.com/* works, any other form won't.

rosshinkley · 2016-05-31T02:57:21Z

That doesn't surprise me as much: google.com is ambiguous and should be "fully qualified" (even if the full qualification is with wildcards, it's required to be explicit about what you expect). I can kind of understand why that wouldn't work.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Urls not listed in filter still pass through? #2

Urls not listed in filter still pass through? #2

coodoo commented May 29, 2016

rosshinkley commented May 30, 2016

coodoo commented May 30, 2016

coodoo commented May 30, 2016

rosshinkley commented May 31, 2016

coodoo commented May 31, 2016

rosshinkley commented May 31, 2016

Urls not listed in filter still pass through? #2

Urls not listed in filter still pass through? #2

Comments

coodoo commented May 29, 2016

rosshinkley commented May 30, 2016

coodoo commented May 30, 2016

coodoo commented May 30, 2016

rosshinkley commented May 31, 2016

coodoo commented May 31, 2016

rosshinkley commented May 31, 2016