You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, i've been looking at this library, it's really promissing. It really saves a lot of time writing boiler plate.
But i'm missing one feature to really be able to use it for my use-case.
Is your feature request related to a problem? Please describe.
The issue i'm running into is i don't have to open each link to scrape them.
My first page is the page with listings and has pagination.
The way the library is setup is, i have to .Follow(...) each link and .Parse(..) each one opened page. But in my case i don't have to. The data i need is on this page already.
Describe the solution you'd like
Ability to parse a List, maybe use a JArray for the object returned in the entity.
Describe alternatives you've considered
I didn't find a workaround. I did try something like this:
.Parse([..Enumerable.Range(0,10).Select(x =>{returnnewSchema($"Listing{x}"){newSchemaElement("Name"," div.min-w-0 > a > h2"),newSchemaElement("Amount","div.min-w-0 > p.font-semibold")};})])
Hi, i've been looking at this library, it's really promissing. It really saves a lot of time writing boiler plate.
But i'm missing one feature to really be able to use it for my use-case.
Is your feature request related to a problem? Please describe.
The issue i'm running into is i don't have to open each link to scrape them.
My first page is the page with listings and has pagination.
For example:
Page 1
Page 2
The way the library is setup is, i have to
.Follow(...)
each link and.Parse(..)
each one opened page. But in my case i don't have to. The data i need is on this page already.Describe the solution you'd like
Ability to parse a List, maybe use a JArray for the object returned in the entity.
Describe alternatives you've considered
I didn't find a workaround. I did try something like this:
But all listing are the same, since the query selector just grabs the first one https://github.com/pavlovtech/WebReaper/blob/master/WebReaper/Core/Parser/Concrete/AngleSharpContentParser.cs#L85
Additional context
To keep backwards compatability, i think this needs to be implemented on
SchemaElement
with a new property. MaybeIsList
orIsArray
.In
FillOutput()
https://github.com/pavlovtech/WebReaper/blob/master/WebReaper/Core/Parser/Concrete/AngleSharpContentParser.cs#L43
in the
try
we can add differentiate if it's a list or not, if so,GetListData()
returns a list of data to adda JArray.I'm willing to work on a PR with some guidance/approval.
The text was updated successfully, but these errors were encountered: