Pipeline.Text #95

sahilgupta2105 · 2021-05-26T02:29:00Z

some of the methods feel incomplete, eg. from_folder tries to ingest a bunch of text files from a folder, but what about the labels?

aiqc · 2021-05-26T17:06:21Z

Option a) List argument for labels where the number of list elements is validated against the number of textdata entries?

Option b) When faced with this problem for Dataset.Image, I opted to create the higher-level Pipeline.Image which constructs both a Dataset.Tabular for the label and a Dataset.Image for the image, which is the main reason why Splitset accepts labels and features from different datasets.

If you chose not to include label columns in Dataset.Text, then you are free to name the columns whatever you like and you can automatically use the text-based encoding methods on them by default.

so there's pros and cons

sahilgupta2105 added the bug Something isn't working label May 26, 2021

sahilgupta2105 self-assigned this May 26, 2021

aiqc added the feature label May 8, 2022

aiqc changed the title ~~revisit data ingestion methods for text dataset~~ Pipeline.Text May 8, 2022

aiqc removed the bug Something isn't working label May 8, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pipeline.Text #95

Pipeline.Text #95

sahilgupta2105 commented May 26, 2021

aiqc commented May 26, 2021 •

edited

Loading

Pipeline.Text #95

Pipeline.Text #95

Comments

sahilgupta2105 commented May 26, 2021

aiqc commented May 26, 2021 • edited Loading

aiqc commented May 26, 2021 •

edited

Loading