Scala case class fields shuffling #225

AlexisBRENON · 2021-11-26T13:10:09Z

Hi, I just created #224 to highlight a weird behavior when using scalapb encoders with "classic" scala case class.

Given a case class with two fields with same type (AddressLike(street: Option[String], city: Option[String])) I save a dataset using the first field (street) as the partitionning column.
Then when loading the dataset, spark create a dataframe with columns city and street.
Finally, when collecting this dataframe to an Array[AddressLike], the value of the field street is in the city field and vice-versa.

This seems to happen because the dataframe schema has the street field at the end (while it is the first field of the case class):

root
 |-- city: string (nullable = true)
 |-- street: string (nullable = true)

Providing the schema before calling .load does not modify this actual schema.

Finally, when deserializing to scala case class, mapping seems to be done by position instead of name, leading to city values mapped to the street field and vice-versa.

This index-based mapping can be problematic too, if you have a dataframe with "useless" columns and try to "cast" it to a case class with fewer fields.I will add a test to highlight this too.

Maybe this is more related to frameless encoder than scalapb ones. I can forward this issue there if required.

The text was updated successfully, but these errors were encountered:

AlexisBRENON · 2021-11-26T14:40:44Z

This seems to be a frameless issue.
I reproduce the errors on their repo: typelevel/frameless@master...AlexisBRENON:case_class_support#diff-dd83f3b1d1a249804b5620473177ce6034efbc5f36b45a9b1ef01283cafd50f9R540
And it seems that others already report it: typelevel/frameless#411

thesamet · 2021-12-21T16:30:52Z

I suggest to keep tracking it upstream. My understanding from the above is that it is not actionable by ScalaPB.

thesamet closed this as completed Dec 21, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Scala case class fields shuffling #225

Scala case class fields shuffling #225

AlexisBRENON commented Nov 26, 2021

AlexisBRENON commented Nov 26, 2021

thesamet commented Dec 21, 2021

Scala case class fields shuffling #225

Scala case class fields shuffling #225

Comments

AlexisBRENON commented Nov 26, 2021

AlexisBRENON commented Nov 26, 2021

thesamet commented Dec 21, 2021