-
Notifications
You must be signed in to change notification settings - Fork 604
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Updates to the MF4 reader to enable more general MF4 support #1884
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, will be great to support more Mf4 files.
My only concern really is how this will behave with large mf4 files on resource constrained systems as this looks to load all DataFrame data into memory at once during class initialization instead of iterating through the data frame by frame.
Obviously there should be upsides to this bulk processing if a device has the resources for it, but consider a lower footprint approach.
can/io/mf4.py
Outdated
default_object = can.Message() | ||
|
||
column_channel = data["CAN_DataFrame.BusChannel"].tolist() if "CAN_DataFrame.BusChannel" in names else [default_object.channel for _ in range(num_records)] | ||
column_ide = data["CAN_DataFrame.IDE"].astype(bool).tolist() if "CAN_DataFrame.IDE" in names else [default_object.is_extended_id for _ in range(num_records)] | ||
column_dir = data["CAN_DataFrame.Dir"].astype(bool).tolist() if "CAN_DataFrame.Dir" in names else [default_object.is_rx for _ in range(num_records)] | ||
column_edl = data["CAN_DataFrame.EDL"].astype(bool).tolist() if "CAN_DataFrame.EDL" in names else [default_object.is_fd for _ in range(num_records)] | ||
column_brs = data["CAN_DataFrame.BRS"].astype(bool).tolist() if "CAN_DataFrame.BRS" in names else [default_object.bitrate_switch for _ in range(num_records)] | ||
column_esi = data["CAN_DataFrame.ESI"].astype(bool).tolist() if "CAN_DataFrame.ESI" in names else [default_object.error_state_indicator for _ in range(num_records)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another option here is to add all the arguments that are set to a dict
then construct the Message
from the dict:
Message(**message_data)
That way we don't have to create this default_object.
can/io/mf4.py
Outdated
# Transform to python-can Messages | ||
for i in range(num_records): | ||
self._samples.append( | ||
Message( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh these function names don't quite line up with the behavior - consider revising for clarity
Hi Brian, thanks for your feedback! We have made an updated PR and include a few comments below:
Below is an example comparison of performance: Data: 10 MB finalized MF4 file Close mimic of current implementation:
Previous PR:
Updated PR (using 1000 row chunk size):
In other words, our proposed alternative PR would yield the same RAM usage, while improving the readout speed by 20x+. If you believe the stats above and the proposed alternative PR would be acceptable, we can update the PR accordingly.
|
I recommend to rebase onto the main branch, there was a problem with CI. |
Related to feature request posted below:
#1883