Skip to content

Check for messages[mtype]._instances is None#822

Closed
clydemcqueen wants to merge 0 commit into
ArduPilot:masterfrom
clydemcqueen:clyde_fix_807
Closed

Check for messages[mtype]._instances is None#822
clydemcqueen wants to merge 0 commit into
ArduPilot:masterfrom
clydemcqueen:clyde_fix_807

Conversation

@clydemcqueen

Copy link
Copy Markdown

This is a small change to fix #807. I've tested this against quite a few tlog files over the past 2 weeks and it works great. Running pytest comes up clean.

See the longer discussion in https://discuss.bluerobotics.com/t/mavlogdump-py-crashes-while-reading-distance-sensor-msgs-in-some-tlog-files/14228

Thanks,
/Clyde

Comment thread mavutil.py Outdated
return
instance_value = getattr(msg, msg._instance_field)
if not mtype in messages:
if mtype not in messages or messages[mtype]._instances is None:

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not check for msg.get_type() == 'BAD_DATA' at the top of this method, and not do silly things with invalid data?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review! @peterbarker

That fixes the bug, and pytest succeeds. But has the side effect of never adding BAD_DATA messages to mavfile sysid_state.messages, which is a slight change to behavior. I looked for all usages in this repo, and I don't see any problems, and in fact I imagine it might fix an obscure class of bugs. If you are OK w/ this I'll make the change.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made this change and rebased. Tests pass locally. Thanks.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whoops, I added the if test before the docstring. Fixed.

@clydemcqueen

Copy link
Copy Markdown
Author

Whoops, I lost track of this. Thanks @peterbarker and @tridge for the review and fix. I rebased.

Comment thread mavutil.py Outdated
def add_message(messages, mtype, msg):
'''add a msg to array of messages, taking account of instance messages'''
if msg.get_type() == 'BAD_DATA':
# avoid dealing with instance field

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you made sure that BAD_DATA still appears in MAVProxy's messages command after this change?

@clydemcqueen clydemcqueen Dec 30, 2025

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have confirmed that BAD_DATA messages are still delivered by recv_match() from a tlog file after this change.

Without fix, crashes on the 5th bad message:

(.venv) pymavlink ~/dive_logs/seaq_logs (main=)$ /home/clyde/projects/ardusub_log_tools/tlog_bad_data.py --verbose 2023_04_18_surftrak3/2023-04-18\ 11-03-34.tlog
Processing 1 files
-------------------
Results for 2023_04_18_surftrak3/2023-04-18 11-03-34.tlog
BadDataMsg mavlink2=True sysid=1 compid=194 msg_id=132 reason: invalid MAVLink CRC in msgID 132 0x2fb3 should be 0x6d00
BadDataMsg mavlink2=True sysid=1 compid=194 msg_id=132 reason: invalid MAVLink CRC in msgID 132 0x8261 should be 0x4c16
BadDataMsg mavlink2=True sysid=1 compid=194 msg_id=132 reason: invalid MAVLink CRC in msgID 132 0xe104 should be 0x27d7
BadDataMsg mavlink2=True sysid=1 compid=194 msg_id=132 reason: invalid MAVLink CRC in msgID 132 0x63c9 should be 0x1e79
CRASH WITH ERROR "'NoneType' object does not support item assignment" READING 2023_04_18_surftrak3/2023-04-18 11-03-34.tlog

Same file with fix (not verbose). Most (not all) bad messages were DISTANCE_SENSOR.

(.venv) pymavlink ~/dive_logs/seaq_logs (main=)$ /home/clyde/projects/ardusub_log_tools/tlog_bad_data.py 2023_04_18_surftrak3/2023-04-18\ 11-03-34.tlog
Processing 1 files
-------------------
Results for 2023_04_18_surftrak3/2023-04-18 11-03-34.tlog
msg_id 21 count 6
msg_id 76 count 4
msg_id 77 count 4
msg_id 132 count 1949
msg_id 259 count 1
msg_id 260 count 1
msg_id 262 count 1
msg_id 269 count 1
1967 BAD_DATA messages, 1967 of them were CRC errors

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With MAVExplorer.py, before fix:

MAV> dump DISTANCE_SENSOR
MAV> ERROR in command ['DISTANCE_SENSOR']: 'NoneType' object does not support item assignment

With fix:

MAV> dump DISTANCE_SENSOR
MAV> 2023-04-18 10:58:18.373 DISTANCE_SENSOR {time_boot_ms : 214433, min_distance : 20, max_distance : 5000, current_distance : 3948, type : 1, id : 1, orientation : 25, covariance : 255, horizontal_fov : 0.5199999809265137, vertical_fov : 0.5199999809265137, quaternion : [0.0, 0.0, 0.0, 0.0], signal_quality : 6}
2023-04-18 10:58:18.373 DISTANCE_SENSOR {time_boot_ms : 214517, min_distance : 20, max_distance : 5000, current_distance : 3942, type : 1, id : 1, orientation : 25, covariance : 255, horizontal_fov : 0.5199999809265137, vertical_fov : 0.5199999809265137, quaternion : [0.0, 0.0, 0.0, 0.0], signal_quality : 7}
...

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I think I do need to understand what's going on here.

It feels like we might be adding a BAD_DATA message to messages['BAD_DATA'] - and that object does not behave enough like a "normal" message to have things work. We may need to work out in what way the BAD_DATA doesn't match expected behaviour and either fix the caller or fix the object.

Where is the 'NoneType' object does not support item assignment exception being raised? set debug 100 (I made the number up) might give you a full stack trace to work that out.

You didn't actually answer my question here, BTW. If I'm understanding this even a little then messages in MAVProxy when running on a poor link will now not contain the BAD_DATA message, which would be a regression. We should be able to treat the thing as a normal message.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @peterbarker -- I added the stack trace and some notes to the linked issue.

Yeah, I didn't test the "bad comms" case. I will need to rig something up to test this.

I need to spend more time thinking about the appropriate fix. Feel free to close this PR.

Happy New Year!

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @peterbarker -- I added the stack trace and some notes to the linked issue.

Yeah, I didn't test the "bad comms" case. I will need to rig something up to test this.

I need to spend more time thinking about the appropriate fix. Feel free to close this PR.

Nono - this is obviously a problem, and I'd not want to lose the thinking that's been put in here.

Happy New Year!

Thanks!

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I got nerd-sniped by this, so decided to look into it with Clyde's latest debug info, and have written up investigation notes in the issue that I believe explain what's going on and how to properly fix it.

Tl;dr:

  1. add_message is being erroneously called with a name (instead of a type), so the handling code is using conflicting information
  2. I don't think it should be being passed the type in the first place, since it can just get it from the message object using msg.get_type().

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Exception in mavutil.py after bad_data multi-instance message

4 participants