Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Log is misleading when a post fails #1325

Open
reidsunderland opened this issue Dec 5, 2024 · 4 comments
Open

Log is misleading when a post fails #1325

reidsunderland opened this issue Dec 5, 2024 · 4 comments
Assignees
Labels
bug Something isn't working enhancement New feature or request good first issue Good for newcomers log log messages inscrutable Priority 5 - Defect Missing and/or broken functionality - do not forget

Comments

@reidsunderland
Copy link
Member

Related to/witnessed in #1322 and #1289.

When a message is not posted successfully, the log plugin still logs a message saying the message was posted. If you are very familiar with sr3, you might notice that it says ... posted to a file ... instead of the correct ... posted to exchange: ... topic: ... a file ....

This is really misleading and confusing, especially if you are not very familiar with sr3. When we are grepping through logs, we always search for the keyword posted.

When there is a failure posting, we should not log a message saying the file was posted.

[ERROR] sarracenia.flow post flowCallback plugin <bound method Message.post of <sarracenia.flowcb.post.message.Message object at 0x7fd4d6d40940>> crashed: 'identity'
...
[INFO] sarracenia.flowcb.log after_post posted to a file with baseUrl:

vs a successful post to an AMQP broker:

[INFO] sarracenia.flowcb.log after_post posted to exchange: xpublic topic: v02.post.20241205 a file with baseUrl:
@reidsunderland reidsunderland added bug Something isn't working enhancement New feature or request Priority 5 - Defect Missing and/or broken functionality - do not forget good first issue Good for newcomers labels Dec 5, 2024
@reidsunderland
Copy link
Member Author

The problem is here:

def post(self, worklist):
still_ok = []
all_good=True
for m in worklist.ok:
if all_good and hasattr(self.poster,'putNewMessage') and self.poster.putNewMessage(m):
still_ok.append(m)
else:
all_good=False
worklist.failed.append(m)
worklist.ok = still_ok

When putNewMessage returns False, this code does the right thing. But if putNewMessage throws an exception it doesn't get handled and I the message just stays in worklist.ok.

It should be a pretty easy fix.

@reidsunderland
Copy link
Member Author

(Either we handle exceptions in the post/message.py post method, or fix AMQP and all other moth classes' putNewMessage to handle exceptions internally and always return False, or maybe both)

@andreleblanc11
Copy link
Member

( Including this in this issue as it has to do with logging being misintepreted. )

When a file fails to be put through a ftp transfer with a sender, the logging gives a very confusing response. It states that it fails but then goes on to confirm success of the transfer. I confirmed that the transfer did in fact fail and the success log is wrong.

2024-12-06 16:04:45,324 [ERROR] sarracenia.transfer.ftp put failed to put cmcrfe.2024120600_036 to cmcrfe.2024120600_036: 550 Permission denied.
2024-12-06 16:04:45,325 [DEBUG] sarracenia.transfer.ftp chmod sr_ftp chmod 420 cmcrfe.2024120600_036
2024-12-06 16:04:45,389 [INFO] sarracenia.flow send Sent: /apps/sarra/public_data//20241206/MSC-SCI-CMC-OPS/noaa/cmcrfe.2024120600_036  into
/in/cmcrfe.2024120600_036 0-41063
2024-12-06 16:04:45,389 [DEBUG] sarracenia.flow.sender do processing 1 messages worked!
2024-12-06 16:04:45,389 [DEBUG] sarracenia.flowcb.retry after_work loading from post_retry_001: qty=24 ... got: 0
2024-12-06 16:04:45,389 [INFO] sarracenia.flowcb.log after_work sent ok: /in/cmcrfe.2024120600_036

I think the problem is that the try/except statement in here

elif inflight[0] == '.':
new_inflight_path = new_file + inflight
if not self.o.dry_run:
try:
if accelerated:
len_written = self.proto[self.scheme].putAccelerated(
msg, local_file, new_inflight_path)
else:
len_written = self.proto[self.scheme].put(msg, local_file, new_inflight_path)
except Exception as ex:
logger.error( f"could not send {local_dir}{os.sep}{local_file} inflight={inflight} {sendTo} {msg['new_dir']}/{new_file}: {ex}" )
return 0
try:
self.proto[self.scheme].rename(new_inflight_path, new_file)
except Exception as ex:
logger.error( f"could not rename inflight={inflight} {sendTo} {msg['new_dir']}/{new_file}: {ex}" )
return 0

we should be checking if the len_written is greater then 0, and indicate a failed transfer if it is.

@petersilva petersilva added the log log messages inscrutable label Dec 9, 2024
@petersilva petersilva self-assigned this Dec 9, 2024
@petersilva petersilva reopened this Dec 19, 2024
@petersilva
Copy link
Contributor

the intermittent failures of the flakey_broker flow test in many contexts are likely due to over counting of successful posts because of this bug.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working enhancement New feature or request good first issue Good for newcomers log log messages inscrutable Priority 5 - Defect Missing and/or broken functionality - do not forget
Projects
None yet
Development

No branches or pull requests

3 participants