Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't finish the lambda until all logs will appear in the output #64

Open
AntonZaets opened this issue May 1, 2019 · 10 comments
Open
Labels
enhancement New feature or request

Comments

@AntonZaets
Copy link
Contributor

Problem statement
Due to asynchronous nature of the output we have a situation when logs for the particular invocation doesn't appear between START <invocation id> and END <invocation id>:

  1. Appear in the next invocation;
  2. Don't appear at all if lambda's container is killed due to some reasons.

In the past (in js version) we waited for special sequence in the output of Erlang VM and later directly called console.log from js. I don't exactly know why it worked better (maybe it was overloaded, maybe it prints synchronously).

Example of the logs that was truncated:


START RequestId: e4031f8b-b8de-4dec-bde4-8804ceb606d4 Version: $LATEST
17:11:12 <normal logs without any errors started here>
17:11:12 <last normal log without any errors>
17:11:16 END RequestId: e4031f8b-b8de-4dec-bde4-8804ceb606d4
17:11:16 REPORT RequestId: e4031f8b-b8de-4dec-bde4-8804ceb606d4  Duration: 4551.62 ms    Billed Duration: 4600 ms Memory Size: 2048 MB   Max Memory Used: 354 MB
17:11:24 START RequestId: a7ce76e0-bbda-44d9-87c8-13bd393f24ef Version: $LATEST
17:11:29 END RequestId: a7ce76e0-bbda-44d9-87c8-13bd393f24ef
17:11:29 REPORT RequestId: a7ce76e0-bbda-44d9-87c8-13bd393f24ef  Duration: 4623.58 ms    Billed Duration: 4700 ms Memory Size: 2048 Max Memory Used: 2048 

e4031f8b-b8de-4dec-bde4-8804ceb606d4 really finished with an error {"Unhandled",<<"{\"errorMessage\":\"{{invalid_cast,{read,{error,{http_error,timeout}}}},\\n {gen_server,call,[<0.521.0>,next,30000]}}\",\"errorType\":\"HandlerFailure\"}">>} - we see this in the place where lambda is called. However, there aren't any logs in the output.

I see that the next invocation is failed with OOM (Memory Size: 2048 Max Memory Used: 2048 ) and perhaps this the cause why logs from previous execution didn't appear in the logs for a new invocation.

Proposal

I propose to add some option to erllambda config like wait_for_logs or sync_logs with boolean value and that option will trigger mechanism for logs synchronisation. The mechanism should be found in experiments.

@motobob @velimir what do you think, guys?

@lehoff FYI

@AntonZaets AntonZaets added the enhancement New feature or request label May 1, 2019
@motobob
Copy link
Contributor

motobob commented May 1, 2019

will not just a stdout flush help?

@AntonZaets
Copy link
Contributor Author

AntonZaets commented May 1, 2019

Does erlang have this? I tried to find this in the past and didn't find.

@velimir
Copy link
Contributor

velimir commented Jul 6, 2019

Seems like the main reason why messages would appear only on a subsequent call is that messages are being processed asynchronously by error_logger gen event (meaning only lambda with erlang < 21 would suffer from this as in 21 release error_logger backend was rewritten).

I assume that a workaround like this would reduce the number of "missed" messages, waiting error_logger to process all the "log" messages before thaw kicks in:

diff --git a/src/erllambda_poller.erl b/src/erllambda_poller.erl
index af9314d..36e7cf1 100644
--- a/src/erllambda_poller.erl
+++ b/src/erllambda_poller.erl
@@ -160,6 +160,7 @@ invoke_next(#state{runtime_addr = Addr, aws_cfg = AwsCfg}) ->
         "/", ?API_VERSION/binary,
         ?INVOKE_NEXT_PATH/binary>>),
     erllambda:message("Invoke Next path ~p ~s", [os:system_time(millisecond), FullPath]),
+    gen_event:sync_notify(error_logger, thaw),
     %% infinity due to container Freeze/thaw behaviour
     case request(FullPath, get, [], "", infinity, AwsCfg) of
         {ok, {{200, _}, Hdrs, Rsp}} ->
@@ -179,6 +180,7 @@ invoke_success(#state{runtime_addr = Addr, aws_cfg = AwsCfg}, AwsReqId, Body) ->
         "/", ?API_VERSION/binary,
         (?INVOKE_REPLAY_SUCCESS_PATH(AwsReqId))/binary>>),
     erllambda:message("Invoke Success path ~p ~s", [os:system_time(millisecond), FullPath]),
+    gen_event:sync_notify(error_logger, thaw),
     %% infinity due to container Freeze/thaw behaviour
     case request(FullPath, post, [], encode_body(Body), infinity, AwsCfg) of
         {ok, {{202, _}, _Hdrs, _Rsp}} ->
@@ -201,6 +203,7 @@ invoke_error(#state{runtime_addr = Addr, aws_cfg = AwsCfg}, AwsReqId, Body) ->
         "/", ?API_VERSION/binary,
         (?INVOKE_REPLAY_ERROR_PATH(AwsReqId))/binary>>),
     erllambda:message("Invoke Error path ~p ~s", [os:system_time(millisecond), FullPath]),
+    gen_event:sync_notify(error_logger, thaw),
     %% infinity due to container Freeze/thaw behaviour
     case request(FullPath, post, [], encode_body(Body), infinity, AwsCfg) of
         {ok, {{202, _}, _Hdrs, _Rsp}} ->

@motobob
Copy link
Contributor

motobob commented Jul 11, 2019

@AntonZaets i thought you're on R21+. do you still see the issue?
if so - i've fine with such a change proposed by @velimir added with caution using am explicit config

@al-pavel-baturko
Copy link
Contributor

@motobob as far as I see it's still an issue (with 21.3.8.4). Do you know any workarounds for that?

@motobob
Copy link
Contributor

motobob commented Apr 16, 2020

but the path ^ #64 (comment) is not applied as I see. try it

@al-pavel-baturko
Copy link
Contributor

@motobob looks like it's improving the situation with logs a lot

@motobob
Copy link
Contributor

motobob commented Apr 16, 2020

in the expense of...? how much longer the invocation becomes? if < 50ms send a PR and let's close it

@al-pavel-baturko
Copy link
Contributor

my lambda communicates with other services so can't really say how invocation time has changed with this patch, it varies a lot from call to call.. need some synthetic tests I think.

@motobob
Copy link
Contributor

motobob commented Apr 16, 2020

just measure the AVG...

ts = now()
...
metric(now() - ts) -> logs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Development

No branches or pull requests

4 participants