Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Xray initial support #273

Open
wants to merge 7 commits into
base: master
Choose a base branch
from
Open

Xray initial support #273

wants to merge 7 commits into from

Conversation

artemnikitin
Copy link
Contributor

@artemnikitin artemnikitin commented Jun 24, 2018

Initial implementation of #169

Issue Type

  • Feature Pull Request

Summary

Added initial support for X-Ray tracing.
To be enabled it requires to enable option Enable active tracing for Lambda. See https://docs.aws.amazon.com/xray/latest/devguide/xray-services-lambda.html for details.
Current implementation only trace calls to AWS API.

Code contribution checklist

  1. The contribution fixes a single existing github issue, and it is linked
    to it.
  2. The code is as simple as possible, readable and follows the idiomatic Go
    guidelines.
  3. All new functionality is covered by automated test cases so the overall
    test coverage doesn't decrease.
  4. No issues are reported when running make full-test.
  5. Functionality not applicable to all users should be configurable.
  6. Configurations should be exposed through Lambda function environment
    variables which are also passed as parameters to the
    CloudFormation
    and
    Terraform
    stacks defined as infrastructure code.
  7. Global configurations set from the infrastructure stack level should also
    support per-group overrides using tags.
  8. Tags names and expected values should be similar to the other existing
    configurations.
  9. Both global and tag-based configuration mechanisms should be tested and
    proven to work using log output from various test runs.
  10. The logs should be kept as clean as possible (use log levels as
    appropriate) and formatted consistently to the existing log output.
  11. The documentation is updated to cover the new behavior, as well as the
    new configuration options for both stack parameters and tag overrides.
  12. A code reviewer reproduced the problem and can confirm the code
    contribution actually resolves it.

@artemnikitin artemnikitin requested review from xlr-8 and cristim June 24, 2018 10:28
@coveralls
Copy link

coveralls commented Jun 24, 2018

Coverage Status

Coverage increased (+0.02%) to 80.952% when pulling 245d7f7 on artemnikitin:xray-initial-support into cedc8ba on cristim:master.

@artemnikitin
Copy link
Contributor Author

It's not clear how to test it properly. I tested it with uploading to my personal AWS account and running from it. There were no any EC2 instances in it, but first AWS request DescribeRegions was traced properly. I'm assuming that all other requests will be properly traced too.

@artemnikitin
Copy link
Contributor Author

Regarding codeclimate issue. It doesn't like then function accepting 5 arguments...

@cristim
Copy link
Member

cristim commented Jun 25, 2018

Is is really needed to pass the ctx as argument to all the functions?

Can't we set it as a field in the structs and take it from there?

@artemnikitin
Copy link
Contributor Author

@cristim According to https://github.com/golang/go/wiki/CodeReviewComments#contexts it's the best practice...

@xlr-8
Copy link
Contributor

xlr-8 commented Jun 25, 2018

Does this PR requires a review? Whether yes or no, I guess it could be great to label it based on its status

@artemnikitin
Copy link
Contributor Author

@xlr-8 I added some, but feel free to add more if you think that this is necessary.

@cristim
Copy link
Member

cristim commented Jun 25, 2018

@artemnikitin fair enough, I guess codeclimate should not count contexts in the argument list of this check

@artemnikitin
Copy link
Contributor Author

@cristim anything else from your side?

@cristim
Copy link
Member

cristim commented Jun 28, 2018

@artemnikitin tomorrow is my 20% day, I'll look at it in more depth.

@cristim
Copy link
Member

cristim commented Jun 29, 2018

I tried run this locally and got panics because of the missing context segments.
I can't merge this before we ensure that the binary also works when executed locally.

$ make && ./autospotting                                                                                                                                                          
ok	all files passed go fmt
ok	all files passed go vet
GOOS=linux go build -ldflags="-X main.Version=custom-852d35b" -o autospotting
ok  	github.com/cristim/autospotting/core	0.010s	coverage: 80.3% of statements
2018/06/29 15:10:06 Starting autospotting agent, build custom-852d35b
2018/06/29 15:10:06 Parsed command line flags: regions='' min_on_demand_number=0 min_on_demand_percentage=0.0 allowed_instance_types= disallowed_instance_types= on_demand_price_multiplier=1.00 spot_price_buffer_percentage=10.000 bidding_policy=normal tag_filters= tag_filter_mode=opt-in spot_product_description=Linux/UNIX (Amazon VPC)
2018/06/29 15:10:06 main.go:127: Scanning for available AWS regions
panic: failed to begin subsegment named 'ec2': segment cannot be found.

goroutine 1 [running]:
github.com/cristim/autospotting/vendor/github.com/aws/aws-xray-sdk-go/strategy/ctxmissing.(*DefaultRuntimeErrorStrategy).ContextMissing(0x11e0af0, 0xa5fd00, 0xc4212d5d30)
	/home/magherus/devel/gocode/src/github.com/cristim/autospotting/vendor/github.com/aws/aws-xray-sdk-go/strategy/ctxmissing/default_context_missing.go:47 +0x35
github.com/cristim/autospotting/vendor/github.com/aws/aws-xray-sdk-go/xray.BeginSubsegment(0xca5d80, 0xc4200ac010, 0xbf8c86, 0x3, 0x0, 0x0, 0x0)
	/home/magherus/devel/gocode/src/github.com/cristim/autospotting/vendor/github.com/aws/aws-xray-sdk-go/xray/segment.go:170 +0x661
github.com/cristim/autospotting/vendor/github.com/aws/aws-xray-sdk-go/xray.glob..func1(0xc42020c800)
	/home/magherus/devel/gocode/src/github.com/cristim/autospotting/vendor/github.com/aws/aws-xray-sdk-go/xray/aws.go:58 +0xa1
github.com/cristim/autospotting/vendor/github.com/aws/aws-sdk-go/aws/request.(*HandlerList).Run(0xc42020c948, 0xc42020c800)
	/home/magherus/devel/gocode/src/github.com/cristim/autospotting/vendor/github.com/aws/aws-sdk-go/aws/request/handlers.go:213 +0x9d
github.com/cristim/autospotting/vendor/github.com/aws/aws-sdk-go/aws/request.(*Request).Build(0xc42020c800, 0xc42006fb00, 0x484376)
	/home/magherus/devel/gocode/src/github.com/cristim/autospotting/vendor/github.com/aws/aws-sdk-go/aws/request/request.go:357 +0x68
github.com/cristim/autospotting/vendor/github.com/aws/aws-sdk-go/aws/request.(*Request).Sign(0xc42020c800, 0x60589e8, 0x11c32e0)
	/home/magherus/devel/gocode/src/github.com/cristim/autospotting/vendor/github.com/aws/aws-sdk-go/aws/request/request.go:378 +0x2f
github.com/cristim/autospotting/vendor/github.com/aws/aws-sdk-go/aws/request.(*Request).Send(0xc42020c800, 0x0, 0x0)
	/home/magherus/devel/gocode/src/github.com/cristim/autospotting/vendor/github.com/aws/aws-sdk-go/aws/request/request.go:486 +0x16d
github.com/cristim/autospotting/vendor/github.com/aws/aws-sdk-go/service/ec2.(*EC2).DescribeRegionsWithContext(0xc420170080, 0x7f6fef4b6678, 0xc4200ac010, 0xc420079b80, 0x0, 0x0, 0x0, 0x49, 0xc42006fc98, 0x55bc08)
	/home/magherus/devel/gocode/src/github.com/cristim/autospotting/vendor/github.com/aws/aws-sdk-go/service/ec2/api.go:12169 +0xa4
github.com/cristim/autospotting/core.getRegions(0xca5d80, 0xc4200ac010, 0xcb4880, 0xc420170080, 0x30, 0xc4200b2120, 0x0, 0x0, 0x4cf570)
	/home/magherus/devel/gocode/src/github.com/cristim/autospotting/core/main.go:129 +0x12a
github.com/cristim/autospotting/core.Run(0xca5d80, 0xc4200ac010, 0xc42025c270)
	/home/magherus/devel/gocode/src/github.com/cristim/autospotting/core/main.go:37 +0x131
main.run(0xca5d80, 0xc4200ac010)
	/home/magherus/devel/gocode/src/github.com/cristim/autospotting/autospotting.go:62 +0x41b
main.main()
	/home/magherus/devel/gocode/src/github.com/cristim/autospotting/autospotting.go:30 +0x82
FAIL: 2

@cristim
Copy link
Member

cristim commented Jun 29, 2018

Forgot to mention:

It also fails if I tried to make the ContextMissingStrategy not panic by using this in main, but it just panics somewhere else:

func main() {
	if os.Getenv("AWS_LAMBDA_FUNCTION_NAME") != "" {
		lambda.Start(run)
	} else {
		ctx, _ := xray.ContextWithConfig(
			context.Background(),
			xray.Config{ContextMissingStrategy: ctxmissing.NewDefaultLogErrorStrategy()})

		run(ctx)
	}
}```

@artemnikitin
Copy link
Contributor Author

@cristim I will look into that. Didn't try to run it locally...

@artemnikitin
Copy link
Contributor Author

@cristim I added change with the fix for a panic. For me locally everything works. Please check it when you will have time.

autospotting.go Outdated
@@ -58,7 +59,10 @@ func run(ctx context.Context) {
conf.TagFilteringMode,
conf.SpotProductDescription)

xray.Configure(xray.Config{LogLevel: "error"})
xray.Configure(xray.Config{
ContextMissingStrategy: ctxmissing.NewDefaultLogErrorStrategy(),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, this looks much better than I had attempted... I'm just sorry I wasted a few hours trying to fix this myself without any results.

I'll try it out tomorrow.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Worked like a charm!

Copy link
Member

@cristim cristim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I confirm the panics are fixed, but I'd leave the final verdict to @xlr-8

@artemnikitin
Copy link
Contributor Author

@xlr-8 will you have time to look on it?

Copy link
Contributor

@xlr-8 xlr-8 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work! I wonder however if it shouldn't be made optional as X-Ray isn't free, we currently have very little view on how much it would cost depending on the setups.

Traces are pretty cheap for sure, but if one is generated at each method call, and then runs of lambda every 2min or so I wouldn't like to leave users with a significant cost increase.

On top of that, X-Ray is mostly for the profiling of the application. So it might be worth enabling it by default on all installs.

WDYS?

autospotting.go Outdated
xray.Configure(xray.Config{
ContextMissingStrategy: ctxmissing.NewDefaultLogErrorStrategy(),
LogLevel: "error",
})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this be configurable from an env variable?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What exactly do you want to configure in this place? Log level?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made log level configurable via flag, but it looks like there is a bug in X-Ray SDK. aws/aws-xray-sdk-go#60. At the moment, it will post a particular error message ignoring log level...

core/region.go Outdated
@@ -79,25 +81,25 @@ func (r *region) processRegion() {
r.setupAsgFilters()

logger.Println("Scanning for enabled AutoScaling groups in ", r.name)
r.scanForEnabledAutoScalingGroups()
r.scanForEnabledAutoScalingGroups(context.Background())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The context.Background is usually used for empty context (test/main), so I'm a bit surprised to see it invoked various times. What's the reason behind this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm... context.Background() should be used in tests... I will check, there shouldn't be such things in code

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for catching

@artemnikitin
Copy link
Contributor Author

artemnikitin commented Jul 8, 2018

@xlr-8 to enable it you need to check Enable active tracing in Lambda. With our setup scripts, it's disabled by default. The same option also enables or disables X-Ray in general.
In my opinion, it makes sense to have this option configurable and disabled as it is right now.

@artemnikitin
Copy link
Contributor Author

Now log level for X-Ray configured via xray_log_level parameter

@xlr-8
Copy link
Contributor

xlr-8 commented Jul 10, 2018

@xlr-8 to enable it you need to check Enable active tracing in Lambda. With our setup scripts, it's disabled by default. The same option also enables or disables X-Ray in general.
In my opinion, it makes sense to have this option configurable and disabled as it is right now.

Isn't it something that can be enabled via terraform / cloudformation directly?
Otherwise sounds good to me!

EDIT: Also, the IAM permission of the lambda function might need to be enriched for the tracing

@artemnikitin
Copy link
Contributor Author

Isn't it something that can be enabled via terraform / cloudformation directly?

Yes, it should be possible to do with these tools. I will check it...

@cristim
Copy link
Member

cristim commented Nov 29, 2018

@artemnikitin please consider rebasing this PR on top of the recent state of the code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants