Skip to content
This repository has been archived by the owner on Aug 26, 2021. It is now read-only.

kirillk77/cyanite-remover

Repository files navigation

cyanite-remover

GitHub license Build Status Dependencies Status

cyanite-remover is a Cyanite data removal tool.

Table of Contents

Building

Dependencies

cyanite-remover is a Clojure application and uses Leiningen as build tool. Building cyanite-remover needs a working Leiningen installation, as well as JDK.

Building a Standalone JAR-file

lein uberjar

Built JAR-file will be placed in the target/uberjar directory. You can launch the tool by running ./cyanite-remover command.

Building a Deb-package

Building cyanite-remover deb-package needs installed dpkg-dev and fakeroot packages.

lein fatdeb

Built package will be placed in the target directory.

Usage

Quick Help

cyanite-remover [options] remove-metrics <tenant> <rollup,...> <path,...> <cassandra_host,...> <elasticsearch_url>
cyanite-remover [options] remove-paths <tenant> <path,...> <elasticsearch_url>
cyanite-remover [options] remove-obsolete-data <tenant> <rollup,...> <path,...> <cassandra_host,...> <elasticsearch_url>
cyanite-remover [options] remove-empty-paths <tenant> <path,...> <elasticsearch_url>
cyanite-remover [options] list-metrics <tenant> <rollup,...> <path,...> <cassandra_host,...> <elasticsearch_url>
cyanite-remover [options] list-paths <tenant> <path,...> <elasticsearch_url>
cyanite-remover [options] list-obsolete-data <tenant> <rollup,...> <path,...> <cassandra_host,...> <elasticsearch_url>
cyanite-remover [options] list-empty-paths <tenant> <path,...> <elasticsearch_url>
cyanite-remover help

See commands, arguments and options for more details.

Commands

remove-metrics

Remove metrics from Cassandra.

cyanite-remover remove-metrics [options] tenant rollup(s) path(s) cassandra_host(s) elasticsearch_url

Available options: cassandra-batch-rate, cassandra-batch-size, cassandra-channel-size, cassandra-keyspace, cassandra-options, disable-log, disable-progress, elasticsearch-index, elasticsearch-scroll-batch-rate, elasticsearch-scroll-batch-size, exclude-paths, from, jobs, log-file, log-level, run, sort, stop-on-error, to.

See example of usage here.

Before removing data, make sure that you are going to remove the desired data!

remove-paths

Remove paths from Elasticsearch.

cyanite-remover remove-paths [options] tenant path(s) elasticsearch_url

Available options: disable-log, disable-progress, elasticsearch-index, elasticsearch-scroll-batch-rate, elasticsearch-scroll-batch-size, exclude-paths, log-file, log-level, run, sort,.

See example of usage here.

Before removing data, make sure that you are going to remove the desired data!

Always remove metrics first. Deletion of paths will make it impossible to remove relevant metrics!

remove-obsolete-data

Remove obsolete data from Cassandra and Elasticsearch.

cyanite-remover remove-obsolete-data [options] tenant rollup(s) path(s) cassandra_host(s) elasticsearch_url

Obsolete data is metrics that has not been updated in a while and its paths.

By default, a metric is considered obsolete if it has not been updated for 2678400 seconds (31 day).

The threshold of the obsolescence can be adjusted using the threshold option.

Available options: cassandra-batch-rate, cassandra-batch-size, cassandra-channel-size, cassandra-keyspace, cassandra-options, disable-log, disable-progress, elasticsearch-delete-request-rate, elasticsearch-index, elasticsearch-scroll-batch-rate, elasticsearch-scroll-batch-size, exclude-paths, jobs, log-file, log-level, run, sort, stop-on-error, threshold.

See example of usage here.

Before removing data, make sure that you are going to remove the desired data!

remove-empty-paths

Remove empty paths.

cyanite-remover remove-empty-paths [options] tenant path(s) elasticsearch_url

An empty path is a non-leaf path that has no children.

Available options: disable-log, disable-progress, elasticsearch-delete-request-rate, elasticsearch-index, elasticsearch-scroll-batch-rate, elasticsearch-scroll-batch-size, jobs, log-file, log-level, run, sort, stop-on-error.

See example of usage here.

Before removing data, make sure that you are going to remove the desired data!

list-metrics

List metrics from Cassandra.

cyanite-remover list-metrics [options] tenant rollup(s) path(s) cassandra_host(s) elasticsearch_url

Available options: cassandra-keyspace, cassandra-options, elasticsearch-index, elasticsearch-scroll-batch-rate, elasticsearch-scroll-batch-size, exclude-paths, from, sort, to.

See example of usage here.

list-paths

List paths from Elasticsearch.

cyanite-remover list-paths [options] tenant path(s) elasticsearch_url

Available options: elasticsearch-index, elasticsearch-scroll-batch-rate, elasticsearch-scroll-batch-size, exclude-paths, sort.

See example of usage here.

list-obsolete-data

List obsolete data.

cyanite-remover remove-obsolete-data [options] tenant rollup(s) path(s) cassandra_host(s) elasticsearch_url

See command remove-obsolete-data for more details.

Available options: cassandra-keyspace, cassandra-options, elasticsearch-index, elasticsearch-scroll-batch-rate, elasticsearch-scroll-batch-size, exclude-paths, jobs, sort, threshold.

See example of usage here.

list-empty-paths

List empty paths.

cyanite-remover list-empty-paths [options] tenant path(s) elasticsearch_url

See command remove-empty-paths for more details.

Available options: elasticsearch-index, elasticsearch-scroll-batch-rate, elasticsearch-scroll-batch-size, jobs, sort.

See example of usage here.

help

cyanite-remover help

Show help.

Arguments

tenant

A tenant name.

rollup(s)

A comma-separated list of rollups.

Format: <seconds_per_point:retention,...>

Example: 60:5356800,900:62208000

path(s)

A semicolon-separated list of paths.

Accepted wildcards are:

  • An asterisk *. Matches any number of characters. Example: requests.nginx.*
  • A question mark ?. Matches a single character only. Example: node1.cpu.?
  • A list {path1,path2,...}. Matches any string in a list. Example: {nginx,apache}.cpu.0
  • A range [M-N]. Matches any number in the range from M to N. Example: node[3-17].cpu.0

Example: "requests.nginx.*;node[3-17].cpu.?"

cassandra_host(s)

A comma-separated list of Cassandra hosts.

Example: cass1.example.org,cass2.example.org

elasticsearch_url

An Elasticsearch REST service URL.

Example: http://es.example.org:9200

Options

Options in alphabet order:

cassandra-batch-rate

--cassandra-batch-rate RATE

Set the Cassandra batch rate (batches per second).

Throttling is not used by default.

cassandra-batch-size

--cassandra-batch-size SIZE

Set the Cassandra batch size.

Default: 1000

cassandra-channel-size

--cassandra-channel-size SIZE

Set the Cassandra channel size.

Default: 10000

cassandra-keyspace

--cassandra-keyspace KEYSPACE

Set the Cassandra keyspace.

Default: metric

cassandra-options

-O, --cassandra-options OPTIONS

Set Cassandra options. See Alia documentation for more details.

Example: "{:compression :lz4}"

disable-progress

-P, --disable-progress

Disable the progress bar.

elasticsearch-index

--elasticsearch-index INDEX

Set the Elasticsearch index.

Default: cyanite_paths

elasticsearch-delete-request-rate

--elasticsearch-delete-request-rate RATE

Set the Elasticsearch delete request rate (requests per second).

Throttling is not used by default.

elasticsearch-scroll-batch-rate

--elasticsearch-scroll-batch-rate RATE

Set the Elasticsearch scroll batch rate (batches per second).

Throttling is not used by default.

elasticsearch-scroll-batch-size

--elasticsearch-scroll-batch-size SIZE

Set the Elasticsearch scroll batch size.

Default: 100000

exclude-paths

-e, --exclude-paths PATHS

A semicolon-separated list of paths to exclude from processing.

See path(s) for more details.

from

-f, --from FROM

Set from time in the Unix (POSIX, epoch) time format.

Example: 1420070400

jobs

-j, --jobs JOBS

Set the number of jobs to run simultaneously.

log-file

-l, --log-file FILE

Set the log file.

Default: cyanite-remover.log

log-level

-L, --log-level LEVEL

Set the Log level.

Available log levels: all, trace, debug, info, warn, error, fatal, off.

Default: info

run

-r, --run

Force a normal run. Dry run using on default.

sort

-s, --sort

Sort paths in alphabetical order. From version 0.6.1 output is unsorted by default.

stop-on-error

-S, --stop-on-error

Stop on the first non-fatal error.

threshold

-T, --threshold THRESHOLD

Threshold in seconds. Option is used to search for obsolete data.

See commands remove-obsolete-data and list-obsolete-data for more details.

to

-t, --to TO

Set until time in the Unix (POSIX, epoch) time format.

Example: 1421280000

Usage Scenarios

Inspecting

Before removing data, you may want to inspect the data to be removed.

Listing Metrics from Cassandra

cyanite-remover --sort list-metrics my_tenant 60:5356800,900:62208000 \
  "requests.nginx.*;node[3-17].cpu.?" cass1.example.org \
  http://es.example.org:9200

See command list-metrics for more details.

Listing Paths from Elasticsearch

cyanite-remover --sort list-paths my_tenant "requests.nginx.*;node[3-17].cpu.?" \
  http://es.example.org:9200 --list

See command list-paths for more details.

Listing Obsolete Data

cyanite-remover --threshold 5356800 --exclude-paths "billing.*" --jobs 64 \
  --sort list-obsolete-data my_tenant 60:5356800,900:62208000 "*" \
  cass1.example.org http://es.example.org:9200

See command list-obsolete-data for more details.

Listing Empty Paths

cyanite-remover --sort list-empty-paths my_tenant "*" http://es.example.org:9200

See command list-empty-paths for more details.

Removing

Removing Metrics from Cassandra

cyanite-remover --run --jobs 8 --sort --cassandra-options "{:compression :lz4}" \
  remove-metrics my_tenant 60:5356800,900:62208000 \
  "requests.nginx.*;node[3-17].cpu.?" cass1.example.org \
  http://es.example.org:9200

See command remove-metrics for more details.

Before removing data, make sure that you are going to remove the desired data!

Removing Paths from Elasticsearch

cyanite-remover --run --sort remove-paths my_tenant \
  "requests.nginx.*;node[3-17].cpu.?" http://es.example.org:9200

See command remove-paths for more details.

Before removing data, make sure that you are going to remove the desired data!

Always remove metrics first. Deletion of paths will make it impossible to remove relevant metrics!

Removing Obsolete Data

cyanite-remover --run --threshold 5356800 --exclude-paths "billing.*" \
  --jobs 64 remove-obsolete-data my_tenant 60:5356800,900:62208000 "*" \
  cass1.example.org http://es.example.org:9200

See command remove-obsolete-data for more details.

Before removing data, make sure that you are going to remove the desired data!

Removing Empty Paths

cyanite-remover --run --sort remove-empty-paths my_tenant "*" \
  http://es.example.org:9200

See command remove-empty-paths for more details.

Before removing data, make sure that you are going to remove the desired data!

License

cyanite-remover is covered by MIT License

Thanks

Thanks to Pierre-Yves Ritschard aka @pyr for his work on Cyanite

About

A Cyanite data removal tool

Resources

License

Stars

Watchers

Forks

Packages

No packages published