- Never have to look at log files again!
- Make error debugging easy by capturing context around errors and traces instead of only a traceback and locals
- Make health monitoring happening automatically
- Make performance monitoring easy by automatically tracking and report slow spans
- Support see once, annotate, and forget about it (instead of re-labeling the same error over and over...)
- trace-traces instead of flat log messages
- a trace trace can have multiple "spans" as nodes
- on each "node"
- file_location
- start and end timestamps
- status: started|succeeded|failed
- optional fields:
- exit_error: if something fails without
except
- handled_errors: if we have errors caught with
except
and logged - level
- exit_error: if something fails without
- each trace use a ksuid to ensure it is unique
- each span has a name which is explicitly set or based on function name/call-context
- traces are linked together when a new thread/task is started from an existing trace
- Smart printing
- to terminal when running on localhost
- only json when running on cloud/only normal logging no stdout/stderr?
- smart at grouping together tasks and flushing before exit
- "test-mode": record all traces instead of just printing
- crash_reports/annotating errors
- support marking error as crash/silenced in UI
- support marking error with OK/WARN/ALERT
- find slow functions/threads
- distributed tracing
- support updating trace/spans (same published many times)
- Find traces: {trace_id}
- either a 32 bytes hex
- or a full ksuid
- Select namespace (or selected directly if it already exists)
- Select traces based on tags
- See dashboard of Name|Status|Counts|LastTime
- Toggle error/crash/ok/all
- Toggle show slow/fast/all
- Query for filtering tasks
- Inside an span
- Toggle for Debug/Info/Warning/Error/Critical
- Arrow keys for choosing parents or scrolling down
- when running on localhost
- depend on rich to see traces directly
- when running in a container/lambda service
- by default dumps the trace to stdout (need to configure receiver to parse this logs somehow)
- use httpx/request for forwarding directly
- when debugging
- can run with database dependency directly and inject to local database to help localhost debugging
- api-key in header
- payload(tags, list(span))
- Use a pydantic class for finding all ref_src and ref_dest
- Video of traditional/trace based logging
- src stdout on one side
- stdout on other side
- ref_src|ref_dest
-
uuid4
-
ref_src created on a trace when dumping a message and adding the ref to e.g., metadata
- during injection an alternative index of ref_src -> trace
-
ref_dest used on a trace when parsing back the message then logging with log_ref(ref_dest)
-
RunSequenceGetter
- unique sequence number per tags combination
-
API
- small DB wrapper support using CLI or future frontend app
-
UI?
- using same python code as rich and returning html?
-
- rich based tracebacks & locals collection?
- Long tracebacks might not be necessary if I have call location
- How to do sampling?
- Later
- Support creating alarms, graphs, etc.
- Support search like feature like Kibana
- Support pre-defined dashboards
- principles
- Never more than 1 span active per task, when "root-span" finishes, ALL subtasks must finish
- Errors are stored and tracked when the parent completes
- only logged with traceback if
logger.exception(error)
| or__exit__
of root span has the error
- only logged with traceback if
- DataModel
- user
- last_namespace
- last_access
- client
- list(namespaces)
- namespace
- api_key
- tags|labels
- all apps
- name
- versions
- counter
- last_ts
- trace/trace
- list(span)
- spans
- ts_start
- ts_end
- status=runs|OK|FAIL|CRASH?
- kind? Producer/Consumer, Client/Server
- user
- monkeypatches both
Thread.__init__
ThreadPoolExecutor.submit
- Minimal library implementation, a 0.0.1 release, and a TestTrace as an "integration" test
- Loop-slow ~10ms
- Simple-math
- url-get
- different ways of dumping/parsing
- yaml
- toml
- json
- A local "minimal-system" working
- span-trace used in library with a publisher that writes to a DB
- a textual CLI for viewing the traces
- A local "full-system" working
- Support annotating traces
- Support publishing status based on annotations
- Support metrics publishing
- Support health report
- Cloud receiver and storage
- a receiver lambda working with an API token
- library support to "post" messages to receiver
- CLI configured to use online DB
- Cloud signup with openid and basic markdown UI