Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nested NamedAxisArrays Are Difficult to Read #55

Open
TheCedarPrince opened this issue Nov 7, 2020 · 2 comments
Open

Nested NamedAxisArrays Are Difficult to Read #55

TheCedarPrince opened this issue Nov 7, 2020 · 2 comments

Comments

@TheCedarPrince
Copy link

Hi @Tokazama - love this package for the work going on in NeuriViz!

One issue that I have on the user interface side is that it is quite difficult to sometimes parse the output of a nested NamedAxisArray. To illustrate what I mean, let me show you. I have the following code which creates a nested NamedAxisArray:

    subject_data = NamedAxisArray(
        [NamedAxisArray(
            [NamedAxisArray(
                [
                    DataFrame(eeg_data, copycols=false),
                    DataFrame(electrodes_data, copycols=false),
                    DataFrame(event_data, copycols=false),
                    nosedir,
                    times,
                    sampling_freq,
                ],
                information = [
                    :data,
                    :electrodes,
                    :events,
                    :nosedir,
                    :times,
                    :sampling_freq,
                ],
            )],
            session = [1],
        )],
        subject = [1],
    )

This is desirable as the syntax becomes as easy as subject_data[subject = 1][session = 1][information = :electrodes] to access information.

Deepest Nested NamedAxisArray

Starting from the furthest nested NamedAxisArray, it is not terribly hard to read:

julia> subject_data[1][1]
6-element NamedDimsArray(AxisArray(::Array{Any,1}
  • axes:
     information = [:data, :electrodes, :events, :nosedir, :times, :sampling_freq]
))
                  1
  :data               206440×31 DataFrame. Omitted printing of 25 columns
│ Row    │ x1       │ x2       │ x3       │ x4       │ x5       │ x6       │
│        │ Float32  │ Float32  │ Float32  │ Float32  │ Float32  │ Float32  │
├────────┼──────────┼──────────┼──────────┼──────────┼──────────┼──────────┤
│ 1-19.9952-46.3864-28.329881.1222-67.002621.798   │
│ 2-19.5624-42.738-28.329880.7722-66.565221.7112  │
│ 3-19.1296-39.6108-28.157680.3346-66.215321.6243  │
│ 4-18.9564-38.5685-27.727179.5471-65.952921.6243  │
│ 5-19.1296-39.8714-27.382678.7595-65.60321.6243  │
│ 6-19.1296-42.5643-27.210478.4969-65.340621.4507  │
│ 7-18.9564-45.1703-27.296578.4094-64.990721.1901206433200.0381.73732135.708-102.65-18.7187-110.988 │
│ 206434198.22-3.3009133.469-102.475-20.2057-111.074 │
│ 206435196.403-7.81793131.919-101.862-21.4303-110.467 │
│ 206436194.498-12.6824131.402-100.987-22.0426-109.338 │
│ 206437192.334-18.5024132.005-100.987-21.7802-108.122 │
│ 206438189.824-25.3648132.952-102.387-21.0804-107.861 │
│ 206439187.314-32.401133.469-104.313-20.6431-108.903 │
│ 206440185.496-38.3947133.297-105.538-20.818-110.467:electrodes         31×4 DataFrame
│ Row │ name   │ x       │ y       │ z       │
│     │ String │ Float64 │ Float64 │ Float64 │
├─────┼────────┼─────────┼─────────┼─────────┤
│ 1   │ FP1    │ 0.830.270.48    │
│ 2   │ FP2    │ 0.83-0.270.48    │
│ 3   │ F3     │ 0.50.40.77    │
│ 4   │ F4     │ 0.5-0.40.77    │
│ 5   │ C3     │ 0.00.510.86    │
│ 6   │ C4     │ 0.0-0.510.86    │
│ 7   │ P3     │ -0.50.40.7724P4"    │ -0.66   │ -0.37   │ 0.65    │
│ 25  │ PZ"-0.72-0.00.69    │
│ 26  │ OZ     │ -0.88-0.00.48    │
│ 27  │ I      │ -0.97-0.00.23    │
│ 28CB1"   │ -0.93   │ 0.3     │ 0.23    │
│ 29  │ CB2"-0.93-0.30.23    │
│ 30  │ CB1    │ -0.790.570.23    │
│ 31  │ CB2    │ -0.79-0.570.23:events             151×8 DataFrame. Omitted printing of 2 columns
│ Row │ onset   │ duration │ sample │ trial_type │ response_time │ stim_file  │
│     │ Float64 │ String   │ String │ String     │ String        │ String     │
├─────┼─────────┼──────────┼────────┼────────────┼───────────────┼────────────┤
│ 15.035   │ n/a      │ n/a    │ stimulus   │ 335105064.jpg │
│ 25.37    │ n/a      │ n/a    │ response   │ n/a           │ n/a        │
│ 36.837   │ n/a      │ n/a    │ stimulus   │ n/a           │ 38068.jpg  │
│ 48.651   │ n/a      │ n/a    │ stimulus   │ 289136095.jpg │
│ 58.94    │ n/a      │ n/a    │ response   │ n/a           │ n/a        │
│ 610.801  │ n/a      │ n/a    │ stimulus   │ n/a           │ 38014.jpg  │
│ 712.684  │ n/a      │ n/a    │ stimulus   │ n/a           │ 82063.jpg  │
144193.182 │ n/a      │ n/a    │ stimulus   │ n/a           │ 63093.jpg  │
│ 145195.219 │ n/a      │ n/a    │ stimulus   │ n/a           │ 307043.jpg │
│ 146197.224 │ n/a      │ n/a    │ stimulus   │ 482194061.jpg │
│ 147197.706 │ n/a      │ n/a    │ response   │ n/a           │ n/a        │
│ 148199.145 │ n/a      │ n/a    │ stimulus   │ 32549069.jpg  │
│ 149199.47  │ n/a      │ n/a    │ response   │ n/a           │ n/a        │
│ 150201.014 │ n/a      │ n/a    │ stimulus   │ n/a           │ 83070.jpg  │
│ 151203.063 │ n/a      │ n/a    │ stimulus   │ n/a           │ 166026.jpg │
  :nosedir            "+X"
  :times              1:206440
  :sampling_freq  1000

There are a few things that would be nice to have displayed better. First:

julia> subject_data[1][1]
6-element NamedDimsArray(AxisArray(::Array{Any,1}
  • axes:
     information = [:data, :electrodes, :events, :nosedir, :times, :sampling_freq]
))

could possibly be better displayed as:

julia> subject_data[1][1]
6-element Axis
  • axes:
     information = [:data, :electrodes, :events, :nosedir, :times, :sampling_freq]

Which I think looks clearer and easier to read. Second, rather than having the axis be displayed, it would be nice to have something nice like:

...
  :events                      151×8 DataFrame
  :nosedir                     String
  :times                        UnitRange
  :sampling_freq          Int
...

With optional verbosity levels (i.e. show me directly the values stored in these axes versus tell me the types and dimensions only).

Third, when printing DataFrames, It might be nice to have a new line after the end of printing each data frame for cleanness.

First Nested NamedAxisArray

Moving up one level in the nesting, things start to get very messy:

julia> subject_data[1]
1-element NamedDimsArray(AxisArray(::Array{NamedDims.NamedDimsArray{(:information,),Any,1,AxisArray{Any,1,Array{Any,1},Tuple{Axis{Symbol,Int64,Array{Symbol,1},SimpleAxis{Int64,StaticRanges.OneToMRange{Int64}}}}}},1}
  • axes:
     session = [1]
))
     1
  1   [206440×31 DataFrame. Omitted printing of 25 columns
│ Row    │ x1       │ x2       │ x3       │ x4       │ x5       │ x6       │
│        │ Float32  │ Float32  │ Float32  │ Float32  │ Float32  │ Float32  │
├────────┼──────────┼──────────┼──────────┼──────────┼──────────┼──────────┤
│ 1-19.9952-46.3864-28.329881.1222-67.002621.798   │
│ 2-19.5624-42.738-28.329880.7722-66.565221.7112  │
│ 3-19.1296-39.6108-28.157680.3346-66.215321.6243  │
│ 4-18.9564-38.5685-27.727179.5471-65.952921.6243  │
│ 5-19.1296-39.8714-27.382678.7595-65.60321.6243  │
│ 6-19.1296-42.5643-27.210478.4969-65.340621.4507  │
│ 7-18.9564-45.1703-27.296578.4094-64.990721.1901206433200.0381.73732135.708-102.65-18.7187-110.988 │
│ 206434198.22-3.3009133.469-102.475-20.2057-111.074 │
│ 206435196.403-7.81793131.919-101.862-21.4303-110.467 │
│ 206436194.498-12.6824131.402-100.987-22.0426-109.338 │
│ 206437192.334-18.5024132.005-100.987-21.7802-108.122 │
│ 206438189.824-25.3648132.952-102.387-21.0804-107.861 │
│ 206439187.314-32.401133.469-104.313-20.6431-108.903 │
│ 206440185.496-38.3947133.297-105.538-20.818-110.467 │, 31×4 DataFrame
│ Row │ name   │ x       │ y       │ z       │
│     │ String │ Float64 │ Float64 │ Float64 │
├─────┼────────┼─────────┼─────────┼─────────┤
│ 1   │ FP1    │ 0.830.270.48    │
│ 2   │ FP2    │ 0.83-0.270.48    │
│ 3   │ F3     │ 0.50.40.77    │
│ 4   │ F4     │ 0.5-0.40.77    │
│ 5   │ C3     │ 0.00.510.86    │
│ 6   │ C4     │ 0.0-0.510.86    │
│ 7   │ P3     │ -0.50.40.7724P4"    │ -0.66   │ -0.37   │ 0.65    │
│ 25  │ PZ"-0.72-0.00.69    │
│ 26  │ OZ     │ -0.88-0.00.48    │
│ 27  │ I      │ -0.97-0.00.23    │
│ 28CB1"   │ -0.93   │ 0.3     │ 0.23    │
│ 29  │ CB2"-0.93-0.30.23    │
│ 30  │ CB1    │ -0.790.570.23    │
│ 31  │ CB2    │ -0.79-0.570.23    │, 151×8 DataFrame. Omitted printing of 2 columns
│ Row │ onset   │ duration │ sample │ trial_type │ response_time │ stim_file  │
│     │ Float64 │ String   │ String │ String     │ String        │ String     │
├─────┼─────────┼──────────┼────────┼────────────┼───────────────┼────────────┤
│ 15.035   │ n/a      │ n/a    │ stimulus   │ 335105064.jpg │
│ 25.37    │ n/a      │ n/a    │ response   │ n/a           │ n/a        │
│ 36.837   │ n/a      │ n/a    │ stimulus   │ n/a           │ 38068.jpg  │
│ 48.651   │ n/a      │ n/a    │ stimulus   │ 289136095.jpg │
│ 58.94    │ n/a      │ n/a    │ response   │ n/a           │ n/a        │
│ 610.801  │ n/a      │ n/a    │ stimulus   │ n/a           │ 38014.jpg  │
│ 712.684  │ n/a      │ n/a    │ stimulus   │ n/a           │ 82063.jpg  │
144193.182 │ n/a      │ n/a    │ stimulus   │ n/a           │ 63093.jpg  │
│ 145195.219 │ n/a      │ n/a    │ stimulus   │ n/a           │ 307043.jpg │
│ 146197.224 │ n/a      │ n/a    │ stimulus   │ 482194061.jpg │
│ 147197.706 │ n/a      │ n/a    │ response   │ n/a           │ n/a        │
│ 148199.145 │ n/a      │ n/a    │ stimulus   │ 32549069.jpg  │
│ 149199.47  │ n/a      │ n/a    │ response   │ n/a           │ n/a        │
│ 150201.014 │ n/a      │ n/a    │ stimulus   │ n/a           │ 83070.jpg  │
│ 151203.063 │ n/a      │ n/a    │ stimulus   │ n/a           │ 166026.jpg │, "+X", 1:206440, 1000]

The following seems quite messy in my opinion:

julia> subject_data[1]
1-element NamedDimsArray(AxisArray(::Array{NamedDims.NamedDimsArray{(:information,),Any,1,AxisArray{Any,1,Array{Any,1},Tuple{Axis{Symbol,Int64,Array{Symbol,1},SimpleAxis{Int64,StaticRanges.OneToMRange{Int64}}}}}},1}
  • axes:
     session = [1]
))
     1

It would be nice if it could be more like:

julia> subject_data[1]
1-element Axis
  • axes:
     session = [1]

Furthermore, I am not sure what happens but it recursively descends and displays the values of the deepest nested NamedAxisArray with no explanation nor information about that nested NamedAxisArray's fields.

Highest level

At the highest level is when things become the most obfuscatory:

julia> subject_data
1-element NamedDimsArray(AxisArray(::Array{NamedDims.NamedDimsArray{(:session,),NamedDims.NamedDimsArray{(:information,),Any,1,AxisArray{Any,1,Array{Any,1},Tuple{Axis{Symbol,Int64,Array{Symbol,1},SimpleAxis{Int64,StaticRanges.OneToMRange{Int64}}}}}},1,AxisArray{NamedDims.NamedDimsArray{(:information,),Any,1,AxisArray{Any,1,Array{Any,1},Tuple{Axis{Symbol,Int64,Array{Symbol,1},SimpleAxis{Int64,StaticRanges.OneToMRange{Int64}}}}}},1,Array{NamedDims.NamedDimsArray{(:information,),Any,1,AxisArray{Any,1,Array{Any,1},Tuple{Axis{Symbol,Int64,Array{Symbol,1},SimpleAxis{Int64,StaticRanges.OneToMRange{Int64}}}}}},1},Tuple{Axis{Int64,Int64,Array{Int64,1},SimpleAxis{Int64,StaticRanges.OneToMRange{Int64}}}}}},1}
  • axes:
     subject = [1]
))

Conclusion

In short, it would be nice to somehow adjust the verbosity of the output and maybe instead show something like this for the nested NamedAxisArray:

julia> subject_data
┌1-element Axis
│ • axes:
│     subject = [1]
│       ┌1-element Axis
│       │• axes:
│       │      session = [1] 
│       │         ┌6-element Axis
│       │         │ • axes:
│       │         │      information = [:data, :electrodes, :events, :nosedir, :times, :sampling_freq]
│       │         └ NamedDimsArray(AxisArray(::Array{Any,1}))
│       └ NamedDimsArray(AxisArray(::Array{Any,1}))
└ NamedDimsArray(AxisArray(::Array{Any,1}))

Of course, it's not perfect, but I like it a bit better than what is currently displayed.

What do you think @Tokazama ? I feel like this would actually lead to better tracebacks and increase the ease of debugging issues.

@TheCedarPrince
Copy link
Author

I wonder if AbstractTrees.jl would help with this... Seems promising!

@Tokazama
Copy link
Owner

Tokazama commented Nov 7, 2020

I wonder if AbstractTrees.jl would help with this... Seems promising!

I think this is probably a better direction to go as a graph/tree could solve a lot of this. I've contemplated doing more with this through AxisGraphs.jl, but I'm still incubating ideas on the specifics of how to approach this (I have a fair amount of scratch code thinking through this, but it's not organized or tested yet).

Did you have any specific ideas for what you'd like out of using something like AbstractTrees.jl in terms of interface or is printing your main concern?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants