Skip to content

Commit

Permalink
doc: book section on node defaults (#491)
Browse files Browse the repository at this point in the history
  • Loading branch information
molpopgen authored Mar 16, 2023
1 parent a24c618 commit 4900163
Show file tree
Hide file tree
Showing 2 changed files with 302 additions and 0 deletions.
92 changes: 92 additions & 0 deletions book/src/table_collection_adding_rows.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,3 +24,95 @@ Again, we can take advantage of being able to pass in any type that is `Into<_>`
```

See the [API docs](https://docs.rs/tskit) for more details and examples.

### Adding nodes using default values

This section is more advanced and may be skipped during a first read.

For some tables it may be common to input the same values over and over for some fields when adding rows.
Let's take a look at how to use default values when adding rows to a node table.

Default instances of `NodeDefaults` contain default values for the `flags`, `individual`, and `population` fields:

```rust, noplaygound, ignore
{{#include ../../tests/book_table_collection.rs:node_defaults}}
```

Add a node with these values and a given birth time:

```rust, noplaygound, ignore
{{#include ../../tests/book_table_collection.rs:add_node_defaults}}
```

We can use struct update syntax to create a new node marked as a sample while re-using our other defaults:

```rust, noplaygound, ignore
{{#include ../../tests/book_table_collection.rs:add_node_defaults_sample}}
```

See the [`NodeDefaults`](https://docs.rs/tskit/latest/tskit/struct.NodeDefaults.html) section of the API reference for more.

#### Metadata

[Metadata](metadata.md#Metadata) can complicate the picture a bit:

* Metadata types are defined by the client and are thus a generic in the `tskit` API.
* We do not want to impose too many trait bounds on the client-defined types.
* Metadata is optional on a per-row basis for any given table.

[`NodeDefaultsWithMetadata`](https://docs.rs/tskit/latest/tskit/struct.NodeDefaultsWithMetadata.html) handles the case where rows may or may not have metadata.
The metadata type is generic with trait bound [`tskit::NodeMetadata`](https://docs.rs/tskit/latest/tskit/metadata/trait.NodeMetadata.html).
Because metadata are optional per-row, any metadata defaults are stored as an [`Option`](https://doc.rust-lang.org/std/option/).

For the following examples, this will be our metadata type:

```rust, noplaygound, ignore
{{#include ../../tests/book_table_collection.rs:node_metadata}}
```

##### Case 1: no default metadata

A common use case is that the metadata differs for every row.
For this case, it makes sense for the default value to be the `None` variant of the `Option`.

This case is straightforward:

```rust, noplaygound, ignore
{{#include ../../tests/book_table_collection.rs:node_defaults_with_metadata}}
```

##### Case 2: default metadata

TL;DR:

* If table row defaults *include* metadata, you can run into use-after-move issues.
Fortunately, the compiler will catch this as an error.
* The solution is for your metadata type to implement `Clone`.

Consider the following case:

```rust, noplaygound, ignore
{{#include ../../tests/book_table_collection.rs:node_defaults_with_some_metadata_default}}
```

Imagine that the first row we add uses different metadata but all the other default values:

```rust, noplaygound, ignore
{{#include ../../tests/book_table_collection.rs:node_defaults_with_some_metadata_default_add_first_row}}
```

Nothing interesting has happened.
However, let's take a look at what we need to do if our next row uses a non-default `population` field and the default metadata:

```rust, noplaygound, ignore
{{#include ../../tests/book_table_collection.rs:node_defaults_with_some_metadata_default_add_second_row}}
```

Note the call to `..defaults.clone()`.
(For that call to compile, `NodeMetadata` must implement `Clone`!.)
Without that, our `defaults` instance would have *moved*, leading to a move-after-use compiler error when we add a third row:

```rust, noplaygound, ignore
{{#include ../../tests/book_table_collection.rs:node_defaults_with_some_metadata_default_add_third_row}}
```

210 changes: 210 additions & 0 deletions tests/book_table_collection.rs
Original file line number Diff line number Diff line change
Expand Up @@ -165,3 +165,213 @@ fn get_data_from_edge_table() {
.check_integrity(tskit::TableIntegrityCheckFlags::default())
.is_ok());
}

#[test]
fn test_adding_node_table_row_with_defaults() {
let mut tables = tskit::TableCollection::new(10.).unwrap();
// ANCHOR: node_defaults
let defaults = tskit::NodeDefaults::default();
// ANCHOR_END: node_defaults
// ANCHOR: add_node_defaults
let node = tables.add_node_with_defaults(0.0, &defaults).unwrap();
// ANCHOR_END: add_node_defaults
assert_eq!(node, 0);

// ANCHOR: add_node_defaults_sample
let node = tables
.add_node_with_defaults(
0.0,
// Create a new, temporary defaults instance
&tskit::NodeDefaults {
// Mark the new node as a sample
flags: tskit::NodeFlags::new_sample(),
// Use remaining values from our current defaults
..defaults
},
)
.unwrap();
// ANCHOR_END: add_node_defaults_sample
assert!(tables.nodes().flags(node).unwrap().is_sample());
}

macro_rules! impl_node_metadata_traits {
() => {
impl tskit::metadata::MetadataRoundtrip for NodeMetadata {
fn encode(&self) -> Result<Vec<u8>, tskit::metadata::MetadataError> {
match serde_json::to_string(self) {
Ok(x) => Ok(x.as_bytes().to_vec()),
Err(e) => {
Err(::tskit::metadata::MetadataError::RoundtripError { value: Box::new(e) })
}
}
}
fn decode(md: &[u8]) -> Result<Self, tskit::metadata::MetadataError>
where
Self: Sized,
{
match serde_json::from_slice(md) {
Ok(v) => Ok(v),
Err(e) => {
Err(::tskit::metadata::MetadataError::RoundtripError { value: Box::new(e) })
}
}
}
}
impl tskit::metadata::NodeMetadata for NodeMetadata {}
};
}

mod node_metadata {
#[derive(Debug, serde::Serialize, serde::Deserialize, Eq, PartialEq)]
// ANCHOR: node_metadata
pub struct NodeMetadata {
pub value: i32,
}
// ANCHOR_END: node_metadata
impl_node_metadata_traits!();
}

mod node_metadata_clone {
#[derive(Debug, Eq, PartialEq, Clone, serde::Serialize, serde::Deserialize)]
pub struct NodeMetadata {
pub value: i32,
}
impl_node_metadata_traits!();
}

#[test]
fn test_adding_node_table_row_with_defaults_and_metadata() {
use node_metadata::NodeMetadata;
let mut tables = tskit::TableCollection::new(10.0).unwrap();
// ANCHOR: node_defaults_with_metadata

// Create a type alias for brevity
type DefaultsWithMetadata = tskit::NodeDefaultsWithMetadata<NodeMetadata>;
// Default metadata is None
let defaults = DefaultsWithMetadata::default();

// A row with no metadata
let n0 = tables.add_node_with_defaults(0.0, &defaults).unwrap();

// A row with metadata
let n1 = tables
.add_node_with_defaults(
0.0,
&DefaultsWithMetadata {
population: 3.into(),
metadata: Some(NodeMetadata { value: 42 }),
..defaults
},
)
.unwrap();

// Another row with metadata, different from the last.
let n2 = tables
.add_node_with_defaults(
0.0,
&DefaultsWithMetadata {
population: 1.into(),
metadata: Some(NodeMetadata { value: 1234 }),
..defaults
},
)
.unwrap();
// ANCHOR_END: node_defaults_with_metadata
assert!(tables.nodes().metadata::<NodeMetadata>(n0).is_none());
assert_eq!(
tables
.nodes()
.metadata::<NodeMetadata>(n1)
.unwrap()
.unwrap(),
NodeMetadata { value: 42 }
);
assert_eq!(
tables
.nodes()
.metadata::<NodeMetadata>(n2)
.unwrap()
.unwrap(),
NodeMetadata { value: 1234 }
);
}

#[test]
fn test_adding_node_table_row_with_defaults_and_metadata_requiring_clone() {
use node_metadata_clone::NodeMetadata;
let mut tables = tskit::TableCollection::new(10.0).unwrap();
type DefaultsWithMetadata = tskit::NodeDefaultsWithMetadata<NodeMetadata>;

// ANCHOR: node_defaults_with_some_metadata_default
// What if there is default metadata for all rows?
let defaults = DefaultsWithMetadata {
metadata: Some(NodeMetadata { value: 42 }),
..Default::default()
};
// ANCHOR_END: node_defaults_with_some_metadata_default

// ANCHOR: node_defaults_with_some_metadata_default_add_first_row
let n0 = tables
.add_node_with_defaults(
0.0,
&DefaultsWithMetadata {
metadata: Some(NodeMetadata { value: 2 * 42 }),
..defaults
},
)
.unwrap();
// ANCHOR_END: node_defaults_with_some_metadata_default_add_first_row
assert_eq!(
tables
.nodes()
.metadata::<NodeMetadata>(n0)
.unwrap()
.unwrap(),
NodeMetadata { value: 2 * 42 }
);

// But now, we start to cause a problem:
// If we don't clone here, our metadata type moves,
// so our defaults are moved.
// ANCHOR: node_defaults_with_some_metadata_default_add_second_row
let n1 = tables
.add_node_with_defaults(
0.0,
&DefaultsWithMetadata {
population: 6.into(),
..defaults.clone()
},
)
.unwrap();
// ANCHOR_END: node_defaults_with_some_metadata_default_add_second_row
assert_eq!(
tables
.nodes()
.metadata::<NodeMetadata>(n1)
.unwrap()
.unwrap(),
NodeMetadata { value: 42 }
);

// Now, we have a use-after-move error
// if we hadn't cloned in the last step.
// ANCHOR: node_defaults_with_some_metadata_default_add_third_row
let n2 = tables
.add_node_with_defaults(
0.0,
&DefaultsWithMetadata {
individual: 7.into(),
..defaults
},
)
.unwrap();
// ANCHOR_END: node_defaults_with_some_metadata_default_add_third_row
assert_eq!(
tables
.nodes()
.metadata::<NodeMetadata>(n2)
.unwrap()
.unwrap(),
NodeMetadata { value: 42 }
);
}

0 comments on commit 4900163

Please sign in to comment.