Skip to content

lentendu/EukFunc

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

EukFunc is a R package providing a database and tools for the functional assignment of microbial eukaryotes (fungi, protists, and nematodes) in terrestrial environments from 18S rRNA gene sequence data. By function, we mean the type of nutrient uptake (including prey information if organisms are consumers). Therefore, the database does not include trait data.

The database is based on PR2 v5.0.0 (Guillou et al., 2013), a curated database of 18S rRNA gene sequences following the most recent eukaroytic taxonomy (Adl et al., 2019). Only taxa found in terrestrial environment are kept (i.e. by removing exclusively marine and exclusively freshwater organisms).

Installation

EukFunc can be installed from GitHub using:

install.packages("remotes")
remotes::install_github("lentendu/EukFunc")
library(EukFunc)

Usages

The database is provided in six different flavors:

  • a species database DBu or data(DBu)
  • a species database with only the main functional class including detailed for symbiotroph (parasites, mycorrhiza, host phototroph, others) DBu_main or data(DBu_main)
  • an accession based database DBf or data(DBf)
  • a taxonomy condensed database over all functional information DBc or data(DBc)
  • a taxonomy condensed database over the main functional class including detailed for symbiotroph (parasites, mycorrhiza, host phototroph, others) DBc_main or data(DBc_main)
  • a taxonomy condensed database with only the main functional class DBc_minimal or data(DBc_minimal)

For convenience, the species database is also available as a TAB-separated file EukFunc.pr2.5.0.0.tsv

The main functional classes are: phototroph, predator, saprotroph, symbiotroph and unknown.

The intention of the database is to provide a functional annotation of 18S rRNA gene reads obtained from high-throughput sequencing by comparing those to functionally annotated reference sequences. As functional annotations are linked to a taxonomic path (either species, genus or family), these functional groups could also be used with other type of data (other genomic markers, other taxonomic identification methods).

Tools are provided to assign the functional information from a taxonomic table or list of taxonomic path ( assign_path ), a list of species names ( assign_sp ) and a list of best match accessions ( assign_genbank ). The assign_clade function can be used to assign function from a taxonomic path no created with PR2 v5.0.0 (e.g. GenBank, Silva, UNITE, PR2 version 4.x with the 8 rank's taxonomy). The assign_majority function can be used to improve the proportion of assigned clades be need to be used with caution as it can causes misassignments on some rare cases. Read command's help pages for usage information and examples.

When assigning from a taxonomic table or a path or a clade ( assign_path, assign_clade ), different condensed databases can be provided (e.g. DBc, DBc_main or DBc_minimal). Own condensed database can be created with the tool functionize.

Database governance

Database:

  • Guillaume Lentendu
  • David Singer
  • Stefan Geisen
  • Enrique Lara

R package:

  • Guillaume Lentendu

Fungi:

  • Mohammad Bahram
  • S. Emilia Hannula
  • Leho Tedersoo

Ciliates:

  • Sabine Agatha

other protists

  • Enrique Lara

Nematodes

  • Stefan Geisen
  • Johannes Helder
  • Walter Traunspurger

Citation

Guillaume Lentendu, David Singer, Sabine Agatha, Mohammad Bahram, S. Emilia Hannula, Johannes A Helder, Leho Tedersoo, Walter Traunspurger, Enrique Lara, Stefan Geisen (2025), EukFunc: A holistic Eukaryotic Functional reference for automated profiling of soil eukaryotes, submitted

About

No description, website, or topics provided.

Resources

License

Unknown, MIT licenses found

Licenses found

Unknown
LICENSE
MIT
LICENSE.md

Stars

Watchers

Forks

Packages

No packages published

Languages