ALL Metrics
-
Views
-
Downloads
Get PDF
Get XML
Cite
Export
Track
Data Note
Revised

A draft reference assembly of the Psilocybe cubensis genome

[version 2; peer review: 2 approved]
Previously titled: A draft sequence reference of the Psilocybe cubensis genome
PUBLISHED 15 Jun 2021
Author details Author details
OPEN PEER REVIEW
REVIEWER STATUS

This article is included in the Genomics and Genetics gateway.

This article is included in the Bioinformatics gateway.

This article is included in the Cell & Molecular Biology gateway.

This article is included in the Draft Genomes collection.

Abstract

We describe the use of high-fidelity single molecule sequencing to assemble the genome of the psychoactive Psilocybe cubensis mushroom. The genome is 46.6Mb, 46% GC, and in 32 contigs with an N50 of 3.3Mb. The BUSCO completeness scores are 97.6% with 1.2% duplicates. The Psilocybin synthesis cluster exists in a single 3.2Mb contig. The dataset is available from NCBI BioProject with accessions PRJNA687911 and PRJNA700437.

Keywords

Psilocybe cubensis, Genome, Single molecule sequencing, Psilocybin

Revised Amendments from Version 1

To address the reviewers' very helpful comments

  1. We have included more descriptions of the SNP calling including a Github version of the code one can runt to reproduce this.
  2. We have expanded the analysis to include variant calling on the HiFi reads mapped back to their own reference and in doing so recalled the SNPs from the MGC unknown strain mapped to the P.envy reference utilizing the same code for a controlled comparison. This leveraged different variant callers and produced more variants.
  3. We have improved the readability of figure tracks. 
  4. Adjusted Title language, mushroom number and Fungi vs Fungus typo.
  5. Addressed the Anexic grows and bacterial contamination concerns.
  6. Clarified these citations are to /NCBISRA submissions which currently do not have a DOI publication associated with them. 
  7. We have clarified the text to underscore the importance of other variants in the pathway. We have included his suggested reference here and added links to a SNPeff file that can be used to compare these variants to those he listed. 
  8. In performing SNP calling on the HiFi reads mapped back to the reference we note that 98% of the variants found are heterozygous variants with balanced alleles. Coverage maps can also be viewed in the CoGe genome browser provided to address additional questions regarding aneuploidy/multiple nuclei but coverage for all scaffold is very consistent with the exception of scaffold_26 which is mitochondria.

See the authors' detailed response to the review by Jason C. Slot

Introduction

There are hundreds of mushrooms capable of synthesizing the psychoactive compound psilocybin. This compound has been classified as a “breakthrough therapy” for depression by the FDA (Johnson and Griffiths 2017). The psilocybin pathway was identified by Fricke et al., but to date no public references exist in NCBI with N50s longer than 50kb (Fricke et al. 2017; Blei et al. 2018; Fricke et al. 2019a; Fricke et al. 2019b; Blei et al. 2020; Demmler et al. 2020; Fricke et al. 2020). A more contiguous genome assembly can assist in further resolution of the genetic diversity in the fungi’s secondary metabolite production.

Methods

DNA isolation

Dried stems from Psilocybe cubensis strain P.envy. The strain name is anecdotal reported to have been grown axenically (unknown conditions) and obtained in Somerville, MA, US. These samples were used for isolation of high molecular weight (HMW) DNA using a modified CTAB/Chloroform and SPRI protocol. Briefly, 300mg of stem sample were ground to a fine powder using a -80C frozen mortar and pestle. 150 mg of powder was then aliquoted into 2 mL conical tubes (USA Scientific) with 1.5 mL cetrimonium bromide. These tubes were then incubated at room temperature on a tube rotator for 10 minutes. 6 uL of RNase A (Promega 4 mg/mL) was then added and both tubes were incubated at 37°C for one hour, vortexing every 15 minutes. Following this incubation, 7.5 uL Proteinase K (New England Biolabs 20 mg/mL) was added and the tubes were incubated at 60°C for 30 minutes, vortexing every 10 minutes. At the conclusion of the Proteinase K incubation, both tubes were incubated on ice for 10 minutes. The samples were then centrifuged for 5 minutes at 14000 rpm. 600 uL of supernatant was removed from each tube and added to 600 uL chloroform. The tubes were then vortexed until opaque and spun for 5 minutes at 14000 rpm. 400 uL of the aqueous layer was removed using a wide bore tip and added to a 1.5 mL Eppendorf tube. 400 uL MIP (marijuana infused products) Solution B and 400 uL DNA Binding Beads (Medicinal Genomics PN 420004) were added to the Eppendorf tube and inverted to homogenize. The tubes were then incubated at room temperature on the tube rotator for 15 minutes. The tubes were then removed from the rotator and placed on a magnetic tube rack for 3 minutes. The supernatant was removed, the beads were washed twice with 1 mL of 70% ethanol and allowed to dry for 5 minutes. The beads were then eluted in 100 uL of 56°C Elution Buffer (Medicinal Genomics PN 420004) using a wide bore tip and incubated at 56°C for 5 minutes. Following this incubation, the tubes were returned to the magnetic rack, the supernatant of both tubes were removed using a wide bore tip and pooled in a fresh Eppendorf tube. HMW DNA length was quantified on an Agilent TapeStation and produced a DIN of 8.1. Qubit Fluorometer (Thermo Fisher Scientific) quantified 55ng/ul. Nanodrop Spectrophotometer (Thermo Fisher Scientific) quantified 104ng/ul with 260/280nm ratio of 1.85 and 260/230nm of 0.95.

Sequencing

Sequencing libraries were constructed according to the manufacturer’s instructions for Pacific Biosciences Sequel II HiFi sequencing. 773,735 CCS reads were generated. Quast (Gurevich et al. 2013) was used to assess the quality of the input fasta sequence file (N50 = 13.9Kb) and the output assembly fasta file (3.33Mb N50).

Assembly and annotation

The unfiltered CCS data was assembled using the Peregrine assembler (pg_asm 0.3.5,arm_config5e69f3d+) (Chin 2019). Reads were assembled into 32 contigs with lengths ranging from 32 kilobases to 4.6 megabases (Figure 1). The Peregrine assembler requires at least 2 HiFi reads to substantially overlap to contribute to a contig and as a result we did not observe any bacterial contamination in the assembly BUSCO v3.0.2 completeness scores (97.6%) were measured using agaricales_odb10.2020-08-05 BUSCO lineage database (Table 1) (Simao et al. 2015; Waterhouse et al. 2018). FunAnnotate v1.8.4 was used to annotate the genome (Li and Wang 2021) resulting in 13,478 genes.

e57e4b2e-29f7-46e7-b2b6-f91bd2f4d209_figure1.gif

Figure 1. Psilocybe cubensis P.envy contig length distribution (n = 32).

Table 1. BUSCO completeness scores using agaricales_odb10.2020-08-05.

Total BUSCOsSingle-copyDuplicatedFragmentedMissing
3870372945987
97.60%96.40%1.20%0.20%2.20%

The final genome assembly was aligned to three other public Psilocybe cubensis datasets (Fricke et al. 2017; Torrens-Spence et al. 2018; Reynolds et al. 2018) and one different Psilocybe species (Psilocybe cyanescens) to verify taxonomic identification (Table 2). In total, 96-98.75% of these Psilocybe cubensis sequences align to the new HiFi generated Psilocybe cubensis P.envy reference using minimap2 and bwa-mem (Li and Durbin 2010; Li 2018). Mapping rates were determined using samtools flagstat on bam files (Li et al. 2009). Alignments were visualized with MUMmer V4.0.0beta2 and Integrative Genomics Viewer v2.4.16 (Delcher et al. 2003; Robinson et al. 2011; Thorvaldsdottir et al. 2013).

Table 2. Three Psilocybe cubensis data sets in NCBI and JGI were aligned to the P.envy HiFi reference.

A different Psilocybe species (Psilocybe cyanescens) genome was also mapped with much lower mapping efficiency.

AuthorAccessionData typeMapping rateToolSpecies
Fricke et al. 2017https://mycocosm.jgi.doe.gov/Psicub1_1/Psicub1_1.home.htmlIllumina Assembly98.8%Minimap2P. cubensis
McKernan et al. 2020NCBI Project: PRJNA687911Illumina FastQ96.0%bwa-memP. cubensis
Torrens-Spence et al. 2018NCBI Project: PRJNA450675RAN-Seq Assembly98.5%Minimap2P. cubensis
Reynolds et al. 2018NCBI Project: PRJNA387735Illumina Assembly56.8%Minimap2P. cyanescens

Three Illumina genome assemblies (Reynolds et al., McKernan et al., Fricke et al.) were additionally aligned using MUMmer for whole genome alignment plots (Figure 2).

e57e4b2e-29f7-46e7-b2b6-f91bd2f4d209_figure2.gif

Figure 2. Whole genome alignments of short read Illumina assemblies to Psilocybe cubensis strain P. envy.

Left is Psilocybe cyanescens from Reynolds et al. Middle is McKernan et al. (MGC) Illumina assembly. Right is Fricke et al. or JGI assembly.

Polymorphisms

Illumina whole-genome shotgun data (McKernan et al. NCBI Project: PRJNA687911) was mapped to the P. envy HiFi reference assembly using bwa-mem (version0.7.17-r1188), samtools (version 1.8), sorted with sambamba (version 0.6.7) and variants were identified using GATK HaplotypeCaller (version 4.1.6.0) with default arguments.  The annotation from the funannotate pipeline was converted from gff3 format into SnpEff (version 4.3t 2017-11-24) database as described here (https://pcingola.github.io/SnpEff/se_buildingdb/) and the variants that came out of HaplotypeCaller were annotated. 553,716 variants (471,443 SNPs and 82,273 small insertions and deletions) were called and annotated equating to aSNP every 99bp and a variant every 83bp including indels. Of these, 375,896 (67.9%) are heterozygous and 177,820 (32.1%) are homozygous with a ratio of just over 2 to 1 heterozygous:homozygous variants. Lastly, as a quality check, the original Pacific Biosciences CCS corrected shotgun reads were mapped back to the reference with minimap2 (version 2.17-r941) and variants were called again using GATK HaplotypeCaller. A total of 15,963 variants are identified and 15,674 (98.2%) are heterozygous with only 289 homozygous variants called. Whole genome shotgun reads mapped back to their consensus reference should produce predominantly heterozygous calls in a diploid organism. Scripts utilized to for variant calling are in github and described in the Data availability section.

Structural variation

The N-methyltransferase gene responsible for Psilocybin production in P.envy contains a structural variation not seen in previous P. cubensis surveys (Figure 3). Illumina read mapping of the McKernan et al. P. cubensis assembly in NCBI (NCBI Project: PRJNA687911) demonstrates multiple read pairs spanning a 4.6kb insertion in the HiFi P. cubensis strain P.envy (SRA submission SRP299420). This insertion extends the 3’ end of the P.envy N-methyltransferase gene. The 4.6kb insertion is also observed as a deletion in Psilocybe cyanescens and as a deletion in RNA-Seq data from Torrens-Spence et al. (NCBI Project: PRJNA450675) (Reynolds et al. 2018). Other SNPs also exist in these genes and should be considered in context of this deletion. Further work is required to understand the biological significance of this variation.

e57e4b2e-29f7-46e7-b2b6-f91bd2f4d209_figure3.gif

Figure 3. IGV display of Illumina reads mapped to HiFi Psilocybe cubensis P.envy assembly.

Top track is Medicinal Genomics Illumina whole genome shotgun data of a different P. cubensis (strain name unknown: NCBI Project: PRJNA687911) mapped to the HiFi P. cubensis strain P.envy. Second track contains RNA-Seq data from a third P. cubensis genome (strain name also unknown: NCBI Project: PRJNA450675) hosted at JGI. Third track is Psilocybe cyanescens genome mapped to HiFi P. cubensis P.envy reference genome. Fourth track is FunAnnotate GFF3 annotation of the HiFi P. cubensis P.envy genome.

Conclusions

A highly contiguous Psilocybe cubensis genome has been generated. The N50 contigs lengths are 75 fold more contiguous than the existing assembly available at JGI. This reference can aid in the identification of genetic variation that may impact psilocybin, psilocin, norpsilocin, baeocystin, norbaeocystin and aeruginascin production.

Data availability

GenBank: Psilocybe cubensis strain MGC-MH-2018, whole genome shotgun sequencing project, Accession number JAFIQS000000000.1: https://www.ncbi.nlm.nih.gov/nuccore/JAFIQS000000000.1/.

BioProject: Psilocybe cubensis, Accession number PRJNA687911: https://www.ncbi.nlm.nih.gov/bioproject/PRJNA687911

BioProject: Psilocybe cubensis strain: MGC-MH-2018, Accession number PRJNA700437: https://www.ncbi.nlm.nih.gov/bioproject/PRJNA700437

CoGe genome browser: Psilocybe cubensis (Psilocybe cubensis P.envy), https://genomevolution.org/coge/GenomeInfo.pl?gid=60487

Variant calling scripts: https://github.com/mclaugsf/mgc-public/tree/master/f1000_10-281. The final list of annotated variants and the accompanying SnpEff output files are available here (https://github.com/mclaugsf/mgc-public/tree/master/f1000_10-281/nextflow/annotated-variants). The gff3 file that was used to perform the SnpEff annotation is available for download (https://github.com/mclaugsf/mgc-public/blob/master/f1000_10-281/gff/P-Envy-05-25-2021.gff3.gz) as well as Dockerized workflows written in nextflow used to perform the mapping, variants calling and annotation analysis (https://github.com/mclaugsf/mgc-public/tree/master/f1000_10-281/nextflow).

Comments on this article Comments (0)

Version 2
VERSION 2 PUBLISHED 09 Apr 2021
Comment
Author details Author details
Competing interests
Grant information
Copyright
Download
 
Export To
metrics
Views Downloads
F1000Research - -
PubMed Central
Data from PMC are received and updated monthly.
- -
Citations
CITE
how to cite this article
McKernan K, Kane LT, Crawford S et al. A draft reference assembly of the Psilocybe cubensis genome [version 2; peer review: 2 approved] F1000Research 2021, 10:281 (https://doi.org/10.12688/f1000research.51613.2)
NOTE: it is important to ensure the information in square brackets after the title is included in all citations of this article.
track
receive updates on this article
Track an article to receive email alerts on any updates to this article.

Open Peer Review

Current Reviewer Status: ?
Key to Reviewer Statuses VIEW
ApprovedThe paper is scientifically sound in its current form and only minor, if any, improvements are suggested
Approved with reservations A number of small changes, sometimes more significant revisions are required to address specific details and improve the papers academic merit.
Not approvedFundamental flaws in the paper seriously undermine the findings and conclusions
Version 2
VERSION 2
PUBLISHED 15 Jun 2021
Revised
Views
17
Cite
Reviewer Report 22 Jun 2021
Jason C. Slot, Department of Plant Pathology, The Ohio State University, Columbus, OH, USA 
Approved
VIEWS 17
The authors have provided changes for all the comments and requested edits and I deem all of them to be acceptable, and to have much improved the manuscript. There remains one typo that the authors may wish to correct: In ... Continue reading
CITE
CITE
HOW TO CITE THIS REPORT
Slot JC. Reviewer Report For: A draft reference assembly of the Psilocybe cubensis genome [version 2; peer review: 2 approved]. F1000Research 2021, 10:281 (https://doi.org/10.5256/f1000research.57314.r87516)
NOTE: it is important to ensure the information in square brackets after the title is included in all citations of this article.
Version 1
VERSION 1
PUBLISHED 09 Apr 2021
Views
49
Cite
Reviewer Report 21 May 2021
Anders Goncalves da Silva, Lighthouse Genomics Inc., British Columbia, Canada 
Philippe Henry, Egret Bioscience Ltd., West Kelowna, Canada;  Lighthouse Genomics Inc., BC, Canada 
Approved
VIEWS 49
McKernan and colleagues present on the first highly contiguous draft genome for the magic mushroom Psilocybe cubensis. We commend their use of High accuracy long read sequencing and an advanced bioinformatics pipeline to build a much more complete picture of the ... Continue reading
CITE
CITE
HOW TO CITE THIS REPORT
Goncalves da Silva A and Henry P. Reviewer Report For: A draft reference assembly of the Psilocybe cubensis genome [version 2; peer review: 2 approved]. F1000Research 2021, 10:281 (https://doi.org/10.5256/f1000research.54802.r83076)
NOTE: it is important to ensure the information in square brackets after the title is included in all citations of this article.
Views
89
Cite
Reviewer Report 26 Apr 2021
Jason C. Slot, Department of Plant Pathology, The Ohio State University, Columbus, OH, USA 
Approved with Reservations
VIEWS 89
Summary:
The manuscript presents a high quality assembly of the historically and medicinally important fungus, Psilocybe cubensis. Best practices were observed in sequencing, assembly, and annotation. The manuscript notes a potentially important structural variation present in the P. envy ... Continue reading
CITE
CITE
HOW TO CITE THIS REPORT
Slot JC. Reviewer Report For: A draft reference assembly of the Psilocybe cubensis genome [version 2; peer review: 2 approved]. F1000Research 2021, 10:281 (https://doi.org/10.5256/f1000research.54802.r83396)
NOTE: it is important to ensure the information in square brackets after the title is included in all citations of this article.
  • Author Response 26 May 2021
    Kevin McKernan, R&D, Medicinal Genomics, Beverly, 01915, USA
    26 May 2021
    Author Response
    The reviewer makes excellent points. We will be making these suggested changes to the final manuscript.
    Competing Interests: No competing interests were disclosed.
COMMENTS ON THIS REPORT
  • Author Response 26 May 2021
    Kevin McKernan, R&D, Medicinal Genomics, Beverly, 01915, USA
    26 May 2021
    Author Response
    The reviewer makes excellent points. We will be making these suggested changes to the final manuscript.
    Competing Interests: No competing interests were disclosed.

Comments on this article Comments (0)

Version 2
VERSION 2 PUBLISHED 09 Apr 2021
Comment
Alongside their report, reviewers assign a status to the article:
Approved - the paper is scientifically sound in its current form and only minor, if any, improvements are suggested
Approved with reservations - A number of small changes, sometimes more significant revisions are required to address specific details and improve the papers academic merit.
Not approved - fundamental flaws in the paper seriously undermine the findings and conclusions
Sign In
If you've forgotten your password, please enter your email address below and we'll send you instructions on how to reset your password.

The email address should be the one you originally registered with F1000.

Email address not valid, please try again

You registered with F1000 via Google, so we cannot reset your password.

To sign in, please click here.

If you still need help with your Google account password, please click here.

You registered with F1000 via Facebook, so we cannot reset your password.

To sign in, please click here.

If you still need help with your Facebook account password, please click here.

Code not correct, please try again
Email us for further assistance.
Server error, please try again.