AlignmentViewer: Sequence Analysis of Large Protein Families

Roc Reguant; Yevgeniy Antipin; Rob Sheridan; Christian Dallago; Drew Diamantoukos; Augustin Luna; Chris Sander; Nicholas Paul Gauthier

doi:10.12688/f1000research.22242.2

Home Browse AlignmentViewer: Sequence Analysis of Large Protein Families

ALL Metrics

Views

Downloads

Get PDF

Get XML

Export

▬

✚

Software Tool Article

Revised

AlignmentViewer: Sequence Analysis of Large Protein Families

[version 2; peer review: 2 approved]

Roc Reguant^1-3, Yevgeniy Antipin⁴, Rob Sheridan⁵, [...] Christian Dallago^1-3, Drew Diamantoukos^2,3, Augustin Luna^2,3,6, Chris Sander ^2,3,6, Nicholas Paul Gauthier^2,3,6

Roc Reguant^1-3, Yevgeniy Antipin⁴, [...] Rob Sheridan⁵, Christian Dallago^1-3, Drew Diamantoukos^2,3, Augustin Luna^2,3,6, Chris Sander ^2,3,6, Nicholas Paul Gauthier^2,3,6

PUBLISHED 15 Oct 2020

Author details Author details

¹ Department of Informatics, Technical University of Munich, Munich, Germany
² cBio Center, Department of Data Sciences, Dana-Farber Cancer Institute, Boston, MA, USA
³ Department of Cell Biology, Harvard Medical School, Boston, MA, USA
⁴ Icahn School of Medicine, Mount Sinai, New York, NY, USA
⁵ Knowledge Systems Group, Computational Oncology, Memorial Sloan Kettering Cancer Center, New York, NY, USA
⁶ Broad Institute of MIT and Harvard, Cambridge, MA, USA

Roc Reguant
Roles: Methodology, Project Administration, Software, Writing – Original Draft Preparation, Writing – Review & Editing

Yevgeniy Antipin
Roles: Conceptualization, Methodology, Project Administration, Software, Supervision

Rob Sheridan
Roles: Software, Writing – Review & Editing

Christian Dallago
Roles: Software, Writing – Review & Editing

Drew Diamantoukos
Roles: Software

Augustin Luna
Roles: Project Administration, Software, Supervision, Writing – Original Draft Preparation, Writing – Review & Editing

Chris Sander
Roles: Funding Acquisition, Project Administration, Supervision, Writing – Review & Editing

Nicholas Paul Gauthier
Roles: Project Administration, Supervision, Writing – Review & Editing

OPEN PEER REVIEW

REVIEWER STATUS

This article is included in the Bioinformatics gateway.

Abstract

AlignmentViewer is a web-based tool to view and analyze multiple sequence alignments of protein families. The particular strengths of AlignmentViewer include flexible visualization at different scales as well as analysis of conservation patterns and of the distribution of proteins in sequence space. The tool is directly accessible in web browsers without the need for software installation. It can handle protein families with tens of thousands of sequences and is particularly suitable for evolutionary coupling analysis, e.g. via EVcouplings.org.

Keywords

alignment viewer, MSA, JavaScript, protein alignments, web-based, tool,

Corresponding authors: Chris Sander, Nicholas Paul Gauthier

Competing interests: No competing interests were disclosed.

Grant information: The project was supported by the Human Frontier Science Program (HFSP) (RGP0055/2015), the National Resource for Network Biology (NRNB) from the National Institute of General Medical Sciences (NIGMS) (P41 GM103504), and the Department of Cell Biology at Harvard Medical School.
The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Copyright: © 2020 Reguant R et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

How to cite: Reguant R, Antipin Y, Sheridan R et al. AlignmentViewer: Sequence Analysis of Large Protein Families [version 2; peer review: 2 approved]. F1000Research 2020, 9:213 (https://doi.org/10.12688/f1000research.22242.2) First published: 27 Mar 2020, 9:213 (https://doi.org/10.12688/f1000research.22242.1) Latest published: 15 Oct 2020, 9:213 (https://doi.org/10.12688/f1000research.22242.2)

Revised Amendments from Version 1

We have updated the manuscript and the software according to reviewer comments. We have added a new feature to select the coloring scheme of the alignment, addressed cross-browser compatibility issues, and improved the usability. We have also addressed comments from the reviewers meant to improve the clarity of the manuscript.

See the authors' detailed response to the review by Alex Bateman
See the authors' detailed response to the review by Erik Larsson Lekholm

Introduction

Multiple sequence alignment (MSA) analysis (e.g., analysis of sequence patterns, subfamilies, specificity residues, evolutionary couplings) and visualization allows researchers to extract information and gain a better understanding of protein families. MSA is a basic step in many protein analysis workflows, including 3D structure prediction (Marks et al., 2011), structure detection in flexible (‘disordered’) domains (Toth-Petroczy et al., 2016), function prediction (Tamames et al., 1998) and intracellular localization (Goldberg et al., 2014).

A number of useful tools exist for the visualization of protein MSAs, such as, MView, Wasabi, AliView, MSAViewer and Jalview (Brown et al., 1998; Larsson, 2014; Veidenberg et al., 2016; Waterhouse et al., 2009; Yachdav et al., 2016). MView was one of the first online browser-based MSA viewers, with alignments formatted as an HTML document. Wasabi is a web-based tool particularly useful for phylogenetic analysis and incorporates phylogeny-aware alignment methods. Another desktop application, AliView, has features such as sorting, viewing, removing, editing and merging sequences from large nucleotide sequence datasets. MSAViewer is an interactive MSA visualizer in JavaScript that implements basic features of viewing, scrolling and motif selection. Jalview is a Java-based desktop tool accessible through websites using an embeddable applet, but unfortunately the technology for these applets is no longer supported in most browsers.

AlignmentViewer complements these MSA tools and provides the following features: (i) in-browser and serverless execution, (ii) visualization of very large MSAs, (iii) visualization of conservation patterns, (iv) sequence filtering, (v) logo display, (vi) pairwise sequence identity map, (vii) sequence space exploration by UMAP dimensionality reduction, and (viii) display of top-ranked evolutionary couplings (Hopf et al., 2019).

An earlier version of this article can be found on bioRxiv (DOI: https://doi.org/10.1101/269720); additional features have been implemented since the earlier version.

Methods

Operation

AlignmentViewer is a web-based tool written in JavaScript with minimal system requirements. AlignmentViewer works best on Chrome regardless of operating system. AlignmentViewer is developed with the D3 library (d3js.org) to produce dynamic and interactive data visualizations, with performance (speed) for large alignments a major consideration. The tool is entirely client-based, running inside a web browser without the need for server-side computation.

Implementation

Users can access AlignmentViewer and all its features directly from alignmentviewer.org, but its serverless execution enables anyone to quickly start a local copy for online or offline use. Hyperlinks for lookup in background databases, such as Uniprot or Pfam, are made directly from the client. Alignments can be passed to AlignmentViewer also via a URL query parameter that is served by https and is properly encoded (e.g., https://alignmentviewer.org/?url=https://alignmentviewer.org/example/1bkr_A.1-108.msa.txt), enabling seamless integration from external web services via a simple link (e.g. the EVcouplings, evcouplings.org, web server (Hopf et al., 2019) offers visualization of computed alignments via a link to AlignmentViewer). The tool has been thoroughly tested with many large alignments. An alignment with, e.g., 50,000 sequences (about 13 MB of memory) loads in the Safari browser within one minute; further speedup is planned.

Use case

Figure 1 shows the main functionalities from AlignmentViewer explained in more detail in the next subsections. The top sub-figure shows the msa view with the sequence logo and the alignment capturing most of the attention. This view lets the user examine in depth the alignment. Each amino acid position is represented in sequence logo and the height shows users the information content of each position, in bits. Then, from left to right we show the pixel view, a part of the stats view, and the sequence space (with annotations). The pixel view gives an overview display of the alignment to enable a coarse view of the alignment for better visualization and pattern identification. The all versus all sequence identity sub-figure in Figure 1 (part of the stats view) displays allows users to identify possible clusters in the alignment based on sequence identity. The bottom right sub-figure of Figure 1 displays the sequences clustered by similarity (see section Sequence space) highlighted by user-provided annotations to aid in interpretation of the clusters.

Figure 1. AlignmentViewer visualization of the beta-lactamase protein domain family.

Bars above the sequence alignment quantify residue conservation. The alignment consensus logo (just below the bar chart) is based on the amino acid frequencies. Lower left: pixel view of the alignment especially useful for large families; lower middle: protein-protein sequence similarity matrix graded by percentage identity; lower right: distribution of sequences in sequence space (UMAP projection), colored by species groups.

MSA view

Alignment details. The msa view page has summary information: number of sequences, conservation and gap counts for each position, a sequence logo, and the residues in one letter code. By default, columns with gaps in the reference sequence (first row) are omitted in order to facilitate visual focus on sequence patterns relative to a protein of interest and to avoid extremely gapped alignment views typical of many MSA presentations. The amino acids are colored using a conventional coloring scheme, adopted from Mview, based on amino acid properties.

Sequence attributes and sorting. Sequences in the alignment can be sorted using one of four different methods: (i) the original order provided by the user, (ii–iii) by % sequence identity between a particular sequence and the reference (top) sequence, relative to the first or the second (gaps not counted), and (iv) by user-provided (upload annotations tab) sequence weights or other attributes, such as alignment profile scores (e.g., HMM bit scores). Sequences can be filtered by sequence identity relative to a reference sequence or by percentage of gaps.

Pixel view (suitable for large families)

The pixel view (image view website tab) leads to an overview of the entire depth and breadth of an MSA. The amino acid letters are represented by small rectangles of pixels, retaining the amino acid type coloring (image view tab). This striking visual impression can reveal patterns of conservation and variation, especially for large alignments. This is very useful to gain an intuitive view of sequence properties, noise at the uncertain edges of a protein family, as well as subfamily distributions. The coloring scheme can be by (1) amino acid properties, (2) hydrophobicity (red to blue) or (3) mutational difference (stronger color) in a sequence relative to the reference (first row) sequence.

Stats view

The stats view tab leads to plots of statistical properties of the set of sequences in the alignment, including (i) sequence identity relative to the reference sequence, and (ii) min, max, and average of (i); and (iii) a pairwise sequence identity matrix in which each pixel represents the degree of similarity between two sequences, such that a block-diagonal structure of the matrix is indicative of distinct subfamilies, given, e.g., a tree-derived sequence order as user input. The ordering of sequences by phylogeny is (currently) not part of the tool and can be performed using external tools, e.g., Wasabi (Veidenberg et al., 2016).

Annotations and evolutionary couplings

Users can upload custom numerical attributes or labels for the sequences in the MSA (upload annotations) or evolutionary couplings between residue positions (load couplings). Adding these attributes allows users to use sequence weights, compare different measures of sequence fitness (e.g., bitscore, sequence identity, statistical energy) or visualize evolutionary coupling constraints for pairs of positions.

Sequence space

Users can view representations of the MSA sequences in two- or three-dimensional space under the “sequence space” tab. These representations are generated using the Uniform Manifold Approximation and Projection (UMAP) dimensionality reduction algorithm (McInnes et al., 2018), whichhas been adapted for Javascript using the umap-js library (https://github.com/PAIR-code/umap-js). The alignmentviewer.org implementation uses the number of amino acid differences between pairs of sequences (the Hamming distance) as the distance metric parameter. The algorithm then iteratively calculates an embedding in two- or three-dimensional space, which is displayed in real time for the end users. UMAP hyperparameters are set to reasonable defaults, but can also be configured via the settings panel. Sequences can be colored by user provided annotations ("upload annotations" tab).

Conclusion

AlignmentViewer is a lightweight online viewer for biological multiple sequence alignments that focuses on usability and performance. Written in JavaScript, this tool can be used in many browsers. The architecture of AlignmentViewer allows its use without software installation and without an internet connection. The visualization capabilities, analysis features and metrics in AlignmentViewer are useful in many areas of biology, especially evolutionary, structural, synthetic and chemical biology. In the future we plan to add a visualization of species diversity, predicted contact maps, and organization by sequence subfamilies with specificity residues. A standalone version of AlignmentViewer is available at alignmentviewer.org and is in use by external services including EVcouplings.org. AlignmentViewer is an open source project hosted on GitHub, which welcomes engagement of interested members of the community.

Data availability

All data underlying the results are available as part of the article and no additional source data are required.

Software availability

AlignmentViewer website and demo can be found at: https://alignmentviewer.org/.

Source code available at: https://github.com/sanderlab/alignmentviewer.

Archived source code at time of publication: https://doi.org/10.5281/zenodo.4063551 (Reguant, 2020).

License: MIT license.

Acknowledgments

We thank Debora Marks for constructive discussions.

Faculty Opinions recommended

References

Brown NP, Leroy C, Sander C, et al.: MView: a web-compatible database search or multiple alignment viewer. Bioinforma. 1998; 14(4): 380–381. PubMed Abstract | Publisher Full Text
Goldberg T, Hecht M, Hamp T, et al.: LocTree3 prediction of localization. Nucleic Acids Res. 2014; 42(Web Server issue): W350–W355. PubMed Abstract | Publisher Full Text | Free Full Text
Hopf TA, Green AG, Schubert B, et al.: The EVcouplings Python framework for coevolutionary sequence analysis. Bioinformatics. 2019; 35(9): 1582–1584. PubMed Abstract | Publisher Full Text | Free Full Text
Larsson A: AliView: a fast and lightweight alignment viewer and editor for large datasets. Bioinformatics. 2014; 30(22): 3276–3278. PubMed Abstract | Publisher Full Text | Free Full Text
Marks DS, Colwell LJ, Sheridan R, et al.: Protein 3D structure computed from evolutionary sequence variation. PLoS One. 2011; 6(12); e28766. PubMed Abstract | Publisher Full Text | Free Full Text
McInnes L, Healy J, Melville J: Umap: Uniform manifold approximation and projection for dimension reduction. arXiv preprint. arXiv: 1802.03426. 2018. Reference Source
Reguant R: Alignmentviewer (Version v2.1). Zenodo. 2020. http://www.doi.org/10.5281/zenodo.4063551
Tamames J, Ouzounis C, Casari G, et al.: EUCLID: automatic classification of proteins in functional classes by their database annotations. Bioinformatics. 1998; 14(6): 542–3. PubMed Abstract | Publisher Full Text
Toth-Petroczy A, Palmedo P, Ingraham J, et al.: Structured States of Disordered Proteins from Genomic Sequences. Cell. 2016; 167(1); 158–170.e12. PubMed Abstract | Publisher Full Text | Free Full Text
Veidenberg A, Medlar A, Löytynoja A: Wasabi: An Integrated Platform for Evolutionary Sequence Analysis and Data Visualization. Mol Biol Evol. 2016; 33(4): 1126–1130. PubMed Abstract | Publisher Full Text
Waterhouse AM, Procter JB, Martin DM, et al.: Jalview Version 2--a multiple sequence alignment editor and analysis workbench. Bioinformatics. 2009; 25(9): 1189–91. PubMed Abstract | Publisher Full Text | Free Full Text
Yachdav G, Wilzbach S, Rauscher B, et al.: MSAViewer: interactive JavaScript visualization of multiple sequence alignments. Bioinformatics. 2016; 32(22): 3501–3503. PubMed Abstract | Publisher Full Text | Free Full Text

Comments on this article Comments (0)

Version 2

VERSION 2 PUBLISHED 27 Mar 2020

Author details Author details

Roc Reguant
Roles: Methodology, Project Administration, Software, Writing – Original Draft Preparation, Writing – Review & Editing

Yevgeniy Antipin
Roles: Conceptualization, Methodology, Project Administration, Software, Supervision

Rob Sheridan
Roles: Software, Writing – Review & Editing

Christian Dallago
Roles: Software, Writing – Review & Editing

Drew Diamantoukos
Roles: Software

Augustin Luna
Roles: Project Administration, Software, Supervision, Writing – Original Draft Preparation, Writing – Review & Editing

Chris Sander
Roles: Funding Acquisition, Project Administration, Supervision, Writing – Review & Editing

Nicholas Paul Gauthier
Roles: Project Administration, Supervision, Writing – Review & Editing

Competing interests

No competing interests were disclosed.

Grant information

The project was supported by the Human Frontier Science Program (HFSP) (RGP0055/2015), the National Resource for Network Biology (NRNB) from the National Institute of General Medical Sciences (NIGMS) (P41 GM103504), and the Department of Cell Biology at Harvard Medical School.
The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Article Versions (2)

version 2

Revised

Published: 15 Oct 2020, 9:213

https://doi.org/10.12688/f1000research.22242.2

version 1

Published: 27 Mar 2020, 9:213

https://doi.org/10.12688/f1000research.22242.1

© 2020 Reguant R et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Download

Export To

metrics

	Views	Downloads
F1000Research	-	-
PubMed Central Data from PMC are received and updated monthly.	-	-

Citations

SEE MORE DETAILS

CITE

how to cite this article

Reguant R, Antipin Y, Sheridan R et al. AlignmentViewer: Sequence Analysis of Large Protein Families [version 2; peer review: 2 approved] F1000Research 2020, 9:213 (https://doi.org/10.12688/f1000research.22242.2)

NOTE: it is important to ensure the information in square brackets after the title is included in all citations of this article.

track

receive updates on this article

Track an article to receive email alerts on any updates to this article.

Open Peer Review

Current Reviewer Status: ?

Key to Reviewer Statuses VIEW HIDE

ApprovedThe paper is scientifically sound in its current form and only minor, if any, improvements are suggested

Approved with reservations A number of small changes, sometimes more significant revisions are required to address specific details and improve the papers academic merit.

Not approvedFundamental flaws in the paper seriously undermine the findings and conclusions

Version 2

VERSION 2

PUBLISHED 15 Oct 2020

Revised

Views

Reviewer Report 16 Oct 2020

Alex Bateman, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, UK

Approved

https://doi.org/10.5256/f1000research.30135.r73164

The authors have addressed my concerns fully.

The authors may wish to double check the alignment of the example 1KR:A on the website. It does not look well aligned to my eyes. This suggestion is not required ... Continue reading

Competing Interests: No competing interests were disclosed.

Reviewer Expertise: Computational biology with specialism in biological databases and analysis of protein families.

I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.

CITE

Report a concern

Respond or Comment

Version 1

VERSION 1

PUBLISHED 27 Mar 2020

Views

Reviewer Report 16 Apr 2020

Alex Bateman, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, UK

Approved with Reservations

https://doi.org/10.5256/f1000research.24533.r61763

Overall there is a need for Javascript based alignment viewers which can handle large numbers of sequences. So this software is rather timely. However, I think the current implementation seems to still contain significant bugs and the web page requires a round of user experience testing to make it easier to use and interpret the results. Once these improvements are made then I think this has the potential to be a very useful tool.

Manuscript changes:

Please fix capitalisation of PFAM to Pfam

Software/Website changes:

The computing conservation box is annoyingly placed. It does not go away and covers the UMAP view significantly.
I really need an estimate of how long it will take for the pairwise identity to load. Some time passes…I didn’t realise I had to click on calculate to get the pairwise identity to show. Please UX test this page to make it easier/clearer for the user to understand what to do.

For the top graph on the stats view the three colours used for max/average and min identity are hard to distinguish. Please use a broader colour palette. Please add axis legends for this graph and make it clearer what the title of this graph is. The title is very clear for the bottom plot.

Please add alternative colouring schemes for viewing the alignments. It makes a surprisingly big difference for experienced users ability to interpret an alignment. I personally like the ClustalX colouring scheme that is widely adopted, but there are other popular schemes it would be nice to incorporate as options.

“Based on this and other small tests we can safely state that any computer that is able to run any modern web browser will be able to run any alignment that requires visualization”. This is overstating the case. Needs to be toned down. Try testing the software on the ABC transporter family (PF00005) alignment in Pfam. The NCBI alignment contains 2.6 million sequences. If it works seamlessly for that then you can safely state.

I tried to hook the viewer up to a Pfam alignment which is not very large (840 sequences). I got a box that said Fetching file…and no response beyond that.

https://alignmentviewer.org/?url=http://pfam.xfam.org/family/PF00571/alignment/seed/format?format=fasta&alnType=seed&order=t&case=l&gaps=default&download=0

also tried Stockholm format with no luck.

https://alignmentviewer.org/?url=http://pfam.xfam.org/family/PF00571/alignment/seed/format?format=stockholm&alnType=seed&order=t&case=l&gaps=default&download=0

I also tried with a Pfam family with just 3 sequences. Still with no luck.

I got the viewer to work by uploading a Pfam alignment in fasta format with 3 sequences. Then I tried uploading the seed alignment for the CBS domain downloaded from Pfam in fasta format. I then got the following error:

Parsing error: sequence #3 has different length.

The first few lines of the alignment are included below and the third sequence appears to be the same length as all the others.

>Q8EHI4_SHEON/79-134
I........H.Q...V........MTR........NP.........V.......TVAPY.
...VS.LDVAS....RT..L.....L....E.....HN..I.....G..CL.P.V.L...
.........ENG....D......LV.GIVTWKDLLRAYCA
>Q97V95_SULSO/193-248
V........L.D...A.......G.TK........NP.........I.......TINRY.
...YS.ILNAA....KL..M.....I....E.....KR..I.....G..TL.L.V.M...
.......E..NQ....K......LV.GIVTERDLMYAYIN
>Q6M020_METMP/269-321
V........K.E...I.........MS........PP.........V.......MVSPE.
...AA.LNELI....KG..M.....A....N.........T.....D..RV.Y.V.V...
.........DNG....N......IL.GIISKTDIVRTLSI
>Q72KI1_THET2/367-424
V........E.G...F.........LA........RA.........V.......VLPPS.
...TP.LSQVE....PR..L..........R.....EA..G.....G..RV.L.V.GE..
.......RAGEG...WR......LL.GIYTRTDLYRSAPK
>Q8TX96_METKA/244-298
A........R.N...Y.......L..Q........EM.........V.......VVPPE.
...TP.LHEAL....WE..V.........ID.....KM..S.....D..RI.Y.V.M...
..........DG.....R....KLT.GVVPLIDAVYTLAK
>Q836T2_ENTFA/77-132
V........Q.E...I........MSP........PL.................MVAQD.
...TS.IRDAI....TN..L................FMYDV.....G..SL.Y.V.M...
........DEAK....E......LL.GVLSRKDLLRASLN
>Q8XIV1_CLOPE/77-131
V........K.D...I.........MS........KP.........V.......TVCEE.
...TM.LHDAI....VH.LF.....L...ND.........V.....G..TM.F.VEN...
..........GG....V......LT.GAVSRKDFLKVAIG

If I delete the third sequence and re-upload I get exactly the same error.

For info I carried out tests with Firefox for Mac version 74.0 and Safari Version 13.1

Is the rationale for developing the new software tool clearly explained?

Yes
Is the description of the software tool technically sound?

Yes
Are sufficient details of the code, methods and analysis (if applicable) provided to allow replication of the software development and its use by others?

Partly
Is sufficient information provided to allow interpretation of the expected output datasets and any results generated using the tool?

Yes
Are the conclusions about the tool and its performance adequately supported by the findings presented in the article?

Partly

Competing Interests: No competing interests were disclosed.

Reviewer Expertise: Computational biology with specialism in biological databases and analysis of protein families.

I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard, however I have significant reservations, as outlined above.

CITE

Report a concern

Author Response 15 Oct 2020

Roc Reguant, Department of Cell Biology, Harvard Medical School, Boston, USA

15 Oct 2020

Author Response

Manuscript changes:
Please fix capitalisation of PFAM to Pfam
Response: The manuscript has been updated to Pfam.

Software/Website changes:
The computing conservation box is annoyingly placed. It does not go away and covers the ... Continue reading Manuscript changes:
Please fix capitalisation of PFAM to Pfam
Response: The manuscript has been updated to Pfam.

Software/Website changes:
The computing conservation box is annoyingly placed. It does not go away and covers the UMAP view significantly.
Response: This bug has now been fixed - the pop-up will disappear after the alignment has been loaded.

I really need an estimate of how long it will take for the pairwise identity to load. Some time passes…I didn’t realise I had to click on calculate to get the pairwise identity to show. Please UX test this page to make it easier/clearer for the user to understand what to do.
Response: We have added a button into the middle of the plot to indicate that an action needs to be taken by the user to begin calculating pairwise sequence identity.

For the top graph on the stats view the three colours used for max/average and min identity are hard to distinguish. Please use a broader colour palette. Please add axis legends for this graph and make it clearer what the title of this graph is. The title is very clear for the bottom plot.
Response: We have updated the color palette to increase the contrast, added axis labels and added a clear title.

Please add alternative colouring schemes for viewing the alignments. It makes a surprisingly big difference for experienced users ability to interpret an alignment. I personally like the ClustalX colouring scheme that is widely adopted, but there are other popular schemes it would be nice to incorporate as options.
Response: We agree alternative coloring schemes are useful and have added the clustal colors as a drop-down option on the main page as well as in the image view.

“Based on this and other small tests we can safely state that any computer that is able to run any modern web browser will be able to run any alignment that requires visualization”. This is overstating the case. Needs to be toned down. Try testing the software on the ABC transporter family (PF00005) alignment in Pfam. The NCBI alignment contains 2.6 million sequences. If it works seamlessly for that then you can safely state.
Response: We agree with the reviewer that the phrasing was optimistic and have removed the sentence from the main manuscript.

I tried to hook the viewer up to a Pfam alignment which is not very large (840 sequences). I got a box that said Fetching file…and no response beyond that.
https://alignmentviewer.org/?url=http://pfam.xfam.org/family/PF00571/alignment/seed/format?format=fasta&alnType=seed&order=t&case=l&gaps=default&download=0
also tried Stockholm format with no luck.
https://alignmentviewer.org/?url=http://pfam.xfam.org/family/PF00571/alignment/seed/format?format=stockholm&alnType=seed&order=t&case=l&gaps=default&download=0

Response: There are two technical limitations inherent to how requests happen in the browser. We take as an example your suggested URL: https://alignmentviewer.org/?url=http://pfam.xfam.org/family/PF00571/alignment/seed/format?format=stockholm&alnType=seed&order=t&case=l&gaps=default&download=0

Limitation 1: The alignmentviewer.org website is hosted as a secure site under https. However, the location of the requested alignment (from Pfam) is passed via an insecure connection (http). For security reasons mixing different protocols (also called "mixed content", see https://developer.mozilla.org/en-US/docs/Web/Security/Mixed_content) is not allowed.
Limitation 2: the string passed to the query parameter URL (aka. what follows "?url=") needs to be sanitized before being served. More information about this can be found here: https://en.wikipedia.org/wiki/Query_string#URL_encoding. For clarity, we have updated the example in the main text to be stanitized / encoded.
In your example, the alignment lives at "http://pfam.xfam.org/family/PF00571/alignment/seed/format?format=stockholm&alnType=seed&order=t&case=l&gaps=default&download=0". Passing the string as-is will confuse the browser that will interpret e.g. "&gaps=" as query for alignmentviewer rather than for pfam.xfam.org . Many languages offer built-in functions to sanitize strings to be passed to URL queries (e.g. in a JavaScript frontend application, using the window.encodeURIComponent() function; an interactive website can be found here: https://www.url-encode-decode.com). By doing so, the string proposed for the alignment would result in: "http%3A%2F%2Fpfam.xfam.org%2Ffamily%2FPF00571%2Falignment%2Fseed%2Fformat%3Fformat%3Dstockholm%26alnType%3Dseed%26order%3Dt%26case%3Dl%26gaps%3Ddefault%26download%3D0".
Conclusion: considering the two inherit browser limitations mentioned above, your requests can be sanitized and fixed as follows:
https://alignmentviewer.org/?url=https%3A%2F%2Fpfam.xfam.org%2Ffamily%2FPF00571%2Falignment%2Fseed%2Fformat%3Fformat%3Dfasta%26alnType%3Dseed%26order%3Dt%26case%3Dl%26gaps%3Ddefault%26download%3D0
Again, note that for both URLs the protocol has been changed to "https" instead of "http", and that the string that follows "&url=" has been sanitized using Chrome's built-in console and the "window.encodeURIComponent(x)" function, where x is the string to be sanitized. We have added some text to the manuscript to highlight these requirements.

I also tried with a Pfam family with just 3 sequences. Still with no luck.
I got the viewer to work by uploading a Pfam alignment in fasta format with 3 sequences. Then I tried uploading the seed alignment for the CBS domain downloaded from Pfam in fasta format. I then got the following error:
Parsing error: sequence #3 has different length.
The first few lines of the alignment are included below and the third sequence appears to be the same length as all the others.
>Q8EHI4_SHEON/79-134
I........H.Q...V........MTR........NP.........V.......TVAPY.
...VS.LDVAS....RT..L.....L....E.....HN..I.....G..CL.P.V.L...
.........ENG....D......LV.GIVTWKDLLRAYCA
>Q97V95_SULSO/193-248
V........L.D...A.......G.TK........NP.........I.......TINRY.
...YS.ILNAA....KL..M.....I....E.....KR..I.....G..TL.L.V.M...
.......E..NQ....K......LV.GIVTERDLMYAYIN
>Q6M020_METMP/269-321
V........K.E...I.........MS........PP.........V.......MVSPE.
...AA.LNELI....KG..M.....A....N.........T.....D..RV.Y.V.V...
.........DNG....N......IL.GIISKTDIVRTLSI
>Q72KI1_THET2/367-424
V........E.G...F.........LA........RA.........V.......VLPPS.
...TP.LSQVE....PR..L..........R.....EA..G.....G..RV.L.V.GE..
.......RAGEG...WR......LL.GIYTRTDLYRSAPK
>Q8TX96_METKA/244-298
A........R.N...Y.......L..Q........EM.........V.......VVPPE.
...TP.LHEAL....WE..V.........ID.....KM..S.....D..RI.Y.V.M...
..........DG.....R....KLT.GVVPLIDAVYTLAK
>Q836T2_ENTFA/77-132
V........Q.E...I........MSP........PL.................MVAQD.
...TS.IRDAI....TN..L................FMYDV.....G..SL.Y.V.M...
........DEAK....E......LL.GVLSRKDLLRASLN
>Q8XIV1_CLOPE/77-131
V........K.D...I.........MS........KP.........V.......TVCEE.
...TM.LHDAI....VH.LF.....L...ND.........V.....G..TM.F.VEN...
..........GG....V......LT.GAVSRKDFLKVAIG
If I delete the third sequence and re-upload I get exactly the same error.
For info I carried out tests with Firefox for Mac version 74.0 and Safari Version 13.1
Response: Thank you for pointing out this issue. This bug occurred when refreshing an internal variable, and has now been fixed.
Manuscript changes:
Please fix capitalisation of PFAM to Pfam
Response: The manuscript has been updated to Pfam.

Software/Website changes:
The computing conservation box is annoyingly placed. It does not go away and covers the UMAP view significantly.
Response: This bug has now been fixed - the pop-up will disappear after the alignment has been loaded.

I really need an estimate of how long it will take for the pairwise identity to load. Some time passes…I didn’t realise I had to click on calculate to get the pairwise identity to show. Please UX test this page to make it easier/clearer for the user to understand what to do.
Response: We have added a button into the middle of the plot to indicate that an action needs to be taken by the user to begin calculating pairwise sequence identity.

For the top graph on the stats view the three colours used for max/average and min identity are hard to distinguish. Please use a broader colour palette. Please add axis legends for this graph and make it clearer what the title of this graph is. The title is very clear for the bottom plot.
Response: We have updated the color palette to increase the contrast, added axis labels and added a clear title.

Please add alternative colouring schemes for viewing the alignments. It makes a surprisingly big difference for experienced users ability to interpret an alignment. I personally like the ClustalX colouring scheme that is widely adopted, but there are other popular schemes it would be nice to incorporate as options.
Response: We agree alternative coloring schemes are useful and have added the clustal colors as a drop-down option on the main page as well as in the image view.

“Based on this and other small tests we can safely state that any computer that is able to run any modern web browser will be able to run any alignment that requires visualization”. This is overstating the case. Needs to be toned down. Try testing the software on the ABC transporter family (PF00005) alignment in Pfam. The NCBI alignment contains 2.6 million sequences. If it works seamlessly for that then you can safely state.
Response: We agree with the reviewer that the phrasing was optimistic and have removed the sentence from the main manuscript.

I tried to hook the viewer up to a Pfam alignment which is not very large (840 sequences). I got a box that said Fetching file…and no response beyond that.
https://alignmentviewer.org/?url=http://pfam.xfam.org/family/PF00571/alignment/seed/format?format=fasta&alnType=seed&order=t&case=l&gaps=default&download=0
also tried Stockholm format with no luck.
https://alignmentviewer.org/?url=http://pfam.xfam.org/family/PF00571/alignment/seed/format?format=stockholm&alnType=seed&order=t&case=l&gaps=default&download=0

Response: There are two technical limitations inherent to how requests happen in the browser. We take as an example your suggested URL: https://alignmentviewer.org/?url=http://pfam.xfam.org/family/PF00571/alignment/seed/format?format=stockholm&alnType=seed&order=t&case=l&gaps=default&download=0

Limitation 1: The alignmentviewer.org website is hosted as a secure site under https. However, the location of the requested alignment (from Pfam) is passed via an insecure connection (http). For security reasons mixing different protocols (also called "mixed content", see https://developer.mozilla.org/en-US/docs/Web/Security/Mixed_content) is not allowed.
Limitation 2: the string passed to the query parameter URL (aka. what follows "?url=") needs to be sanitized before being served. More information about this can be found here: https://en.wikipedia.org/wiki/Query_string#URL_encoding. For clarity, we have updated the example in the main text to be stanitized / encoded.
In your example, the alignment lives at "http://pfam.xfam.org/family/PF00571/alignment/seed/format?format=stockholm&alnType=seed&order=t&case=l&gaps=default&download=0". Passing the string as-is will confuse the browser that will interpret e.g. "&gaps=" as query for alignmentviewer rather than for pfam.xfam.org . Many languages offer built-in functions to sanitize strings to be passed to URL queries (e.g. in a JavaScript frontend application, using the window.encodeURIComponent() function; an interactive website can be found here: https://www.url-encode-decode.com). By doing so, the string proposed for the alignment would result in: "http%3A%2F%2Fpfam.xfam.org%2Ffamily%2FPF00571%2Falignment%2Fseed%2Fformat%3Fformat%3Dstockholm%26alnType%3Dseed%26order%3Dt%26case%3Dl%26gaps%3Ddefault%26download%3D0".
Conclusion: considering the two inherit browser limitations mentioned above, your requests can be sanitized and fixed as follows:
https://alignmentviewer.org/?url=https%3A%2F%2Fpfam.xfam.org%2Ffamily%2FPF00571%2Falignment%2Fseed%2Fformat%3Fformat%3Dfasta%26alnType%3Dseed%26order%3Dt%26case%3Dl%26gaps%3Ddefault%26download%3D0
Again, note that for both URLs the protocol has been changed to "https" instead of "http", and that the string that follows "&url=" has been sanitized using Chrome's built-in console and the "window.encodeURIComponent(x)" function, where x is the string to be sanitized. We have added some text to the manuscript to highlight these requirements.

I also tried with a Pfam family with just 3 sequences. Still with no luck.
I got the viewer to work by uploading a Pfam alignment in fasta format with 3 sequences. Then I tried uploading the seed alignment for the CBS domain downloaded from Pfam in fasta format. I then got the following error:
Parsing error: sequence #3 has different length.
The first few lines of the alignment are included below and the third sequence appears to be the same length as all the others.
>Q8EHI4_SHEON/79-134
I........H.Q...V........MTR........NP.........V.......TVAPY.
...VS.LDVAS....RT..L.....L....E.....HN..I.....G..CL.P.V.L...
.........ENG....D......LV.GIVTWKDLLRAYCA
>Q97V95_SULSO/193-248
V........L.D...A.......G.TK........NP.........I.......TINRY.
...YS.ILNAA....KL..M.....I....E.....KR..I.....G..TL.L.V.M...
.......E..NQ....K......LV.GIVTERDLMYAYIN
>Q6M020_METMP/269-321
V........K.E...I.........MS........PP.........V.......MVSPE.
...AA.LNELI....KG..M.....A....N.........T.....D..RV.Y.V.V...
.........DNG....N......IL.GIISKTDIVRTLSI
>Q72KI1_THET2/367-424
V........E.G...F.........LA........RA.........V.......VLPPS.
...TP.LSQVE....PR..L..........R.....EA..G.....G..RV.L.V.GE..
.......RAGEG...WR......LL.GIYTRTDLYRSAPK
>Q8TX96_METKA/244-298
A........R.N...Y.......L..Q........EM.........V.......VVPPE.
...TP.LHEAL....WE..V.........ID.....KM..S.....D..RI.Y.V.M...
..........DG.....R....KLT.GVVPLIDAVYTLAK
>Q836T2_ENTFA/77-132
V........Q.E...I........MSP........PL.................MVAQD.
...TS.IRDAI....TN..L................FMYDV.....G..SL.Y.V.M...
........DEAK....E......LL.GVLSRKDLLRASLN
>Q8XIV1_CLOPE/77-131
V........K.D...I.........MS........KP.........V.......TVCEE.
...TM.LHDAI....VH.LF.....L...ND.........V.....G..TM.F.VEN...
..........GG....V......LT.GAVSRKDFLKVAIG
If I delete the third sequence and re-upload I get exactly the same error.
For info I carried out tests with Firefox for Mac version 74.0 and Safari Version 13.1
Response: Thank you for pointing out this issue. This bug occurred when refreshing an internal variable, and has now been fixed.
Competing Interests: No competing interests were disclosed. Close
Report a concern
Respond or Comment

COMMENTS ON THIS REPORT

Author Response 15 Oct 2020

Roc Reguant, Department of Cell Biology, Harvard Medical School, Boston, USA

15 Oct 2020

Author Response

Manuscript changes:
Please fix capitalisation of PFAM to Pfam
Response: The manuscript has been updated to Pfam.

Software/Website changes:
The computing conservation box is annoyingly placed. It does not go away and covers the ... Continue reading Manuscript changes:
Please fix capitalisation of PFAM to Pfam
Response: The manuscript has been updated to Pfam.

Software/Website changes:
The computing conservation box is annoyingly placed. It does not go away and covers the UMAP view significantly.
Response: This bug has now been fixed - the pop-up will disappear after the alignment has been loaded.

I really need an estimate of how long it will take for the pairwise identity to load. Some time passes…I didn’t realise I had to click on calculate to get the pairwise identity to show. Please UX test this page to make it easier/clearer for the user to understand what to do.
Response: We have added a button into the middle of the plot to indicate that an action needs to be taken by the user to begin calculating pairwise sequence identity.

For the top graph on the stats view the three colours used for max/average and min identity are hard to distinguish. Please use a broader colour palette. Please add axis legends for this graph and make it clearer what the title of this graph is. The title is very clear for the bottom plot.
Response: We have updated the color palette to increase the contrast, added axis labels and added a clear title.

Please add alternative colouring schemes for viewing the alignments. It makes a surprisingly big difference for experienced users ability to interpret an alignment. I personally like the ClustalX colouring scheme that is widely adopted, but there are other popular schemes it would be nice to incorporate as options.
Response: We agree alternative coloring schemes are useful and have added the clustal colors as a drop-down option on the main page as well as in the image view.

“Based on this and other small tests we can safely state that any computer that is able to run any modern web browser will be able to run any alignment that requires visualization”. This is overstating the case. Needs to be toned down. Try testing the software on the ABC transporter family (PF00005) alignment in Pfam. The NCBI alignment contains 2.6 million sequences. If it works seamlessly for that then you can safely state.
Response: We agree with the reviewer that the phrasing was optimistic and have removed the sentence from the main manuscript.

I tried to hook the viewer up to a Pfam alignment which is not very large (840 sequences). I got a box that said Fetching file…and no response beyond that.
https://alignmentviewer.org/?url=http://pfam.xfam.org/family/PF00571/alignment/seed/format?format=fasta&alnType=seed&order=t&case=l&gaps=default&download=0
also tried Stockholm format with no luck.
https://alignmentviewer.org/?url=http://pfam.xfam.org/family/PF00571/alignment/seed/format?format=stockholm&alnType=seed&order=t&case=l&gaps=default&download=0

Response: There are two technical limitations inherent to how requests happen in the browser. We take as an example your suggested URL: https://alignmentviewer.org/?url=http://pfam.xfam.org/family/PF00571/alignment/seed/format?format=stockholm&alnType=seed&order=t&case=l&gaps=default&download=0

Limitation 1: The alignmentviewer.org website is hosted as a secure site under https. However, the location of the requested alignment (from Pfam) is passed via an insecure connection (http). For security reasons mixing different protocols (also called "mixed content", see https://developer.mozilla.org/en-US/docs/Web/Security/Mixed_content) is not allowed.
Limitation 2: the string passed to the query parameter URL (aka. what follows "?url=") needs to be sanitized before being served. More information about this can be found here: https://en.wikipedia.org/wiki/Query_string#URL_encoding. For clarity, we have updated the example in the main text to be stanitized / encoded.
In your example, the alignment lives at "http://pfam.xfam.org/family/PF00571/alignment/seed/format?format=stockholm&alnType=seed&order=t&case=l&gaps=default&download=0". Passing the string as-is will confuse the browser that will interpret e.g. "&gaps=" as query for alignmentviewer rather than for pfam.xfam.org . Many languages offer built-in functions to sanitize strings to be passed to URL queries (e.g. in a JavaScript frontend application, using the window.encodeURIComponent() function; an interactive website can be found here: https://www.url-encode-decode.com). By doing so, the string proposed for the alignment would result in: "http%3A%2F%2Fpfam.xfam.org%2Ffamily%2FPF00571%2Falignment%2Fseed%2Fformat%3Fformat%3Dstockholm%26alnType%3Dseed%26order%3Dt%26case%3Dl%26gaps%3Ddefault%26download%3D0".
Conclusion: considering the two inherit browser limitations mentioned above, your requests can be sanitized and fixed as follows:
https://alignmentviewer.org/?url=https%3A%2F%2Fpfam.xfam.org%2Ffamily%2FPF00571%2Falignment%2Fseed%2Fformat%3Fformat%3Dfasta%26alnType%3Dseed%26order%3Dt%26case%3Dl%26gaps%3Ddefault%26download%3D0
Again, note that for both URLs the protocol has been changed to "https" instead of "http", and that the string that follows "&url=" has been sanitized using Chrome's built-in console and the "window.encodeURIComponent(x)" function, where x is the string to be sanitized. We have added some text to the manuscript to highlight these requirements.

I also tried with a Pfam family with just 3 sequences. Still with no luck.
I got the viewer to work by uploading a Pfam alignment in fasta format with 3 sequences. Then I tried uploading the seed alignment for the CBS domain downloaded from Pfam in fasta format. I then got the following error:
Parsing error: sequence #3 has different length.
The first few lines of the alignment are included below and the third sequence appears to be the same length as all the others.
>Q8EHI4_SHEON/79-134
I........H.Q...V........MTR........NP.........V.......TVAPY.
...VS.LDVAS....RT..L.....L....E.....HN..I.....G..CL.P.V.L...
.........ENG....D......LV.GIVTWKDLLRAYCA
>Q97V95_SULSO/193-248
V........L.D...A.......G.TK........NP.........I.......TINRY.
...YS.ILNAA....KL..M.....I....E.....KR..I.....G..TL.L.V.M...
.......E..NQ....K......LV.GIVTERDLMYAYIN
>Q6M020_METMP/269-321
V........K.E...I.........MS........PP.........V.......MVSPE.
...AA.LNELI....KG..M.....A....N.........T.....D..RV.Y.V.V...
.........DNG....N......IL.GIISKTDIVRTLSI
>Q72KI1_THET2/367-424
V........E.G...F.........LA........RA.........V.......VLPPS.
...TP.LSQVE....PR..L..........R.....EA..G.....G..RV.L.V.GE..
.......RAGEG...WR......LL.GIYTRTDLYRSAPK
>Q8TX96_METKA/244-298
A........R.N...Y.......L..Q........EM.........V.......VVPPE.
...TP.LHEAL....WE..V.........ID.....KM..S.....D..RI.Y.V.M...
..........DG.....R....KLT.GVVPLIDAVYTLAK
>Q836T2_ENTFA/77-132
V........Q.E...I........MSP........PL.................MVAQD.
...TS.IRDAI....TN..L................FMYDV.....G..SL.Y.V.M...
........DEAK....E......LL.GVLSRKDLLRASLN
>Q8XIV1_CLOPE/77-131
V........K.D...I.........MS........KP.........V.......TVCEE.
...TM.LHDAI....VH.LF.....L...ND.........V.....G..TM.F.VEN...
..........GG....V......LT.GAVSRKDFLKVAIG
If I delete the third sequence and re-upload I get exactly the same error.
For info I carried out tests with Firefox for Mac version 74.0 and Safari Version 13.1
Response: Thank you for pointing out this issue. This bug occurred when refreshing an internal variable, and has now been fixed.
Manuscript changes:
Please fix capitalisation of PFAM to Pfam
Response: The manuscript has been updated to Pfam.

Software/Website changes:
The computing conservation box is annoyingly placed. It does not go away and covers the UMAP view significantly.
Response: This bug has now been fixed - the pop-up will disappear after the alignment has been loaded.

I really need an estimate of how long it will take for the pairwise identity to load. Some time passes…I didn’t realise I had to click on calculate to get the pairwise identity to show. Please UX test this page to make it easier/clearer for the user to understand what to do.
Response: We have added a button into the middle of the plot to indicate that an action needs to be taken by the user to begin calculating pairwise sequence identity.

For the top graph on the stats view the three colours used for max/average and min identity are hard to distinguish. Please use a broader colour palette. Please add axis legends for this graph and make it clearer what the title of this graph is. The title is very clear for the bottom plot.
Response: We have updated the color palette to increase the contrast, added axis labels and added a clear title.

Please add alternative colouring schemes for viewing the alignments. It makes a surprisingly big difference for experienced users ability to interpret an alignment. I personally like the ClustalX colouring scheme that is widely adopted, but there are other popular schemes it would be nice to incorporate as options.
Response: We agree alternative coloring schemes are useful and have added the clustal colors as a drop-down option on the main page as well as in the image view.

“Based on this and other small tests we can safely state that any computer that is able to run any modern web browser will be able to run any alignment that requires visualization”. This is overstating the case. Needs to be toned down. Try testing the software on the ABC transporter family (PF00005) alignment in Pfam. The NCBI alignment contains 2.6 million sequences. If it works seamlessly for that then you can safely state.
Response: We agree with the reviewer that the phrasing was optimistic and have removed the sentence from the main manuscript.

I tried to hook the viewer up to a Pfam alignment which is not very large (840 sequences). I got a box that said Fetching file…and no response beyond that.
https://alignmentviewer.org/?url=http://pfam.xfam.org/family/PF00571/alignment/seed/format?format=fasta&alnType=seed&order=t&case=l&gaps=default&download=0
also tried Stockholm format with no luck.
https://alignmentviewer.org/?url=http://pfam.xfam.org/family/PF00571/alignment/seed/format?format=stockholm&alnType=seed&order=t&case=l&gaps=default&download=0

Response: There are two technical limitations inherent to how requests happen in the browser. We take as an example your suggested URL: https://alignmentviewer.org/?url=http://pfam.xfam.org/family/PF00571/alignment/seed/format?format=stockholm&alnType=seed&order=t&case=l&gaps=default&download=0

Limitation 1: The alignmentviewer.org website is hosted as a secure site under https. However, the location of the requested alignment (from Pfam) is passed via an insecure connection (http). For security reasons mixing different protocols (also called "mixed content", see https://developer.mozilla.org/en-US/docs/Web/Security/Mixed_content) is not allowed.
Limitation 2: the string passed to the query parameter URL (aka. what follows "?url=") needs to be sanitized before being served. More information about this can be found here: https://en.wikipedia.org/wiki/Query_string#URL_encoding. For clarity, we have updated the example in the main text to be stanitized / encoded.
In your example, the alignment lives at "http://pfam.xfam.org/family/PF00571/alignment/seed/format?format=stockholm&alnType=seed&order=t&case=l&gaps=default&download=0". Passing the string as-is will confuse the browser that will interpret e.g. "&gaps=" as query for alignmentviewer rather than for pfam.xfam.org . Many languages offer built-in functions to sanitize strings to be passed to URL queries (e.g. in a JavaScript frontend application, using the window.encodeURIComponent() function; an interactive website can be found here: https://www.url-encode-decode.com). By doing so, the string proposed for the alignment would result in: "http%3A%2F%2Fpfam.xfam.org%2Ffamily%2FPF00571%2Falignment%2Fseed%2Fformat%3Fformat%3Dstockholm%26alnType%3Dseed%26order%3Dt%26case%3Dl%26gaps%3Ddefault%26download%3D0".
Conclusion: considering the two inherit browser limitations mentioned above, your requests can be sanitized and fixed as follows:
https://alignmentviewer.org/?url=https%3A%2F%2Fpfam.xfam.org%2Ffamily%2FPF00571%2Falignment%2Fseed%2Fformat%3Fformat%3Dfasta%26alnType%3Dseed%26order%3Dt%26case%3Dl%26gaps%3Ddefault%26download%3D0
Again, note that for both URLs the protocol has been changed to "https" instead of "http", and that the string that follows "&url=" has been sanitized using Chrome's built-in console and the "window.encodeURIComponent(x)" function, where x is the string to be sanitized. We have added some text to the manuscript to highlight these requirements.

I also tried with a Pfam family with just 3 sequences. Still with no luck.
I got the viewer to work by uploading a Pfam alignment in fasta format with 3 sequences. Then I tried uploading the seed alignment for the CBS domain downloaded from Pfam in fasta format. I then got the following error:
Parsing error: sequence #3 has different length.
The first few lines of the alignment are included below and the third sequence appears to be the same length as all the others.
>Q8EHI4_SHEON/79-134
I........H.Q...V........MTR........NP.........V.......TVAPY.
...VS.LDVAS....RT..L.....L....E.....HN..I.....G..CL.P.V.L...
.........ENG....D......LV.GIVTWKDLLRAYCA
>Q97V95_SULSO/193-248
V........L.D...A.......G.TK........NP.........I.......TINRY.
...YS.ILNAA....KL..M.....I....E.....KR..I.....G..TL.L.V.M...
.......E..NQ....K......LV.GIVTERDLMYAYIN
>Q6M020_METMP/269-321
V........K.E...I.........MS........PP.........V.......MVSPE.
...AA.LNELI....KG..M.....A....N.........T.....D..RV.Y.V.V...
.........DNG....N......IL.GIISKTDIVRTLSI
>Q72KI1_THET2/367-424
V........E.G...F.........LA........RA.........V.......VLPPS.
...TP.LSQVE....PR..L..........R.....EA..G.....G..RV.L.V.GE..
.......RAGEG...WR......LL.GIYTRTDLYRSAPK
>Q8TX96_METKA/244-298
A........R.N...Y.......L..Q........EM.........V.......VVPPE.
...TP.LHEAL....WE..V.........ID.....KM..S.....D..RI.Y.V.M...
..........DG.....R....KLT.GVVPLIDAVYTLAK
>Q836T2_ENTFA/77-132
V........Q.E...I........MSP........PL.................MVAQD.
...TS.IRDAI....TN..L................FMYDV.....G..SL.Y.V.M...
........DEAK....E......LL.GVLSRKDLLRASLN
>Q8XIV1_CLOPE/77-131
V........K.D...I.........MS........KP.........V.......TVCEE.
...TM.LHDAI....VH.LF.....L...ND.........V.....G..TM.F.VEN...
..........GG....V......LT.GAVSRKDFLKVAIG
If I delete the third sequence and re-upload I get exactly the same error.
For info I carried out tests with Firefox for Mac version 74.0 and Safari Version 13.1
Response: Thank you for pointing out this issue. This bug occurred when refreshing an internal variable, and has now been fixed.
Competing Interests: No competing interests were disclosed. Close
Report a concern

Views

Reviewer Report 09 Apr 2020

Erik Larsson Lekholm, Department of Medical Biochemistry and Cell Biology, Institute of Biomedicine, The Sahlgrenska Academy, University of Gothenburg, Gothenburg, Sweden

Approved

https://doi.org/10.5256/f1000research.24533.r61767

This Software Tool article, describing a web-based sequence alignment viewer (AlignmentViewer), is well-written and very clearly presented. The introduction nicely motivates the need for this tool, given the many other available alternatives. The main features of the software are clearly explained in the article. The tool in itself seems useful and ran quickly and efficiently when tested on sample data. I have only a few minor comments that may be considered by the authors:

Minor:

A little bit of detail may be added to the “Sequence space” section. “Two- or three-dimensional sequence space” is at first a bit confusing, as no information has been given at that stage as to what it means. How is UMAP applied to the Hamming distances? Is each sequence first represented by a vector of Hamming distances, with each element corresponding to one of the included sequences? Maybe this can be clarified with a few additional words.
The following sentence is somewhat confusing: "Figure 1 shows the main functionalities from AlignmentViewer (Reguant, 2020) explained in more detail in the next subsections.”
"This view lets the user examine in depth the alignment.” - should it be “examine the alignment in depth"?
When loading an alignment using a somewhat older version of Safari (12.1.2), the “computing conservation” window does not disappear after the operation finishes. This problem was absent in Chrome.

Is the rationale for developing the new software tool clearly explained?

Yes
Is the description of the software tool technically sound?

Yes
Are sufficient details of the code, methods and analysis (if applicable) provided to allow replication of the software development and its use by others?

Yes
Is sufficient information provided to allow interpretation of the expected output datasets and any results generated using the tool?

Yes
Are the conclusions about the tool and its performance adequately supported by the findings presented in the article?

Yes

Competing Interests: I was a postdoc in Chris Sander's lab at MSKCC, NYC, during 2009-2011. I confirm that this competing interest has not affected my ability to write an objective and unbiased review of the article.

Reviewer Expertise: Genomics, bioinformatics

I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.

CITE

Report a concern

Author Response 30 Apr 2020

Chris Sander, Department of Cell Biology, Harvard Medical School, Boston, USA

30 Apr 2020

Author Response

Thank you for your comments and suggestions - they are spot on.
We will work on taking these into account on the web site and in the next version of the ... Continue reading Thank you for your comments and suggestions - they are spot on.
We will work on taking these into account on the web site and in the next version of the manuscript.
Chris Sander - DFCI and HMS
Thank you for your comments and suggestions - they are spot on.
We will work on taking these into account on the web site and in the next version of the manuscript.
Chris Sander - DFCI and HMS
Competing Interests: No competing interests were disclosed. Close
Report a concern
Author Response 15 Oct 2020

Roc Reguant, Department of Cell Biology, Harvard Medical School, Boston, USA

15 Oct 2020

Author Response

A little bit of detail may be added to the “Sequence space” section. “Two- or three-dimensional sequence space” is at first a bit confusing, as no information has been given ... Continue reading A little bit of detail may be added to the “Sequence space” section. “Two- or three-dimensional sequence space” is at first a bit confusing, as no information has been given at that stage as to what it means. How is UMAP applied to the Hamming distances? Is each sequence first represented by a vector of Hamming distances, with each element corresponding to one of the included sequences? Maybe this can be clarified with a few additional words.
Response: We agree that a bit more detail is warranted and have updated the manuscript that describes using the hamming distance between individual sequences as the distance metric (parameter) for UMAP.

The following sentence is somewhat confusing: "Figure 1 shows the main functionalities from AlignmentViewer (Reguant, 2020) explained in more detail in the next subsections.”
Response: We have updated the manuscript to make it more understandable. The citation to Reguant 2020 was also confusing (it points to the codebase) so we moved that codebase DOI to a single instance at the end of the manuscript.

"This view lets the user examine in depth the alignment.” - should it be “examine the alignment in depth"?
Response: The manuscript has been corrected.

When loading an alignment using a somewhat older version of Safari (12.1.2), the “computing conservation” window does not disappear after the operation finishes. This problem was absent in Chrome.
Response: Thank you for noticing this bug. We have now corrected the issue and the dialog disappears when using Safari.
A little bit of detail may be added to the “Sequence space” section. “Two- or three-dimensional sequence space” is at first a bit confusing, as no information has been given at that stage as to what it means. How is UMAP applied to the Hamming distances? Is each sequence first represented by a vector of Hamming distances, with each element corresponding to one of the included sequences? Maybe this can be clarified with a few additional words.
Response: We agree that a bit more detail is warranted and have updated the manuscript that describes using the hamming distance between individual sequences as the distance metric (parameter) for UMAP.

The following sentence is somewhat confusing: "Figure 1 shows the main functionalities from AlignmentViewer (Reguant, 2020) explained in more detail in the next subsections.”
Response: We have updated the manuscript to make it more understandable. The citation to Reguant 2020 was also confusing (it points to the codebase) so we moved that codebase DOI to a single instance at the end of the manuscript.

"This view lets the user examine in depth the alignment.” - should it be “examine the alignment in depth"?
Response: The manuscript has been corrected.

When loading an alignment using a somewhat older version of Safari (12.1.2), the “computing conservation” window does not disappear after the operation finishes. This problem was absent in Chrome.
Response: Thank you for noticing this bug. We have now corrected the issue and the dialog disappears when using Safari.
Competing Interests: No competing interests were disclosed. Close
Report a concern
Respond or Comment

COMMENTS ON THIS REPORT

Author Response 30 Apr 2020

Chris Sander, Department of Cell Biology, Harvard Medical School, Boston, USA

30 Apr 2020

Author Response

Thank you for your comments and suggestions - they are spot on.
We will work on taking these into account on the web site and in the next version of the ... Continue reading Thank you for your comments and suggestions - they are spot on.
We will work on taking these into account on the web site and in the next version of the manuscript.
Chris Sander - DFCI and HMS
Thank you for your comments and suggestions - they are spot on.
We will work on taking these into account on the web site and in the next version of the manuscript.
Chris Sander - DFCI and HMS
Competing Interests: No competing interests were disclosed. Close
Report a concern
Author Response 15 Oct 2020

Roc Reguant, Department of Cell Biology, Harvard Medical School, Boston, USA

15 Oct 2020

Author Response

A little bit of detail may be added to the “Sequence space” section. “Two- or three-dimensional sequence space” is at first a bit confusing, as no information has been given ... Continue reading A little bit of detail may be added to the “Sequence space” section. “Two- or three-dimensional sequence space” is at first a bit confusing, as no information has been given at that stage as to what it means. How is UMAP applied to the Hamming distances? Is each sequence first represented by a vector of Hamming distances, with each element corresponding to one of the included sequences? Maybe this can be clarified with a few additional words.
Response: We agree that a bit more detail is warranted and have updated the manuscript that describes using the hamming distance between individual sequences as the distance metric (parameter) for UMAP.

The following sentence is somewhat confusing: "Figure 1 shows the main functionalities from AlignmentViewer (Reguant, 2020) explained in more detail in the next subsections.”
Response: We have updated the manuscript to make it more understandable. The citation to Reguant 2020 was also confusing (it points to the codebase) so we moved that codebase DOI to a single instance at the end of the manuscript.

"This view lets the user examine in depth the alignment.” - should it be “examine the alignment in depth"?
Response: The manuscript has been corrected.

When loading an alignment using a somewhat older version of Safari (12.1.2), the “computing conservation” window does not disappear after the operation finishes. This problem was absent in Chrome.
Response: Thank you for noticing this bug. We have now corrected the issue and the dialog disappears when using Safari.
A little bit of detail may be added to the “Sequence space” section. “Two- or three-dimensional sequence space” is at first a bit confusing, as no information has been given at that stage as to what it means. How is UMAP applied to the Hamming distances? Is each sequence first represented by a vector of Hamming distances, with each element corresponding to one of the included sequences? Maybe this can be clarified with a few additional words.
Response: We agree that a bit more detail is warranted and have updated the manuscript that describes using the hamming distance between individual sequences as the distance metric (parameter) for UMAP.

The following sentence is somewhat confusing: "Figure 1 shows the main functionalities from AlignmentViewer (Reguant, 2020) explained in more detail in the next subsections.”
Response: We have updated the manuscript to make it more understandable. The citation to Reguant 2020 was also confusing (it points to the codebase) so we moved that codebase DOI to a single instance at the end of the manuscript.

"This view lets the user examine in depth the alignment.” - should it be “examine the alignment in depth"?
Response: The manuscript has been corrected.

When loading an alignment using a somewhat older version of Safari (12.1.2), the “computing conservation” window does not disappear after the operation finishes. This problem was absent in Chrome.
Response: Thank you for noticing this bug. We have now corrected the issue and the dialog disappears when using Safari.
Competing Interests: No competing interests were disclosed. Close
Report a concern

Comments on this article Comments (0)

Version 2

VERSION 2 PUBLISHED 27 Mar 2020

Open Peer Review

Reviewer Status

Reviewer Reports

	Invited Reviewers
	1	2
Version 2 (revision) 15 Oct 20		read
Version 1 27 Mar 20	read	read

Erik Larsson Lekholm, University of Gothenburg, Gothenburg, Sweden
Alex Bateman, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, UK

Comments on this article

All Comments(0)

Add a comment

Browse by related subjects

Back to all reports

Reviewer Report

2 Views

16 Oct 2020 | for Version 2

Alex Bateman, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, UK

2 Views Cite this report Responses(0)

Approved

Competing Interests

No competing interests were disclosed.

Reviewer Expertise

Computational biology with specialism in biological databases and analysis of protein families.

I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.

Respond to this report

Responses (0)

Back to all reports

Reviewer Report

28 Views

16 Apr 2020 | for Version 1

Alex Bateman, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, UK

28 Views Cite this report Responses(1)

Approved With Reservations

Is the rationale for developing the new software tool clearly explained?

Yes
Is the description of the software tool technically sound?

Yes
Are sufficient details of the code, methods and analysis (if applicable) provided to allow replication of the software development and its use by others?

Partly
Is sufficient information provided to allow interpretation of the expected output datasets and any results generated using the tool?

Yes
Are the conclusions about the tool and its performance adequately supported by the findings presented in the article?

Partly

Competing Interests

No competing interests were disclosed.

Reviewer Expertise

Computational biology with specialism in biological databases and analysis of protein families.

Respond to this report

Responses (1)

Author Response

15 Oct 2020

Roc Reguant, Department of Cell Biology, Harvard Medical School, Boston, USA

Manuscript changes:
Please fix capitalisation of PFAM to Pfam
Response: The manuscript has been updated to Pfam.

Software/Website changes:
The computing conservation box is annoyingly placed. It does not go away and covers the UMAP view significantly.
Response: This bug has now been fixed - the pop-up will disappear after the alignment has been loaded.

I really need an estimate of how long it will take for the pairwise identity to load. Some time passes…I didn’t realise I had to click on calculate to get the pairwise identity to show. Please UX test this page to make it easier/clearer for the user to understand what to do.
Response: We have added a button into the middle of the plot to indicate that an action needs to be taken by the user to begin calculating pairwise sequence identity.

For the top graph on the stats view the three colours used for max/average and min identity are hard to distinguish. Please use a broader colour palette. Please add axis legends for this graph and make it clearer what the title of this graph is. The title is very clear for the bottom plot.
Response: We have updated the color palette to increase the contrast, added axis labels and added a clear title.

Please add alternative colouring schemes for viewing the alignments. It makes a surprisingly big difference for experienced users ability to interpret an alignment. I personally like the ClustalX colouring scheme that is widely adopted, but there are other popular schemes it would be nice to incorporate as options.
Response: We agree alternative coloring schemes are useful and have added the clustal colors as a drop-down option on the main page as well as in the image view.

“Based on this and other small tests we can safely state that any computer that is able to run any modern web browser will be able to run any alignment that requires visualization”. This is overstating the case. Needs to be toned down. Try testing the software on the ABC transporter family (PF00005) alignment in Pfam. The NCBI alignment contains 2.6 million sequences. If it works seamlessly for that then you can safely state.
Response: We agree with the reviewer that the phrasing was optimistic and have removed the sentence from the main manuscript.

I tried to hook the viewer up to a Pfam alignment which is not very large (840 sequences). I got a box that said Fetching file…and no response beyond that.
https://alignmentviewer.org/?url=http://pfam.xfam.org/family/PF00571/alignment/seed/format?format=fasta&alnType=seed&order=t&case=l&gaps=default&download=0
also tried Stockholm format with no luck.
https://alignmentviewer.org/?url=http://pfam.xfam.org/family/PF00571/alignment/seed/format?format=stockholm&alnType=seed&order=t&case=l&gaps=default&download=0

Response: There are two technical limitations inherent to how requests happen in the browser. We take as an example your suggested URL: https://alignmentviewer.org/?url=http://pfam.xfam.org/family/PF00571/alignment/seed/format?format=stockholm&alnType=seed&order=t&case=l&gaps=default&download=0

Limitation 1: The alignmentviewer.org website is hosted as a secure site under https. However, the location of the requested alignment (from Pfam) is passed via an insecure connection (http). For security reasons mixing different protocols (also called "mixed content", see https://developer.mozilla.org/en-US/docs/Web/Security/Mixed_content) is not allowed.
Limitation 2: the string passed to the query parameter URL (aka. what follows "?url=") needs to be sanitized before being served. More information about this can be found here: https://en.wikipedia.org/wiki/Query_string#URL_encoding. For clarity, we have updated the example in the main text to be stanitized / encoded.
In your example, the alignment lives at "http://pfam.xfam.org/family/PF00571/alignment/seed/format?format=stockholm&alnType=seed&order=t&case=l&gaps=default&download=0". Passing the string as-is will confuse the browser that will interpret e.g. "&gaps=" as query for alignmentviewer rather than for pfam.xfam.org . Many languages offer built-in functions to sanitize strings to be passed to URL queries (e.g. in a JavaScript frontend application, using the window.encodeURIComponent() function; an interactive website can be found here: https://www.url-encode-decode.com). By doing so, the string proposed for the alignment would result in: "http%3A%2F%2Fpfam.xfam.org%2Ffamily%2FPF00571%2Falignment%2Fseed%2Fformat%3Fformat%3Dstockholm%26alnType%3Dseed%26order%3Dt%26case%3Dl%26gaps%3Ddefault%26download%3D0".
Conclusion: considering the two inherit browser limitations mentioned above, your requests can be sanitized and fixed as follows:
https://alignmentviewer.org/?url=https%3A%2F%2Fpfam.xfam.org%2Ffamily%2FPF00571%2Falignment%2Fseed%2Fformat%3Fformat%3Dfasta%26alnType%3Dseed%26order%3Dt%26case%3Dl%26gaps%3Ddefault%26download%3D0
Again, note that for both URLs the protocol has been changed to "https" instead of "http", and that the string that follows "&url=" has been sanitized using Chrome's built-in console and the "window.encodeURIComponent(x)" function, where x is the string to be sanitized. We have added some text to the manuscript to highlight these requirements.

I also tried with a Pfam family with just 3 sequences. Still with no luck.
I got the viewer to work by uploading a Pfam alignment in fasta format with 3 sequences. Then I tried uploading the seed alignment for the CBS domain downloaded from Pfam in fasta format. I then got the following error:
Parsing error: sequence #3 has different length.
The first few lines of the alignment are included below and the third sequence appears to be the same length as all the others.
>Q8EHI4_SHEON/79-134
I........H.Q...V........MTR........NP.........V.......TVAPY.
...VS.LDVAS....RT..L.....L....E.....HN..I.....G..CL.P.V.L...
.........ENG....D......LV.GIVTWKDLLRAYCA
>Q97V95_SULSO/193-248
V........L.D...A.......G.TK........NP.........I.......TINRY.
...YS.ILNAA....KL..M.....I....E.....KR..I.....G..TL.L.V.M...
.......E..NQ....K......LV.GIVTERDLMYAYIN
>Q6M020_METMP/269-321
V........K.E...I.........MS........PP.........V.......MVSPE.
...AA.LNELI....KG..M.....A....N.........T.....D..RV.Y.V.V...
.........DNG....N......IL.GIISKTDIVRTLSI
>Q72KI1_THET2/367-424
V........E.G...F.........LA........RA.........V.......VLPPS.
...TP.LSQVE....PR..L..........R.....EA..G.....G..RV.L.V.GE..
.......RAGEG...WR......LL.GIYTRTDLYRSAPK
>Q8TX96_METKA/244-298
A........R.N...Y.......L..Q........EM.........V.......VVPPE.
...TP.LHEAL....WE..V.........ID.....KM..S.....D..RI.Y.V.M...
..........DG.....R....KLT.GVVPLIDAVYTLAK
>Q836T2_ENTFA/77-132
V........Q.E...I........MSP........PL.................MVAQD.
...TS.IRDAI....TN..L................FMYDV.....G..SL.Y.V.M...
........DEAK....E......LL.GVLSRKDLLRASLN
>Q8XIV1_CLOPE/77-131
V........K.D...I.........MS........KP.........V.......TVCEE.
...TM.LHDAI....VH.LF.....L...ND.........V.....G..TM.F.VEN...
..........GG....V......LT.GAVSRKDFLKVAIG
If I delete the third sequence and re-upload I get exactly the same error.
For info I carried out tests with Firefox for Mac version 74.0 and Safari Version 13.1
Response: Thank you for pointing out this issue. This bug occurred when refreshing an internal variable, and has now been fixed.

View more View less

Competing Interests

No competing interests were disclosed.

Back to all reports

Reviewer Report

25 Views

09 Apr 2020 | for Version 1

Erik Larsson Lekholm, Department of Medical Biochemistry and Cell Biology, Institute of Biomedicine, The Sahlgrenska Academy, University of Gothenburg, Gothenburg, Sweden

25 Views Cite this report Responses(2)

Approved

A little bit of detail may be added to the “Sequence space” section. “Two- or three-dimensional sequence space” is at first a bit confusing, as no information has been given at that stage as to what it means. How is UMAP applied to the Hamming distances? Is each sequence first represented by a vector of Hamming distances, with each element corresponding to one of the included sequences? Maybe this can be clarified with a few additional words.
The following sentence is somewhat confusing: "Figure 1 shows the main functionalities from AlignmentViewer (Reguant, 2020) explained in more detail in the next subsections.”
"This view lets the user examine in depth the alignment.” - should it be “examine the alignment in depth"?
When loading an alignment using a somewhat older version of Safari (12.1.2), the “computing conservation” window does not disappear after the operation finishes. This problem was absent in Chrome.

Is the rationale for developing the new software tool clearly explained?

Yes
Is the description of the software tool technically sound?

Yes
Are sufficient details of the code, methods and analysis (if applicable) provided to allow replication of the software development and its use by others?

Yes
Is sufficient information provided to allow interpretation of the expected output datasets and any results generated using the tool?

Yes
Are the conclusions about the tool and its performance adequately supported by the findings presented in the article?

Yes

Competing Interests

I was a postdoc in Chris Sander's lab at MSKCC, NYC, during 2009-2011. I confirm that this competing interest has not affected my ability to write an objective and unbiased review of the article.

Reviewer Expertise

Genomics, bioinformatics

I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.

Respond to this report

Responses (2)

Author Response

30 Apr 2020

Chris Sander, Department of Cell Biology, Harvard Medical School, Boston, USA

Thank you for your comments and suggestions - they are spot on.
We will work on taking these into account on the web site and in the next version of the manuscript.
Chris Sander - DFCI and HMS

View more View less

Competing Interests

No competing interests were disclosed.

Author Response

15 Oct 2020

Roc Reguant, Department of Cell Biology, Harvard Medical School, Boston, USA

A little bit of detail may be added to the “Sequence space” section. “Two- or three-dimensional sequence space” is at first a bit confusing, as no information has been given at that stage as to what it means. How is UMAP applied to the Hamming distances? Is each sequence first represented by a vector of Hamming distances, with each element corresponding to one of the included sequences? Maybe this can be clarified with a few additional words.
Response: We agree that a bit more detail is warranted and have updated the manuscript that describes using the hamming distance between individual sequences as the distance metric (parameter) for UMAP.

The following sentence is somewhat confusing: "Figure 1 shows the main functionalities from AlignmentViewer (Reguant, 2020) explained in more detail in the next subsections.”
Response: We have updated the manuscript to make it more understandable. The citation to Reguant 2020 was also confusing (it points to the codebase) so we moved that codebase DOI to a single instance at the end of the manuscript.

"This view lets the user examine in depth the alignment.” - should it be “examine the alignment in depth"?
Response: The manuscript has been corrected.

When loading an alignment using a somewhat older version of Safari (12.1.2), the “computing conservation” window does not disappear after the operation finishes. This problem was absent in Chrome.
Response: Thank you for noticing this bug. We have now corrected the issue and the dialog disappears when using Safari.

View more View less

Competing Interests

No competing interests were disclosed.

Alongside their report, reviewers assign a status to the article:

Approved - the paper is scientifically sound in its current form and only minor, if any, improvements are suggested

Approved with reservations - A number of small changes, sometimes more significant revisions are required to address specific details and improve the papers academic merit.

Not approved - fundamental flaws in the paper seriously undermine the findings and conclusions

[1] Brown NP, Leroy C, Sander C, et al.: MView: a web-compatible database search or multiple alignment viewer. Bioinforma. 1998; 14(4): 380–381. PubMed Abstract | Publisher Full Text

[2] Goldberg T, Hecht M, Hamp T, et al.: LocTree3 prediction of localization. Nucleic Acids Res. 2014; 42(Web Server issue): W350–W355. PubMed Abstract | Publisher Full Text | Free Full Text

[3] Hopf TA, Green AG, Schubert B, et al.: The EVcouplings Python framework for coevolutionary sequence analysis. Bioinformatics. 2019; 35(9): 1582–1584. PubMed Abstract | Publisher Full Text | Free Full Text

[4] Larsson A: AliView: a fast and lightweight alignment viewer and editor for large datasets. Bioinformatics. 2014; 30(22): 3276–3278. PubMed Abstract | Publisher Full Text | Free Full Text

[5] Marks DS, Colwell LJ, Sheridan R, et al.: Protein 3D structure computed from evolutionary sequence variation. PLoS One. 2011; 6(12); e28766. PubMed Abstract | Publisher Full Text | Free Full Text

[6] McInnes L, Healy J, Melville J: Umap: Uniform manifold approximation and projection for dimension reduction. arXiv preprint. arXiv: 1802.03426. 2018. Reference Source

[7] Reguant R: Alignmentviewer (Version v2.1). Zenodo. 2020. http://www.doi.org/10.5281/zenodo.4063551

[8] Tamames J, Ouzounis C, Casari G, et al.: EUCLID: automatic classification of proteins in functional classes by their database annotations. Bioinformatics. 1998; 14(6): 542–3. PubMed Abstract | Publisher Full Text

[9] Toth-Petroczy A, Palmedo P, Ingraham J, et al.: Structured States of Disordered Proteins from Genomic Sequences. Cell. 2016; 167(1); 158–170.e12. PubMed Abstract | Publisher Full Text | Free Full Text

[10] Veidenberg A, Medlar A, Löytynoja A: Wasabi: An Integrated Platform for Evolutionary Sequence Analysis and Data Visualization. Mol Biol Evol. 2016; 33(4): 1126–1130. PubMed Abstract | Publisher Full Text

[11] Waterhouse AM, Procter JB, Martin DM, et al.: Jalview Version 2--a multiple sequence alignment editor and analysis workbench. Bioinformatics. 2009; 25(9): 1189–91. PubMed Abstract | Publisher Full Text | Free Full Text

[12] Yachdav G, Wilzbach S, Rauscher B, et al.: MSAViewer: interactive JavaScript visualization of multiple sequence alignments. Bioinformatics. 2016; 32(22): 3501–3503. PubMed Abstract | Publisher Full Text | Free Full Text

AlignmentViewer: Sequence Analysis of Large Protein Families

Abstract

Keywords

Revised Amendments from Version 1

Introduction

Methods

Operation

Implementation

Use case

Figure 1. AlignmentViewer visualization of the beta-lactamase protein domain family.

MSA view

Pixel view (suitable for large families)

Stats view

Annotations and evolutionary couplings

Sequence space

Conclusion

Data availability

Software availability

Acknowledgments

References

Comments on this article Comments (0)

Open Peer Review

Comments on this article Comments (0)

Open Peer Review

Reviewer Status

Reviewer Reports

Comments on this article

Browse by related subjects

Competing Interests Policy

Stay Updated