FAQ:  VNTRfinder and PolyPredictR

What does this webserver do?. 1

Have you any examples?. 1

What if I don’t get a tandem repeat detected when I expect one?. 1

What if a repeat detected in the reference is not detected in the target sequences?. 1

What if I find a variant but I suspect that the target sequence containing a repeat is not homologous to the repeat region?. 2

I’m not sure how to fill in the query form, is there a tutorial?. 2

How long does a typical search take?. 2

A repeat I know to be polymorphic has not been predicted to be so?. 2

 

If your question(s) are not addressed here or in the documentation, please send your query to vntrfinder@gmail.com

 

What does this webserver do?

Two things:

Firstly, it can predict potentially polymorphic repeats in one or more sequence (reference sequence(s)). To only avail of this utility, enter sequences in the first sequence box only.

Secondly, it can search for repeat variation by aligning repeats from the reference sequence(s) with repeats in one or more target sequence. To avail of this, enter sequences in both the first and second sequence boxes.

Have you any examples?

Examples (Test1 and Test2) are available on the main search page. Also, examples are presented and discussed on the examples page.

What if I don’t get a tandem repeat detected when I expect one?

Try lowering Tandem Repeats Finder’s parameters (lower minscore, mismatch or indel values).

What if a repeat detected in the reference is not detected in the target sequences?

(a)       Try lowering the size of the required flanks, because there may be insufficient sequence flanking a tandem repeat.

(b)       Try increasing the flanks a well as the mismatch parameter:

The VNTRfinder program automatically increases the mismatch frequency (after starting at “0”) until it finds a match. It may be that there are multiple possible matches, in which case the program will not return a result, on the grounds that the mapping is ambiguous. The positions of unambiguous matches detected are returned in the third output file (“repeats detected in the reference(s) and found in the target(s)”). In this case, edit down the query and target sequences to restrict analysis to regions that exclude the ambiguities that you are certain should not match; however, be careful that you understand the sequences are truly homologous.

What if I find a variant but I suspect that the target sequence containing a repeat is not homologous to the repeat region?

The programme will proceed until it finds a single unambiguous hit. Be careful when inspecting the output to pay close attention to the final mismatches between the query and target. This appears as the third last column in the third output file (repeats detected in the reference(s) and found in the target(s)). Think about how many mismatches seem reasonable given the evolutionary distance you would expect between the query and target. The programme offers no guarantee that the target and query regions that match are truly homologous, it simply finds the best matching sequence.

 

Secondly, the query form allows you to choose one of four options in searching for matches across sequences; (a) represents length difference consistent with change in the repeat copy-number, (b) has the same repeat unit length and motif, (c) has any repetitive sequence, (d) represents any sequence (i.e. no check). Choosing option (a) instead of (d) will increase the stringency of the search for homology and only matches for repeats that satisfy the criterion in question will be reported, e.g. only matches containing the same tandem repeat unit and motif.

I’m not sure how to fill in the query form, is there a tutorial?

There’s a tutorial for internet explorer users at http://www.bioinformatics.rcsi.ie/vntrfinder/Tutorial.mht. Users of other browsers can download it at http://www.bioinformatics.rcsi.ie/vntrfinder/Tutorial.pps. Also, see the documentation at http://www.bioinformatics.rcsi.ie/vntrfinder/documentation.html. 

How long does a typical search take?

If you are comparing two genes of about 4kb each, the search will take about 5 seconds. The example files take less than 20 seconds. Genomes will clearly take much longer and you are advised to download the programs and run these locally if running large jobs. Also, a search can take a little longer than normal if you are searching against more than one target sequence/genome.

A repeat I know to be polymorphic has not been predicted to be so?

The rules used by PolyPredictR to predict whether a repeat is polymorphic are based on rules described in previous literature. As discussed in this literature, there are rates of false positives. Also, the predictions are based on repeats in human sequences and may not be applicable to all genomes.

 

 

 If your question(s) are not addressed here or in the documentation, please send your query to vntrfinder@gmail.com