Current location - Loan Platform Complete Network - Big data management - Use of IQ-TREE - Ultra-Fast Construction of Evolutionary Trees by Great Likelihood Methods
Use of IQ-TREE - Ultra-Fast Construction of Evolutionary Trees by Great Likelihood Methods

I've known about IQ-tree for a long time. However I have not been using it. The main reason is that the commonly used software for fast ML tree construction is FastTree, and if accuracy is required I will use RaxML on the server. so it's out of use. However, probably the biggest advantage of using IQ-tree is that he supports direct estimation of alternative models. It would indeed save a lot of work. Since I've had a little more time relatively recently, I'll check out the IQ-tree documentation.

IQ-tree officially seems to be only available in a multi-threaded version now ....

If you're in a hurry, then just scroll to the end .

Where the -s parameter follows the input multi-sequence comparison result. Running this command produces two output files

example.phy.iqtree records relatively specific information about the construction of the evolutionary tree.

example.phy.treefile records the newick text of the constructed tree, this should be the most important output file

example.phy.log is mainly used to debug the software for the authors.

The authors mention in the documentation

that this is a very important file. p>

This is a very resourceful operation ....

The IQ-tree run process saves the results of each successful run step, or rather he runs breaks and restarts from the breakpoint. This is a great benefit for large data sets. Sometimes, though, we just want to start from scratch, so we need to add the argument -redo.

By default IQ-tree's output file name is prefixed with the input alignment file. We can modify this by using the -pre argument

But in fact, I don't think it's necessary at all. Unless you want to keep tweaking the tree-building parameters.

IQ-tree supports a wide range of alternative model choices for different input data, including

Making it automatically test and choose the best alternative model by setting the -m MFP parameter

This parameter can actually be left out, as mentioned above, and will be enforced by default.

Once the prediction of the optimal alternative model is performed, then an additional file is output,

example.phy.model which records the likelihood information for all models.

In fact, the information for the optimal alternative model is recorded in example.phy.iqtree. If the multiple sequence comparison results are identical, then the optimal alternative model is also identical. If it has not changed, it is possible to specify an alternative model, for example, if it is known that the optimal alternative model is TIM2+I+G . Then you can do the following

Of course, sometimes you just want to see what the optimal alternative model is, rather than building an evolutionary tree, which is relatively time-consuming. Then you can execute

If computational resources allow, then the best way is to add the parameter -mtree, which will check all available models

If your input data is SNP data, then you need to add +ASC

Of course, based on anecdotal evidence and personal experience, it seems that the NJ method performs just as well on SNP data due to the ML method.

There is only one true evolutionary message, and we always take limited sequence information and hope to get him. Whether we can get him is the question. And whether the sequence information we use can truly and stably reflect an evolutionary information, then is another thing. bootstrap method commonly used, especially on the ML method of constructing an evolutionary tree, branch reliability test method. But the biggest problem with this computational logic is that sampling re-runs, sampling re-runs, and repeats until convergence or to a specified say 1000 times. The calculation is large and time-consuming.The author team of IQ-tree proposed a fast BS method in the aforementioned and finally integrated it into IQ-tree. The way it is used is

Notice that:

With these parameters, a MAXIMUM LIKELIHOOD TREE section is added to the output file example.phy.iqtree, where the specific BS results are recorded. The corresponding NEWICK text can be found in example.phy.treefile.

In addition, three output files are added

The authors remind

Personally, I actually tried IQ-tree very early on, but after two or three sets of data, I didn't think it performed as well as RaxML, and didn't use it. The authors mentioned in the documentation that fast BS can overestimate BS values in case of model conflicts, and recommended adding the parameter -bnni . So the command was

Of course, the authors still provide the normal BS parameter, i.e., don't have -bb , but just use -b . Maybe we can use it if we are not in a hurry

In addition, IQ-tree supports other support estimation methods

SH-like

You can even do both branch support calculations at the same time,

Well, as someone who basically doesn't do much evolutionary analysis, I seem to think that having a BS is enough, after all, this seems to be what most people care about.

Emm... I thought iqtree itself directly supported multithreading, but it seems from the documentation that a separate version of iqtree is required, iqtree-omp.

Notice that using multithreading is only more efficient with long comparison results. The best way to do this is to let IQtree make up its own mind

But for now, there should be official compilation support for multithreading.

To summarize, a one-step approach to building an evolution tree using IQ-tree might be

Finally, look at the resulting file example.phy.treefile.

If you're worried that fast BS doesn't work well, then consider using

If your sequence is long enough, then the documentation suggests increasing -cmax, which defaults to 10, which is mostly a matter of computational resources.

Of course, many times it is necessary to add multithreading support parameters, as follows