ipyrad-analysis toolkit: Astral species trees
Once you’ve run treeslider you now have a distribution of gene trees that you can use as input to a multi-locu species tree or network inference program. Here I demonstrate how to input your data to astral.
Required software¶
[1]:
# conda install ipyrad -c bioconda
# conda install toytree -c eaton-lab
[2]:
import pandas as pd
import toytree
Short Tutorial:¶
[4]:
# load the tree table from CSV
tree_table = pd.read_csv(
"./analysis-treeslider/test.tree_table.csv",
index_col=0,
)
---------------------------------------------------------------------------
FileNotFoundError Traceback (most recent call last)
<ipython-input-4-6a21e2a50b73> in <module>
2 tree_table = pd.read_csv(
3 "./analysis-treeslider/test.tree_table.csv",
----> 4 index_col=0,
5 )
~/miniconda3/lib/python3.7/site-packages/pandas/io/parsers.py in parser_f(filepath_or_buffer, sep, delimiter, header, names, index_col, usecols, squeeze, prefix, mangle_dupe_cols, dtype, engine, converters, true_values, false_values, skipinitialspace, skiprows, skipfooter, nrows, na_values, keep_default_na, na_filter, verbose, skip_blank_lines, parse_dates, infer_datetime_format, keep_date_col, date_parser, dayfirst, cache_dates, iterator, chunksize, compression, thousands, decimal, lineterminator, quotechar, quoting, doublequote, escapechar, comment, encoding, dialect, error_bad_lines, warn_bad_lines, delim_whitespace, low_memory, memory_map, float_precision)
683 )
684
--> 685 return _read(filepath_or_buffer, kwds)
686
687 parser_f.__name__ = name
~/miniconda3/lib/python3.7/site-packages/pandas/io/parsers.py in _read(filepath_or_buffer, kwds)
455
456 # Create the parser.
--> 457 parser = TextFileReader(fp_or_buf, **kwds)
458
459 if chunksize or iterator:
~/miniconda3/lib/python3.7/site-packages/pandas/io/parsers.py in __init__(self, f, engine, **kwds)
893 self.options["has_index_names"] = kwds["has_index_names"]
894
--> 895 self._make_engine(self.engine)
896
897 def close(self):
~/miniconda3/lib/python3.7/site-packages/pandas/io/parsers.py in _make_engine(self, engine)
1133 def _make_engine(self, engine="c"):
1134 if engine == "c":
-> 1135 self._engine = CParserWrapper(self.f, **self.options)
1136 else:
1137 if engine == "python":
~/miniconda3/lib/python3.7/site-packages/pandas/io/parsers.py in __init__(self, src, **kwds)
1915 kwds["usecols"] = self.usecols
1916
-> 1917 self._reader = parsers.TextReader(src, **kwds)
1918 self.unnamed_cols = self._reader.unnamed_cols
1919
pandas/_libs/parsers.pyx in pandas._libs.parsers.TextReader.__cinit__()
pandas/_libs/parsers.pyx in pandas._libs.parsers.TextReader._setup_parser_source()
FileNotFoundError: [Errno 2] File b'./analysis-treeslider/test.tree_table.csv' does not exist: b'./analysis-treeslider/test.tree_table.csv'
[59]:
# examine top of table
tree_table.head()
[59]:
scaffold | start | end | sites | snps | samples | missing | tree | |
---|---|---|---|---|---|---|---|---|
0 | 2 | 0 | 2000000 | 13263 | 155 | 9 | 0.0 | (sagr:0.00343708,(oleo:0.00266064,(mini:0.0020... |
1 | 2 | 2000000 | 4000000 | 10544 | 112 | 9 | 0.0 | (fusi-N:0.00441769,reference:0.0186764,((bran:... |
2 | 2 | 4000000 | 6000000 | 5544 | 46 | 9 | 0.0 | (virg:0.00297301,fusi-N:0.00243431,(oleo:0.003... |
3 | 2 | 6000000 | 8000000 | 12777 | 138 | 9 | 0.0 | (fusi-N:0.00283693,(bran:0.00363545,reference:... |
4 | 2 | 8000000 | 10000000 | 14441 | 166 | 9 | 0.0 | (bran:0.00446094,reference:0.0119105,(fusi-S:0... |
Write the trees column to a file¶
[11]:
outfile = open("trees.nwk", "w")
outfile.write("\n".join(tree_table.tree.tolist()))
outfile.close()
Get Astral¶
The following command will download and unzip Astral into the current directory. A binary to call the program astral will be located in the Astral directory.
[12]:
%%bash
wget -q https://github.com/smirarab/ASTRAL/raw/master/Astral.5.6.3.zip
unzip -qo Astral.5.6.3
Run Astral¶
This will write the results (stdout) to a file called astral.tre, and it will write other information printed to the screen (stderr) to a file called astral.err. By default astral runs 100 bootstrap replicates by gene-tree resampling. To turn on gene and site resampling would require inputting the bootstrap files, which we did not currently save (we could…).
[13]:
%%bash
java -jar Astral/astral.5.6.3.jar -i trees.nwk > astral.tre 2>astral.err
Plot astral species tree¶
[61]:
tre = toytree.rtree.coaltree(10)
tre.
[61]:
<toytree.Toytree.ToyTree at 0x7f0474bb2160>
[71]:
tre.rotate_node(names=["r0", "r1", "r2"])
[71]:
<toytree.Toytree.ToyTree at 0x7f047506c358>
[68]:
tre.draw(node_labels=True);
[60]:
tre = toytree.tree("astral.tre").root('reference')
tre.draw(node_labels="support", tip_labels_align=True);
---------------------------------------------------------------------------
NewickError Traceback (most recent call last)
<ipython-input-60-40b2dc83ebdf> in <module>
----> 1 tre = toytree.tree("astral.tre").root('reference')
2 tre.draw(node_labels="support", tip_labels_align=True);
~/Documents/toytree/toytree/Toytree.py in __init__(self, newick, tree_format, fixed_order)
54 # prase a str, URL, or file
55 elif isinstance(newick, (str, bytes)):
---> 56 self.treenode = TreeParser(newick, tree_format).treenodes[0]
57
58 # make an empty tree
~/Documents/toytree/toytree/TreeParser.py in __init__(self, intree, tree_format, multitree, debug)
64 # parse intree
65 if not self.debug:
---> 66 self._run()
67
68
~/Documents/toytree/toytree/TreeParser.py in _run(self)
80
81 # parse newick strings to treenodes list
---> 82 self.get_treenodes()
83
84 # apply names from tdict
~/Documents/toytree/toytree/TreeParser.py in get_treenodes(self)
148
149 # extract one tree
--> 150 self.treenodes.append(extractor.newick_from_string())
151
152 else:
~/Documents/toytree/toytree/TreeParser.py in newick_from_string(self)
251 closing_internal = closing_internal.rstrip(";")
252 # read internal node data and go up one level
--> 253 self.apply_node_data(closing_internal, "internal")
254 self.current_parent = self.current_parent.up
255 return self.root
~/Documents/toytree/toytree/TreeParser.py in apply_node_data(self, subnw, node_type)
306 self.current_node.add_feature(fname, fvalue)
307 else:
--> 308 raise NewickError("Unexpected newick format {}".format(subnw))
309
310
NewickError: Unexpected newick format '[q1=0.4761904761904762;q2=0.2619047619047619;q3=0.2619047619047619;f1=13.333333333333334;f2=7.333333333333334;f3=7.333333333333334;pp1=0.8444981624253073;pp2=0.07775091878734611;pp3=0.07775091878734611;QC=6;EN=28.0]':0.2102954088363608
Plot astral species tree with quartet supports¶
Not yet supported in toytree, what a wonky file format…