2. Pairwise
The Pairwise command in DBRetina is designed to perform pairwise comparisons between supergroups based on their shared features. This command takes the index prefix and the number of cores as input parameters.
Usage: DBRetina pairwise [OPTIONS]
Calculate pairwise distances.
Options:
-i, --index-prefix TEXT Index file prefix [required]
-t, --threads INTEGER number of cores
-d, --dist-type TEXT select from ['min_cont', 'avg_cont', 'max_cont',
'ochiai', 'jaccard'] [default: max_cont]
-c, --cutoff FLOAT RANGE filter out distances < cutoff [default: 0.0;
0<=x<=100]
--help Show this message and exit.
2.1 Command arguments
-i, --index-prefix TEXT Index file prefix [required]
This is the user-defined prefix that was used in the indexing step.
-t, --threads INTEGER number of cores
The number of processing cores to be used for parallel computation during the pairwise comparisons.
-d, --dist-type TEXT select from ['min_cont', 'avg_cont', 'max_cont', 'ochiai', 'jaccard'] [default: max_cont]
-c, --cutoff FLOAT RANGE filter out distances < cutoff [default: 0.0; 0<=x<=100]
The -d
and -c
input parameters serve the purpose of selecting a particular distance metric and predefined cutoff. This cutoff will eliminate all pairwise comparisons that have a distance value lower than the cutoff.
2.2 Output files format
{perfix}_DBRetina_pairwise.tsv
A TSV file that provides information about shared features between each pair of supergroups. The TSV columns are defined as follows:
group_1_ID | ID of the first supergroup in a pair |
group_2_ID | ID of the second supergroup in a pair |
group_1_name | name of the first supergroup in a pair |
group_2_name | name of the second supergroup in a pair |
shared_features | number of features shared between the two supergroups |
min_containment | minimum containment between the two supergroups |
avg_containment | average containment between the two supergroups |
max_containment | maximum containment between the two supergroups |
ochiai | Ochiai distance between the two supergroups |
jaccard | Jaccard distance between the two supergroups |
The output PNG file of histogram of pairwise distances
{index_prefix}_DBRetina_distance_metrics_plot_log.png
clustered bar chart illustrates the frequency distribution of five distance metrics - min_cont, avg_cont, max_cont, ochiai, and jaccard - across various distance ranges. The y-axis is displayed on a logarithmic scale to accommodate the wide range of frequencies observed in the data.
{index_prefix}_DBRetina_distance_metrics_plot_linear.png
Same as above, but the y-axis is displayed on a linear scale.