Parenclitic and Synolytic Networks Revisited

Nazarenko, Tatiana; Whitwell, Harry J.; Blyuss, Oleg; Zaikin, Alexey

dc.contributor.author	Nazarenko, Tatiana
dc.contributor.author	Whitwell, Harry J.
dc.contributor.author	Blyuss, Oleg
dc.contributor.author	Zaikin, Alexey
dc.date.accessioned	2021-11-03T13:00:01Z
dc.date.available	2021-11-03T13:00:01Z
dc.date.issued	2021-10-20
dc.identifier.citation	Nazarenko , T , Whitwell , H J , Blyuss , O & Zaikin , A 2021 , ' Parenclitic and Synolytic Networks Revisited ' , Frontiers in Genetics , vol. 12 , 733783 . https://doi.org/10.3389/fgene.2021.733783
dc.identifier.other	Jisc: a41945ac88734cbda2d6833cd55d6d89
dc.identifier.other	publisher-id: 733783
dc.identifier.other	ORCID: /0000-0002-0194-6389/work/102685380
dc.identifier.uri	http://hdl.handle.net/2299/25165
dc.description	© 2021 Nazarenko, Whitwell, Blyuss and Zaikin. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). https://creativecommons.org/licenses/by/4.0/
dc.description.abstract	Parenclitic networks provide a powerful and relatively new way to coerce multidimensional data into a graph form, enabling the application of graph theory to evaluate features. Different algorithms have been published for constructing parenclitic networks, leading to the question—which algorithm should be chosen? Initially, it was suggested to calculate the weight of an edge between two nodes of the network as a deviation from a linear regression, calculated for a dependence of one of these features on the other. This method works well, but not when features do not have a linear relationship. To overcome this, it was suggested to calculate edge weights as the distance from the area of most probable values by using a kernel density estimation. In these two approaches only one class (typically controls or healthy population) is used to construct a model. To take account of a second class, we have introduced synolytic networks, using a boundary between two classes on the feature-feature plane to estimate the weight of the edge between these features. Common to all these approaches is that topological indices can be used to evaluate the structure represented by the graphs. To compare these network approaches alongside more traditional machine-learning algorithms, we performed a substantial analysis using both synthetic data with a priori known structure and publicly available datasets used for the benchmarking of ML-algorithms. Such a comparison has shown that the main advantage of parenclitic and synolytic networks is their resistance to over-fitting (occurring when the number of features is greater than the number of subjects) compared to other ML approaches. Secondly, the capability to visualise data in a structured form, even when this structure is not a priori available allows for visual inspection and the application of well-established graph theory to their interpretation/application, eliminating the “black-box” nature of other ML approaches.	en
dc.format.extent	3059160
dc.language.iso	eng
dc.relation.ispartof	Frontiers in Genetics
dc.subject	Genetics
dc.subject	networks
dc.subject	graphs
dc.subject	parenclitic
dc.subject	synolytic
dc.subject	complexity
dc.title	Parenclitic and Synolytic Networks Revisited	en
dc.contributor.institution	Department of Physics, Astronomy and Mathematics
dc.contributor.institution	School of Physics, Engineering & Computer Science
dc.description.status	Peer reviewed
rioxxterms.versionofrecord	10.3389/fgene.2021.733783
rioxxterms.type	Journal Article/Review
herts.preservation.rarelyaccessed	true

Files in this item

Name:: fgene_12_733783_1_.pdf
Size:: 2.917Mb
Format:: PDF

View/Open

This item appears in the following Collection(s)

Research publications

Show simple item record