Project banner

Summary of the results of Phase 1

The bibliographic database has been provided by Epistemio SRL.  It includes more than 46 million publications. To improve the coverage of the database, Epistemio has also developed software modules that allow and motivate users of the Epistemio website to add references to the database. The statistical properties of the extracted database have been investigated by the project coordonator. Moreover, three clusterization methods (Link-community, ModuLand, GANXiS)  have been implemented and tested on 6 different reference graphs and on the extracted citation network. Then, ECC and L-betweenness based divisive clustering algorithms applicable on large networks have been developed and tested on the availabe networks.

Summary of the results of Phase 2

Voronoi Cluster detection method has been developed and tested on different benchmark and real-world networks. A Local Cluster Detection (LCD) method has also been developed and tested. The LCD method has been used to  find a normalization method for scientific indicators. The P-Index, which is the Hirsch Index of Individual Publications has also been proposed. The relation between p-index and number of citations have also been checked. It was shown that the h-index can be adapted to evaluate individual publications. The citation database has been improved and webpage for each publication has been created. Web interfaces have also been realized for creating personal profiles.

Summary of the results of Phase 3

Relying on citation networks extracted from PubMed, CrossRef, Microsoft Academic Search and Web of Science we found that using local cluster detection the calculation of a paper's PageRank limited to its local cluster yields values that allow a simple and efficient normalization over different fields. The much simpler citation number shows similar behavior though not as clearly as PageRank. Based on citation number we have started assessing journal level performance.

Local Cluster Detection in tandem with local network properties such as degree, PageRank or similarity has also been employed for establishing an automated procedure for reviewer selection.  Conclusions are to be drawn in the next phase of the project.

The theoretical study on a parameter-free version of the community detection method developed in phase II indicates the possibility of further improving performance while reducing bias. The scope of the new stochastic method includes networks with overlapping and hierarchical community structure. 

As a response to a concrete need of Epistemio we have also developed a method for the visualization of the large scale structure of giant networks. Our algorithm could handle networks of over 11 million nodes.

Epistemio has achieved important progress in optimizing the response time  of search in a large-scale Solr index. The benefits of a new publication rating system have been explored and documented in the project's annual report.

Summary of the results of Phase 4

Using our previous results and algorithms we have developed a prototype system for automatic review selection. The software has been tested on a citation network extracted from the Web of Science database, and it was integrated with the Epistemio databses.

Moreover, the development of a biologically inspired, genetic-like algorithm for evaluating scientific publications has been started.

It was initiated the study of the new Epistemio® scale for rating scientific publications.

Summary of the results of Phase 5

The automatic reviewer selection system has been tested and improved. Moreover, the conflicts of interest detection feature has also been implemented by measuring the overlap between author profiles on the level of a co-authorship network. This feature has also been tested on real citation networks.

The biologically inspired fitness indicator has also been proposed. It was studied on different citation networks and it was shown how the fitness defined for new ideas could be a better scientometric indicator than the simple citation number of papers. This model could also provide a framework of tracking individual ideas and innovations within the abundance of intertwined citations, giving additional insight from historical, sociological and economic perspectives.

The mathematical background of the stochastic graph Voronoi tessellation and several of its potential applications like community detection have also been studied.


Phase 1: Completed (Technical report delivered in December 2012) - 507 705 RON

Phase 2: Completed (Technical report delivered in December 2013) - 633 248 RON

Phase 3: Completed (Technical report delivered in December 2014) - 753 665 RON

Phase 4: Completed (Technical report delivered in December 2015) - 614 880.75 RON

Phase 5: Completed  (Final technical report delivered in December 2016) - 721 749.86 RON