excel - Calculating Clustering Coefficient -


i have data in form such as:

------------------------------------------------------------------------------- author_id   year    coauthor_count  high    medium  low     deviant paper_count ------------------------------------------------------------------------------- 677         2005    1               1.00    0.00    0.00    0.00    3 677         2007    3               0.66    0.00    0.33    0.00    1 677         2009    1               0.00    1.00    0.00    0.00    1 677         2011    5               0.60    0.00    0.40    0.00    1 677         2012    2               1.00    0.00    0.00    0.00    1 677         2013    5               0.60    0.40    0.00    0.00    2 1359        2005    11              0.00    0.00    0.81    0.18    11 1359        2006    27              0.00    0.14    0.70    0.14    20 1359        2007    29              0.00    0.06    0.62    0.31    12 1359        2008    29              0.00    0.10    0.55    0.34    13 1359        2009    28              0.00    0.32    0.53    0.14    18 1359        2010    22              0.04    0.18    0.59    0.18    14   ...   ...   ...   

whereas high, medium, low , deviant columns representing similarity value between author , coauthor. in same form have data regarding author , venue similarities , counts.

i have used microsoft clustering cluster these data, successful assigning each row cluster label.

but issue want calculate clustering coefficients of these data, whereas data should in graphical form (nodes, edges) calculate cluster coefficients.

how can clustering coefficient of these data calculated?

ms clustering not give formula calculate (local) clustering coefficient of author.

rather microsoft clustering gives (according documentation) 2 algorithms, k-means clustering , em clustering (which related k-means, more general). broadly speaking, these methods structure dataset whole.

the "clustering coefficient" looking more property of relationship network of author.

it case of unfortunate naming. there 1 name/attribute different concepts:

  • "clustering algorithms", unsupervised machine-learning methods
  • "clustering coefficient", measure graph theory

the local clustering coefficient can calculated follows

for each author    create list of  coauthor-ids of author (this column missing in table)   coauthor-ids list,       count/sum the unique mutual coauthorship-pairs between them, not author himself     divide number of coauthors per author (you have one, coauthor_count) 

see illustration on right of wikipedia page linked above.

i haven't looked excel plugins or vba molules or add-ins this.


Comments

Popular posts from this blog

php - Wordpress website dashboard page or post editor content is not showing but front end data is showing properly -

How to get the ip address of VM and use it to configure SSH connection dynamically in Ansible -

javascript - Get parameter of GET request -