TEAM FORMATION IN SOCIAL NETWORKS
BASED ON DENSEST SUBGRAPHS
Given a task Τ, a set of experts V with multiple skills and a social network G(V, W) reflecting the compatibility among the experts, team formation is the problem of identifying a team C ⊆ V that is both competent in performing the task Τ and compatible in working together. Existing methods for this problem make too restrictive assumptions and thus cannot model practical scenarios. The goal of this paper is to consider the team formation problem in a realistic setting and present a novel formulation based on densest subgraphs. Our formulation allows modeling of many natural requirements such as
- inclusion of a designated team leader and/or a group of given experts,
- restriction of the size or more generally cost of the team
- enforcing locality of the team, e.g., in a geographical sense or social sense
The proposed formulation leads to a generalized version of the classical densest subgraph problem with cardinality constraints (DSP), which is an NP hard problem and has many applications in social network analysis. In this paper, we present a new method for (approximately) solving the generalized DSP (GDSP). Our method, FORTE, is based on solving an equivalent continuous relaxation of GDSP. The solution found by our method has a quality guarantee and always satisfies the constraints of GDSP. Experiments show that the proposed formulation (GDSP) is useful in modeling a broader range of team formation problems and that our method produces more coherent and compact teams of high quality. We also show, with the help of an LP relaxation of GDSP, that our method gives close to optimal solutions to GDSP.
DOWNLOAD AND LICENSE
FORTE has been developed by Syama Sundar Rangapuram, Thomas Bühler and Matthias Hein, Department of Computer Science, Saarland University, Germany. The code for FORTE is published as free software under the terms of the GNU GPL v3.0. Please include a reference to the paper Towards Realistic Team Formation in Social Networks based on Densest Subgraphs and include the original documentation and copyright notice.
DBLP CO-AUTHOR NETWORK DATASET:
We also provide here a weighted co-author network built, using a simple parser for dblp.xml, from the snapshot of the DBLP database downloaded on the 9th of July 2012.
In this network, a vertex corresponds to an author and an edge between two vertices indicates the collarboration between the authors represented by those vertices. The weight of the edge is the number of shared publications.
Apart from the weight matrix W, full details of each author is also provided in the variable author_details. Here you find for each author, a list of conferences/journals he has publications in along with the number of publications in each conference/journal. One can use this author information to extract a subgraph of authors who have publications in selected conferences/journals and/or build skill matrix for a chosen skill set.
For the team formation problem considered in the paper, we restricted to four fields of computer science: Databases (DB), Theory (T), Data Mining (DM), Artificial Intelligence (AI). Conferences that we consider for each field are given as follows:
- DB = SIGMOD, VLDB, ICDE, ICDT, PODS
- T = SODA, FOCS, STOC, STACS, ICALP, ESA
- DM = WWW, KDD, SDM, PKDD, ICDM, WSDM
- AI = IJCAI, NIPS, ICML, COLT, UAI, CVPR
For the experiments, we took the largest component of the subgraph of the authors who have at least three publications in any of the above 23 conferences. You can download the datasets used in the paper here:
 S. Rangapuram, T. Bühler, and M. Hein
Towards Realistic Team Formation in Social Networks based on Densest Subgraphs,
Proceedings of the 22nd International conference on World Wide Web (WWW 2013). PDF