Abstract |
Structured real world data can be represented with graphs whose structure encodes independence assumptions within the data. Due to statistical advantages over generative graphical models, Conditional Random Fields (CRFs) are used in a wide range of classification tasks on structured data sets. CRFs can be learned from both, fully or partially supervised data, and may be used to infer fully unlabeled or partially labelled data. However, performing inference in CRFs with an arbitrary graphical structure on a large amount of data is computational expensive and nearly intractable on a reseacher's workstation. Hence, we take advantage of recent developments in computer hardware, namely general-purpose Graphics Processing Units (GPUs). We not merely run given algorithms on GPUs, but present a novel framework of parallel algorithms at several levels for training general CRFs on very large data sets. We evaluate their performance in terms of runtime and F1 -Score.
|