The paper “Multilabel Subject-based Classifi cation of Poetry” by Andr es Lou, Diana Inkpen, and Chris T an asescu (MARGENTO) has been accepted to the 28th Florida Artificial Intelligence Research Society–FLAIRS–Conference; the paper is part of the ampler MARGENTO project Poetry Computational Graphs and the Graph Poem.
Here is the abstract:
Multilabel Subject-based Classi cation of Poetry
by Andr es Lou, Diana Inkpen, and Chris T an asescu (MARGENTO)
University of Ottawa, School of Electrical Engineering and Computer Science
Oftentimes, the question “what is this poem about?” has no trivial answer, regardless of length, style, author, or context in which the poem is found. We propose a simple system of multilabel classifi cation of poems based on their subjects following the categories and subcategories as laid out by the Poetry Foundation. We make use of a model that combines the methodologies of tf-idf and Latent Dirichlet Allocation for feature extraction, and a Support Vector Machine model for the classi fication task. We determine how likely it is for our models to correctly classify each poem they read into one or more main categories and subcategories. Our contribution is, thus, a new method to automatically classify poetry given a set and various subsets of categories.