Poster
in
Workshop: CtrlGen: Controllable Generative Modeling in Language and Vision
Controllable Paraphrase Generation with Multiple Types of Constraints
Gwénolé Lecorvé
One of the current challenges in paraphrase generation is the ability to enforce linguistic constraints on the desired output. These constraints may relate to the length of the output sentence, its syntactic structure, the presence of specific words, etc. While recent works focus on these constraints in isolation, this paper studies a variety of constraints imposed separately or in combination with one another. These constraints include different linguistic factors (surface words, syntax and semantics) of the input sequence or output sequence or both, and different data shapes (scalar value, sequence and tree). These constraints are integrated in a paraphrase generation process using an attention-based encoder-decoder model trained and experimented on the ParaNMT-50M corpus. The results show that the constraints are well respected by the models and that they allow to improve the quality of the produced paraphrases. This multiple constraint-driven model opens a new window for controllable paraphrase generation. The code is publicly available: https://gitlab.inria.fr/expression/tremolo/controllable-paraphrase-generation .