Abstract
Objective: The minimum transformer performance can move forward by incorporating the progress of the last decade. Background: Current efforts begin with surface forms and build meanings for tagging, parsing, identifying semantic relationships, etc. But a parser and outside resources can also provide these morphological, syntactic, and semantic features. Method: Instead of inputting tokens of surface forms into a transformer, we could start with token vectors whose summed embeddings represent the deep features of the words. This is an empirical study on how inputs with token vectors perform on masked word prediction.
Supplementary weblinks
Title
Parse2Vec tables
Description
These are a few csv tables that can be integrated into the parser output in order to reproduce the results.
Actions
View 

