Hostname: page-component-89b8bd64d-shngb Total loading time: 0 Render date: 2026-05-07T16:59:09.838Z Has data issue: false hasContentIssue false

Emerging trends: Deep nets for poets

Published online by Cambridge University Press:  01 September 2021

Kenneth Ward Church*
Affiliation:
Baidu, Sunnyvale, CA, USA
Xiaopeng Yuan
Affiliation:
Baidu, Beijing, China
Sheng Guo
Affiliation:
Baidu, Beijing, China
Zewu Wu
Affiliation:
Baidu, Beijing, China
Yehua Yang
Affiliation:
Baidu, Beijing, China
Zeyu Chen
Affiliation:
Baidu, Beijing, China
*
*Corresponding author. E-mail: KennethChurch@baidu.com
Rights & Permissions [Opens in a new window]

Abstract

Deep nets have done well with early adopters, but the future will soon depend on crossing the chasm. The goal of this paper is to make deep nets more accessible to a broader audience including people with little or no programming skills, and people with little interest in training new models. A github is provided with simple implementations of image classification, optical character recognition, sentiment analysis, named entity recognition, question answering (QA/SQuAD), machine translation, speech to text (SST), and speech recognition (STT). The emphasis is on instant gratification. Non-programmers should be able to install these programs and use them in 15 minutes or less (per program). Programs are short (10–100 lines each) and readable by users with modest programming skills. Much of the complexity is hidden behind abstractions such as pipelines and auto classes, and pretrained models and datasets provided by hubs: PaddleHub, PaddleNLP, HuggingFaceHub, and Fairseq. Hubs have different priorities than research. Research is training models from corpora and fine-tuning them for tasks. Users are already overwhelmed with an embarrassment of riches (13k models and 1k datasets). Do they want more? We believe the broader market is more interested in inference (how to run pretrained models on novel inputs) and less interested in training (how to create even more models).

Information

Type
Emerging Trends
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution, and reproduction in any medium, provided the original work is properly cited.
Copyright
© The Author(s), 2021. Published by Cambridge University Press
Figure 0

Table 1. A simple API for several deep net inference applications

Figure 1

Table 2. See footnoteag for short-cut links to code for each checkmark (✓), with an emphasis on instant gratification, simplicity, and readability. More examples will be added soon

Figure 2

Figure 1. Example of combining OCR and translation with Unix pipes.

Figure 3

Figure 2. Sentiment analysis using PaddleHub.

Figure 4

Figure 3. Translate between many language pairs.

Figure 5

Table 3. Unmasking: replace each input word with [MASK] and predict fillers. Red is added to highlight differences between the top prediction and the original input

Figure 6

Figure 4. Auto-completion from a search engine.

Figure 7

Table 4. Unmasking applied to cliches. Incorrect predictions are highlighted in red

Figure 8

Figure 5. Calibration shows scores are too high (black points are below red line).

Figure 9

Figure 6. BERT scores increase with frequency of candidate fillers. Frequencies are estimated from training set.