Kerasメモ（BERTその4）

Keras python BERT

前回の続き。Transformerを構成するFeedForwardレイヤを見てみる。論文「Attention Is All You Need」からの抜粋。 In addition to attention sub-layers, each of the layers in our encoder and decoder contains a fully connected feed-forward network,…

2019-06-15

Kerasメモ（BERTその3）

Keras python BERT

前々回の続き。Transformerを構成するMultiHeadAttentionレイヤを見てみる。MultiHeadAttentionレイヤのインプットの形状が(bathc_size, 512, 768)、「head_num」が「12」である場合、並列化は下図のとおりとなる。図中の「Wq」、「Wk」、「Wv」、「Wo」はM…

2019-06-12

Kerasメモ（BERTその2）

Keras python BERT

前回の続き。Position Embeddingレイヤを見てみる。 model.summary Layer (type) Output Shape Param # ========================================================================== Embedding-Position (PositionEmbedding) (None, 512, 768) 393216 ====…

2019-06-10

Kerasメモ（BERTその1）

Keras python BERT

BERT（Bidirectional Encoder Representations from Transformers)を試してみる。論文には2種類のモデルが掲載されている。 the number of layers (i.e., Transformer blocks) as L the hidden size as H the number of self-attention heads as ABERT(BASE)…

ichou1のブログ

主に音声認識、時々、データ分析のことを書く

2019-06-01から1ヶ月間の記事一覧

Kerasメモ（BERTその4）

Kerasメモ（BERTその3）

Kerasメモ（BERTその2）

Kerasメモ（BERTその1）