Kerasメモ（seq2seqで足し算）

以下の書籍を参考に、seq2seqを使った足し算を試してみる。

データの長さは固定で「12」、値がない部分はブランクで埋める。

式部分の長さ : 「7」固定
答え部分の長さ : 「5」固定（イコールにあたる"_"含む）

データ例（addition.txt）

16+75  _91  
52+607 _659 
75+22  _97  
63+22  _85  
795+3  _798 
706+796_1502
...

サンプル数は全体で「50,000」、学習用に90%、残り10%を検証用に分割する。

出現する文字は、数字の「0から9」、プラス記号、ブランク、アンダーバーの13種類。

「input_dim」が「13」、「output_dim」が「16」のEmbeddingレイヤを通して特徴ベクトルに変換する。

今回、Encoder側とDecoder側とでEmbeddingレイヤを「それぞれ持つ」パターンと「共有する」パターンを試してみる。

モデルはKerasで書き直した。

Embeddingレイヤをそれぞれ持つパターン

学習モデル

f:id:ichou1:20190215210929p:plain

推論モデル

f:id:ichou1:20190215213954p:plain

Embeddingレイヤを共有するパターン

学習モデル

f:id:ichou1:20190215215410p:plain

推論モデル

f:id:ichou1:20190215220343p:plain

学習時のパラメータを以下のとおり設定。

batch_size = 128  # Batch size for training.
epochs = 20  # Number of epochs to train for.
train_model.compile(optimizer='adam', loss='categorical_crossentropy')

学習後、検証データを推論モデルに渡した結果の一部。2番目は誤りパターン。

Input sentence:   ['3', '6', '7', '+', '5', '5', ' ']
Decoded sentence: ['4', '2', '2', ' ']
label sentence:   ['4', '2', '2', ' ']
-
Input sentence:   ['6', '0', '0', '+', '2', '5', '7']
Decoded sentence: ['8', '6', '7', ' ']
label sentence:   ['8', '5', '7', ' ']
-
Input sentence:   ['7', '6', '1', '+', '2', '9', '2']
Decoded sentence: ['1', '0', '5', '3']
label sentence:   ['1', '0', '5', '3']
-

今回、3種類のモデルを作成した。