BERTメモ（structural probes）その1

『A Structural Probe for Finding Syntax in Word Representations』を試してみる。

ソースコード。
https://github.com/john-hewitt/structural-probes

まずは、pre-trained structural probesを使ったデモを動かしてみる。

printf "The chef that went to the stores was out of food" | python structural-probes/run_demo.py example/demo-bert.yaml

訳すと、「コックは店に行ったが、食材が無かった」といったところか。

完了すると、TikZ用データと可視化した2つの画像が出力される。

demo-dist-pred0.png

f:id:ichou1:20191004212146p:plain
"chef"と"out"の距離が近い。

demo-depth-pred0.png

f:id:ichou1:20191004212202p:plain

configの中身を見てみる。

example/demo-bert.yaml(抜粋)

model:
  hidden_dim: 1024
  model_layer: 16 # ELMo: {0,1,2}; BERT-base: {0,...,11}; BERT-large: {0,...,23}

BERT-largeの、17番目のtransformerレイヤにおけるencode結果を使っている（indexは0始まり）

probe:
  maximum_rank: 1024
  depth_params_path: example/data/bertlarge16-depth-probe.params
  distance_params_path: example/data/bertlarge16-distance-probe.params

「maximum_rank」はprojection matrix（shape: [hidden_dim, maximun_rank]）の次元を決める。
また、Depth用とDistanse用のパラメータのPATHを指定する。
（パラメータは「structural-probes/run_experiment.py」を実行すると生成される）

以下はトレーニングの時に使う模様。

probe:
  psd_parameters: True

structural-probes/run_experiment.py

if args['probe']['psd_parameters']:
    # Positive Semidefinite(PSD) Matrices（半正定値）
    return probe.OneWordPSDProbe
else:
    # Non-Positive Semidefinite(non-PSD) Matrices（半正定値を保証しない）
    return probe.OneWordNonPSDProbe

ichou1のブログ

主に音声認識、時々、データ分析のことを書く

BERTメモ（structural probes）その1

demo-dist-pred0.png

demo-depth-pred0.png

example/demo-bert.yaml(抜粋)

structural-probes/run_experiment.py