Lines with multiple tabs are not handled by load_data
def read_vocabulary(corpus_file)
should return question-answer pairs but 198 out of 221282 "pairs" are lists of at least three strings. load_and_prepare_data
passes these wrong pairs further through interface_training and trainIters
to batch2TrainData
. There only the first two parts are picked even though in this case they are part of the same utterance or the second is the empty string.