|
# DATA
|
|
# DATA
|
|
## DE Wikipedia
|
|
## DE Wikipedia
|
|
Average sentence length of Wikipedia: 15.307692307692308
|
|
Average sentence length of Wikipedia: 15.307692307692308
|
|
|
|
|
|
Most common words in Wikipedia: [('der', 37692107), ('und', 30308189), ('die', 22311233), ('in', 22255489), ('von', 14906763), ('den', 10872509), ('des', 10847730), ('im', 10071944), ('mit', 8975642), ('dem', 7205673), ('er', 6678509), ('das', 6638634), ('für', 6476150), ('als', 6366424), ('wurde', 6015610), ('zu', 5984934), ('ist', 5893142), ('auf', 5663895), ('eine', 5492910), ('ein', 5256055)]
|
|
Most common words in Wikipedia: [('der', 37692107), ('und', 30308189), ('die', 22311233), ('in', 22255489), ('von', 14906763), ('den', 10872509), ('des', 10847730), ('im', 10071944), ('mit', 8975642), ('dem', 7205673), ('er', 6678509), ('das', 6638634), ('für', 6476150), ('als', 6366424), ('wurde', 6015610), ('zu', 5984934), ('ist', 5893142), ('auf', 5663895), ('eine', 5492910), ('ein', 5256055)]
|
|
|
|
|
|
## TIGER
|
|
## TIGER
|
|
Average sentence length of Tiger: 5.5
|
|
Average sentence length of Tiger: 5.5
|
|
|
|
|
|
Most common words in Tiger: [(',', 43659), ('.', 41057), ('der', 26771), ('die', 24459), ('und', 16282), ('in', 13720), ('den', 9688), ("''", 8811), ('``', 8809), ('von', 8267), ('zu', 7140), ('mit', 6277), ('das', 6078), ('auf', 5987), ('des', 5911), ('für', 5859), ('sich', 5585), ('Die', 5562), ('nicht', 5143), ('im', 5138)]
|
|
Most common words in Tiger: [(',', 43659), ('.', 41057), ('der', 26771), ('die', 24459), ('und', 16282), ('in', 13720), ('den', 9688), ("''", 8811), ('``', 8809), ('von', 8267), ('zu', 7140), ('mit', 6277), ('das', 6078), ('auf', 5987), ('des', 5911), ('für', 5859), ('sich', 5585), ('Die', 5562), ('nicht', 5143), ('im', 5138)]
|
|
|
|
|
|
# EXPERIMENTS
|
|
# EXPERIMENTS
|
... | | ... | |