Skip to content

Efficiency Experiment

Chen Yuan edited this page Nov 29, 2018 · 3 revisions

Settings

Each functionalities, as well as a final experiment with all the functionalities together are run with either sampledata dataset (3 lines), labeled as sample or bigdata dataset (600 lines), labeled as big, and either 1 mapper or 8 mappers. Each experiment is timed.

Single Mapper Experiment

We roughly divide functionalities into 4 categories w.r.t the time needed under bigdata dataset and 1 mapper setting.

Category Time Functionalities
Fast < 1 min tokenize, cleanxml, ssplit, pos, lemma, depparse
Medium Fast ~5 mins ner regexner (2.5 min), quote openie (5 min)
Medium Slow ~20 mins sentiment, parse
Slow ~ 2 hours dcoref, coref, relation, natlog, all funcs

Multiple Mappers Comparison Experiemnt W.R.T Different Functionalities

Functionality Dataset #Mapper Results
tokenize sample 1 0:11.71
tokenize sample 8 0:10.25
tokenize big 1 0:10.41
tokenize big 8 0:10.23
cleanxml sample 1 0:10.71
cleanxml sample 8 0:10.57
cleanxml big 1 0:10.30
cleanxml big 8 0:10.85
ssplit sample 1 0:10.65
ssplit sample 8 0:11.59
ssplit big 1 0:12.08
ssplit big 8 0:09.46
pos sample 1 0:10.37
pos sample 8 0:10.59
pos big 1 0:11.43
pos big 8 0:10.89
lemma sample 1 0:10.84
lemma sample 8 0:10.55
lemma big 1 0:12.89
lemma big 8 0:12.19
ner sample 1 0:23.49
ner sample 8 0:23.17
ner big 1 2:36.27
ner big 8 0:49.74
regexner sample 1 0:20.04
regexner sample 8 0:24.11
regexner big 1 2:36.63
regexner big 8 0:46.11
sentiment sample 1 0:17.11
sentiment sample 8 0:14.98
sentiment big 1 19:34.27
sentiment big 8 3:09.27
parse sample 1 0:17.46
parse sample 8 0:14.21
parse big 1 18:27.94
parse big 8 3:12.87
depparse sample 1 0:21.69
depparse sample 8 0:22.87
depparse big 1 0:59.23
depparse big 8 0:27.54
dcoref sample 1 0:50.87
dcoref sample 8 0:49.43
dcoref big 1 1:57:18
dcoref big 8 22:51.10
coref sample 1 1:20.70
coref sample 8 1:04.75
coref big 1 2:01:09
coref big 8 12:57.37
relation sample 1 1:15.16
relation sample 8 0:52.42
relation big 1 1:59:47
relation big 8 20:02.67
natlog sample 1 0:37.45
natlog sample 8 0:28.31
natlog big 1 2:15:43
natlog big 8 28:46.11
quote sample 1 2:29.81
quote sample 8 1:59.08
quote big 1 5:24.74
quote big 8 19:15.83
openie sample 1 0:50.37
openie sample 8 0:46.85
openie big 1 4:20.89
openie big 8 1:01.10
all sample 1 2:57.00
all sample 8 2:28.39
all big 1 2:15:59
all big 8 8:51.14

Clone this wiki locally