-
Notifications
You must be signed in to change notification settings - Fork 0
Efficiency Experiment
Chen Yuan edited this page Nov 29, 2018
·
3 revisions
Each functionalities, as well as a final experiment with all the functionalities together are run with either sampledata dataset (3 lines), labeled as sample or bigdata dataset (600 lines), labeled as big, and either 1 mapper or 8 mappers. Each experiment is timed.
We roughly divide functionalities into 4 categories w.r.t the time needed under bigdata dataset and 1 mapper setting.
| Category | Time | Functionalities |
|---|---|---|
| Fast | < 1 min | tokenize, cleanxml, ssplit, pos, lemma, depparse |
| Medium Fast | ~5 mins | ner regexner (2.5 min), quote openie (5 min) |
| Medium Slow | ~20 mins | sentiment, parse |
| Slow | ~ 2 hours | dcoref, coref, relation, natlog, all funcs |
| Functionality | Dataset | #Mapper | Results |
|---|---|---|---|
| tokenize | sample | 1 | 0:11.71 |
| tokenize | sample | 8 | 0:10.25 |
| tokenize | big | 1 | 0:10.41 |
| tokenize | big | 8 | 0:10.23 |
| cleanxml | sample | 1 | 0:10.71 |
| cleanxml | sample | 8 | 0:10.57 |
| cleanxml | big | 1 | 0:10.30 |
| cleanxml | big | 8 | 0:10.85 |
| ssplit | sample | 1 | 0:10.65 |
| ssplit | sample | 8 | 0:11.59 |
| ssplit | big | 1 | 0:12.08 |
| ssplit | big | 8 | 0:09.46 |
| pos | sample | 1 | 0:10.37 |
| pos | sample | 8 | 0:10.59 |
| pos | big | 1 | 0:11.43 |
| pos | big | 8 | 0:10.89 |
| lemma | sample | 1 | 0:10.84 |
| lemma | sample | 8 | 0:10.55 |
| lemma | big | 1 | 0:12.89 |
| lemma | big | 8 | 0:12.19 |
| ner | sample | 1 | 0:23.49 |
| ner | sample | 8 | 0:23.17 |
| ner | big | 1 | 2:36.27 |
| ner | big | 8 | 0:49.74 |
| regexner | sample | 1 | 0:20.04 |
| regexner | sample | 8 | 0:24.11 |
| regexner | big | 1 | 2:36.63 |
| regexner | big | 8 | 0:46.11 |
| sentiment | sample | 1 | 0:17.11 |
| sentiment | sample | 8 | 0:14.98 |
| sentiment | big | 1 | 19:34.27 |
| sentiment | big | 8 | 3:09.27 |
| parse | sample | 1 | 0:17.46 |
| parse | sample | 8 | 0:14.21 |
| parse | big | 1 | 18:27.94 |
| parse | big | 8 | 3:12.87 |
| depparse | sample | 1 | 0:21.69 |
| depparse | sample | 8 | 0:22.87 |
| depparse | big | 1 | 0:59.23 |
| depparse | big | 8 | 0:27.54 |
| dcoref | sample | 1 | 0:50.87 |
| dcoref | sample | 8 | 0:49.43 |
| dcoref | big | 1 | 1:57:18 |
| dcoref | big | 8 | 22:51.10 |
| coref | sample | 1 | 1:20.70 |
| coref | sample | 8 | 1:04.75 |
| coref | big | 1 | 2:01:09 |
| coref | big | 8 | 12:57.37 |
| relation | sample | 1 | 1:15.16 |
| relation | sample | 8 | 0:52.42 |
| relation | big | 1 | 1:59:47 |
| relation | big | 8 | 20:02.67 |
| natlog | sample | 1 | 0:37.45 |
| natlog | sample | 8 | 0:28.31 |
| natlog | big | 1 | 2:15:43 |
| natlog | big | 8 | 28:46.11 |
| quote | sample | 1 | 2:29.81 |
| quote | sample | 8 | 1:59.08 |
| quote | big | 1 | 5:24.74 |
| quote | big | 8 | 19:15.83 |
| openie | sample | 1 | 0:50.37 |
| openie | sample | 8 | 0:46.85 |
| openie | big | 1 | 4:20.89 |
| openie | big | 8 | 1:01.10 |
| all | sample | 1 | 2:57.00 |
| all | sample | 8 | 2:28.39 |
| all | big | 1 | 2:15:59 |
| all | big | 8 | 8:51.14 |