Skip to content

Conversation

@danielgranat
Copy link

First thanks for sharing this code. It's exactly what i needed, and i didn't find anything i liked in NodeJS.

This is more a request to get feedback then actual pull request.
The problem i encountered is that wikipedia does not work with the image and title extraction that's implemented now.
The Image is not in the header, but is the first image in the '.infobox'.
Title- When splitting the Title, the longest part is usually not the important part, like in 'Thomas Edison - Wikipedia, the free encyclopedia'
Trying to tackle this problem i saw 2 options:

  1. Re-factor the current extraction implementation to support wikipedia structure. I don't think it's a good option. First it will cause the code to be less readable. Second, what will happen when i need more customization?!
  2. Second option is to have something like domain specific plugins.

Obviously i decided to use the second option.

There is still work to be done and issues to address, but I would like to get your input on the proposed solution.

Thank you for your time!

@ageitgey
Copy link
Owner

Hey, thanks for the PR. I do agree that domain-specific plugins are a better path here than hacks on top of hacks in the main code.

I'll take a look at the PR in detail when I have a little free time and let you know what I think.

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants