Skip to content

General thoughts/suggestions on NLTK content #20

@cameronmclean

Description

@cameronmclean

A few comments based on experience/reflection from day 1 of ResBaz - feel free to discuss, discard, or modify as appropriate...

  • when introducing significant whitespace, should we use the concept of a 'code block' eg see http://en.wikipedia.org/wiki/Block_%28programming%29
  • when talking about lists the exercise/demo uses a list sent4 or sent7 that was defined elsewhere and invisible to the learner. Would it be better to have the user create two new lists from scratch, and then see how you can join and manipulate them?
  • some examples use the python print and others just type the variable name and have the interpreter display the contents. We should be consistent and (i suggest) always use print variable - this way learner get used to the idea of using lots of print statements to look inside their variables, a useful debugging skill.
  • defining variables challenge 1 - some people tried to solve the more generic problem, and write a function that will recognise a ; in (any) array and store/print anything before it - but were stumped because of lack of skills/practice yet - perhaps reword this example so it's clear we 'know' what is in the array and you want us to count and slice given a known content.
  • similar to the last point - the fidst challenge that follows, some folks tried to write a general function that will take four corpora as input and compare (which proved to be difficult with current practice/skills/knowledge) - but I think the aim is just to write one function that will take one input text and return the top 15 most common words.
  • occasionally an example in the explanation/notes is one that you wouldn't want the user to actually type because the output is too large - eg sorted(whole_corpora) or array[8:] type things - check that all examples are "runnable" if the user tries them on the loaded texts.
  • the python construct [len(w) for w in text] etc - I think it might better to write these out longform - especially as learners haven't been introduced to for loops yet etc.
  • the challenge to write code that will find all the words in a text that are more than seven letters long and occur more than seven times - this requires the use of and conditional which wasn't introduced earlier...
  • variable name - sometimes we use w for "word" and sometimes we use word - be consistent. Perhaps the longer form is easier for learners to follow than w

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions