-
Notifications
You must be signed in to change notification settings - Fork 14
Syllabus
This page contains a list of the readings for each week. Books will only be referred to as "Author Year"; the list of all books can be found in the bottom of the page.
Week 1: Bagging, boosting and nested cross validation
- Raschka 2017, Chapter 7 and Chapter 6 (pp. 203-205)
Week 2: Neural networks and gradient descent
- 3Blue1Brown's two first videos (1 and 2) on neural networks
-
Or if you would rather read:
- Nielsen 2018, Chapter 1, or
- Raschka 2017, Chapter 12
-
Or if you would rather read:
Week 3: Backpropagation, regularization and the vanishing gradient problem
- 3Blue1Brown's two last videos (3 and 4) on neural networks.
- Or if you would rather read: Nielsen 2018, Chapter 2
- Nielsen 2018, Chapter 3
- Nielsen 2018, Chapter 5
Week 4: Deep learning (CNN and RNN)
- On Convolutional Neural Networks:
- Reading: Nielsen 2018, Chapter 6
- Reading: cs231n course notes (Stanford). They're extremely good.
- Video: cs231n course lecture given by Serena Yeung or Andrej Karpathy
- On Recurrent Neural Networks:
- Reading: Goodfellow 2016, Chapter 10
- Reading: Andrej Karpathy, "The Unreasonable Effectiveness of RNNs". It's a great blog post!
- Video: cs231n course lecture given by Justin Johnson or Andrej Karpathy.
Week 5: Networks 1
Week 6: Networks 2
- Barabási 2018, chapter 9 and chapter 10.
Week 7: Networks 3 - peer effects
- Manski, C.F., 1993. Identification of endogenous social effects: The reflection problem. The review of economic studies, 60(3), pp.531-542.
- Sacerdote, B., 2001. Peer effects with random assignment: Results for Dartmouth roommates. The Quarterly journal of economics, 116(2), pp.681-704.
- Sacerdote, B., 2011. Peer effects in education: How might they work, how big are they and how much do we know thus far?. In Handbook of the Economics of Education (Vol. 3, pp. 249-277). Elsevier.
- Carrell, S.E., Sacerdote, B.I. and West, J.E., 2013. From natural variation to optimal policy? The importance of endogenous peer group formation. Econometrica, 81(3), pp.855-882.
Week 8: Networks 4 - network formation
-
McPherson, M., Smith-Lovin, L., & Cook, J. M. (2001). Birds of a feather: Homophily in social networks. Annual review of sociology, 27(1), 415-444.
-
Rivera, M. T., Soderstrom, S. B., & Uzzi, B. (2010). Dynamics of dyads in social networks: Assortative, relational, and proximity mechanisms. Annual Review of Sociology, 36, 91-115.
Week 9: Spatial data 1 - fundamentals
Learn to efficiently use geolocation data. Topics including working spatial shapes (polygons, lines etc.) and methods for storage, manipulation, changing coordinate system, feature extraction as well as plotting data.
-
Preparation:
- Install geo packages with the following command:
conda install -c conda-forge geopandas folium -y
- Gimond (2017), read the following chapters/subsections
- 1, 2, 3.1.1, 5, 8.2, 8.3, 9, 14.1
- Install geo packages with the following command:
- Optional inspirational reading:
- Good maps: Gimond (2017) chapter 4, 6
- Alex Hern in The Guardian on Jan 28, 2018: Fitness tracking app Strava gives away location of secret US army bases
- Ali Winston in The Verge on Feb 27, 2018: Palantir has secretly been using New Orleans to test its predictive policing technology
- Glaeser, E. L., Kincaid, M. S., & Naik, N. (2018). Computer Vision and Real Estate: Do Looks Matter and Do Incentives Determine Looks (No. w25174). National Bureau of Economic Research.
Week 10: Spatial data 2 - methods for identification
- Black, S. E. (1999). Do better schools matter? Parental valuation of elementary education. The Quarterly Journal of Economics, 114(2), 577-599.
- Baylis, P., Obradovich, N., Kryvasheyeu, Y., Chen, H., Coviello, L., Moro, E., ... & Fowler, J. H. (2018). Weather impacts expressed sentiment. PloS one, 13(4), e0195750.
- Optional, background reading
-
Week 11 (April. 23): Text as Data 1. [SR] - fundamentals
-
Preparation:
- Install text processing packages with the following command:
conda install -c anaconda nltkconda install -c conda-forge spacyconda install -c anaconda gensim
Brush-up reading
- building regular expressions: https://web.stanford.edu/~jurafsky/slp3/2.pdf
- basic operations with strings, text files and webdata (especially chapter 3)
Introductory texts Pick one of the following texts. *"Machine Translation: Mining Text for Social Theory" by James A. Evans and Pedro Aceves: https://www.annualreviews.org/doi/pdf/10.1146/annurev-soc-081715-074206 *"Text as Data" Matthew Gentzkow, Bryan T. Kelly and Matt Taddy: https://web.stanford.edu/~gentzkow/research/text-as-data.pdf *"Text as Data: The Promise and Pitfalls of Automatic Content Analysis Methods for Political Texts ": Justin Grimmer and Brandon M. Stewart. https://web.stanford.edu/~jgrimmer/tad2.pdf
- Install text processing packages with the following command:
-
-
Week 11 (April. 29): Text as Data 2. [SR] - Datadriven discovery and measurement This week we shall discuss how to do good and reliable measurement in text. We will situate this discussion in relation to the many off-the-shelf methods for data mining: and in particular rulebased and lexical approaches and topic modelling.
Keywords are measurement, prototyping, informations extraction, text clustering
Readings
- Blei 2012: "Probabilitic Topic Models"
- Nelson 2017: "Computational Grounded Theory: A Methodological Framework"
- (re)read Grimmer and Stewart 2013: "Text as Data: The Promise and Pitfalls of Automatic Content Analysis Methods for Political Texts"
Inspirational
- Large-Scale Computerized Text Analysis in Political Science: Opportunities and Challenges
- Gerlach, Peixoto and Altmann 2018: "A network approach to topic models"
- Egami et al 2018 preprint: How to Make Causal Inferences Using Texts
-
Week 12 (May. 7): Text as Data 3. [SR] - Text Classification and Bias in Meaurement Final session we shall discuss using NLP and supervised learning as measurement devices. We will cover recent progress in NLP using Transfer Learning. Finally we will discuss the gain in performance in relation to differential bias accross e.g. social groups, gender and ethnicity.
Readings:
- Wang and MAnning 2012: "Baselines and Bigrams: Simple, Good Sentiment and Topic Classification"
- Felbo et al. 2017: "Using millions of emoji occurrences to learn any-domain representationsfor detecting sentiment, emotion and sarcasm"--- Deepmoji
- Kiritchenko and Mohammed 2018: "Examining Gender and Race Bias in Two Hundred Sentiment Analysis Systems"
Inspiration:
State-of-the-art language modelling
- Devlin et al. 2018: "BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding"
- Peters et al. 2017 "Deep contextualized word representations" -- ELMO
- Howard and Ruder 2018: "Universal Language Model Fine-tuning for Text Classification" Articles on Bias
- On the issue in the NLP community: Bender and Friedman: "Data Statements for Natural Language Processing:Toward Mitigating System Bias and Enabling Better Science"
- Bolukbasi et al. 2016: "[Man is to woman as computer programmer is to homemaker]"(https://arxiv.org/abs/1607.06520)
- Manzini et al. 2019: "Black is to Criminal as Caucasian is to Police: Detecting and Removing Multiclass Bias in Word Embeddings"
- Johansen, Hovy and Søgaard 2015: "Cross-lingual syntactic variation over age and gender"
- Hovy and Søgaard 2015: "Tagging performance correlates with author age"
- Raschka, Sebastian, and Vahid Mirjalili. Python for Machine Learning, 2nd Ed. Packt Publishing, 2017.
- Nielsen, Michael. Neural Networks and Deep Learning, 2018. Web book available free here
- Goodfellow, Ian, and Bengio, Yoshua, and Courville, Aaron. "Deep Learning". MIT Press, 2016. Web book available free here.
- Barabási, Albert-László. Network science. Web book avaialable free here. Cambridge university press, 2016.
- Gimond, Manuel. Intro to GIS and Spatial Analysis. Web book available free here. Preprint, 2017.
- Jurafsky, Dan, and James H. Martin. Speech and language processing. Vol. 3. London: Pearson, 2014.