Some background notes for work on machine learning
This is for now an ad-hoc page of links and ideas
Machine learning and language technology
Mainstream language technology is dominated by neural networks. Neural networks provide powerful tools for language modeling and there are now an increasing amount of university classes, seminars and workshops about it.
Good to read
- NN for NLP course at Carnegie Mellon University
- NLPL winter school
- also interesting: Recent trends in deep learning
Things we have looked at
- Grave et al 2018: Learning Word Vectors for 157 Languages
- Modelling Natural Language, Programs, and their Intersection ** slides
Ideas we have
- MT baseline, cf. our rule-based MT
- Stavekontrollforslag, cf. our work on proofing
- Missinglist-komponent
Moglege studentoppgåver
Vi har språklege data (i Korp nå, men vi har mer):
- sme: over 32M tokens, nesten 3M setninger
- nob-sme: nesten 2.5M tokens, mer enn 150K setninger