Waleed Ammar of CMU

To watch GPU usage realtime: watch -n 0.1 nvidia-smi
If multiple GPUs are available, you can limit which ones are used by a process by setting the environment variable: CUDA_VISIBLE_DEVICES=5
Disk out of space and not sure which directories to blame: du -a /home | sort -n -r | head -n 10
1:1 meetings
apt-get cheatsheet
Markdown cheatsheet
checkout a remote branch: git checkout --track origin/branch_name
create a new branch: git checkout -b new_branch_name
(ana)conda cheat sheet.
To grep multiple files at the same time: grep -E 'fatal|error|critical|failure|warning|' *.log
To recursively find the size of a directory on linux: du -sh ~
On picking/installing/configuring a GPU machine for deep learning..
Slav Ivanov's blog post on picking/installing/configuring a GPU machine for deep learning.
Change access permissions to a bunch of files/directories at once..
Wikipedia and Wiktionary dumps.
the only thing you need to remember about string encoding in python is: decode what you read, encode what you write.
max marginal pruning (in vine.)
A nice explanation of LSTMs.
It's expensive to compute the softmax layer over the vocabulary to compute p(word|context). Three solutions which have been shown to work are: hierarchical softmax (Goodman 2001), noise contrastive estimation (Gutmann and Hyvarinen 2010), self-normalizing neural networks (Devlin et al. 2014).
Chris Dyer's blog.
a paper that compares "off the shelf" dependency parsers.
a tutorial and a blog post on spectral clustering.
Andrew Jones' brief explanation for Xavier Glorot's initialization of neural network parameters.
Sergey Sundukovskiy slides on prototypes, minimal viable products (MVPs), ...etc.
Brendan's tool for visualizing syntactic trees (parseviz).
A bunch of monolingual corpora.
Compress: tar -cfvz compressed-output.tar.gz uncompressed-input-files.* OR tar -cvf mystuff.tar foo.tex fig1.eps fig2.eps && gzip mystuff.tar
Decompress: tar -xfvz compressed-input.tar.gz [-C uncompressed-output-dir]
Boyd and Vandenberghe's book "Convex Optimization"
to initialize submodules after a `git clone', execute `git submodule update --init' at the root directory (reference).
Groups, rings, fields, and vector spaces
Unicode points for Math symbols, Greek letters, Math operators (handy for preparing slides).
Tips and tricks in stochastic gradient descent land.
How to use stochastic gradient descent with L1-regularization? prox-grad, dual averaging, FRTL
installing standard R packages , custom packages in R, and what to do when cpp compilation fails while installing custom R packages
locality sensitive hashing (LSH)
history of deep learning
count-min sketches (a cool data structure that approximates counts of elements in a set)
style guidelines for python
an introduction to GCC
NLP conferences
simulations of beta (and other) distribution density
evaluating clusterings (a ps version of the paper which I like more)
sequence labeling tutorial
configure; make; make install
step-by-step example for using GDB within Emacs to debug a C or C++ program. See this for more GDB commands.
gentle tutorial on using valgrind to find memory problems in c++ code
using screen to survive dropped ssh connections while running your jobs
productivity tips for using ssh
blacklight frontend machine blacklight.psc.teragrid.org
learning topic models; beyond svd. slides, paper
mit's matrix cookbook, and Tom Minka's awesome writeup on matrix derivatives.
EM tutorial
Why are the objectives of logistic regression and crf models convex?
LaTeX on blogger
git concepts
Eigen: a c++ matrix library