dm.cs.tu-dortmund.de/en/mlbits/sequential-models-maximum-entropy-models/
Maximum Entropy Markov Models (MEMM) – Lecture Notes
y_2}) = .3\cdot .3 \cdot .3 = 0.027\)
\(P(\textcolor[RGB]{0,155,170}{y_1,y_2,y_1,y_2}) = .6\cdot .2 \cdot .5 = 0.06\)
\(P(\textcolor[RGB]{191,2,127}{y_1,y_1,y_2,y_2}) = .4\cdot .55 \cdot .3 = 0.066\)
Most [...] w y_2\) is most likely
In \(y_2\) , \(y_2\rightarrow y_2\) is most likely
\(P(\textcolor[RGB]{132,184,24}{y_1,y_1,y_1,y_1}) = .4\cdot .45 \cdot .5 = 0.09\)
\(P(\textcolor[RGB]{246,180,63}{y_2,y_2,y_2,y_2}) [...] were unknown.
But: on natural language text, most unknown words will actually be nouns (names)!
Idea 2: use word similarity to infer most likely type.
Uppercase: most likely a noun. Lowercase: ?
Can we learn …