Tuesday, January 20, 2015

Getting Stanford CoreNLP working

It's not trivial.

Installing on Ubuntu.
Had to get java on there
sudo apt-get install default-jdk

Or do I just need to do ant?
sudo apt-get install ant

https://ant.apache.org/manual/tutorial-HelloWorldWithAnt.html

Tried to run ant but got an error, I think it was about java 1.8

Going to try to get the real deal from here
http://www.oracle.com/technetwork/java/javase/downloads/jdk8-downloads-2133151.html

Untar instructions, per usual
http://www.pendrivelinux.com/how-to-open-a-tar-file-in-unix-or-linux/

---
I ended up going to just the server version of Ubuntu. The performance of the desktop version was awful on my machine, and I don't think I really need any GUI.

Tried using the default-jdk, forgetting that I needed Java 8. Trying to curl the download file proved problematic. After failing to get an SSH thing going between my computer and the VM, I found this curl instruction, which adds some flags and a header to the make it work:

http://stackoverflow.com/questions/10268583/how-to-automate-download-and-installation-of-java-jdk-on-linux

Once the jdk was unpacked, I linked all the binaries inside to my personal "bin" folder:

http://stackoverflow.com/questions/1347105/linux-link-all-files-from-one-to-another-directory

Ran ant. Build successful! Now, how do I use this?

(side - had some issue with alt tabbing away from VM, solution here: https://forums.virtualbox.org/viewtopic.php?f=9&t=8329)

---
Turns out I had to build the JARs first, with command "ant jar", while in the directory with the build.xml file.

https://ant.apache.org/manual/tutorial-HelloWorldWithAnt.html

Then tried running:

java -cp "*" -Xmx2g edu.stanford.nlp.pipeline.StanfordCoreNLP -annotators tokenize,ssplit,pos,lemma,ner,parse,dcoref

but got errors. Looks like complaints about models. Wasn't I supposed to download these from somewhere?

---
Scratch a bunch of that, or possibly all of it. There was an actual download link directly on the Stanford website, that appears to be everything ready to go. I missed it because its center-aligned while the rest of the page is left aligned... Anyway, I was able to get the interpreter running by directly copying the provided command.

Testing with some sample text, the tool seems to do something. But I don't really know how to interpret any of it.

No comments:

Post a Comment