Wednesday, March 4, 2015

Cutting down data

I cut down the dataset to a smaller sample of 500 entries. Still, there is an error. I get an OutOfMemoryError, referring to the Java heap space. Perhaps I will try to allocate more memory to it via command line.

No crashes when doing 100 lines.

Now that I have something to work with, it is quite the XML jumble.

Currently working with this tutorial:

https://docs.python.org/2/library/xml.etree.elementtree.html

No comments:

Post a Comment