Quick Hack: Nokogiri backed Hash-ish class
July 20th, 2009
Many months ago I was working on some code that needed to handle small bits of XML. For the sake of expediency I went with using Rails’ Hash.from_xml method. This worked great for quite a while.
Eventually, the size of the XML I was handling got quite a bit bigger. Now, I’m dealing with several megabytes of XML in each file. With this much data Hash.from_xml is painfully slow. Maxing out the CPU for several minutes slow.
So, I spent a few minutes on Friday afternoon and hacked together a class that acts like the data structure that Hash.from_xml returns. However, instead of building the entire data structure up front, it has some Hash and Array methods that call Nokogiri methods on demand. That said, even parsing the entire document is an order of magnitude faster.
user system total real
Hash.from_xml 663.820000 3.220000 667.040000 (675.076590)
NokoHash 22.410000 0.190000 22.600000 ( 22.771999)
I also threw in some #method_missing magic to allow for method-call style access of the structure instead of just hash/key access.
Warning
This code is only tested as far as my specific needs for this specific system. I have precisely zero confidence that this will work smoothly for you. I just found this to be an interesting bit of code, I hope you do too.
1 Response to “Quick Hack: Nokogiri backed Hash-ish class”
Sorry, comments are closed for this article.
July 21st, 2009 at 07:40 AM
late loading is a beautiful thing when it comes to performance and memory constraints. nice job.