![]() |
Articles Feed |
Categories
Archives
- March 2010 (1)
- February 2010 (2)
- January 2010 (1)
- December 2009 (1)
- October 2009 (4)
- September 2009 (2)
- August 2009 (2)
- July 2009 (5)
- June 2009 (2)
- May 2009 (2)
- April 2009 (8)
- March 2009 (7)
- January 2009 (2)
- December 2008 (3)
- November 2008 (5)
- October 2008 (4)
- September 2008 (6)
- August 2008 (4)
- July 2008 (5)
- June 2008 (5)
- May 2008 (4)
- April 2008 (2)
- February 2008 (4)
- January 2008 (2)
- December 2007 (2)
- November 2007 (2)
- October 2007 (2)
- September 2007 (1)
- August 2007 (3)
- July 2007 (1)
- June 2007 (4)
- May 2007 (7)
- April 2007 (2)
- February 2007 (3)
- January 2007 (3)
- November 2006 (3)
- October 2006 (3)
- September 2006 (17)
- November 2004 (1)
Storing Binary Data in Postgres? Beware!
by: micah | February 18th, 2007 |
If your situation matches the following conditions, beware!
- You’re working in rails.
- You’re using Postgresql.
- You’re storing binary data in the database.
This was the situation on a project of mine. We were storing PDFs and PNG images in our Postgrs database. Everything was fine during development and testing where we used files that ranged up to a few dozen kilobytes in size. The situation went rapidly downhill when file sizes got up in the hundreds of kilobytes to megabytes. The worst part about it was that the errors we got were misleading and seemingly random. They included:
undefined method `<<' for nil:NilClassinvalid end of bufferundefined class/module Packet(Packetbeing a model class which was definitely defined)
These errors sent us on a wile goose chase. The clue that finally pointed us toward the problem was the fact that it took 6 seconds to load a 4M PDF document from the database. That was far too long especially considering the same document could be loaded from a file instantaneously.
Apparently, binary data stored in Postgresql’s bytea data type has to be parsed on save and load to escape and unescape certain characters. Unfortunately the native C postgres gem doesn’t do the parsing. It’s done in the PostgreSQLAdapter.unescape_bytea and PostgreSQLAdapter.escape_bytea methods (Ruby code) of ActiveRecord and the parsing is a bit too intensive for Ruby. This is where the meltdown begins. It consumes too much memory, or too much processing power, or … well I don’t know exactly. But I know it breaks.
We refactored our model such that all the binary data gets stored in flat files on disk rather than in the database. After this, our Rails app came back to life. It was MUCH faster too!
Here’s hoping that if and when you encounter this problem, Google points you to this blog entry and you find it helpful.
