<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/css" href="/stylesheets/rss.css"?>
<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:trackback="http://madskills.com/public/xml/rss/module/trackback/">
  <channel>
    <title>8th Light Blog: Storing Binary Data in Postgres? Beware!</title>
    <link>http://blog.8thlight.com/articles/2007/02/18/storing-binary-data-in-postgres-beware</link>
    <language>en-us</language>
    <ttl>40</ttl>
    <description>In the minds of the craftsmen...</description>
    <item>
      <title>Storing Binary Data in Postgres? Beware!</title>
      <description>&lt;p&gt;If your situation matches the following conditions, beware!&lt;/p&gt;

&lt;ol&gt;
    &lt;li&gt;You&amp;#8217;re working in rails.&lt;/li&gt;
    &lt;li&gt;You&amp;#8217;re using Postgresql.&lt;/li&gt;
    &lt;li&gt;You&amp;#8217;re storing binary data in the database.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This was the situation on a project of mine.  We were storing PDFs and PNG images in our Postgrs database.  Everything was fine during development and testing where we used files that ranged up to a few dozen kilobytes in size.  The situation went rapidly downhill when file sizes got up in the hundreds of kilobytes to megabytes.  The worst part about it was that the errors we got were misleading and seemingly random.  They included:&lt;/p&gt;

&lt;ul&gt;
    &lt;li&gt;&lt;code&gt;undefined method `&lt;&lt;' for nil:NilClass&lt;/code&gt;&lt;/li&gt;
    &lt;li&gt;&lt;code&gt;invalid end of buffer&lt;/code&gt;&lt;/li&gt;
    &lt;li&gt;&lt;code&gt;undefined class/module Packet&lt;/code&gt; (&lt;code&gt;Packet&lt;/code&gt; being a model class which was definitely defined)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These errors sent us on a wile goose chase.  The clue that finally pointed us toward the problem was the fact that it took 6 seconds to load a 4M PDF  document from the database.  That was far too long especially considering the same document could be loaded from a file instantaneously.&lt;/p&gt;

&lt;p&gt;Apparently, binary data stored in Postgresql&amp;#8217;s &lt;code&gt;bytea&lt;/code&gt; data type has to be parsed on save and load to escape and unescape certain characters.  Unfortunately the native C postgres gem doesn&amp;#8217;t do the parsing.  It&amp;#8217;s done in the &lt;code&gt;PostgreSQLAdapter.unescape_bytea&lt;/code&gt; and &lt;code&gt;PostgreSQLAdapter.escape_bytea&lt;/code&gt; methods (Ruby code) of ActiveRecord and the parsing is a bit too intensive for Ruby.  This is where the meltdown begins.  It consumes too much memory, or too much processing power, or &amp;#8230; well I don&amp;#8217;t know exactly.  But I know it breaks.&lt;/p&gt;

&lt;p&gt;We refactored our model such that all the binary data gets stored in flat files on disk rather than in the database.  After this, our Rails app came back to life.  It was MUCH faster too!  &lt;/p&gt;

&lt;p&gt;Here&amp;#8217;s hoping that if and when you encounter this problem, Google points you to this blog entry and you find it helpful.&lt;/p&gt;</description>
      <pubDate>Sun, 18 Feb 2007 06:05:00 +0000</pubDate>
      <guid isPermaLink="false">urn:uuid:f828a6f0-869a-4c89-86f8-f50cc6561beb</guid>
      <author>Micah</author>
      <link>http://blog.8thlight.com/articles/2007/02/18/storing-binary-data-in-postgres-beware</link>
      <category>Coding</category>
      <category>Micah</category>
    </item>
    <item>
      <title>"Storing Binary Data in Postgres? Beware!" by fotos@funtasia.gr</title>
      <description>&lt;p&gt;Well you are mostly correct. The ruby escaping/unescaping happens only if you use the pure ruby postgresql adapter and not the native one. &lt;/p&gt;

&lt;p&gt;If you install this gem: postgres-pr then you get the pure ruby (pr) adapter. If you install this gem: postgres you get the native (as in C) adapter (homepage: http
://ruby.scripting.ca/postgres/). I think it also goes by the name ruby-postgres (&lt;a href="http://rubyforge.org/projects/ruby-postgres/" rel="nofollow"&gt;http://rubyforge.org/projects/ruby-postgres/&lt;/a&gt;). &lt;/p&gt;

&lt;p&gt;You can see that from activerecord/lib/active&lt;em&gt;record/connection&lt;/em&gt;adapters/postgresql&lt;em&gt;adapter.rb that if the PGconn object responds to the &amp;#8216;escape&lt;/em&gt;bytea&amp;#8217; method it wi
ll try to do it that way. And the native adapter does it in C, using the libpq interface (&lt;a href="http://www.postgresql.org/docs/8.2/static/libpq-exec.html#LIBPQ" rel="nofollow"&gt;http://www.postgresql.org/docs/8.2/static/libpq-exec.html#LIBPQ&lt;/a&gt;-EXEC-ESCAP
E-BYTEA), which should be fast enough.&lt;/p&gt;

&lt;p&gt;If you run windows you can get the native adapter (precompiled) from &lt;a href="http://www.vandomburg.net/pages/postgres-ruby-windows." rel="nofollow"&gt;http://www.vandomburg.net/pages/postgres-ruby-windows.&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Also if you use PostgreSQL 8.2 you need to patch postgresql_adapter and add an &amp;#8216;E&amp;#8217; for the quoting of binary strings.&lt;/p&gt;

&lt;p&gt;Have fun and do a better research next time! :)&lt;/p&gt;

&lt;p&gt;-fot&lt;/p&gt;</description>
      <pubDate>Mon, 16 Jul 2007 17:26:11 +0000</pubDate>
      <guid isPermaLink="false">urn:uuid:d84544cf-14d5-480a-9b64-9089b6843147</guid>
      <link>http://blog.8thlight.com/articles/2007/02/18/storing-binary-data-in-postgres-beware#comment-278</link>
    </item>
    <item>
      <title>"Storing Binary Data in Postgres? Beware!" by Fotos Georgiadis</title>
      <description>&lt;p&gt;Well you are mostly correct. The ruby escaping/unescaping happens only if you use the pure ruby postgresql adapter and not the native one. &lt;/p&gt;

&lt;p&gt;If you install this gem: postgres-pr then you get the pure ruby (pr) adapter. If you install this gem: postgres you get the native (as in C) adapter (homepage: &lt;a href="http://ruby.scripting.ca/postgres/" rel="nofollow"&gt;http://ruby.scripting.ca/postgres/&lt;/a&gt;). I think it also goes by the name ruby-postgres (&lt;a href="http://rubyforge.org/projects/ruby-postgres/" rel="nofollow"&gt;http://rubyforge.org/projects/ruby-postgres/&lt;/a&gt;). &lt;/p&gt;

&lt;p&gt;You can see that from activerecord/lib/active&lt;em&gt;record/connection&lt;/em&gt;adapters/postgresql&lt;em&gt;adapter.rb that if the PGconn object responds to the &amp;#8216;escape&lt;/em&gt;bytea&amp;#8217; method it will try to do it that way. And the native adapter does it in C, using the libpq interface (&lt;a href="http://www.postgresql.org/docs/8.2/static/libpq-exec.html#LIBPQ" rel="nofollow"&gt;http://www.postgresql.org/docs/8.2/static/libpq-exec.html#LIBPQ&lt;/a&gt;-EXEC-ESCAPE-BYTEA), which should be fast enough.&lt;/p&gt;

&lt;p&gt;If you run windows you can get the native adapter (precompiled) from &lt;a href="http://www.vandomburg.net/pages/postgres-ruby-windows." rel="nofollow"&gt;http://www.vandomburg.net/pages/postgres-ruby-windows.&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Have fun and do a better research next time! :)&lt;/p&gt;

&lt;p&gt;-fot&lt;/p&gt;</description>
      <pubDate>Wed, 04 Jul 2007 19:55:13 +0000</pubDate>
      <guid isPermaLink="false">urn:uuid:e70292db-9189-458b-b8b6-73639321175d</guid>
      <link>http://blog.8thlight.com/articles/2007/02/18/storing-binary-data-in-postgres-beware#comment-276</link>
    </item>
  </channel>
</rss>
