Speed up mysql in rails

I was looking at the implementation of the instantiation of ActiveRecord objects from the database, and the population of the @attributes attribute. There is a method called all_hashes which generates the hashes that are used for the @attributes, so I looked at this. Generating whole hashes for each database row is a little bit expensive in time and memory. Perhaps we can do better?I note that Stefan Kaes did some work a year and a half ago on implementing all_hashes in C but it still relies on hashes being generated – I don’t think anyone has attempted what I do here.So without further ado, I present slim_attributes, the non-hash implementation of all_hashes.Here are the important but unscientific benchmarks (to give you an idea) – notice that the speed relative to using plain ActiveRecord depends on how many attributes are accessed in the model objects (because slim_attributes lazily instantiates them into strings). There were 2 models used; one had 44 and the other had 104 attributes.


View the plugin here – and install with:script/plugin install http://pennysmalls.com/rails_plugins/slim_attributesthen follow the instructions to compile it given in the README below (yes, it should be made into a gem that compiles itself):[UPDATE: there is now a better rubygem, see here and here.]==========SlimAttributesThis is a small patch to the ActiveRecord Mysql adaptor that stops rails from using the existing all_hashes / each_hash mechanism – which is what is called when you do a find.It is faster, and uses less memory.Measuring with just ActiveRecord code – fetching stuff from the database – we see anything from very little up to a 50% (or more) speed increase, but I suppose it really depends on your system and environment, and what you are doing with the results from the database. Measure your own system and send me the results!InstallationYou’re going to need the mysql headers for this to work.

cd vendor/plugins/slim_attributesruby extconf.rb --with-mysql-configmakesudo make install

DescriptionThe reason for overriding all_hashes is threefold:* making a hash of each and every row returned from the database is slow* ruby makes frozen copies of each column name string (for the keys) which results in a great many strings which are not really needed* we observe that it’s not often that all the fields of rows fetched from the database are actually usedSo this is an alternative implementation of all_hashes that returns a ‘fake hash’ which contains a hash of the column names (the same hash of names is used for every row), and also contains the row data in an area memcpy’d directly from the mysql API.The field contents are then instantiated into Ruby strings on demand – ruby strings are only made if you need them. Note that if you always look at all the columns when you fetch data from the database then this won’t necessarily be faster that the unpatched mysql adapter. But it won’t be much slower either, and we do expect that most times not all the columns from a result set are accessed.Note that the ‘fake hash’ quacks like a hash in many ways, but not all ways. So @attributes in an ActiveRecord object may not behave as you are expecting it to, and it particularly won’t work if you try to add a key to it that is not a column name in the result set.

@attributes["not a column name"] = "something"=> RuntimeError: Key was not a column name from the result set

Hash has many methods that are not supported by the fake hash, but I found that the ones I have implemented have been sufficient for use in our Rails app. It should be fairly easy to implement most of the missing methods if needed, but I did not wish this patch to be larger than necessary.===========No warranty – this plugin should be considered experimental and likely needs some more work if you want it to be foolproof. However, that said, we are using it in our production environment with good results.==========Finally it’s interesting to note that Dan Chak wrote some code to actually return hashes from the database rather than ActiveRecord objects, when you just want the data without any fancy associations and so on. It’s much faster, proving that creating the ActiveRecord objects is fairly slow. I’ll take a look at combining this with slim_attributes – returning fake hashes should be faster still. (Combining his 50% improvement with my 50% should yield instant results 🙂UpdateI have now tested hash_extension with and against slim_attributes. My test fetched all records from two separate ActiveRecord models 100 times.

Plain ActiveRecord 38.3s
Using find_as_hashes 35.1s
Using slim_attributes 13.0s
Using both 10.4s

Clearly slim_attributes makes the biggest difference, but it should be noted that this is really the ideal case – where Model.find(:all) is done without actually accessing any of the attributes.