I was consumed with dreams about Stephen Hawking’s black hole entropy formula last night, which is frustrating because the math is, sadly, beyond me. But, I mention it to you today because, knowing so little about black holes my mind instead kept trying to change it into an formula to calculate the Financial Entropy of a webapp subscriber.
So, I put it to you, dear reader, have you, or any of your math enabled friends, come up with a formula for calculating the finincial entropy of a webapp subscriber? If you haven’t, but you could, there are many many entrepreneurs who would sing your praises and happily buy you a drink.
On a related note: a little Googling brings up this fascinating looking book whose price or $199.95 puts it beyond that of anything I’ll buy without having a damn good idea of what exactly is inside. Also, why is it that I never heard the phrase “Financial Entropy” until last night in my dreams?
It was foggy the other day in Boston and I couldn’t help but take some pictures. Something about this picture just makes me feel like I’m home. I don’t really understand it though, because it’s the old parts of Boston that I truly love.
I’m not sure where I originally came across the extract_fixtures rake task (maybe here)but there’s nothing better than using real data to run your Rails unit tests. Well, real in the sense that it was generated by actually using your app. But there’s a problem with extract_fixtures. Once you get some real data to base your tests on you don’t want it to change because it would break your tests. So, after the first run extract_fixtures becomes almost useless because it’ll wipe out the fixtures you’ve been working with.
So what’s a girl to do? Well, in my case the answer is to beef it up a bit. The following version of extract_fixtures takes two optional parameters:
TABLES=foos,bars,other_foos
Tables takes a comma delimited (no spaces) list of table names that you want to extract. If you pass this in it will only extract those tables.
If you improve it further please drop me a line and I’ll add or link to your enhancements here.
desc "Create YAML test fixtures from data in an existing database. " +
" Defaults to development database. Set RAILS_ENV to override. " +
"\nSet OUTPUT_DIR to specify an output directory. Defaults to test/fixtures. " +
"\nSet TABLES (a coma separated list of table names) to specify which tables to extract. " +
"Leaving it blank will extract all tables."
task :extract_fixtures => :environmentdo
sql = "SELECT * FROM %s"
skip_tables = ["schema_info"]
ActiveRecord::Base.establish_connection
if (notENV['TABLES'])
tables = ActiveRecord::Base.connection.tables - skip_tables
else
tables = ENV['TABLES'].split(/, */)
endif (notENV['OUTPUT_DIR'])
output_dir="#{RAILS_ROOT}/test/fixtures"else
output_dir = ENV['OUTPUT_DIR'].sub(/\/$/, '')
end
(tables).each do |table_name|
i = "000"File.open("#{output_dir}/#{table_name}.yml", 'w') do |file|
data = ActiveRecord::Base.connection.select_all(sql % table_name)
file.write data.inject({}) { |hash, record|
hash["#{table_name}_#{i.succ!}"] = record
hash
}.to_yaml
puts"wrote #{table_name} to #{output_dir}/"endendend
Ever since migrations were introduced to Rails I’ve heard nothing but praise for them, and truth be told, they are a far better way of setting up your database than the standard raw sql import. But, that’s where the goodness ends.
The problem is in the concept of going up or down in database versions. The core concept is great, to be able to roll back to a previous version of the database, but the implementation is completely out of sync with the version control systems we use to manage the codebase that depends on that database. I’ll use subversion as an example because (for those of you still stuck in CVS land) every time you do a check in the system gets tagged with a new revision number.
Everything starts out fine. The initial migration reflects the needs of the initial codebase. But after that they’re never the same. There’s no way to now what migration corresponds to what version of the codebase. What if in checkin 400 I add a new migration that changes the schema. A few weeks later I need to roll back the codebase to version 380. What migration number should i roll the database back to if any? Unless you happen to remember what migration number you were on when the codebase was at 380 you’re screwed.
So what’s the solution?
Well, the solution starts with the schema.rb file that’s created every time you run a migration. If you’re like me, that gets checked in whenever it changes and the schema.rb file is always in sync with your codebase. So you’ve always got a representation of the appropriate database configuration for any revision of your codebase. That’s good. In fact that’s better than the actual migration scripts because it’s always in sync. But there’s one more problem…
**Priming the database**
Most webapps don’t function well with a completely empty database. There are usually some default settings, maybe an admin account to log in with the first time, things like that. And there are a few rake tasks out there to [bootstrap your database](http://rails.techno-weenie.net/forums/2/topics/778?page=1) with the contents of some .yaml files. There are others to [dump your database](http://media.pragprog.com/titles/fr_rr/code/CreateFixturesFromLiveData/lib/tasks/extract_fixtures.rake “warning, that’s a raw rake file”) into the same files for later import, or generating real data to run your tests against.
**Putting it all together**
If you make sure that your your .yaml files for bootstrapping are always in sync (you should be keeping them in sync anyway) then you can avoid the problem entirely by just making your own schema.rb file (it’s just a migration file for all the tables at once) and never running `rake db:migrate` again. Instead run your schema.rb (make sure it forces table creation) and then `rake db:bootstrap`. When you need to modify your database schema don’t make a new migration script. It’ll only confuse matters down the road. Instead, modify your schema.rb and make sure your .yaml files are in sync (you’d have to do that even if you were using migrations).
[Damien Tanner](http://iamrice.org/) points out that migrations are *very useful* when migrating a live site. And I have to agree. My proposed solution is seriously lacking in that department. What would most likely be best is a blending of both concepts. Don’t use migrations to muck about with your database during development. Instead, create migration scripts *only* for migrating data on live sites. Don’t title them “`create_foo_table`” or “`add_foo_column`” but instead name them something like “`live_site_version_4`” or something like that that makes it clear exactly what it’s for and how it pertains to your site.
There’s been a lot on my mind lately. Gears a-moving. Projects in motion. And things to say…
I’ve had a lot to say lately, not that you’d know it from my lack of posting. But, that’s mostly because I find spam really demoralizing and some Arse-hole was manually submitting it! Bots I can understand, they suck but at least it’s just some mindless thing that happened to find a site it knew how to submit to. But no. weblog.masukomi.org was a one of a kind intstallation. So writing a bot for it would be pointless. ARGH someone too the time to manually add spam to all my new posts!
I couldn’t take it.
So, I sucked in my gut, upgraded mysql, (after doubly covering my butt against potentially hosing client databases on this box), and installed [Mephisto](http://mephistoblog.com/) (edge version). I said “screw it” to dealing with making a custom theme. There’s just way too much on my plate right now that I consider far more important. Like:
* The new web based version of ListfulThinking (not released yet).
* Another web app I’ve been working on to support ListfulThinking, then realized I could sell it too!
* Bug fixes for [ServerWatcher](http://serverwatcher.masukomi.org/) (not formally released but available in [CVS](http://sourceforge.net/projects/serverwatcher))
* A book on managing open source projects.
* And the big one you’ll see in a couple days… Caterpillar 3.0.
Caterpillar 1 and 2 were extremely compact news aggregators written in Java with a Swing UI. Caterpillar 3.0 adds Bayesian filtering to find “interesting” articles. Why? Well, you see I, unlike most people, subscribe to a LOT of feeds. 220+ last time I checked. Mostly blogs of programmers like me or people like [Dooce](http://www.dooce.com/) who just rock. But, while I’m definitely the exception you probably do exactly the same thing with Mailing lists. You end up subscribed to many interesting lists, get flooded with so much mail you couldn’t possibly keep up with it all, and then get frustrated because you *know* you’ve missed something interesting in there.
So what’s the solution? The same thing Thunderbird and just about every other mail app uses to keep out spam. Just reverse it and apply it to feeds. Instead of leaning what articles suck (spam) you train it what articles rock.
I’ve been sitting on this for over a year now. It’s a great proof of concept and I knew that with a bit more work it could be something sellable. Except, I have other projects that I am far more interested in working on and I realized that I’d been completely ignoring my own advice on when to set an app free.
As I’ve said before I think that there are some great reasons not to open source an app. The biggest of which money. But far too many developers keep things under wraps because they “might sell it someday”. Well, that’s just what I had been doing. Actually I was serious about selling it for a while but things changed. I met a girl named [Ruby](http://www.ruby-lang.org/) and fell in love. Java…. well, we won’t talk about him. Boys are icky anyway.
So, yeah. I commented out the registration bits, made some bug fixes, and will be releasing it shortly. But, I don’t really have much interest in setting up yet another project and adding yet another source of bugs needing fixing to my plate (what with two potentially profitable and far more enjoyable Ruby on Rails projects on the horizon). So it will be a drop shipment kind of release. Kersplat. code… dribble….
Regarding this site:
The feed is… well, don’t subscribe to it quite yet, I still need to hook it back into feedster which means it’ll probably be repointed within a few days, but it’s late, and i’ve been poking at annoying bits for hours now.
The old articles… no clue. Maybe I’ll take the time to import them. Maybe I won’t.
In the future you should expect a different kind of blogging from me. I’m planning on doing a [Paul Graham](http://www.paulgraham.com/) kind of approach to it. Well written articles (unlike this one) with a point. I’ve got a few queued up in my head already.