I had never heard of the acronym VLDB until I tried to import a 77+ million record tab delimited file into Sql Server Express. I didn’t realize what a headache I was in for.
For AffPortal.com I have purchased about 15 – 20 different databases to power the keyword research functionality and to build initial lists with. I was thinking I could provide this search function to members and that brings a lot of value to a subscription which is only $27/month to begin with. It’s a serious amount of data to pour through.
Every attempt to import this data into Sql Server failed for one reason or another. Either truncation errors, Primary table disc space errors or freezeups. I even tried splitting out the large flat file into about 40 smaller files to import one by one. Still no luck. After much forehead slapping, I was able to set the initial size of the database to about 4gb and the import began. After about 35% of the importing completed, the database was full. I had maxed out the 4gb limit on Sql Server express and the full blown version of Sql Server cost thousands. Good times.
My db vendor suggested I try Firebird. An open source db like MySql. I downloaded it, installed and look at that, command line only, no user interface… no thanks. But wait, there all these third party UIs, awesome. So I start the import. BLAM. Primary table memory error again. Keep in mind this is after about 6 hours of messing with this well into last saturday morning.
I got an idea. Thanks to the guys at IronTech, I had a MySql db running already on the server and I knew this had a gui front end that is user friendly. So I crank it up. Wait there is no import tool. So off to Google I go and find several but i’m already in the hole quite a bit financially so I don’t feel like spending another couple of hundred on an import tool so I find a nice little bulk import query. Finally something free and in loads the database.
After about an hour of cranking away, convinced this is going to time out at minute 59, the query actually completed. I FINALLY had a database full of all 77, 300,000 keyword phrases….
I thought I was at the home stretch… nope.. (continued)
Facebook Comments: