Friday, March 20, 2009

CUIL Search Engine & my robots.txt file

Back in early September, I'd written about the new search engine - CUIL. Like many people, I'm pretty wed to my Google, but CUIL had a lot of positive buzz. Unfortunately, every time I searched on the words that would put my site first in the Google results list - the ones that users would search on to get to our site - I did not get our site. I got a Wikipedia page and a few other related pages, but not my site. Very problematic! I immediately submitted a form to CUIL asking for our site to be included. That didn't seem to work. A few days after submitting our site to be crawled, I tried again - same lousy results set. Now I'm frustrated. I do it for a few weeks. Each time we don't appear I fill in the form to contact CUIL & ask for help. I didn't get anything back. I'd kind of written off CUIL after that, just giving up on it & not recommending it to anyone. But then someone reminded me of CUIL yesterday. Tried it again - now some of our site's subpages are appearing, but not our home page. Still frustrated, I fill in the form and ask them to write back ASAP, that I hadn't gotten a response, etc., etc.
This time, I get an immediate reply from their VP of Marketing, I believe, followed up by further emails. By this AM, I had a full assessment of what - from their bots' programming was the problem with my site's robots.txt file. Now, understand, I've never spent a LOT of time on our robots.txt file & learning how it hangs together. I knew the basic syntax and put it together in a way that the googlebot didn't mind. Apparently, CUIL's bot did. In fairness to CUIL, I had redundancy in my robots.txt file because I wasn't careful in the robots.txt construction, so it makes sense why their bot does what it does. In fairness to me, if it worked with Google, it seemed reasonable.
My big mistake was to try and to specify levels of directories using wildcards. /*/*.css for example (no, I have no idea why I was so paranoid that the wildcards wouldn't be enough to designate that the pattern I wanted to avoid wouldn't be enough to ensure that the pattern was avoided at all directory levels. My bad.)
So I've fixed it & we'll see if that fixes CUIL. If it does, then I'll probably start recommending it to others out there. In the meantime, they definitely get an A for customer service to webmasters this go 'round.

No comments: