Yesterday, the new website at work was supposed to launch. We’d tested it out pretty thoroughly–unit tests, stress tests, and a “training” copy of the content management system for users in the company. We’d even set up a beta copy on the live server where we added in some of the virtual directories used for services within the company (support ticketing system, etc.) to make sure they all functioned right under the new content management system. Everything was supposed to go smoothly.
And everything did, for the most part.
Except suddenly, some people weren’t able to get to their email through Outlook. Outlook Web Access worked just fine. Outlook worked just fine for some people. And there was really no common thread linking all of the users who had problems. Some people externally had problems, some people internally had problems, but nothing we could easily reproduce consistently in the IT office.
What ensued was a few hours of scrambling to figure out the problem and get a solution. It’s a sticky situation as to what to do next–you’ve told management that the new website is up and they’ve relayed that news on to the rest of the company, but some people can’t get their email through Outlook now.
Since it wasn’t easy to narrow down the cause of the problem, it wasn’t easy to find a solution. Without being to narrow down the root cause of the problem, you’re limited to trial and error. And trial and error often produces a lot of “magic” solutions that seem to work if you change setting X on the server and change setting Y on the client and log in with your domain and username and hold your head just right.
Trial and error can also result in hosing things up even more than normal. We stopped trying when it was clear we were out of ideas–because when you’re out of ideas, you start tinkering with things you don’t fully understand. So it didn’t get fixed until this morning–which looked bad for us, but wasn’t a complete lack of service as Outlook Web Access still worked just fine.
The culprit, it turns out, was wildcard extension mappings in IIS. The new website’s content management system uses this feature to route all file requests, regardless of extension, through ASP.Net, where the requested URLs are matched up against content in a database.
The problem was that the Exchange RPC front-end was installed on the Default Web Site. I’m not exactly sure of the inner workings of IIS, ASP.Net, and Outlook RPC, but somewhere in the mix, IIS was throwing back 405 errors refusing to handle RPC_IN_DATA and RPC_OUT_DATA HTTP requests.
Now, as to why we didn’t test this, it’s apparently a pain to set up Outlook RPC (or so I’m told, I’ve never done it). And we’d already tested that virtual directories as a whole work. Of course my fear there was that configuration settings from the main site would affect virtual directories under it. That really wasn’t the problem; the problem was that configuration settings on the main site kept certain HTTP methods from being handled at all.
I should also mention that we couldn’t easily find anything on Google about this. I did manage to turn up some posts about problems with SharePoint Services and Outlook RPC. (SharePoint Services, at least version 3, is built on ASP.Net and uses wildcard extension mappings to handle requests.) They didn’t offer any hope other than saying you shouldn’t run SharePoint and Outlook RPC on the same website.
At this point we’ve replaced all of the content on the site with static copies. There weren’t that many pages so it wasn’t as painful as it could have been.
I’ve modified the content management system to work around our little dilemma. (In the span of about half a day, I might add.) Basically what that means is we’ve mapped *.aspx and *.html files (the only file extensions you can give a file in the CMS) to go through ASP.Net without verifying that the file exists. The CMS will also create blank “default.aspx” or “index.html” placeholder files when new pages are created in the system, which make IIS think that default documents exist, and then pass requests through ASP.Net.
It ain’t pretty but it works–believe me, I’ve checked the fix thoroughly. It’s sort of depressing because I took great care in making the system as maintainable and efficient as possible (or at least a balance between the two) and the code as clean as possible. This fix sort of flip-flops some of the logic, and adds a lot of extra mess to the file system (what with the placeholder files and all). And sadly, since Outlook RPC is such a pain to set up, this is probably the long-term solution.
But with the fix, we no longer have to choose between allowing users to access Exchange from the Internet and having a working website. And while the whole matter was a mistake on our part–although how could you realistically test for something like this?–we diagnosed the problem and created a fix relatively quickly without a lot of (potentially dangerous) knee-jerk reactions.