How To: Troubleshoot VServer is Down Issue
In this AskSupport How To video you will learn how to Troubleshoot VServer is Down Issue
Tags: technical support netscaler how to
Views: 991
Rating: 0
Transcript : Hi, my name is Ronan O’Brien, and I work in the Citrix Tech Support Readiness Team. In today’s How To, I’m going to take a look at how to troubleshoot a Vserver down issue. There’s a couple of different Vservers, as we know, on the appliance, so we will be taking primarily a look at the normal load balancing Vservers and also the Access Gateway Enterprise Edition Vserver. So we have a VIP, like so. This is the response we’re getting: Http/1.1 Service Unavailable. All right. So, this is what you’d normally get called about, or this would be the first instance…the first sign that you have a problem. So the first thing to do, I’m going to take a look at the virtual server itself. Right? And we can see, in fact, that here it is here, ‘cause I’m going port 80 HTTP. It is, in fact down. Okay. So, we have a problem. First thing I’m going to do is just hit a refresh, just to make sure that it is down, that it’s not saved from some previous state. It’s still down. So, what brings a Vserver down? If the service that are bound to it…the services which are bound, if they’re down, that will bring the Vserver down. So, I will need to find what services are bound. So, on my box, I don’t really have that many things configured, that many entities. But on a bigger system, you may need to go in here, just to see which services we’re looking at. So, I’m going to narrow this down. We have one service. Let’s take a look at that. So I drop into Services. I double click this service. Okay. Why am I double clicking it? What brings a service down? The monitor that’s bound to it. That decides if the service is healthy or not. This case, we have one http monitor. What does a http monitor do? It connects over port 80 and sends a request and looks for a 200 OK response code. So let’s click on this monitor here, and what do we get? We get the details of the monitor here. We can see the state is down. We see the probes. We see how many are failed out of the total. Okay? We get the reason why it’s failed. This is the key to finding out what’s going on. Failure—no MIP or SNIP available to send the monitor probe. That’s an interesting little error. Okay. What does that mean, no MIP or SNIP available? We all know that the NetScaler appliance talks to the backend servers using its MIP or SNIP. Okay, that’s mapped IP address or subnet IP address. So, by default ,it will use this IP, and it says it isn’t available, or that there’s none available. So we go to network I click on IPs, and I see we have a similar IP address. So what’s the problem? Okay, let’s refresh this again just to make sure no one has changed it while we’re working. So, it’s still there. Okay? Now, I happen to know what the issue for this problem is. The next step, we go…we need to make sure that the mode is enabled. Okay. So I go to Change Modes here and Settings. And we can see that Use Subnet IP mode is not turned on. Okay. That’s the reason for that particular error code. So I click this, okay? Use Subnet IP. Click OK. Yes. So just wanted…that pop-up box wanted to know if we wanted to make changes. We do, indeed. We need to have Use Subnet IP mode turned on. So I go back to my virtual servers. I’m going to hit refresh, and they’re still down. Okay? Now what’s the problem? So I go back to my services again. Let’s refresh these. I can see my DNS servers has come up. Okay. That’s a good thing. But my service here is still down. So let me open up this service again and simply click HTTP. So I can see here, it still says no MIP or SNIP available to send the monitor probe. Okay? I’m just going to do a Refresh All and open this up again. Now you can see that the issue has changed. It’s now gone to Last Response: Failure—HTTP response code 404 received. Okay. What does that mean? 404 is a not found, right? So, this is the URL. Now we can do several things at this stage. We could take a network trace, right, to see what the issue is. We know what HT…let’s take a look at the HTTP monitor. Let’s see exactly what it’s doing. This just…this is just the default monitor, and it’s doing a HEAD on the root directory. Okay. So the HEAD simply requests headers in the response. It’s not looking for any body response. It’s looking to say, “Give me the response headers only please.” That makes it nice and efficient. We’re not sending…wasting bandwidth by transferring HTML body content. We’re just looking for the…we’re just looking for this response code. So, we’re doing this on the root. Okay. So I’ve a couple of choices here. I can drop into tal.net here, and I can do the same thing. Right? Just from my own workstation, let’s just see if I get the same response. Okay? So I’m bypassing the NetScaler at this point, and we’re just going to go from a workstation. So I’m telling the port 80. Okay, so the port opens up. We know that the backend server is listening. Now I’m just going to open up NotePad here, and I’m going to craft a HTTP request that should be very similar to what the NetScaler is typing in. So…okay? Now we also have to give a host to be HTTP/1.1 compliant. I’m just putting in the IP. And that’s it. So I’m just going to copy and paste this into my terminal, or into my tal.net window. Basically I’m using tal.net here to act like a browser. And we can see the response: 404 Object Not Found, which is quite strange. Okay? Now at this point, we realize that that’s what the web server is returning. And this point, we would go to the web server admins, and say, “Why is your web server returning a 404, okay, where it shouldn’t…it should clearly be returning a 200?” In this particular case, I’ve just paused my web server. I’ve paused the process so it’s not serving any content. I’m going to start this again. So I’ve now started it. Let’s paste this in again and see what we get. Okay, so we’re getting a bad request. I don’t think it didn’t like… Let’s try this one more time. Paste. And there we are. Okay? So it’s getting a 200 OK. So now the web server’s up and running. My tal.net works. Let’s jump back to my… Let’s hit our config Refresh All. And we can see that it’s up. Okay? You double click. Go to our monitor. And there we see the response code. So that’s just a quick example of how to troubleshoot HTTP load balance Vservers when they’re down. Okay? We step into the services, we look at the monitor result, and then we take it from there. Now, as regards Access Gateway Enterprise Edition, we can see here that this is down. And it’s very rare, okay, that we see this sort of error. The reason for that being that there are no services bound to an Access Gateway virtual server, but it is SSL. One of the few things which can bring down an SSL Vserver is the lack of a certificate. So we can see here there are no certificates configured. So I just need to add--I have one of these test certs here—add my test certificate, and that’s what brings it up. That’s all that’s required for that particular case. So just a few little short pointers on the different directions we can go to resolve these sort of issues. And the tool is available. So a simple command to prompt tal.net, and we can check the health of our web servers. We can do the same from the NetScaler, as well, and we can run tal.net, too. So, hopefully, that’s useful. Hopefully you’ll get some use out of these particular troubleshooting points. Thanks for listening, and have a nice day.