XenServer + Neutron used to work well, but the support was broken when more and more changes were made in Neutron, and the lack of a CI environment with XenServer.

I began getting XenServer + Neutron back to work with previous blog openstack-networking-quantum-on-xenserver

[[code]]czoxOTQ6XCIgICAgRGVwbG95bWVudCBlbnZpcm9ubWVudDoNCiAgICAgICAgWGVuU2VydmVyOiA2LjUNCiAgICAgICAgT3BlblN0YWN7WyYqJl19azogbGF0ZXN0IG1hc3RlciBjb2RlIChTZXB0ZW1iZXIgMjAxNSkNCiAgICAgICAgTmV0d29yazogTUwyIHBsdWdpbiwgT1ZTIGRyaXtbJiomXX12ZXIsIFZMQU4gdHlwZQ0KICAgICAgICBTaW5nbGUgQm94IGluc3RhbGxhdGlvbg0KXCI7e1smKiZdfQ==[[/code]]

I had made some changes in DevStack script to let XenServer+Neutron be installed and ran properly. Below are some debugging processes I made when newly launched VMs cannot get IP from DHCP agent automatically.

Brief description of VMs getting IP from DHCP process:

When VMs are booting, they will try to send DHCP request broadcast message within the same domain and waiting for DHCP server’s reply.

If VMs cannot get IP address, our straightforward reaction is to check whether the packets from the VMs can be recieved by DHCP server, see picture 1

flow-VM-to-DomU

Dump traffic in Network Node

Since I use DevStack with single box installation, all nodes reside in the same DomU.

1. Check namespace that DHCP agent uses

execute sudo ip netns in DomU, you probably get outputs like these

[[code]]czo5ODpcIiAgICBxcm91dGVyLTE3YmRiZTUxLTkzZGYtNGJkOC05M2ZkLWJiMzk5ZWQzZDRjMQ0KICAgIHFkaGNwLTQ5YTYyM2ZkLWN7WyYqJl19MTY4LTRmMjctYWQ4Mi05NDZiZmI2ZGYzZDcNClwiO3tbJiomXX0=[[/code]]

Note: qdhcp-xxx is the namespace for DHCP agent

2. Check interface DHCP agent uses for L3 packets

execute sudo ip netns exec qdhcp-49a623fd-c168-4f27-ad82-946bfb6df3d7 ifconfig, you can get interface like tapXXX

[[code]]czo5NzY6XCIgIGxvICAgICAgICBMaW5rIGVuY2FwOkxvY2FsIExvb3BiYWNrDQogICAgICAgICAgICBpbmV0IGFkZHI6MTI3LjAuMC57WyYqJl19MSAgTWFzazoyNTUuMC4wLjANCiAgICAgICAgICAgIGluZXQ2IGFkZHI6IDo6MS8xMjggU2NvcGU6SG9zdA0KICAgICAgICAgICAgVXtbJiomXX1QIExPT1BCQUNLIFJVTk5JTkcgIE1UVTo2NTUzNiAgTWV0cmljOjENCiAgICAgICAgICAgIFJYIHBhY2tldHM6MCBlcnJvcnM6MCBke1smKiZdfXJvcHBlZDowIG92ZXJydW5zOjAgZnJhbWU6MA0KICAgICAgICAgICAgVFggcGFja2V0czowIGVycm9yczowIGRyb3BwZWQ6MCBvdmV7WyYqJl19cnJ1bnM6MCBjYXJyaWVyOjANCiAgICAgICAgICAgIGNvbGxpc2lvbnM6MCB0eHF1ZXVlbGVuOjANCiAgICAgICAgICAgIFJYIGJ5dHtbJiomXX1lczowICgwLjAgQikgIFRYIGJ5dGVzOjAgKDAuMCBCKQ0KDQogIHRhcDdiMzllY2FkLTgxIExpbmsgZW5jYXA6RXRoZXJuZXQgIEhXe1smKiZdfWFkZHIgZmE6MTY6M2U6ZTM6NDY6YzENCiAgICAgICAgICAgIGluZXQgYWRkcjoxMC4wLjAuMiAgQmNhc3Q6MTAuMC4wLjI1NSAgTWF7WyYqJl19c2s6MjU1LjI1NS4yNTUuMA0KICAgICAgICAgICAgaW5ldDYgYWRkcjogZmU4MDo6ZjgxNjozZWZmOmZlZTM6NDZjMS82NCBTY29wZXtbJiomXX06TGluaw0KICAgICAgICAgICAgaW5ldDYgYWRkcjogZmRmZjo2MzE6OTY5NjowOmY4MTY6M2VmZjpmZWUzOjQ2YzEvNjQgU2NvcGU6e1smKiZdfUdsb2JhbA0KICAgICAgICAgICAgVVAgQlJPQURDQVNUIFJVTk5JTkcgIE1UVToxNTAwICBNZXRyaWM6MQ0KICAgICAgICAgICAgUlh7WyYqJl19IHBhY2tldHM6NDI2MDYgZXJyb3JzOjAgZHJvcHBlZDowIG92ZXJydW5zOjAgZnJhbWU6MA0KICAgICAgICAgICAgVFggcGFja2V0c3tbJiomXX06MzggZXJyb3JzOjAgZHJvcHBlZDowIG92ZXJydW5zOjAgY2FycmllcjowDQogICAgICAgICAgICBjb2xsaXNpb25zOjAgdHhxdWV1e1smKiZdfWVsZW46MA0KICAgICAgICAgICAgUlggYnl0ZXM6NDY4NzE1MCAoNC42IE1CKSAgVFggYnl0ZXM6NDg2NyAoNC44IEtCKQ0KXCI7e1smKiZdfQ==[[/code]]

3. Monitor traffic flow with DHCP agent’s interface tapXXX

execute sudo ip netns exec qdhcp-49a623fd-c168-4f27-ad82-946bfb6df3d7 tcpdump -i tap7b39ecad-81 -s0 -w dhcp.cap to monitor traffic flow with this interface

Theoretically, when launching a new instance, you should see DHCP request and reply messages like this:

[[code]]czoyMzg6XCIgICAgICAxNjoyOTo0MC43MTA5NTMgSVAgMC4wLjAuMC5ib290cGMgJmd0OyAyNTUuMjU1LjI1NS4yNTUuYm9vdHBzOiB7WyYqJl19Qk9PVFAvREhDUCwgUmVxdWVzdCBmcm9tIGZhOjE2OjNlOmY5OmY2OmIwIChvdWkgVW5rbm93biksIGxlbmd0aCAzMDINCiAgICAgIHtbJiomXX0xNjoyOTo0MC43MTM2MjUgSVAgMTcyLjIwLjAuMS5ib290cHMgJmd0OyAxNzIuMjAuMC4xMC5ib290cGM6IEJPT1RQL0RIQ1AsIFJle1smKiZdfXBseSwgbGVuZ3RoIDMzMA0KXCI7e1smKiZdfQ==[[/code]]

Dump traffic in Compute Node

Meanwhile, you will definitely want to dump traffics at the VM side. This should be done in compute node, and with xenserver this is actually in Dom0.

When new instance is launched, there will be a new virtual interface created named “vifX.Y”. ‘X’ is the domain ID for the new VM and Y is the ID if the VIF defined in XAPI. Domain IDs are sequential – if the latest interface is vif20.0, the next one will mostly be vif21.0. Then you can try tcpdump -i vif21.0. It may fail at first because the virtual interface is not created ready yet! But trying several times, once the virtual interface is created, you can monitor the packets.

Theoretically you should see DHCP request and reply in Dom0, like you see in DHCP agent side.

Note: If you cannot catch the dump packet at the instance’s launching time, you can also try this using “ifup eth0” by login the instance via XenCenter. “ifup eth0” will also trigger the instance sending DHCP request.

1. Check DHCP request go out at VM side

In most case, you should see the DHCP request packet sent out from Dom0, this means that the VM itself is OK. It has sent out DHCP request message.

Note: Some images will try to send DHCP request from time to time until it get the respond message. However, some images won’t. They will only try several times, e.g. three time. Even if it cannot get DHCP responds it won’t try again any more. In some scenario, this will let the instance lost the chance of sending DHCP request. And that’s why some people on the internet suggest changing images when launching instance cannot get IP.

2. Check DHCP request go in at DHCP server side

But in my case, I cannot see any DHCP request from the DHCP agent side.

Where the request packet goes? It’s possible that the packets are dropped? Then who dropped these packets? Why drop them?

If we think it a bit more, it’s either L2 or L3 that dropped. With this in mind, we can begin to check one by one. For L3/L4, I don’t set firewall and the security group’s default rule is to let all packets go through. So, I don’t spent so much effort on this part.

For L2, since we use OVS, I begin to check OVS rules. It will take you much time if you are not familiar with OVS. At least I spent much time on it for totally understanding the mechanism and the rules.

The main aim is to check that all existing rules in Dom0 and DomU, and then try to find out which rule let the packets dropped.

Check OVS flow rules

1. OVS flow rules in Network Node

execute sudo ovs-ofctl show br-int to get the port information on bridge br-int

[[code]]czoxMTE2OlwiICBzdGFja0BEZXZTdGFja09TRG9tVTp+JCBzdWRvIG92cy1vZmN0bCBzaG93IGJyLWludA0KICBPRlBUX0ZFQVRVUkV7WyYqJl19U19SRVBMWSAoeGlkPTB4Mik6IGRwaWQ6MDAwMGJhNzg1ODBkNjA0YQ0KICBuX3RhYmxlczoyNTQsIG5fYnVmZmVyczoyNTYNCiAgY3tbJiomXX1hcGFiaWxpdGllczogRkxPV19TVEFUUyBUQUJMRV9TVEFUUyBQT1JUX1NUQVRTIFFVRVVFX1NUQVRTIEFSUF9NQVRDSF9JUA0KICBhe1smKiZdfWN0aW9uczogT1VUUFVUIFNFVF9WTEFOX1ZJRCBTRVRfVkxBTl9QQ1AgU1RSSVBfVkxBTiBTRVRfRExfU1JDIFNFVF9ETF9EU1QgU0V7WyYqJl19VF9OV19TUkMgU0VUX05XX0RTVCBTRVRfTldfVE9TIFNFVF9UUF9TUkMgU0VUX1RQX0RTVCBFTlFVRVVFDQogICAgMShpbnQtYnItZXtbJiomXX10aDEpOiBhZGRyOjFhOjJkOjVmOjQ4OjY0OjQ3DQogICAgICAgIGNvbmZpZzogICAgIDANCiAgICAgICAgc3RhdGU6ICAgICAgMA0Ke1smKiZdfSAgICAgICAgc3BlZWQ6IDAgTWJwcyBub3csIDAgTWJwcyBtYXgNCiAgICAyKHRhcDdiMzllY2FkLTgxKTogYWRkcjowMDowMDowMDp7WyYqJl19MDA6MDA6MDANCiAgICAgIGNvbmZpZzogICAgIFBPUlRfRE9XTg0KICAgICAgc3RhdGU6ICAgICAgTElOS19ET1dODQogICAgICBzcHtbJiomXX1lZWQ6IDAgTWJwcyBub3csIDAgTWJwcyBtYXgNCiAgICAzKHFyLTc4NTkyZGQ0LWVjKTogYWRkcjowMDowMDowMDowMDowMDowMA0Ke1smKiZdfSAgICAgIGNvbmZpZzogICAgIFBPUlRfRE9XTg0KICAgICAgc3RhdGU6ICAgICAgTElOS19ET1dODQogICAgICBzcGVlZDogMCBNYnB7WyYqJl19cyBub3csIDAgTWJwcyBtYXgNCiAgICA0KHFyLTU1YWY1MGM3LTMyKTogYWRkcjowMDowMDowMDowMDowMDowMA0KICAgICAgY29uZntbJiomXX1pZzogICAgIFBPUlRfRE9XTg0KICAgICAgc3RhdGU6ICAgICAgTElOS19ET1dODQogICAgICBzcGVlZDogMCBNYnBzIG5vdywgMCBNe1smKiZdfWJwcyBtYXgNCiAgICBMT0NBTChici1pbnQpOiBhZGRyOjllOjA0Ojk0OmE0Ojk1OmJiDQogICAgICBjb25maWc6ICAgICBQT1JUX0R7WyYqJl19T1dODQogICAgICBzdGF0ZTogICAgICBMSU5LX0RPV04NCiAgICAgIHNwZWVkOiAwIE1icHMgbm93LCAwIE1icHMgbWF4DQogIE9GUHtbJiomXX1UX0dFVF9DT05GSUdfUkVQTFkgKHhpZD0weDQpOiBmcmFncz1ub3JtYWwgbWlzc19zZW5kX2xlbj0wDQpcIjt7WyYqJl19[[/code]]

execute sudo ovs-ofctl dump-flows br-int to get the flow rules
[[code]]czo4Njk6XCIgIHN0YWNrQERldlN0YWNrT1NEb21VOn4kIHN1ZG8gb3ZzLW9mY3RsIGR1bXAtZmxvd3MgYnItaW50DQogIE5YU1RfRkx7WyYqJl19T1cgcmVwbHkgKHhpZD0weDQpOg0KICAgIGNvb2tpZT0weDliZjNkNjA0NTBjMmFlOTQsIGR1cmF0aW9uPTI3NzYyNS4wMnMsIHRhYntbJiomXX1sZT0wLCBuX3BhY2tldHM9MzEsIG5fYnl0ZXM9NDA3NiwgaWRsZV9hZ2U9MTU3OTMsIGhhcmRfYWdlPTY1NTM0LCBwcmlvcml0eT0ze1smKiZdfSxpbl9wb3J0PTEsZGxfdmxhbj0xMDQxIGFjdGlvbnM9bW9kX3ZsYW5fdmlkOjEsTk9STUFMDQogICAgY29va2llPTB4OWJmM2Q2MDR7WyYqJl19NTBjMmFlOTQsIGR1cmF0aW9uPTI3NzYzMS45MjhzLCB0YWJsZT0wLCBuX3BhY2tldHM9Miwgbl9ieXRlcz0xODAsIGlkbGVfYWdlPXtbJiomXX02NTUzNCwgaGFyZF9hZ2U9NjU1MzQsIHByaW9yaXR5PTIsaW5fcG9ydD0xIGFjdGlvbnM9ZHJvcA0KICAgIGNvb2tpZT0weDliZjNke1smKiZdfTYwNDUwYzJhZTk0LCBkdXJhdGlvbj0yNzc2MzIuMTE2cywgdGFibGU9MCwgbl9wYWNrZXRzPTQyNzgyLCBuX2J5dGVzPTQ3MDYwOTl7WyYqJl19LCBpZGxlX2FnZT0xLCBoYXJkX2FnZT02NTUzNCwgcHJpb3JpdHk9MCBhY3Rpb25zPU5PUk1BTA0KICAgIGNvb2tpZT0weDliZjNkNntbJiomXX0wNDUwYzJhZTk0LCBkdXJhdGlvbj0yNzc2MzIuMTAzcywgdGFibGU9MjMsIG5fcGFja2V0cz0wLCBuX2J5dGVzPTAsIGlkbGVfYWdle1smKiZdfT02NTUzNCwgaGFyZF9hZ2U9NjU1MzQsIHByaW9yaXR5PTAgYWN0aW9ucz1kcm9wDQogICAgY29va2llPTB4OWJmM2Q2MDQ1MGMyYWV7WyYqJl19OTQsIGR1cmF0aW9uPTI3NzYzMi4wOXMsIHRhYmxlPTI0LCBuX3BhY2tldHM9MCwgbl9ieXRlcz0wLCBpZGxlX2FnZT02NTUzNCwgaHtbJiomXX1hcmRfYWdlPTY1NTM0LCBwcmlvcml0eT0wIGFjdGlvbnM9ZHJvcA0KXCI7e1smKiZdfQ==[[/code]]

These rules in DomU looks like normal without suspicious, so go on with Dom0 and try to find more.

2. OVS flow rules in Compute Node

As analysis of traffic flow in picture 1, the traffic direction from VM to DHCP is xapiX->xapiY(Dom0), then ->br-eth1->br-int(DomU).

So, maybe some rules filtered the packets at layer 2 level by OVS. I do suspect xapiY although I cannot say direct reasons. So checked rules in xapiY, in our case it is xapi3 actually.

execute ovs-ofctl show xapi3 get port information

[[code]]czo3NzA6XCIgIFtyb290QHJib2JvIH5dIyBvdnMtb2ZjdGwgc2hvdyB4YXBpMw0KICBPRlBUX0ZFQVRVUkVTX1JFUExZICh4aWQ9MHh7WyYqJl19Mik6IGRwaWQ6MDAwMDhlYzAwMTcwYjAxMw0KICBuX3RhYmxlczoyNTQsIG5fYnVmZmVyczoyNTYNCiAgY2FwYWJpbGl0aWVzOiBGTHtbJiomXX1PV19TVEFUUyBUQUJMRV9TVEFUUyBQT1JUX1NUQVRTIFFVRVVFX1NUQVRTIEFSUF9NQVRDSF9JUA0KICBhY3Rpb25zOiBPVVRQVVQge1smKiZdfVNFVF9WTEFOX1ZJRCBTRVRfVkxBTl9QQ1AgU1RSSVBfVkxBTiBTRVRfRExfU1JDIFNFVF9ETF9EU1QgU0VUX05XX1NSQyBTRVRfTld7WyYqJl19X0RTVCBTRVRfTldfVE9TIFNFVF9UUF9TUkMgU0VUX1RQX0RTVCBFTlFVRVVFDQogICAgMSh2aWYxNS4xKTogYWRkcjpmZTpmZjpmZntbJiomXX06ZmY6ZmY6ZmYNCiAgICAgIGNvbmZpZzogICAgIDANCiAgICAgIHN0YXRlOiAgICAgIDANCiAgICAgIHNwZWVkOiAwIE1icHMgbm93e1smKiZdfSwgMCBNYnBzIG1heA0KICAgIDIocGh5LXhhcGkzKTogYWRkcjpkNjozNzoxNzoxZDowMTplZQ0KICAgICAgY29uZmlnOiAgICAgMA17WyYqJl19CiAgICAgIHN0YXRlOiAgICAgIDANCiAgICAgIHNwZWVkOiAwIE1icHMgbm93LCAwIE1icHMgbWF4DQogICAgTE9DQUwoeGFwaTMpOntbJiomXX0gYWRkcjo1YTo0Njo2NTphMjozYjo0Zg0KICAgICAgY29uZmlnOiAgICAgMA0KICAgICAgc3RhdGU6ICAgICAgMA0KICAgICAgc3Ble1smKiZdfWVkOiAwIE1icHMgbm93LCAwIE1icHMgbWF4DQogIE9GUFRfR0VUX0NPTkZJR19SRVBMWSAoeGlkPTB4NCk6IGZyYWdzPW5vcm1hbCB7WyYqJl19bWlzc19zZW5kX2xlbj0wDQpcIjt7WyYqJl19[[/code]]

execute ovs-ofctl dump-flows xapi3 to get flow rules
[[code]]czo1MjQ6XCIgIFtyb290QHJib2JvIH5dIyBvdnMtb2ZjdGwgZHVtcC1mbG93cyB4YXBpMw0KICBOWFNUX0ZMT1cgcmVwbHkgKHhpZD17WyYqJl19MHg0KToNCiAgICBjb29raWU9MHgwLCBkdXJhdGlvbj0yNzg3MDAuMDA0cywgdGFibGU9MCwgbl9wYWNrZXRzPTQyOTE3LCBuX2J5dHtbJiomXX1lcz00ODM2OTMzLCBpZGxlX2FnZT0wLCBoYXJkX2FnZT02NTUzNCwgcHJpb3JpdHk9MCBhY3Rpb25zPU5PUk1BTA0KICAgIGNvb2tpe1smKiZdfWU9MHgwLCBkdXJhdGlvbj0yNzYxMTcuNTU4cywgdGFibGU9MCwgbl9wYWNrZXRzPTMxLCBuX2J5dGVzPTM5NzYsIGlkbGVfYWdlPTF7WyYqJl19Njg1OSwgaGFyZF9hZ2U9NjU1MzQsIHByaW9yaXR5PTQsaW5fcG9ydD0yLGRsX3ZsYW49MSBhY3Rpb25zPW1vZF92bGFuX3ZpZDoxMHtbJiomXX00MSxOT1JNQUwNCiAgICBjb29raWU9MHgwLCBkdXJhdGlvbj0yNzg2OTQuOTQ1cywgdGFibGU9MCwgbl9wYWNrZXRzPTcsIG5fYnl0e1smKiZdfWVzPTc5OSwgaWRsZV9hZ2U9NjU1MzQsIGhhcmRfYWdlPTY1NTM0LCBwcmlvcml0eT0yLGluX3BvcnQ9MiBhY3Rpb25zPWRyb3ANClwie1smKiZdfTt7WyYqJl19[[/code]]

Please pay attention to port 2(phy-xapi3), it has two specific rules:

• The higher priority=4 will be matched firstly, if the dl_vlan=1, it will modify the tag and then with normal process, which will let the flow through

• The lower priority=2 will be matched secondly, it will drop the flow. So, will the flows be dropped? If the flow doesn’t have dl_vlan=1, it will be dropped definitely.

Note:

(1) For dl_vlan=1, this is the virtual LAN tag id which corresponding to the Port tag

(2) I didn’t realize the problem is lacking tag for the new launched instance for a long time due to my lack of OVS understanding. Thus I don’t have such sense of checking the port’s tag with this problem at first. So next time when we meet this problem, we can check these part first.

With this question, I checked the new launched instance’s port information, ran command ovs-vsctl show in Dom0, you can get outputs like these:

[[code]]czoyODA6XCJCcmlkZ2UgXCJ4YXBpNVwiDQogICAgZmFpbF9tb2RlOiBzZWN1cmUNCiAgICBQb3J0IFwieGFwaTVcIg0KICAgICAgICBJbnRle1smKiZdfXJmYWNlIFwieGFwaTVcIg0KICAgICAgICAgICAgdHlwZTogaW50ZXJuYWwNCiAgICBQb3J0IFwidmlmMTYuMFwiDQogICAgICAgIEludGVye1smKiZdfWZhY2UgXCJ2aWYxNi4wXCINCiAgICBQb3J0IFwiaW50LXhhcGkzXCINCiAgICAgICAgSW50ZXJmYWNlIFwiaW50LXhhcGkzXCINCiAgICAgICB7WyYqJl19ICAgICB0eXBlOiBwYXRjaA0KICAgICAgICAgICAgb3B0aW9uczoge3BlZXI9XCJwaHkteGFwaTNcIn0NClwiO3tbJiomXX0=[[/code]]

For port vif16.0, it really doesn’t have tag with value 1, so the flow will be dropped without doubt.

Note: When launching a new instance under xenserver, it will have a virtual network interface named vifx.0, and from OVS’s point of view, it will also create a port and bound that interface correspondingly.

Check why tag is not set

The next step is to find out why the new launched instance don’t have tag in OVS. There is no obvious findings for new comers like me. Just read the code over and over and make assumptions and test and so forth.

But after trying this and that a while, I did find each time when I resart neutron-openvswitch-agent(q-agt) in Compute Node, the VM can get IP if I execute ifup etho
command.

So, there must be something which is done when q-agt restart and is not done when launching a new instance. With this findings, it’s much more targeted while reading codes.

Finally I found that, with XenServer, when new instance is launched, q-agt cannot detect new added port and it will not add tag to this port consequently.

But why q-agt cannot detect port changes? We have a session from DomU to Dom0 to monitor port changes, seems it cannot work as we expected.

With this in mind, I first ran command ovsdb-client monitor Interface name,ofport in Dom0, you probably get outputs like this:

[[code]]czoxMjI3OlwiICAgICAgICAgICAgW3Jvb3RAcmJvYm8gfl0jIG92c2RiLWNsaWVudCBtb25pdG9yIEludGVyZmFjZSBuYW1lLG9mcG97WyYqJl19cnQNCiAgICAgICAgICAgIHJvdyAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICBhY3Rpb24gIG5hbWUgICAgICAgIG9mcHtbJiomXX1vcnQNCiAgICAgICAgICAgIC0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLSAtLS0tLS0tIC0tLS0tLS0tLS0tIC0te1smKiZdfS0tLS0NCiAgICAgICAgICAgIDU0YmNkYTYxLWRlNjQtNGQwZS1hMWM4LWQzMzlhMmNhYmI1MCBpbml0aWFsIFwiZXRoMVwiICAgICAgMXtbJiomXX0gICAgIA0KICAgICAgICAgICAgOTg3YmU2MzYtYjM1Mi00N2EzLWE1NzAtODExOGI1OWM3YmJjIGluaXRpYWwgXCJ4YXBpM1wiICAgICB7WyYqJl19NjU1MzQgDQogICAgICAgICAgICBiYjZhNGY3MC05ZjljLTQzNjItOTM5Ny0wMTA3NjBmODVhMDYgaW5pdGlhbCBcInhhcGk1XCIgICAge1smKiZdfSA2NTUzNCANCiAgICAgICAgICAgIDlkZGZmMzY4LTBiZTUtNGYyMy1hMDNjLTc5NDA1NDNkMGNjYyBpbml0aWFsIFwidmlmMTUuMlwiIHtbJiomXX0gIDEgICAgIA0KICAgICAgICAgICAgYmEzYWYwZjUtZThlZC00YmRiLThjM2QtNjdhNjM4YjgxMDkxIGluaXRpYWwgXCJwaHkteGFwaXtbJiomXX0zXCIgMiAgICAgDQogICAgICAgICAgICBiNTcyODRjZi0xZGNkLTRhMTAtYmVlMS00MjUxNmFmZTI1NzMgaW5pdGlhbCBcImV0aDBcIiAge1smKiZdfSAgICAxICAgICANCiAgICAgICAgICAgIDM4YTBkZDM3LTE3M2YtNDIxYy05YWJhLTNlMDNhNWI4YzkwMCBpbml0aWFsIFwidmlmMTYue1smKiZdfTBcIiAgIDIgICAgIA0KICAgICAgICAgICAgNThiODNmZTQtNWYzMy00MGYzLTlkZDktZDVkNGIzZjI1OTgxIGluaXRpYWwgXCJ4ZW5icntbJiomXX0wXCIgICAgNjU1MzQgDQogICAgICAgICAgICA2Yzc5Mjk2NC0zOTMwLTQ3N2MtYmFmYS01NDE1MjU5ZGVhOTYgaW5pdGlhbCBcImludC17WyYqJl19eGFwaTNcIiAxICAgICANCiAgICAgICAgICAgIGNhYTUyZDYzLTU5ZWQtNDkxNy05ZWMzLTFlYTk1NzQ3MGQ1ZSBpbml0aWFsIFwidmlme1smKiZdfTE1LjFcIiAgIDEgICAgIA0KICAgICAgICAgICAgZDg4MDVkMDUtYmJkMi00MGNiLWIyMTktZWI5MTc3YzIxN2RjIGluaXRpYWwgXCJ2aXtbJiomXX1mMTUuMFwiICAgNiAgICAgDQogICAgICAgICAgICA4MTMxZGNkMi02OWVhLTQwMWEtYTY1ZS00ZDRhMTcyMDNlMGMgaW5pdGlhbCBcInh7WyYqJl19YXBpNFwiICAgICA2NTUzNCANCiAgICAgICAgICAgIDA4NmU2ZTNhLTFhYjItNDY5Zi05NjA0LTU2YmJkNGMyZmU4NiBpbml0aWFsIFwie1smKiZdfXhlbmJyMVwiICAgIDY1NTM0IA0KXCI7e1smKiZdfQ==[[/code]]

Then I launched a new instance try to find whether OVS monitor can give new output for the new launched instance, and I do get outputs like:
[[code]]czo1MDM6XCIgICAgICAgICAgICByb3cgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgYWN0aW9uIG5hbWUgICAgICBvZnB7WyYqJl19b3J0DQogICAgICAgICAgICAtLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0gLS0tLS0tIC0tLS0tLS0tLSAtLS0tLXtbJiomXX0tDQogICAgICAgICAgICAyNDljNDI0YS00YzlhLTQ3YjQtOTkxYS1iZGVkOWVjNjNhZGEgaW5zZXJ0IFwidmlmMTcuMFwiIFtdICAgIA17WyYqJl19Cg0KICAgICAgICAgICAgcm93ICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgIGFjdGlvbiBuYW1lICAgICAgb2Zwb3J0DXtbJiomXX0KICAgICAgICAgICAgLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tIC0tLS0tLSAtLS0tLS0tLS0gLS0tLS0tDQoge1smKiZdfSAgICAgICAgICAgMjQ5YzQyNGEtNGM5YS00N2I0LTk5MWEtYmRlZDllYzYzYWRhIG9sZCAgICAgICAgICAgICAgW10gICAgDQogICB7WyYqJl19ICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgbmV3ICAgIFwidmlmMTcuMFwiIDMgICAgDQpcIjt7WyYqJl19[[/code]]

So, this means the OVS monitor itself works well! There maybe other errors with the code that makes the monitoring. Seems it much more nearer with the root cause 🙂

Finally, I found with XenServer, our current implementation cannot get the OVS monitor’s output, and thus q-agt cannot know there is new port added. But lucky enough, L2 Agent provide another way of getting the ports changes, and thus we can first use that way instead.

Setting minimize_polling=false in the L2 agent’s configuration file ensures the Agent does not rely on “ovsdb-client monitor”, which means that the port will be identified and the tag gets added!

In this case, this is all that was needed to get an IP address and everything else worked normally.