Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Discussion for RFD 158 #121

Open
danmcd opened this issue Nov 10, 2018 · 14 comments
Open

Discussion for RFD 158 #121

danmcd opened this issue Nov 10, 2018 · 14 comments

Comments

@danmcd
Copy link

danmcd commented Nov 10, 2018

Subject says it all.

@papertigers
Copy link
Contributor

nit: Our UpdateAIP endpoint could probably take some sort of query param such as ?detach=true to simply remove any associations.

@danmcd
Copy link
Author

danmcd commented Dec 19, 2018

nit: Our UpdateAIP endpoint could probably take some sort of query param such as ?detach=true to simply remove any associations.

Happy to change this, but my REST is weak, an example appreciated.

@jasonbking
Copy link
Contributor

For the Triton requirements, as an initial stab about it, how about triton post-setup fabrics creates a single default vxlnat zone (likely on the HN) -- with the change in how public IPs will be handled, there don't seem to be too many scenarios where one would create a fabric and not have a vxlnat zone. Additional NAT zones could be created via sdcadm create vxlnat [-s CN...] [-I image_uuid].

I think this would follow what other services do, so it wouldn't be an oddball.

@jasonbking
Copy link
Contributor

AIP assignments to the NAT Reform zones that match the default-router assignment for its inner encapsulating network. (e.g. an AIP for an instance attached to 10.3.3.0/24 must reside on the same NAT Reform zone as the default router for 10.3.3.0/24.)

I'm confused by this. Why should the vxlnat zone care about any the network or net mask of the instance? An AIP is allocated from the pool of AIP addresses, and a NAT rule is created to do static NAT between the AIP and the fabric (internal) IP of the instance + its vnet ID. On ingress, the vxlnat zone will do the NAT, encapsulate the IP datagram, and send it to the UL3 of the CN hosting the instance. On egress, the fabric routing will send any non-fabric packets to the vxlnat instance to perform decapsulation, NAT, and send off to the Internet.

@papertigers
Copy link
Contributor

For the given use case example of adding a new vxlnat zone or replacing an old failed one, it is not clear to me what our intentions are. Do we plan on not disrupting existing flows allowing the new or replaced zone to steadily pick up new flows or do we plan on shuffling things around anyways to balance traffic. There are probably trade offs to both approaches that need to be considered.

@papertigers
Copy link
Contributor

nit: Our UpdateAIP endpoint could probably take some sort of query param such as ?detach=true to simply remove any associations.

Happy to change this, but my REST is weak, an example appreciated.

https://apidocs.joyent.com/cloudapi/#StartMachine
Here are some examples of Start, Stop, Restart that are POSTs with corresponding actions of start, stop, restart.

We could possibly change one of our endpoints to be a POST with an action of action=detach or action=attach vs the current version of the UpdateAIP endpoint. Thoughts?

@papertigers
Copy link
Contributor

papertigers commented Dec 19, 2018

Also note on the napi side, we will most likely have very similar api endpoints to the cloudapi ones.
CloudAPI typically makes similar api calls to the triton services such as vmapi and napi.

I think we need to think a bit more about https://github.com/joyent/rfd/tree/master/rfd/0158#portolan
We should document what we expect to happen and then we can figure out how to make that happen. For example detecting a CN that is up or down may not be enough. There may be an issue with vxlnat kernel module or vxlnatd instance that causes traffic to stop flowing. We should have a way of detecting this so we can shuffling things around on the backend instead of having customers stuck in limbo where their flows never make forward progress. We may want to have kstats that tell us packet counts going through vxlnat are incrementing or expose an interface to vxlnatd that can be scraped for similar info.

@danmcd
Copy link
Author

danmcd commented Dec 19, 2018

For the Triton requirements, as an initial stab about it, how about triton post-setup fabrics creates a single default vxlnat zone (likely on the HN) -- with the change in how public IPs will be handled, there don't seem to be too many scenarios where one would create a fabric and not have a vxlnat zone. Additional NAT zones could be created via sdcadm create vxlnat [-s CN...] [-I image_uuid].

I think this would follow what other services do, so it wouldn't be an oddball.

That was part of my do-before-looking-at-feedback task today, so I may have something close to this. Apologies if I missed something here.

@danmcd
Copy link
Author

danmcd commented Dec 19, 2018

AIP assignments to the NAT Reform zones that match the default-router assignment for its inner encapsulating network. (e.g. an AIP for an instance attached to 10.3.3.0/24 must reside on the same NAT Reform zone as the default router for 10.3.3.0/24.)

I'm confused by this. Why should the vxlnat zone care about any the network or net mask of the instance? An AIP is allocated from the pool of AIP addresses, and a NAT rule is created to do static NAT between the AIP and the fabric (internal) IP of the instance + its vnet ID. On ingress, the vxlnat zone will do the NAT, encapsulate the IP datagram, and send it to the UL3 of the CN hosting the instance. On egress, the fabric routing will send any non-fabric packets to the vxlnat instance to perform decapsulation, NAT, and send off to the Internet.

I put this restriction in place because if I have 10.1.1.1/24 as the default router for 10.1.1.0/24, ANY AIP on 10.1.1.0/24 will share the same default router as non-AIPs on 10.1.1.0/24 (namely .1), and will get sent to the same UL3 address, therefore the same NAT Reform zone. I thought I was clear, maybe I need to be clearer? (And does the AIP picture, now fixed, make that more clear?)

@danmcd
Copy link
Author

danmcd commented Dec 19, 2018

For the given use case example of adding a new vxlnat zone or replacing an old failed one, it is not clear to me what our intentions are. Do we plan on not disrupting existing flows allowing the new or replaced zone to steadily pick up new flows or do we plan on shuffling things around anyways to balance traffic. There are probably trade offs to both approaches that need to be considered.

My initial thought was "new NAT Reform zones only get new fabrics" to prevent disruption. If there's enough load, however, disruption may be a small price to pay. I do need to spell out that tradeoff more, thank you.

@danmcd
Copy link
Author

danmcd commented Dec 19, 2018

nit: Our UpdateAIP endpoint could probably take some sort of query param such as ?detach=true to simply remove any associations.

Happy to change this, but my REST is weak, an example appreciated.

https://apidocs.joyent.com/cloudapi/#StartMachine
Here are some examples of Start, Stop, Restart that are POSTs with corresponding actions of start, stop, restart.

We could possibly change one of our endpoints to be a POST with an action of action=detach or action=attach vs the current version of the UpdateAIP endpoint. Thoughts?

Would appreciate help with details here, but sure!

@danmcd
Copy link
Author

danmcd commented Dec 19, 2018

Also note on the napi side, we will most likely have very similar api endpoints to the cloudapi ones.
CloudAPI typically makes similar api calls to the triton services such as vmapi and napi.

I think we need to think a bit more about https://github.com/joyent/rfd/tree/master/rfd/0158#portolan
We should document what we expect to happen and then we can figure out how to make that happen. For example detecting a CN that is up or down may not be enough. There may be an issue with vxlnat kernel module or vxlnatd instance that causes traffic to stop flowing. We should have a way of detecting this so we can shuffling things around on the backend instead of having customers stuck in limbo where their flows never make forward progress. We may want to have kstats that tell us packet counts going through vxlnat are incrementing or expose an interface to vxlnatd that can be scraped for similar info.

Agreed. It may be a worthy discussion topic for VPC on Thursday.

@danmcd
Copy link
Author

danmcd commented Dec 19, 2018

THANKS for the comments so far. I plan on folding in my followups tonight and tomorrow morning.

@danmcd
Copy link
Author

danmcd commented Dec 20, 2018

nit: Our UpdateAIP endpoint could probably take some sort of query param such as ?detach=true to simply remove any associations.

I added a DetachAIP primitive per your suggestion. It may be wrong, but I think it's what you had in mind.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants