RIPE 90

Daily Archives

.
RIPE 90
Main room
15 May 2025
Routing Working Group
2 p.m..

Routing Working Group session is starting. And here on stage as three fine co‑chairs, Sebastian, Ben and me, Ignas, and let's start from a few administrative announcements and matters.

First the minutes of the previous Working Group session, they have been approved, reviewed by the community, hopefully, and archived. Them another important aspect, and this is seriously important. Please rate the talks. This is a feedback channel that we are using for doing, well post meeting analysis, and also helps guiding of what you as a community liked or didn't like. Therefore this is important. Please do that. It's not difficult. You need to log in with your credentials into the archive, and provide the feedback there please do.

BEN COX: Obviously while we love where we are right now, the programme com, the Routing Working Group, would really love you to submit your upcoming talks which we would love to review as soon as possible, because every single Working Group ends up with this massive crunch at the last moment, which we don't like, and I am sure you don't like either. If you have an idea, just let us know. Even if you don't have slides, let us know, we'd love to see you at RIPE 91 and have your talk, but would love as much notice as possible. Rate the talks and send answered your new talks in as soon as possible.

IGNAS BAGDONAS: There is one other message that we received as a letter from the NCC, it's about the Programme Committee elections. Just to prove that we really received the letter, yes. There is programme elections going on, and you can vote until 5 p.m. today, please do. This is serious, please consider who gets selected to the Programme Committee because that group eventually controls quite a lot, lot of the content that gets onto the agenda. That is important.

So with this, let's move to our content part of the meeting, and I am happy to invite the first speaker, Tim, from RIPE, talking about what's going on with RPKI functionality.

TIM BRUIJNZEELS: Hello everyone. My name is Tim, I am a principal customer at the RIPE NCC. I have been doing RPKI for a long time, so, some of you will have seen me before. I also see some new faces here, which is nice.

Today, I'd like to talk to you about the features that we're working on at the RIPE NCC in RPKI. Of course we also do other stuff, like we work on infrastructure and you know all that stuff. But today, I really want to focus on the visible features to you. So I'll start by talking a bit about what we have done so far this year, then I'll move to the big things that we plan to work on soon.

So, what have we done? As you may recall, we did a big overall of the RPKI dashboard last year. It's been revamped, we like it a lot, we're getting feedback from users, in the Database Working Group we had a feedback form, if you see that, to be brutally honest, when I get a question to give feedback, I usually say no thanks, but it does really help us, so if you would, please.

By and large, feedback is very positive. We have had some specific feedback that we were able to incorporate and make small changes.

Other than that, we have been working on ROA history in particular. So histories, we keep history of all the changes in the RPKI CA. This is always been there, but it may have been a bit more difficult to find in the past. So now it's more prominently featured on the landing page, and until recently the history with regard to ROAs was actually text based so it's to not that easy to parse. One the things we worked on is to make that more readable, as shown here.

We have also introduced filters, so you can just look at your ROA changes and not everything else.
We have also just deployed a roll back feature. So this allows you to go, find a change that you have done in the past, and say actually I want to revert to the state of configuration just before this change. Now, if you do that, we don't just change things immediately but instead, you get the review your changes. So, in this example, in our test environment, we don't actually have matching BGP announcements. But if we do, in your case, then what we would show you is this is the change that has to go back to that earlier point in time, and this is how it would affect BGP as we know it today. So, you get to choose them. Let me add to the pending changes, make more changes potentially or accept them and commit them essentially. It's a bit like a second bit, I tend to thinks as a bit of a, you know, change management or GIT or something.

Other things:

A fairly small feature in terms of complexity of implementation, but hopefully very useful, is that we have always had these alerts that when we see a change in the routing information that we get from RIS as compared to your ROAs, then we would send you daily or weekly e‑mails, telling you, informing you about not valued or invalid announcements. What we have added as well is is you can opt into get e‑mails when ROA configurations are changed. This is opt in, because we also have people using API, and potentially making a lot of changes, so if you would make this the default, then we were a bit afraid we would spam you a lot, which we didn't want to do without your consent. So, if this is something you like, then you can go into the configuration dashboard and enable this.

Now, switching topics a bit. So, next things. So, I'll go from talking about user UX essentially into a bit of routing security.

So, BGPsec has been ‑‑ the standards have been around for quite a while. BGPsec is about verifiable paths in BGP.

How does this work? Well, a complete explanation of how BGPsec works would take much longer that than I have to today, so I'll do a quick overview. If there are questions, bring them up.

BGPsec is about signed path. So it is a way to detect lies on the path essentially. Pat spoofing would be evidence. The way this works in the RPKI is that any RPKI would create a router certificate that associates the AS number of your router, your AS number with a key use in your router. Then the router can sign and validate announcements and updates.
There are some challenges around deploying BGPsec. Performance is one of them. And we have worked on this in the past. There is some concerns around downgrade because if you have incremental deployment how do you know what should be BGPsec, how do you know what is okay to not be BGPsec? And well what if there is an issue with RPKI itself? BGPsec path can only be valid or invalid, it cannot be not found. So, in that case, everything would become invalid. And I'm not saying that these challenges cannot be resolved, but they have been part of the reason why we didn't enable or support signing the router certificates earlier.

On the other hand, it's a chicken and egg problem, right. So, we have had a request to do this in the past. The signing side of this is actually relatively easy. So we figured well, you know, maybe we should do the right thing and stop postponing it. So, support signing this in the API only. The UI we come from later when this takes off. At least this can help implement, to get the standards moving forward.

The next thing in is a completely different take on this which has been discussed in the IETF recently, over the last five years, more or less. I think.

Different approach. So, AS provider attestations, ASP. This is all about plausible paths. It's not been having verifiable signed paths and routers doing signing and validation. It's a slightly different approach, as I'll show. Maybe this is a bit detailed. But if we look at the ASPA objects, it's similar as to how a ROA object is built. We have an RPKI signed object that has an intermediate E certificate, that's just all the plumbing that is needed to make validation work.

The essential take away here is that as a holder of an AS number you can make a signed statement that says my AS, that's customer AS can have these provider AS numbers after me in the BGP path. Well, the RPKI can software takes care of the rest essentially. But that's what you need to think about.

How do we then use that? Well, as I said, it's about having plausible paths. If we turn it around it's about maybe not having inplausible paths. What I mean by TLA is it actually supports the State where you have no a ‑‑ no information, you have no attestation, right. So, something that is not provider would be wrong, would be implausible.

So, if you look at the path and we look from each AS to AS hop, we can have a number of situations right. We can have the next AS listed as provider, which is good. We could have a statement from the AS about who their provider are but the next one is not their provider, or there could be no attestation as I just said.

As I'll ‑‑ in bullet points here I explain what maybe is better seen here graphically.

If we think about validating routes received from customers, so, imagine you are a provider in this case AS1 is a provider, and I didn't use documentation AS numbers here because they didn't fit. So no offence meant against AS5, whoever it may be, but yeah.

So, AS1 receives routes from customers, right. So on the one hand we have with AS4 to 2, 1 and all is good. But we have an evil AS5 here who is prepending the path. So they put AS4 in front of them, and announce a prefix that's actually okay if you look at the ROA. How can I ask for help? Well, if AS4 in this case made an attestation, made a statement about who their providers are and AS5 is not on that list, then AS1 can detect fairly easily here that this was not okay.

Now, in this case I talk about AS5 being evil, but maybe a more real world case can be that, you know, these people were a peering relationship and there was a leak. So, AS5 actually leaks this announcement upwards to AS3 and they shouldn't have. So, in terms of validation, it's the same thing.

So, maybe let me highlight how I tried to denote the validation in this case. If we look at the full path here, of 4 through 5 through 3 to 1, then the ASPA hop from 4 to 5 would be considered invalid because in this case I denoted as AS4 is AS2 is in my list of providers, AS5 is not. The other path here, 4, 2, 1, is okay, because there are no ASPA objects that say otherwise.

Now, if we take this up a notch and think about how do you do this actually across providers, so routes that you have learned from providers, when things become a bit more complicated, you need to wrap your head around doing it differently. And the concept of that ASPA uses for this is up ramps and down ramps. Essentially we can think of these as customer to provider Linx going upward to a common point. So, in this case, we have here on my left, also your left I guess, because...yeah ‑‑ we have an announcement going up from this top network to the provider to Tier 1 and on the other side it goes down. The path continues but we have another up path come from the other side of ramp we should call it. We then then to think of a down ramp. So a reverse up ramp essentially. And in terms of the verification algorithm, what the challenge is that well we can look at the full path and we can look at it from both end. So, we try to create the longest possible plausible up ramp, that is still, does not have not provider in it we can do the same from the other side. And then we can have a number of situations. So we can have an overlap, typically when there is not enough information, that's okay. We can have the meet at the single point, a common provider for example, a tier 1 transit or a common provider between stop networks, or they meet at the peer pair, and then we can have one hop that appears to be invalid.

It's invalid in case we have a separation of more than one hop. And I'll go through this by example.
So, here is the example of peers. So in this case everybody is doing ROAs and ASPA. So, we have the path 6, 3, 2, 5, and all these ASNs have made declarations and it's all good. Well, the path is good. But if we do the analysis, then 6 to 3 is okay. 3 to 2, that's where the up ramp stops, because 2 is not a provider of 3, 1 is a provider of 3.

From the other side we go 5, 2, is okay. 2 to 3 is not okay because 1 is their provider, 3 is not. This is where they meet the other pair and this is a feature. It might be counter intuitive that you would say something being invalid in the middle is okay, but then, you know, think about Internet Exchange points and having a lot of peers and if you have had to explicitly name each and everyone of them all the time then this could be quite error prone, right.

So, this helps with that case.

Now, partial deployment, like I said, if you have no other information then things are okay number this case the AS6 and AS3 are not creating ASPA objects and TTL up ramp is longer. And that's okay. They meet. If you have even fewer, even almost no deployment, then you can have a situation where the full path okay from both end and there is a complete overlap. So this is okay. That's a feature because we want to have incremental deployment.

Now, suppose that AS7 here is actually evil, and they start spoofing. So, they pretend AS6, and do a more specific announcement actually in this case, a 24, and now if you look at the full path. Well AS5 said that AS2 is their provider. Otherwise we don't have any other information so if you look from the other side, 6, 7, 4, 1, well you can read it, it's all okay. And from the other side, it's all also okay. So, the hijack succeeds.

Then you might think well, what if AS6 just signs ASPA, then we should be good, right? Well, no, unfortunately not yet. But we're getting there.

So, in this case 6 says AS3 is my provider, 7 is not. So that's clear, but now if we do the analysis, the up path stops very quickly, but the down ramp, looked at from the other side. Now, in this picture we are inclined to see this is clearly wrong, but that's because I drew the errors and I called one a provider and a Tier 1, and so on and so forth. But if you don't have the information in ASPA, you can't draw this conclusion, it could be that 2 is a provider of 5, 1 is a provider of 2, 4 is a provider of 1 and so forth. So, that's why in this case, it's still accepted.

But as we have even more deployment, this changes, so, for example, if this Tier 1 in this picture decides to sign and ASPA object, it essentially says, I have no providers, because they say I have one provider, it's AS0, and AS0 cannot legally appear, so essentially that means I have no providers. Now, if you start doing the up and down ramp analysis, the down ramp, as seen from 5 in this case, stops at 1. And we have separation and therefore this is an invalid.

Now, you might say well, that's okay,.can just, you know, pretend with 3 and then 6 and then we're okay, right. Yes. But if we 3 also deploys ASPA then this becomes more difficult. It doesn't protect straight away against all forms, but especially with more and more deployment, it becomes harder and harder to make plausible paths. Or spoof, plausible paths I should say. Okay, that's all on the way that, you know, the verification algorithm actually works. Now what about the deployment model?
.
The flow of information in the deployment is actually really the same as what we have for radio ROAs. So you have an RPKI CA, so the RPKI dashboard, where you create ASPA objects, they are published into a repository, they fetched and validated by RPKI validators then there is a protocol that communicates this information into your router, the router essentially gets a table, it doesn't have to do any crypto, but can use this table to do the analysis that we just talked about.

If we look at the standards in the IETF, then I think discussion on almost all these things has been resolved by now. There was some remaining discussion about the last hop here about the sort of protocol between the RPKI validator and getting the information into the router, but that's being resolved right now as well. So that's good. We're moving forward.

If you look at current implementations, and if you know of others that I have left out here, then please let me know. On the signing side, this has been implemented in Krill, so if you run an dedicated CA on an RIR, you can use this today.

At the RIPE NCC we have an implementation in our public desk environment, there was an e‑mail sent out about this about a year ago, I want to make sure that we put this information on our website and give it more prominence. We hadn't done so so far because of the ongoing discussions in the IETF, but I think the time is right that we do so now.

On the validation side, we have support at least in Routinator and in RPKI clients, there may be others, but I don't think so.

Routers: Well, in software routers or in open BGP and BIRD we have support it, and I have learned recently that Cisco is actually also working on an implementation, and they might actually reach out to people if anybody wants to test with them, so what I agreed with them is that after this presentation, I will also send a link to this for the mailing list for the people that missed it, and they can respond to this and say okay, if you want to test with us, then contact us. I'll leave that part to them.

So, there is movement. What do we want to do as a next step? Well we want to work on extending the support in the user interface. What we'll do in this case, just like with the API, we will use a feature flag, so meaning that we use the same code and our test environment in production, but to begin with, it will only be available in the test environment. Once we have reached the point that we believe this is feature complete and we're confident enough that things won't change anymore in the IETF, then we can enable this in production.

We're currently working out the ideas of how the UI should look like, so Antanelle sitting there waving, I'm here, if you have any questions or ideas or are just curious about how we might do this, then you can talk to us, we would be happy to get your feedback, but also we'd be happy to have your contact and contact you later as we're working on this going forward.

And ARIN and APNIC are also planning to work at least on supporting this in test environments.

And I think that brings me almost to the end. So I have one more slide. If you want to read the ASPA verification draft that's in the IETF, or can I mention you Alexander, talk to Alexander, who knows more about this than I do, then please do. There is also examples that are being used for implementation testing, and there is a formal proof. I find those very difficult to he had RA, but maybe you feel differently and you'd be very welcome to have a he had RA of course.

And with that, I get to the end and would like to open the floor for questions and comments.

AUDIENCE SPEAKER: Gus Caplan. I just wanted to comment on the compatibility; RTR TR was released last week with ASPA support in both the output and the JSON input and all of the places you would expect it to be.

TIM BRUIJNZEELS: Nice, good.

AUDIENCE SPEAKER: Maria, developer of BIRD. I am one the reasons why that thing is still in the idea of draft stage because we got in some serious discussions about the wording of the validation procedure. So if anybody who'd like to he had RA the validation procedure and say something about that, please reach out to me or Alexander. Alexander, sorry, I owe you an e‑mail on this Alexander, so please look at it, there should be more eyes than just me. Thank you:

AUDIENCE SPEAKER: I have a question. The how this validation would work in a scenario where ought inappropriately al system is connected to I and peers by their service and not upstream provider but directly and some peers are known the prefixes learned from IX to their down streams and these down streams sometimes announce their down streams. Would this be considered invalidity if the autonomous systems are not the provider?

TIM BRUIJNZEELS: I think in this case, do you have a non‑transparent route server in this case, is that what you mean? They would appear to be in the provider.

SPEAKER: There are two options. First one if is says transparent Internet Exchange point, you are mainly peering with other members of these Internet Exchange points and so the peering is just works fine with that, nothing happens. If it's not transparent Internet Exchange point, you should add system of these Internet Exchange point into your AS record to make it work correctly. So, the document says if you are not sure that your Internet Exchange point is transparent or not, I must say system in the route server in the ASPA record.

AUDIENCE SPEAKER: I am talking about a scenario when the peer from this Internet Exchange announces the prefixes learned from Internet Exchange point and they are downstream.

AUDIENCE SPEAKER: Yeah, yeah I got it. It should work fine if you follow the recommendations from the document. If you are still unsure, catch up with me after the session, I will chat with you. I will show you how it works.

AUDIENCE SPEAKER: I am speaking as a participant. So, could you talk a little bit about comparing ASPA and BGPsec. What in fact does that mean for both the community and industry? Is that a replacement or how should that evolve?

TIM BRUIJNZEELS: Well, they do different things. With BGPsec, you get certainty that nothing was changed in the path by any speaker in the path. But you can still have leaks, because it doesn't say anything about whether this path was ‑‑ well correct in terms of policy, let's say. Whereas with ASPA, you could have information about policy that can be used to prevent leaks. In this sense, they complement each other. And it also means that if we do ASPA, it doesn't mean that there is no reason at all to do BGPsec. Now all that being said, and Randy can correct me if I'm wrong, I do think that there is a feeling that as it becomes harder and harder to create plausible paths in spoofing, that may not solve the problem a hundred percent, but it at least addresses some of the problem.

RANDY BUSH: A significant difference we're missing here is ASPA is path based, whereas BGPsec is prefix based. And so ASPA could say that this announcement is valid because it was valid for my red routes but in fact my blue routes shouldn't be will travelling on it but you will fail to detect that.

IGNAS BAGDONAS: Any other questions or comments? Well, then please a round of Applause.

(Applause)
.
Next to talk.


SIMON LEINEN: Thank you. Thanks Tim for the Tour de France about what's going on in RPKI. This may be a bit light. I am talking about specific infrastructure aspect of running RPKI in particular route origin validation. I work for Switch, which is the Swiss Research Education Network AS559. We do origin validation, we also publish ROAs of course, but I'm not talking about about that, although I could take questions.

And the particular point I'm looking at is rsync as a transport protocol and trying to use alternative applications. So the go in the title is the go the language, I could have written GoLang, but I thought it was a nice pun.

So, basically, this was supposed to be just a short experience talk and this is the gist of the message. So, I tried to use this particular alternative rsync implementation called gokrazy rsync, which was written by Michael Stapelberg, who some of you might know. We use that to fetch RPKI material in some of our relying party implementations, namely Routinator and forward, or we tried it with both. The result is it works. There are some caveats that ‑‑ I go into it, but that seems to be viable at least with these two implementations. I'm not claiming that we should go back to rsync in case anybody is insinuating that, or starting to misunderstand this. RRDP is great, we love it, but for now, let's assume we also still have to support rsync. You can entertain that discussion.

Also there is some footnotes. He did all the work for this talk, he made some changes to his programme to get to this stage. Thank you Michael. Also thanks for Ignas and Tim who sat through a rehearsal of the talk and gave great feedback.

Why do we think about diversity? I hope many of you here are in the same boat and are doing ROV and are running maybe their own relying parties. And the way we do it, we run two ‑‑ intentionally run two diverse implementations. We used to use Routinator and Octo RPKI, then we replaced Octo RPKI with RPKI client. So that's what we're using in production. That's also thought which we are not using, have not been using so far.

The reasons why we have diverse implementations. They might be extended to implementations of like the library infrastructure, such as rsync in the system as well, so that is my weak argument for even looking at this. Again, the assumption is that rsync can't just be switched off tomorrow or something, so...
.
We had a look at alternatives through the rsync implementation that we and I think many people have been used which I think is the Tridge, the original rsync coded by Andrew Tridgell. Many people also use open B SC rsync or open rsync. I don't think we do. We use whatever comes with Debian.
But those two are written in C and fairly widely used, but there are others, and the particular one I have been looking at is called gokrazy rsync. It was written from scratch, it was written from the paper I think that specifies rsync that Andrew Tridgell published long ago. I think it has been evolved somewhere like version 27 of the protocol, or they had version 31, but he implemented version 27, I am not a hundred percent sure.

Michael Stockenbach, he may be known to you because he also wrote he must like go, he works for Google in Zurich and he also wrote his complete home router implementation in go, all the user space in go be is called router 7, it might be interesting to you. And I think the rsync work comes from that space.

So that's I'll be talking about. Will there are other implementations, I have not looked at them for sure many people are have tried to rewrite it in Rust, but maybe for future study for others.

So, why consider go to implement something like rsync? One reason is memory save. I know there have been debates how valuable that really is as a feature. I have no personal experience with go. I have lots of experience with other memory save languages like Lisp, Python, and have written, I think there is a point that it's actually more safe if you don't have to worry about writing beyond investigator array boundaries, But of course, people have been doing good work in languages without that feature.

But maybe the impetus for looking at this at all, I don't know whether you remember in January this year, there were some vulnerabilities reported against the common rsync implementation that had fairly high risk scores, some of us have been trying to upgrade and these issues were related to exactly such like path off load, things that are hard to reproduce in a language like go. So, I think there is some empirical basis for the suggestion to move to memory safe languages. Also, go has a nice features for making use of multiple threads and today all computers are very multi‑threaded, so it's potential for improvements, performed improvements and so on. There are other reasons too. It has a modern library, an ecosystem that makes is easier to write safe programmes, handling files and so on, but I'm not going into them further.

Why this particular gokrazy implementation? I think there are other go implementations, but for me, that one was attractive because the author had been working with large systems and is also seems to be a known person in the go community and has been publishing like stuff, so it's not just, not recreational programming. More importantly when I reached out about the idea, he was very open to helping me, or the community, both by like normal, also by implementing missing features. He has done a bit of that. Also we used to both live in Zurich, since I moved to another part of the country. For me that was argument, if he was doing too much work I could easily buy him a beer or something.

What's the status of compatibility? I didn't manage or recently try to make it work with RPKI client because RPKI client makes quite some hard wired assumptions about command line options set up must be present in the RPKI client that are of course present in open ‑‑ rsync client, sorry, and when I started trying that, many of them were missing from gokrazy. But Routinator are the production implementation we use, it could be made to use this alternative system. I had to tell it which command line arguments are safe to use, there is this configuration example there. And later I also discovered FORT, which I hadn't used previously. It can also be configured in a similar way to be compatible with go crazy. So it's not a one to one plug in replacement yet, but you can make it work in this context for sure.

And then I got it all running on test servers, and then I wanted to know whether it also produces reasonable results, because at some point I got no more error messages, but that's maybe not enough to trust it. So, I did this like this week from the hotel. So I set up two Debian VMs in our BM infrastructure, installed the same version, recent version of FORT, and then I selected, that is what makes FORT interesting for these tests, you can select priorities for both RRDP and rsync, you can tell try to use rsync wherever available and only if that doesn't work, then you can try.

So this is a safe configuration that will exercise rsync a lot. It's harder to to do with Routinator, so I set this up. I started with an empty cache. I let it run. And then I looked at what happened. So, starting within an empty cache, the system like I guess all relaying parties have to do this, will go and collect, starting from the trust anchor lists, start to collect this RPKI material from everywhere, and then it once it has done that, then it starts to being able for, or validating, are ‑‑ it validates the data and then it will become available for routers to use that information. And it logs, when it has done that because it's happy.

And then secretly it will periodically I think with FORT it's about one once an hour, it will go again through the tell and update the information that has received if it changes on the repository.

So, here are the results. I hope I can read them the will, there are two columns of numbers. The left column is other results for the standard rsync and the right column are the results fore go crazy rsync. And you see that it takes, with the alternative gokrazy implementation, it takes quite a bit longer this first run to collect the data. So that seems unfortunate. But, you also see, under this third row of number, after this initial run, the gokrazy system can rsync, it ends up with 5,000 more validated ROAs. This is just like 1%, but it is, it seemed to be repeatable. Okay, so this is from one run, but I didn't trust my eyes at first so I ran that a couple of times, the results were very similar.
So, that could be worrying maybe you think it's getting some stuff that is not really, that shouldn't be there, that would be bad. But I thought about this and I came to the conclusion that it's just more patient, so, it picks up stuff that takes too long for the old rsync, old rsync will give up, why while the new one is still trying to get the information. The reason why I think this is that if you wait, if you leave both systems run for a while, then ‑‑ so some consequence runs which don't have to transfer everything, just what has changed, just check what has changed and transfer the data, they are very similar in speed, no collector winner, but the number of validated ROAs with traditional rsync, it converges to what gokrazy rsync on the first round. So that's reassuring. I guess we don't get much extra stuff.

We do get a little extra stuff in terms of files. I looked at that for example there are files called something something dot ASPA which, with a traditional sync they aren't transferred because they are intentionally excluded an the new rsync, it doesn't support the exclusion option yet, so it just transfers them, but I don't think that hurts a lot.

Performance wise, that I conclusion would be okay, this doesn't take ten times longer, it's very similar in performance, it seems to get the rate data, highly not scientific measurements. I hope there are not too many, or if there are measurement research people link, this is not how you should do measurements, but it's like what an operator does when they want to know whether anything works.

Looking towards the future, what could we do? Maybe implement those missing options, I think an include and exclude could be potentially valuable to ‑‑ I don't think they are that vital because there is not that much other stuff that we are interested in, but it could be good.

One thing that I noticed, I found out that the original rsync has is it, unlike many utilities that when they done they either exit with a zero exit code or with a single error exit code like 1, Tridge, the original rsync has a lot of different exit codes, meaning different things, like I could reach the server, or he was able to talk the server but then it tanked out or something, and gokrazy rsync does not have that, it's just 01, and actually there are some relying party implementations that expose these error codes like Prometheus metrics, you can make Grafana dashboard from it, this is a little piece of functionality that you loose when you replace traditional rsync with gokrazy rsync. Maybe it could be implemented also.

Another possibility was if you are considering using or writing a go based to relying party implementation, which I don't think it exists anymore, Octo RPKI was one, that seems to have gone. Then you could easily integrate that as a library. Based on other relying parts on other languages, it may not be attractive to use a go library, I'm not sure.

The big question is why do we waste time with it? I mentioned I like RRDP, there has been a lot of discussion about why that's not good enough. Before this week he was actually more defensive, or more worried about argument, anyway, my thinking was if we, even if we manage to deprecate rsync use with RPKI, which some people have tried doing, but it's a bit hard to manage this, will have to live with it for a couple of years, since then I actually heard some arguments why it would be good to continue having it, for example, as to use rsync as a fallback protocol in cases where RRDP doesn't work or has problems. So, maybe that's less negative about that.

And the general moral thing here is yeah, this is clearly the old thing but we as a community, I think we have done quite good at keeping the old stuff, maybe IPv4 is a bad example ‑‑ running despite new sexier things coming up and this might not be where the money is or don't work on these kinds of things if you want to be the next whatever, or make a career. But it's part of the gardening, the chores that we have to do to keep this thing running, right. And of course the actual reason was that I was interested in go and I thought yeah, maybe this would be a good opportunity for me to get some, my feet wet a little bit and do some coding and go and when I run into issues, I can always ask this pro to fix it, but of course as things go Michael did all the coding work and I still wait for an opportunity to learn some GoLang. That's it. I hope you find some of this relevant and have some questions. Thanks a lot.

(Applause)

AUDIENCE SPEAKER: Tim, RIPE NCC. I have ‑‑ thank you for doing this research. I think it's good to have diversity in rsync clients. I was one the people who at some point tried to really replicate rsync, but the message back from the IETF is that referred RRDP is okay but we want a fall back protocol. On that, well we actually need to run an rsync server, and diversity in server implementations would also be very helpful. Do you know if there is an effort to make this into a server?

SIMON LEINEN: Yeah, so honestly, I didn't think about that before this week, because we only validate and we outsource all the publication to RIPE NCC who have been doing a great job, but thinking about it, I think, Go‑Crazy rsync will be probably even more useful on the server side, on the publishing side, because I think there are quite a few thoughts that Michael had about doing this securely and making use of restrictive mechanisms and safe library calls specifically for the server side so that you can reduce the risk, attack servers on that side. I encourage people to look at this, but as I said, we don't publish, we let you publish, so ‑‑ but I think that would be potentially even more useful.

AUDIENCE SPEAKER: Ben from BGP tools, speaking for myself not the Working Group. Something that is worth keeping in mind with the lack of include/excludes things is a bit of a problem because if somebody accidentally puts a 5 gigabyte Windows 11 on their rsync end point by mistake, a lot of people's VMs don't have enough disc space and without those flags it will sync it down. In theory, people could put a 4 gigabyte X.509 certificate, but that's slightly less unlikely, Slightly. So, it is a good ‑‑ it's not will bullet proof misconfiguration, but it definitely helps solve the really silly mistakes.

SIMON LEINEN: Yes, that's the a good point. I have certainly not done any like security testing. I know there is also another option called Max bytes or something Max size, that may be supported, it might be more effective against that case, like you can also just call the file dot ROA or something, and it would be transferred. But yeah, that's a good point.

AUDIENCE SPEAKER: I think also to previous comment; I think open rsync does have a server back‑end that you can use which was one the major motivations of writing it.

SIMON LEINEN: Yeah, I am sure open rsync has many well thought out security features, and I just don't know that intimately, but yeah...

AUDIENCE SPEAKER: Randy Bush: Ben, don't let Windows on your machine.

SIMON LEINEN: I haven't tried it in Windows, but it should work.

AUDIENCE SPEAKER: Actually a question about tools versus library. So now you have developed the equivalent of a tool. What would you think of developing a library instead, well in one of say a particular language of choice, like C++, and defining bindings to other language of less popular choices or more popular choices instead of maintaining basically separate tools and maintaining a lower number of libraries that can get reused, what's your opinion?

SIMON LEINEN: Yes, so there are aspects related to this go implementation. I am sure the author has library use in mind. But then I also think that this might not be helpful for people wanting to use other languages as Go. So I think you are more alluding to focusing work on writing library in a traditional language, right ‑‑ or a language that interfaces more easily with other run time systems like C, C++, maybe Rust. But that is fair argument, and I want to attract people to contributing to this Go implementation or Go ecosystem without thinking first whether it would be better for the community to invest that effort into like maintaining quality implementations in more conventional languages that interface with other systems more easily. Does that answer your question?

IGNAS BAGDONAS: Not that I was expecting a definitive answer to that. So this is something that is much more into the softening nearing aspects and overall systems, not just a particular tool, but specifically there are languages that are easier to interface with. Now for the reasons that they have stable EPI and some of them don't probably don't plan to, certainly can be done. The question is at what cost and at what complexity.

SIMON LEINEN: Yeah, yeah, absolutely. It's important kind of consideration. I think from what I know, as an outsider, I think the go community has been quite conscious and has done relatively good job of keeping compatibility across, at least across versions, so this stable interfaces, consideration I think that's very much part of the GoLang mindset. That's my impression, but yeah, if you try to go outside of that language ecosystem, in Java you have the VM and you can mix languages easily, I don't know, use Scala library with a could the listen programme or something. That's not Go. But otherwise, I think for versioning I would place some trust in go as a platform. But I have very weak opinions on go, I am not an expert by any stretch.

IGNAS BAGDONAS: Excellent. Thank you. Any more questions? Do we have anything online? Last call for questions. No, questions, then a round of Applause.

(Applause)
.
Now we are moving to a slightly different talk. Don't be surprised seeing Alexander on stage because he will not be talking about ASPA, he will be talking about something very different, but that's just a different aspect of networking.

ALEXANDER AZIMOV: Hello everybody. It's Routing Working Group, and so I will not be speaking about BGP. Ill not be speaking about RPKI or routine incident.

Let's talk about load balancing. In this report I will present results of our search about is it feasible to build a stateless Layer4 load balancer?
.
You are all network engineers, imagine that you have back‑ends. Will normally it's a kind of HTTP servers, and you need to balance ingress traffic towards these back‑ends. How to do it? Of course you can just have a directly flag in these backends into a switch and you will have multipath and it will work. It will balance traffic, but will it scale? Unfortunately, it will not be scaling beyond the recognise or beyond the... size of multipath group so. Lots of limitations that really doesn't meet requirements of applications that has a significant traffic volume.

So, usually the high level design of a system that's processes HTTP requests looks like this one. The there is a network, there is an multipath that balances traffic for Layer4 load balances, and Layer4 load balances performance kind of magic, will go deep inside this magic during this report, and deliver traffic to the backends.

So, how Layer4 load balancer works. It depends on the implementation, but normally what is received to be a DNS when you typed F coding into your browser you are receiving A or AAAA records, and so Layer4 load balancers are advertising these prefixes into your network. And when these traffic is received, it's may be encapsulated as shown on this picture, toward ‑‑ so it's Layer4 load balancer chooses to select a backend to sent traffic for a specific TCP connection, or UDP connection, it doesn't really matter, so it's why it's Layer4, it works not only with IP header, normally it also works with TCP or UDP headers. What is quite important is that the response from the backends goes directly to the end user. So, the traffic is asymmetric. And the reason for this is the cost, because normally the volume of traffic that's your backends are sending back to the world can be much higher than the volume of traffic that they are get can from the world.

But what if one the backend fails? It's the responsibility of Layer4 load balancer to detect it, and to change its balancing policy so that the traffic is rescheduled to the backends that are still alive.

It uses hash tables to select the backends and to load‑balance traffic between them. And in a simple situation, it uses, for example, a hash function and after that it divides the results by the model of the number of backends, and so as you can see in this example on the left side, there is a map of keys and backends when we have 3 backends, and on the right side, when we have 4 backends. As you can see the result is that there is a significant change in this map, and I believe it's not the expectation of the end users, it's not the expectation of the application developers that if one backend goes offline, every TCP session is reset. So we need, if we start sending traffic to a selected backend for a selected TCP session, we should keep sending traffic.

And this gives us a problem. Not a problem, but it makes our system stable. Let's see how it works. When we receive a send packet, we are selecting a backend according to our current hash table, and we are aiding it to connections state table. And we are receiving an ACK packet, we are checking this table, and if there is a match we are sending traffic to the selected backend, if there are no match we are just dropping packets.

But, there is a question, maybe a question how we delete connections from the connection table. We do see only half of the packets, we do see only ingress flow, and there is no other way than to see maybe research, because feed packets is not enough, or to have a time‑out. So, if a session is not active for a selected period of time, we can believe that our garbage collecting system can remove it from the table and everything should be fine.
So, what we have learned.
So, commonly, Layer4 load balancers are working only with ingress traffic. They perform active health checks, it maybe TCP based, it may be HTTP based, it may be some specific protocol based, so it's active health checks, and it's stable. And it's vulnerable to DDoS attacks. Let's get back to the one of the previous slides when we have a high level design of our system. And let's imagine that there is a SYN flood that is attacking one of our services, and so, what we are expecting that Layer3 switches are just invulnerable to DDoS attack, to the let's say we have a Linux or something like this, we have single case. So that there are kind of protecting us against SYN flood on the server side and we have Layer4 load balancers that can't protect themselves in any way. So, the result may be that the contract table is growing, it's growing fast, you have a collision, after you are facing collisions, you are facing first service degradation and after that you may have experienced a full denial of service.

And this problem is not a new one. There was a lot of paperwork in the academia that was trying to find a way how we can remove states from the Layer4 load balancers.

Commonly, we can cluster them into two main options. First one, let flow states to the network. For example, okay we will off load ‑‑ use maybe service 6, maybe other internal technology, that will try to reroute the traffic between backends, so when backend is receiving traffic that is not supposed to be, not supposed to be locally ‑‑ so if there is no local active contract table, so it is rerouted to another backend. Another approach is to use TCP to offload data to their chosen backend, so it's some space in the TCP options, and you can put a key or something, some identified cater of the backend into these option field. But there is a problem. So, in both scenarios require real hard integration between your application and the end users, or your application and the network. So, to be realistic, there are solutions that require integration between your application and end users, the browsers, they just don't work. If it is just integration with your network, it may work but you will have also have some issues, it will be really hard to deploy it in production. So the question was: Is it possible to get rid of states but with no requirement to support it at the end user level or at the network?
.
Let's take a little take away. Let's speak about consistent hashing. Previously we talked about that rehashing can dramatically change the map between the keys and the backends. Consistent hashing is a set of different mathematics methods. This provide a way to hash it in a way that the fraction of changes correspond with the fraction of changing backends. For example, if you have 100 backends and the only one of them flaps, it will not affect the full hashing, but it will be affecting only a small fraction of, maybe in an ideal situation 1% of table.

Here is an example how it works: It was a real implementation on Python, I believe. So as you see it's just an example we added one backend and the hash rsync has changed but not that much as it was when we were just using division by model. Is it enough to get rid of state? Unfortunately, it is not. Because still, there is changes in the tables, in the key maps, and we want to keep all TCP connections that were previously established to be established. We don't want them to be reset. And the problem still is that when a set of backends changes, it changes the hash. So we don't know how it was changed. And these kind of didn't until we changed the rules. What have, instead of we make rehash, we don't drop and replace the old hash ring but we keep them all together. We have old hash ring and new hash ring. And we can ‑‑ and if we do it, we can see which keys are stable and which keys are unstable. Do we need state for stable keys in no, we don't. Because either it was working, selecting same backend when we were using the old hash, and it's using the same backend when we are using the new hash. With unstable parties,it's more difficult. Let see how it works.

In case of a SYN packet, first we need to check is the hash stable? If it is stable, we just are following regular procedure, we just encapsulate packet and don't create any states. If it is not stable, we create a state. And again encapsulate a packet. With an ACK packet it is more difficult. First what we are doing is we are checking the state exists. If the state exists, we are selected backend and so we send the traffic. Second, if the hash ring is stable, if it is stable, once again, we don't need a state, we know that previously we were sending traffic for a selected backend and we can keep sending traffic to this backend. And if it is not stable, we can guess that we were using the old hash to send traffic it our own backend and we are creating a state and sending traffic using the old hash.

It's very important to understand how to perform ‑‑ so we have ‑‑ when the set of backend is changing, we have a situation when we have two hash rings, and at some point of time we would like to merge it. When we are merging them we are getting back to the full stable system when we don't need states at all. And so if there are continuous changes of the set of the backend, we don't create the third hash ring, the fourth hash ring, we are only replacing the candidate hash ring. And if there are no further updates for the time‑out that is equal to the time‑out of the session, we can guarantee that all sessions that were previously established were mapped through the proper hash ring, we created states, if it was needed, and we can merge them and get one hash again.

So, it is nice theory. And we decided to check it in our lab at least. And we needed an implementation of consistent hash ring. We needed preliminaries of Layer4 load balancer. And at the moment, when we were making this research, it proved out that these, our choice was to use IB base with implementation of maglev hash that was published a few years ago by Google. And so, we took this piece of OpenSource over the and implemented off the top of maglev hash. Maglev hash stateless. And here is how it works. In this scenario, we were just performing a lot of frequent requests to our backends, and also we did some instability of this backend, so there was a number of flaps,ing and we were measuring how the fraction of flaps affect the system. As you can see, so, if in our case there are no flaps, there are zero states because the hash ring is stable and we don't keep tracking on any TCP states in the system.

If there is 10% of flaps, still it's by order of magnitude, is ‑‑ we need less states than their vanilla system. And more importantly, it's quite important in terms of CPU usage, it reduces CPU usage by nearly four times much of course for the classic maglev implementation, so it's not affected by the instability, always keeps states, always performance lookups in, into this growing table, and in our case, yes, it was quite effective.

So, what we learned. It's possible to create a nearly stateless layer for load balancer. It's significantly reduces both the number of states in the contract table and reduces CPU usage in our implementation on top of IP layers. Most importantly, this approach doesn't require any kind of integration with your network or with the end users.

And our prototype is open sourced. You can follow the link. See it on GitHub and if by chance you are still using IP layers in your environment, you can try it.

Thank you for link.

(Applause)

AUDIENCE SPEAKER: Hi, Christopher. Have you compared it to, for example, the GitHub TOB implementation or CloudFlare uni mark that uses a second chance hop instead of keeping stated and open server?

ALEXANDER AZIMOV: So, it's ‑‑ what we have seen from the, what was attempts of doing in the industry, so we took a look to what was it was doing from the GitHub team. So all of them required the integration from the backend, and this is the main difference here. So, comparing performance is kind of irrelevant because you should compare performance of the same basic system. So, the approach itself, it's, you can implement it on any kind of Layer4 load balancer. The only requirement you need is an implementation of consistent hashing. And ‑‑ so, to be fully open, in our environment, this project was kind of unlikely. The reason was that when we started this project, there was also a parallel activity, ongoing project to switch the environment of our Layer4 balancer to a number system built on top of TPTK, a much higher performance, the system is called Janet. It's also Open Sources but without this model of double hashing and I hope that it will be finally OpenSourced with this model too.

AUDIENCE SPEAKER: Ray. I am wondering comparing with an application of local lines which would be typically be used in production to resolve this issue of failing TCP connection when backend fails because the connection is obviously terminated there because the connection is obviously terminated there, and set up a new unwith. Would this be an advantage compared to application, their local answer that you can for example guarantee like end‑to‑end to end encryption between client systems?

ALEXANDER AZIMOV: So, normally, so, let's go back to the high level design picture.

Normally, in a huge environment, when you have a complicated applications, these backends is what you are calling application load balancers itself. So the question is how to balance traffic that is coming into your network between application load balancers. So, it's part of the system and there is no conflict of interest. They are just working together.

AUDIENCE SPEAKER: Hi. I read the... read to the end to say well hash tables are unsorted and you do sort all the patch sort in the hash table and use listeners, so for insertion, so I guess the performance would be back in the case of hash tables is really large, have you done any testing with significant number of connections? I don't know, thousands, millions?

ALEXANDER AZIMOV: So, as you can see, what we were testing, it was about 250,000 simultaneous connections, for each second.

AUDIENCE SPEAKER: So there was not a problem in practice.

ALEXANDER AZIMOV: No, no problem. It was quite an interesting project. I really loved working on it, the pre‑cursors of checking the stability of the hash ring is just working,

IGNAS BAGDONAS: Any other questions? Well, no more questions, then thank you.

(Applause) this brings the content of our meeting to the end. What's remaining? Again, important message: Think about, if you are thinking about submitting something for RIPE 91, do that earlier than later. If you want to discuss whether you should submit, you can find the Chairs and discuss. And another one is, the PC elections, so, that's a message that we have received. You still have within hour or an hour and a half, so cast your vote. That is important.

With that, the meeting comes to the end.

BEN COX: We are finishing early, so you have plenty of time to log into the portal and rate the talks. Which I appreciate.

IGNAS BAGDONAS: This is important. This is a feedback channel for us, and this has influence in what content you will eventually see on the other meetings. Thank you.

(Applause)
.
(Coffee break)